CS5720 - Week 1
Slide 13 of 20

Activation Functions Introduction

What are Activation Functions?

Activation functions introduce non-linearity into neural networks, transforming the weighted sum of inputs into an output signal. Without them, even deep networks would only compute linear transformations!
Neuron with Activation:
Input → [Σ(w·x) + b] → f(z) → Output

Where f(z) is the activation function

Key Properties:
  • Non-linear transformation
  • Differentiable (for backpropagation)
  • Computationally efficient
  • Suitable gradient properties

Why We Need Them

Without Activation Functions:

f(W₂(W₁x)) = W₂W₁x = Wx

Multiple layers collapse to a single linear transformation!

With Activation Functions:

f₂(W₂·f₁(W₁x))

Can learn complex, non-linear patterns!

Benefits:
  • Enable learning of complex patterns
  • Create non-linear decision boundaries
  • Allow deep networks to be meaningful
  • Model real-world relationships

Interactive Activation Functions

z = 0.0
f(z) = 0.0

Quick Comparison

Linear

f(z) = z

No activation

❌ No non-linearity

Sigmoid

f(z) = 1/(1+e⁻ᶻ)

Range: (0, 1)

✓ Probability output

Tanh

f(z) = tanh(z)

Range: (-1, 1)

✓ Zero-centered

ReLU

f(z) = max(0, z)

Range: [0, ∞)

✓ Most popular

Prepared by Dr. Gorkem Kar