What are Activation Functions?
Activation functions introduce non-linearity into neural networks, transforming the weighted sum of inputs into an output signal. Without them, even deep networks would only compute linear transformations!
Neuron with Activation:
Input → [Σ(w·x) + b] → f(z) → Output
Where f(z) is the activation function
Key Properties:
- Non-linear transformation
- Differentiable (for backpropagation)
- Computationally efficient
- Suitable gradient properties
Why We Need Them
Without Activation Functions:
f(W₂(W₁x)) = W₂W₁x = Wx
Multiple layers collapse to a single linear transformation!
With Activation Functions:
f₂(W₂·f₁(W₁x))
Can learn complex, non-linear patterns!
Benefits:
- Enable learning of complex patterns
- Create non-linear decision boundaries
- Allow deep networks to be meaningful
- Model real-world relationships