Beyond Single Layers
Multi-Layer Perceptrons (MLPs) add hidden layers between input and output, enabling them to learn non-linear patterns that single perceptrons cannot.
Key Innovation:
By stacking multiple layers and using non-linear activation functions, MLPs can approximate any continuous function - the Universal Approximation Theorem!
MLP Architecture:
- Input Layer: Receives raw features
- Hidden Layer(s): Learns representations
- Output Layer: Produces final predictions
- Full Connectivity: Each neuron connects to all neurons in next layer
Solving the XOR Problem
Remember how single perceptrons failed at XOR? MLPs solve it elegantly with just one hidden layer!
| Feature |
Single Perceptron |
Multi-Layer Perceptron |
| Decision Boundary |
Linear only |
Non-linear |
| XOR Problem |
Cannot solve |
Easily solved |
| Hidden Layers |
0 |
1 or more |
| Learning Algorithm |
Perceptron rule |
Backpropagation |
Historical Note:
The inability to train MLPs held back neural networks until backpropagation was popularized in 1986!