CS5720 - Backpropagation: Mathematical Overview (No Proofs)

The Core Idea

Backpropagation efficiently calculates how much each weight contributed to the error by propagating error signals backward through the network.

Why Backpropagation?

Without it, we'd need to:
• Change each weight slightly
• See how loss changes
• Repeat for millions of weights!

Backprop calculates ALL gradients in one backward pass!

The Chain Rule: ∂Loss/∂weight = ∂Loss/∂output × ∂output/∂weight
(Click for intuitive explanation)

The Algorithm Steps

1. Forward Pass

Feed input through network, save all intermediate values (activations)

2. Calculate Loss

Compare prediction with target using loss function

3. Backward Pass

Starting from loss, calculate gradients layer by layer going backward

4. Update Weights

Use gradients to update all weights via gradient descent

Key Insight: Each neuron only needs local information - its inputs, weights, and the error signal from the next layer!

Signal Flow in Backpropagation

x₁

x₂

Input Layer

→ Forward

← Backward

h₁

h₂

h₃

Hidden Layer

→ Forward

← Backward

Output Layer

→ Loss

Loss

📝 Remember: No Heavy Math!

You don't need to derive these formulas! Just understand that backprop efficiently computes how to adjust each weight to reduce the error.

Backpropagation - Mathematical Overview (No Proofs)

The Core Idea

The Algorithm Steps

Signal Flow in Backpropagation

Modal Title