CS5720 - Gradient Descent: The Big Picture

The Core Idea

Gradient Descent is an optimization algorithm that finds the minimum of a function by repeatedly taking steps in the direction of steepest decrease.

Key concepts:

• Gradient = Direction of steepest increase
• Negative gradient = Direction of steepest decrease
• Step size = Learning rate (how big our steps are)
• Goal = Find the lowest point (minimum loss)

🏔️ Mountain Analogy

Imagine you're lost in foggy mountains and want to reach the valley. You can only feel the slope under your feet. Gradient descent says: always step downhill!

The Algorithm

Step 1: Initialize

Start with random weights (random position on the mountain)
Step 2: Calculate Gradient

Find which direction is "downhill" (compute derivatives)
Step 3: Update Weights

Take a step in the opposite direction of gradient

w_new = w_old - learning_rate × gradient
Step 4: Repeat

Keep stepping until we reach the bottom (convergence)

Gradient Descent: The Big Picture

The Core Idea

The Algorithm

Interactive Mountain Descent

Modal Title