CS5720 - Week 2
Slide 23 of 40

Mean Squared Error for Regression

The MSE Formula

MSE = (1/n) × Σ(y_predicted - y_actual)²
Where:
n = number of examples
y_predicted = what our network predicts
y_actual = the true value
• We square the difference and average over all examples
Why square the errors?

• Makes all errors positive (no cancellation)
Penalizes large errors more heavily
• Makes the math differentiable (important for learning!)
• Creates a smooth error surface

Key Properties of MSE

  • 📈 Always Non-negative
    MSE ≥ 0, and equals 0 only when predictions are perfect
  • 🎯 Differentiable Everywhere
    Smooth gradient allows efficient optimization
  • ⚖️ Sensitive to Outliers
    Large errors contribute disproportionately to the loss
  • 📊 Unit Dependent
    MSE has units of (output)², so scale matters

Interactive MSE Visualization

Prediction Value 5.0
True Value 7.0
MSE
4.00
Prepared by Dr. Gorkem Kar