CS5720 - Overfitting and Underfitting

The Three Scenarios

🥶 Underfitting

Model is too simple - can't capture underlying patterns. Poor performance on both training and test data.

✨ Just Right

Model captures patterns well without memorizing. Good performance on both training and test data.

🔥 Overfitting

Model memorizes training data - can't generalize. Great on training data, poor on test data.

Key Indicators

How to Diagnose:

Training vs Validation Loss:
• Both high → Underfitting
• Large gap → Overfitting
• Both low, small gap → Just right

Learning Curves:
Plot training and validation performance over time

Quick Test: If your model performs much better on training data than validation data, you're overfitting!

Warning Sign: Perfect training accuracy (100%) is almost always overfitting in real-world problems.

Interactive Model Complexity Demo

Model Complexity

Noise Level

20%

Training Data Fit

Learning Curves

Training Accuracy

85%

Validation Accuracy

82%

Complexity Status

Just Right

🐻 The Goldilocks Principle

Just like Goldilocks' porridge, we want our model complexity to be "just right" - not too simple (underfitting) and not too complex (overfitting), but perfectly balanced for good generalization.

Overfitting and Underfitting

The Three Scenarios

Key Indicators

Interactive Model Complexity Demo

Modal Title