CS5720 - Cross-Entropy for Classification

Understanding Cross-Entropy

Cross-Entropy Loss measures the difference between two probability distributions - your predicted probabilities and the true distribution.

Why Cross-Entropy for Classification?

• Probability-based: Works with class probabilities
• Logarithmic penalty: Heavily penalizes confident wrong predictions
• Smooth gradients: Better for optimization than accuracy
• Information theory: Measures "surprise" of predictions

Loss = -Σ(y_true × log(y_pred))
(Click for detailed explanation)

💡 Key Insight

Cross-entropy penalizes wrong predictions exponentially. Being 90% confident about the wrong class is much worse than being 60% confident!

MSE vs Cross-Entropy

Aspect	MSE	Cross-Entropy
Problem Type	Regression	Classification
Output	Continuous values	Probabilities
Gradients	Can saturate	Well-behaved
Penalty	Quadratic	Logarithmic
Use When	Predicting amounts	Predicting categories

Quick Rule: If your output is a probability distribution over classes, use cross-entropy. If it's a continuous value, use MSE.

Interactive Cross-Entropy Visualization

Adjust the predicted probabilities and see how cross-entropy loss changes!
True class: Cat (Class 0)

0.80

Cat ✓

0.15

Dog

0.05

Bird

Cross-Entropy Loss: 0.223

Drag to adjust Cat probability

Cross-Entropy for Classification

Understanding Cross-Entropy

MSE vs Cross-Entropy

Interactive Cross-Entropy Visualization

Modal Title