Transfer learning leverages knowledge gained from one task to solve a related task. Instead of training from scratch, we use models pre-trained on massive datasets.
🎯 Core Idea
A CNN trained on millions of images has already learned to detect edges, shapes, textures, and complex patterns. We can reuse these learned features for our specific task.
🏗️ How It Works
Take a pre-trained model, remove the final classification layer, add your own classifier, and fine-tune the network on your specific dataset.
⚡ Why It's Powerful
Dramatically reduces training time, requires less data, and often achieves better performance than training from scratch.
✅ Key Advantages
• Faster training (hours vs days)
• Less data required (hundreds vs millions)
• Better performance on small datasets
• Lower computational costs
• Proven, state-of-the-art architectures
Popular Pre-trained Models
Choose from a variety of proven architectures, each with different trade-offs between accuracy and efficiency.
🏆 ResNet-50/101/152
25.6M params • 76.2% ImageNet accuracy
Excellent general-purpose model with skip connections
🔍 VGG-16/19
138M params • 74.4% ImageNet accuracy
Simple architecture, great for understanding features
⚙️ Inception-v3
24M params • 78.8% ImageNet accuracy
Efficient multi-scale feature extraction
📱 MobileNet-v2
3.5M params • 72.0% ImageNet accuracy
Optimized for mobile and edge devices
⚖️ EfficientNet-B0 to B7
5.3M-66M params • 77.3-84.3% accuracy
Optimal balance of accuracy and efficiency
🔄 Vision Transformer (ViT)
86M params • 85.8% ImageNet accuracy
Transformer architecture applied to vision
Transfer Learning Workflow
1
Choose Model
Select pre-trained architecture based on your requirements
→
2
Load Weights
Download ImageNet-trained weights from model zoo
→
3
Modify Architecture
Replace final layer with your number of classes
→
4
Fine-tune
Train on your dataset with lower learning rates
💻 Implementation Examples
# PyTorch Implementation
import torch
import torch.nn as nn
import torchvision.models as models
# Load pre-trained ResNet
model = models.resnet50(pretrained=True)
# Freeze feature extractor (optional)
for param in model.parameters():
param.requires_grad = False
# Replace classifier
num_classes = 10
model.fc = nn.Linear(model.fc.in_features, num_classes)
# Only train the classifier
optimizer = torch.optim.Adam(model.fc.parameters(), lr=0.001)