The Design Challenge
The Dilemma:
How do we make networks deeper AND wider while keeping computational costs manageable?
Traditional Problems:
• Deeper networks = More parameters
• Wider networks = More computation
• Fixed filter sizes = Limited flexibility
• VGG-16 had 138M parameters!
The Question:
What if the network could decide which filter size to use at each layer?
The Inception Solution
The "Network in Network" Approach:
Apply multiple filter sizes in parallel and let the network learn which to use!
Key Innovations:
• Multiple filter sizes (1×1, 3×3, 5×5) in parallel
• 1×1 convolutions for dimension reduction
• Auxiliary classifiers for training
• Only 7M parameters (vs VGG's 138M!)
The Result:
Winner of ILSVRC 2014 with 6.67% error - beating VGG while using 20x fewer parameters!