CS5720 - GoogLeNet/Inception: Efficient Architecture

The Design Challenge

The Dilemma:

How do we make networks deeper AND wider while keeping computational costs manageable?

Traditional Problems:
• Deeper networks = More parameters
• Wider networks = More computation
• Fixed filter sizes = Limited flexibility
• VGG-16 had 138M parameters!

The Question:

What if the network could decide which filter size to use at each layer?

The Inception Solution

The "Network in Network" Approach:

Apply multiple filter sizes in parallel and let the network learn which to use!

Key Innovations:
• Multiple filter sizes (1×1, 3×3, 5×5) in parallel
• 1×1 convolutions for dimension reduction
• Auxiliary classifiers for training
• Only 7M parameters (vs VGG's 138M!)

The Result:

Winner of ILSVRC 2014 with 6.67% error - beating VGG while using 20x fewer parameters!

GoogLeNet/Inception: Efficient Architecture

The Design Challenge

The Inception Solution

The Inception Module

Modal Title