CS5720 - Week 10
Slide 183 of 200

Object Detection Introduction

What is Object Detection?

Object detection combines classification and localization to identify what objects are in an image and where they are located using bounding boxes.
Key Components:

Classification: What is the object?
Localization: Where is the object?
Multiple Objects: Handle variable numbers
Confidence Scores: How sure are we?
  • 🔍 Variable Object Count
    Images can contain 0 to 100+ objects
  • 📏 Scale Variation
    Objects can be tiny or massive
  • 👁️ Occlusion Handling
    Objects may be partially hidden

Classification vs Detection

Image Classification
🐱
Input: Image
Output: Single label
Answer: "Cat"
Object Detection
📦
Input: Image
Output: Boxes + labels
Answer: "Cat at (x,y,w,h)"
Key Insight:
Detection is much harder because we need to search the entire image at multiple scales and locations!

Object Detection in Action

Street Scene with Objects
Car 95%
Person 89%
Bike 76%
Output: Multiple bounding boxes with class labels and confidence scores
Bounding Box
(x, y, width, height) coordinates that tightly enclose the object
Class Label
The predicted category of the detected object
Confidence Score
How certain the model is about this detection (0-100%)
Prepared by Dr. Gorkem Kar