CS5720 - Introduction to Computer Vision

What is Computer Vision?

Computer Vision is a field of AI that trains computers to interpret and understand the visual world. It enables machines to identify and process objects in images and videos in the same way that human vision does.

The Goal:

• Extract meaningful information from visual data
• Make decisions based on visual input
• Understand the content and context of images
• Bridge the gap between pixels and concepts

🎯 The Fundamental Challenge

How do we go from a 2D array of pixel values to understanding that there's a "cat sitting on a chair"? This semantic gap is what makes computer vision challenging!

Real-World Applications

🏥

Medical Imaging

Detecting tumors, analyzing X-rays, assisting diagnosis

🚗

Autonomous Vehicles

Object detection, lane recognition, pedestrian tracking

👤

Face Recognition

Security systems, photo organization, authentication

🛒

Retail & E-commerce

Visual search, product recommendations, quality control

Core Computer Vision Tasks

🏷️

Image Classification

"This is a dog"

📦

Object Detection

"Dog at (x,y) with bbox"

🎨

Segmentation

"These pixels are dog"

👁️

Facial Recognition

"This is person X"

🏃

Pose Estimation

"Joint locations"

Introduction to Computer Vision

What is Computer Vision?

Real-World Applications

Core Computer Vision Tasks

Modal Title