CS5720 - Adversarial Attacks on Neural Networks

Understanding the Threat

Adversarial Attacks are carefully crafted inputs designed to fool neural networks into making incorrect predictions, often with imperceptible changes to humans.

🚨 Critical Security Risk

Adversarial attacks pose serious threats to AI systems in security-critical applications like autonomous vehicles, medical diagnosis, and financial fraud detection.

📖

White-box Attacks

Attacker has full knowledge of model architecture and parameters
📦

Black-box Attacks

Attacker can only query the model and observe outputs
📋

Gray-box Attacks

Partial knowledge of model or training process
🌍

Physical Attacks

Adversarial examples that work in the real world

Common Attack Methods

Attack Techniques used to generate adversarial examples:

⚡

FGSM

Fast Gradient Sign Method - single step attack
🎯

PGD

Projected Gradient Descent - iterative optimization
🔍

C&W L2 Attack

Carlini & Wagner optimization-based attack
🎭

DeepFool

Finds minimal perturbation to decision boundary

Real-World Attack Scenarios

🚗

Autonomous Vehicles

Adversarial patches on stop signs causing misclassification

Safety-critical failure mode

🏥

Medical Imaging

Manipulated medical scans leading to misdiagnosis

Life-threatening consequences

👤

Facial Recognition

Adversarial glasses or patches bypassing security systems

Security breach potential

💳

Financial Fraud

Adversarial inputs to evade fraud detection systems

Economic impact

Adversarial Attacks on Neural Networks

Understanding the Threat

Common Attack Methods

Real-World Attack Scenarios

Modal Title