CS5720 - Week 13
Slide 246 of 260

Adversarial Attacks on Neural Networks

Understanding the Threat

Adversarial Attacks are carefully crafted inputs designed to fool neural networks into making incorrect predictions, often with imperceptible changes to humans.
🚨 Critical Security Risk
Adversarial attacks pose serious threats to AI systems in security-critical applications like autonomous vehicles, medical diagnosis, and financial fraud detection.
  • šŸ“–
    White-box Attacks
    Attacker has full knowledge of model architecture and parameters
  • šŸ“¦
    Black-box Attacks
    Attacker can only query the model and observe outputs
  • šŸ“‹
    Gray-box Attacks
    Partial knowledge of model or training process
  • šŸŒ
    Physical Attacks
    Adversarial examples that work in the real world

Common Attack Methods

Attack Techniques used to generate adversarial examples:
  • ⚔
    FGSM
    Fast Gradient Sign Method - single step attack
  • šŸŽÆ
    PGD
    Projected Gradient Descent - iterative optimization
  • šŸ”
    C&W L2 Attack
    Carlini & Wagner optimization-based attack
  • šŸŽ­
    DeepFool
    Finds minimal perturbation to decision boundary

Real-World Attack Scenarios

šŸš—
Autonomous Vehicles
Adversarial patches on stop signs causing misclassification
Safety-critical failure mode
šŸ„
Medical Imaging
Manipulated medical scans leading to misdiagnosis
Life-threatening consequences
šŸ‘¤
Facial Recognition
Adversarial glasses or patches bypassing security systems
Security breach potential
šŸ’³
Financial Fraud
Adversarial inputs to evade fraud detection systems
Economic impact
Prepared by Dr. Gorkem Kar