CS5720 - API Development for ML Models

What is an ML API?

An ML API (Application Programming Interface) provides a standardized way for applications to interact with machine learning models, allowing real-time predictions through HTTP requests.

🌐 Language agnostic - any client can use the model
🔄 Real-time inference for web and mobile apps
📈 Scalable deployment with load balancing
🔒 Secure access control and authentication
📊 Easy monitoring and logging capabilities

API Design Patterns

🌐
REST API

Standard HTTP methods for stateless model inference
📊
GraphQL API

Flexible query language for complex model interactions
⚡
Streaming API

Real-time data processing for continuous predictions
📦
Batch API

Process multiple samples efficiently in bulk

Popular ML API Frameworks

⚡

FastAPI

Modern, fast Python framework with automatic API documentation

Automatic OpenAPI/Swagger docs
Built-in data validation
Async support
Type hints integration

🌶️

Flask

Lightweight and flexible Python micro-framework

Minimal setup required
Highly customizable
Large ecosystem
Easy to learn

🎯

Django REST

Full-featured framework with built-in admin and ORM

Built-in authentication
Database ORM
Admin interface
Robust security

🔥

TorchServe

PyTorch's official serving framework for production

Multi-model serving
Auto-scaling
Model versioning
Metrics monitoring

Click on any framework or API type to learn more!

API Development for ML Models

What is an ML API?

API Design Patterns

Popular ML API Frameworks

Modal Title