Skip to main content

Model Deployment

Deploy models for real-time and batch inference at scale.

Real-time Inference

SageMaker endpoints
Auto scaling configuration
Multi-model endpoints

Batch Inference

Batch transform jobs
Large-scale processing
Cost-effective inference

Edge Deployment

SageMaker Edge Manager
IoT device deployment
Model optimization for edge

A/B Testing

Multi-variant endpoints
Traffic splitting
Performance comparison

Real-time Inference
Batch Inference
Edge Deployment
A/B Testing