3. Latency and Scalability

How fast does the model produce predictions, and how does that speed change at scale? A model that takes 200ms per prediction is fine for batch processing (score all customers overnight) but too slow for real-time applications (approve a credit card transaction in under 100ms). At Athena's scale — 2