**Clear ownership:** Every production model has a designated owner and an on-call rotation - **Runbooks:** Step-by-step procedures for responding to common alerts (high null rate, prediction distribution shift, latency spike) - **Escalation paths:** If the on-call engineer cannot resolve the issue w