the systematic monitoring of how AI systems perform after they are deployed in real clinical settings. A system that performed well in clinical trials may perform differently when used by different clinicians, on different patient populations, with different equipment, in different workflows.