Chapter 36: Quiz — AI in Healthcare Decision-Making

20 Questions

Instructions: Select the best answer for each multiple-choice question. For short-answer questions, write two to four sentences.


Question 1 The Epic Deterioration Index is best described as:

A) A laboratory test that measures biomarkers of deterioration risk B) A machine learning model embedded in Epic's EHR that predicts hospitalized patient deterioration risk C) A clinical protocol for early warning response to deteriorating patients D) A standardized nursing assessment tool for patient safety

Correct Answer: B Explanation: The Deterioration Index is an AI model that analyzes EHR data — vital signs, lab values, nursing assessments, medications — to generate a risk score predicting which patients are at risk of rapid clinical decline. It is embedded in Epic's software interface, not a laboratory test or standardized nursing protocol.


Question 2 The core methodological problem with IBM Watson for Oncology's training was that:

A) The training data contained too many cases of rare cancers B) Watson was trained on hypothetical cases curated by MSKCC oncologists rather than real patient outcomes data C) Watson was trained on data from too many different institutions D) The training dataset was too small to support a reliable model

Correct Answer: B Explanation: Watson's fundamental limitation was its training methodology: it learned from expert-curated hypothetical cases rather than from actual patient outcomes. This produced a system that encoded MSKCC's clinical reasoning without validating that reasoning against real-world outcomes, and that did not generalize beyond MSKCC's patient population and institutional context.


Question 3 "Automation bias" in clinical settings refers to:

A) AI systems that systematically favor automated recommendations over human judgment B) The tendency of clinicians to over-rely on AI recommendations without exercising adequate independent judgment C) The bias introduced when AI training data comes exclusively from automated record-keeping systems D) The preference of hospital administrators for automated AI tools over human-intensive care protocols

Correct Answer: B Explanation: Automation bias is the documented human tendency to over-rely on automated systems — accepting their recommendations with less scrutiny than the evidence would warrant. In clinical settings, this manifests as faster or less-questioned acceptance of AI-recommended actions, with reduced independent clinical reasoning.


Question 4 The Optum care management algorithm studied by Obermeyer et al. (2019) introduced racial bias primarily through:

A) Intentional programming of race as a protected characteristic B) Use of healthcare costs as a proxy for health need, which underestimated Black patients' needs due to systemic barriers to care C) Training data that included race as an explicit variable in predictive models D) A testing process that evaluated performance only in predominantly white patient populations

Correct Answer: B Explanation: The algorithm used healthcare spending as a proxy for health need, on the assumption that sicker patients spend more. Because Black patients face systemic barriers to healthcare access, they generated lower costs than equally sick white patients. The proxy variable encoded structural inequality, causing the algorithm to systematically underestimate Black patients' health needs.


Question 5 The FDA's Software as a Medical Device (SaMD) framework applies to:

A) All software used in hospitals and clinical settings B) Software that performs medical device functions, including making or supporting clinical decisions C) Hardware medical devices that have embedded software components D) Any AI system trained on medical data, regardless of its intended use

Correct Answer: B Explanation: SaMD refers specifically to software intended to perform medical device functions — making or supporting medical decisions — without being part of a hardware device. Not all software used in healthcare is SaMD; administrative software, general communication tools, and billing systems are not typically SaMD.


Question 6 The eGFR race correction that was eliminated in 2021 had what clinical effect on Black patients?

A) It overestimated kidney disease severity, leading to overtreatment B) It assigned higher kidney function values, potentially delaying diagnosis and treatment of kidney disease C) It required Black patients to undergo more frequent kidney function testing D) It adjusted treatment recommendations to account for documented genetic differences in kidney function

Correct Answer: B Explanation: The race correction assigned higher eGFR values to Black patients with identical creatinine measurements, making their kidney function appear better than it was. This could delay the clinical recognition of kidney disease, delaying interventions that could slow disease progression. The correction was not based on robust evidence of biological difference.


Question 7 What is the primary concern about "predetermined change control plans" (PCCPs) for adaptive AI medical devices?

A) They allow AI devices to change their behavior after FDA authorization without additional review B) They are too restrictive, preventing beneficial AI improvement C) They require manufacturers to fund new clinical trials for every model update D) They do not apply to AI used in software platforms rather than standalone devices

Correct Answer: A Explanation: PCCPs allow manufacturers to define in advance what types of model updates are acceptable without triggering a new FDA review, enabling adaptive AI to update continuously. The primary concern is that model updates — even within a PCCP — could change clinical behavior in ways that the PCCP did not fully anticipate, potentially affecting safety and effectiveness.


Question 8 Independent validation studies of the Epic Deterioration Index have generally found:

A) That the model performs exactly as Epic reports across all hospital settings B) That performance in specific hospital populations may differ substantially from Epic's reported statistics, with variable positive predictive values C) That the model performs better than Epic reports but only in academic medical centers D) That the model is accurate but has significant alert fatigue issues that are uniform across settings

Correct Answer: B Explanation: Published independent validation studies have found that the Deterioration Index's performance in real hospital populations — including sensitivity, specificity, and positive predictive value — can differ substantially from Epic's internal validation statistics. This performance variation across settings is a core governance concern.


Question 9 "Algorithmic redlining" in healthcare refers to:

A) The use of geographic data to target healthcare marketing B) The systematic disadvantaging of patients from certain demographic groups through algorithmic tools C) Algorithms that prevent patients from accessing care outside their network D) The use of zip codes to predict health risk for insurance underwriting

Correct Answer: B Explanation: Algorithmic redlining describes patterns in which algorithmic tools — risk scores, resource allocation algorithms, clinical decision support — systematically direct resources away from or create worse outcomes for patients from certain demographic groups, analogous to the historical practice of mortgage redlining.


Question 10 The pulse oximeter inaccuracy finding during COVID-19 was significant primarily because:

A) It demonstrated that all wearable medical devices need recalibration during pandemics B) A device authorized by the FDA without skin-tone testing was found to overestimate oxygen saturation in patients with darker skin, affecting clinical decisions C) The inaccuracy affected all patients equally but was particularly severe during COVID-19 D) Pulse oximeters had been recalled by the FDA years earlier and were being used illegally

Correct Answer: B Explanation: Research documented that pulse oximeters systematically overestimated blood oxygen saturation in patients with darker skin tones, meaning clinical decisions about oxygen therapy and discharge based on pulse oximetry readings could be based on inaccurate values — potentially causing delayed treatment for patients who were more hypoxic than the device indicated.


Question 11 In Obermeyer's Optum algorithm study, what was the magnitude of the racial bias found?

A) At a given risk score, Black patients were approximately equally sick to white patients B) At a given risk score, Black patients were substantially sicker than white patients C) At a given risk score, white patients were sicker than Black patients, reflecting appropriate risk stratification D) The algorithm had no racial bias but had significant bias by sex

Correct Answer: B Explanation: Obermeyer's analysis found that at the same algorithmic risk score, Black patients were significantly sicker than white patients — meaning the algorithm was effectively underestimating the health needs of Black patients relative to white patients with identical scores. A Black patient at a given risk score had, on average, more health problems than a white patient at the same score.


Question 12 The 510(k) premarket clearance pathway poses particular challenges for novel clinical AI because:

A) It requires randomized clinical trial evidence that AI companies cannot generate B) Novel AI tools may not have appropriate predicates; demonstrating substantial equivalence to older technologies does not ensure new AI tools are safe C) The FDA charges fees for 510(k) review that small AI companies cannot afford D) The pathway takes too long and prevents timely deployment of beneficial AI tools

Correct Answer: B Explanation: The 510(k) pathway is designed for devices substantially equivalent to previously authorized devices. Novel AI clinical tools that do not have close predecessors may be able to identify a technically similar predicate, but substantial equivalence to an older technology does not, by itself, ensure that the new AI system performs safely in clinical use.


Question 13 Which of the following most accurately describes the FDA regulatory status of most AI mental health chatbots?

A) All AI mental health applications require FDA authorization before distribution B) Most general wellness apps, including many mental health chatbots, fall under exemptions that do not require FDA review C) AI mental health tools are regulated by the FTC rather than the FDA D) Mental health chatbots require prescription status and are regulated as prescription digital therapeutics

Correct Answer: B Explanation: The FDA's approach to digital health has generally exempted low-risk general wellness apps from device review requirements. Most AI mental health chatbots — Woebot, Wysa, Replika — operate under this general wellness exemption rather than FDA device authorization, meaning they have not demonstrated safety and effectiveness through FDA review.


Question 14 What distinguishes "informed consent" requirements for AI involvement in clinical care from typical technology consent in hospitals?

A) AI consent requires patient signature, whereas other technology consents are implied B) AI consent should be material, decision-relevant, and specific to AI when AI significantly influences diagnosis or treatment — not merely a generic technology disclosure C) There are currently no specific requirements; all AI consent is handled identically to other technology consent D) Federal law requires specific consent for all AI involvement in clinical care

Correct Answer: B Explanation: A principled framework for AI consent in clinical care holds that disclosure should be meaningful and decision-relevant — focused on AI uses that significantly influence clinical decisions, particularly when AI performance may vary for specific patient populations. Generic technology consent does not satisfy this standard.


Question 15 What specific patient safety risk does end-of-life AI mortality prediction create beyond standard prognostic uncertainty?

A) Mortality predictions are always inaccurate and therefore should not be used clinically B) Algorithmic predictions presented with numerical precision may carry authority effects that shape clinical and family decision-making in ways disconnected from patient values C) Mortality prediction tools are not covered by HIPAA and therefore create data privacy violations D) Mortality prediction AI requires FDA authorization that currently does not exist

Correct Answer: B Explanation: The concern with mortality prediction AI is not only statistical accuracy but also the authority effect of numerical predictions: a score of 74% risk of death within 30 days may shape clinical conversations, family expectations, and care intensity decisions in ways that reduce patient autonomy and do not reflect the appropriate acknowledgment of prognostic uncertainty that patient-centered care requires.


Question 16 Short Answer: Explain the "generalizability problem" in clinical AI and why Watson for Oncology exemplifies it.

Model Answer: Generalizability refers to whether an AI model's performance, when trained in one clinical context, holds in different clinical contexts — different patient populations, different institutional resources, different clinical practices. Watson for Oncology exemplified the generalizability problem because it was trained on hypothetical cases curated by MSKCC oncologists, encoding MSKCC's specific institutional reasoning. When deployed in hospitals in India, South Korea, and other contexts with different patient demographics, different available drugs, and different clinical protocols, Watson's recommendations were found to disconcord significantly with local tumor board recommendations — precisely because the model had not been validated in those populations.


Question 17 Short Answer: What is "alert fatigue" in the context of clinical AI, and how does it create patient safety risks?

Model Answer: Alert fatigue occurs when clinical AI tools generate too many alerts that are false positives — warnings about patients who are not actually deteriorating or at elevated risk. Clinicians exposed to high volumes of false positive alerts develop habits of overriding or dismissing alerts rapidly without case-by-case clinical assessment. This desensitization creates patient safety risk because genuinely important alerts — true positives — may be dismissed alongside false positives. Independent validation studies of the Deterioration Index have raised alert fatigue concerns based on positive predictive values that generate substantial false positive alert rates.


Question 18 Short Answer: Why is the "silent update" to Epic's Deterioration Index ethically significant beyond the specific technical change made?

Model Answer: The ethical significance is not primarily about the specific technical change made in September 2021, but about what silent updates reveal about the governance of AI embedded in platform software: clinical tools that shape patient care can be altered without clinicians' awareness, undermining the premise of meaningful human oversight. Clinicians develop calibrated intuitions about AI tool outputs based on experience with a specific model version; when the model changes without notification, those intuitions may become miscalibrated, and the "human-in-the-loop" becomes a human exercising judgment based on incorrect assumptions about the tool they are using.


Question 19 Short Answer: What does Watson for Oncology teach about the relationship between institutional prestige and AI validity in healthcare?

Model Answer: Watson for Oncology's partnership with MSKCC — one of the world's premier cancer centers — was used to confer institutional authority on Watson's recommendations. But clinical excellence in human expert judgment does not automatically transfer to AI systems built on curated training data from that institution. The MSKCC association was a form of appeal to authority that did not substitute for rigorous clinical validation in diverse external populations. This teaches that AI procurement in healthcare must be governed by clinical evidence standards — validation studies, outcomes data, demographic performance — not by the prestige of the institutions associated with the system's development.


Question 20 Short Answer: A hospital is considering deploying an AI tool that predicts which emergency department patients are at highest risk of serious illness to prioritize triage. What equity considerations should the hospital assess before deployment?

Model Answer: The hospital should assess: (1) whether the AI tool's training data included patients demographically similar to its patient population, particularly minority and low-income patients who may be underrepresented in training datasets; (2) whether the vendor has provided demographic subgroup performance data showing whether the tool performs comparably for patients of different races, ethnicities, and socioeconomic backgrounds; (3) whether any proxy variables used in the model — insurance status, prior healthcare utilization, geographic data — may introduce systematic bias; and (4) what mechanisms will monitor post-deployment performance equity. Given that algorithmic triage bias would directly affect which patients receive urgent care, this equity assessment is a patient safety requirement, not an optional additional analysis.