Case Study: Medical Reversal — When Standard of Care Is Wrong
The Pattern
Medical reversal occurs when a medical practice established as standard of care is later contradicted by higher-quality evidence — showing the practice to be no better than alternatives or actively harmful. This case study examines three representative reversals to illustrate the common pathway and structural forces involved.
Case A: Hormone Replacement Therapy (HRT)
The practice: For decades, postmenopausal women were prescribed hormone replacement therapy (estrogen alone or estrogen plus progesterone) based on observational data suggesting cardiovascular protective effects. Professional guidelines recommended HRT for cardiovascular disease prevention. Millions of women were prescribed the therapy.
The evidence basis: Observational studies (not RCTs) showed that women taking HRT had lower rates of cardiovascular disease. The biological mechanism seemed plausible: estrogen had known effects on blood vessel function and cholesterol levels.
The reversal: The Women's Health Initiative (WHI), a large RCT launched in 1991 and reporting results in 2002, found that combination HRT increased cardiovascular risk, along with risks of breast cancer and stroke. The trial was stopped early because of safety concerns.
The aftermath: HRT prescribing dropped dramatically. But the reversal was incomplete: debate continued about whether HRT might still be beneficial for younger postmenopausal women (the "timing hypothesis"), and prescribing rates partially recovered.
Failure modes active: Survivorship bias (observational studies captured women who chose HRT — a healthier population on average). Plausible story problem (the biological mechanism "made sense"). Authority cascade (professional society guidelines endorsed HRT). Sunk cost (decades of clinical training, patient expectations, pharmaceutical investment).
Case B: Stenting for Stable Angina
The practice: Percutaneous coronary intervention (PCI) — inserting a stent into a narrowed coronary artery — was widely performed for patients with stable angina (chest pain during exertion that is not immediately life-threatening). The procedure was logical: the artery is narrowed, so widen it.
The evidence basis: Strong evidence supported stenting for acute coronary syndromes (heart attacks). The extrapolation to stable angina was based on anatomical reasoning and clinical experience rather than randomized evidence.
The reversal: The COURAGE trial (2007) showed that for stable angina, stenting plus optimal medical therapy was no better than medical therapy alone. The ORBITA trial (2017) went further: it was sham-controlled (patients in the control group underwent catheterization without stenting) and found no benefit of stenting over sham procedure. The improvement patients reported after stenting was largely placebo effect.
The aftermath: Stenting for stable angina has decreased but not disappeared. Many cardiologists continue to perform the procedure, and many patients continue to request it. The institutional infrastructure (catheterization labs, interventional cardiology fellowships, hospital revenue models) creates powerful resistance to de-adoption.
Failure modes active: Anchoring of first explanations (the mechanical model — "blocked artery → open it" — was intuitive and resistant to evidence that it didn't work for stable disease). Incentive structures (stenting is lucrative; not stenting generates no revenue). Einstellung effect (interventional cardiologists trained to intervene have difficulty recommending non-intervention). Precision without accuracy (angiographic measurements of artery narrowing gave precise anatomical data that didn't predict clinical outcomes).
Case C: Arthroscopic Surgery for Knee Osteoarthritis
The practice: Arthroscopic surgery (lavage and debridement) for knee osteoarthritis was one of the most commonly performed orthopedic procedures in the United States, with hundreds of thousands of procedures annually.
The evidence basis: Minimal. The procedure was adopted based on clinical experience and the plausible reasoning that cleaning out damaged tissue should reduce pain.
The reversal: A 2002 RCT in the New England Journal of Medicine (Moseley et al.) compared arthroscopic surgery to sham surgery (incisions made but no actual procedure performed). The result: no difference. Patients who received sham surgery improved just as much as those who received real surgery. Multiple subsequent trials confirmed the finding.
The aftermath: The procedure has declined but remains in use. Some surgeons argue that specific subgroups of patients benefit (though the evidence for this is weak). The institutional resistance to full de-adoption illustrates the full failure mode stack: sunk cost (surgical training, equipment investment), incentive structures (procedure generates revenue), authority cascade (senior surgeons' experience), and therapeutic inertia (the difficulty of unlearning an established practice).
The Common Pathway
All three cases follow the same trajectory:
- Adoption based on inadequate evidence (observational data, mechanistic reasoning, clinical experience)
- Institutional embedding (guidelines, training programs, revenue models, patient expectations)
- Delayed rigorous testing (RCTs conducted years or decades after widespread adoption)
- Reversal evidence published (RCT shows practice is ineffective or harmful)
- Slow de-adoption (institutional resistance from sunk costs, incentives, and therapeutic inertia)
The pathway reveals a structural problem: medicine's evidence hierarchy is inverted. The strongest evidence (RCTs) comes last — after the practice is already institutionally embedded. By the time the evidence arrives, the switching cost is enormous.
Analysis Questions
1. For each of the three cases, identify the single failure mode that was most responsible for the practice's initial adoption and the single failure mode most responsible for its persistence after reversal evidence was published. Are they the same?
2. The chapter notes that ~40% of tested medical practices are reversed. But most practices are never rigorously tested. What does this imply about the total error rate in medical practice? Is the 40% figure likely an underestimate or overestimate of the true reversal rate?
3. Design a "pre-adoption testing requirement" that would prevent medical reversals by requiring rigorous evidence before a practice becomes standard of care. What would it look like? What resistance would it face? What legitimate concerns would it raise (consider Chapter 21's overcorrection analysis)?
4. The sham surgery trials (ORBITA, Moseley) are particularly powerful because they control for the placebo effect of intervention itself. Why are sham-controlled surgical trials rare? What institutional and ethical barriers prevent them? Are these barriers legitimate?
5. Compare the "inverted evidence hierarchy" in medicine (strong evidence comes last) to evidence practices in your own field. Does your field adopt practices before rigorously testing them? If so, what structural features drive this inversion?
Key Takeaway
Medical reversal is not an aberration — it is the predictable consequence of a system that adopts practices based on inadequate evidence and then creates enormous institutional resistance to de-adoption when better evidence arrives. The common pathway (adoption → embedding → delayed testing → reversal → slow de-adoption) is driven by structural forces, not individual failures. Any field that adopts practices without rigorous testing and then embeds them in institutional infrastructure is vulnerable to the same pattern.