Chapter 15: Further Reading
Essential Sources
1. Judea Pearl and Dana Mackenzie, The Book of Why: The New Science of Cause and Effect (Basic Books, 2018)
The most accessible introduction to causal inference ever written, by one of the field's two foundational figures. Pearl and Mackenzie present the Ladder of Causation (association, intervention, counterfactual) as an organizing framework for understanding what data can and cannot tell us. The book covers the history of causal thinking from Galton and Pearson through Wright's path analysis to Pearl's own do-calculus, with extended examples from epidemiology, economics, and artificial intelligence.
Reading guidance: Chapters 1-3 are essential background for this course: Chapter 1 introduces the Ladder of Causation, Chapter 2 covers the history of how statistics lost causation (the "Sewall Wright affair" is fascinating), and Chapter 3 introduces Bayesian networks and the do-operator. Chapter 4 (confounding and deconfounding) and Chapter 6 (paradoxes, including Simpson's) map directly to the content of this chapter. For the technically inclined reader, Pearl's academic text Causality: Models, Reasoning, and Inference (Cambridge University Press, 2nd edition, 2009) provides the formal mathematics behind everything in The Book of Why. The 2009 edition includes the complete do-calculus, the front-door criterion, and transportability — all of which appear in Chapter 17 of this textbook. However, Causality assumes fluency in probability and graph theory and is not for the casual reader.
2. Scott Cunningham, Causal Inference: The Mixtape (Yale University Press, 2021)
The best modern textbook for applied causal inference, written with clarity, personality, and a relentless focus on practical application. Cunningham covers the potential outcomes framework, directed acyclic graphs, matching, propensity scores, regression discontinuity, instrumental variables, difference-in-differences, and synthetic control — all with code examples (in both Stata and R) and real-world applications from economics, criminology, and public policy.
Reading guidance: Chapter 1 ("Introduction") and Chapter 2 ("Probability and Regression Review") provide background. Chapter 3 ("Directed Acyclic Graphs") is an excellent complement to our Chapter 17 — Cunningham's exposition of collider bias and the backdoor criterion is particularly clear. Chapters 4-9 cover the estimation methods that appear in our Chapter 18, each with worked examples and code. The book's tone is conversational and occasionally irreverent — it reads more like a practitioner's guide than a formal textbook, which makes it an effective complement to the more mathematical treatments. A free online version is available at mixtape.scunning.com, which also includes Python translations of the code examples.
3. Miguel A. Hernan and James M. Robins, Causal Inference: What If (Chapman & Hall/CRC, 2020)
The definitive graduate-level textbook on causal inference from an epidemiological and biostatistical perspective. Hernan and Robins develop the potential outcomes framework and graphical models in parallel, showing how each illuminates the other. The book covers identification, estimation, time-varying treatments, mediation, and interaction — topics that extend well beyond what most data scientists encounter.
Reading guidance: Part I (Chapters 1-10: "Causal Inference Without Models") is the most relevant for this course. Chapter 1 ("A Definition of Causal Effect") and Chapter 3 ("Observational Studies") are excellent introductions to the fundamental problem and confounding. Chapter 6 ("Graphical Representation of Causal Effects") bridges between the potential outcomes and graphical frameworks. Chapter 7 ("Confounding") provides the most rigorous treatment of confounding adjustment we have encountered. Part II extends to parametric models, and Part III addresses causal inference with time-varying treatments (inverse probability of treatment weighting, g-estimation, the parametric g-formula) — material that is advanced even by the standards of this textbook. The book is available free online at hsph.harvard.edu/miguel-hernan/causal-inference-book, which removes any barrier to access.
4. Paul W. Holland, "Statistics and Causal Inference" (Journal of the American Statistical Association, 81(396): 945-960, 1986)
The paper that introduced the phrase "the fundamental problem of causal inference" and formalized the connection between Rubin's potential outcomes framework and Fisher's randomization-based inference. Holland's paper is a masterpiece of expository writing: in 16 pages, he clarifies the conceptual foundations of causal inference, distinguishes between scientific and statistical questions, and explains why the potential outcomes framework resolves ambiguities that plague regression-based approaches.
Reading guidance: The entire paper is worth reading. Section 2 ("A Model for Causal Inference") defines potential outcomes and the fundamental problem with exceptional clarity. Section 3 ("Two Solutions to the Fundamental Problem") distinguishes between the "scientific solution" (finding units that are identical except for treatment) and the "statistical solution" (using randomization to make group-level comparisons valid). Section 5 ("Rubin's Model") connects potential outcomes to regression, showing how the standard regression coefficient can be interpreted causally only under specific conditions. The paper is accessible to anyone with a background in introductory statistics.
5. Brady Neal, Introduction to Causal Inference from a Machine Learning Perspective (Online, 2020)
A modern, freely available introduction that bridges the gap between the causal inference literature (which is rooted in statistics, economics, and epidemiology) and the machine learning community. Neal covers both the potential outcomes and structural causal models frameworks, with a strong emphasis on identification and estimation using machine learning methods.
Reading guidance: Chapters 1-4 cover the conceptual foundations (association vs. causation, potential outcomes, graphical models) at a level that complements this chapter and Chapters 16-17 of this textbook. Chapters 5-7 cover identification strategies (backdoor adjustment, front-door criterion, do-calculus), and Chapters 10-11 cover causal discovery and causal machine learning. The machine learning framing makes this text particularly relevant for readers of this book, as it uses the same computational vocabulary. Available at bradyneal.com/causal-inference-course. The accompanying lecture videos provide additional context and worked examples for each chapter.