Chapter 16: Further Reading
Essential Sources
1. Donald B. Rubin, "Estimating Causal Effects of Treatments in Randomized and Nonrandomized Studies" (Journal of Educational Psychology, 1974)
The foundational paper that introduced the potential outcomes framework for causal inference. Rubin formalized the idea that causal effects are defined as comparisons between potential outcomes — what would happen to the same unit under different treatments — and showed how randomization makes these effects identifiable. The paper is notable for its clarity and its insistence that causal questions must be framed in terms of manipulable treatments ("no causation without manipulation"), a position that remains influential and debated.
Reading guidance: The paper is short (13 pages) and remarkably accessible for a foundational work. Section I establishes the notation $Y_t(u)$ — the potential outcome for unit $u$ under treatment $t$ — and defines the individual causal effect as the difference between potential outcomes. Section II introduces the fundamental problem: we cannot observe both potential outcomes for the same unit. Section III is the key contribution: Rubin shows how random assignment makes the average causal effect estimable from observed data, and how observational studies can approximate this under stated assumptions. Section IV discusses the role of covariates and stratification. The paper predates much of the subsequent formalization (SUTVA, ignorability, the propensity score), but the core framework is already here. For Rubin's later formalization of assumptions, see Rubin (1978), "Bayesian Inference for Causal Effects: The Role of Randomization," which introduces the assignment mechanism framework, and Rubin (1980), "Comment on 'Randomization Analysis of Experimental Data: The Fisher Randomization Test' by Basu," which first states SUTVA explicitly.
2. Guido W. Imbens and Donald B. Rubin, Causal Inference for Statistics, Social, and Biomedical Sciences: An Introduction (Cambridge University Press, 2015)
The definitive textbook treatment of the potential outcomes framework, written by two of the field's founders (Imbens shared the 2021 Nobel Prize in Economics for his contributions to causal inference methodology). At 625 pages, it provides a comprehensive, mathematically rigorous development of the Rubin causal model, from basic definitions through randomized experiments, observational studies, instrumental variables, and regression discontinuity.
Reading guidance: Part I (Chapters 1-3) covers the material in this chapter: potential outcomes, causal estimands, and the role of the assignment mechanism. This is the essential reading. Chapter 1 ("Causality: The Basic Framework") introduces the notation and the fundamental problem. Chapter 2 ("A Brief History of the Potential Outcome Approach to Causal Inference") provides intellectual context. Chapter 3 ("A Taxonomy of Assignment Mechanisms") classifies studies by how treatment is assigned — a framework that determines which estimation methods are appropriate. For readers continuing to Chapter 18 of this textbook: Part II (Chapters 4-8) covers randomized experiments in extraordinary depth, and Part III (Chapters 12-18) covers observational studies (conditioning methods, propensity scores). The writing is precise but demands patience — this is not a quick read, and the authors prioritize mathematical rigor over brevity. For a more concise and applied treatment, see Imbens and Angrist (1994) on local average treatment effects and Angrist and Pischke (2009) below.
3. Paul W. Holland, "Statistics and Causal Inference" (Journal of the American Statistical Association, 1986)
The paper that crystallized the potential outcomes framework for the statistics community and gave the "fundamental problem of causal inference" its name. Holland's contribution was partly philosophical: he articulated the distinction between "causes of effects" (why did this patient recover?) and "effects of causes" (does this drug cause recovery?), arguing that statistics can address only the latter. The paper also formalized the connection between Rubin's framework and experimental design, showing that randomization is sufficient (but not necessary) for identifying average causal effects.
Reading guidance: Section 2 ("A Model for Causal Inference") is the core: Holland presents the potential outcomes notation, the fundamental problem, and the distinction between scientific and statistical solutions to the problem. The "scientific solution" (observing both potential outcomes by repeating the experiment on the same unit) is generally impossible; the "statistical solution" (using populations of units) is what the entire field is built on. Section 3 introduces the concept of "prima facie causal effect" (what we call the naive estimate) and derives the selection bias decomposition. Section 5 ("Rubin's Model") connects the framework to Rubin's earlier work and introduces the role of covariates. The paper ends with a discussion of the phrase "no causation without manipulation" and its implications for defining treatments — a discussion that remains relevant when we try to define the "treatment" in recommendation systems or other complex interventions. At 18 pages, this is one of the most important and readable papers in the field.
4. Joshua D. Angrist and Jorn-Steffen Pischke, Mostly Harmless Econometrics: An Empiricist's Companion (Princeton University Press, 2009)
The most influential applied econometrics textbook of the past two decades. Angrist and Pischke present the potential outcomes framework from an economist's perspective, with a pragmatic focus on identification strategies for observational data: randomized experiments, regression, instrumental variables, difference-in-differences, and regression discontinuity. The writing is unusually lively for an econometrics text, and the examples are drawn from real published studies.
Reading guidance: Chapter 2 ("The Ideal Experiment") covers the material in this chapter: potential outcomes, selection bias, and the role of randomization. It is shorter and more informal than Imbens and Rubin (2015), making it an excellent complement for readers who want a second perspective. Chapter 3 ("Making Regression Make Sense") addresses the question central to Section 16.9 of this chapter: when does OLS regression have a causal interpretation? The answer involves the conditional independence assumption (their term for conditional ignorability) and the omitted variable bias formula, derived with characteristic clarity. Chapters 4-6 cover IV, DiD, and RD — the methods of Chapter 18 of this textbook. The book is opinionated (Angrist and Pischke are famously skeptical of structural models and Bayesian methods), and their emphasis on "design-based" identification strategies has shaped a generation of applied researchers. For a more recent and expanded treatment, see Angrist and Pischke, Mastering 'Metrics: The Path from Cause to Effect (Princeton University Press, 2015), which covers similar material at a more introductory level.
5. Carlos Cinelli and Chad Hazlett, "Making Sense of Sensitivity: Extending Omitted Variable Bias" (Journal of the Royal Statistical Society: Series B, 2020)
A modern paper that formalizes sensitivity analysis for omitted variable bias in terms of partial $R^2$ values. Cinelli and Hazlett show how to answer the question: "How strong would an unmeasured confounder have to be — in terms of its association with treatment and outcome — to change the qualitative conclusion of the analysis?" Their framework produces intuitive sensitivity plots (contour plots in the $R^2_{D \sim U | \mathbf{X}}$ vs. $R^2_{Y \sim U | D, \mathbf{X}}$ space) and formal bounds on the bias.
Reading guidance: Section 2 establishes the framework: the OVB is parameterized by two partial $R^2$ values — how much of the residual variation in treatment and outcome the omitted confounder explains. Section 3 introduces the sensitivity contour plot, which is the key visual tool. The "robustness value" (the strength of confounding required to reduce the estimated effect to zero) provides a single-number summary of robustness. Section 5 applies the method to a real study (the effect of Darfur violence exposure on attitudes toward peace). The paper is technically demanding but rewards careful reading. For Python users, the concepts can be implemented directly using the OVB formula from Section 16.9 of this chapter; for R users, the sensemakr package provides a complete implementation. This paper is essential reading for anyone conducting observational causal analyses: the sensitivity analysis it enables should be reported alongside every causal estimate from observational data.