Chapter 18: Further Reading
Essential Sources
1. Guido W. Imbens and Donald B. Rubin, Causal Inference for Statistics, Social, and Biomedical Sciences: An Introduction (Cambridge University Press, 2015)
The comprehensive reference for the potential outcomes approach to causal inference, co-authored by the 2021 Nobel laureate Guido Imbens. While Chapter 16's further reading recommended Part I for foundational potential outcomes concepts, this chapter's material corresponds primarily to Parts III and IV of the book, which cover observational studies and matching methods in depth.
Reading guidance: Part III (Chapters 12-18) covers the selection-on-observables framework that underpins propensity score methods. Chapter 12 ("Regular Assignment Mechanisms: Design") develops the theory of unconfounded assignment. Chapter 13 ("Estimation: Subclassification on the Propensity Score") and Chapter 14 ("Estimation: Matching") provide the theoretical foundations for PSM and IPW as developed in Sections 18.2-18.5 of this chapter. The treatment is mathematically rigorous: Imbens and Rubin derive finite-sample results rather than relying on asymptotic arguments, which gives sharper intuition about where and why these methods work. Chapter 18 ("Instrumental Variables Analysis") covers IV estimation from the potential outcomes perspective, including the LATE theorem, though the presentation differs from the structural equation approach common in econometrics. The book does not cover DiD or RD in depth (these are better treated in Angrist and Pischke or Cunningham). For practitioners who want the theoretical foundations behind the tools: this is the definitive source. For those who want implementation guidance: supplement with Cunningham (below).
2. Scott Cunningham, Causal Inference: The Mixtape (Yale University Press, 2021)
The most accessible modern textbook on causal inference methods, aimed at applied researchers. Cunningham covers the same five methods as this chapter — matching/propensity scores, IV, DiD, RD — with extensive examples, code (Stata and R), and an informal style that makes the material approachable without sacrificing rigor.
Reading guidance: The book is organized around identification strategies, making it an ideal companion to this chapter. Chapter 5 ("Matching and Subclassification") covers propensity score methods with practical guidance on implementation and diagnostics. Chapter 7 ("Instrumental Variables") provides the most intuitive treatment of the LATE theorem available, including the "compliance types" framework (always-takers, never-takers, compliers, defiers) explained through concrete examples. Chapter 9 ("Difference-in-Differences") is outstanding: it covers the basic 2x2 design, the parallel trends assumption, event study specifications, and (in the 2021 edition) provides early discussion of the staggered adoption problems formalized by Goodman-Bacon (2021). Chapter 6 ("Regression Discontinuity") covers both sharp and fuzzy designs with worked examples. The book is freely available at https://mixtape.scunning.com/ and includes Stata, R, and Python code. For readers of this textbook: Cunningham provides the applied perspective that complements the mathematical rigor of Imbens and Rubin. Read the corresponding Cunningham chapter after each section of this chapter for a different angle on the same concepts.
3. Matias D. Cattaneo, Nicolas Idrobo, and Rocio Titiunik, A Practical Introduction to Regression Discontinuity Designs (Cambridge Elements, 2020)
The current standard reference for RD methodology, written by the researchers who developed the modern toolkit (the rdrobust package, the CCT optimal bandwidth, the robust confidence intervals). This slim volume (approximately 100 pages in two parts) distills the state of the art into a practical guide.
Reading guidance: Part I ("Foundations") covers sharp RD: the setup, identification assumptions, estimation via local polynomial regression, bandwidth selection, and inference. The treatment of bandwidth selection is the most current available: it presents the Calonico-Cattaneo-Titiunik (2014) optimal bandwidth with bias correction, which has largely superseded the Imbens-Kalyanaraman (2012) rule-of-thumb used in earlier literature. Part II ("Extensions") covers fuzzy RD, RD with discrete running variables, geographic RD, and multi-dimensional RD. The book provides extensive practical guidance: how to create the RD plot (bin scatter with local polynomial), how to conduct falsification tests (McCrary density test, covariate balance, placebo cutoffs), and how to report RD results. For readers implementing RD in production: this is the essential reference. The authors maintain the rdrobust package (R and Stata; Python port via rdd), which implements all the methods described. Section 18.9 of this chapter follows the Cattaneo et al. framework; this book provides the theoretical depth behind the implementation.
4. Andrew Goodman-Bacon, "Difference-in-Differences with Variation in Treatment Timing" (Journal of Econometrics, 2021)
The paper that transformed the DiD literature by demonstrating that the standard two-way fixed effects (TWFE) estimator produces biased estimates under staggered treatment adoption — a finding that affected hundreds of published studies.
Reading guidance: Section 2 presents the core decomposition: the TWFE coefficient is a weighted average of all possible 2x2 DiD comparisons, where the weights depend on group sizes and treatment timing. Section 3 reveals the problem: when treatment effects are dynamic (varying over time since adoption), comparisons that use already-treated units as controls produce biased estimates because the "control" group's outcomes include both the time trend and the treatment effect. The bias can flip the sign of the estimate. Section 4 provides the diagnostic tool: the Bacon decomposition, which reports the constituent 2x2 comparisons and their weights, allowing researchers to identify problematic comparisons. The paper is technically demanding but essential for anyone conducting DiD analyses. For readers of this textbook: if you have a staggered adoption design, read this paper before running a TWFE regression. For the solutions to the problem Goodman-Bacon identifies, see Callaway and Sant'Anna (2021), "Difference-in-Differences with Multiple Time Periods" (Journal of Econometrics), and Sun and Abraham (2021), "Estimating Dynamic Treatment Effects in Event Studies with Heterogeneous Treatment Effects" (Journal of Econometrics), both of which propose estimators that use only not-yet-treated or never-treated units as controls.
5. Paul R. Rosenbaum, Observational Studies (Springer, 2nd edition, 2002)
The foundational text on the design and analysis of observational studies, written by the co-inventor of the propensity score. Rosenbaum's book goes beyond estimation to address the deeper question: how do you design an observational study to be as convincing as possible, even when randomization is impossible?
Reading guidance: Part I ("Observational Studies") establishes the framework. Chapter 3 ("Overt Bias") covers propensity score methods as developed in this chapter, but with Rosenbaum's distinctive emphasis on study design over study analysis. His argument — that a well-designed observational study (one that collects the right variables, defines treatment precisely, and pre-specifies the analysis plan) is more convincing than a poorly designed study with a sophisticated estimator — is a valuable corrective to the tendency to rely on statistical methods alone. Part II ("Sensitivity to Hidden Bias") is Rosenbaum's greatest contribution: a framework for quantifying how sensitive a causal conclusion is to unmeasured confounding. The sensitivity parameter $\Gamma$ measures the maximum departure from random assignment (in terms of treatment odds) that is consistent with the data; if $\Gamma$ must be implausibly large to overturn the conclusion, the finding is robust. This framework underpins Exercise 18.22 and should be applied to every observational causal estimate. For readers who conduct observational studies: Part II alone justifies reading the entire book. Rosenbaum's writing is meticulous and rewards careful attention.