Chapter 17: Further Reading
Essential Sources
1. Judea Pearl, Causality: Models, Reasoning, and Inference, 2nd edition (Cambridge University Press, 2009)
The foundational text of graphical causal models and the do-calculus. Pearl developed the framework over three decades, starting with Bayesian networks (1988) and culminating in this comprehensive treatment of causal reasoning. The second edition incorporates the completeness results for do-calculus and the complete identification algorithm, making it the definitive reference for the theory presented in this chapter.
Reading guidance: Chapter 1 ("Introduction to Probabilities, Graphs, and Causal Models") establishes the notation and the causal Markov condition. Chapter 2 ("A Theory of Inferred Causation") covers causal discovery — learning the DAG from data — which is complementary to this chapter's assumption that the DAG is given. Chapter 3 ("Causal Diagrams and the Identification of Causal Effects") is the essential reading: it presents d-separation, the backdoor criterion, the front-door criterion, and the do-calculus. This chapter corresponds most directly to Chapter 17 of this textbook. Section 3.3 (backdoor criterion) and Section 3.4 (front-door criterion) are particularly important. Chapter 7 ("The Logic of Structure-Based Counterfactuals") connects the graphical framework to potential outcomes, formalizing the equivalence between the two frameworks. The book is mathematically rigorous and rewards careful reading. For a less technical introduction, see Pearl, Glymour, and Jewell (2016) below. For Pearl's own non-technical exposition of the ideas, see The Book of Why (Pearl and Mackenzie, 2018), which presents the causal inference framework as a narrative accessible to a general audience — useful for building intuition but insufficient for implementation.
2. Carlos Cinelli, Andrew Forney, and Judea Pearl, "A Crash Course in Good and Bad Controls" (Sociological Methods & Research, 2022)
A systematic treatment of the question this chapter addresses informally in Section 17.8: which variables should be included in a regression (or other adjustment method) and which should be excluded? Cinelli, Forney, and Pearl enumerate 18 distinct graph structures — covering confounders, mediators, colliders, descendants of each, M-bias structures, and combinations — and classify each as a good control, bad control, or neutral. The paper is a direct translation of the backdoor criterion into practical guidelines for applied researchers.
Reading guidance: The paper is organized around a series of small DAGs (Figures 1-18), each illustrating one configuration. For each configuration, the authors state whether conditioning on the variable helps (reduces bias), hurts (introduces bias), or is neutral. Table 1 summarizes all 18 cases and is worth printing and keeping at hand during any observational analysis. The most important cases for practitioners are: (1) the classic confounder (good control), (2) the mediator (bad control — blocks the causal path), (3) the collider (bad control — opens a spurious path), (4) the M-bias structure (conditioning on the "M" node introduces bias even though it appears to be a pre-treatment variable), and (5) the descendant of the mediator (bad control — partially blocks the causal path). The paper is accessible without heavy mathematical prerequisites and is the single best practical reference for variable selection in observational causal analyses. For the underlying theory, see Pearl (2009) Chapter 3.
3. Jonas Peters, Dominik Janzing, and Bernhard Scholkopf, Elements of Causal Inference: Foundations and Learning Algorithms (MIT Press, 2017)
A modern treatment of causal inference that bridges the graphical models community and the machine learning community. Peters, Janzing, and Scholkopf cover structural causal models, independence-based causal discovery, score-based methods, and the connection between causal inference and distribution shift (domain adaptation). The book is more compact than Pearl (2009) and focuses on the algorithmic aspects of causal reasoning.
Reading guidance: Chapter 2 ("Two Examples") provides concrete illustrations of why prediction and causation differ — useful for reinforcing the intuition from Chapters 15-17 of this textbook. Chapter 4 ("Learning Causal Structure") covers the PC algorithm and other causal discovery methods — the problem of learning the DAG from data, which this chapter assumes is given. Chapter 6 ("Causal Inference in Practice") is the most applied section, discussing real-world challenges including unmeasured confounders, feedback loops, and time series. The book assumes familiarity with probability theory and basic graph theory, and is appropriate for graduate students and practitioners. Available as a free PDF from the MIT Press website.
4. Amit Sharma and Emre Kiciman, "DoWhy: An End-to-End Library for Causal Inference" (arXiv:2011.04216, 2020)
The technical paper behind the DoWhy library used in Section 17.11. Sharma and Kiciman propose a four-step workflow for causal inference: (1) model the problem as a causal graph, (2) identify the causal estimand using graphical criteria, (3) estimate the effect using statistical methods, and (4) refute the estimate using robustness checks. DoWhy implements this workflow in Python, integrating with EconML for advanced estimation methods.
Reading guidance: Section 2 describes the four-step workflow and motivates each step. Section 3 covers the identification module, which implements the backdoor criterion, front-door criterion, and instrumental variable identification automatically from the user-specified graph. Section 4 covers the refutation module, which provides the placebo treatment, random common cause, and data subset tests demonstrated in this chapter. For hands-on implementation, the DoWhy documentation (https://www.pywhy.org/dowhy/) provides Jupyter notebook tutorials for each identification strategy. For the broader PyWhy ecosystem (which includes DoWhy, EconML, CausalML, and gcm), see the PyWhy project page. The paper and library are essential for anyone implementing graphical causal models in a production data science pipeline.
5. Miguel A. Hernan and James M. Robins, Causal Inference: What If (Chapman & Hall/CRC, 2020)
A comprehensive textbook that covers causal inference from both the potential outcomes and graphical perspectives, written by two leading epidemiologists. Hernan and Robins are unusual in synthesizing both frameworks throughout the book, rather than treating them separately. The treatment of DAGs is particularly strong, with extensive examples from public health and medicine.
Reading guidance: Part I ("Causal Inference Without Models," Chapters 1-10) covers the foundations: causal effects, randomized experiments, observational studies, effect modification, and interaction. Chapter 6 ("Graphical Representation of Causal Effects") introduces DAGs in the epidemiological context and covers the three junction types, d-separation, and the backdoor criterion. Chapter 7 ("Confounding") uses DAGs to formalize confounding, which directly complements this chapter. Chapter 8 ("Selection Bias") uses DAGs to explain collider bias and selection bias — the "bad controls" of Section 17.8. Part II ("Causal Inference With Models," Chapters 11-20) covers IP weighting, standardization, instrumental variables, and causal survival analysis. The book is freely available online (https://www.hsph.harvard.edu/miguel-hernan/causal-inference-book/) and uses a consistent notation throughout. It is the best reference for practitioners who want to see how graphical causal models are applied in real epidemiological studies, and it provides an excellent bridge between this chapter (theory) and Chapter 18 (estimation methods).