Chapter 21 Further Reading: Data Journalism and Statistical Literacy

Annotated Bibliography

The following sources are organized thematically. Entries marked with an asterisk (*) are especially recommended for readers new to statistical literacy.

Foundational Statistical Literacy

1. Huff, Darrell. How to Lie with Statistics*. W. W. Norton, 1954.

The most widely read introduction to statistical manipulation ever published. Huff's slim volume covers misleading averages, sample selection bias, visual trickery with charts, and the strategic omission of context — all illustrated with clear examples. Despite its age, virtually every technique Huff described is in daily use in contemporary media. The book is dated in its cultural references and its cheerful tone occasionally minimizes genuine harms, but as a first encounter with statistical skepticism it remains unmatched. Every chapter of this present textbook owes something to Huff's foundational work. Read it in an afternoon.

2. Cairo, Alberto. How Charts Lie: Getting Smarter About Visual Information. W. W. Norton, 2019.

Cairo, a journalism professor and data visualization expert, systematically examines how charts mislead — not through outright fraud but through design choices that create false impressions. Coverage includes truncated axes, misleading map projections, cherry-picked data, and the gap between what charts seem to show and what they actually encode. Unlike Huff, Cairo focuses specifically on visual communication and provides a framework (not just examples) for evaluating charts. Essential reading alongside any data journalism curriculum.

3. Silver, Nate. The Signal and the Noise: Why So Many Predictions Fail — But Some Don't*. Penguin Press, 2012.

Silver's account of probabilistic thinking in domains from weather forecasting to baseball to elections is one of the most readable treatments of uncertainty and prediction in print. The book argues for Bayesian reasoning — updating beliefs in proportion to evidence rather than seeking certainty — and documents the systematic overconfidence that characterizes prediction across almost every domain. The chapters on polling, economic forecasting, and the limits of models are directly relevant to this chapter. Silver also honestly examines where FiveThirtyEight has been wrong, which is methodologically instructive.

4. Kahneman, Daniel. Thinking, Fast and Slow. Farrar, Straus and Giroux, 2011.

Kahneman's synthesis of a career's research on cognitive biases provides the psychological foundation for understanding why statistical illiteracy is so persistent. His account of base rate neglect, availability heuristic, anchoring, and overconfidence illuminates why the statistical errors documented in this chapter are not merely educational failures but reflect deep features of human cognition. The book's treatment of "System 1" and "System 2" thinking is simplified but useful, and the later replication-crisis caveats about some specific findings (e.g., ego depletion) do not undermine the core cognitive bias research. Required background for any serious engagement with misinformation and critical thinking.

5. Wheelan, Charles. Naked Statistics: Stripping the Dread from the Data. W. W. Norton, 2013.

A highly accessible introduction to statistical reasoning for non-mathematicians. Wheelan covers probability, distributions, regression, and statistical inference with wit and clarity, using contemporary examples from policy, sports, and science. For readers who found statistics courses alienating, Naked Statistics offers a more humane entry point. The chapters on regression and causality are particularly useful for understanding why observational studies require such careful interpretation.

The Replication Crisis

6. Open Science Collaboration. "Estimating the Reproducibility of Psychological Science." Science 349, no. 6251 (2015): aac4716.

The landmark paper documenting that approximately 39% of 100 published psychology studies replicated at statistical significance. This paper launched the replication crisis as a mainstream scientific and public conversation. Reading the full paper (not just the abstract) reveals the careful methodological choices the researchers made and the genuine ambiguity in interpreting "failed" replications. The supplementary materials contain detailed information about individual study outcomes and are useful for instruction. Available open-access through the Open Science Framework.

7. Simmons, Joseph P., Leif D. Nelson, and Uri Simonsohn. "False-Positive Psychology: Undisclosed Flexibility in Data Collection and Analysis Allows Presenting Anything as Significant." Psychological Science 22, no. 11 (2011): 1359–66.

The paper that first rigorously demonstrated that standard "researcher degrees of freedom" — legitimate-seeming choices about when to stop data collection, which variables to include, which subgroups to analyze — could produce p < 0.05 results from random data with high reliability. Required reading for understanding the mechanism of the replication crisis as distinct from deliberate fraud. The paper introduced the term "p-hacking" to a broad academic audience and recommended pre-registration as the primary remedy.

8. Gelman, Andrew, and Eric Loken. "The Statistical Crisis in Science." American Scientist 102, no. 6 (2014): 460–65.

Gelman and Loken's "garden of forking paths" metaphor — the idea that researchers' post-hoc analytical choices, even when made without conscious awareness of their effect on results, inflate false positive rates — is among the most useful frameworks for understanding why p-hacking does not require bad intentions. The article is accessible and short. Gelman's blog (Statistical Modeling, Causal Inference, and Social Science) extends this analysis with ongoing commentary on new research.

9. Ioannidis, John P. A. "Why Most Published Research Findings Are False." PLoS Medicine 2, no. 8 (2005): e124.

A mathematically elegant demonstration that, given typical effect sizes, sample sizes, publication bias, and researcher degrees of freedom in biomedical research, the majority of published positive findings are likely false positives. Written fifteen years before the replication crisis became common knowledge, this paper was initially dismissed by many as an overclaim. It has since been vindicated by empirical replication data. The mathematical modeling approach is accessible to readers with college-level probability.

Health Statistics

10. Gigerenzen, Gerd. Calculated Risks: How to Know When Numbers Deceive You. Simon & Schuster, 2002.

Gigerenzen, a statistician who has extensively studied medical decision-making, demonstrates that doctors, patients, and journalists systematically misunderstand health statistics — particularly conditional probabilities, screening test results, and treatment efficacy claims. The book's argument that "natural frequency" framing (27 out of 1,000 rather than 2.7%) dramatically improves statistical comprehension is backed by experimental evidence. The chapters on cancer screening statistics are especially important for media literacy and patient autonomy.

11. Goldacre, Ben. Bad Science: Quacks, Hacks, and Big Pharma Flacks. Fourth Estate, 2008.

Goldacre, a physician and science journalist, provides an accessible and often blistering account of how pharmaceutical companies, media organizations, and supplement marketers misrepresent medical research. The book covers placebo effects, publication bias, trial design manipulation, ghost-writing of medical papers, and the misrepresentation of risk. Goldacre's style is polemical in places, but the underlying analysis is rigorous and the specific cases are well-documented. Bad Pharma (2012), his follow-up, extends the analysis to systematic publication bias in drug trials.

Polling Methodology

12. Pew Research Center. "Explaining Why the 2016 and 2020 Presidential Election Polls Were Misleading." Pew Research Center, 2021.

Available at pewresearch.org, this report provides a clear and current analysis of why presidential election polls systematically underestimated Republican vote share in 2016 and 2020. The report covers differential non-response by education level, the limits of likely voter models, and the specific challenges of state-level polling. Essential for understanding the limits of even technically sophisticated polling. The American Association for Public Opinion Research (AAPOR) commissioned a parallel task force report available at aapor.org that provides more technical detail.

Economic Statistics

13. Marmot, Michael. The Status Syndrome: How Social Standing Affects Our Health and Longevity. Times Books, 2004.

While not primarily a statistics textbook, Marmot's synthesis of the Whitehall studies and broader social epidemiology illustrates how economic statistics (GDP, income, poverty rates) fail to capture the mechanisms by which social status affects health. The book demonstrates that the relationship between socioeconomic position and health is a gradient that operates throughout the income distribution — not just at poverty thresholds — and that this gradient cannot be explained by material deprivation alone. Essential context for evaluating economic statistics' relationship to wellbeing.

14. Stiglitz, Joseph E., Amartya Sen, and Jean-Paul Fitoussi. Mismeasuring Our Lives: Why GDP Doesn't Add Up. New Press, 2010.

The report of the Commission on the Measurement of Economic Performance and Social Progress, established by French President Sarkozy in 2008. Stiglitz, Sen, and Fitoussi document GDP's limitations and propose alternative frameworks for measuring economic wellbeing, sustainability, and quality of life. The report is technically rigorous but accessible to non-economists. It provides the intellectual foundation for the Human Development Index, Genuine Progress Indicator, and similar alternative metrics discussed in Section 21.8.

15. Harcourt, Bernard E. Against Prediction: Profiling, Policing, and Punishing in an Actuarial Age. University of Chicago Press, 2007.

Harcourt's analysis of how actuarial risk assessment tools in criminal justice — tools that use statistical correlates of recidivism to guide sentencing and parole decisions — can perpetuate racial and socioeconomic disparities even when not explicitly designed to do so. The book provides a sophisticated treatment of how statistical tools interact with the social contexts in which they are deployed, and how optimizing for one statistical objective (predicted recidivism) can produce morally unacceptable outcomes. Directly relevant to algorithmic fairness discussions that extend from this chapter into Chapter 22.

Online Resources

Our World in Data (ourworldindata.org): Freely available, rigorously sourced, clearly visualized data on global development, health, and economics. Each dataset includes full methodology and source documentation.
Statista (statista.com): A statistical data portal with broad coverage of industry, demographic, and policy statistics. Useful for finding data quickly; methodology varies and should be verified for academic use.
The NNT (thennt.com): Maintained by a team of physicians, this site provides patient-oriented summaries of treatment evidence expressed in NNT and NNH terms, independent of pharmaceutical framing.
ClinicalTrials.gov: The US registry of clinical trials, useful for verifying whether a published drug trial was pre-registered and whether pre-registered and published outcomes match.
Open Science Framework (osf.io): Hosts pre-registration documents, open data, and replication project materials for thousands of psychology and social science studies.
Our World in Data's "Coronavirus Pandemic" tracker: A model of how to present complex, evolving public health data with methodological transparency and honest uncertainty communication.