Further Reading: Designing Studies — Sampling and Experiments

Books (Start Here)

Wheelan, C. (2013). Naked Statistics: Stripping the Dread from the Data. W.W. Norton & Company. Chapters 5 and 10 cover sampling, bias, and polling in the same conversational style as this textbook. The Literary Digest poll gets a vivid retelling, and Wheelan's treatment of polling errors is both entertaining and rigorous. If you read one supplementary book for this chapter, make it this one.

Salsburg, D. (2001). The Lady Tasting Tea: How Statistics Revolutionized Science in the Twentieth Century. Henry Holt and Company. A narrative history of statistics told through the people who developed it. Chapter 1 covers the famous "lady tasting tea" experiment that helped Ronald Fisher develop the foundations of experimental design. The book shows how the concepts in this chapter were born — through real stories of scientists struggling to make sense of data. Accessible and genuinely entertaining.

Freedman, D., Pisani, R., & Purves, R. (2007). Statistics (4th ed.). W.W. Norton & Company. Chapters 1 and 2 provide the clearest textbook treatment of observational studies vs. experiments that I've encountered. The explanations of confounding and the Salk vaccine trial are particularly strong. More formal than Wheelan but still very readable.

Ellenberg, J. (2014). How Not to Be Wrong: The Power of Mathematical Thinking. Penguin Press. Chapters 4-6 cover survivorship bias (with the famous WWII airplane story told in full), the misuse of statistical significance, and how to think about uncertainty. Ellenberg writes for a general audience and connects mathematical ideas to real-world decision-making.

Articles and Papers

Squire, P. (1988). "Why the 1936 Literary Digest Poll Failed." Public Opinion Quarterly, 52(1), 125-133. The definitive academic analysis of the Literary Digest disaster. Squire carefully disentangles the effects of selection bias and nonresponse bias, showing that both contributed to the failure. A clear, well-written paper that's accessible to introductory students.

Kohavi, R., Tang, D., & Xu, Y. (2020). Trustworthy Online Controlled Experiments: A Practical Guide to A/B Testing. Cambridge University Press. Written by researchers at Microsoft, this is the most comprehensive guide to A/B testing in industry. It covers experimental design, common pitfalls, and the statistical methods used by major tech companies. More advanced than this chapter but invaluable for anyone pursuing a career in data science or product management. The first three chapters are accessible to beginners.

Kramer, A. D. I., Guillory, J. E., & Hancock, J. T. (2014). "Experimental evidence of massive-scale emotional contagion through social networks." Proceedings of the National Academy of Sciences, 111(24), 8788-8790. The Facebook emotional contagion study discussed in Case Study 2. Reading the actual paper — which is only three pages — is a great exercise in applying the causal claims checklist and thinking about research ethics. Note the methodology, the sample size, and the ethical controversy.

Wald, A. (1943). "A Method of Estimating Plane Vulnerability Based on Damage of Survivors." Statistical Research Group, Columbia University. Abraham Wald's original WWII memo on survivorship bias in aircraft armor placement. The original document is technical, but searching for "Abraham Wald survivorship bias" will bring up many accessible summaries and visualizations. The story illustrates how statistical thinking can literally save lives.

Videos

Veritasium — "Is Most Published Research Wrong?" (YouTube, ~13 min) An excellent video explaining how study design flaws — including confounding, small samples, and publication bias — contribute to the replication crisis in science. Connects to the replication crisis preview from Chapter 1's case study and previews hypothesis testing concepts from Chapter 13.

MinutePhysics — "What is NOT Random?" (YouTube, ~4 min) A short, fun exploration of what "random" really means and why it's harder to achieve than you'd think. Useful for building intuition about randomization.

3Blue1Brown — "But what is a random variable?" (YouTube, ~12 min) While more mathematical than needed for this chapter, Grant Sanderson's visual explanations build deep intuition about randomness and probability that will pay off throughout the course. Watch this one when you're ready for the next level.

TED Talk — "Irene Pepperberg: What studying talking birds tells us about animal intelligence" (or search for "Hans the Clever Horse") The story of Clever Hans — a horse that appeared to solve math problems but was actually reading unconscious cues from his trainer — is one of the most famous examples of why blinding matters. Many YouTube videos and articles tell this story well. It's a vivid reminder that experimenters can inadvertently influence results.

Interactive and Online Resources

Seeing Theory — "Frequentist Inference" and "Basic Probability" (seeing-theory.brown.edu) A beautifully designed interactive website from Brown University that visualizes statistical concepts. The sampling section lets you draw samples from a population and see how different sampling methods produce different results. Spend 15-20 minutes here after this chapter to build visual intuition.

RANDOM.ORG (random.org) A website that generates true random numbers using atmospheric noise. Useful for understanding what "random" means (as opposed to pseudorandom computer algorithms) and for actually running random sampling if you're doing a class project. Try generating a random sample from a numbered list.

Pew Research Center — Survey Methodology (pewresearch.org/methods) Pew is one of the most respected polling organizations in the world, and their methodology page explains exactly how they design surveys, select samples, and weight responses. Reading how professionals handle the challenges from this chapter — nonresponse bias, sampling frames, cell-phone-only households — is enlightening.

FiveThirtyEight — "How FiveThirtyEight's House, Senate, and Governor Models Work" (fivethirtyeight.com) A detailed explanation of how polling aggregation works, including how different polls are weighted based on their methodology. Excellent for seeing how the concepts in this chapter play out in real-world political forecasting.

Podcasts

Cautionary Tales — "LaLa Land: Surviving an Satisfying Screwup" and other episodes (pushkin.com) Tim Harford's podcast tells stories of human error, many of which involve statistical thinking gone wrong. The episodes on survivorship bias, the planning fallacy, and groupthink all connect to themes in this chapter.

More or Less: Behind the Statistics (BBC Radio 4 / BBC Sounds) The BBC's statistics fact-checking program examines claims made in the media and evaluates the underlying evidence. Nearly every episode involves questions about study design, sampling, and whether correlations can be interpreted as causation. Listening to a few episodes is one of the best ways to develop the critical evaluation skills from Section 4.8.

Looking Ahead

The concepts in this chapter are foundational for everything that follows. Here's where they'll come back:

  • Chapter 5 (Graphs): You'll visualize data — but the graphs are only as trustworthy as the data collection process behind them
  • Chapter 11 (Sampling Distributions): You'll formalize the idea of sampling variability — how much a sample statistic can differ from the population parameter
  • Chapter 12 (Confidence Intervals): You'll quantify the uncertainty that comes from using a sample instead of the whole population
  • Chapter 13 (Hypothesis Testing): You'll formally test whether observed differences (like Alex's A/B test results) are statistically significant
  • Chapter 16 (Comparing Two Groups): You'll learn the exact statistical tests Alex used to evaluate her A/B test
  • Chapter 17 (Power and Effect Sizes): You'll learn how to determine the sample size needed to detect a meaningful effect
  • Chapter 22-23 (Regression): You'll learn techniques for statistically controlling confounding variables in observational data
  • Chapter 26 (Statistics and AI): You'll return to the question of biased training data and algorithmic fairness
  • Chapter 27 (Ethical Data Practice): You'll explore the ethics of data collection and experimentation in depth