Further Reading: Descriptive Statistics
You've now learned the core toolkit of descriptive statistics — center, spread, and shape. If any of these topics sparked your curiosity, here are resources to take you deeper.
Tier 1: Verified Sources
David Spiegelhalter, The Art of Statistics: How to Learn from Data (Basic Books, 2019). If this chapter resonated with you, Spiegelhalter's book is the natural next step. He covers descriptive statistics, probability, and inference with extraordinary clarity, using real-world examples from crime data, medical trials, and everyday life. His approach — intuition first, formulas second — matches the philosophy of our textbook. Chapters 1-4 are especially relevant to what we covered here.
Charles Wheelan, Naked Statistics: Stripping the Dread from the Data (W. W. Norton, 2013). Wheelan is a former correspondent for The Economist who writes about statistics the way a journalist would — with stories, humor, and a relentless focus on "why should I care?" His chapters on descriptive statistics are excellent for building intuition, and his treatment of the Central Limit Theorem (which you'll meet in Chapter 21) is one of the most accessible out there.
Darrell Huff, How to Lie with Statistics (W. W. Norton, 1954; reprint 1993). Yes, 1954. This tiny, illustrated book is one of the best-selling statistics books of all time, and it's still devastatingly relevant. Huff's examples of how averages, graphs, and samples can be manipulated to mislead are the perfect companion to our Case Study 1. It's a two-hour read and will permanently sharpen your critical thinking about numbers in the news.
Edward Tufte, The Visual Display of Quantitative Information (Graphics Press, 2nd edition, 2001). If the Anscombe's Quartet section made you think more carefully about the relationship between statistics and visualization, Tufte is the master. This beautifully designed book explores how to present data honestly and effectively. It's as much about what not to do (what Tufte calls "chartjunk") as what to do well.
Allen B. Downey, Think Stats: Exploratory Data Analysis in Python (O'Reilly, 2nd edition, 2014). Downey takes a computational approach to statistics — learn by coding, not by memorizing formulas. His treatment of distributions, descriptive statistics, and probability (all in Python) aligns closely with our approach. The book is freely available online as well.
Tier 2: Attributed Resources
Francis Anscombe, "Graphs in Statistical Analysis" (1973). Published in The American Statistician, Vol. 27, No. 1. This is the original paper introducing Anscombe's Quartet — the four datasets with identical summary statistics but different visual patterns. It's short, readable, and one of the most cited papers in statistics. If you want to read the original argument for why visualization matters alongside computation, this is it.
Justin Matejka and George Fitzmaurice, "Same Stats, Different Graphs: Generating Datasets with Varied Appearance and Identical Statistics through Simulated Annealing" (2017). Published in the Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems. This paper introduces the "Datasaurus Dozen" — a modern, dramatic extension of Anscombe's Quartet that includes a dataset shaped like a dinosaur. It's fun, visually striking, and makes the same point with even more force.
John Tukey's contributions to exploratory data analysis. Tukey invented the box plot, coined the term "bit" in computing, and wrote Exploratory Data Analysis (Addison-Wesley, 1977), which is the intellectual foundation for much of what we covered in this chapter. His emphasis on looking at data before modeling it was revolutionary at the time and is now standard practice. His book is technical but historically important.
Hans Rosling's Gapminder data visualizations. Rosling's famous TED talks and the Gapminder foundation's interactive tools (search for "Gapminder tools") are excellent examples of descriptive statistics done right — combining careful measurement with compelling visualization to tell the story of global health and development. Directly relevant to Case Study 2.
Recommended Next Steps
-
If you want deeper statistical intuition: Read Spiegelhalter or Wheelan. Both are written for people who are curious about data but anxious about math.
-
If you want more Python practice: Work through the relevant chapters of Downey's Think Stats. His computational approach complements what we've done here.
-
If you're interested in how statistics can mislead: Read Huff's How to Lie with Statistics. It's short, it's fun, and it will make you a better critical thinker about every number you see in the news.
-
If you want to explore real health data: Visit the WHO's Global Health Observatory or the World Bank's Open Data portal. Both provide free, downloadable datasets that you can analyze using everything you learned in this chapter.
-
If you're ready to move on: Chapter 20 introduces probability thinking — the framework for reasoning about uncertainty and randomness. We'll build intuition through Python simulation before touching any formulas.