Chapter 21 Exercises: Data Journalism and Statistical Literacy

Instructions

Exercises are organized by section. Problems marked (Applied) require interpreting real-world-style data. Problems marked (Coding) require Python or spreadsheet analysis. Problems marked (Research) require consulting external sources.

Part A: Foundational Concepts (Sections 21.1–21.2)

Exercise 21.1 — Mean vs. Median: The Distortion of Averages

A neighborhood has ten households with the following annual incomes (in thousands of dollars): $38, $42, $45, $51, $53, $58, $62, $68, $74, $1,400$

(a) Calculate the mean household income. (b) Calculate the median household income. (c) Calculate the income that falls at the 25th and 75th percentile. (d) A real estate developer says "the average household in this neighborhood earns over $189,000 a year." Is this statement technically accurate? Is it misleading? What would a more honest characterization say? (e) Sketch a histogram of this distribution. What shape does it have, and what does that shape tell you about when mean and median diverge most?

Exercise 21.2 — Absolute vs. Relative Risk: Drug Trial Analysis

A clinical trial tests a new blood pressure medication over five years. Among 2,000 patients in the control group, 120 experienced a major cardiovascular event (heart attack or stroke). Among 2,000 patients in the treatment group, 72 experienced a major cardiovascular event.

(a) Calculate the event rate in the control group (as a percentage). (b) Calculate the event rate in the treatment group (as a percentage). (c) Calculate the absolute risk reduction (ARR). (d) Calculate the relative risk reduction (RRR). (e) Calculate the Number Needed to Treat (NNT) to prevent one cardiovascular event. (f) The drug company's press release states: "New drug reduces cardiovascular risk by 40%." Is this accurate? Write a more complete and honest summary of the drug's benefit. (g) The drug costs $3,000 per year. Using the NNT, calculate the cost per event prevented. Does this change your assessment of the drug's value?

Exercise 21.3 — Base Rate Neglect: Airport Security Screening

A transportation security agency deploys a behavioral detection program. The algorithm flags individuals for secondary screening. You are given the following information: - In any given week, 0.1% of airline passengers are carrying prohibited materials. - The algorithm correctly identifies 90% of passengers who are carrying prohibited materials (sensitivity = 90%). - The algorithm incorrectly flags 5% of passengers who are not carrying prohibited materials (false positive rate = 5%).

Assume 1,000,000 passengers pass through screening in a given week.

(a) How many passengers are actually carrying prohibited materials? (b) Of these, how many does the algorithm correctly identify? (c) How many passengers without prohibited materials are incorrectly flagged? (d) What is the positive predictive value (PPV) of a positive result — the probability that a flagged passenger is actually carrying prohibited materials? (e) A politician argues that because the algorithm is "90% accurate," it should be expanded. Evaluate this claim using your calculations. What does "90% accurate" actually mean here, and why is it insufficient to justify the program? (f) What would the base rate of prohibited materials need to be for the PPV to reach 50%?

Exercise 21.4 — Sample Size and Precision

A polling organization conducts three polls on the same day about the same candidate preference question. The results are:

Poll A: n = 200, Candidate X leads with 54%
Poll B: n = 1,000, Candidate X leads with 51%
Poll C: n = 5,000, Candidate X leads with 49%

(a) Calculate the approximate margin of error (at 95% confidence) for each poll. (b) For each poll, state the range within which the true population value is estimated to fall. (c) Polls A and C give contradictory impressions of who is ahead. Using your margin-of-error calculations, explain what a statistically careful analyst would conclude. (d) A news website headlines Poll A with "Candidate X leads by 4 points!" Is this responsible reporting? What should the headline say instead? (e) What minimum sample size would be needed to achieve a margin of error of ±1 percentage point at 95% confidence?

Part B: Statistical Misuse (Section 21.3)

Exercise 21.5 — Cherry-Picking Timeframes: Crime Statistics

The table below shows the violent crime rate (per 100,000 population) in a fictional city across ten years:

Year	Violent Crime Rate
2014	412
2015	438
2016	451
2017	445
2018	412
2019	389
2020	356
2021	331
2022	318
2023	305

Mayor A took office in January 2018. Mayor B's challenger is running against her in 2024.

(a) Candidate B claims that "crime has fallen nearly 30% during Mayor A's tenure." Is this accurate? Calculate the exact percentage change from 2018 to 2023. (b) A critic claims that "crime rose sharply under the previous mayor and Mayor A has merely benefited from a pre-existing trend." Evaluate this claim by looking at the full data series. (c) What would be the most honest starting point for assessing Mayor A's impact on crime? (d) Identify two different timeframes that a political opponent could use to make Mayor A look bad, and two that a supporter could use to make her look good. Calculate the percentage change for each framing. (e) What additional information would you need to make a more rigorous assessment of mayoral impact on crime rates?

Exercise 21.6 (Applied) — Misleading Axes: Recreating the Deception

The following describes a real chart (reconstruct it from the description):

A bar chart shows federal income tax revenue for two years. Year 1: $1.53 trillion. Year 2: $1.78 trillion. The chart's y-axis runs from $1.4 trillion to $1.9 trillion.

(a) Calculate the actual percentage increase in tax revenue between the two years. (b) If a viewer judges the second bar to be approximately 2.5 times the height of the first bar on the truncated chart, what percentage increase would that imply? (c) Redraw the chart correctly (by hand or computationally) with the y-axis starting at zero. How does the visual impression change? (d) Under what circumstances (if any) would a truncated y-axis be defensible for a bar chart? Construct an example where you would and one where you would not use a truncated axis.

Exercise 21.7 — Correlation and Causation: Identifying Causal Structures

For each of the following correlations, identify which causal structure(s) — A causes B, B causes A, C causes both A and B, or spurious coincidence — seem most plausible. Justify your reasoning.

(a) Ice cream sales and drowning deaths both spike in summer months. (b) Countries with higher rates of television ownership have lower rates of child mortality. (c) Students who take more notes in class receive higher grades. (d) Firefighters are more likely to be present at large fires than at small ones. (e) Shoe size is correlated with reading ability in elementary school children. (f) Neighborhoods with more police officers have higher crime rates. (g) Hospital patients who receive more aggressive treatment have worse outcomes.

Exercise 21.8 — The Replication Crisis: Understanding p-Values

(a) Explain in plain language what it means when a study reports p < 0.05. (b) If a researcher conducts 20 independent tests of null hypotheses that are all actually true, how many would you expect to produce p < 0.05 by chance? (c) Explain "researcher degrees of freedom" and list at least four legitimate analytical choices that could affect whether a study achieves p < 0.05. (d) What is a pre-registration, and how does it address the p-hacking problem? (e) The Open Science Collaboration (2015) attempted to replicate 100 psychology studies. Approximately 60 failed to replicate. Does this mean 60% of published psychology findings are false? What other explanations should be considered? (f) A news article reports: "Scientists confirm that power poses boost confidence and hormonal levels." The study cited has n = 42 and was conducted in 2010. What should a statistically literate reader want to know before accepting this conclusion?

Part C: Polling Methodology (Section 21.4)

Exercise 21.9 — Evaluating Poll Quality

Read the following (fictional) poll description and answer the questions:

"A new poll from the Coalition for Healthy Schools finds that 84% of parents support mandatory physical education every school day. The poll surveyed 1,247 registered parents who responded to an online invitation sent through the Coalition's email list. Margin of error: ±2.8 points."

(a) Identify at least four methodological concerns with this poll. (b) Is the stated margin of error meaningful for this poll? Explain why or why not. (c) Who commissioned the poll, and why might this matter? (d) Suggest the type of question wording that might have produced the 84% figure. Then suggest an alternative wording that might produce a substantially different result. (e) Describe what a methodologically sound poll on this topic would look like. Specify: population, sampling method, sample size, question wording approach, and how you would report the results.

Exercise 21.10 — Question Wording Effects: Designing Biased and Unbiased Questions

(a) Write a leading question about immigration policy that would be likely to produce a more restrictive response. (b) Write an alternative leading question on the same topic that would be likely to produce a more permissive response. (c) Write a balanced question on the same topic that follows best practices for neutrality. (d) Explain three specific techniques pollsters use to write unbiased questions, and identify the opposite (biased) technique for each. (e) What is a "push poll" and how does it differ from a genuine opinion poll? Give an example.

Exercise 21.11 — Margin of Error and Significance

A presidential approval poll shows the incumbent at 48% approval, with a margin of error of ±3 points. The previous month, the same poll showed 52% approval with the same margin of error.

(a) What are the 95% confidence intervals for each poll result? (b) Do the two confidence intervals overlap? What does overlap mean for interpreting the apparent change? (c) A news headline says "President's approval drops sharply." Is this supported by the statistical evidence? What headline would be more accurate? (d) What sample size would be needed to make a 4-point decline statistically significant at the 95% level? (e) Beyond sampling error, list three non-sampling sources of error that could affect this poll's accuracy.

Part D: Data Visualization (Section 21.5)

Exercise 21.12 (Applied) — Chart Identification and Critique

For each of the following chart scenarios, identify the deceptive technique being used and explain how it misleads:

(a) A line chart showing unemployment rate uses a y-axis from 3.5% to 5.5%. The line appears to nearly double over two years, but the actual increase is from 3.8% to 4.6%.

(b) A 3D pie chart shows four categories. One slice (27%) faces the viewer while the others recede into the background. The 27% slice appears to occupy about 40% of the visual space.

(c) A choropleth map of voter turnout by county shows rural counties in deep color and urban counties in light color. A news commentator says "clearly, rural areas are more engaged in democracy than cities."

(d) A dual-axis chart shows GDP growth (left axis, 0–6%) and unemployment rate (right axis, 3.5–5.5%). The lines appear to move in perfect lockstep, suggesting they are of equal magnitude.

(e) A bar chart comparing two programs uses a y-axis starting at 78%. Program A scores 79%, Program B scores 84%. Program B's bar appears to be five times as tall as Program A's.

Exercise 21.13 (Coding) — Creating Honest Visualizations

Using Python (matplotlib, seaborn) or a spreadsheet program:

(a) Create a bar chart comparing five countries' GDP growth rates (2%, 2.5%, 1.8%, 3.1%, 0.9%) using both a zero-baseline y-axis and a truncated y-axis (starting at 0.5%). Describe the difference in visual impression.

(b) Create a choropleth map sketch (or description of how you would create one) of voter turnout by US state, then explain what a cartogram version would look like and why it might be more informative.

(c) Create a scatter plot showing a spurious correlation between two variables of your choosing. Then add a third variable represented by color to reveal the confounder.

Part E: Scientific Studies (Section 21.6)

Exercise 21.14 — Reading Study Abstracts

Read the following (constructed) abstract:

"We conducted a randomized controlled trial of mindfulness-based stress reduction (MBSR) versus waitlist control in 87 patients with moderate anxiety. After 8 weeks, MBSR participants showed significant improvement on the GAD-7 anxiety scale compared to controls (p = 0.03, Cohen's d = 0.28). Effect sizes were in the small range. Secondary outcomes including sleep quality and quality of life showed non-significant trends in the predicted direction."

(a) What is the sample size, and is it adequate to detect small effects reliably? (b) What does p = 0.03 tell you? What doesn't it tell you? (c) Cohen's d = 0.28 is considered a "small" effect. What does this mean in practical terms for the patients who participated? (d) The secondary outcomes are not significant. Why is it important that the authors reported this? (e) What would you want to know about the control condition, the therapists delivering MBSR, and the population studied before generalizing these results? (f) How would you characterize this finding in a news article? Write a two-sentence description that is accurate about both what was found and its limitations.

Exercise 21.15 — Confounders and Study Design

A large observational study of 500,000 adults finds that people who drink red wine regularly have a 25% lower rate of cardiovascular disease than non-drinkers.

(a) List at least five plausible confounders that could explain this association without wine having any direct cardiovascular benefit. (b) Explain what "residual confounding" means and why it is a fundamental limitation of observational studies. (c) Describe the study design that would provide the most convincing evidence for or against a causal protective effect of red wine on cardiovascular disease. What are the practical obstacles to conducting this study? (d) A newspaper headline reads: "Drink Red Wine to Protect Your Heart, Study Says." What would you change about this headline and why?

Part F: Health and Economic Statistics (Sections 21.7–21.8)

Exercise 21.16 — NNT in Context

A new preventive treatment for a common illness is under evaluation. Two different framings of trial results are provided:

Framing 1 (pharmaceutical company): "The treatment reduces serious illness risk by 75% in eligible adults."

Framing 2 (independent analysis): The baseline rate of serious illness in the eligible population over the trial period was 0.4%. The treatment group had a rate of 0.1%.

(a) Verify that the 75% relative risk reduction is consistent with the absolute numbers given. (b) Calculate the ARR and NNT. (c) The treatment has a side effect (mild nausea) in 8% of recipients. Calculate the NNH for nausea. (d) Given NNT and NNH, how many people experience nausea for every serious illness prevented? (e) The treatment costs $800 per person. Calculate the cost per serious illness prevented. (f) Write two descriptions of this treatment — one from a pharmaceutical marketing perspective (using technically accurate statistics) and one from a neutral public health perspective.

Exercise 21.17 — Unemployment Rate Definitions

Using hypothetical data, suppose a country has: - Working-age population (ages 16–64): 200 million - Employed full-time: 120 million - Employed part-time who want full-time work: 8 million - Employed part-time by choice: 12 million - Unemployed and actively searching: 7 million - Discouraged workers (want work but stopped searching): 4 million - Other not in labor force (students, retirees, caregivers): 49 million

(a) Calculate the U-3 (official) unemployment rate. (b) Calculate the labor force participation rate. (c) Calculate the U-6 (broad) unemployment rate including discouraged and involuntary part-time workers. (d) During a recession, the official unemployment rate barely rises while U-6 increases substantially. Explain why this could happen. (e) A politician takes office when U-3 = 6% and leaves when U-3 = 4%. They claim this as a success. What additional information do you need to evaluate this claim?

Exercise 21.18 (Research) — GDP and Human Wellbeing

(a) Look up the GDP per capita of three countries with similar values but substantially different levels of life expectancy, education, and subjective wellbeing. What does this suggest about GDP as a wellbeing indicator? (b) The United Nations Human Development Index (HDI) combines GDP per capita with life expectancy and education. Research how the HDI ranking of the United States differs from its GDP per capita ranking. What does this tell you? (c) What is the Genuine Progress Indicator (GPI) and how does it differ from GDP? Find one documented case where GPI and GDP give contradictory signals about economic progress.

Part G: Integrated Data Literacy (Section 21.9)

Exercise 21.19 (Applied) — The Data Literacy Checklist in Action

Apply the full data literacy checklist from Section 21.9 to the following news claim:

"A new study shows that people who eat organic food have a 25% lower risk of developing cancer. The study, published in JAMA, followed 68,000 French adults for five years."

Work through all 22 questions on the checklist and write a paragraph summarizing what the claim likely does and does not establish.

Exercise 21.20 (Applied) — Full Statistical Claim Analysis

Find a statistical claim in a current news article (from any mainstream outlet) that makes a quantitative claim about health, economics, or social science. Apply the data literacy checklist and write a 500-word analysis evaluating:

(a) What the claim asserts (b) What statistical concepts are relevant (c) What information is missing that would be needed to evaluate the claim (d) How the claim should be qualified or contextualized (e) Your assessment of whether the headline accurately represents the finding

Exercise 21.21 — Evaluating Data Journalism

Choose one data journalism piece from FiveThirtyEight, The Upshot, ProPublica, or the Guardian Data Desk published in the last two years.

(a) What dataset(s) did the journalists use? Are these primary sources, secondary sources, or both? (b) What methodology did they use to analyze the data? Did they provide enough detail for you to evaluate it? (c) Did the visualization accurately represent the data? Apply at least three principles from Section 21.5. (d) Did the journalists appropriately qualify their conclusions? Did they discuss limitations? (e) What questions does this piece leave unanswered, and what additional reporting or analysis would strengthen it?

Exercise 21.22 — Constructing a Misleading Statistic

This exercise asks you to think adversarially in order to build defensive skills.

Using publicly available data on a topic of your choice (crime, healthcare, education, economics):

(a) Construct three technically accurate but potentially misleading statistical claims using: (1) a cherry-picked timeframe, (2) relative rather than absolute framing, (3) a misleading metric. (b) For each claim, write the accurate, contextualized version that provides the full picture. (c) Reflect: what does this exercise reveal about how easy it is to mislead with technically accurate statistics?

Exercise 21.23 — Historical Case: Tobacco Industry Statistics

The tobacco industry used statistical arguments for decades to resist the scientific consensus on smoking and cancer. Research the following:

(a) What statistical arguments did the tobacco industry use to argue that the correlation between smoking and lung cancer was not causal? (b) What was the "Texas sharpshooter" fallacy, and how did some tobacco-funded researchers use it? (c) What types of evidence finally established causality strongly enough to overcome tobacco industry statistical counter-arguments? (d) How does the tobacco playbook compare to statistical arguments made by industries facing other regulatory challenges today?

Exercise 21.24 (Coding) — Monte Carlo Simulation of Sampling Variability

Write a Python program that:

(a) Simulates a population of 1,000,000 voters with a true preference of 52% for Candidate A. (b) Repeatedly draws samples of size n = 500 from this population and records the sample proportion favoring Candidate A. (c) After 10,000 simulations, plots a histogram of the sample proportions. (d) Calculates the proportion of samples that would correctly declare Candidate A ahead (i.e., where sample proportion > 50%). (e) Repeats the simulation for n = 100, 500, 1000, 2000, and 5000, and plots how the proportion of "correct" samples changes with sample size. (f) Uses the simulation results to explain conceptually what a margin of error is.

Exercise 21.25 (Coding) — Visualizing the Base Rate Problem

Write a Python program that:

(a) Creates a visualization of the base rate / positive predictive value relationship for a medical test. (b) Uses the following parameters as inputs: test sensitivity, test specificity, and disease prevalence. (c) Plots PPV as a function of disease prevalence (from 0.01% to 50%) for a test with 99% sensitivity and 99% specificity. (d) Annotates the chart to highlight the "rare disease problem" — where even very good tests have poor PPV in low-prevalence populations. (e) Adds a second curve for a test with 90% sensitivity and 90% specificity on the same plot.

Exercise 21.26 — Economic Statistics Audit

The following four statements about a fictional economy are all technically accurate. Using the data provided, explain what each statement conceals:

"GDP has grown 18% over the past decade" (per capita GDP has grown 4%; population grew 12%)
"Unemployment is at a record low 3.8%" (labor force participation is at its lowest point in 40 years at 61.2%)
"Median household income rose by $2,200 last year" (inflation was 3.8%; mean household income rose by $8,700)
"Consumer prices rose only 2.1% last year" (housing costs rose 8.2%, healthcare rose 6.4%; consumer electronics fell 15%)

For each statement: (a) identify what it conceals, (b) explain why the concealed information matters, (c) write a more complete statement.

Exercise 21.27 — Polling Audit: 2020 US Presidential Election

The 2020 US presidential election featured some of the largest polling errors in modern history, particularly in states like Wisconsin, Ohio, and Florida.

(a) Research the actual polling averages versus final results in three states with large polling errors in 2020. (b) What explanations have been offered for the systematic underestimation of the Republican vote share? (c) The American Association for Public Opinion Research (AAPOR) commissioned a task force to investigate. What were the main findings of that task force report? (d) Does systematic polling error in multiple consecutive election cycles suggest that polling as an enterprise is fundamentally broken, or are there specific fixable problems? Support your answer with evidence.

Exercise 21.28 — The Replication Crisis and Its Implications

Read the abstract of the Open Science Collaboration (2015) replication study (Estimating the reproducibility of psychological science) published in Science.

(a) What percentage of studies replicated at a p < 0.05 level? (b) What other measures of replication success did the researchers use, and how did the picture change? (c) Which areas of psychology showed the highest replication rates, and which the lowest? What might explain these differences? (d) The study's authors are careful not to claim that the failed replications demonstrate the original studies were wrong. Why not? What alternative explanations do they offer? (e) What implications does widespread non-replication have for this textbook, which cites social psychology research on misinformation, motivated reasoning, and confirmation bias?

Exercise 21.29 (Research) — A Personal Data Literacy Audit

Over the course of one week, collect ten statistical claims from news sources you regularly consume (newspapers, social media, podcasts, television).

For each claim, record: (a) The source and medium (b) The statistical claim, quoted as precisely as possible (c) Whether the claim provided: base rates, absolute figures, sample size, confidence intervals, causal vs. correlational qualification (d) Your assessment of the claim's accuracy and completeness based on the tools developed in this chapter (e) An overall grade (A–F) for statistical responsibility

At the end of the week, write a 300-word reflection on what you observed. Which statistical errors were most common? Did the quality of statistical reporting vary by source type?

Exercise 21.30 — Capstone: Statistical Analysis of a Public Health Claim

Select a specific public health claim that has been controversial or contested in the media (examples: vaccine efficacy claims, dietary recommendations, exercise guidelines, mental health statistics, COVID-19 fatality statistics).

Write a 1,000-word analysis that: (a) Precisely states the claim and its sources (b) Identifies the relevant statistical concepts from this chapter (c) Evaluates what the underlying evidence actually shows (with specific reference to effect sizes, sample sizes, study designs, and replication status) (d) Identifies how the claim has been accurately and inaccurately framed in media coverage (e) Concludes with your best assessment of what the evidence does and does not establish, stated with appropriate uncertainty