Chapter 8 Exercises: Sampling: Who Speaks for the Public?
Tier 1: Foundational
1. The Literary Digest Autopsy In two to three paragraphs, explain the Literary Digest's 1936 failure. Your answer should cover: (a) what they did right in terms of scale, (b) what they did wrong in terms of sampling design, (c) why having more responses made things worse rather than better, and (d) what George Gallup did differently that allowed him to correctly predict the result with a fraction of the responses.
2. Sampling Method Matching Match each scenario to the sampling method that best fits: (a) simple random sample, (b) systematic sample, (c) stratified sample, (d) cluster sample.
i. A pollster randomly selects every 50th name from an alphabetically sorted voter registration list. ii. A survey of congressional district opinion that interviews 20 respondents in each of the 20 randomly selected precincts in the district. iii. An academic survey that assigns a random number to every adult American in the Census and selects those with numbers below a threshold. iv. A poll that separately selects 200 respondents from each of the state's five regions, then weights each region to its actual share of the likely voter population.
3. Margin of Error Calculation Using the rule of thumb MOE ≈ 1/√n at 95% confidence, calculate the margin of error for the following samples:
a. A national poll with n = 1,024 respondents b. A state poll with n = 600 respondents c. A subgroup of Latino voters representing 18% of a 1,200-respondent sample (how many Latino respondents are in the sample, and what is their MOE?) d. Two polls of the same population, each with n = 800. The first shows 48% support for Candidate X; the second shows 52%. Is this difference statistically meaningful?
4. Probability vs. Nonprobability Classify each of the following as probability sampling (P) or nonprobability sampling (NP). For each nonprobability sample, identify the specific type (convenience, quota, purposive, snowball) and explain why it does not qualify as probability sampling.
a. A random-digit-dial telephone survey of U.S. adults, reaching every 10-digit number with equal probability b. An online poll embedded in a news article, open to any reader who clicks c. A focus group of politically active young voters recruited through social media advertisements d. A stratified sample drawn from a state's registered voter file with known selection probabilities e. A panel of 5,000 opt-in respondents recruited through online advertising who take surveys in exchange for reward points
5. The Sampling Frame Problem For each of the following populations, identify the most commonly used sampling frame for surveys of that group, and describe one significant way the frame fails to cover the full population:
a. U.S. registered voters b. All U.S. adults c. Likely voters in a specific congressional district d. Homeowners in a suburban county
6. Coverage Bias Identification A pollster is conducting a telephone survey of adults in a large Sun Belt city with the following characteristics: 35% Latino, 15% Black, 5% Asian, 45% non-Hispanic white; 60% of households are cell-phone-only; median income is $42,000; 25% of residents are non-English-dominant speakers.
The pollster uses a landline-only RDD frame and conducts all interviews in English. Identify at least four specific coverage biases this approach will introduce, and describe the direction of each bias.
7. Weighting Intuition A survey of 1,000 respondents produces the following demographic profile:
| Group | In Sample | In Target Population |
|---|---|---|
| Women | 55% | 52% |
| Men | 45% | 48% |
| College+ | 48% | 35% |
| No college | 52% | 65% |
a. Calculate the approximate weight for each of the four cells (women, men, college+, no college). b. If college-educated respondents favor Candidate A by 60-40 and non-college respondents favor Candidate A by 45-55, estimate how weighting will shift the top-line estimate of Candidate A's support.
Tier 2: Analytical
8. Design a Stratified Sample You are designing a poll of likely voters in a state with the following regional breakdown:
| Region | Share of Likely Voters | Estimated Partisan Split |
|---|---|---|
| Major metro (2 cities) | 38% | 65D-35R |
| Suburbs | 30% | 52D-48R |
| Mid-size cities | 16% | 55D-45R |
| Rural | 16% | 35D-65R |
You have a budget for n = 1,200 total interviews. Design a stratified sampling scheme that: (a) ensures each region is adequately represented for subgroup analysis, (b) correctly weights the sample to the true regional distribution for top-line estimation, and (c) oversample rural voters at twice the proportionate rate. Show your sample sizes for each stratum and the weights you would use.
9. The MRP Explanation A journalist calls Meridian and asks: "How can you estimate opinions in all 120 state legislative districts when your poll only has 1,200 respondents? That's only 10 people per district." Write Vivian Park's response — a clear, jargon-minimized explanation of how MRP allows small-sample inference to small geographies. The explanation should be accessible to a non-statistician but accurate.
10. Nonresponse Bias Analysis A political poll achieves a 5% response rate. The pollster argues that this is fine because, after weighting on age, gender, race, and education, the sample profile closely matches the target population. Evaluate this argument. What does the pollster's claim actually establish? What does it not establish? Under what conditions is a 5% response rate with demographic weighting adequate, and under what conditions is it problematic?
11. The Likely Voter Screen Compare the following two likely voter screening approaches and discuss their implications for sample composition and horse-race results:
Screen A: "How likely are you to vote in the November election? Very likely, somewhat likely, not very likely, or not at all likely?" → Include all who say "very likely."
Screen B: Score-based: weight respondents by their vote propensity score from the voter file, giving higher weights to respondents with more complete voting histories.
Which screen is more methodologically defensible? What are the practical tradeoffs between them? How might they produce different top-line estimates in a race where one candidate's coalition is disproportionately composed of lower-propensity voters?
12. The 2020 Polling Error Post-election analysis of the 2020 polls found systematic underestimation of Trump's support in several Midwestern states. AAPOR's report identified differential nonresponse by educational attainment as a primary factor. Write a 400-word explanation of how differential nonresponse by education can produce systematic polling error, and describe two specific methodological changes that could reduce this problem in future cycles.
13. Raking in Practice You are weighting a sample of 800 respondents. After proportionate selection, your sample has the following profile compared to target:
| Category | In Sample | Target |
|---|---|---|
| Women | 58% | 52% |
| Men | 42% | 48% |
| Age 18-34 | 12% | 21% |
| Age 35-64 | 55% | 51% |
| Age 65+ | 33% | 28% |
| White | 72% | 60% |
| Non-white | 28% | 40% |
Describe the raking process in conceptual terms: what does the algorithm do in each iteration, and why does it need to cycle multiple times? What convergence criterion would you use to know when the algorithm is finished?
Tier 3: Advanced
14. The Sun Belt Sampling Challenge Trish McGovern faces the following challenge when polling the Garza-Whitfield race: the state's Latino population is concentrated in southern metro counties, is more likely to be cell-phone-only, has lower telephone response rates, and includes a significant proportion of non-English-dominant speakers. Design a comprehensive sampling strategy that addresses all four challenges. Your strategy should specify: frame, mode, language, regional oversample, and weighting approach. Justify each choice.
15. The Herding Critique A prominent data journalist argues: "The polling industry is self-correcting in the wrong direction — when all polls agree on a result, it's because pollsters are adjusting toward the consensus, not because they're independently measuring the same thing." Write a 500-word response that: (a) explains the mechanism by which herding occurs, (b) evaluates the evidence for and against the claim, and (c) proposes one methodological intervention that could reduce herding in the political polling industry.
16. MRP Implementation Plan You have access to a national survey of 3,000 respondents that includes questions about attitudes toward a specific ballot initiative. You want to estimate support for the initiative in each of the 50 states, even though many states have very few respondents. Write a 400-word conceptual implementation plan for MRP that covers: the regression model specification (which predictors, and why), the Census poststratification cells you would use, how you would validate your state-level estimates, and how you would report uncertainty.
17. Ethics of Likely Voter Modeling A progressive organization argues: "Likely voter models that rely heavily on past voting behavior are a form of voter suppression through data — they treat low-propensity voters' preferences as less real or less important." A pollster responds: "We're predicting an election, not conducting a census — we should weight toward the people who will actually vote." Write a 500-word essay that: (a) takes a clear position on this dispute, (b) engages seriously with the strongest version of the opposing argument, and (c) proposes what you consider an ethically defensible likely voter methodology.
18. The Sample Size Decision Memo You are advising a state legislative campaign with a $15,000 polling budget. They want to know: should they field one poll of n=1,200, two polls of n=600 each (conducted three weeks apart), or one poll of n=800 with a follow-up of n=400 to track specific subgroups? Write a 400-word decision memo analyzing each option in terms of statistical precision, tracking capability, and subgroup analysis capacity. Make a recommendation and defend it.
19. The Opt-In Panel Debate A media organization publishes a poll conducted with an opt-in online panel, without disclosing that it is a nonprobability sample. They report a margin of error of ±3 percentage points, identical to what they would report for a probability sample. Write a methodological critique that: (a) explains why reporting a traditional MOE for a nonprobability sample is misleading, (b) describes what information readers would need to appropriately interpret the result, and (c) proposes an alternative disclosure standard that would be both honest and comprehensible to a general audience.
20. Synthesis: Design the Ideal Garza-Whitfield Poll You have been given a $60,000 budget to conduct the most methodologically rigorous possible poll of the Garza-Whitfield Senate race. Design the poll from the ground up: sampling frame, mode, sample size, stratification, weighting approach, likely voter screen, and disclosure standards. For each decision, explicitly state the tradeoff you are making between cost, precision, coverage, and generalizability. Conclude with a candid assessment of what your ideal poll can and cannot tell us about the race.