Chapter 17 Exercises: Poll Aggregation
Conceptual Exercises
Exercise 17.1 — The Mathematics of Averaging
A Senate race has the following six polls: - Poll A: Democrat +4 (n=600, MOE ±4.0) - Poll B: Democrat +1 (n=800, MOE ±3.5) - Poll C: Republican +2 (n=400, MOE ±4.9) - Poll D: Democrat +3 (n=1,200, MOE ±2.8) - Poll E: Democrat +5 (n=500, MOE ±4.4) - Poll F: Democrat +2 (n=900, MOE ±3.3)
a) Compute the simple (unweighted) average of these six polls. b) Compute a sample-size-weighted average, using actual sample sizes as weights. c) How much does the weighted average differ from the simple average in this case? d) In what kind of scenario would you expect the weighting to make a larger difference?
Exercise 17.2 — Identifying House Effects
A pollster (let's call them Apex Research) has released polls in seven competitive Senate races this cycle. In each case, compare their result to the average of all other pollsters in that race:
| Race | Apex Result | Other Pollsters Avg | Apex vs. Average |
|---|---|---|---|
| AZ-Sen | R+3 | D+1 | ______ |
| NC-Sen | R+2 | R+0 | ______ |
| GA-Sen | R+5 | D+2 | ______ |
| PA-Sen | R+1 | D+3 | ______ |
| WI-Sen | R+4 | D+1 | ______ |
| NV-Sen | R+2 | D+2 | ______ |
| OH-Sen | R+3 | R+1 | ______ |
a) Fill in the "Apex vs. Average" column. What is the average lean? b) What is the likely explanation for this pattern? c) If you were aggregating these polls, how would you adjust Apex's results? d) What methodological features would you look for to understand the source of this house effect?
Exercise 17.3 — Recency Weighting Design
You are building a simple aggregation model with recency weighting. You have the following polls in a gubernatorial race:
| Poll | Date | Result | Days Before Election |
|---|---|---|---|
| Poll 1 | Jan 15 | D+8 | 295 |
| Poll 2 | Mar 3 | D+5 | 248 |
| Poll 3 | Jun 1 | D+4 | 158 |
| Poll 4 | Aug 15 | D+2 | 83 |
| Poll 5 | Sep 20 | D+1 | 47 |
| Poll 6 | Oct 20 | R+1 | 17 |
a) Compute a simple average of all six polls. b) Design a recency weighting scheme. Describe your approach in words, then apply it. c) How sensitive is your result to the specific decay function you chose? d) What does this race look like if you use only the last three polls? The last two?
Analytical Exercises
Exercise 17.4 — Evaluating an Aggregator
Choose any competitive election in the last four election cycles (2016–2024). Find the final polling average from at least two aggregators (RCP, FiveThirtyEight, etc.) and compare:
a) What was each aggregator's final polling average? b) What was the actual election result? c) Which aggregator was closer to the actual result? d) Can you identify any polls that were included/excluded differently by different aggregators? e) Was there evidence of herding in the final polls? (Hint: look at the distribution of individual poll results vs. what you'd expect from the margins of error.)
Exercise 17.5 — The Influence Problem
Read about the 2016 Senate race in Indiana (Todd Young vs. Evan Bayh), which shifted dramatically in the final two weeks. Using news archives if needed:
a) How did aggregators rate the race at the start of October? b) How did the rating change in the final two weeks? c) Can you find any evidence that the early ratings affected campaign resource allocation? d) What does this race illustrate about the limits of aggregation?
Exercise 17.6 — Herding Detection
Here are 12 polls from a competitive Senate race, all taken in a 30-day window. The reported margin of error for each is ±4 points.
Results (Democratic lead in percentage points): +4, +3, +4, +3, +5, +3, +4, +4, +3, +4, +5, +3
a) Compute the mean and standard deviation of these results. b) If these polls were truly independent, with the stated margins of error, what standard deviation would you expect to see across the polls? c) Is there evidence of herding? Explain your reasoning. d) What would you do differently as a consumer of these polls if you suspected herding?
Applied Exercises
Exercise 17.7 — Build Your Own Average
Select a current or recent competitive statewide race. Collect at least 8 polls from the last 60 days of the campaign (use polling databases like 538, RCP, or Ballotpedia).
Build three versions of the polling average: 1. A simple average 2. A recency-weighted average (you design the weighting) 3. A quality-weighted average (use FiveThirtyEight's pollster grades if available)
Compare the three results. Write 300-400 words discussing which average you trust most and why.
Exercise 17.8 — The Aggregator Audit
On the final day before a recent election, compile the final polling averages from at least three different aggregators (RCP, 538, Economist, DDHQ, etc.) for the same race.
a) Tabulate the averages side by side. b) Identify any races where aggregators differed by more than 2 percentage points. c) For one such divergent race, investigate which polls each aggregator included. What explains the difference? d) After the election, compute which aggregator was most accurate.
Discussion Questions
Discussion 17.1
Jake Rourke argues that aggregators are a "black box run by people who want Garza to win." Is this a reasonable concern, or motivated reasoning? What would convince you that a particular aggregator was ideologically biased in its methodology? What evidence would you look for?
Discussion 17.2
Suppose you discovered that FiveThirtyEight's model had a historically Republican-leaning house effect — not through any ideological intent, but because of how their pollster rating system is calibrated. What would you want them to do about it? How would you evaluate whether their correction was adequate?
Discussion 17.3
A media outlet argues that publication of polling averages suppresses turnout on the winning side. You are the editor of a polling aggregation website. Do you have any ethical obligation to the democratic process that goes beyond accurate reporting? What, if anything, would you change about your publication practices in light of this concern?