Chapter 17 Key Takeaways: Poll Aggregation
Core Concepts
1. Aggregation reduces random error — not systematic error. The fundamental statistical insight behind poll aggregation is the Central Limit Theorem: averaging multiple independent measurements reduces the variance of the combined estimate. A ten-poll average has roughly one-third the margin of error of a single poll. But this benefit applies only to random error. If all polls share a systematic bias — because they all use similar likely voter screens, or all miss the same demographic — averaging them gives you a very precise estimate of the wrong thing.
2. Weighted averages outperform simple averages. Simple averages treat all polls as equally reliable. Sophisticated aggregations weight by quality (pollster's historical track record), recency (more recent polls tell you more about current opinion), and sample size (larger polls are statistically more precise). Each weighting dimension adds accuracy, but each involves modeling choices that are themselves debatable.
3. House effects are systematic, detectable, and adjustable. Individual pollsters have consistent partisan leans — usually reflecting methodological choices like LV screen design, weighting procedures, and mode of interview. These house effects can be detected by comparing pollster results to the average of other pollsters in the same race. Good aggregators adjust for house effects before averaging; simple aggregators don't.
4. The aggregator ecosystem includes multiple approaches with different trade-offs. RealClearPolitics offers transparency at the cost of methodological sophistication. FiveThirtyEight offers methodological rigor at the cost of full transparency. Cook, Sabato, and others offer expert qualitative judgment that integrates beyond polling data. Decision Desk HQ focuses on quantitative model precision. Monitoring multiple aggregators and understanding their methodological differences gives more information than relying on any single source.
5. Aggregators can influence what they measure. By rating races and displaying polling averages, aggregators shape campaign resource allocation, media coverage, and potentially voter behavior. This feedback loop — where measurement affects the quantity being measured — is one of the most philosophically challenging aspects of election forecasting. Awareness of this dynamic should make both producers and consumers of aggregations more thoughtful.
6. Herding corrupts the independence assumption. The statistical power of aggregation depends on polls being genuinely independent. When pollsters "herd" — adjusting their results toward the consensus to avoid being outliers — the aggregate reflects where pollsters think the race is rather than independent estimates of where it is. Herding is detectable (look for distributions too tight for the stated margins of error) but hard to fully correct.
Practical Implications
For reading aggregations: - Always look at which polls are included, not just the headline average - Check whether aggregators are adjusting for house effects - Be more confident when multiple methodologically different aggregators agree; investigate divergences - Remember that the aggregate measures current opinion, not a prediction of Election Day outcomes
For producing polls that enter aggregations: - Methodological transparency improves quality ratings and ultimately quality weighting - Disclosing likely voter screen design, weighting procedures, and crosstabs earns transparency credit - Timing matters enormously — early polls contribute little to late aggregates due to recency weighting - Understanding your own house effects and disclosing them builds long-term credibility
Connections to Other Chapters
- Chapter 8 (Sampling): The statistical logic of averaging builds on the Central Limit Theorem introduced in sampling discussion
- Chapter 9 (Fielding): Methodological choices in fielding (LV screens, weighting) determine house effects and quality ratings
- Chapter 18 (Fundamentals Models): Polls-plus models combine poll aggregation with structural economic and political variables
- Chapter 19 (Probabilistic Forecasting): Polling averages are inputs to probabilistic models; converting an average to a win probability requires additional machinery
- Chapter 20 (When Models Fail): The 2016 and 2020 cases show how systematic errors can corrupt even sophisticated aggregations
Key Terms
- Poll aggregation: Combining multiple polls into a single estimate to reduce random error
- House effect: The systematic partisan lean of a specific pollster, detected by comparing their results to other pollsters' results in the same race
- Herding: Pollsters adjusting results toward the consensus average to avoid being outliers
- Likely voter screen: The set of questions used to determine which respondents are probably going to vote; different screens produce different electorates
- Recency weighting: Giving higher weight to polls taken closer to Election Day
- Quality weighting: Giving higher weight to pollsters with better historical accuracy and methodological transparency
- Pollster rating: A grade assigned by aggregators based on historical accuracy and methodology quality