Chapter 20 Further Reading: When Models Fail
Essential Reading
AAPOR Ad Hoc Committee on 2016 Presidential Election Polling (2017). An Evaluation of 2016 Election Polls in the U.S. American Association for Public Opinion Research. The definitive industry postmortem on 2016, covering nonresponse bias, late deciding, and education weighting in detail. Available at aapor.org.
AAPOR Task Force on 2020 Pre-Election Polling (2021). Understanding the 2020 Pre-Election Polls: An Evaluation of the 2020 General Election Polls. American Association for Public Opinion Research. The follow-up assessment documenting the persistence and worsening of systematic polling error. Identifies partisan nonresponse bias as the central mechanism.
Sturgis, P. et al. (2016). Report of the Inquiry into the 2015 British General Election Opinion Polls. Market Research Society / British Polling Council. Essential international companion to the AAPOR 2016 report. The most rigorous official inquiry into a polling failure published by any national polling organization.
Gelman, A., & King, G. (1993). "Why Are American Presidential Election Campaign Polls So Variable When Votes Are So Predictable?" British Journal of Political Science, 23(4), 409–451. A classic paper on the dynamics of polling movement during campaigns — essential background for understanding when polls and forecasts diverge.
Forecasting and Model Failure
Silver, N. (2012). The Signal and the Noise: Why So Many Predictions Fail — but Some Don't. Penguin Press. Foundational text on the epistemology of forecasting across fields, with extensive political application. Particularly relevant chapters on election forecasting and the concept of calibration.
Taleb, N.N. (2007). The Black Swan: The Impact of the Highly Improbable. Random House. The source of the "black swan" concept applied to the 2016 discussion. Essential for understanding the epistemological debate between unpredictability and methodological failure.
Lewis-Beck, M.S., & Stegmaier, M. (2000). "Economic Determinants of Electoral Outcomes." Annual Review of Political Science, 3, 183–219. Classic overview of fundamentals-based forecasting models and their assumptions — background necessary to understand why fundamentals models fail when structural assumptions break down.
Enns, P.K., & Lagodny, J. (2021). "Forecasting the 2020 Presidential Election: Leading Economic Indicators, Time for Change, and the Challenge of 2020." PS: Political Science & Politics, 54(1), 58–62. Analysis of how standard fundamentals models fared in the pandemic election year.
Nonresponse and Weighting
Pew Research Center (2012). Assessing the Representativeness of Public Opinion Surveys. Pew Research Center. Documents the dramatic decline in telephone survey response rates and its implications for sample composition.
Kennedy, C., et al. (2021). "An Evaluation of 2016 Election Polls in the U.S." Public Opinion Quarterly, 82(S1), 1–33. Academic version of the AAPOR 2016 assessment with additional statistical detail on weighting approaches and their effectiveness.
Gelman, A., et al. (2016). "How Can We Use Economic Models to Improve Polling?" Chance, 29(4), 18–25. Technical discussion of blending poll data with fundamentals to reduce nonresponse-driven bias.
Cohn, N. (2020, November 10). "Why Polls Were So Wrong About Biden's Margin." New York Times, The Upshot. Accessible post-2020 analysis from the perspective of a practitioner who built one of the major forecasting models.
Herding
Silver, N. (2014, September 22). "The Polls Are Alright." FiveThirtyEight. Early analysis of herding in polling, including the statistical test for whether the distribution of poll results is "too tight" to be consistent with independent measurement.
Erikson, R.S., & Wlezien, C. (2012). The Timeline of Presidential Elections: How Campaigns Do (and Do Not) Matter. University of Chicago Press. Research on polling dynamics over the course of campaigns, including the convergence of polls toward outcomes as Election Day approaches.
International Comparisons
Australian Market and Social Research Society (2020). Australian Polling Inquiry Report. AMSRS. The Australian counterpart to the British and American official inquiries. Particularly valuable for its analysis of the transition from telephone to online panel methodology.
Jennings, W., & Wlezien, C. (2018). "Election Polling Errors across Time and Space." Nature Human Behaviour, 2(4), 276–283. Systematic cross-national analysis of polling error magnitude and direction across 45 countries and multiple decades. Provides the empirical backbone for the claim that polling failure is an international structural phenomenon.
YouGov (2017). How YouGov's 2017 General Election Model Worked. YouGov. Technical description of the MRP methodology developed after the 2015 UK failure, which accurately predicted the 2017 hung parliament.
Calibration and Epistemology of Forecasting
Tetlock, P.E. (2005). Expert Political Judgment: How Good Is It? How Can We Know? Princeton University Press. Foundational research on the accuracy of political prediction, introducing the "hedgehog vs. fox" typology and documenting systematic overconfidence among domain experts.
Tetlock, P.E., & Gardner, D. (2015). Superforecasting: The Art and Science of Prediction. Crown Publishers. The follow-up to Expert Political Judgment, focused on the practices and mindsets of high-accuracy forecasters. Directly relevant to the calibration discussion in sections 20.9–20.10.
Breiman, L. (2001). "Statistical Modeling: The Two Cultures." Statistical Science, 16(3), 199–231. Influential methodological essay distinguishing prediction-focused from explanation-focused statistical modeling — essential background for the chapter's "Prediction vs. Explanation" theme.