Chapter 22 Key Takeaways: Down-Ballot and Global Forecasting

DataField.Dev

Chapter 22 Key Takeaways: Down-Ballot and Global Forecasting

House Forecasting

House forecasting is a data-sparse problem at the individual-race level. Most competitive House races receive little or no public polling, making the primary inputs structural: generic ballot signals, historical district partisan lean, incumbency, fundraising, and forecaster ratings. Individual-race polling, when available, is layered on top of these structural signals.

The generic ballot provides the primary anchor for national-level seat projections. But translating a generic ballot margin into a seat projection requires modeling the geographic distribution of votes across districts — a highly variable relationship that reflects geographic sorting, the location of competitive seats, and the nonlinearity of the seats-votes curve.

Senate Forecasting Limitations

Senate forecasting faces a small-cycle-sample problem: with only 10–15 competitive races per cycle, the historical calibration data for any given cycle is thin. Candidate quality effects — which can move individual Senate races 5+ points from the generic partisan baseline — are real and important but difficult to model systematically. The 2022 Republican candidate quality problem is the clearest recent example.

MRP: Power and Limits

MRP is the most significant methodological innovation in subnational electoral estimation of the past two decades. Its core contribution: producing district- or constituency-level estimates from national surveys by combining individual-level demographic regressions with Census-based poststratification. YouGov's 2017 UK constituency model demonstrated that a large-sample national survey, appropriately modeled, can predict constituency outcomes that individual constituency polls miss.

MRP's limits are equally important: it fails when local factors (candidate quality, local economic conditions, political culture) are more important than demographic composition; when the regression model's predictor variables fail to capture current-cycle realignments; and when sparse demographic cells in remote or unusual constituencies undermine the poststratification.

International Forecasting

International forecasting is not American forecasting exported. The fundamental challenges differ:

Multi-party systems require simultaneous estimation of several correlated vote shares, not a single two-candidate margin.
Parliamentary systems add a government-formation prediction step that depends on post-election negotiation, not just vote shares.
Data availability ranges from the American high extreme to genuine polling deserts where structural and analogical approaches are the only tools available.
Partisan nonresponse bias is not unique to Trump — Brazil's 2022 Bolsonaro underestimation and the UK's 2015 Conservative underestimation show the same mechanism operating in different political systems.

The Data-Scarcity Imperative

When direct polling data is absent or unreliable, forecasters should: 1. Use structural inputs (economic conditions, historical partisan lean, incumbency) 2. Import parameters from comparable data-rich environments with appropriate caution 3. Use expert elicitation with structured probability elicitation protocols 4. Widen confidence intervals to explicitly reflect the additional uncertainty 5. Communicate clearly what the forecast does and does not capture

The responsible threshold: a forecast in a data-scarce environment should be offered as "structured uncertainty" — a framework for thinking about possible outcomes — rather than a prediction. The Brazilian case illustrates that wide, honestly communicated intervals are more useful than narrow intervals that prove wrong.

Who Gets Counted: The Global Dimension

The systematic exclusion of hard-to-survey populations from polling samples is simultaneously a methodological failure and a democratic concern. Rural poor voters in Brazil, economically marginal voters in India, and linguistically isolated communities in sub-Saharan Africa are the most likely to be underrepresented in standard survey samples — and in many elections, they are the decisive margin. A forecasting methodology that cannot reach these populations produces a map that leaves out precisely the territory where elections are decided.

This is not a solvable problem in any final sense, but it can be mitigated by: methodological diversity (telephone and in-person and online sampling), explicit modeling of who is missing from the sample, and honest quantification of uncertainty attributable to coverage gaps.

The Fundamental Principle: Match Method to Data

The central methodological lesson of this chapter and the preceding three: the forecasting method must be matched to the data environment. A Senate race model appropriate for 20 high-quality likely voter polls is not appropriate for a state legislative race with no polls. A poll aggregation approach designed for two-party American elections is not appropriate for a multi-party parliamentary election. A fundamentals model calibrated on American economic voting is not directly importable to a country where accountability attribution works differently.

Rigor in electoral forecasting is not about applying the most sophisticated technique uniformly; it is about matching the appropriate technique to the information environment and communicating honestly about what the chosen approach can and cannot do.