Case Study 18-2: Building a Senate Fundamentals Model — The Garza-Whitfield Context

Case Study 18-2: Building a Senate Fundamentals Model — The Garza-Whitfield Context

Background

This case study walks through the construction and application of a structural forecasting model for a Senate race, using the Garza-Whitfield race as the primary context. The goal is not to produce a precise point prediction but to understand how structural factors establish a baseline range of expected outcomes, and how that baseline interacts with polling data.

The case is composite and illustrative, combining realistic modeling approaches with fictional race details.

The Structural Context for Senate Races

Presidential fundamentals models are well-established. Senate fundamentals models are less developed — partly because Senate races are more idiosyncratic, partly because individual-race data is noisier than national data. But the same underlying logic applies: economic conditions, presidential approval, incumbency, and historical partisan lean shape the structural baseline.

For the Garza-Whitfield Senate race, Nadia Osei built a structural baseline model with five components:

Component 1: National Political Environment

The national political environment — measured primarily by the president's approval rating and the generic ballot — creates a headwind or tailwind for each party's candidates nationwide. In the Garza-Whitfield election year, the Democratic president's approval was approximately 48% and the generic congressional ballot showed Democrats +2.

Interpretation: A neutral-to-slightly-favorable national environment for Democrats. No significant wave in either direction.

Component 2: State Partisan Lean

The state hosting the Garza-Whitfield race had presidential results of: - 2024: D+2.8 - 2020: D+2.1 - 2016: R+0.4 - 2012: D+3.2 - 2008: D+4.1

Historical average: approximately D+2.4, suggesting a slight but consistent Democratic lean in recent cycles, though the state was genuinely competitive.

Component 3: Garza's Incumbency

Senator Garza was serving her first full Senate term (elected six years earlier in a wave year). Her statewide approval rating, measured in separate polls, was approximately +8 net (54% approve, 46% disapprove). This is modestly positive — indicating some personal incumbency advantage beyond the structural baseline.

Component 4: State Economic Conditions

The state's economy was performing slightly better than the national average: unemployment was 0.4 points below the national rate, and median household income had grown 2.1% in real terms over the past two years. These positive local conditions slightly favor the incumbent.

Component 5: Demographic Trends

The state's Latino population had grown from 18% of the electorate in 2016 to approximately 24% in the current cycle. Latino voters had supported Garza by approximately 62-38 in her previous race. This demographic shift represented a structural advantage for Democrats that was embedded in the historical trend but not fully captured by older baseline partisan lean numbers.

Constructing the Baseline Estimate

Nadia used a simple structured judgment model rather than a formal regression (given limited data points for individual-race structural modeling):

Factor	Direction	Estimated Effect
National environment (D+2 generic ballot)	Slight D advantage	+1.0 to Garza
State partisan lean (D+2.4 historical)	D advantage	+2.4 baseline
Garza incumbency (net +8 approval)	D advantage	+1.5 personal advantage
State economic conditions (better than national)	Slight D advantage	+0.5
Latino demographic growth	D advantage	+0.8 vs. 2016 baseline
Total structural baseline		Garza +6.2

Note: These components are not simply additive in a rigorous statistical sense — they overlap, and some effects are partially captured in others. The structured judgment model is an approximation, not a formal regression.

Uncertainty Around the Baseline

The structural baseline of Garza +6.2 carries substantial uncertainty. Nadia estimated the 80% confidence interval as approximately Garza +2 to Garza +10.

Why so wide? Several reasons: - The model components are estimated with significant uncertainty - Senate races are more idiosyncratic than presidential races - Candidate quality effects (Whitfield's specific strengths and weaknesses) aren't captured - The state is trending but within its historical competitive range - Turnout variation can swing a Senate race several points in either direction

Comparing to Polling Data

The structural baseline of Garza +6.2 was meaningfully above the polling average of Garza +3.5 at the same point in the campaign. This created what Nadia called an "interesting tension."

In a well-calibrated forecasting framework, when the structural model and the polls diverge, the right response is not to simply average them but to investigate why. Possible explanations for polls showing a smaller lead than structure suggests:

The structural model is overestimating Garza's advantage. Perhaps the national environment is less favorable than the generic ballot suggests for this specific state; perhaps Garza's incumbency advantage is weaker than the approval rating implies.
The polls are understating Garza's lead. Perhaps the LV screens are too strict and capturing a Republican-leaning likely voter pool; perhaps there are turnout dynamics favoring Democrats that the polls aren't capturing.
Both are approximately right and the true answer is somewhere in between. The structural model suggests Garza +6, the polls suggest Garza +3, and the truth is Garza +4 or +5.
Candidate quality is pulling the race back toward parity. Whitfield is running a better-than-average campaign that is cutting into the structural advantage.

This is exactly the kind of analysis that sophisticated forecasters engage in: not treating models or polls as oracles, but as imperfect instruments pointing toward the same underlying reality from different angles.

The Integrated Forecast

Nadia's integrated forecast combined the structural baseline and the polling average using a Bayesian-inspired weighting: - Structural baseline weight: 30% (reflecting high structural uncertainty at this stage in the cycle) - Polling average weight: 70% (reflecting more information in the polls at this late stage)

Integrated point estimate: (0.30 × 6.2) + (0.70 × 3.5) = 1.86 + 2.45 = Garza +4.3

This number — Garza +4.3 — was notably close to what the major aggregators were showing (Garza +3.2 to Garza +3.8). The convergence gave Nadia confidence that the structural baseline and polling average were both pointing toward the same general range, and that the true expected outcome was in the Garza +3 to +5 range.

What the Model Couldn't Tell Her

Nadia was careful to communicate to the campaign what the model could not address:

Specific turnout uncertainty: If Latino turnout collapsed to 2018 midterm levels (significantly lower), the race would tighten dramatically. The structural model incorporated demographic trends but not scenario-specific turnout projections.

October events: Any significant event in the final three weeks — a scandal, a national crisis, a game-changing debate moment — was not in the model.

Mobilization effects: The Garza campaign's ground game, if it was significantly stronger than Whitfield's, could add 1-2 points beyond the structural baseline. But this required execution that couldn't be modeled in advance.

Discussion Questions

1. Nadia weighted the structural model at 30% and polling at 70% in her integrated forecast. What factors should determine the weighting between structural models and polling averages? How should the weighting change as Election Day approaches?

2. The structured judgment model assigns specific numerical effects to each structural factor (e.g., +1.5 for Garza's incumbency). How would you evaluate whether these numbers are well-calibrated? What historical data would you use?

3. The structural baseline (Garza +6.2) exceeded the polling average (Garza +3.5) by almost 3 points. In Chapter 19, we'll discuss probabilistic forecasting. What probability of a Garza win would you assign, given these inputs?

4. The model doesn't include candidate quality. How would you attempt to measure candidate quality for Garza and Whitfield in a way that could be incorporated into the structural model? What observable signals would you use?

5. Compare Nadia's approach to fundamentals modeling with Jake Rourke's approach to his own internal polls (described in Chapter 17). What analytical habits distinguish these two approaches? Which practices could Jake adopt to improve his analysis?

Quantitative Extension

Using the five structural components Nadia identified, build an alternative weighting scheme:

a) Assign your own weights to each component (they should sum to 100%) b) Compute the weighted structural baseline using your weights c) How does your baseline compare to Nadia's Garza +6.2? d) Using your baseline and the polling average of Garza +3.5, compute an integrated forecast with your preferred structural/polling weight split e) Justify your weighting decisions in writing (150-200 words)