Chapter 21 Quiz: Building a Simple Election Model

DataField.Dev

Chapter 21 Quiz: Building a Simple Election Model

Multiple Choice

1. In the three-layer election model architecture, which layer reduces the impact of sampling noise in individual polls?

a) Layer 1: Poll aggregation b) Layer 2: Fundamentals integration c) Layer 3: Monte Carlo simulation d) All three layers reduce sampling noise equally

2. In the exponential decay weighting function $w = e^{-\lambda d}$, what happens to a poll's weight as its age increases (d increases)?

a) The weight increases exponentially b) The weight decreases exponentially c) The weight decreases linearly d) The weight remains constant regardless of age

3. Why does the model apply a square root transformation to sample size when calculating sample-size weights (i.e., use $\sqrt{n}$ instead of $n$)?

a) Because larger samples are always untrustworthy b) To represent diminishing returns — doubling sample size does not double information content c) Because the square root gives the margin of error directly d) To ensure that all polls receive equal weight

4. The systematic error term in the Monte Carlo simulation is modeled as a single correlated draw per simulation rather than independent draws. This design choice is intended to capture:

a) The variability in different pollsters' methodologies b) The possibility that all polls are simultaneously wrong in the same direction c) Random sampling variance across different poll respondents d) The uncertainty in the fundamentals-based prior

5. In the blended model, what does a "poll weight" of 0.75 mean?

a) 75 percent of the simulations result in a Democratic win b) The model assigns 75 percent of its final estimate to the polling average and 25 percent to the fundamentals prior c) 75 percent of polls in the dataset are weighted higher than average d) The fundamentals prior is given 75 percent of the total weight

6. A sensitivity analysis shows that changing the systematic error SD from 2.0 to 4.0 shifts the win probability from 61 percent to 49 percent. What does this indicate?

a) The model is insensitive to the systematic error assumption b) The win probability estimate is robust to uncertainty about polling bias c) The model's output is highly sensitive to the assumption about systematic error magnitude d) The model is incorrectly specified and should be discarded

7. The effective number of polls $N_{\text{eff}} = 1 / \sum_i w_i^2$ is useful because it:

a) Counts the total number of polls in the dataset b) Measures how evenly weight is distributed across polls; a lower value indicates concentration in fewer polls c) Determines the appropriate margin of error to report d) Counts only the likely voter polls in the weighted average

8. Which of the following is NOT one of the failure modes identified in the chapter?

a) Overfitting to polling data when polls are systematically biased b) Using inappropriate fundamentals that no longer capture the current political environment c) Choosing an overly wide confidence interval that makes the forecast useless d) Treating the model as certain rather than as a probabilistic estimate

9. Nadia recommends increasing the poll weight as Election Day approaches. The theoretical justification for this is:

a) Polls become cheaper to conduct closer to Election Day b) The fundamentals model becomes more accurate closer to Election Day c) Polling more closely approximates actual vote choice as the date approaches, reducing the value added by a historical-baseline fundamentals prior d) The Monte Carlo simulation requires fewer draws when Election Day is close

10. The chapter distinguishes between "false precision" in win probability reporting and appropriate precision. Which of the following best represents appropriate precision given the model described?

a) "Garza has a 61.23% chance of winning." b) "Garza has about a 60–65% chance of winning." c) "Garza will probably win." d) "The model cannot determine a winner."

True/False

11. The margin of error reported in the weighted average captures both sampling variance and systematic bias from differential nonresponse.

12. A point estimate of +2.1 (Garza leading by 2.1 points) should produce a win probability of exactly 50 percent in a Monte Carlo simulation.

13. Adding more polls to a weighted average always reduces uncertainty, regardless of the systematic bias in those polls.

14. The fundamentals prior is useful even when many recent polls are available, because it provides an independent source of signal not subject to polling nonresponse bias.

15. The sensitivity analysis in the chapter finds that the systematic error SD has a larger impact on win probability than the choice of decay rate.

Short Answer

16. Explain in two to three sentences why the election model's Monte Carlo simulation adds a systematic error term as a single draw per simulation rather than as an independent draw for each poll. What would happen to the probability distribution if the systematic error were modeled as independent per poll?

17. The chapter introduces the concept of "effective number of polls." Describe in one sentence what a high effective number of polls (close to the actual number of polls) vs. a low effective number of polls indicates about how weights are distributed.

18. Nadia is asked by the campaign manager what would "most change the probability estimate." Based on the sensitivity analysis in the chapter, what answer should Nadia give? Name two specific information inputs and explain briefly why each matters.

19. The chapter states that the model's blending weight should "approach 1.0 as Election Day nears." What is the theoretical reason for this recommendation, and what value of poll weight would you use on Election Day itself?

20. The model in this chapter assumes that state unemployment and presidential approval affect Senate race outcomes through fixed coefficients. Identify one scenario from Chapter 20 (or real-world knowledge) where this assumption would be most likely to fail, and explain why.

Coding Comprehension

21. Consider the following code:

systematic_err = np.random.normal(0, 2.0, n_simulations)

a) What does this line produce? b) If this term is added to a point estimate of +2.0, approximately what fraction of simulations will produce a negative value (Whitfield wins)? c) What would happen to that fraction if the SD were increased to 4.0?

22. The model normalizes weights with:

df["weight"] = df["weight_raw"] / df["weight_raw"].sum()

Why is normalization necessary? What property of the weights does this ensure, and why does that property matter for the weighted average calculation?

Answer Key

a
b
b
b
b
c
b
c
c
b
False (margin of error captures only sampling variance)
False (the systematic uncertainty is greater than 2.1 points, so the tail on the Whitfield side is substantial; a 2.1-point lead with a ~3-4 point total SD gives a win probability of roughly 60-65%, not 50%)
False (adding systematically biased polls does not reduce systematic uncertainty; it may increase it)
True
True (in the sensitivity table, changing systematic error SD from 2.0 to 3.5 produces a larger win-probability shift than changing the decay rate across the range tested)
A single correlated draw means all simulations assume a common directional polling error — the scenario where all polls are simultaneously wrong in the same direction. If modeled independently, the errors would cancel out across polls within each simulation, understating the real risk that all polls share a common bias.
A high effective number of polls indicates weight is evenly distributed (many polls contribute meaningfully); a low effective number indicates that one or a few polls dominate the average.
Accept answers that identify systematic error magnitude and the point estimate (polling average). Both are the most sensitive parameters. Reducing systematic error requires either historical validation of Ordana polling accuracy or alternative data sources; improving the point estimate requires more or better-quality recent polls.
As Election Day approaches, polls reflect increasingly finalized voter decisions rather than evolving preferences, making them the most accurate available signal. The fundamentals prior reflects structural factors that may have already been incorporated into polls. On Election Day itself, poll weight = 1.0 (polls only) if high-quality final polls are available, since turnout has begun and fundamentals are now fully priced into revealed behavior.
Accept answers identifying the Dobbs example from Chapter 20 (structural break due to Supreme Court ruling), the 2020 pandemic (fundamentals models calibrated on normal-economy cycles), or any scenario where a structural change altered the relationship between approval/economy and vote choice.
a) An array of n_simulations random draws from a normal distribution centered at 0 with standard deviation 2.0. b) With mean 2.0 and SD 2.0, approximately 16% of draws will be below 0 (1 standard deviation above mean); combined with other uncertainty terms the fraction will be higher. c) With SD 4.0, more simulations will produce negative values — Whitfield wins more frequently — widening the win-probability range.
Normalization ensures the weights sum to 1, which is required for the weighted average to be a proper weighted mean (not a weighted sum). Without normalization, the result would scale with the total raw weight rather than representing the average of poll margins.