Chapter 4 Exercises: Thinking Like a Political Analyst
Conceptual Exercises
Exercise 4.1 — Classifying Analytical Goals
For each of the following questions, identify whether it is primarily descriptive, inferential, or predictive, and explain why the distinction matters for how you would approach the analysis.
a) "What share of registered voters in our state are under 35?"
b) "Does canvassing increase turnout among low-propensity voters?"
c) "If the national environment shifts by two points toward Democrats, what happens to our state-level Senate race?"
d) "What were the most common words used in Whitfield's stump speech over the past month?"
e) "Which precincts are most likely to have large pools of persuadable voters in the general election?"
Exercise 4.2 — Identifying Confounds
Each of the following observational findings has at least two plausible alternative explanations. For each finding, list at least two confounds and then propose a piece of evidence that would help distinguish among explanations.
a) Candidates who run more negative ads win at a higher rate than candidates who run exclusively positive ads.
b) Counties with lower median income show higher support for Whitfield's populist platform.
c) Voters contacted by door-knockers in the first week of October were more likely to report strong candidate preference in a post-contact survey.
d) Senate incumbents who hold more town halls win reelection at higher rates than those who hold fewer.
Exercise 4.3 — Applying the Decision Tree
You are Nadia Osei, analytics director for the Garza campaign. The campaign is considering adding a series of radio ads targeting rural listeners in the state's northern counties. Using the seven-question analyst's decision tree from Section 4.7, work through the analytical setup for this decision. You do not need real data — you need to formulate the questions properly and specify what data or information would be needed to answer each.
Exercise 4.4 — Base Rates and Prior Construction
You are asked to forecast the outcome of an open-seat Senate race in a state with the following characteristics: - The state went Republican in 5 of the last 6 presidential elections (the exception being 2020, when it went D+1.2%) - The state has had a Republican senator for 18 consecutive years - This cycle's national environment, as measured by the generic congressional ballot, favors Democrats by 3.5 points - One credible early poll shows the Democratic candidate leading by 4 points
a) Construct a prior probability distribution over the Democratic candidate's win probability before seeing the poll. Justify your prior.
b) How much should the single poll update your prior? What factors affect how much weight you should give the poll?
c) What additional information would most substantially move your prior, and in which direction?
Exercise 4.5 — Counterfactual Construction
The Garza campaign won a closely contested primary by 2.3 percentage points. The runner-up, a progressive state senator, endorsed Garza three weeks before the general. Shortly after the endorsement, Garza's favorability among self-described progressives rose from 61% to 74%.
a) State the causal question implied by this sequence of events.
b) List at least three alternative explanations for the favorability rise.
c) Design a piece of evidence — observational or experimental — that would help you distinguish the endorsement effect from the alternative explanations.
d) Why is it particularly difficult to identify endorsement effects in this context?
Applied Exercises
Exercise 4.6 — Gut vs. Data: Structured Integration
Jake Rourke believes that the three largest rural counties will deliver Whitfield a net advantage of 18,000 votes — a number based on his experience and knowledge of those communities. The campaign's turnout model, based on historical precinct data and modeled enthusiasm scores, predicts a net advantage of only 11,000 votes.
a) What information would you want to gather to evaluate whose estimate is more reliable?
b) How would you structure an analysis that integrates Jake's local knowledge with the model's output rather than simply choosing one over the other?
c) What cognitive biases should you be most alert to in Jake's estimate? In the model's estimate?
Exercise 4.7 — Pre-Mortem Analysis
You are on the Garza campaign's analytics team. Leadership has just approved a strategy of concentrating the last $500,000 of campaign budget on digital persuasion ads targeting college-educated suburban women in three media markets, based on the campaign's support model showing high persuasion opportunity in that segment.
Write a two-page pre-mortem. Assume the strategy was followed and Garza lost by 1.8 points. What are the three most plausible ways the analytical reasoning underlying this strategy could have been wrong? For each failure mode, what evidence could have detected the problem in advance?
Exercise 4.8 — Calibration Exercise
For each of the following political questions, provide your best probability estimate AND an honest confidence interval reflecting your uncertainty. Then identify the single most important piece of information that would most substantially narrow your confidence interval.
a) In the Garza-Whitfield race, what is the probability that the candidate who leads in the final pre-election polling average wins?
b) What is the probability that a Senate election in a state where the presidential candidate of the same party won by more than 5 points will be won by that party's Senate candidate?
c) What is the probability that a campaign with a 3-point polling lead in the final week will win?
Discussion Questions
4.D1 — Jake Rourke argues that his 25 years of campaign experience give him better political judgment than any model. Nadia argues that models are more reliable than individual judgment because they are not subject to cognitive bias. Who is right, and under what conditions?
4.D2 — The chapter argues that intellectual humility — expressing calibrated uncertainty — is a professional virtue for political analysts. But campaign environments reward confident projections. How should an analyst navigate this tension? What are the ethical dimensions of overstating certainty to satisfy client demands?
4.D3 — We noted that the ecological fallacy — drawing individual-level conclusions from aggregate data — has distorted understanding of populism. Can you think of a specific political narrative in recent American politics that may have been driven partly by ecological fallacy reasoning? How would you test whether the narrative holds at the individual level?
4.D4 — The chapter distinguishes prediction and explanation as different analytical goals requiring different approaches. Is it ever possible for a single analysis to serve both goals well? What are the conditions under which the tension between them is most acute?
Research Exercises
Exercise 4.9 — Case Study in Analytical Failure
Research one of the following: (a) the 2016 presidential polling miss, (b) the 2020 Senate polling misses, or (c) a specific campaign's documented analytical failure (the Romney Orca system, the Clinton 2016 analytics team's model performance, etc.). Write a three-to-five-page analysis applying the frameworks from this chapter. Specifically: What type of analysis was being done? What confounds or alternative explanations were underweighted? What base rates were neglected? How was uncertainty communicated — or not communicated — to decision-makers?
Exercise 4.10 — Interview Exercise
Interview one person who has made decisions under political uncertainty — a campaign staffer, a local official, a political journalist, or a policy analyst. Ask them: (a) How do they form prior beliefs about political questions? (b) How do they update those beliefs when new information arrives? (c) What is the most important thing they have learned about the limits of their own judgment? Write a two-page reflection connecting their responses to the frameworks in this chapter.