Case Study 6.1: The 2016 U.S. Election and the Limits of Probability Communication

DataField.Dev

Case Study 6.1: The 2016 U.S. Election and the Limits of Probability Communication

Background

Before the 2016 U.S. presidential election, political forecast models gave Donald Trump a 28–29% chance of winning (FiveThirtyEight's final model) to roughly 15% (The New York Times). Hillary Clinton was favored by all major quantitative models.

Trump won.

In the aftermath, widespread public confusion ensued. Many people concluded: the models were "wrong." That probability had "failed." That the experts didn't know what they were talking about.

But was the model wrong? What does it mean for a probabilistic forecast to be correct or incorrect?

The Communication Problem

A probability of 28% is not a small probability. It means that in roughly 3 out of 10 similar elections with these conditions, Trump would win. The model was not saying "Clinton will definitely win" — it was saying "Clinton is more likely to win."

The problem was that most people heard "28%" and interpreted it as "28% chance = basically impossible." This reflects a calibration failure in how probability is communicated, not in the underlying math.

For comparison: - A 28% probability is roughly the probability that you roll a 1 on a four-sided die - It's roughly the probability that your flight is delayed by more than 15 minutes (historically about 20–25%) - It's roughly the probability of rain when a weather forecast says "30% chance of rain"

These are not extreme long shots. They are common events that happen regularly.

What Would "Being Wrong" Look Like?

If the model consistently assigned 30% probability to events that happened 70% of the time, that would be a calibrated failure. Or if it assigned 30% to events that happened 5% of the time.

A model that assigned 28% to Trump winning, and Trump won, is a single data point. You need many elections to evaluate calibration. Research on election forecasting models suggests they are reasonably well-calibrated overall — they're not systematically biased in ways that would make the Trump outcome a model failure.

Discussion Questions

1. Frequency vs. outcome: The model assigned 28% probability to a Trump win. He won. Does this mean the model was wrong? What would it mean for the model to be "right"? How many elections would you need to test it?

2. Communication failure: The chapter discusses how probability is communicated vs. how it's received. What would better probability communication look like? How should "28% chance" be presented to minimize misinterpretation?

3. Bayesian update: After Trump's win, how much should you update your belief in the quality of election forecasting models? How much weight should a single data point carry?

4. The 30% chance of rain problem: When a weather forecast says "30% chance of rain" and it doesn't rain, do you consider the forecast wrong? Why or why not? What does this reveal about how we evaluate probability?

5. Policy implications: If the public systematically misinterprets probability (treating 28% as "basically impossible"), what are the consequences for how probabilistic information is used in policy decisions?

6. Connection to luck: When you experience an unlikely outcome (a job you got against the odds; a relationship that worked when it "shouldn't have"), what conclusions should you draw? What conclusions should you not draw?