Case Study 2: Iowa Electronic Markets vs. Polls — 30 Years of Data
"The stock market has predicted nine of the last five recessions. Prediction markets aim to do better." — Economist joke, adapted for prediction markets
Overview
The Iowa Electronic Markets (IEM) have been forecasting U.S. presidential elections since 1988, providing the longest continuous dataset on prediction market accuracy in existence. Over this period, the IEM has been compared — favorably, critically, and endlessly — to traditional opinion polls.
This case study examines the IEM's track record across ten presidential election cycles (1988-2024), compares its accuracy to contemporaneous polling averages, and investigates when and why markets outperformed polls (or failed to). We use synthetic but realistic data modeled on the publicly available research, and we build visualizations that bring the comparison to life.
1. Background: The IEM's Design
1.1 Market Structure
The IEM operates two types of presidential election markets:
Vote-Share Market: Contracts pay based on the actual percentage of the two-party popular vote received by each candidate. A contract for the Democratic candidate pays $1 times the Democrat's share of the two-party vote. For example, if the Democrat receives 52% of the two-party vote, the Democratic contract pays $0.52 and the Republican contract pays $0.48. The market price of each contract therefore represents the market's estimate of each candidate's vote share.
Winner-Take-All Market: Contracts pay $1 if the specified candidate wins the popular vote and $0 otherwise. The market price directly represents the market's estimated probability of each candidate winning.
Both markets are real-money markets with a $500 maximum investment per individual. Trading occurs through a continuous double-auction mechanism with a limit order book.
1.2 Participant Profile
IEM participants are not a representative sample of the electorate. They skew: - Male (approximately 70-80%) - Highly educated (most are college graduates or current students) - Economically knowledgeable (many are business or economics students and faculty) - Geographically concentrated in the Midwest (due to the University of Iowa connection)
Despite this non-representative composition, the IEM has consistently produced accurate forecasts. This is a key finding: prediction markets do not require representative participants. They require participants who are, on average, well-informed and motivated.
1.3 Data Limitations
Several limitations should be acknowledged: - The IEM is small (typically a few hundred active traders per election cycle). - Maximum investments are capped at $500, limiting the financial incentive for informed trading. - Liquidity is thin compared to commercial markets, which can lead to bid-ask spreads that reduce price precision. - The data from early election cycles (1988, 1992) is sparser than from later cycles.
2. Election-by-Election Analysis
2.1 1988: Bush vs. Dukakis
Actual result: Bush 53.4%, Dukakis 46.6% (two-party vote)
The inaugural IEM market. With only about 200 participants, the market was thin but functional. The IEM's election-eve prediction was approximately 53.2% for Bush — within 0.2 percentage points of the actual result. Gallup's final pre-election poll had Bush at 56% (among likely voters), a larger error of 2.6 percentage points.
Verdict: IEM more accurate than polls.
2.2 1992: Clinton vs. Bush vs. Perot
Actual result: Clinton 53.5%, Bush 46.5% (two-party vote, excluding Perot)
The three-way race complicated both polling and market forecasting. The IEM handled the third-party challenge reasonably well, with its two-party vote-share estimate close to the final result. Polling averages were also reasonably accurate in this cycle.
Verdict: Roughly comparable; both IEM and polls were close.
2.3 1996: Clinton vs. Dole
Actual result: Clinton 54.7%, Dole 45.3% (two-party vote)
A relatively easy forecast — Clinton led consistently throughout the campaign. The IEM's election-eve price was approximately 55.0% for Clinton, an error of 0.3 percentage points. Polls were also close, with final polling averages showing Clinton at approximately 54%.
Verdict: Both accurate; IEM very slightly better.
2.4 2000: Gore vs. Bush
Actual result: Gore 50.3%, Bush 49.7% (two-party popular vote; Bush won the Electoral College)
The closest election in modern history. The IEM's election-eve price for Gore was approximately 50.5%, an error of only 0.2 percentage points. The final RealClearPolitics polling average had Gore ahead by about 1.5 points among likely voters — somewhat less accurate. However, the IEM's winner-take-all market slightly favored Bush, correctly anticipating the Electoral College outcome despite Gore's popular vote edge.
Verdict: IEM more accurate on vote share; also correctly priced the close race.
2.5 2004: Bush vs. Kerry
Actual result: Bush 51.2%, Kerry 48.8% (two-party vote)
The IEM's election-eve price was approximately 51.5% for Bush, an error of 0.3 points. The final polling average had the race tighter, at roughly 50-50 among likely voters. The IEM's slight edge in accuracy was consistent with its historical pattern of outperforming polls in close races.
Verdict: IEM more accurate; polls underestimated Bush's margin.
2.6 2008: Obama vs. McCain
Actual result: Obama 53.7%, McCain 46.3% (two-party vote)
The financial crisis made this race relatively predictable by election day. Both the IEM and polls converged on a clear Obama victory. The IEM's election-eve price was approximately 53.5% for Obama, while the RealClearPolitics average had Obama at roughly 52.5%. Both were close; the IEM was marginally more accurate.
Verdict: Both accurate; IEM slightly better.
2.7 2012: Obama vs. Romney
Actual result: Obama 52.0%, Romney 48.0% (two-party vote)
The IEM's election-eve price was approximately 51.8% for Obama, an error of 0.2 points. Polling averages were close, at approximately 51.0% for Obama. Both were accurate, with the IEM slightly better.
Verdict: Both accurate; IEM very slightly better.
2.8 2016: Clinton vs. Trump
Actual result: Clinton 51.1%, Trump 48.9% (two-party popular vote; Trump won the Electoral College)
This election was widely seen as a polling failure, with most national polls overestimating Clinton's margin. The IEM's election-eve price was approximately 51.5% for Clinton, close to the actual result. Polling averages had Clinton at approximately 52.5%, a somewhat larger error.
The winner-take-all market told a more interesting story. The IEM gave Clinton approximately a 70% chance of winning, while many polling-based models gave her 85-99%. In hindsight, the IEM's more modest confidence was better calibrated.
Verdict: IEM more accurate on vote share and better calibrated on win probability.
2.9 2020: Biden vs. Trump
Actual result: Biden 52.3%, Trump 47.7% (two-party vote)
The IEM's election-eve price was approximately 52.5% for Biden, an error of 0.2 points. Polling averages had Biden at approximately 53.5%, overestimating his margin by about 1.2 points. This cycle continued the post-2016 pattern of polls slightly overestimating Democratic performance.
Verdict: IEM more accurate.
2.10 2024: Harris vs. Trump
Actual result: Trump 51.6%, Harris 48.4% (two-party vote; approximate)
The 2024 election provided another test case. With the IEM's influence reduced relative to Polymarket and other modern platforms, the data is sparser. The IEM's final prices were close to the actual result, while national polls again underestimated Republican performance.
Verdict: IEM more accurate than national polling averages; other prediction markets (Polymarket) were also closer to the result than polls.
3. Aggregate Analysis
3.1 Mean Absolute Error Comparison
The following table summarizes the election-eve mean absolute error (MAE) for the IEM vote-share market and for the final polling average across all election cycles:
| Election | IEM MAE (pp) | Polling MAE (pp) | IEM Advantage |
|---|---|---|---|
| 1988 | 0.2 | 2.6 | +2.4 |
| 1992 | 0.5 | 0.6 | +0.1 |
| 1996 | 0.3 | 0.7 | +0.4 |
| 2000 | 0.2 | 1.5 | +1.3 |
| 2004 | 0.3 | 1.2 | +0.9 |
| 2008 | 0.2 | 1.2 | +1.0 |
| 2012 | 0.2 | 1.0 | +0.8 |
| 2016 | 0.4 | 1.4 | +1.0 |
| 2020 | 0.2 | 1.2 | +1.0 |
| 2024 | 0.5 | 1.8 | +1.3 |
| Average | 0.30 | 1.22 | +0.92 |
pp = percentage points. IEM Advantage = Polling MAE minus IEM MAE. Positive values indicate the IEM was more accurate.
The IEM's average election-eve MAE of approximately 0.3 percentage points compares favorably to the polling average's approximately 1.2 percentage points. The IEM was more accurate in every election cycle, though the margin varied.
3.2 Time Horizon Analysis
One of the IEM's most striking features is its accuracy over longer time horizons. While polls are volatile months before an election — responding to convention bounces, debate performances, and news cycles — IEM prices tend to be more stable and more accurate early in the campaign.
Berg et al. (2008) showed that the IEM outperformed polls not just on election eve but also at horizons of 1 month, 3 months, and even 6 months before the election. The advantage was largest at longer horizons, suggesting that the IEM's information aggregation mechanism is particularly valuable when individual data points (polls) are noisy.
The data in code/example-02-iem-analysis.py simulates this time-horizon comparison and shows the MAE of IEM prices vs. polling averages at different distances from election day.
3.3 Calibration Analysis
Beyond point-estimate accuracy, we can ask whether IEM prices were well-calibrated as probabilities. Using the winner-take-all market: when the IEM said a candidate had a 60% chance of winning, did that candidate win approximately 60% of the time?
With only 10 presidential elections, the sample size is too small for a robust calibration analysis at the presidential level. However, combining IEM data from presidential elections, midterm elections, Senate races, and gubernatorial races yields a larger sample. Research by Berg and Rietz suggests that IEM prices are reasonably well-calibrated across this broader dataset, though the favorite-longshot bias (overpricing of favorites, underpricing of longshots) introduces some distortion.
4. Why Markets Outperformed — and When They Didn't
4.1 Why Markets Outperform Polls
Several mechanisms explain the IEM's accuracy advantage:
Continuous updating: Polls are snapshots taken at a single point in time. The IEM price incorporates all available information continuously, including the latest polls, breaking news, economic data, and private information. By election eve, the IEM has processed far more information than any single poll.
Incentive alignment: Poll respondents have no financial incentive to report their true voting intentions accurately. They may lie (social desirability bias), change their minds after being polled, or be uncertain about whether they will actually vote. IEM traders, by contrast, have a financial incentive to be accurate — they lose money if they are wrong.
Self-selection of informed participants: The IEM attracts people who are interested in and knowledgeable about politics. Participants who consistently make poor predictions lose money and tend to drop out, while successful traders remain. This natural selection process tends to improve the quality of the participant pool over time.
Aggregation across information sources: The IEM price aggregates information from many sources — polls, economic models, expert judgment, local knowledge, campaign insider information — into a single number. This aggregation process can be more efficient than any single information source.
4.2 When Polls Are Competitive
Polls perform relatively well under certain conditions:
When the election is not close: In landslide elections (1996, 2008), both polls and markets are accurate because the outcome is easy to predict. The IEM's advantage is smallest in these cycles.
Close to election day: Polls become increasingly accurate as election day approaches, because voter preferences stabilize and the "likely voter" screen becomes more accurate. The IEM's advantage narrows in the final days before the election.
When there are no systematic polling errors: The IEM's advantage is largest when polls have systematic biases (such as underestimating turnout among certain demographics). In cycles where polls are unbiased, the accuracy gap narrows.
4.3 When Markets Struggle
Prediction markets are not infallible. They struggle in several situations:
Small sample of events: With only 10 presidential elections since 1988, it is difficult to draw strong statistical conclusions about whether the IEM is truly better than polls or merely lucky. The sample size problem is real and should temper strong claims.
Thin liquidity: The IEM's small size means that a single large trader can move prices substantially. In several election cycles, researchers have identified episodes where IEM prices were distorted by individual traders making large, poorly informed bets.
Partisan bias: IEM traders, despite having financial incentives, are not fully rational. Research by Forsythe, Rietz, and Ross (1999) documented a "wishful thinking" effect: partisans tend to overvalue contracts for their preferred candidate. When the trader population is ideologically imbalanced, this can bias market prices.
Structural uncertainty: Markets are better at aggregating information than at forecasting genuinely uncertain events. The 2000 election (Gore vs. Bush) was essentially a coin flip, and no forecasting method — market, poll, or model — could predict the outcome with confidence.
5. Visualizations
5.1 Side-by-Side Accuracy Comparison
The code in code/example-02-iem-analysis.py generates a bar chart comparing IEM and polling MAE across election cycles. The visualization highlights:
- The IEM's consistent accuracy advantage
- The variation in this advantage across cycles
- The general trend of both methods improving over time (as both polling methodology and market design have matured)
5.2 Time-Series Plot
The code also generates a time-series plot showing IEM prices and polling averages for a selected election cycle, tracking both from 6 months before the election to election day. This visualization illustrates: - The greater volatility of poll-based estimates early in the campaign - The convergence of market and poll estimates as election day approaches - The IEM's tendency to "see through" temporary polling bounces
5.3 Calibration Plot
A calibration plot shows whether IEM probabilities are well-calibrated. Bins of IEM prices (e.g., 40-50%, 50-60%, 60-70%) are compared to the actual frequency of outcomes in each range. A perfectly calibrated market would produce points along the 45-degree line.
6. The Broader Comparison: Markets vs. Models
6.1 The Rise of Forecasting Models
Since 2008, the landscape of election forecasting has been transformed by statistical models — most prominently Nate Silver's FiveThirtyEight model. These models combine polling data with demographic information, economic indicators, and historical patterns to produce probability estimates that are often more precise than raw polling averages.
The relevant comparison for prediction markets is no longer "markets vs. polls" but "markets vs. sophisticated statistical models." On this comparison, the evidence is more mixed:
- In 2012, FiveThirtyEight's model (approximately 90% confidence in Obama) was better calibrated than InTrade (approximately 67%) or the IEM.
- In 2016, FiveThirtyEight (approximately 72% for Clinton) was better calibrated than models like the Princeton Election Consortium (approximately 99% for Clinton), and comparable to prediction market estimates.
- In 2020 and 2024, prediction markets (including Polymarket) appeared to be somewhat better calibrated than the major statistical models, which overestimated Democratic performance.
6.2 Complementary, Not Competing
The most sophisticated view of the markets-vs.-polls debate is that they are complementary rather than competing tools:
- Polls provide structured, representative samples of voter opinion.
- Models combine polls with other data sources and adjust for known biases.
- Markets aggregate information from many sources (including polls and models) and provide continuous, real-time updates.
The best forecasts likely come from combining all three — and indeed, modern forecasting models increasingly incorporate prediction market prices as one input among many.
7. Implications
7.1 For Prediction Market Design
The IEM experience suggests several design principles: - Even small, low-stakes markets can produce accurate forecasts. - The $500 investment cap did not prevent accurate pricing, though it may have limited the market's ability to resist manipulation. - Academic credibility matters: the IEM's university affiliation gave it legitimacy that commercial platforms struggle to achieve.
7.2 For Election Forecasting
- No single method dominates. The best approach combines markets, polls, and models.
- Markets are most valuable at longer time horizons, where polls are noisiest.
- Markets provide useful calibration information: when the market says 60%, it means 60%, not "leading" or "behind."
7.3 For Prediction Market Users
- Do not treat market prices as certainty. A 70% probability means a 30% chance of the other outcome.
- Be aware of the favorite-longshot bias: markets tend to overconfidently price the favorite.
- Consider the liquidity of the market: thin markets produce less reliable prices.
Discussion Questions
-
With only 10 presidential elections in the IEM dataset, can we confidently conclude that markets outperform polls? What sample size would you need for a statistically significant conclusion?
-
The IEM's participant pool is not representative of the general population. Is this a strength (informed participants) or a weakness (potential ideological bias)?
-
How should we think about the IEM's accuracy in the context of modern platforms like Polymarket that have much higher liquidity? Would you expect Polymarket to be more or less accurate than the IEM?
-
Is the "wishful thinking" bias identified by Forsythe et al. (1999) more or less likely in modern prediction markets, where participation is broader and more anonymous?
-
If you were designing the "perfect" election forecasting system, how would you combine prediction markets, polls, and statistical models?
Key Data Points
| Metric | Value |
|---|---|
| IEM founding year | 1988 |
| Election cycles covered | 10 (1988-2024) |
| Average IEM election-eve MAE | ~0.3 percentage points |
| Average polling election-eve MAE | ~1.2 percentage points |
| IEM outperformance rate | 100% of cycles (election-eve) |
| IEM head-to-head vs. individual polls | ~74% win rate (Berg et al., 2004) |
| Maximum individual investment | $500 |
| Typical active traders per election | 200-800 |
This case study accompanies Chapter 2: A Brief History of Prediction Markets. See also: Case Study 1: The Rise and Fall of InTrade Code: case-study-code.py and example-02-iem-analysis.py