Case Study 1: Wisdom of Crowds Experiment — Simulating Information Aggregation
Overview
In this case study, we build a comprehensive agent-based simulation to investigate how prediction market prices emerge from the interactions of 1,000 agents with different information, biases, and trading strategies. We will systematically vary the conditions that affect crowd wisdom—diversity, independence, the fraction of informed traders—and measure how these factors influence market accuracy.
By the end of this case study, you will have:
- Built a full agent-based simulation with 1,000 heterogeneous agents
- Demonstrated that market prices converge to the true probability under favorable conditions
- Identified the conditions under which convergence fails
- Quantified the relationship between crowd composition and market accuracy
- Produced publication-quality analysis of your simulation results
Background
The classic "wisdom of crowds" demonstration involves asking a large group to estimate some quantity—the weight of an ox, the number of jellybeans in a jar—and showing that the average estimate is remarkably close to the true value. Prediction markets extend this principle to probability estimation: the market price should converge to the true probability of an event as diverse, independent agents trade based on their private information.
But real markets are more complex than simple averaging. Agents interact through a price mechanism, they may observe and react to each other's trades, and the composition of the trading population matters. This case study uses simulation to explore these complexities.
Part 1: Setting Up the Simulation
1.1 The Event
We simulate a prediction market for a binary event with a true probability of $\theta = 0.65$. This could represent any real-world question—an election outcome, a product launch decision, a scientific hypothesis. The key is that no single agent knows the true probability, but collectively, their information can reveal it.
1.2 Agent Architecture
Our simulation includes 1,000 agents drawn from five types:
| Agent Type | Count | Description | Information Quality |
|---|---|---|---|
| Expert fundamentalist | 30 | Deep domain knowledge | Very high (noise std = 0.03) |
| Informed fundamentalist | 120 | Good relevant knowledge | High (noise std = 0.08) |
| Casual informed | 250 | Some relevant knowledge | Medium (noise std = 0.18) |
| Noise trader | 400 | No relevant information | None (random beliefs) |
| Contrarian | 50 | Systematically biased against consensus | Inverted signals |
| Momentum trader | 150 | Follow price trends | No fundamental info |
1.3 Core Simulation Code
"""
Case Study 1: Wisdom of Crowds — Agent-Based Simulation
Chapter 11: Information Aggregation Theory
This simulation demonstrates how diverse agents with partial information
can collectively arrive at accurate probability estimates through
market interaction.
"""
import numpy as np
from collections import defaultdict
import json
class Agent:
"""Base class for all agent types."""
def __init__(self, agent_id, agent_type, cash=1000.0):
self.agent_id = agent_id
self.agent_type = agent_type
self.cash = cash
self.position = 0.0
self.trade_history = []
self.belief = 0.5 # initial uninformed belief
def compute_demand(self, current_price, price_history):
"""Compute desired trade quantity. Positive = buy, negative = sell."""
raise NotImplementedError
def update_belief(self, current_price, price_history):
"""Update belief based on market information."""
pass
class ExpertFundamentalist(Agent):
"""Expert with very accurate private signal."""
def __init__(self, agent_id, true_prob, noise_std=0.03):
super().__init__(agent_id, 'expert_fundamentalist')
# Receive a very accurate signal
self.signal = np.clip(
np.random.normal(true_prob, noise_std), 0.01, 0.99
)
self.belief = self.signal
self.intensity = 5.0 # trade aggressively
def compute_demand(self, current_price, price_history):
mispricing = self.belief - current_price
# Trade proportional to mispricing, bounded by position limits
demand = self.intensity * mispricing
# Risk management: reduce demand as position grows
if abs(self.position) > 100:
demand *= 100 / abs(self.position)
return demand
class InformedFundamentalist(Agent):
"""Trader with good but imperfect private signal."""
def __init__(self, agent_id, true_prob, noise_std=0.08):
super().__init__(agent_id, 'informed_fundamentalist')
self.signal = np.clip(
np.random.normal(true_prob, noise_std), 0.01, 0.99
)
self.belief = self.signal
self.intensity = 2.0
def compute_demand(self, current_price, price_history):
mispricing = self.belief - current_price
demand = self.intensity * mispricing
if abs(self.position) > 50:
demand *= 50 / abs(self.position)
return demand
class CasualInformed(Agent):
"""Trader with some relevant knowledge, noisy signal."""
def __init__(self, agent_id, true_prob, noise_std=0.18):
super().__init__(agent_id, 'casual_informed')
self.signal = np.clip(
np.random.normal(true_prob, noise_std), 0.01, 0.99
)
self.belief = self.signal
self.intensity = 0.8
self.learning_rate = 0.1 # they update beliefs based on price
def compute_demand(self, current_price, price_history):
mispricing = self.belief - current_price
return self.intensity * mispricing
def update_belief(self, current_price, price_history):
# Casual traders are influenced by the market price
self.belief = (1 - self.learning_rate) * self.belief + \
self.learning_rate * current_price
class NoiseTrader(Agent):
"""Trader with no relevant information; trades randomly."""
def __init__(self, agent_id):
super().__init__(agent_id, 'noise_trader')
self.belief = np.random.uniform(0.1, 0.9)
self.volatility = 0.5
def compute_demand(self, current_price, price_history):
return np.random.normal(0, self.volatility)
def update_belief(self, current_price, price_history):
# Noise traders have drifting random beliefs
self.belief = np.clip(
self.belief + np.random.normal(0, 0.05), 0.05, 0.95
)
class Contrarian(Agent):
"""Trader who systematically bets against the consensus."""
def __init__(self, agent_id, true_prob, noise_std=0.15):
super().__init__(agent_id, 'contrarian')
# Contrarians have inverted signals (model the wrong direction)
raw_signal = np.clip(
np.random.normal(true_prob, noise_std), 0.01, 0.99
)
self.signal = 1.0 - raw_signal # inverted!
self.belief = self.signal
self.intensity = 1.5
def compute_demand(self, current_price, price_history):
mispricing = self.belief - current_price
return self.intensity * mispricing
class MomentumTrader(Agent):
"""Trader who follows recent price trends."""
def __init__(self, agent_id, lookback=15, sensitivity=2.0):
super().__init__(agent_id, 'momentum_trader')
self.lookback = lookback
self.sensitivity = sensitivity
def compute_demand(self, current_price, price_history):
if len(price_history) < self.lookback:
return 0
recent = price_history[-self.lookback:]
trend = (recent[-1] - recent[0]) / self.lookback
return self.sensitivity * trend * 50
class PredictionMarketSimulation:
"""
Full prediction market simulation with heterogeneous agents.
"""
def __init__(self, true_probability, initial_price=0.50,
price_impact=0.0008):
self.true_probability = true_probability
self.price = initial_price
self.price_impact = price_impact
self.price_history = [initial_price]
self.volume_history = []
self.agents = []
self.round_data = []
def create_standard_population(self):
"""Create the standard 1000-agent population."""
agent_id = 0
# 30 expert fundamentalists
for _ in range(30):
self.agents.append(
ExpertFundamentalist(agent_id, self.true_probability)
)
agent_id += 1
# 120 informed fundamentalists
for _ in range(120):
self.agents.append(
InformedFundamentalist(agent_id, self.true_probability)
)
agent_id += 1
# 250 casual informed
for _ in range(250):
self.agents.append(
CasualInformed(agent_id, self.true_probability)
)
agent_id += 1
# 400 noise traders
for _ in range(400):
self.agents.append(NoiseTrader(agent_id))
agent_id += 1
# 50 contrarians
for _ in range(50):
self.agents.append(
Contrarian(agent_id, self.true_probability)
)
agent_id += 1
# 150 momentum traders
for _ in range(150):
self.agents.append(MomentumTrader(agent_id))
agent_id += 1
return self
def run(self, n_rounds=500, participation_rate=0.25):
"""
Run the simulation.
Parameters
----------
n_rounds : int
Number of trading rounds.
participation_rate : float
Fraction of agents active each round.
"""
for t in range(n_rounds):
# Select active agents this round
active_mask = np.random.random(len(self.agents)) < participation_rate
active_agents = [a for a, m in zip(self.agents, active_mask) if m]
# Collect demands
demands = []
for agent in active_agents:
demand = agent.compute_demand(self.price, self.price_history)
demands.append((agent, demand))
# Calculate net demand and price change
net_demand = sum(d for _, d in demands)
volume = sum(abs(d) for _, d in demands)
price_change = self.price_impact * net_demand
new_price = np.clip(self.price + price_change, 0.01, 0.99)
# Execute trades
for agent, demand in demands:
if abs(demand) > 0.001:
trade_price = (self.price + new_price) / 2
agent.position += demand
agent.cash -= demand * trade_price
agent.trade_history.append({
'round': t,
'demand': demand,
'price': trade_price
})
# Update price
self.price = new_price
self.price_history.append(self.price)
self.volume_history.append(volume)
# Agents update beliefs
for agent in self.agents:
agent.update_belief(self.price, self.price_history)
# Record round data
self.round_data.append({
'round': t,
'price': self.price,
'volume': volume,
'net_demand': net_demand,
'n_active': len(active_agents),
'price_error': abs(self.price - self.true_probability)
})
return self
def settle_and_analyze(self):
"""Settle contracts and analyze results."""
# Determine event outcome
outcome = 1.0 if np.random.random() < self.true_probability else 0.0
# Compute PnL for each agent
for agent in self.agents:
settlement_value = agent.position * outcome
agent.pnl = settlement_value + agent.cash - 1000.0
# Aggregate results by type
type_results = defaultdict(lambda: {
'count': 0, 'total_pnl': 0, 'pnls': [],
'avg_belief': 0, 'beliefs': []
})
for agent in self.agents:
t = agent.agent_type
type_results[t]['count'] += 1
type_results[t]['total_pnl'] += agent.pnl
type_results[t]['pnls'].append(agent.pnl)
type_results[t]['beliefs'].append(agent.belief)
for t in type_results:
type_results[t]['avg_pnl'] = (
type_results[t]['total_pnl'] / type_results[t]['count']
)
type_results[t]['avg_belief'] = np.mean(type_results[t]['beliefs'])
type_results[t]['pnl_std'] = np.std(type_results[t]['pnls'])
# Price accuracy metrics
prices = np.array(self.price_history)
errors = np.abs(prices - self.true_probability)
# Convergence: first round where error < 0.03 and stays there
converged_round = None
for i in range(len(errors) - 20):
if all(errors[i:i+20] < 0.03):
converged_round = i
break
return {
'outcome': outcome,
'final_price': self.price_history[-1],
'true_probability': self.true_probability,
'final_error': errors[-1],
'mean_error': errors.mean(),
'min_error': errors.min(),
'converged_round': converged_round,
'type_results': dict(type_results),
'price_history': self.price_history,
'volume_history': self.volume_history
}
Part 2: Running the Baseline Simulation
2.1 Standard Conditions
# Baseline simulation: standard population, standard conditions
np.random.seed(42)
sim = PredictionMarketSimulation(true_probability=0.65)
sim.create_standard_population()
sim.run(n_rounds=500)
results = sim.settle_and_analyze()
print("=" * 60)
print("BASELINE SIMULATION RESULTS")
print("=" * 60)
print(f"\nTrue probability: {results['true_probability']:.3f}")
print(f"Final market price: {results['final_price']:.3f}")
print(f"Final error: {results['final_error']:.4f}")
print(f"Mean error: {results['mean_error']:.4f}")
print(f"Converged at round: {results['converged_round']}")
print(f"Event outcome: {results['outcome']}")
print(f"\n{'Agent Type':<25} {'Count':>6} {'Avg PnL':>10} {'Avg Belief':>12}")
print("-" * 55)
for agent_type, data in sorted(results['type_results'].items()):
print(f"{agent_type:<25} {data['count']:>6} "
f"{data['avg_pnl']:>10.2f} {data['avg_belief']:>12.3f}")
2.2 Interpreting Baseline Results
The baseline simulation should show:
-
Price convergence. The market price should converge toward 0.65 within the first 100-200 rounds, despite starting at 0.50 and being influenced by 400 noise traders and 150 momentum traders.
-
Informed traders profit. Expert and informed fundamentalists should earn positive PnL on average, while noise traders and contrarians lose money. This wealth transfer is the mechanism by which the market rewards accurate information.
-
Momentum traders are a mixed bag. Their profitability depends on whether the dominant trend (from 0.50 toward 0.65) was strong enough to exploit, versus the noise-driven false trends they also chased.
Part 3: Varying Conditions
3.1 Experiment 1: Diversity Matters
What happens when we reduce the diversity of the informed population?
def run_diversity_experiment(n_simulations=50):
"""
Compare market accuracy with diverse vs. homogeneous
informed populations.
"""
results_diverse = []
results_homogeneous = []
for _ in range(n_simulations):
# Diverse: standard population
sim_d = PredictionMarketSimulation(true_probability=0.65)
sim_d.create_standard_population()
sim_d.run(n_rounds=300)
r_d = sim_d.settle_and_analyze()
results_diverse.append(r_d['final_error'])
# Homogeneous: all informed traders have same signal quality
sim_h = PredictionMarketSimulation(true_probability=0.65)
# Replace diverse fundamentalists with all the same type
agent_id = 0
for _ in range(400): # same total informed count
sim_h.agents.append(
InformedFundamentalist(agent_id, 0.65, noise_std=0.10)
)
agent_id += 1
for _ in range(400):
sim_h.agents.append(NoiseTrader(agent_id))
agent_id += 1
for _ in range(200):
sim_h.agents.append(MomentumTrader(agent_id))
agent_id += 1
sim_h.run(n_rounds=300)
r_h = sim_h.settle_and_analyze()
results_homogeneous.append(r_h['final_error'])
print("=== Diversity Experiment ===")
print(f"Diverse population - Mean error: {np.mean(results_diverse):.4f} "
f"(std: {np.std(results_diverse):.4f})")
print(f"Homogeneous population - Mean error: {np.mean(results_homogeneous):.4f} "
f"(std: {np.std(results_homogeneous):.4f})")
return results_diverse, results_homogeneous
3.2 Experiment 2: Independence and Correlated Beliefs
What happens when agents share information and their beliefs become correlated?
def run_correlation_experiment(n_simulations=50):
"""
Simulate different levels of belief correlation
(social influence between agents).
"""
correlation_levels = [0.0, 0.1, 0.2, 0.3, 0.5, 0.7]
results = {}
for rho in correlation_levels:
errors = []
for _ in range(n_simulations):
sim = PredictionMarketSimulation(true_probability=0.65)
# Create agents with correlated noise
agent_id = 0
common_shock = np.random.normal(0, 0.15) # shared error
for _ in range(200):
agent = InformedFundamentalist(agent_id, 0.65, noise_std=0.10)
# Add correlated component to signal
agent.signal = np.clip(
agent.signal + np.sqrt(rho) * common_shock,
0.01, 0.99
)
agent.belief = agent.signal
sim.agents.append(agent)
agent_id += 1
for _ in range(800):
sim.agents.append(NoiseTrader(agent_id))
agent_id += 1
sim.run(n_rounds=300)
r = sim.settle_and_analyze()
errors.append(r['final_error'])
results[rho] = {
'mean_error': np.mean(errors),
'std_error': np.std(errors),
'median_error': np.median(errors)
}
print("=== Correlation Experiment ===")
print(f"{'Correlation':>12} {'Mean Error':>12} {'Std Error':>12}")
print("-" * 38)
for rho, data in sorted(results.items()):
print(f"{rho:>12.1f} {data['mean_error']:>12.4f} "
f"{data['std_error']:>12.4f}")
return results
3.3 Experiment 3: Informed Trader Fraction
How many informed traders are needed for accurate prices?
def run_informed_fraction_experiment(n_simulations=50):
"""
Vary the fraction of informed traders and measure accuracy.
"""
fractions = [0.01, 0.02, 0.05, 0.10, 0.15, 0.20, 0.30, 0.50]
results = {}
for frac in fractions:
errors = []
convergence_rounds = []
for _ in range(n_simulations):
sim = PredictionMarketSimulation(true_probability=0.65)
agent_id = 0
n_total = 1000
n_informed = int(n_total * frac)
n_noise = n_total - n_informed
for _ in range(n_informed):
sim.agents.append(
InformedFundamentalist(agent_id, 0.65, noise_std=0.08)
)
agent_id += 1
for _ in range(n_noise):
sim.agents.append(NoiseTrader(agent_id))
agent_id += 1
sim.run(n_rounds=400)
r = sim.settle_and_analyze()
errors.append(r['final_error'])
convergence_rounds.append(
r['converged_round'] if r['converged_round'] is not None else 400
)
results[frac] = {
'mean_error': np.mean(errors),
'std_error': np.std(errors),
'mean_convergence': np.mean(convergence_rounds),
'pct_converged': sum(
1 for c in convergence_rounds if c < 400
) / len(convergence_rounds)
}
print("=== Informed Fraction Experiment ===")
print(f"{'Fraction':>10} {'Mean Error':>12} {'Convergence':>14} "
f"{'% Converged':>13}")
print("-" * 52)
for frac, data in sorted(results.items()):
print(f"{frac:>9.0%} {data['mean_error']:>12.4f} "
f"{data['mean_convergence']:>12.0f} "
f"{data['pct_converged']:>12.0%}")
return results
Part 4: Analysis and Visualization
4.1 Price Path Analysis
def analyze_price_path(price_history, true_prob, label=""):
"""Analyze properties of the simulated price path."""
prices = np.array(price_history)
errors = np.abs(prices - true_prob)
returns = np.diff(prices)
# Compute statistics
stats = {
'label': label,
'n_rounds': len(prices) - 1,
'final_price': prices[-1],
'final_error': errors[-1],
'mean_error': errors.mean(),
'max_error': errors.max(),
'return_mean': returns.mean(),
'return_std': returns.std(),
'return_autocorr_lag1': np.corrcoef(returns[:-1], returns[1:])[0, 1]
if len(returns) > 2 else 0,
'max_drawdown': max(
prices[:i+1].max() - prices[i]
for i in range(len(prices))
) if len(prices) > 1 else 0
}
print(f"\n--- Price Path Analysis: {label} ---")
for key, val in stats.items():
if key != 'label':
if isinstance(val, float):
print(f" {key:<25}: {val:.4f}")
else:
print(f" {key:<25}: {val}")
return stats
4.2 Wealth Transfer Analysis
def analyze_wealth_transfer(results):
"""
Analyze how wealth flows between agent types.
This is the key mechanism driving information aggregation.
"""
print("\n=== Wealth Transfer Analysis ===")
print("(Positive PnL = gained from market; Negative = lost to market)\n")
total_informed_profit = 0
total_noise_loss = 0
for agent_type, data in sorted(results['type_results'].items()):
total_pnl = data['total_pnl']
avg_pnl = data['avg_pnl']
count = data['count']
bar_length = int(abs(avg_pnl) / 2)
if avg_pnl >= 0:
bar = "+" * min(bar_length, 40)
bar_str = f"[{bar}>"
else:
bar = "-" * min(bar_length, 40)
bar_str = f"<{bar}]"
print(f" {agent_type:<25} {avg_pnl:>+8.2f} per agent "
f"{total_pnl:>+10.2f} total {bar_str}")
if 'fundamental' in agent_type or 'expert' in agent_type:
total_informed_profit += total_pnl
if 'noise' in agent_type or 'contrarian' in agent_type:
total_noise_loss += total_pnl
print(f"\n Total informed profit: {total_informed_profit:>+10.2f}")
print(f" Total noise/contrarian loss: {total_noise_loss:>+10.2f}")
print(f" Market is approximately zero-sum: "
f"{abs(total_informed_profit + total_noise_loss) < 100}")
Part 5: Key Findings and Discussion
5.1 Expected Results
When you run these simulations, you should observe the following patterns:
Finding 1: Convergence. With the standard population, prices converge to within 3 percentage points of the true probability within 100-200 rounds in most simulations. This demonstrates the core wisdom-of-crowds result.
Finding 2: Diversity improves accuracy. The diverse population (experts + informed + casual) produces lower mean errors than a homogeneous population with the same total number of informed traders. Different noise structures cancel more effectively.
Finding 3: Correlation is destructive. Even modest correlation ($\rho = 0.2$) significantly degrades accuracy. At $\rho = 0.5$, the market performs only marginally better than a single trader.
Finding 4: A few informed traders suffice. Markets with as few as 5% informed traders can achieve reasonable accuracy if those traders are active and well-capitalized. Below 2%, accuracy degrades rapidly.
Finding 5: Wealth transfer is the mechanism. Informed traders profit at the expense of noise traders and contrarians. This wealth transfer is not a bug—it is the mechanism that incentivizes information acquisition and transmission.
5.2 Discussion Questions
-
How does the participation rate (fraction of agents active each round) affect convergence speed? Is it better to have all agents trade every round, or is random participation sufficient?
-
If you could add one more agent type to the simulation, what would it be and why? Consider agents who acquire information at a cost, agents who switch strategies based on performance, or agents who communicate through social networks.
-
How would the results change if the true probability were 0.95 instead of 0.65? Would convergence be faster or slower? Why?
-
The simulation uses a simple price-impact model. How would results differ with an LMSR market maker? What are the trade-offs?
-
In practice, prediction markets operate over days or weeks, not 500 discrete rounds. How would you calibrate the simulation parameters (price impact, participation rate, trading intensity) to match real-world data?
5.3 Extensions
For further exploration:
- Dynamic information. Add events at specific rounds where new public information shifts the true probability. Measure how quickly the market adjusts.
- Endogenous participation. Let agents decide whether to participate based on their expected profit. This creates an equilibrium where only the most informed agents remain active.
- Network effects. Give agents a social network and let them share beliefs before trading. Measure how different network structures affect accuracy.
- Multiple markets. Run correlated markets simultaneously and see if agents who are informed about the correlation can profit by trading across markets.