Case Study 1: Wisdom of Crowds Experiment — Simulating Information Aggregation

Overview

In this case study, we build a comprehensive agent-based simulation to investigate how prediction market prices emerge from the interactions of 1,000 agents with different information, biases, and trading strategies. We will systematically vary the conditions that affect crowd wisdom—diversity, independence, the fraction of informed traders—and measure how these factors influence market accuracy.

By the end of this case study, you will have:

  1. Built a full agent-based simulation with 1,000 heterogeneous agents
  2. Demonstrated that market prices converge to the true probability under favorable conditions
  3. Identified the conditions under which convergence fails
  4. Quantified the relationship between crowd composition and market accuracy
  5. Produced publication-quality analysis of your simulation results

Background

The classic "wisdom of crowds" demonstration involves asking a large group to estimate some quantity—the weight of an ox, the number of jellybeans in a jar—and showing that the average estimate is remarkably close to the true value. Prediction markets extend this principle to probability estimation: the market price should converge to the true probability of an event as diverse, independent agents trade based on their private information.

But real markets are more complex than simple averaging. Agents interact through a price mechanism, they may observe and react to each other's trades, and the composition of the trading population matters. This case study uses simulation to explore these complexities.


Part 1: Setting Up the Simulation

1.1 The Event

We simulate a prediction market for a binary event with a true probability of $\theta = 0.65$. This could represent any real-world question—an election outcome, a product launch decision, a scientific hypothesis. The key is that no single agent knows the true probability, but collectively, their information can reveal it.

1.2 Agent Architecture

Our simulation includes 1,000 agents drawn from five types:

Agent Type Count Description Information Quality
Expert fundamentalist 30 Deep domain knowledge Very high (noise std = 0.03)
Informed fundamentalist 120 Good relevant knowledge High (noise std = 0.08)
Casual informed 250 Some relevant knowledge Medium (noise std = 0.18)
Noise trader 400 No relevant information None (random beliefs)
Contrarian 50 Systematically biased against consensus Inverted signals
Momentum trader 150 Follow price trends No fundamental info

1.3 Core Simulation Code

"""
Case Study 1: Wisdom of Crowds — Agent-Based Simulation
Chapter 11: Information Aggregation Theory

This simulation demonstrates how diverse agents with partial information
can collectively arrive at accurate probability estimates through
market interaction.
"""

import numpy as np
from collections import defaultdict
import json


class Agent:
    """Base class for all agent types."""

    def __init__(self, agent_id, agent_type, cash=1000.0):
        self.agent_id = agent_id
        self.agent_type = agent_type
        self.cash = cash
        self.position = 0.0
        self.trade_history = []
        self.belief = 0.5  # initial uninformed belief

    def compute_demand(self, current_price, price_history):
        """Compute desired trade quantity. Positive = buy, negative = sell."""
        raise NotImplementedError

    def update_belief(self, current_price, price_history):
        """Update belief based on market information."""
        pass


class ExpertFundamentalist(Agent):
    """Expert with very accurate private signal."""

    def __init__(self, agent_id, true_prob, noise_std=0.03):
        super().__init__(agent_id, 'expert_fundamentalist')
        # Receive a very accurate signal
        self.signal = np.clip(
            np.random.normal(true_prob, noise_std), 0.01, 0.99
        )
        self.belief = self.signal
        self.intensity = 5.0  # trade aggressively

    def compute_demand(self, current_price, price_history):
        mispricing = self.belief - current_price
        # Trade proportional to mispricing, bounded by position limits
        demand = self.intensity * mispricing
        # Risk management: reduce demand as position grows
        if abs(self.position) > 100:
            demand *= 100 / abs(self.position)
        return demand


class InformedFundamentalist(Agent):
    """Trader with good but imperfect private signal."""

    def __init__(self, agent_id, true_prob, noise_std=0.08):
        super().__init__(agent_id, 'informed_fundamentalist')
        self.signal = np.clip(
            np.random.normal(true_prob, noise_std), 0.01, 0.99
        )
        self.belief = self.signal
        self.intensity = 2.0

    def compute_demand(self, current_price, price_history):
        mispricing = self.belief - current_price
        demand = self.intensity * mispricing
        if abs(self.position) > 50:
            demand *= 50 / abs(self.position)
        return demand


class CasualInformed(Agent):
    """Trader with some relevant knowledge, noisy signal."""

    def __init__(self, agent_id, true_prob, noise_std=0.18):
        super().__init__(agent_id, 'casual_informed')
        self.signal = np.clip(
            np.random.normal(true_prob, noise_std), 0.01, 0.99
        )
        self.belief = self.signal
        self.intensity = 0.8
        self.learning_rate = 0.1  # they update beliefs based on price

    def compute_demand(self, current_price, price_history):
        mispricing = self.belief - current_price
        return self.intensity * mispricing

    def update_belief(self, current_price, price_history):
        # Casual traders are influenced by the market price
        self.belief = (1 - self.learning_rate) * self.belief + \
                      self.learning_rate * current_price


class NoiseTrader(Agent):
    """Trader with no relevant information; trades randomly."""

    def __init__(self, agent_id):
        super().__init__(agent_id, 'noise_trader')
        self.belief = np.random.uniform(0.1, 0.9)
        self.volatility = 0.5

    def compute_demand(self, current_price, price_history):
        return np.random.normal(0, self.volatility)

    def update_belief(self, current_price, price_history):
        # Noise traders have drifting random beliefs
        self.belief = np.clip(
            self.belief + np.random.normal(0, 0.05), 0.05, 0.95
        )


class Contrarian(Agent):
    """Trader who systematically bets against the consensus."""

    def __init__(self, agent_id, true_prob, noise_std=0.15):
        super().__init__(agent_id, 'contrarian')
        # Contrarians have inverted signals (model the wrong direction)
        raw_signal = np.clip(
            np.random.normal(true_prob, noise_std), 0.01, 0.99
        )
        self.signal = 1.0 - raw_signal  # inverted!
        self.belief = self.signal
        self.intensity = 1.5

    def compute_demand(self, current_price, price_history):
        mispricing = self.belief - current_price
        return self.intensity * mispricing


class MomentumTrader(Agent):
    """Trader who follows recent price trends."""

    def __init__(self, agent_id, lookback=15, sensitivity=2.0):
        super().__init__(agent_id, 'momentum_trader')
        self.lookback = lookback
        self.sensitivity = sensitivity

    def compute_demand(self, current_price, price_history):
        if len(price_history) < self.lookback:
            return 0
        recent = price_history[-self.lookback:]
        trend = (recent[-1] - recent[0]) / self.lookback
        return self.sensitivity * trend * 50


class PredictionMarketSimulation:
    """
    Full prediction market simulation with heterogeneous agents.
    """

    def __init__(self, true_probability, initial_price=0.50,
                 price_impact=0.0008):
        self.true_probability = true_probability
        self.price = initial_price
        self.price_impact = price_impact
        self.price_history = [initial_price]
        self.volume_history = []
        self.agents = []
        self.round_data = []

    def create_standard_population(self):
        """Create the standard 1000-agent population."""
        agent_id = 0

        # 30 expert fundamentalists
        for _ in range(30):
            self.agents.append(
                ExpertFundamentalist(agent_id, self.true_probability)
            )
            agent_id += 1

        # 120 informed fundamentalists
        for _ in range(120):
            self.agents.append(
                InformedFundamentalist(agent_id, self.true_probability)
            )
            agent_id += 1

        # 250 casual informed
        for _ in range(250):
            self.agents.append(
                CasualInformed(agent_id, self.true_probability)
            )
            agent_id += 1

        # 400 noise traders
        for _ in range(400):
            self.agents.append(NoiseTrader(agent_id))
            agent_id += 1

        # 50 contrarians
        for _ in range(50):
            self.agents.append(
                Contrarian(agent_id, self.true_probability)
            )
            agent_id += 1

        # 150 momentum traders
        for _ in range(150):
            self.agents.append(MomentumTrader(agent_id))
            agent_id += 1

        return self

    def run(self, n_rounds=500, participation_rate=0.25):
        """
        Run the simulation.

        Parameters
        ----------
        n_rounds : int
            Number of trading rounds.
        participation_rate : float
            Fraction of agents active each round.
        """
        for t in range(n_rounds):
            # Select active agents this round
            active_mask = np.random.random(len(self.agents)) < participation_rate
            active_agents = [a for a, m in zip(self.agents, active_mask) if m]

            # Collect demands
            demands = []
            for agent in active_agents:
                demand = agent.compute_demand(self.price, self.price_history)
                demands.append((agent, demand))

            # Calculate net demand and price change
            net_demand = sum(d for _, d in demands)
            volume = sum(abs(d) for _, d in demands)

            price_change = self.price_impact * net_demand
            new_price = np.clip(self.price + price_change, 0.01, 0.99)

            # Execute trades
            for agent, demand in demands:
                if abs(demand) > 0.001:
                    trade_price = (self.price + new_price) / 2
                    agent.position += demand
                    agent.cash -= demand * trade_price
                    agent.trade_history.append({
                        'round': t,
                        'demand': demand,
                        'price': trade_price
                    })

            # Update price
            self.price = new_price
            self.price_history.append(self.price)
            self.volume_history.append(volume)

            # Agents update beliefs
            for agent in self.agents:
                agent.update_belief(self.price, self.price_history)

            # Record round data
            self.round_data.append({
                'round': t,
                'price': self.price,
                'volume': volume,
                'net_demand': net_demand,
                'n_active': len(active_agents),
                'price_error': abs(self.price - self.true_probability)
            })

        return self

    def settle_and_analyze(self):
        """Settle contracts and analyze results."""
        # Determine event outcome
        outcome = 1.0 if np.random.random() < self.true_probability else 0.0

        # Compute PnL for each agent
        for agent in self.agents:
            settlement_value = agent.position * outcome
            agent.pnl = settlement_value + agent.cash - 1000.0

        # Aggregate results by type
        type_results = defaultdict(lambda: {
            'count': 0, 'total_pnl': 0, 'pnls': [],
            'avg_belief': 0, 'beliefs': []
        })

        for agent in self.agents:
            t = agent.agent_type
            type_results[t]['count'] += 1
            type_results[t]['total_pnl'] += agent.pnl
            type_results[t]['pnls'].append(agent.pnl)
            type_results[t]['beliefs'].append(agent.belief)

        for t in type_results:
            type_results[t]['avg_pnl'] = (
                type_results[t]['total_pnl'] / type_results[t]['count']
            )
            type_results[t]['avg_belief'] = np.mean(type_results[t]['beliefs'])
            type_results[t]['pnl_std'] = np.std(type_results[t]['pnls'])

        # Price accuracy metrics
        prices = np.array(self.price_history)
        errors = np.abs(prices - self.true_probability)

        # Convergence: first round where error < 0.03 and stays there
        converged_round = None
        for i in range(len(errors) - 20):
            if all(errors[i:i+20] < 0.03):
                converged_round = i
                break

        return {
            'outcome': outcome,
            'final_price': self.price_history[-1],
            'true_probability': self.true_probability,
            'final_error': errors[-1],
            'mean_error': errors.mean(),
            'min_error': errors.min(),
            'converged_round': converged_round,
            'type_results': dict(type_results),
            'price_history': self.price_history,
            'volume_history': self.volume_history
        }

Part 2: Running the Baseline Simulation

2.1 Standard Conditions

# Baseline simulation: standard population, standard conditions
np.random.seed(42)

sim = PredictionMarketSimulation(true_probability=0.65)
sim.create_standard_population()
sim.run(n_rounds=500)
results = sim.settle_and_analyze()

print("=" * 60)
print("BASELINE SIMULATION RESULTS")
print("=" * 60)
print(f"\nTrue probability:    {results['true_probability']:.3f}")
print(f"Final market price:  {results['final_price']:.3f}")
print(f"Final error:         {results['final_error']:.4f}")
print(f"Mean error:          {results['mean_error']:.4f}")
print(f"Converged at round:  {results['converged_round']}")
print(f"Event outcome:       {results['outcome']}")

print(f"\n{'Agent Type':<25} {'Count':>6} {'Avg PnL':>10} {'Avg Belief':>12}")
print("-" * 55)
for agent_type, data in sorted(results['type_results'].items()):
    print(f"{agent_type:<25} {data['count']:>6} "
          f"{data['avg_pnl']:>10.2f} {data['avg_belief']:>12.3f}")

2.2 Interpreting Baseline Results

The baseline simulation should show:

  1. Price convergence. The market price should converge toward 0.65 within the first 100-200 rounds, despite starting at 0.50 and being influenced by 400 noise traders and 150 momentum traders.

  2. Informed traders profit. Expert and informed fundamentalists should earn positive PnL on average, while noise traders and contrarians lose money. This wealth transfer is the mechanism by which the market rewards accurate information.

  3. Momentum traders are a mixed bag. Their profitability depends on whether the dominant trend (from 0.50 toward 0.65) was strong enough to exploit, versus the noise-driven false trends they also chased.


Part 3: Varying Conditions

3.1 Experiment 1: Diversity Matters

What happens when we reduce the diversity of the informed population?

def run_diversity_experiment(n_simulations=50):
    """
    Compare market accuracy with diverse vs. homogeneous
    informed populations.
    """
    results_diverse = []
    results_homogeneous = []

    for _ in range(n_simulations):
        # Diverse: standard population
        sim_d = PredictionMarketSimulation(true_probability=0.65)
        sim_d.create_standard_population()
        sim_d.run(n_rounds=300)
        r_d = sim_d.settle_and_analyze()
        results_diverse.append(r_d['final_error'])

        # Homogeneous: all informed traders have same signal quality
        sim_h = PredictionMarketSimulation(true_probability=0.65)
        # Replace diverse fundamentalists with all the same type
        agent_id = 0
        for _ in range(400):  # same total informed count
            sim_h.agents.append(
                InformedFundamentalist(agent_id, 0.65, noise_std=0.10)
            )
            agent_id += 1
        for _ in range(400):
            sim_h.agents.append(NoiseTrader(agent_id))
            agent_id += 1
        for _ in range(200):
            sim_h.agents.append(MomentumTrader(agent_id))
            agent_id += 1
        sim_h.run(n_rounds=300)
        r_h = sim_h.settle_and_analyze()
        results_homogeneous.append(r_h['final_error'])

    print("=== Diversity Experiment ===")
    print(f"Diverse population  - Mean error: {np.mean(results_diverse):.4f} "
          f"(std: {np.std(results_diverse):.4f})")
    print(f"Homogeneous population - Mean error: {np.mean(results_homogeneous):.4f} "
          f"(std: {np.std(results_homogeneous):.4f})")

    return results_diverse, results_homogeneous

3.2 Experiment 2: Independence and Correlated Beliefs

What happens when agents share information and their beliefs become correlated?

def run_correlation_experiment(n_simulations=50):
    """
    Simulate different levels of belief correlation
    (social influence between agents).
    """
    correlation_levels = [0.0, 0.1, 0.2, 0.3, 0.5, 0.7]
    results = {}

    for rho in correlation_levels:
        errors = []
        for _ in range(n_simulations):
            sim = PredictionMarketSimulation(true_probability=0.65)

            # Create agents with correlated noise
            agent_id = 0
            common_shock = np.random.normal(0, 0.15)  # shared error

            for _ in range(200):
                agent = InformedFundamentalist(agent_id, 0.65, noise_std=0.10)
                # Add correlated component to signal
                agent.signal = np.clip(
                    agent.signal + np.sqrt(rho) * common_shock,
                    0.01, 0.99
                )
                agent.belief = agent.signal
                sim.agents.append(agent)
                agent_id += 1

            for _ in range(800):
                sim.agents.append(NoiseTrader(agent_id))
                agent_id += 1

            sim.run(n_rounds=300)
            r = sim.settle_and_analyze()
            errors.append(r['final_error'])

        results[rho] = {
            'mean_error': np.mean(errors),
            'std_error': np.std(errors),
            'median_error': np.median(errors)
        }

    print("=== Correlation Experiment ===")
    print(f"{'Correlation':>12} {'Mean Error':>12} {'Std Error':>12}")
    print("-" * 38)
    for rho, data in sorted(results.items()):
        print(f"{rho:>12.1f} {data['mean_error']:>12.4f} "
              f"{data['std_error']:>12.4f}")

    return results

3.3 Experiment 3: Informed Trader Fraction

How many informed traders are needed for accurate prices?

def run_informed_fraction_experiment(n_simulations=50):
    """
    Vary the fraction of informed traders and measure accuracy.
    """
    fractions = [0.01, 0.02, 0.05, 0.10, 0.15, 0.20, 0.30, 0.50]
    results = {}

    for frac in fractions:
        errors = []
        convergence_rounds = []

        for _ in range(n_simulations):
            sim = PredictionMarketSimulation(true_probability=0.65)
            agent_id = 0
            n_total = 1000
            n_informed = int(n_total * frac)
            n_noise = n_total - n_informed

            for _ in range(n_informed):
                sim.agents.append(
                    InformedFundamentalist(agent_id, 0.65, noise_std=0.08)
                )
                agent_id += 1

            for _ in range(n_noise):
                sim.agents.append(NoiseTrader(agent_id))
                agent_id += 1

            sim.run(n_rounds=400)
            r = sim.settle_and_analyze()
            errors.append(r['final_error'])
            convergence_rounds.append(
                r['converged_round'] if r['converged_round'] is not None else 400
            )

        results[frac] = {
            'mean_error': np.mean(errors),
            'std_error': np.std(errors),
            'mean_convergence': np.mean(convergence_rounds),
            'pct_converged': sum(
                1 for c in convergence_rounds if c < 400
            ) / len(convergence_rounds)
        }

    print("=== Informed Fraction Experiment ===")
    print(f"{'Fraction':>10} {'Mean Error':>12} {'Convergence':>14} "
          f"{'% Converged':>13}")
    print("-" * 52)
    for frac, data in sorted(results.items()):
        print(f"{frac:>9.0%} {data['mean_error']:>12.4f} "
              f"{data['mean_convergence']:>12.0f} "
              f"{data['pct_converged']:>12.0%}")

    return results

Part 4: Analysis and Visualization

4.1 Price Path Analysis

def analyze_price_path(price_history, true_prob, label=""):
    """Analyze properties of the simulated price path."""
    prices = np.array(price_history)
    errors = np.abs(prices - true_prob)
    returns = np.diff(prices)

    # Compute statistics
    stats = {
        'label': label,
        'n_rounds': len(prices) - 1,
        'final_price': prices[-1],
        'final_error': errors[-1],
        'mean_error': errors.mean(),
        'max_error': errors.max(),
        'return_mean': returns.mean(),
        'return_std': returns.std(),
        'return_autocorr_lag1': np.corrcoef(returns[:-1], returns[1:])[0, 1]
            if len(returns) > 2 else 0,
        'max_drawdown': max(
            prices[:i+1].max() - prices[i]
            for i in range(len(prices))
        ) if len(prices) > 1 else 0
    }

    print(f"\n--- Price Path Analysis: {label} ---")
    for key, val in stats.items():
        if key != 'label':
            if isinstance(val, float):
                print(f"  {key:<25}: {val:.4f}")
            else:
                print(f"  {key:<25}: {val}")

    return stats

4.2 Wealth Transfer Analysis

def analyze_wealth_transfer(results):
    """
    Analyze how wealth flows between agent types.
    This is the key mechanism driving information aggregation.
    """
    print("\n=== Wealth Transfer Analysis ===")
    print("(Positive PnL = gained from market; Negative = lost to market)\n")

    total_informed_profit = 0
    total_noise_loss = 0

    for agent_type, data in sorted(results['type_results'].items()):
        total_pnl = data['total_pnl']
        avg_pnl = data['avg_pnl']
        count = data['count']

        bar_length = int(abs(avg_pnl) / 2)
        if avg_pnl >= 0:
            bar = "+" * min(bar_length, 40)
            bar_str = f"[{bar}>"
        else:
            bar = "-" * min(bar_length, 40)
            bar_str = f"<{bar}]"

        print(f"  {agent_type:<25} {avg_pnl:>+8.2f} per agent  "
              f"{total_pnl:>+10.2f} total  {bar_str}")

        if 'fundamental' in agent_type or 'expert' in agent_type:
            total_informed_profit += total_pnl
        if 'noise' in agent_type or 'contrarian' in agent_type:
            total_noise_loss += total_pnl

    print(f"\n  Total informed profit:   {total_informed_profit:>+10.2f}")
    print(f"  Total noise/contrarian loss: {total_noise_loss:>+10.2f}")
    print(f"  Market is approximately zero-sum: "
          f"{abs(total_informed_profit + total_noise_loss) < 100}")

Part 5: Key Findings and Discussion

5.1 Expected Results

When you run these simulations, you should observe the following patterns:

Finding 1: Convergence. With the standard population, prices converge to within 3 percentage points of the true probability within 100-200 rounds in most simulations. This demonstrates the core wisdom-of-crowds result.

Finding 2: Diversity improves accuracy. The diverse population (experts + informed + casual) produces lower mean errors than a homogeneous population with the same total number of informed traders. Different noise structures cancel more effectively.

Finding 3: Correlation is destructive. Even modest correlation ($\rho = 0.2$) significantly degrades accuracy. At $\rho = 0.5$, the market performs only marginally better than a single trader.

Finding 4: A few informed traders suffice. Markets with as few as 5% informed traders can achieve reasonable accuracy if those traders are active and well-capitalized. Below 2%, accuracy degrades rapidly.

Finding 5: Wealth transfer is the mechanism. Informed traders profit at the expense of noise traders and contrarians. This wealth transfer is not a bug—it is the mechanism that incentivizes information acquisition and transmission.

5.2 Discussion Questions

  1. How does the participation rate (fraction of agents active each round) affect convergence speed? Is it better to have all agents trade every round, or is random participation sufficient?

  2. If you could add one more agent type to the simulation, what would it be and why? Consider agents who acquire information at a cost, agents who switch strategies based on performance, or agents who communicate through social networks.

  3. How would the results change if the true probability were 0.95 instead of 0.65? Would convergence be faster or slower? Why?

  4. The simulation uses a simple price-impact model. How would results differ with an LMSR market maker? What are the trade-offs?

  5. In practice, prediction markets operate over days or weeks, not 500 discrete rounds. How would you calibrate the simulation parameters (price impact, participation rate, trading intensity) to match real-world data?

5.3 Extensions

For further exploration:

  • Dynamic information. Add events at specific rounds where new public information shifts the true probability. Measure how quickly the market adjusts.
  • Endogenous participation. Let agents decide whether to participate based on their expected profit. This creates an equilibrium where only the most informed agents remain active.
  • Network effects. Give agents a social network and let them share beliefs before trading. Measure how different network structures affect accuracy.
  • Multiple markets. Run correlated markets simultaneously and see if agents who are informed about the correlation can profit by trading across markets.