Case Study 1: The Favorite-Longshot Bias Across 10,000 Markets

Overview

In this case study, we analyze a synthetic dataset of 10,000 resolved prediction market contracts to detect, quantify, and exploit the favorite-longshot bias (FLB). We generate the dataset with a known bias structure, then apply detection algorithms as if we were analyzing real data. Finally, we build a contrarian strategy that trades against the FLB and backtest its performance.

This case study mirrors the kind of analysis a serious prediction market trader would perform on real historical data from platforms like Polymarket, PredictIt, or Kalshi.

Part 1: Generating the Synthetic Dataset

We generate 10,000 contracts, each with a true probability drawn from a Beta(2, 2) distribution (which produces a bell-shaped distribution centered at 0.50, representing a realistic mix of likely and unlikely events). Market prices are generated by applying a prospect theory weighting function to the true probabilities, plus noise.

import numpy as np

np.random.seed(42)

N = 10000

# True probabilities: Beta(2,2) gives a nice spread centered at 0.5
true_probs = np.random.beta(2, 2, size=N)

# Apply prospect theory weighting to generate biased market prices
gamma = 0.65  # Known bias parameter
weighted_probs = true_probs ** gamma / (
    true_probs ** gamma + (1 - true_probs) ** gamma
) ** (1 / gamma)

# Add noise (market microstructure, private information, etc.)
noise = np.random.normal(0, 0.05, size=N)
market_prices = np.clip(weighted_probs + noise, 0.01, 0.99)

# Generate outcomes based on true probabilities
outcomes = (np.random.uniform(size=N) < true_probs).astype(float)

print(f"Dataset: {N} contracts")
print(f"True prob range: [{true_probs.min():.3f}, {true_probs.max():.3f}]")
print(f"Market price range: [{market_prices.min():.3f}, {market_prices.max():.3f}]")
print(f"Outcome rate: {outcomes.mean():.3f}")

Part 2: Detecting the Favorite-Longshot Bias

We bin the contracts by their market-implied probability and compare the actual outcome rate in each bin to the implied probability. The FLB predicts that low-probability bins will have actual rates below their implied rates (longshots overpriced), and high-probability bins will have actual rates above their implied rates (favorites underpriced).

n_bins = 20
bin_edges = np.linspace(0, 1, n_bins + 1)

print(f"{'Bin Center':>12} {'N':>6} {'Implied':>10} {'Actual':>10} {'Deviation':>12} {'Direction':>12}")
print("-" * 70)

bin_centers = []
actual_rates = []
deviations_list = []
counts_list = []

for i in range(n_bins):
    lo, hi = bin_edges[i], bin_edges[i + 1]
    if i == n_bins - 1:
        mask = (market_prices >= lo) & (market_prices <= hi)
    else:
        mask = (market_prices >= lo) & (market_prices < hi)

    n = mask.sum()
    if n < 10:
        continue

    center = (lo + hi) / 2
    actual = outcomes[mask].mean()
    dev = actual - center

    bin_centers.append(center)
    actual_rates.append(actual)
    deviations_list.append(dev)
    counts_list.append(n)

    direction = "Overpriced" if dev < -0.01 else ("Underpriced" if dev > 0.01 else "Fair")
    print(f"{center:12.3f} {n:6d} {center:10.3f} {actual:10.3f} {dev:12.3f} {direction:>12}")

bin_centers = np.array(bin_centers)
actual_rates_arr = np.array(actual_rates)
deviations_arr = np.array(deviations_list)

Expected output pattern: Low-probability bins show negative deviations (actual < implied, meaning longshots are overpriced). High-probability bins show positive deviations (actual > implied, meaning favorites are underpriced). The crossover point is near 0.35-0.40.

Part 3: Quantifying the Bias

We compute several summary statistics to quantify the FLB.

from scipy.optimize import minimize_scalar
from scipy import stats

# 1. Correlation between implied probability and deviation
flb_correlation = np.corrcoef(bin_centers, deviations_arr)[0, 1]
print(f"FLB Correlation: {flb_correlation:.4f}")
print(f"  (Positive = classic FLB pattern; >0.5 is strong evidence)")

# 2. Fit gamma from prospect theory weighting function
def neg_sse(gamma):
    if gamma <= 0.1 or gamma > 2.0:
        return 1e12
    w = bin_centers ** gamma / (
        bin_centers ** gamma + (1 - bin_centers) ** gamma
    ) ** (1 / gamma)
    residuals = actual_rates_arr - w
    weights = np.array(counts_list)
    return np.sum(weights * residuals ** 2)

result = minimize_scalar(neg_sse, bounds=(0.2, 1.8), method='bounded')
gamma_hat = result.x
print(f"Estimated gamma: {gamma_hat:.4f}")
print(f"  (True gamma: {gamma}; gamma < 1 confirms FLB)")

# 3. Average overpricing of longshots (implied prob < 0.25)
longshot_mask = bin_centers < 0.25
if longshot_mask.any():
    avg_longshot_overpricing = -deviations_arr[longshot_mask].mean()
    print(f"Average longshot overpricing: {avg_longshot_overpricing:.4f}")

# 4. Average underpricing of favorites (implied prob > 0.75)
favorite_mask = bin_centers > 0.75
if favorite_mask.any():
    avg_favorite_underpricing = deviations_arr[favorite_mask].mean()
    print(f"Average favorite underpricing: {avg_favorite_underpricing:.4f}")

# 5. Statistical significance
slope, intercept, r_value, p_value, std_err = stats.linregress(
    bin_centers, deviations_arr
)
print(f"Linear regression slope: {slope:.4f} (p-value: {p_value:.6f})")
print(f"  (Positive slope with low p-value confirms FLB)")

Part 4: Building a Contrarian Strategy

Now we build a strategy that exploits the FLB: sell overpriced longshots, buy underpriced favorites.

def flb_strategy(market_price, gamma_est=0.65, threshold=0.02):
    """
    Generate a trading signal based on the estimated FLB.

    Parameters
    ----------
    market_price : float
        Current market-implied probability.
    gamma_est : float
        Estimated gamma from FLB detection.
    threshold : float
        Minimum estimated edge to trade.

    Returns
    -------
    dict with signal, direction, and estimated edge.
    """
    p = market_price

    # Invert the weighting function to estimate true probability
    # This is an approximation: if market_price = w(true_prob),
    # then true_prob = w_inv(market_price)
    # For simplicity, use numerical inversion
    from scipy.optimize import brentq

    def w(p_true):
        return p_true ** gamma_est / (
            p_true ** gamma_est + (1 - p_true) ** gamma_est
        ) ** (1 / gamma_est)

    try:
        true_prob_est = brentq(lambda x: w(x) - p, 0.001, 0.999)
    except ValueError:
        true_prob_est = p

    edge = true_prob_est - p

    if edge > threshold:
        return {'action': 'BUY', 'edge': edge, 'true_prob_est': true_prob_est}
    elif edge < -threshold:
        return {'action': 'SELL', 'edge': -edge, 'true_prob_est': true_prob_est}
    else:
        return {'action': 'HOLD', 'edge': 0, 'true_prob_est': true_prob_est}

Part 5: Backtesting the Strategy

from scipy.optimize import brentq

pnl = []
trades = []
transaction_cost = 0.01  # 1 cent per contract

for i in range(N):
    signal = flb_strategy(market_prices[i], gamma_est=gamma_hat, threshold=0.02)

    if signal['action'] == 'BUY':
        profit = outcomes[i] - market_prices[i] - transaction_cost
        pnl.append(profit)
        trades.append({
            'idx': i,
            'action': 'BUY',
            'price': market_prices[i],
            'outcome': outcomes[i],
            'profit': profit,
            'edge': signal['edge']
        })
    elif signal['action'] == 'SELL':
        profit = market_prices[i] - outcomes[i] - transaction_cost
        pnl.append(profit)
        trades.append({
            'idx': i,
            'action': 'SELL',
            'price': market_prices[i],
            'outcome': outcomes[i],
            'profit': profit,
            'edge': signal['edge']
        })
    else:
        pnl.append(0)

pnl = np.array(pnl)
active_pnl = pnl[pnl != 0]

print("=" * 60)
print("BACKTEST RESULTS: FLB Exploitation Strategy")
print("=" * 60)
print(f"Total contracts analyzed: {N}")
print(f"Trades executed: {len(trades)}")
print(f"Hold (no trade): {N - len(trades)}")
print(f"")
print(f"Total PnL: ${pnl.sum():.2f}")
print(f"Mean PnL per trade: ${active_pnl.mean():.4f}")
print(f"Median PnL per trade: ${np.median(active_pnl):.4f}")
print(f"Std dev of PnL: ${active_pnl.std():.4f}")
print(f"")
print(f"Win rate: {(active_pnl > 0).mean():.2%}")
print(f"Average win: ${active_pnl[active_pnl > 0].mean():.4f}")
print(f"Average loss: ${active_pnl[active_pnl < 0].mean():.4f}")
print(f"")

# Sharpe ratio (annualized, assuming daily trades)
sharpe = active_pnl.mean() / active_pnl.std() * np.sqrt(252)
print(f"Sharpe ratio (annualized): {sharpe:.2f}")

# Maximum drawdown
cumulative = pnl.cumsum()
peak = np.maximum.accumulate(cumulative)
drawdown = peak - cumulative
max_dd = drawdown.max()
print(f"Maximum drawdown: ${max_dd:.2f}")

# Breakdown by trade direction
buy_trades = [t for t in trades if t['action'] == 'BUY']
sell_trades = [t for t in trades if t['action'] == 'SELL']
print(f"")
print(f"BUY trades: {len(buy_trades)}")
if buy_trades:
    buy_profits = np.array([t['profit'] for t in buy_trades])
    print(f"  Win rate: {(buy_profits > 0).mean():.2%}")
    print(f"  Mean PnL: ${buy_profits.mean():.4f}")

print(f"SELL trades: {len(sell_trades)}")
if sell_trades:
    sell_profits = np.array([t['profit'] for t in sell_trades])
    print(f"  Win rate: {(sell_profits > 0).mean():.2%}")
    print(f"  Mean PnL: ${sell_profits.mean():.4f}")

Part 6: Sensitivity Analysis

We test how the strategy performs under different assumptions about gamma and the trading threshold.

print("\nSensitivity Analysis: Varying gamma estimate")
print(f"{'Gamma':>8} {'Trades':>8} {'Total PnL':>12} {'Win Rate':>10} {'Sharpe':>8}")
print("-" * 50)

for gamma_test in [0.50, 0.55, 0.60, 0.65, 0.70, 0.75, 0.80, 0.90, 1.00]:
    test_pnl = []
    test_trades = 0

    for i in range(N):
        signal = flb_strategy(market_prices[i], gamma_est=gamma_test, threshold=0.02)
        if signal['action'] == 'BUY':
            test_pnl.append(outcomes[i] - market_prices[i] - 0.01)
            test_trades += 1
        elif signal['action'] == 'SELL':
            test_pnl.append(market_prices[i] - outcomes[i] - 0.01)
            test_trades += 1

    test_pnl = np.array(test_pnl) if test_pnl else np.array([0])
    win_rate = (test_pnl > 0).mean()
    sharpe_val = test_pnl.mean() / max(test_pnl.std(), 1e-9) * np.sqrt(252)

    print(f"{gamma_test:8.2f} {test_trades:8d} {test_pnl.sum():12.2f} {win_rate:10.2%} {sharpe_val:8.2f}")

print("\nSensitivity Analysis: Varying threshold")
print(f"{'Threshold':>10} {'Trades':>8} {'Total PnL':>12} {'Win Rate':>10} {'Mean Edge':>10}")
print("-" * 55)

for thresh in [0.005, 0.01, 0.02, 0.03, 0.05, 0.08, 0.10]:
    test_pnl = []
    test_edges = []

    for i in range(N):
        signal = flb_strategy(market_prices[i], gamma_est=gamma_hat, threshold=thresh)
        if signal['action'] != 'HOLD':
            if signal['action'] == 'BUY':
                test_pnl.append(outcomes[i] - market_prices[i] - 0.01)
            else:
                test_pnl.append(market_prices[i] - outcomes[i] - 0.01)
            test_edges.append(signal['edge'])

    test_pnl = np.array(test_pnl) if test_pnl else np.array([0])
    test_edges = np.array(test_edges) if test_edges else np.array([0])

    print(f"{thresh:10.3f} {len(test_pnl):8d} {test_pnl.sum():12.2f} "
          f"{(test_pnl > 0).mean():10.2%} {test_edges.mean():10.4f}")

Part 7: Key Findings and Discussion

Expected Findings

The FLB is clearly present in the synthetic data, with a positive correlation between implied probability and deviation, and a fitted gamma below 1.0.
The contrarian strategy is profitable, with a positive total PnL and a reasonable Sharpe ratio. The strategy makes money by selling overpriced longshots and buying underpriced favorites.
The strategy works best at the extremes. Contracts with very low (< 15%) or very high (> 85%) implied probabilities show the largest mispricing and the highest profit per trade.
Sensitivity to gamma estimation. The strategy is profitable for a range of gamma estimates but performs best when gamma is close to the true value. Overestimating the bias (using too low a gamma) leads to overtrading and reduced performance.
Threshold selection matters. A higher threshold means fewer but higher-quality trades. There is a trade-off between trade frequency and edge per trade.

Real-World Considerations

In real prediction markets, several additional factors must be considered:

Transaction costs. Real transaction costs (spread, fees) may be larger than the 1 cent assumed here. The strategy's profitability depends critically on keeping transaction costs low.
Liquidity constraints. Not all contracts have sufficient liquidity to execute the desired trade size at the market price.
Time decay. Some of the FLB edge may come from holding positions until resolution, which ties up capital.
Model risk. The true gamma is unknown, and the estimated gamma may be inaccurate.
Non-stationarity. The magnitude of the FLB may change over time as markets mature and traders learn.

Conclusions

The favorite-longshot bias is a real, measurable, and exploitable phenomenon in prediction markets. A systematic strategy that corrects for the bias by selling longshots and buying favorites can generate consistent positive returns, provided that transaction costs are managed carefully and the bias magnitude is estimated accurately.

The full code for this case study is available in code/case-study-code.py.