37 min read

> "In trading, as in poker, you don't need to win every hand. You need to find situations where the odds are in your favor, bet appropriately, and repeat." -- Adapted from David Sklansky

Chapter 13: Finding and Quantifying Your Edge

"In trading, as in poker, you don't need to win every hand. You need to find situations where the odds are in your favor, bet appropriately, and repeat." -- Adapted from David Sklansky

Every prediction market participant faces the same fundamental question: Do I actually know something the market doesn't? This question separates profitable traders from those who slowly donate their capital to more informed participants. In this chapter, we build a rigorous framework for identifying, measuring, sizing, and tracking your edge in prediction markets. We will move from intuitive notions of "I think the market is wrong" to precise, quantifiable measures that tell you exactly how much to bet and when to walk away.


13.1 What Is "Edge"?

13.1.1 Defining Edge

Edge is a systematic advantage that allows you to generate positive expected returns from trading. In prediction markets, edge means your probability estimate for an event is more accurate than the price the market offers. If a market prices an event at 40 cents (implying 40% probability) and the true probability is 55%, you have a 15-percentage-point edge.

Formally, we define edge as:

$$ \text{Edge} = q - p $$

where: - $q$ is your estimated true probability of the event occurring - $p$ is the market's implied probability (the price of the YES contract)

When Edge > 0, buying YES has positive expected value. When Edge < 0, buying YES has negative expected value (but selling YES, or buying NO, may have positive expected value).

This definition is deceptively simple. The entire challenge of prediction market trading lies in accurately estimating $q$, honestly assessing your confidence in that estimate, and resisting the temptation to see edge where none exists.

13.1.2 Types of Edge

Not all edges are created equal. Understanding the source of your advantage helps you assess its reliability and durability.

Information Edge You possess facts, data, or observations that other market participants do not yet have. Examples include: - Being physically present at an event (witnessing a candidate's health issues firsthand) - Having access to unpublished polling data or proprietary models - Reading primary sources in a foreign language that most traders cannot access - Having professional domain expertise (a meteorologist trading weather markets)

Information edges are often the most powerful but also the most fleeting. Once the information becomes public, the edge vanishes.

Analytical Edge You process publicly available information more skillfully than other participants. Examples include: - Building a superior statistical model from the same data everyone can see - Correctly weighting base rates when others succumb to recency bias - Identifying relevant historical analogies that others overlook - Properly aggregating multiple weak signals into a strong forecast

Analytical edges are more sustainable than information edges because they stem from skill rather than access. However, they require continuous improvement as other traders develop better models.

Speed Edge You react to new information faster than other participants. Examples include: - Automated systems that parse news feeds and trade within seconds - Monitoring live data sources (election night returns, sports scores) with low latency - Being awake and active during off-peak hours when markets are thin

Speed edges are valuable but require technical infrastructure. They are also vulnerable to competition from faster traders.

Behavioral Edge You exploit systematic psychological biases in other traders. Examples include: - Buying when panic selling drives prices below fair value - Selling into irrational exuberance or hype-driven rallies - Trading against the favorite-longshot bias (markets tend to overprice longshots and underprice favorites) - Exploiting anchoring effects when markets react sluggishly to new information

Behavioral edges can be very durable because cognitive biases are deeply rooted. However, as prediction markets mature, behavioral mispricings may become smaller.

13.1.3 The Uncomfortable Truth: Most Traders Don't Have Edge

Before we proceed with the mechanics of exploiting edge, we must confront an uncomfortable reality: most prediction market traders do not have a consistent edge. Studies of financial markets consistently show that the vast majority of active traders underperform passive strategies. Prediction markets, while different in structure, are subject to the same basic arithmetic.

For every dollar won by an above-average trader, a dollar must be lost by a below-average trader (minus platform fees). This is a zero-sum game (negative-sum after fees). The average trader must lose money equal to the fees they pay.

Consider the following thought experiment. You see a prediction market for "Will candidate X win the election?" trading at 60 cents. You believe the true probability is 70%. Before acting, ask yourself: - Why do I believe 70%? What specific information or analysis supports this? - Why is the market at 60%? What do other traders know that might make 60% correct? - Am I more or less informed than the typical trader in this market? - Have I been right in similar situations before?

If you cannot articulate clear, specific reasons why your estimate is better than the market's, you probably do not have edge. The market price already incorporates the views of many participants, some of whom may have access to information or models superior to yours.

This chapter is not meant to discourage you from trading. Rather, it is meant to give you the tools to honestly assess whether you have edge, how much edge you have, and how to size your bets accordingly. The traders who survive and profit in the long run are those who are brutally honest about their abilities.


13.2 Expected Value: The Foundation of All Trading

13.2.1 The EV Formula for Prediction Markets

Expected value (EV) is the average outcome you would experience if you could repeat a bet infinitely many times. For a prediction market trade, EV is calculated as follows.

When you buy a YES contract at price $p$ and you believe the true probability of YES is $q$:

$$ \text{EV} = q \times (1 - p) - (1 - q) \times p $$

Breaking this down: - With probability $q$, the event occurs, and you gain $(1 - p)$ per contract (the contract pays \$1, and you paid $p$) - With probability $(1 - q)$, the event does not occur, and you lose $p$ per contract

Simplifying:

$$ \text{EV} = q - p $$

This elegant result shows that the EV of buying YES is simply the difference between your probability estimate and the market price. This is exactly our definition of edge.

When you buy a NO contract at price $(1 - p)$ (equivalently, sell YES at $p$) and you believe the true probability of YES is $q$:

$$ \text{EV} = (1 - q) \times p - q \times (1 - p) = p - q $$

The EV of buying NO is positive when $p > q$, meaning the market overestimates the event's probability.

13.2.2 EV as a Percentage Return

To compare opportunities across different price points, it helps to express EV as a percentage of the capital risked:

$$ \text{EV\%} = \frac{q - p}{p} \times 100\% $$

This measures the expected return on investment per dollar spent buying YES. For example, if $q = 0.70$ and $p = 0.60$:

$$ \text{EV\%} = \frac{0.70 - 0.60}{0.60} \times 100\% = 16.7\% $$

You expect to earn 16.7 cents for every 60 cents invested, on average.

13.2.3 Worked Examples

Example 1: Clear Positive EV Market: "Will it snow in Chicago in January?" trading at YES = $0.55 Your analysis: Historical base rate is 85%, current weather models suggest 80% this year. Your estimate: $q = 0.80$

$$ \text{EV} = 0.80 - 0.55 = +0.25 $$

$$ \text{EV\%} = \frac{0.25}{0.55} = +45.5\% $$

This is a very large edge. Before trading, you should verify: why is the market so low? Is there information you are missing? A 25-point edge in a liquid market would be unusual.

Example 2: Marginal Positive EV Market: "Will the incumbent win re-election?" trading at YES = $0.62 Your analysis: Polling average suggests 65%, economic models suggest 67%. Your estimate: $q = 0.65$

$$ \text{EV} = 0.65 - 0.62 = +0.03 $$

$$ \text{EV\%} = \frac{0.03}{0.62} = +4.8\% $$

A small edge. Whether this is worth trading depends on your confidence in the estimate and the costs involved (fees, opportunity cost of capital).

Example 3: Negative EV Market: "Will Team A win the championship?" trading at YES = $0.35 Your gut feeling: "They have a great roster, maybe 40%?" Analysis: Team A has a 30% win probability based on Elo ratings and historical performance. Your estimate (after analysis): $q = 0.30$

$$ \text{EV} = 0.30 - 0.35 = -0.05 $$

Buying YES has negative expected value. You would be paying 35 cents for something worth only 30 cents. If you still feel compelled to buy, consider that your gut feeling was contradicted by the data. The market may already incorporate the optimistic view.

13.2.4 The EV Scanner

In practice, you want to systematically scan many markets for positive EV opportunities rather than evaluating them one at a time. A basic EV scanner compares your probability estimates to market prices and flags opportunities above a minimum threshold.

See code/example-01-ev-calculator.py for a complete implementation. The key logic is:

def compute_ev(your_prob: float, market_price: float) -> dict:
    """Compute expected value metrics for a prediction market trade."""
    ev_yes = your_prob - market_price
    ev_no = market_price - your_prob

    best_side = "YES" if ev_yes > ev_no else "NO"
    best_ev = max(ev_yes, ev_no)

    if best_side == "YES":
        ev_pct = best_ev / market_price
    else:
        ev_pct = best_ev / (1 - market_price)

    return {
        "best_side": best_side,
        "edge": best_ev,
        "ev_pct": ev_pct,
        "cost": market_price if best_side == "YES" else (1 - market_price),
    }

When using an EV scanner, remember that the quality of the output depends entirely on the quality of your probability estimates. Garbage in, garbage out. The scanner is a tool for organizing your analysis, not a substitute for it.


13.3 The Kelly Criterion

13.3.1 The Problem: How Much to Bet?

Finding positive EV opportunities is only half the challenge. The other half is deciding how much capital to allocate to each bet. Bet too little, and you leave money on the table. Bet too much, and a string of losses can wipe you out even when you have genuine edge.

Consider a simple example: you find a coin that lands heads 60% of the time, and you can bet at even money (pay \$1 to win \$1). You have \$100. How much should you bet per flip?

  • Betting \$1 per flip: Very safe, but your bankroll grows slowly.
  • Betting \$50 per flip: You will probably win, but two consecutive losses (probability 16%) would cost you \$100 and you'd be ruined.
  • Betting \$100 per flip: A single loss wipes you out. This is clearly too much despite having edge.

The Kelly criterion provides the mathematically optimal answer: the fraction of your bankroll that maximizes the long-run growth rate of your wealth.

13.3.2 Derivation Intuition

The Kelly criterion was developed by John Kelly Jr. at Bell Labs in 1956. The key insight is that we should maximize the expected logarithm of wealth, not the expected wealth itself. Why?

Maximizing expected wealth leads to bizarre conclusions: if offered a bet with 51% chance of tripling your money and 49% chance of losing everything, expected wealth says bet everything (EV = 0.51 * 3 = 1.53 per dollar). But any reasonable person would recognize that this is a terrible idea because you go broke 49% of the time.

The logarithmic utility function accounts for the fact that losing 50% of your bankroll is more painful than gaining 50% is pleasurable. Mathematically, the geometric growth rate (which determines long-run compounded returns) is maximized by the Kelly fraction.

For a binary bet where: - $b$ = odds received on a winning bet (net profit per dollar wagered) - $p$ = probability of winning - $q = 1 - p$ = probability of losing

The Kelly fraction is:

$$ f^* = \frac{bp - q}{b} $$

13.3.3 Kelly for Prediction Markets

In a prediction market, buying YES at price $c$ (where $0 < c < 1$) has the following payoff structure: - Win: net profit of $(1 - c)$ per contract (you pay $c$, receive $1$) - Lose: net loss of $c$ per contract

The odds are $b = (1 - c) / c$. Substituting into the Kelly formula with your estimated win probability $p$:

$$ f^* = \frac{\frac{1-c}{c} \cdot p - (1-p)}{\frac{1-c}{c}} $$

Simplifying:

$$ f^* = \frac{p(1-c) - (1-p)c}{1-c} = \frac{p - c}{1 - c} $$

This is an elegant result for buying YES. The Kelly fraction for prediction markets is:

$$ f^*_{\text{YES}} = \frac{p - c}{1 - c} $$

where $p$ is your estimated probability and $c$ is the market price (cost of YES).

For buying NO (when you believe the market overestimates the event):

$$ f^*_{\text{NO}} = \frac{(1-p) - (1-c)}{c} = \frac{c - p}{c} $$

13.3.4 Worked Examples

Example 1: Strong Edge on YES You estimate $p = 0.75$ for an event priced at $c = 0.60$.

$$ f^*_{\text{YES}} = \frac{0.75 - 0.60}{1 - 0.60} = \frac{0.15}{0.40} = 0.375 $$

Kelly says bet 37.5% of your bankroll. With a \$10,000 bankroll, you would buy \$3,750 worth of YES contracts (which at \$0.60 each means 6,250 contracts).

Example 2: Moderate Edge on NO You estimate $p = 0.30$ for an event priced at $c = 0.45$.

$$ f^*_{\text{NO}} = \frac{0.45 - 0.30}{0.45} = \frac{0.15}{0.45} = 0.333 $$

Kelly says bet 33.3% of your bankroll on NO.

Example 3: Small Edge You estimate $p = 0.52$ for an event priced at $c = 0.48$.

$$ f^*_{\text{YES}} = \frac{0.52 - 0.48}{1 - 0.48} = \frac{0.04}{0.52} = 0.077 $$

Kelly says bet 7.7% of your bankroll. The smaller the edge, the smaller the bet.

Example 4: No Edge You estimate $p = 0.50$ for an event priced at $c = 0.50$.

$$ f^*_{\text{YES}} = \frac{0.50 - 0.50}{1 - 0.50} = 0 $$

Kelly says do not bet. This makes perfect sense: with no edge, any bet has zero or negative expected growth.

13.3.5 Properties of Kelly Betting

The Kelly criterion has several important properties:

  1. Never recommends risking more than your edge warrants. As edge shrinks toward zero, the Kelly fraction shrinks toward zero.

  2. Never recommends risking entire bankroll. The fraction is always between 0 and 1 (for positive EV bets).

  3. Maximizes long-run geometric growth rate. Over many bets, Kelly grows your bankroll faster than any other strategy with probability approaching 1.

  4. Is myopic. Each bet is sized independently based on current bankroll and edge. No need to plan ahead.

  5. Has high variance. Full Kelly can experience large drawdowns. The probability of seeing your bankroll cut in half at some point is significant.


13.4 Fractional Kelly and Practical Sizing

13.4.1 Why Full Kelly Is Too Aggressive

While the Kelly criterion is mathematically optimal for long-run growth, full Kelly betting is almost always too aggressive for real-world trading. Here is why:

Uncertainty in probability estimates. The Kelly formula assumes you know $p$ exactly. In reality, your probability estimate is itself uncertain. If you think an event has a 70% chance, your true confidence might be "somewhere between 60% and 80%." If the true probability turns out to be lower than your estimate, full Kelly will overbet.

Drawdown tolerance. Full Kelly has a 50% chance of experiencing a 50% drawdown from peak at some point. It has a 10% chance of experiencing a 90% drawdown. These drawdowns can be psychologically devastating and practically disastrous (margin calls, inability to pay rent).

Correlated bets. Kelly is derived for independent bets. In prediction markets, many bets are correlated (multiple elections in the same cycle, related economic events). Full Kelly on each individual bet ignores correlations and results in overexposure.

Model risk. Your model might be systematically wrong in ways you do not understand. Full Kelly provides no buffer against model failure.

13.4.2 Fractional Kelly

The standard practical solution is fractional Kelly: betting some fraction $\alpha$ of the full Kelly amount, where $0 < \alpha \leq 1$.

$$ f_{\text{practical}} = \alpha \cdot f^* $$

Common choices: - Half Kelly ($\alpha = 0.5$): The most popular choice. Achieves 75% of full Kelly's growth rate with dramatically lower drawdowns. With half Kelly, the probability of a 50% drawdown drops from 50% to about 25%. - Quarter Kelly ($\alpha = 0.25$): Very conservative. Achieves about 56% of full Kelly's growth rate but has very modest drawdowns. Good for beginners or when you have low confidence in your estimates. - Third Kelly ($\alpha = 0.33$): A reasonable middle ground. Achieves about 67% of full Kelly's growth rate.

13.4.3 Adjusting for Estimation Uncertainty

A principled way to choose $\alpha$ is to account for uncertainty in your probability estimate. If your estimate is $\hat{p}$ with standard error $\sigma_p$, you can compute Kelly for a range of plausible $p$ values and average the results.

A simpler heuristic: if your 80% confidence interval for $p$ is $[\hat{p} - \delta, \hat{p} + \delta]$, use the Kelly fraction computed at the conservative end of your range. For buying YES, use $p = \hat{p} - \delta$ in the formula. This naturally reduces your position size when you are less certain.

13.4.4 Over-Betting: The Asymmetric Danger

One of the most important insights from Kelly theory is that over-betting is far worse than under-betting. Consider the following:

  • Betting at 2x the Kelly fraction produces the same expected growth rate as not betting at all (it earns zero expected growth).
  • Betting at more than 2x Kelly produces negative expected growth, meaning you are expected to go broke.

This asymmetry means that when in doubt, bet less. Under-betting reduces your growth rate linearly, but over-betting can be catastrophic.

The growth rate as a function of bet size $f$ for a binary bet with probability $p$ and odds $b$ is:

$$ g(f) = p \ln(1 + bf) + (1-p) \ln(1 - f) $$

This function peaks at $f = f^*$ (the Kelly fraction) and is concave. It crosses zero at $f = 2f^*$ (approximately, for small $f^*$). Beyond $2f^*$, growth is negative.

13.4.5 Ruin Probability

Even with edge, there is always a probability of "ruin" (losing some specified fraction of your bankroll). For full Kelly:

$$ P(\text{bankroll ever drops to fraction } x \text{ of current}) = x $$

This means there is a 10% chance your bankroll will drop to 10% of its starting value at some point. For half Kelly:

$$ P(\text{bankroll ever drops to fraction } x) = x^2 $$

The ruin probability drops to 1% for reaching 10% of starting capital. This is a dramatic improvement that justifies fractional Kelly for almost all practical purposes.

13.4.6 Practical Position Sizing Algorithm

Here is a practical algorithm for sizing prediction market positions:

  1. Estimate your probability $\hat{p}$ and confidence interval $[\hat{p} - \delta, \hat{p} + \delta]$.
  2. Compute Kelly using the conservative estimate $\hat{p} - \delta$ (for YES bets).
  3. Apply fractional multiplier $\alpha$ (start with 0.25 for beginners, 0.50 for experienced traders).
  4. Apply portfolio limits: no single position should exceed 5-10% of bankroll regardless of Kelly.
  5. Check correlation: if you already have positions in related markets, reduce size further.
  6. Account for fees: adjust for platform trading fees, which reduce your effective edge.

See code/example-02-kelly-criterion.py for a complete implementation including simulation of different Kelly fractions.


13.5 Edge Decomposition

13.5.1 Why Decompose Edge?

Knowing your total edge (the gap between your probability and the market price) is useful, but understanding where that edge comes from is even more valuable. Edge decomposition helps you:

  • Identify which skills or information sources contribute most to your profitability
  • Focus your efforts on strengthening your biggest advantages
  • Recognize when a particular source of edge is eroding
  • Make better decisions about which markets to trade

13.5.2 Components of Edge

We decompose total edge into four components:

Calibration Edge ($E_{\text{cal}}$) The market may be systematically miscalibrated. For example, events priced at 80% might actually occur only 75% of the time. If you are better calibrated than the market, you have calibration edge. This is measured by comparing the market's calibration curve to yours.

$$ E_{\text{cal}} = q_{\text{calibrated\_market}} - p_{\text{market}} $$

where $q_{\text{calibrated\_market}}$ is what the market price should be after correcting for the market's calibration errors.

Information Edge ($E_{\text{info}}$) You have access to information that is not yet reflected in the market price. This is the portion of your edge attributable to private or under-appreciated information.

$$ E_{\text{info}} = q_{\text{with\_info}} - q_{\text{without\_info}} $$

This measures how much your probability estimate changes when you incorporate your private information.

Model Edge ($E_{\text{model}}$) Even with the same publicly available information, you may process it more accurately using a superior model. This is the portion of your edge from better analytical methodology.

$$ E_{\text{model}} = q_{\text{your\_model}} - q_{\text{naive\_model}} $$

where $q_{\text{naive\_model}}$ is the estimate from a simple baseline model using public data.

Timing Edge ($E_{\text{timing}}$) You may trade at better moments -- entering before favorable price moves and avoiding unfavorable timing. This is measured by comparing your entry prices to the average price during the period.

$$ E_{\text{timing}} = p_{\text{avg}} - p_{\text{your\_entry}} $$

(for YES purchases; positive means you bought cheaper than average).

13.5.3 The Decomposition

Total edge can be approximately decomposed:

$$ E_{\text{total}} = E_{\text{cal}} + E_{\text{info}} + E_{\text{model}} + E_{\text{timing}} + E_{\text{residual}} $$

where $E_{\text{residual}}$ captures interaction effects and components not easily categorized. In practice, these components are not perfectly additive because they can interact. For example, your model might be better because it incorporates your private information. The decomposition is still useful as a rough attribution.

13.5.4 Measuring Each Component

Calibration Edge Measurement: To measure the market's calibration error, collect a large sample of market prices and outcomes. Group prices into bins (e.g., 0-10%, 10-20%, ..., 90-100%) and compare the average price in each bin to the actual frequency of YES outcomes. The gap is the calibration error at that price level.

def measure_calibration_edge(prices, outcomes, n_bins=10):
    """Measure the market's calibration error."""
    bins = np.linspace(0, 1, n_bins + 1)
    cal_errors = []
    for i in range(n_bins):
        mask = (prices >= bins[i]) & (prices < bins[i+1])
        if mask.sum() > 0:
            avg_price = prices[mask].mean()
            actual_freq = outcomes[mask].mean()
            cal_errors.append(actual_freq - avg_price)
    return np.mean(cal_errors)

Information Edge Measurement: Compare your probability estimates with and without your private information. If you record both estimates, the difference averaged over many trades gives your information edge.

Model Edge Measurement: Compare your model's predictions to a naive baseline (e.g., the market price itself, or a simple historical base rate). The improvement in Brier score or log-loss attributable to your model is a measure of model edge.

Timing Edge Measurement: Compare your average entry prices to the volume-weighted average price (VWAP) over the relevant period. If you consistently buy below VWAP, you have positive timing edge.

See code/example-03-edge-tracker.py for a Python implementation of edge decomposition.


13.6 Sources of Edge in Prediction Markets

13.6.1 Domain Expertise

Perhaps the most accessible and reliable source of edge is deep knowledge in a specific domain. A climate scientist trading weather markets, an epidemiologist trading pandemic markets, or a political operative trading election markets each bring knowledge that most participants lack.

How to exploit it: Focus exclusively on markets within your domain. Build quantitative models informed by your expertise. Compare your model's outputs to market prices. Trade only when the gap exceeds a meaningful threshold.

Example: A trade policy expert notices that a market for "Will Country X impose tariffs on steel by year-end?" is priced at 25%. The expert knows that the relevant regulatory agency has already begun the administrative process, which historically leads to tariff imposition 70% of the time within the specified timeframe. The expert has a 45-percentage-point edge.

Durability: High, as long as you continue investing in domain knowledge. However, as prediction markets grow, more domain experts will participate, narrowing edges.

13.6.2 Quantitative Models

Building statistical or machine learning models that forecast events more accurately than the market consensus is a systematic approach to generating edge.

How to exploit it: Collect relevant data, build and backtest models, and trade when model predictions diverge significantly from market prices.

Example: You build a model that predicts election outcomes using economic indicators, approval ratings, and historical patterns. Your model has been backtested on 200 past elections and has a Brier score of 0.18, better than the prediction market's historical Brier score of 0.22 on similar events. You trade when your model's probability differs from the market by more than 5 percentage points.

Durability: Medium. Models can be replicated by other participants who collect the same data. However, continuous model improvement can maintain an edge.

13.6.3 News Speed

Being among the first to react to breaking news can generate edge, especially in markets that are slow to update.

How to exploit it: Monitor news sources, social media, and primary data feeds in real-time. Use automation to detect relevant news and execute trades quickly.

Example: An employment report is released showing unexpectedly strong job growth. You immediately buy YES on "Will the central bank raise interest rates at the next meeting?" before other traders update the market. You gain 3 cents of edge before the market adjusts.

Durability: Low to medium. Speed edges are eroded by faster competitors and market microstructure improvements. They require ongoing investment in infrastructure.

13.6.4 Cross-Market Information

Information in one market can inform trading decisions in related markets. This is analogous to pairs trading in financial markets.

How to exploit it: Monitor related markets for inconsistencies. If Market A implies X, and Market B implies not-X, at least one is mispriced. Trade accordingly.

Example: Market A prices "Party Z wins the election" at 60%. Market B prices "Party Z wins the popular vote" at 45%. Historically, winning the popular vote is a near-prerequisite for winning the election (though not always). This inconsistency suggests at least one market is mispriced. If you can determine which one is wrong, you have edge.

Durability: Medium. Cross-market arbitrage opportunities can persist because many traders focus on individual markets rather than analyzing relationships.

13.6.5 Behavioral Exploitation

Markets are composed of humans with systematic cognitive biases. Exploiting these biases is a durable source of edge.

Common biases to exploit:

Favorite-Longshot Bias: Markets tend to overprice low-probability events (longshots) and underprice high-probability events (favorites). Selling contracts priced at 5-10% that should be priced at 2-5% can be profitable over many trades.

Recency Bias: After a surprising event, markets overreact by pricing similar future events as too likely. After a political upset, markets may overprice upsets in the next election.

Narrative Bias: Events with compelling narratives get overpriced. "Will Team A complete an unprecedented comeback?" may be priced higher than the base rate would suggest because the story is exciting.

Anchoring: Markets can be slow to update from their initial prices. If a market opens at 50% and new information suggests 70%, the market might only move to 60%, anchored to the initial price.

13.6.6 Platform-Specific Inefficiencies

Different prediction market platforms have different user bases, fee structures, and liquidity profiles. These structural differences create exploitable inefficiencies.

Examples: - Price discrepancies between platforms trading the same event (cross-platform arbitrage) - Markets with low liquidity where prices lag information - New markets where prices have not yet converged to fair value - Events near resolution where prices stick at round numbers instead of moving to 95-99%

Durability: Variable. Cross-platform arbitrage opportunities may persist due to withdrawal delays and capital requirements. Liquidity-based edges are more transient.


13.7 Estimating Your True Probability

13.7.1 The Central Challenge

Everything in this chapter depends on accurately estimating $q$, the true probability of an event. If your estimate is wrong, your edge calculation is wrong, your Kelly sizing is wrong, and you will lose money. This section presents techniques for producing well-calibrated probability estimates.

13.7.2 Base Rate Analysis

The simplest and often most powerful starting point is the base rate: how often have similar events occurred in the past?

Steps: 1. Define the reference class: what category of events does this belong to? 2. Collect historical data on similar events. 3. Compute the frequency of the outcome. 4. Adjust for any factors that make the current situation different from the base rate.

Example: "Will the incumbent US president win re-election?" - Base rate: Since 1900, incumbents have won approximately 67% of the time (8 out of 12 who sought re-election). - Adjustment: Current economic conditions are poor, which historically correlates with incumbent losses. Adjust downward to perhaps 55%.

Base rates are powerful because they ground your estimate in data rather than narrative. They protect against overreaction to recent events and compelling stories.

13.7.3 Reference Class Forecasting

Reference class forecasting extends base rate analysis by carefully selecting the most relevant comparison group.

Steps: 1. Identify the specific event you are forecasting. 2. Define multiple possible reference classes (broad to narrow). 3. For each class, compute the base rate. 4. Select the reference class that is most relevant while still having sufficient sample size. 5. Adjust the base rate for factors specific to the current situation.

Example: "Will this tech startup IPO above its listing price on the first day?" - Broad class: All IPOs -- about 65% trade above listing price on day one. - Narrower class: Tech IPOs -- about 72%. - Narrowest class: Tech IPOs in the current market environment (bull market) -- about 80%. - Adjustment for specific company factors: strong revenue growth, popular brand -- adjust to 85%.

The tension in reference class forecasting is between relevance and sample size. The most relevant reference class may contain only a few data points, making the base rate unreliable.

13.7.4 Model-Based Estimation

For events with rich quantitative data, building a statistical model can produce superior estimates.

Common approaches: - Logistic regression: Model the binary outcome as a function of predictive features. - Ensemble methods: Combine predictions from multiple models (random forests, gradient boosting, etc.) for more robust estimates. - Bayesian models: Start with a prior (the base rate) and update with new evidence.

Practical tips for model building: - Always hold out data for validation. Backtested performance on the same data used for training is meaningless. - Be skeptical of complex models. Simple models with a few strong features often outperform complex models, especially with limited training data. - Quantify model uncertainty. A model that outputs 70% +/- 15% calls for very different trading than one that outputs 70% +/- 3%.

13.7.5 Combining Multiple Estimates

When you have multiple probability estimates (base rate, model, expert judgment), combining them typically produces better forecasts than using any single estimate.

Simple average: The most straightforward approach is to average your estimates. This is surprisingly effective.

$$ q_{\text{combined}} = \frac{1}{n}\sum_{i=1}^{n} q_i $$

Weighted average: If some estimates are more reliable, weight them accordingly.

$$ q_{\text{combined}} = \sum_{i=1}^{n} w_i q_i, \quad \sum w_i = 1 $$

Extremization: Research on forecasting tournaments shows that after averaging, you should often push the combined estimate away from 50%. If the average is 60%, the "extremized" estimate might be 65%. This corrects for the tendency of averages to be underconfident.

$$ q_{\text{extremized}} = \frac{q_{\text{combined}}^a}{q_{\text{combined}}^a + (1 - q_{\text{combined}})^a} $$

where $a > 1$ is the extremization parameter (typically 1.5-2.5).

13.7.6 Confidence Intervals on Probability Estimates

A crucial but often neglected step is quantifying how uncertain you are about your probability estimate. This is uncertainty about a probability, not the probability itself.

If you estimate $q = 0.70$ with an 80% confidence interval of $[0.60, 0.80]$, this means: - Your best guess is 70% - You are 80% confident the true probability lies between 60% and 80% - There is a 10% chance the true probability is below 60% and a 10% chance it is above 80%

This uncertainty directly feeds into your Kelly sizing. A narrow confidence interval supports larger bets; a wide interval calls for smaller bets.

def kelly_with_uncertainty(prob_estimate, confidence_interval, market_price, alpha=0.5):
    """Compute fractional Kelly using conservative end of confidence interval."""
    lower_bound = confidence_interval[0]
    # Use lower bound for YES bets (conservative)
    conservative_prob = lower_bound
    if conservative_prob <= market_price:
        return 0  # No bet -- edge not significant at lower bound
    kelly = (conservative_prob - market_price) / (1 - market_price)
    return alpha * kelly

13.8 Edge Decay and Competition

13.8.1 How Edges Erode Over Time

No edge lasts forever. Understanding the lifecycle of an edge helps you plan strategically and adapt to changing conditions.

Information edges decay fastest. Once your private information becomes public, the edge vanishes instantly. A news-based edge might last minutes or hours. An edge from unpublished research might last weeks or months until the research is published or independently replicated.

Analytical edges decay at medium speed. As other traders adopt similar models or analytical techniques, the advantage narrows. A novel quantitative model might provide edge for months or years, but eventually, competitors will develop comparable models.

Behavioral edges decay slowest. Cognitive biases are deeply ingrained in human psychology. The favorite-longshot bias has been documented for decades and still persists in many markets. However, the magnitude of behavioral mispricings may shrink as markets attract more sophisticated participants.

13.8.2 The Market Efficiency Trajectory

Prediction markets tend to become more efficient over time:

  1. Early stage: Markets have thin liquidity and attract mostly recreational traders. Edges are large and abundant. Simple strategies (base rate trading, exploiting obvious mispricings) are profitable.

  2. Growth stage: More participants enter, including quantitative traders and domain experts. Easy mispricings are arbitraged away. Edges become smaller and harder to find. Simple strategies become less profitable.

  3. Mature stage: Markets are highly liquid and attract sophisticated institutional participants. Edges are small, fleeting, and require significant investment (data, models, infrastructure) to exploit. Only traders with genuine comparative advantage can profit consistently.

Most prediction market platforms in the mid-2020s are somewhere between stages 1 and 2, meaning there are still meaningful opportunities for skilled traders.

13.8.3 Adapting to Competition

To maintain edge as markets become more competitive:

  1. Specialize. Focus on niche markets where you have genuine domain expertise and fewer competitors.
  2. Invest in infrastructure. Better data, faster execution, and more sophisticated models create barriers to competition.
  3. Seek new markets. When new event categories or platforms launch, there are often early mover advantages.
  4. Combine edge sources. Even if each individual edge source is small, combining domain expertise, quantitative models, and behavioral awareness can produce a meaningful total edge.
  5. Track and adapt. Continuously monitor your realized edge. When it shrinks, investigate why and adjust your strategy.

13.8.4 The Lifecycle of a Trading Strategy

A typical prediction market trading strategy goes through distinct phases:

Discovery phase (weeks to months): You identify a new source of edge. Initial trades are highly profitable. You refine your approach and scale up.

Exploitation phase (months to years): You systematically exploit the edge. Returns are consistent but may gradually decline as the market adapts or competitors enter.

Decay phase (months to years): Returns diminish. The edge becomes smaller than your costs of trading (fees, time, opportunity cost). You reduce position sizes.

Retirement phase: The edge is gone. You stop trading this strategy and redirect your efforts to discovering new edges.

Successful long-term prediction market traders do not rely on a single strategy forever. They maintain a pipeline of strategies at different lifecycle stages, replacing declining edges with newly discovered ones.


13.9 When You Don't Have Edge

13.9.1 Recognizing Negative EV Situations

The most important skill in trading is not finding edge -- it is recognizing when you do not have edge. Here are warning signs that you are trading without an advantage:

You cannot articulate your edge. If someone asks "Why do you think the market is wrong?" and your answer is vague ("I just feel like it should be higher"), you probably do not have edge.

Your information is already public. If your trading thesis is based on a news article that was published hours ago, the market has likely already incorporated that information.

You are in a highly liquid, well-followed market. Markets with many active, sophisticated participants are harder to beat. Your edge in "Will the US president win re-election?" is probably smaller than your edge in "Will the mayor of a small city win re-election?" because the former attracts far more analytical attention.

Your track record shows no edge. If you have made 100+ trades and your overall return is negative or near zero, you do not have edge (at least not with your current approach). The sample size is large enough that positive edge should be apparent.

You are emotionally attached to your position. If you are rooting for a particular outcome, your probability estimate is likely biased. The desire for an outcome to occur is not evidence that it will.

13.9.2 Entertainment Value vs. Profit Motive

There is nothing wrong with trading prediction markets for entertainment, education, or engagement with current events. But it is crucial to be honest about your motivation:

If you are trading for entertainment: Set a fixed "entertainment budget" that you are willing to lose entirely, similar to a night out or a sports bet. Do not use Kelly sizing or expect positive returns. Enjoy the experience and learn from it.

If you are trading for profit: Apply the framework in this chapter rigorously. Trade only when you have quantified edge. Size positions with fractional Kelly. Track your results. Be willing to sit on the sidelines when no opportunities meet your threshold.

The dangerous middle ground is trading for entertainment while convincing yourself you are trading for profit. This leads to risk-taking without discipline and often to significant losses.

13.9.3 The Importance of Honest Self-Assessment

The prediction market community, like all trading communities, suffers from survivorship bias. You hear about traders who turned \$1,000 into \$100,000 but not about the many who turned \$10,000 into \$0. This creates the illusion that beating the market is easier than it actually is.

To honestly assess your edge:

  1. Keep detailed records. Log every trade with your probability estimate, rationale, market price, and eventual outcome.
  2. Compute your realized edge. After a sufficient number of trades (at least 50-100), compare your forecasts to outcomes. Are your 70% estimates actually correct 70% of the time?
  3. Compare to a naive strategy. What would have happened if you had simply bet on every market at the market price? If your returns are not meaningfully better than this benchmark, you do not have edge.
  4. Seek objective feedback. Share your track record with other forecasters. Participate in forecasting tournaments where your skill is measured objectively.
  5. Consider fees. Even if your forecasts are slightly better than the market, fees may consume your edge. Calculate your returns net of all costs.

13.9.4 Tracking and Measuring Your Actual Edge

To measure edge empirically, compute the following metrics over your trading history:

Realized edge per trade: $$ \bar{E} = \frac{1}{N}\sum_{i=1}^{N}(q_i - p_i) \cdot \text{sign}(q_i - p_i) $$

where $q_i$ is the actual outcome (0 or 1) and $p_i$ is the market price at the time of your trade. Wait -- this is not quite right. We want to compare your forecasts to market prices, judged by actual outcomes.

A better measure: for each trade where you bought YES, compute your actual return:

$$ R_i = \begin{cases} 1 - p_i & \text{if event occurred} \\ -p_i & \text{if event did not occur} \end{cases} $$

Your average return $\bar{R}$ should be positive if you have edge. Compute a confidence interval around $\bar{R}$ to assess statistical significance.

Brier score comparison: Compare your Brier score (mean squared error of your probability forecasts) to the market's Brier score:

$$ \text{BS}_{\text{you}} = \frac{1}{N}\sum_{i=1}^{N}(q_i - o_i)^2 $$

$$ \text{BS}_{\text{market}} = \frac{1}{N}\sum_{i=1}^{N}(p_i - o_i)^2 $$

where $o_i \in \{0, 1\}$ is the actual outcome. If $\text{BS}_{\text{you}} < \text{BS}_{\text{market}}$, your forecasts are more accurate.


13.10 Building an Edge Tracking System

13.10.1 Why Track Systematically?

Human memory is unreliable and biased. We remember our brilliant winning trades and forget our losing ones. Without systematic tracking, we cannot honestly assess our performance. A good edge tracking system:

  • Records every trade with full context
  • Computes key performance metrics automatically
  • Identifies patterns in your trading (which market types are most profitable, which lead to losses)
  • Provides early warning when your edge is declining
  • Forces accountability and intellectual honesty

13.10.2 What to Log for Each Trade

For every trade, record:

  1. Trade metadata: Date, time, market ID, market question, platform
  2. Your analysis: Your probability estimate ($q$), confidence interval, rationale (in writing)
  3. Market data: Market price ($p$) at time of trade, recent price history, liquidity/volume
  4. Position details: Side (YES/NO), quantity, price paid, total cost
  5. Kelly analysis: Computed Kelly fraction, actual fraction used, reason for any deviation
  6. Edge classification: Primary source of edge (information, analytical, behavioral, etc.)
  7. Resolution: Actual outcome, P&L, resolution date
  8. Post-mortem: Was your analysis correct? What did you miss? What would you do differently?

13.10.3 Key Performance Metrics

Your edge tracking system should compute:

Profit and Loss (P&L): - Total P&L (absolute dollars) - P&L per trade (average) - P&L by market category - P&L by edge source

Calibration metrics: - Calibration curve (predicted vs. actual probabilities) - Brier score - Log-loss score

Risk metrics: - Sharpe ratio (mean return / standard deviation of returns) - Maximum drawdown (largest peak-to-trough decline) - Win rate (fraction of trades with positive P&L) - Profit factor (gross profit / gross loss)

Edge metrics: - Average edge per trade (your probability - market price, signed by direction of bet) - Realized edge (actual returns minus market-implied expected returns) - Edge by category (information, analytical, behavioral, timing) - Edge trend over time (is your edge growing or shrinking?)

13.10.4 P&L Attribution

P&L attribution decomposes your total profits into sources, helping you understand what drives your returns.

$$ \text{Total P\&L} = \sum_{\text{categories}} \text{P\&L}_{\text{category}} $$

Useful decompositions: - By market type: Politics, sports, economics, science, etc. - By edge source: Information, analytical, behavioral, timing - By position size: Small, medium, large positions - By market price range: Low probability (0-20%), medium (20-80%), high (80-100%) - By holding period: Short-term (< 1 week), medium (1-4 weeks), long (> 4 weeks)

Patterns in your P&L attribution reveal your true strengths and weaknesses. You might discover that you are profitable in political markets but lose money in sports markets, or that your information edge is real but your analytical edge is zero.

13.10.5 Python Edge Tracker

See code/example-03-edge-tracker.py for a complete Python implementation of an edge tracking system. The system provides:

  • Trade logging with full metadata
  • Automatic P&L computation
  • Calibration analysis
  • Edge decomposition
  • Performance reports
  • Visualization of results over time

The core structure:

class EdgeTracker:
    def __init__(self):
        self.trades = []

    def log_trade(self, market_id, question, your_prob, market_price,
                  side, quantity, price, edge_source, rationale):
        """Log a new trade."""
        trade = {
            "timestamp": datetime.now(),
            "market_id": market_id,
            "question": question,
            "your_prob": your_prob,
            "market_price": market_price,
            "side": side,
            "quantity": quantity,
            "price": price,
            "edge_source": edge_source,
            "rationale": rationale,
            "resolved": False,
        }
        self.trades.append(trade)

    def resolve_trade(self, market_id, outcome):
        """Record the outcome of a trade."""
        for trade in self.trades:
            if trade["market_id"] == market_id and not trade["resolved"]:
                trade["outcome"] = outcome
                trade["resolved"] = True
                trade["pnl"] = self._compute_pnl(trade, outcome)

    def performance_report(self):
        """Generate a comprehensive performance report."""
        resolved = [t for t in self.trades if t["resolved"]]
        # ... compute metrics ...

13.11 Chapter Summary

This chapter has presented a comprehensive framework for finding, quantifying, sizing, and tracking your edge in prediction markets.

Key insights:

  1. Edge is the difference between your true probability and the market price. If your estimate is not more accurate than the market's, you do not have edge and should not trade.

  2. Expected value is simply $q - p$. Every prediction market trade can be evaluated by comparing your probability estimate to the market price. Positive EV is necessary but not sufficient for profitable trading -- you also need proper position sizing.

  3. The Kelly criterion tells you how much to bet. For buying YES at price $c$ with estimated probability $p$: $f^* = (p - c)/(1 - c)$. For buying NO: $f^* = (c - p)/c$.

  4. Use fractional Kelly in practice. Half Kelly achieves 75% of full Kelly's growth with much lower risk. Quarter Kelly is appropriate for beginners or uncertain estimates.

  5. Edge comes from identifiable sources. Information, analytical skill, speed, behavioral exploitation, and platform-specific inefficiencies each contribute to your total edge. Knowing where your edge comes from helps you maintain and improve it.

  6. Honest probability estimation is the foundation. Base rate analysis, reference class forecasting, model building, and estimate combination are practical tools for producing calibrated probability estimates.

  7. Edges decay over time. As markets become more efficient and competition increases, today's edge may not exist tomorrow. Successful traders maintain a pipeline of strategies and continuously seek new sources of edge.

  8. Most traders do not have edge. This is the most important lesson. Honest self-assessment, rigorous tracking, and a willingness to sit on the sidelines are what separate profitable traders from those who slowly lose money while entertaining themselves.

  9. Track everything. A systematic edge tracking system is essential for measuring your performance, identifying strengths and weaknesses, and maintaining intellectual honesty.


What's Next

In Chapter 14, we will explore Portfolio Construction and Risk Management, building on the position sizing concepts from this chapter. You will learn how to manage a portfolio of prediction market positions, handle correlations between bets, and protect your bankroll against catastrophic losses. While this chapter focused on individual trade sizing, Chapter 14 addresses the challenge of combining many trades into a coherent portfolio strategy.

We will also introduce concepts like: - Multi-market Kelly optimization - Correlation-adjusted position sizing - Drawdown limits and stop-loss rules - Portfolio rebalancing in prediction markets - Stress testing and scenario analysis

The skills from this chapter -- computing EV, applying Kelly, decomposing edge, and tracking performance -- will form the foundation for everything that follows in Part III.