27 min read

> "In prediction markets, the edge belongs not to those who predict the future, but to those who price it more accurately than the crowd."

Chapter 14: Binary Outcome Trading Strategies

"In prediction markets, the edge belongs not to those who predict the future, but to those who price it more accurately than the crowd."

Binary prediction markets offer a unique trading environment. Unlike equities or commodities, every contract resolves to exactly 0 or 1. This terminal certainty creates distinct strategic opportunities that do not exist in traditional markets. A stock can drift sideways for years; a binary contract must eventually settle. This chapter presents a comprehensive toolkit of trading strategies designed specifically for this binary structure, from fundamental probability modeling to the mechanical exploitation of expiry convergence.

We will build on the probability foundations from Chapter 4 and the market structure concepts from Chapter 13. By the end of this chapter, you will have a working vocabulary of strategies, the mathematical frameworks to evaluate them, and Python implementations you can adapt to live markets.


14.1 Strategy Classification Framework

Before diving into individual strategies, we need a taxonomy. Binary market strategies can be organized along several dimensions.

14.1.1 The Strategy Spectrum

At the highest level, strategies differ in their information source and time horizon:

Strategy Type Information Source Typical Horizon Edge Source
Fundamental Domain expertise, models Days to months Superior probability estimate
Event-Driven Catalyst identification Hours to days Speed and interpretation
Mean Reversion Price history, statistics Hours to days Noise vs. signal distinction
Closing the Gap Expiry mechanics Days to resolution Convergence certainty
Momentum Price trends Hours to weeks Information cascade
Contrarian Behavioral analysis Days to weeks Crowd overreaction
News/Sentiment Real-time information Minutes to hours Speed advantage

14.1.2 When Each Strategy Applies

No single strategy dominates across all market conditions. The choice depends on several factors:

Market maturity. A newly listed contract with little liquidity may be best approached with fundamental analysis. A mature market with deep order books and extensive trading history may offer mean-reversion opportunities.

Time to expiry. Closing-the-gap strategies become increasingly powerful as expiry approaches. Fundamental strategies lose relevance in the final hours when information has already been incorporated.

Information environment. Event-driven and news strategies thrive when catalysts are identifiable. In calm periods, mean reversion and fundamental approaches may be more appropriate.

Liquidity. Momentum strategies require sufficient liquidity to enter and exit positions. In thin markets, the bid-ask spread alone may consume any edge.

14.1.3 Risk-Reward Profiles

Each strategy has a characteristic risk-reward signature:

$$\text{Expected Value} = P(\text{win}) \times \text{Payoff}_{\text{win}} + P(\text{loss}) \times \text{Payoff}_{\text{loss}}$$

For a binary contract purchased at price $p$ with estimated true probability $q$:

$$E[\text{profit}] = q \times (1 - p) - (1 - q) \times p = q - p$$

This elegantly simple formula --- expected profit equals the difference between your probability estimate and the market price --- is the foundation of all binary market strategies. Every strategy we discuss is ultimately a method for finding situations where $q - p > 0$ (for long positions) or $p - q > 0$ (for short positions).

14.1.4 The Kelly Criterion in Binary Markets

Given an estimated edge, the optimal position size follows the Kelly Criterion. For a binary contract at price $p$ with estimated true probability $q$:

$$f^* = \frac{q - p}{1 - p} \quad \text{(for buying YES)}$$

$$f^* = \frac{p - q}{p} \quad \text{(for buying NO / selling YES)}$$

where $f^*$ is the fraction of bankroll to wager. In practice, traders use fractional Kelly (typically $\frac{1}{4}$ to $\frac{1}{2}$ Kelly) to account for estimation error.


14.2 Fundamental Analysis Strategies

Fundamental analysis in prediction markets means building a probability model from first principles and trading when the market price diverges from your estimate. This is the most intellectually demanding strategy but often the most rewarding.

14.2.1 The Fundamental Process

A systematic fundamental analysis follows five steps:

  1. Decompose the event into measurable components
  2. Gather relevant data for each component
  3. Model the relationship between components and outcome
  4. Estimate the probability using the model
  5. Compare to market price and trade if edge exists

14.2.2 Example: Election Market Fundamentals

Consider a market on whether Candidate A will win a presidential election. A fundamental model might incorporate:

Polling data. Raw polls are noisy. A proper model must account for: - Pollster quality (historical accuracy) - Sample size and methodology - Timing (polls closer to election are more predictive) - Likely voter vs. registered voter screens

Fundamentals. Political science research identifies several non-polling predictors: - Incumbent party tenure (voters tire of long-ruling parties) - Economic conditions (GDP growth, unemployment, real income) - Incumbent approval rating - Primary performance

Historical base rates. How often do candidates in similar positions win? This provides a prior probability that can be updated with current data.

The Bayesian framework combines these:

$$P(\text{A wins} \mid \text{data}) = \frac{P(\text{data} \mid \text{A wins}) \times P(\text{A wins})}{P(\text{data})}$$

In practice, we often work with log-odds for easier computation:

$$\log \frac{P(\text{A wins})}{P(\text{A loses})} = \log \frac{\pi}{1 - \pi} + \sum_{i} w_i \cdot x_i$$

where $\pi$ is the prior probability and $x_i$ are standardized predictor values with weights $w_i$.

14.2.3 Building the Model in Python

import numpy as np
from dataclasses import dataclass
from typing import List, Dict, Optional

@dataclass
class PollData:
    """A single polling observation."""
    pollster: str
    date: str
    sample_size: int
    candidate_a_pct: float
    candidate_b_pct: float
    pollster_rating: float  # 0-1, higher is better
    methodology: str  # 'live', 'online', 'ivr'

@dataclass
class FundamentalFactors:
    """Non-polling predictors."""
    gdp_growth_pct: float
    unemployment_rate: float
    incumbent_approval: float
    incumbent_party_terms: int
    primary_vote_share: float

class FundamentalProbabilityModel:
    """
    Estimates binary outcome probability from fundamentals.
    Combines polling data with structural predictors using
    a Bayesian-inspired weighted model.
    """

    # Weights derived from historical analysis
    POLL_WEIGHT = 0.60
    FUNDAMENTALS_WEIGHT = 0.25
    HISTORICAL_WEIGHT = 0.15

    # Methodology quality adjustments
    METHODOLOGY_QUALITY = {
        'live': 1.0,
        'online': 0.85,
        'ivr': 0.75,
    }

    def __init__(self):
        self.polls: List[PollData] = []
        self.fundamentals: Optional[FundamentalFactors] = None
        self.historical_base_rate: float = 0.50

    def add_poll(self, poll: PollData):
        self.polls.append(poll)

    def set_fundamentals(self, factors: FundamentalFactors):
        self.fundamentals = factors

    def set_historical_base_rate(self, rate: float):
        self.historical_base_rate = rate

    def _weighted_poll_average(self) -> float:
        """
        Calculate quality-weighted polling average.
        Weights account for recency, sample size, pollster quality,
        and methodology.
        """
        if not self.polls:
            return 0.50

        weights = []
        values = []

        for poll in self.polls:
            # Recency weight (exponential decay, assuming
            # polls are sorted or we use date-based weighting)
            recency_w = 1.0  # Simplified; full version uses dates

            # Sample size weight (sqrt scaling)
            size_w = np.sqrt(poll.sample_size / 1000.0)

            # Pollster quality weight
            quality_w = poll.pollster_rating

            # Methodology weight
            method_w = self.METHODOLOGY_QUALITY.get(
                poll.methodology, 0.80
            )

            total_weight = recency_w * size_w * quality_w * method_w
            weights.append(total_weight)

            margin = poll.candidate_a_pct - poll.candidate_b_pct
            values.append(margin)

        weights = np.array(weights)
        values = np.array(values)
        weights = weights / weights.sum()

        weighted_margin = np.sum(weights * values)
        return self._margin_to_probability(weighted_margin)

    def _margin_to_probability(self, margin: float) -> float:
        """Convert polling margin to win probability using logistic function."""
        # Historical calibration: each polling point ~ 0.3 log-odds
        log_odds = 0.3 * margin
        prob = 1 / (1 + np.exp(-log_odds))
        return np.clip(prob, 0.01, 0.99)

    def _fundamentals_probability(self) -> float:
        """Estimate probability from non-polling fundamentals."""
        if self.fundamentals is None:
            return 0.50

        f = self.fundamentals
        score = 0.0

        # GDP growth: positive growth favors incumbent party
        score += 0.15 * f.gdp_growth_pct

        # Unemployment: lower is better for incumbent
        score -= 0.10 * (f.unemployment_rate - 5.0)

        # Approval: each point above 50% adds to log-odds
        score += 0.05 * (f.incumbent_approval - 50.0)

        # Fatigue: penalty for long incumbent party tenure
        score -= 0.20 * max(0, f.incumbent_party_terms - 1)

        # Primary strength
        score += 0.03 * (f.primary_vote_share - 50.0)

        return self._margin_to_probability(score)

    def estimate_probability(self) -> Dict[str, float]:
        """
        Combine all sources into final probability estimate.

        Returns dict with component and final probabilities.
        """
        poll_prob = self._weighted_poll_average()
        fund_prob = self._fundamentals_probability()
        hist_prob = self.historical_base_rate

        # Combine using log-odds averaging
        def to_log_odds(p):
            p = np.clip(p, 0.001, 0.999)
            return np.log(p / (1 - p))

        def from_log_odds(lo):
            return 1 / (1 + np.exp(-lo))

        combined_log_odds = (
            self.POLL_WEIGHT * to_log_odds(poll_prob)
            + self.FUNDAMENTALS_WEIGHT * to_log_odds(fund_prob)
            + self.HISTORICAL_WEIGHT * to_log_odds(hist_prob)
        )

        final_prob = from_log_odds(combined_log_odds)

        return {
            'poll_probability': round(poll_prob, 4),
            'fundamentals_probability': round(fund_prob, 4),
            'historical_probability': round(hist_prob, 4),
            'combined_probability': round(final_prob, 4),
            'confidence_interval_low': round(
                max(0.01, final_prob - 0.08), 4
            ),
            'confidence_interval_high': round(
                min(0.99, final_prob + 0.08), 4
            ),
        }

    def evaluate_trade(
        self, market_price: float, transaction_cost: float = 0.02
    ) -> Dict[str, float]:
        """
        Determine if a trade is warranted given the market price.
        """
        estimate = self.estimate_probability()
        q = estimate['combined_probability']
        p = market_price

        edge = q - p
        edge_after_costs = abs(edge) - transaction_cost

        if edge > 0:
            direction = 'BUY_YES'
            kelly_fraction = (q - p) / (1 - p)
        elif edge < 0:
            direction = 'BUY_NO'
            kelly_fraction = (p - q) / p
        else:
            direction = 'NO_TRADE'
            kelly_fraction = 0.0

        half_kelly = kelly_fraction / 2.0

        return {
            'estimated_prob': q,
            'market_price': p,
            'raw_edge': round(edge, 4),
            'edge_after_costs': round(edge_after_costs, 4),
            'direction': direction,
            'kelly_fraction': round(kelly_fraction, 4),
            'half_kelly': round(half_kelly, 4),
            'trade_recommended': edge_after_costs > 0,
        }

14.2.4 Fundamental Strategy Trade Management

Once a fundamental model identifies an edge, trade management follows these principles:

  1. Entry. Enter when edge exceeds transaction costs by a meaningful margin (typically 3-5 cents on a $1 contract).

  2. Position sizing. Use fractional Kelly based on confidence in the estimate. Higher model confidence (narrow confidence interval) justifies larger fraction; lower confidence demands smaller positions.

  3. Updating. As new data arrives (new polls, economic releases), re-run the model and adjust positions. If the edge shrinks below transaction costs, exit. If it widens, consider adding.

  4. Exit. Exit when (a) the edge disappears, (b) the event resolves, or (c) you lose confidence in the model. Do not hold losing positions out of stubbornness --- the model should drive decisions, not ego.

14.2.5 Strengths and Limitations

Strengths: - Can identify large edges that persist for weeks or months - Not dependent on market microstructure or speed - Scales well: the same model can evaluate many markets

Limitations: - Requires genuine domain expertise - Model may be systematically wrong (model risk) - Slow to react to sudden new information - Overconfidence in model leads to oversized positions


14.3 Event-Driven Trading

Event-driven trading exploits identifiable catalysts --- specific events that will move the market. In prediction markets, these catalysts are often scheduled and known in advance, creating opportunities for both pre-event positioning and post-event reaction.

14.3.1 Identifying Catalysts

Catalysts for binary prediction markets include:

  • Scheduled announcements: debate performances, earnings for corporate markets, regulatory decisions
  • Data releases: polls, economic indicators, court rulings
  • Deadlines: filing deadlines, qualification dates, legislative votes
  • External events: endorsements, scandals, geopolitical developments

The key insight is that many catalysts are predictable in timing even if their content is not. A debate will happen on a known date. A jobs report will be released on the first Friday of the month. A court ruling will come before the end of the term.

14.3.2 Pre-Event Positioning

Pre-event positioning means establishing a position before the catalyst, based on an expectation of how the event will move the market.

The volatility play. Before a major event, implied volatility is high --- the market knows the price could move significantly in either direction. If you believe the event will be less impactful than the market expects, you can sell volatility (in practice, this means selling both YES and NO at prices that collectively exceed $1.00 if the spread allows it, or trading against extreme prices).

The directional play. If you have a view on the event outcome, you can position before the event. This is riskier but offers greater reward if correct.

The "buy the rumor, sell the news" play. In prediction markets, this manifests when: 1. Anticipation of a positive catalyst drives the price up gradually 2. The actual catalyst, even if positive, is "priced in" 3. Post-event, the price stagnates or drops slightly as speculative holders exit

This pattern is less reliable in prediction markets than in equities because binary contracts resolve definitively. However, it can appear in markets with distant expiry dates.

14.3.3 Post-Event Reaction Trading

Post-event reaction trading is about speed and interpretation. When a catalyst occurs:

  1. Assess the outcome relative to expectations (not in absolute terms)
  2. Compare to current market price
  3. Act quickly if the market has not yet adjusted

The edge decays rapidly. In liquid prediction markets, large catalysts are priced in within minutes. The trader must either be very fast (seconds) or focus on interpretation edge --- understanding the implications of an event better than the market, even if slower.

14.3.4 Building an Event Calendar

A systematic event-driven approach requires maintaining a calendar of upcoming catalysts:

from dataclasses import dataclass, field
from datetime import datetime, timedelta
from typing import List, Dict, Optional
from enum import Enum

class EventImpact(Enum):
    LOW = "low"
    MEDIUM = "medium"
    HIGH = "high"
    CRITICAL = "critical"

class EventCategory(Enum):
    POLL_RELEASE = "poll_release"
    DEBATE = "debate"
    ECONOMIC_DATA = "economic_data"
    LEGAL_RULING = "legal_ruling"
    DEADLINE = "deadline"
    ENDORSEMENT = "endorsement"
    VOTE = "vote"
    OTHER = "other"

@dataclass
class CatalystEvent:
    """Represents a known upcoming catalyst."""
    name: str
    date: datetime
    category: EventCategory
    expected_impact: EventImpact
    affected_markets: List[str]
    description: str
    pre_event_strategy: Optional[str] = None
    post_event_strategy: Optional[str] = None
    notes: str = ""

@dataclass
class EventCalendar:
    """
    Tracks upcoming catalysts and generates alerts for
    event-driven trading opportunities.
    """
    events: List[CatalystEvent] = field(default_factory=list)

    def add_event(self, event: CatalystEvent):
        self.events.append(event)
        self.events.sort(key=lambda e: e.date)

    def get_upcoming(
        self, days_ahead: int = 7
    ) -> List[CatalystEvent]:
        """Get events occurring within the next N days."""
        now = datetime.now()
        cutoff = now + timedelta(days=days_ahead)
        return [
            e for e in self.events
            if now <= e.date <= cutoff
        ]

    def get_high_impact(self) -> List[CatalystEvent]:
        """Get all high and critical impact events."""
        return [
            e for e in self.events
            if e.expected_impact in (
                EventImpact.HIGH, EventImpact.CRITICAL
            )
        ]

    def get_events_for_market(
        self, market_id: str
    ) -> List[CatalystEvent]:
        """Get all events affecting a specific market."""
        return [
            e for e in self.events
            if market_id in e.affected_markets
        ]

    def generate_alerts(
        self, alert_hours_before: List[int] = None
    ) -> List[Dict]:
        """
        Generate trading alerts for upcoming events.
        Default alerts at 24h, 4h, and 1h before event.
        """
        if alert_hours_before is None:
            alert_hours_before = [24, 4, 1]

        now = datetime.now()
        alerts = []

        for event in self.events:
            for hours in alert_hours_before:
                alert_time = event.date - timedelta(hours=hours)
                if now <= alert_time <= now + timedelta(hours=1):
                    alerts.append({
                        'event': event.name,
                        'hours_until': hours,
                        'impact': event.expected_impact.value,
                        'markets': event.affected_markets,
                        'strategy': event.pre_event_strategy,
                        'message': (
                            f"ALERT: '{event.name}' in {hours}h "
                            f"[{event.expected_impact.value}] - "
                            f"Affects: {', '.join(event.affected_markets)}"
                        ),
                    })

        return alerts

    def summary_report(self) -> str:
        """Generate a formatted summary of upcoming events."""
        upcoming = self.get_upcoming(days_ahead=14)
        if not upcoming:
            return "No events in the next 14 days."

        lines = ["=== Event Calendar (Next 14 Days) ===\n"]
        for event in upcoming:
            days_away = (event.date - datetime.now()).days
            lines.append(
                f"  [{event.expected_impact.value.upper():>8}] "
                f"{event.date.strftime('%Y-%m-%d %H:%M')} "
                f"({days_away}d) - {event.name}"
            )
            if event.pre_event_strategy:
                lines.append(
                    f"           Strategy: {event.pre_event_strategy}"
                )
        return "\n".join(lines)

14.3.5 Risk Management for Event-Driven Trades

Event-driven trades carry binary event risk on top of the binary contract risk. A debate performance can be interpreted in unexpected ways. A legal ruling can have nuances that defy simple expectations.

Key risk management principles:

  • Size positions for the worst case. If the event goes against you, how much do you lose? For a binary contract at 60 cents, buying YES risks 60 cents; buying NO risks 40 cents. Size accordingly.
  • Use multiple uncorrelated events. Diversify across different types of catalysts.
  • Have a post-event exit plan. Decide in advance what outcomes warrant holding, adding, or exiting.

14.4 Mean Reversion Strategies

Mean reversion in prediction markets exploits the tendency of prices to overreact to noise and then return to a fair value. Unlike in equity markets where mean reversion relies on a stable equilibrium value, in prediction markets the "mean" is the true probability --- which itself can change. This makes mean reversion both more nuanced and, when applied correctly, more powerful.

14.4.1 When Prices Overreact

Price overreaction in prediction markets occurs due to:

  • Thin liquidity. A single large order moves the price far beyond its informational content.
  • Emotional trading. Partisan traders overreact to news that confirms or threatens their beliefs.
  • Herding. Traders follow recent price movements, amplifying noise.
  • Misinterpretation. News is initially misinterpreted, and the correction takes time.

14.4.2 Detecting Overreaction: The Bollinger Band Analog

In traditional markets, Bollinger Bands use a moving average and standard deviation to define "normal" price ranges. We adapt this concept for prediction markets with important modifications:

The binary adjustment. A binary contract's variance depends on its price level. A contract at 50 cents has maximum variance; a contract at 5 cents or 95 cents has very low variance. We must normalize for this.

The theoretical standard deviation of a binary price at level $p$ over a time window with $n$ observations is:

$$\sigma_{\text{theoretical}} = \sqrt{\frac{p(1-p)}{n}}$$

We compare actual price movements to this theoretical baseline:

$$Z = \frac{p_t - \bar{p}_{t,k}}{\sigma_{\text{observed},k}}$$

where $\bar{p}_{t,k}$ is the $k$-period moving average and $\sigma_{\text{observed},k}$ is the observed standard deviation over the same window.

A mean-reversion signal fires when $|Z|$ exceeds a threshold (typically 2.0), but only if the move cannot be explained by genuine new information.

14.4.3 Distinguishing Noise from Information

This is the critical challenge. Not every large price move is an overreaction --- some reflect genuine new information that permanently shifts the true probability.

Heuristics for distinguishing noise from signal:

Feature Likely Noise Likely Information
Speed of move Gradual drift Sharp jump
Volume Low/normal High
News catalyst None identifiable Clear catalyst
Time of day Off-hours Active hours
Order flow Single large order Many orders
Reversion behavior Partial reversion within hours No reversion

14.4.4 Mean Reversion Signal Generator

import numpy as np
from dataclasses import dataclass
from typing import List, Tuple, Optional

@dataclass
class PricePoint:
    """A single price observation."""
    timestamp: float  # Unix timestamp
    price: float
    volume: float

@dataclass
class MeanReversionSignal:
    """A detected mean-reversion opportunity."""
    timestamp: float
    current_price: float
    fair_value_estimate: float
    z_score: float
    direction: str  # 'BUY' or 'SELL'
    strength: str  # 'weak', 'moderate', 'strong'
    confidence: float
    suggested_entry: float
    suggested_target: float
    suggested_stop: float

class BinaryMeanReversionDetector:
    """
    Detects mean-reversion opportunities in binary prediction markets.
    Uses an adapted Bollinger Band approach that accounts for the
    heteroscedastic nature of binary prices.
    """

    def __init__(
        self,
        lookback_period: int = 20,
        z_threshold: float = 2.0,
        volume_filter: bool = True,
        min_price: float = 0.05,
        max_price: float = 0.95,
    ):
        self.lookback_period = lookback_period
        self.z_threshold = z_threshold
        self.volume_filter = volume_filter
        self.min_price = min_price
        self.max_price = max_price

    def calculate_bands(
        self, prices: np.ndarray
    ) -> Tuple[np.ndarray, np.ndarray, np.ndarray]:
        """
        Calculate adaptive Bollinger Bands for binary prices.
        Adjusts band width for the price-dependent variance
        of binary contracts.
        """
        n = len(prices)
        k = self.lookback_period

        middle = np.full(n, np.nan)
        upper = np.full(n, np.nan)
        lower = np.full(n, np.nan)

        for i in range(k, n):
            window = prices[i - k:i]
            mu = np.mean(window)
            sigma = np.std(window, ddof=1)

            # Theoretical sigma for a binary at this price level
            theoretical_sigma = np.sqrt(mu * (1 - mu) / k)

            # Use the larger of observed and theoretical to avoid
            # overfitting to calm periods
            effective_sigma = max(sigma, theoretical_sigma)

            middle[i] = mu
            upper[i] = min(1.0, mu + self.z_threshold * effective_sigma)
            lower[i] = max(0.0, mu - self.z_threshold * effective_sigma)

        return lower, middle, upper

    def calculate_z_score(
        self, prices: np.ndarray
    ) -> np.ndarray:
        """Calculate rolling Z-score of price relative to moving average."""
        n = len(prices)
        k = self.lookback_period
        z_scores = np.full(n, np.nan)

        for i in range(k, n):
            window = prices[i - k:i]
            mu = np.mean(window)
            sigma = np.std(window, ddof=1)

            if sigma > 1e-8:
                z_scores[i] = (prices[i] - mu) / sigma
            else:
                z_scores[i] = 0.0

        return z_scores

    def detect_signals(
        self, price_history: List[PricePoint]
    ) -> List[MeanReversionSignal]:
        """
        Analyze price history and return mean-reversion signals.
        """
        if len(price_history) < self.lookback_period + 1:
            return []

        prices = np.array([p.price for p in price_history])
        volumes = np.array([p.volume for p in price_history])
        timestamps = np.array([p.timestamp for p in price_history])

        z_scores = self.calculate_z_score(prices)
        lower, middle, upper = self.calculate_bands(prices)

        signals = []

        for i in range(self.lookback_period, len(prices)):
            current_price = prices[i]

            # Skip prices too close to 0 or 1
            if current_price < self.min_price or current_price > self.max_price:
                continue

            z = z_scores[i]
            if abs(z) < self.z_threshold:
                continue

            # Volume filter: reject signals on high volume
            # (likely information-driven)
            if self.volume_filter:
                avg_volume = np.mean(volumes[max(0, i - self.lookback_period):i])
                if avg_volume > 0 and volumes[i] > 3.0 * avg_volume:
                    continue  # High volume suggests real information

            # Determine direction and strength
            if z > 0:
                direction = 'SELL'  # Price is above mean, expect reversion down
                target = middle[i]
                stop = min(1.0, current_price + 0.10)
            else:
                direction = 'BUY'  # Price is below mean, expect reversion up
                target = middle[i]
                stop = max(0.0, current_price - 0.10)

            # Signal strength
            if abs(z) > 3.0:
                strength = 'strong'
                confidence = 0.75
            elif abs(z) > 2.5:
                strength = 'moderate'
                confidence = 0.60
            else:
                strength = 'weak'
                confidence = 0.50

            signals.append(MeanReversionSignal(
                timestamp=timestamps[i],
                current_price=current_price,
                fair_value_estimate=middle[i],
                z_score=round(z, 3),
                direction=direction,
                strength=strength,
                confidence=confidence,
                suggested_entry=current_price,
                suggested_target=round(target, 4),
                suggested_stop=round(stop, 4),
            ))

        return signals

    def backtest_signals(
        self,
        price_history: List[PricePoint],
        holding_periods: List[int] = None,
    ) -> dict:
        """
        Backtest mean-reversion signals on historical data.
        Returns performance statistics for each holding period.
        """
        if holding_periods is None:
            holding_periods = [1, 5, 10, 20]

        signals = self.detect_signals(price_history)
        prices = np.array([p.price for p in price_history])

        results = {}
        for hp in holding_periods:
            returns = []
            for signal in signals:
                # Find the index of the signal
                idx = next(
                    (i for i, p in enumerate(price_history)
                     if p.timestamp == signal.timestamp),
                    None,
                )
                if idx is None or idx + hp >= len(prices):
                    continue

                entry_price = prices[idx]
                exit_price = prices[idx + hp]

                if signal.direction == 'BUY':
                    ret = exit_price - entry_price
                else:
                    ret = entry_price - exit_price

                returns.append(ret)

            if returns:
                returns = np.array(returns)
                results[f'holding_{hp}'] = {
                    'num_trades': len(returns),
                    'mean_return': round(np.mean(returns), 4),
                    'median_return': round(np.median(returns), 4),
                    'win_rate': round(np.mean(returns > 0), 4),
                    'avg_win': round(
                        np.mean(returns[returns > 0]), 4
                    ) if np.any(returns > 0) else 0.0,
                    'avg_loss': round(
                        np.mean(returns[returns < 0]), 4
                    ) if np.any(returns < 0) else 0.0,
                    'sharpe': round(
                        np.mean(returns) / (np.std(returns) + 1e-8), 4
                    ),
                }
            else:
                results[f'holding_{hp}'] = {
                    'num_trades': 0,
                    'message': 'No trades for this holding period',
                }

        return results

14.4.5 When Mean Reversion Fails

Mean reversion strategies can lead to significant losses when:

  1. Regime change. A fundamental shift occurs (e.g., a candidate drops out of a race). The price moves far from the old mean and never returns because the true probability has genuinely changed.

  2. Trending markets. In strongly trending markets --- driven by a sequence of information arrivals --- mean reversion repeatedly triggers but the price continues moving in the same direction.

  3. Convergence to 0 or 1. As a binary contract approaches expiry, the price naturally moves toward the extremes. A mean-reversion strategy that fights this tendency will lose.

Mitigation: Always check whether a price move coincides with new information. Use a filter that suppresses mean-reversion signals within hours of major news events or when time to expiry is short.


14.5 Closing-the-Gap Strategy

The closing-the-gap strategy exploits one of the most powerful features of binary contracts: the certainty of terminal convergence. Every binary contract must eventually settle at 0 or 1. As expiry approaches, mispriced contracts must converge to their true value, creating a mechanical edge.

14.5.1 The Convergence Principle

Consider a contract that will resolve YES. At any time before resolution, the price might be 0.70, 0.85, or 0.95 --- but at resolution, it will be exactly 1.00. The "gap" between the current price and 1.00 closes over time.

If you can identify contracts where the outcome is nearly certain before the market reflects this certainty, you can profit from the convergence. The edge is:

$$\text{Edge} = P(\text{outcome}) \times (1 - p) - (1 - P(\text{outcome})) \times p$$

where $P(\text{outcome})$ is the true probability and $p$ is the market price.

For a contract that is almost certainly going to resolve YES ($P \approx 0.98$) trading at 0.90:

$$\text{Edge} = 0.98 \times 0.10 - 0.02 \times 0.90 = 0.098 - 0.018 = 0.080$$

An 8-cent edge with near-certainty of the outcome is attractive --- but the key word is "near-certainty." The 2% chance of being wrong risks 90 cents per contract.

14.5.2 Identifying Stale Markets

The best closing-the-gap opportunities arise in stale markets --- markets where the outcome has become clear but the price has not fully adjusted. This happens because:

  • Low attention. The market has few active traders, and no one bothers to push the price to its correct value.
  • Transaction costs. The remaining edge is small enough that transaction costs make it unattractive for most traders, but it may still be profitable in aggregate across many markets.
  • Illiquidity. Even if traders want to trade, thin order books make it difficult to execute at fair prices.

14.5.3 The Mathematics of Convergence Edge

Let $T$ be the time to expiry, $p_t$ the current price, and $q$ the true probability of YES. The expected convergence rate depends on the information arrival process.

If we model information arrival as a continuous process, the expected price path follows:

$$E[p_{t+\Delta t}] = q + (p_t - q) \cdot e^{-\lambda \Delta t}$$

where $\lambda$ is the rate of information incorporation. As $T \to 0$, $p_t \to \{0, 1\}$.

The convergence edge as a function of time to expiry:

$$\text{Edge}(T) = |q - p_t| \cdot (1 - e^{-\lambda T})$$

This formula shows that the edge is largest when: 1. The mispricing $|q - p_t|$ is large 2. Time to expiry $T$ is short (so convergence is imminent) 3. The information rate $\lambda$ is high

14.5.4 Practical Implementation

import numpy as np
from dataclasses import dataclass
from typing import List, Dict

@dataclass
class BinaryMarket:
    """Represents a binary prediction market contract."""
    market_id: str
    question: str
    current_price: float
    time_to_expiry_days: float
    volume_24h: float
    num_traders: int
    last_trade_hours_ago: float
    category: str

@dataclass
class GapOpportunity:
    """A detected closing-the-gap opportunity."""
    market: BinaryMarket
    estimated_true_prob: float
    edge: float
    edge_per_day: float
    direction: str  # 'BUY_YES' or 'BUY_NO'
    risk: float
    reward: float
    risk_reward_ratio: float
    staleness_score: float
    confidence: str

class ClosingTheGapAnalyzer:
    """
    Identifies and evaluates closing-the-gap opportunities
    in binary prediction markets.
    """

    def __init__(
        self,
        min_edge: float = 0.03,
        max_time_to_expiry: float = 14.0,
        min_confidence_threshold: float = 0.90,
        transaction_cost: float = 0.02,
    ):
        self.min_edge = min_edge
        self.max_time_to_expiry = max_time_to_expiry
        self.min_confidence_threshold = min_confidence_threshold
        self.transaction_cost = transaction_cost

    def calculate_staleness_score(
        self, market: BinaryMarket
    ) -> float:
        """
        Score how 'stale' a market is (0-1, higher = more stale).
        Stale markets are more likely to have convergence opportunities.
        """
        # Factor 1: Time since last trade (normalized)
        recency_score = min(1.0, market.last_trade_hours_ago / 48.0)

        # Factor 2: Low trader count
        trader_score = max(0.0, 1.0 - market.num_traders / 100.0)

        # Factor 3: Low volume relative to time remaining
        expected_volume = max(1.0, market.time_to_expiry_days * 50.0)
        volume_score = max(
            0.0, 1.0 - market.volume_24h / expected_volume
        )

        # Weighted combination
        staleness = (
            0.40 * recency_score
            + 0.30 * trader_score
            + 0.30 * volume_score
        )

        return round(staleness, 3)

    def estimate_true_probability(
        self,
        market: BinaryMarket,
        external_estimate: float = None,
    ) -> float:
        """
        Estimate the true probability of the outcome.
        If an external estimate is provided, use it.
        Otherwise, use a simple heuristic based on price extremity.
        """
        if external_estimate is not None:
            return external_estimate

        # Heuristic: for stale markets near expiry,
        # extreme prices are likely correct
        p = market.current_price
        staleness = self.calculate_staleness_score(market)

        # If market is stale and price is extreme, true prob is
        # even more extreme
        if p > 0.80:
            # Likely YES outcome; adjust upward
            adjustment = staleness * 0.05
            return min(0.99, p + adjustment)
        elif p < 0.20:
            # Likely NO outcome; adjust downward
            adjustment = staleness * 0.05
            return max(0.01, p - adjustment)
        else:
            # Near 50-50; can't determine direction from price alone
            return p

    def analyze_market(
        self,
        market: BinaryMarket,
        external_estimate: float = None,
    ) -> GapOpportunity:
        """
        Analyze a single market for closing-the-gap opportunity.
        """
        true_prob = self.estimate_true_probability(
            market, external_estimate
        )
        p = market.current_price

        # Determine direction
        if true_prob > p:
            direction = 'BUY_YES'
            edge = true_prob - p - self.transaction_cost
            risk = p  # Maximum loss if outcome is NO
            reward = 1.0 - p  # Maximum gain if outcome is YES
        else:
            direction = 'BUY_NO'
            edge = p - true_prob - self.transaction_cost
            risk = 1.0 - p
            reward = p

        edge_per_day = edge / max(0.1, market.time_to_expiry_days)
        risk_reward = reward / risk if risk > 0 else float('inf')
        staleness = self.calculate_staleness_score(market)

        # Confidence classification
        if true_prob > 0.95 or true_prob < 0.05:
            confidence = 'high'
        elif true_prob > 0.85 or true_prob < 0.15:
            confidence = 'medium'
        else:
            confidence = 'low'

        return GapOpportunity(
            market=market,
            estimated_true_prob=round(true_prob, 4),
            edge=round(edge, 4),
            edge_per_day=round(edge_per_day, 4),
            direction=direction,
            risk=round(risk, 4),
            reward=round(reward, 4),
            risk_reward_ratio=round(risk_reward, 2),
            staleness_score=staleness,
            confidence=confidence,
        )

    def scan_markets(
        self,
        markets: List[BinaryMarket],
        external_estimates: Dict[str, float] = None,
    ) -> List[GapOpportunity]:
        """
        Scan multiple markets and return ranked opportunities.
        """
        if external_estimates is None:
            external_estimates = {}

        opportunities = []

        for market in markets:
            if market.time_to_expiry_days > self.max_time_to_expiry:
                continue

            ext_est = external_estimates.get(market.market_id)
            opp = self.analyze_market(market, ext_est)

            if opp.edge >= self.min_edge:
                opportunities.append(opp)

        # Rank by edge per day (annualized return equivalent)
        opportunities.sort(
            key=lambda o: o.edge_per_day, reverse=True
        )

        return opportunities

    def portfolio_summary(
        self, opportunities: List[GapOpportunity]
    ) -> Dict:
        """
        Generate portfolio-level summary statistics.
        """
        if not opportunities:
            return {'message': 'No opportunities found'}

        edges = [o.edge for o in opportunities]
        risks = [o.risk for o in opportunities]

        return {
            'num_opportunities': len(opportunities),
            'avg_edge': round(np.mean(edges), 4),
            'total_edge': round(np.sum(edges), 4),
            'avg_risk': round(np.mean(risks), 4),
            'total_risk': round(np.sum(risks), 4),
            'best_opportunity': opportunities[0].market.market_id,
            'best_edge_per_day': opportunities[0].edge_per_day,
            'direction_split': {
                'BUY_YES': sum(
                    1 for o in opportunities if o.direction == 'BUY_YES'
                ),
                'BUY_NO': sum(
                    1 for o in opportunities if o.direction == 'BUY_NO'
                ),
            },
            'confidence_split': {
                'high': sum(
                    1 for o in opportunities if o.confidence == 'high'
                ),
                'medium': sum(
                    1 for o in opportunities if o.confidence == 'medium'
                ),
                'low': sum(
                    1 for o in opportunities if o.confidence == 'low'
                ),
            },
        }

14.5.5 Risks and Edge Cases

The closing-the-gap strategy appears nearly risk-free when the outcome is "obvious," but several risks exist:

  1. Black swan events. A candidate could drop out of a race on the last day. A court could issue a surprise ruling. Even at 95% confidence, 1 in 20 trades will lose.

  2. Resolution disputes. The market may resolve in an unexpected way due to ambiguous rules. What counts as "winning" may be disputed.

  3. Platform risk. The exchange could fail, freeze, or become insolvent before payout.

  4. Opportunity cost. Capital locked in a 3-cent edge trade for 10 days could potentially earn more elsewhere.

  5. Aggregation risk. Running the strategy across 100 markets means the occasional large loss from a surprise outcome can consume many small wins. If each trade has 3 cents of edge and 90 cents of risk, you need 30 wins for every loss just to break even.


14.6 Momentum and Trend Following

Momentum strategies in prediction markets capitalize on the tendency of price movements to persist over short to medium time horizons. While the efficient market hypothesis suggests that price changes should be unpredictable, behavioral factors and information cascades create exploitable trends.

14.6.1 Why Momentum Exists in Prediction Markets

Several mechanisms generate momentum in binary markets:

Slow information diffusion. Not all traders see information at the same time. A news article about a candidate might first move the price by 3 cents as early readers trade, then by another 5 cents as the story spreads over hours.

Confirmation cascades. When traders see the price moving in a direction, they interpret it as a signal that informed traders know something. This leads to further buying, reinforcing the trend.

Anchor adjustment. Traders anchor to recent prices and adjust insufficiently. A move from 0.50 to 0.55 might be the correct response to new information that warrants 0.65, but traders only partially adjust.

Position building. Large traders cannot execute their full position at once. They build over time, creating sustained directional pressure.

14.6.2 Momentum Signals for Binary Markets

Traditional momentum indicators need adaptation for binary markets:

Rate of Change (ROC):

$$\text{ROC}_k = \frac{p_t - p_{t-k}}{p_{t-k}}$$

However, for binary markets, a raw price change is more informative:

$$\Delta p_k = p_t - p_{t-k}$$

since the bounded nature of binary prices makes percentage changes misleading near the extremes.

Exponential Moving Average Crossover:

$$\text{EMA}_{\text{fast}} > \text{EMA}_{\text{slow}} \implies \text{bullish momentum}$$

Volume-Weighted Momentum:

$$\text{VW\_Mom} = \frac{\sum_{i=t-k}^{t} V_i \cdot \Delta p_i}{\sum_{i=t-k}^{t} V_i}$$

This gives more weight to price moves accompanied by high volume, filtering out noise.

14.6.3 Python Momentum Indicator

import numpy as np
from dataclasses import dataclass
from typing import List, Optional

@dataclass
class MomentumSignal:
    timestamp: float
    price: float
    momentum_score: float  # -1 to +1
    trend_direction: str  # 'UP', 'DOWN', 'NEUTRAL'
    trend_strength: str  # 'weak', 'moderate', 'strong'
    suggested_action: str

class BinaryMomentumIndicator:
    """
    Momentum indicator adapted for binary prediction markets.
    Combines price momentum, volume confirmation, and
    trend persistence into a single composite score.
    """

    def __init__(
        self,
        fast_period: int = 5,
        slow_period: int = 20,
        volume_lookback: int = 10,
        momentum_threshold: float = 0.3,
    ):
        self.fast_period = fast_period
        self.slow_period = slow_period
        self.volume_lookback = volume_lookback
        self.momentum_threshold = momentum_threshold

    def _ema(self, data: np.ndarray, period: int) -> np.ndarray:
        """Calculate exponential moving average."""
        alpha = 2.0 / (period + 1)
        ema = np.zeros_like(data)
        ema[0] = data[0]
        for i in range(1, len(data)):
            ema[i] = alpha * data[i] + (1 - alpha) * ema[i - 1]
        return ema

    def _price_momentum(self, prices: np.ndarray) -> np.ndarray:
        """
        Calculate price momentum as EMA crossover normalized
        to [-1, 1] range.
        """
        fast_ema = self._ema(prices, self.fast_period)
        slow_ema = self._ema(prices, self.slow_period)

        raw_momentum = fast_ema - slow_ema

        # Normalize by price-dependent scale
        scale = np.sqrt(prices * (1 - prices) + 0.01)
        normalized = raw_momentum / scale

        return np.clip(normalized * 5.0, -1.0, 1.0)

    def _volume_confirmation(
        self, prices: np.ndarray, volumes: np.ndarray
    ) -> np.ndarray:
        """
        Calculate volume-weighted directional signal.
        Positive when up moves have higher volume than down moves.
        """
        n = len(prices)
        confirmation = np.zeros(n)

        for i in range(1, n):
            lookback_start = max(0, i - self.volume_lookback)
            window_prices = prices[lookback_start:i + 1]
            window_volumes = volumes[lookback_start:i + 1]

            if len(window_prices) < 2:
                continue

            changes = np.diff(window_prices)
            vols = window_volumes[1:]

            up_volume = np.sum(vols[changes > 0]) if np.any(changes > 0) else 0
            down_volume = np.sum(vols[changes < 0]) if np.any(changes < 0) else 0
            total_volume = up_volume + down_volume

            if total_volume > 0:
                confirmation[i] = (up_volume - down_volume) / total_volume

        return confirmation

    def _trend_persistence(self, prices: np.ndarray) -> np.ndarray:
        """
        Measure how consistently the price has been moving
        in one direction (Hurst exponent proxy).
        """
        n = len(prices)
        persistence = np.zeros(n)
        lookback = self.slow_period

        for i in range(lookback, n):
            window = prices[i - lookback:i + 1]
            changes = np.diff(window)
            if len(changes) == 0:
                continue

            # Count consecutive same-direction moves
            same_direction = 0
            for j in range(1, len(changes)):
                if changes[j] * changes[j - 1] > 0:
                    same_direction += 1

            persistence[i] = same_direction / max(1, len(changes) - 1)
            # Map [0, 1] to [-1, 1]: 0.5 = random walk
            persistence[i] = 2.0 * (persistence[i] - 0.5)

        return persistence

    def calculate_signals(
        self,
        prices: np.ndarray,
        volumes: np.ndarray,
        timestamps: np.ndarray,
    ) -> List[MomentumSignal]:
        """
        Calculate composite momentum signals.
        """
        price_mom = self._price_momentum(prices)
        vol_confirm = self._volume_confirmation(prices, volumes)
        trend_pers = self._trend_persistence(prices)

        # Composite score: weighted average
        composite = (
            0.50 * price_mom
            + 0.30 * vol_confirm
            + 0.20 * trend_pers
        )

        signals = []
        warmup = self.slow_period + 1

        for i in range(warmup, len(prices)):
            score = composite[i]

            # Determine direction and strength
            if score > self.momentum_threshold:
                direction = 'UP'
                if score > 0.7:
                    strength = 'strong'
                elif score > 0.5:
                    strength = 'moderate'
                else:
                    strength = 'weak'
                action = 'BUY_YES'
            elif score < -self.momentum_threshold:
                direction = 'DOWN'
                if score < -0.7:
                    strength = 'strong'
                elif score < -0.5:
                    strength = 'moderate'
                else:
                    strength = 'weak'
                action = 'BUY_NO'
            else:
                direction = 'NEUTRAL'
                strength = 'weak'
                action = 'HOLD'

            signals.append(MomentumSignal(
                timestamp=timestamps[i],
                price=prices[i],
                momentum_score=round(score, 4),
                trend_direction=direction,
                trend_strength=strength,
                suggested_action=action,
            ))

        return signals

14.6.4 When Momentum Fails

Momentum strategies are vulnerable to:

  • Reversals at extremes. As a binary price approaches 0 or 1, the payoff asymmetry changes. Momentum may push a price to 0.95, but the remaining 5 cents of upside is dwarfed by the 95 cents of downside risk.
  • Low liquidity traps. Apparent momentum may be a single trader building a position. When they stop, the price stalls or reverses.
  • Sudden information. A news event that contradicts the trend causes a sharp reversal.
  • Overfitting. Momentum parameters that worked in the past may not work in the future.

Best practice: Use momentum as a confirming signal rather than a standalone strategy. Combine with fundamental analysis to avoid riding momentum off a cliff.


14.7 Contrarian Strategies

Contrarian strategies take the opposite side of crowd behavior, betting that the consensus is wrong or has overreacted. In prediction markets, contrarian approaches can be highly profitable because the participant base is often biased by partisan beliefs, emotional reactions, or herding behavior.

14.7.1 The Behavioral Basis

Several well-documented biases create contrarian opportunities:

Overconfidence. Traders overestimate the precision of their own information. When positive news arrives, they push the price too high. When negative news arrives, they push it too low.

Herding. Traders follow the crowd, amplifying moves beyond what information warrants. In prediction markets, this is especially pronounced around salient events (debates, viral moments).

Recency bias. Recent events receive disproportionate weight. A strong debate performance might cause the price to move 15 cents when the historical impact of debates on election outcomes warrants only a 3-5 cent move.

Partisan trading. In political markets, participants often trade their hopes rather than their beliefs. This creates persistent biases that contrarians can exploit.

14.7.2 Contrarian Signal Detection

A contrarian signal fires when the market shows signs of overreaction. Key indicators:

Sentiment extremes. When social media sentiment, comment volume, or search trends reach extreme levels, the market is likely overshooting.

Price velocity. Rapid price moves that exceed historical norms for similar events suggest overreaction.

Volume spikes without proportionate news. High trading volume driven by speculative interest rather than new information.

Distance from fundamental value. When the market price diverges significantly from model-based estimates.

14.7.3 Python Contrarian Signal Detector

import numpy as np
from dataclasses import dataclass
from typing import List, Dict, Optional

@dataclass
class ContrarianSignal:
    timestamp: float
    current_price: float
    signal_type: str  # 'overreaction_up', 'overreaction_down'
    overreaction_score: float  # 0-1, higher = more extreme
    suggested_position: str  # 'FADE_UP' (sell/buy NO), 'FADE_DOWN' (buy YES)
    expected_reversion: float
    time_horizon_periods: int
    risk_level: str

class ContrarianDetector:
    """
    Detects contrarian trading opportunities by identifying
    crowd overreaction in binary prediction markets.
    """

    def __init__(
        self,
        price_velocity_window: int = 5,
        volume_spike_threshold: float = 2.5,
        overreaction_threshold: float = 0.6,
        historical_move_percentile: float = 90.0,
    ):
        self.price_velocity_window = price_velocity_window
        self.volume_spike_threshold = volume_spike_threshold
        self.overreaction_threshold = overreaction_threshold
        self.historical_move_percentile = historical_move_percentile

    def _calculate_price_velocity(
        self, prices: np.ndarray
    ) -> np.ndarray:
        """
        Calculate rate of price change over rolling window.
        Normalized by price-level variance.
        """
        n = len(prices)
        velocity = np.zeros(n)

        for i in range(self.price_velocity_window, n):
            start = i - self.price_velocity_window
            change = prices[i] - prices[start]
            p_avg = np.mean(prices[start:i + 1])
            theoretical_std = np.sqrt(p_avg * (1 - p_avg))
            if theoretical_std > 0.01:
                velocity[i] = change / theoretical_std
            else:
                velocity[i] = 0.0

        return velocity

    def _detect_volume_spike(
        self, volumes: np.ndarray, lookback: int = 20
    ) -> np.ndarray:
        """
        Detect volume spikes relative to recent average.
        Returns spike ratio (current volume / average volume).
        """
        n = len(volumes)
        spike_ratio = np.ones(n)

        for i in range(lookback, n):
            avg_vol = np.mean(volumes[i - lookback:i])
            if avg_vol > 0:
                spike_ratio[i] = volumes[i] / avg_vol

        return spike_ratio

    def _calculate_overreaction_score(
        self,
        velocity: float,
        volume_spike: float,
        historical_velocities: np.ndarray,
    ) -> float:
        """
        Compute composite overreaction score (0-1).
        Higher = more likely to be an overreaction.
        """
        # Percentile of current velocity among historical
        abs_velocity = abs(velocity)
        abs_historical = np.abs(
            historical_velocities[historical_velocities != 0]
        )

        if len(abs_historical) == 0:
            velocity_percentile = 0.5
        else:
            velocity_percentile = np.mean(abs_historical < abs_velocity)

        # Volume spike component
        volume_component = min(1.0, volume_spike / 5.0)

        # Combined score
        score = 0.60 * velocity_percentile + 0.40 * volume_component

        return round(score, 3)

    def detect_signals(
        self,
        prices: np.ndarray,
        volumes: np.ndarray,
        timestamps: np.ndarray,
        news_events: Optional[List[float]] = None,
    ) -> List[ContrarianSignal]:
        """
        Scan price and volume data for contrarian opportunities.

        Args:
            prices: Array of prices
            volumes: Array of volumes
            timestamps: Array of timestamps
            news_events: List of timestamps when news occurred
                (signals near these are filtered out)
        """
        velocity = self._calculate_price_velocity(prices)
        volume_spike = self._detect_volume_spike(volumes)

        signals = []
        warmup = max(self.price_velocity_window, 20) + 1

        for i in range(warmup, len(prices)):
            # Skip if too close to 0 or 1
            if prices[i] < 0.08 or prices[i] > 0.92:
                continue

            v = velocity[i]
            vs = volume_spike[i]

            # Calculate overreaction score
            historical_v = velocity[warmup:i]
            score = self._calculate_overreaction_score(
                v, vs, historical_v
            )

            if score < self.overreaction_threshold:
                continue

            # Filter out news-driven moves if news timestamps provided
            if news_events:
                time_since_news = min(
                    abs(timestamps[i] - nt) for nt in news_events
                ) if news_events else float('inf')
                # If within 2 hours of news, skip
                if time_since_news < 7200:
                    continue

            # Determine signal direction
            if v > 0:
                signal_type = 'overreaction_up'
                position = 'FADE_UP'
                expected_reversion = -0.5 * (prices[i] - np.mean(
                    prices[max(0, i - 20):i]
                ))
            else:
                signal_type = 'overreaction_down'
                position = 'FADE_DOWN'
                expected_reversion = -0.5 * (prices[i] - np.mean(
                    prices[max(0, i - 20):i]
                ))

            # Risk classification
            if score > 0.85:
                risk = 'low'  # Very likely overreaction
                horizon = 5
            elif score > 0.70:
                risk = 'medium'
                horizon = 10
            else:
                risk = 'high'
                horizon = 20

            signals.append(ContrarianSignal(
                timestamp=timestamps[i],
                current_price=prices[i],
                signal_type=signal_type,
                overreaction_score=score,
                suggested_position=position,
                expected_reversion=round(expected_reversion, 4),
                time_horizon_periods=horizon,
                risk_level=risk,
            ))

        return signals

    def backtest_contrarian(
        self,
        prices: np.ndarray,
        volumes: np.ndarray,
        timestamps: np.ndarray,
        holding_period: int = 10,
    ) -> Dict:
        """
        Backtest contrarian signals and return performance stats.
        """
        signals = self.detect_signals(prices, volumes, timestamps)

        if not signals:
            return {'message': 'No signals detected', 'num_signals': 0}

        returns = []
        for signal in signals:
            idx = np.argmin(np.abs(timestamps - signal.timestamp))
            if idx + holding_period >= len(prices):
                continue

            entry = prices[idx]
            exit_price = prices[idx + holding_period]

            if signal.suggested_position == 'FADE_UP':
                ret = entry - exit_price  # Profit if price drops
            else:
                ret = exit_price - entry  # Profit if price rises

            returns.append(ret)

        returns = np.array(returns)

        return {
            'num_signals': len(signals),
            'num_trades': len(returns),
            'mean_return': round(np.mean(returns), 4) if len(returns) > 0 else 0,
            'win_rate': round(np.mean(returns > 0), 4) if len(returns) > 0 else 0,
            'avg_win': round(
                np.mean(returns[returns > 0]), 4
            ) if np.any(returns > 0) else 0,
            'avg_loss': round(
                np.mean(returns[returns < 0]), 4
            ) if np.any(returns < 0) else 0,
            'total_return': round(np.sum(returns), 4) if len(returns) > 0 else 0,
            'sharpe_ratio': round(
                np.mean(returns) / (np.std(returns) + 1e-8), 4
            ) if len(returns) > 0 else 0,
        }

14.7.4 Risk Management for Contrarian Trades

Contrarian trading is inherently uncomfortable --- you are fighting the crowd, and the crowd is sometimes right. Key risk management rules:

  1. Size small. Never risk more than 2-3% of capital on a single contrarian trade. The crowd may be right, and you want to survive being wrong.

  2. Require multiple confirming indicators. Do not fade a move based on velocity alone. Look for volume spikes, sentiment extremes, and divergence from fundamentals.

  3. Set time limits. If the expected reversion does not occur within your time horizon, exit. The market may have been right all along.

  4. Avoid fading informed moves. If a price move is accompanied by credible new information, it is not an overreaction --- it is information incorporation. The contrarian who fights information loses.


14.8 News and Sentiment Trading

News and sentiment trading in prediction markets is about speed and interpretation. When new information arrives, the trader who incorporates it into a probability estimate fastest has an edge.

14.8.1 The Speed Advantage

In prediction markets, the speed advantage hierarchy is:

  1. First to know. Being the first to see breaking news (rare, and borders on insider trading if the information is non-public).
  2. First to interpret. Seeing the same news as everyone else but understanding its implications faster.
  3. First to act. Understanding the implications simultaneously but executing the trade faster.

For most traders, level 2 is the most accessible. You may not see the poll result before anyone else, but you can have a framework for interpreting poll results that allows you to trade within seconds of release.

14.8.2 Sentiment Scoring

Systematic sentiment analysis can provide an edge, especially when aggregating many small signals:

$$\text{Sentiment Score} = \frac{N_{\text{positive}} - N_{\text{negative}}}{N_{\text{positive}} + N_{\text{negative}} + N_{\text{neutral}}}$$

More sophisticated approaches weight by source credibility, recency, and relevance:

$$S = \frac{\sum_i w_i \cdot s_i}{\sum_i w_i}$$

where $s_i \in [-1, 1]$ is the sentiment of source $i$ and $w_i$ is its weight.

14.8.3 Python News Monitor Skeleton

import time
import hashlib
from dataclasses import dataclass, field
from typing import List, Dict, Callable, Optional
from datetime import datetime
from enum import Enum

class Sentiment(Enum):
    VERY_NEGATIVE = -2
    NEGATIVE = -1
    NEUTRAL = 0
    POSITIVE = 1
    VERY_POSITIVE = 2

@dataclass
class NewsItem:
    """A single news item with metadata."""
    headline: str
    source: str
    timestamp: datetime
    url: str
    sentiment: Sentiment
    relevance_score: float  # 0-1
    affected_markets: List[str]
    content_hash: str = ""

    def __post_init__(self):
        if not self.content_hash:
            self.content_hash = hashlib.md5(
                self.headline.encode()
            ).hexdigest()

@dataclass
class SentimentAggregate:
    """Aggregated sentiment for a market."""
    market_id: str
    score: float  # -1 to +1
    num_sources: int
    time_window_hours: float
    dominant_sentiment: str
    confidence: float

class NewsMonitor:
    """
    Skeleton for a real-time news monitoring system for
    prediction markets. In production, you would connect
    this to actual news APIs (e.g., NewsAPI, Twitter/X API,
    RSS feeds).
    """

    def __init__(self):
        self.news_items: List[NewsItem] = []
        self.seen_hashes: set = set()
        self.callbacks: List[Callable] = []
        self.keyword_market_map: Dict[str, List[str]] = {}

    def register_market_keywords(
        self, market_id: str, keywords: List[str]
    ):
        """
        Map keywords to markets so incoming news can be
        automatically linked to affected markets.
        """
        for keyword in keywords:
            kw_lower = keyword.lower()
            if kw_lower not in self.keyword_market_map:
                self.keyword_market_map[kw_lower] = []
            self.keyword_market_map[kw_lower].append(market_id)

    def register_callback(self, callback: Callable):
        """Register a function to call when high-impact news arrives."""
        self.callbacks.append(callback)

    def _match_markets(self, headline: str) -> List[str]:
        """Find markets affected by a news headline."""
        headline_lower = headline.lower()
        affected = set()
        for keyword, markets in self.keyword_market_map.items():
            if keyword in headline_lower:
                affected.update(markets)
        return list(affected)

    def _estimate_sentiment(self, headline: str) -> Sentiment:
        """
        Simple rule-based sentiment estimation.
        In production, use a proper NLP model.
        """
        positive_words = [
            'wins', 'leads', 'surges', 'gains', 'rises',
            'strong', 'ahead', 'victory', 'endorsed', 'approved',
        ]
        negative_words = [
            'loses', 'trails', 'drops', 'falls', 'scandal',
            'weak', 'behind', 'defeat', 'rejected', 'denied',
        ]

        headline_lower = headline.lower()
        pos_count = sum(1 for w in positive_words if w in headline_lower)
        neg_count = sum(1 for w in negative_words if w in headline_lower)

        if pos_count > neg_count + 1:
            return Sentiment.VERY_POSITIVE
        elif pos_count > neg_count:
            return Sentiment.POSITIVE
        elif neg_count > pos_count + 1:
            return Sentiment.VERY_NEGATIVE
        elif neg_count > pos_count:
            return Sentiment.NEGATIVE
        else:
            return Sentiment.NEUTRAL

    def ingest_headline(
        self,
        headline: str,
        source: str,
        url: str = "",
        timestamp: datetime = None,
    ) -> Optional[NewsItem]:
        """
        Process an incoming news headline.
        Returns the NewsItem if it's new, None if duplicate.
        """
        content_hash = hashlib.md5(headline.encode()).hexdigest()

        if content_hash in self.seen_hashes:
            return None

        self.seen_hashes.add(content_hash)

        if timestamp is None:
            timestamp = datetime.now()

        affected_markets = self._match_markets(headline)
        sentiment = self._estimate_sentiment(headline)

        # Relevance score based on keyword matches
        headline_lower = headline.lower()
        keyword_matches = sum(
            1 for kw in self.keyword_market_map
            if kw in headline_lower
        )
        relevance = min(1.0, keyword_matches / 3.0)

        news_item = NewsItem(
            headline=headline,
            source=source,
            timestamp=timestamp,
            url=url,
            sentiment=sentiment,
            relevance_score=relevance,
            affected_markets=affected_markets,
            content_hash=content_hash,
        )

        self.news_items.append(news_item)

        # Trigger callbacks for high-relevance news
        if relevance > 0.5 and sentiment != Sentiment.NEUTRAL:
            for callback in self.callbacks:
                callback(news_item)

        return news_item

    def get_market_sentiment(
        self,
        market_id: str,
        hours: float = 24.0,
    ) -> SentimentAggregate:
        """
        Calculate aggregate sentiment for a specific market
        over the specified time window.
        """
        now = datetime.now()
        relevant_items = [
            item for item in self.news_items
            if market_id in item.affected_markets
            and (now - item.timestamp).total_seconds() < hours * 3600
        ]

        if not relevant_items:
            return SentimentAggregate(
                market_id=market_id,
                score=0.0,
                num_sources=0,
                time_window_hours=hours,
                dominant_sentiment='NEUTRAL',
                confidence=0.0,
            )

        # Weighted sentiment score
        total_weight = 0.0
        weighted_sentiment = 0.0

        for item in relevant_items:
            weight = item.relevance_score
            # Recency weighting
            age_hours = (now - item.timestamp).total_seconds() / 3600
            recency_weight = 1.0 / (1.0 + age_hours)
            weight *= recency_weight

            weighted_sentiment += weight * item.sentiment.value
            total_weight += weight

        score = weighted_sentiment / total_weight if total_weight > 0 else 0.0
        score = max(-1.0, min(1.0, score / 2.0))  # Normalize

        if score > 0.3:
            dominant = 'POSITIVE'
        elif score < -0.3:
            dominant = 'NEGATIVE'
        else:
            dominant = 'NEUTRAL'

        confidence = min(1.0, len(relevant_items) / 10.0)

        return SentimentAggregate(
            market_id=market_id,
            score=round(score, 3),
            num_sources=len(relevant_items),
            time_window_hours=hours,
            dominant_sentiment=dominant,
            confidence=round(confidence, 3),
        )

14.8.4 Practical Considerations

  • Latency matters. In liquid prediction markets, major news is priced in within 1-5 minutes. A trader who needs 10 minutes to assess the impact of a poll release will find no edge remaining.
  • False signals. Sentiment analysis produces many false positives. A headline "Candidate A DESTROYS Candidate B in debate" might be partisan opinion, not objective analysis.
  • Source quality. Not all news is equally informative. A pollster with a strong track record releasing a new survey is far more impactful than a blog post.
  • Saturation. When a story is everywhere, the market has already priced it in. The edge comes from information that is not yet widely disseminated.

14.9 Combining Multiple Strategies

No single strategy works in all market conditions. The most robust approach combines multiple strategies, using each in the conditions where it excels and resolving conflicts through a systematic framework.

14.9.1 Signal Aggregation

When multiple strategies generate signals for the same market, we need a method to combine them. A simple approach is weighted averaging:

$$S_{\text{combined}} = \sum_{i=1}^{N} w_i \cdot S_i$$

where $S_i$ is the signal from strategy $i$ (normalized to [-1, 1]) and $w_i$ is the weight for that strategy, with $\sum w_i = 1$.

The weights should reflect: - Historical accuracy of each strategy - Current market conditions (e.g., give more weight to event-driven strategies near catalysts) - Strategy independence (diversification benefit)

14.9.2 Strategy Correlation

If two strategies always agree, combining them adds no diversification value. The benefit of combining strategies comes from their independence:

$$\text{Var}(S_{\text{combined}}) = \sum_i w_i^2 \sigma_i^2 + 2\sum_{i

Lower correlations $\rho_{ij}$ between strategies reduce portfolio-level variance, leading to better risk-adjusted returns.

Typical strategy correlations in prediction markets:

Strategy Pair Expected Correlation
Fundamental + Mean Reversion Low (0.1-0.3)
Fundamental + Contrarian Low to Moderate (0.2-0.4)
Momentum + Contrarian Negative (-0.3 to -0.1)
Event-Driven + News High (0.5-0.8)
Closing-the-Gap + Fundamental Moderate (0.3-0.5)

14.9.3 Conflict Resolution

When strategies conflict (e.g., momentum says BUY, contrarian says SELL), the conflict itself is informative:

  • High conflict = high uncertainty. Reduce position size.
  • Fundamental overrides technical. If your fundamental model disagrees with a momentum signal, trust fundamentals for longer time horizons.
  • Contrarian overrides momentum at extremes. When prices are near 0 or 1, contrarian signals are more valuable because the payoff asymmetry favors mean reversion.

14.9.4 Python Multi-Strategy Combiner

import numpy as np
from dataclasses import dataclass
from typing import List, Dict, Optional

@dataclass
class StrategySignal:
    """A signal from a single strategy."""
    strategy_name: str
    direction: float  # -1 (strong sell) to +1 (strong buy)
    confidence: float  # 0-1
    time_horizon: str  # 'short', 'medium', 'long'

@dataclass
class CombinedSignal:
    """Combined signal from multiple strategies."""
    market_id: str
    combined_direction: float
    combined_confidence: float
    recommended_action: str
    position_size_factor: float
    strategy_agreement: float
    individual_signals: List[StrategySignal]
    conflict_level: str  # 'none', 'low', 'high'

class MultiStrategyCombiner:
    """
    Combines signals from multiple trading strategies into
    a single actionable recommendation.
    """

    DEFAULT_WEIGHTS = {
        'fundamental': 0.30,
        'event_driven': 0.20,
        'mean_reversion': 0.15,
        'momentum': 0.10,
        'contrarian': 0.10,
        'closing_gap': 0.10,
        'news_sentiment': 0.05,
    }

    def __init__(
        self,
        weights: Dict[str, float] = None,
        action_threshold: float = 0.25,
        max_conflict_to_trade: float = 0.60,
    ):
        self.weights = weights or self.DEFAULT_WEIGHTS.copy()
        self.action_threshold = action_threshold
        self.max_conflict_to_trade = max_conflict_to_trade

        # Normalize weights
        total = sum(self.weights.values())
        self.weights = {
            k: v / total for k, v in self.weights.items()
        }

    def _measure_agreement(
        self, signals: List[StrategySignal]
    ) -> float:
        """
        Measure how much strategies agree.
        Returns 0 (complete disagreement) to 1 (complete agreement).
        """
        if len(signals) < 2:
            return 1.0

        directions = [s.direction for s in signals]
        # Check if all directions have the same sign
        positive = sum(1 for d in directions if d > 0)
        negative = sum(1 for d in directions if d < 0)
        total = len(directions)

        agreement = max(positive, negative) / total
        return round(agreement, 3)

    def combine_signals(
        self,
        market_id: str,
        signals: List[StrategySignal],
    ) -> CombinedSignal:
        """
        Combine multiple strategy signals into one recommendation.
        """
        if not signals:
            return CombinedSignal(
                market_id=market_id,
                combined_direction=0.0,
                combined_confidence=0.0,
                recommended_action='NO_TRADE',
                position_size_factor=0.0,
                strategy_agreement=0.0,
                individual_signals=[],
                conflict_level='none',
            )

        # Weighted combination
        total_weight = 0.0
        weighted_direction = 0.0
        weighted_confidence = 0.0

        for signal in signals:
            w = self.weights.get(signal.strategy_name, 0.10)
            adjusted_w = w * signal.confidence
            weighted_direction += adjusted_w * signal.direction
            weighted_confidence += adjusted_w * abs(signal.direction)
            total_weight += adjusted_w

        if total_weight > 0:
            combined_dir = weighted_direction / total_weight
            combined_conf = weighted_confidence / total_weight
        else:
            combined_dir = 0.0
            combined_conf = 0.0

        # Measure agreement
        agreement = self._measure_agreement(signals)

        # Conflict level
        if agreement > 0.80:
            conflict = 'none'
        elif agreement > 0.60:
            conflict = 'low'
        else:
            conflict = 'high'

        # Position size factor: scale by agreement
        position_factor = abs(combined_dir) * agreement

        # Recommended action
        if conflict == 'high' and agreement < (1 - self.max_conflict_to_trade):
            action = 'NO_TRADE'
            position_factor = 0.0
        elif combined_dir > self.action_threshold:
            action = 'BUY_YES'
        elif combined_dir < -self.action_threshold:
            action = 'BUY_NO'
        else:
            action = 'NO_TRADE'
            position_factor = 0.0

        return CombinedSignal(
            market_id=market_id,
            combined_direction=round(combined_dir, 4),
            combined_confidence=round(combined_conf, 4),
            recommended_action=action,
            position_size_factor=round(position_factor, 4),
            strategy_agreement=agreement,
            individual_signals=signals,
            conflict_level=conflict,
        )

    def generate_report(
        self, combined: CombinedSignal
    ) -> str:
        """Generate a human-readable report of the combined signal."""
        lines = [
            f"=== Multi-Strategy Report: {combined.market_id} ===",
            f"Combined Direction: {combined.combined_direction:+.4f}",
            f"Combined Confidence: {combined.combined_confidence:.4f}",
            f"Strategy Agreement: {combined.strategy_agreement:.1%}",
            f"Conflict Level: {combined.conflict_level}",
            f"Recommended Action: {combined.recommended_action}",
            f"Position Size Factor: {combined.position_size_factor:.4f}",
            "",
            "--- Individual Signals ---",
        ]

        for signal in combined.individual_signals:
            weight = self.weights.get(signal.strategy_name, 0.10)
            lines.append(
                f"  {signal.strategy_name:>20s}: "
                f"dir={signal.direction:+.3f}  "
                f"conf={signal.confidence:.3f}  "
                f"weight={weight:.3f}  "
                f"horizon={signal.time_horizon}"
            )

        return "\n".join(lines)

14.10 Risk Management for Binary Strategies

Binary markets present unique risk management challenges. The bounded payoff (0 or 1) simplifies some aspects but introduces others.

14.10.1 Position Limits

Per-market limit. Never risk more than 5% of total capital on a single binary market. Even your highest-conviction trade can be wrong.

Strategy-level limit. Limit total capital allocated to any single strategy to 30%. If mean reversion stops working, you want to survive.

Correlation limit. Track the correlation between your positions. Five positions on different election markets may all move together if a single event (e.g., a debate) affects all of them.

14.10.2 Stop-Loss in Binary Markets

Traditional stop-losses are problematic in binary markets because:

  1. Prices are bounded. A contract bought at 0.60 can only drop to 0.00 --- a 60-cent loss. The maximum loss is known at entry.

  2. Binary resolution. If you believe a contract is worth 0.70 and it drops to 0.55, the correct response depends on whether the drop reflects new information or noise. If noise, the correct action is to buy more, not stop out.

Instead of price-based stops, use thesis-based exits: - Define the conditions under which your trade thesis is invalidated - Exit when those conditions are met, regardless of current price - This might mean exiting at a loss (if new information invalidates your estimate) or at a profit (if the thesis played out faster than expected)

14.10.3 Portfolio-Level Risk

For a portfolio of binary positions, the key risk metric is expected drawdown:

$$E[\text{drawdown}] = \sum_{i} P(\text{loss}_i) \times \text{position}_i$$

The worst-case scenario is when all positions lose simultaneously. While unlikely, correlated risks make this more probable than the product of individual loss probabilities.

Correlation management. Group your positions by exposure: - Political markets: all exposed to "political surprise" risk - Sports markets: less correlated with political events - Financial markets: correlated with economic conditions

Diversify across exposure groups to reduce portfolio-level risk.

14.10.4 The Risk of Ruin

For binary market traders, risk of ruin is a real concern. The formula for ruin probability with constant fraction betting:

$$P(\text{ruin}) = \left(\frac{1 - p_{\text{edge}}}{p_{\text{edge}}}\right)^{B/b}$$

where $B$ is total bankroll, $b$ is bet size, and $p_{\text{edge}}$ is the probability of winning each bet. This formula assumes independent bets --- which is rarely exactly true, but provides a useful approximation.

Rule of thumb: If your average edge is 5% (i.e., you win 55% of even-money bets), risk no more than 1/20 of your bankroll per trade to keep ruin probability below 1%.


14.11 Backtesting Binary Strategies

Backtesting is essential for validating any trading strategy before deploying real capital. Binary markets present unique backtesting challenges.

14.11.1 Walk-Forward Testing

Walk-forward testing is the gold standard for backtesting prediction market strategies. The process:

  1. Train period. Calibrate strategy parameters on historical data (e.g., months 1-6).
  2. Test period. Apply the strategy out-of-sample (e.g., month 7).
  3. Walk forward. Shift the window: train on months 2-7, test on month 8. Repeat.

This prevents overfitting by ensuring the strategy is always evaluated on data it has never seen.

14.11.2 Avoiding Lookahead Bias

Common sources of lookahead bias in binary market backtests:

  • Using resolution data in signal generation. If your model knows a contract resolved YES, it may inadvertently use this information to select trades.
  • Survivorship bias. Only backtesting on markets that resolved (excluding delisted or cancelled markets).
  • Data snooping. Testing many strategy variants and selecting the one that performed best on historical data. This variant is optimized for the past, not the future.

14.11.3 Realistic Fill Assumptions

Backtests must account for execution realities:

  • Bid-ask spread. You rarely trade at the mid-price. Assume you pay the spread.
  • Slippage. Large orders move the market. Assume 0.5-1.0 cents of slippage per contract.
  • Fees. Platform fees reduce returns. Include them in the backtest.
  • Partial fills. Not all orders are filled. Assume some fraction of intended trades are not executed.

14.11.4 Python Backtest Framework

import numpy as np
from dataclasses import dataclass, field
from typing import List, Dict, Callable, Optional
from enum import Enum

class Side(Enum):
    BUY_YES = "BUY_YES"
    BUY_NO = "BUY_NO"

@dataclass
class Trade:
    """A single trade in the backtest."""
    timestamp: float
    market_id: str
    side: Side
    entry_price: float
    quantity: int
    exit_price: float = 0.0
    exit_timestamp: float = 0.0
    pnl: float = 0.0
    is_open: bool = True
    resolution: Optional[int] = None  # 0 or 1

@dataclass
class BacktestConfig:
    """Configuration for a backtest run."""
    initial_capital: float = 10000.0
    transaction_cost_pct: float = 0.02
    slippage_cents: float = 0.005
    max_position_pct: float = 0.05  # Max 5% of capital per trade
    partial_fill_rate: float = 0.90  # 90% of orders filled

@dataclass
class BacktestResult:
    """Results of a backtest run."""
    total_return: float = 0.0
    total_return_pct: float = 0.0
    num_trades: int = 0
    win_rate: float = 0.0
    avg_profit: float = 0.0
    avg_loss: float = 0.0
    max_drawdown: float = 0.0
    sharpe_ratio: float = 0.0
    profit_factor: float = 0.0
    equity_curve: List[float] = field(default_factory=list)
    trades: List[Trade] = field(default_factory=list)
    monthly_returns: Dict[str, float] = field(default_factory=dict)

class BinaryBacktester:
    """
    Backtesting framework for binary prediction market strategies.
    Handles realistic execution, transaction costs, and
    portfolio management.
    """

    def __init__(self, config: BacktestConfig = None):
        self.config = config or BacktestConfig()
        self.capital = self.config.initial_capital
        self.trades: List[Trade] = []
        self.equity_curve: List[float] = []

    def _apply_slippage(
        self, price: float, side: Side
    ) -> float:
        """Apply slippage to execution price."""
        slip = self.config.slippage_cents
        if side == Side.BUY_YES:
            return min(1.0, price + slip)
        else:
            return max(0.0, price - slip)

    def _apply_transaction_cost(
        self, price: float
    ) -> float:
        """Calculate transaction cost for a trade."""
        return price * self.config.transaction_cost_pct

    def _calculate_position_size(
        self, price: float
    ) -> int:
        """
        Determine position size (number of contracts) based
        on capital and risk limits.
        """
        max_capital = self.capital * self.config.max_position_pct
        cost_per_contract = price + self._apply_transaction_cost(price)
        if cost_per_contract <= 0:
            return 0

        size = int(max_capital / cost_per_contract)

        # Apply partial fill
        if np.random.random() > self.config.partial_fill_rate:
            size = int(size * np.random.uniform(0.3, 0.7))

        return max(0, size)

    def execute_trade(
        self,
        timestamp: float,
        market_id: str,
        side: Side,
        raw_price: float,
    ) -> Optional[Trade]:
        """
        Execute a trade with realistic assumptions.
        Returns the Trade object or None if trade cannot be executed.
        """
        # Apply slippage
        exec_price = self._apply_slippage(raw_price, side)

        # Calculate position size
        quantity = self._calculate_position_size(exec_price)
        if quantity <= 0:
            return None

        # Calculate cost including transaction fees
        cost = quantity * (
            exec_price
            + self._apply_transaction_cost(exec_price)
        )

        if cost > self.capital:
            return None

        self.capital -= cost

        trade = Trade(
            timestamp=timestamp,
            market_id=market_id,
            side=side,
            entry_price=exec_price,
            quantity=quantity,
        )
        self.trades.append(trade)

        return trade

    def resolve_trade(
        self,
        trade: Trade,
        resolution: int,
        exit_timestamp: float,
    ):
        """
        Resolve a trade when the binary contract settles.

        Args:
            trade: The trade to resolve
            resolution: 1 (YES) or 0 (NO)
            exit_timestamp: When the resolution occurred
        """
        trade.resolution = resolution
        trade.exit_timestamp = exit_timestamp
        trade.is_open = False

        if trade.side == Side.BUY_YES:
            if resolution == 1:
                trade.exit_price = 1.0
                trade.pnl = trade.quantity * (1.0 - trade.entry_price)
            else:
                trade.exit_price = 0.0
                trade.pnl = -trade.quantity * trade.entry_price
        else:  # BUY_NO
            if resolution == 0:
                trade.exit_price = 1.0
                trade.pnl = trade.quantity * (1.0 - (1.0 - trade.entry_price))
            else:
                trade.exit_price = 0.0
                trade.pnl = -trade.quantity * (1.0 - trade.entry_price)

        # Subtract exit transaction cost
        exit_cost = self._apply_transaction_cost(abs(trade.pnl))
        trade.pnl -= exit_cost

        self.capital += trade.quantity + trade.pnl  # Return capital + profit

    def early_exit(
        self,
        trade: Trade,
        exit_price: float,
        exit_timestamp: float,
    ):
        """
        Exit a trade before resolution by selling the contract.
        """
        exit_price = self._apply_slippage(
            exit_price,
            Side.BUY_NO if trade.side == Side.BUY_YES else Side.BUY_YES,
        )

        trade.exit_price = exit_price
        trade.exit_timestamp = exit_timestamp
        trade.is_open = False

        if trade.side == Side.BUY_YES:
            trade.pnl = trade.quantity * (exit_price - trade.entry_price)
        else:
            trade.pnl = trade.quantity * (
                (1 - trade.entry_price) - (1 - exit_price)
            )

        exit_cost = self._apply_transaction_cost(
            trade.quantity * exit_price
        )
        trade.pnl -= exit_cost

        self.capital += trade.quantity * exit_price - exit_cost

    def run_backtest(
        self,
        signal_generator: Callable,
        price_data: Dict[str, np.ndarray],
        resolution_data: Dict[str, int],
        timestamps: np.ndarray,
    ) -> BacktestResult:
        """
        Run a complete backtest.

        Args:
            signal_generator: Function that takes (market_id, prices, idx)
                and returns (Side, confidence) or None
            price_data: Dict mapping market_id to price arrays
            resolution_data: Dict mapping market_id to resolution (0 or 1)
            timestamps: Array of timestamps
        """
        self.capital = self.config.initial_capital
        self.trades = []
        self.equity_curve = [self.capital]

        # Walk through time
        for t_idx in range(len(timestamps)):
            for market_id, prices in price_data.items():
                if t_idx >= len(prices):
                    continue

                # Generate signal
                signal = signal_generator(market_id, prices, t_idx)

                if signal is not None:
                    side, confidence = signal
                    self.execute_trade(
                        timestamp=timestamps[t_idx],
                        market_id=market_id,
                        side=side,
                        raw_price=prices[t_idx],
                    )

            self.equity_curve.append(self.capital)

        # Resolve all open trades
        for trade in self.trades:
            if trade.is_open and trade.market_id in resolution_data:
                self.resolve_trade(
                    trade,
                    resolution_data[trade.market_id],
                    timestamps[-1],
                )

        return self._compile_results()

    def _compile_results(self) -> BacktestResult:
        """Compile backtest statistics."""
        closed_trades = [t for t in self.trades if not t.is_open]

        if not closed_trades:
            return BacktestResult(
                equity_curve=self.equity_curve,
                trades=self.trades,
            )

        pnls = np.array([t.pnl for t in closed_trades])
        wins = pnls[pnls > 0]
        losses = pnls[pnls < 0]

        # Calculate drawdown
        equity = np.array(self.equity_curve)
        peak = np.maximum.accumulate(equity)
        drawdown = (peak - equity) / peak
        max_dd = np.max(drawdown) if len(drawdown) > 0 else 0.0

        # Calculate Sharpe (assuming daily returns)
        if len(pnls) > 1:
            sharpe = (
                np.mean(pnls) / (np.std(pnls) + 1e-8) * np.sqrt(252)
            )
        else:
            sharpe = 0.0

        total_wins = np.sum(wins) if len(wins) > 0 else 0.0
        total_losses = abs(np.sum(losses)) if len(losses) > 0 else 0.001

        return BacktestResult(
            total_return=round(self.capital - self.config.initial_capital, 2),
            total_return_pct=round(
                (self.capital / self.config.initial_capital - 1) * 100, 2
            ),
            num_trades=len(closed_trades),
            win_rate=round(np.mean(pnls > 0), 4) if len(pnls) > 0 else 0,
            avg_profit=round(np.mean(wins), 4) if len(wins) > 0 else 0,
            avg_loss=round(np.mean(losses), 4) if len(losses) > 0 else 0,
            max_drawdown=round(max_dd, 4),
            sharpe_ratio=round(sharpe, 4),
            profit_factor=round(total_wins / total_losses, 4),
            equity_curve=self.equity_curve,
            trades=self.trades,
        )

14.11.5 Interpreting Backtest Results

Key metrics for evaluating a binary market strategy:

Metric Good Excellent Warning Sign
Win Rate > 55% > 65% < 50%
Profit Factor > 1.3 > 2.0 < 1.0
Max Drawdown < 15% < 8% > 25%
Sharpe Ratio > 1.0 > 2.0 < 0.5
Trade Count > 50 > 200 < 20

Important caveat: A strategy that looks great on 20 trades might be noise. Statistical significance requires many trades. For a binary strategy with 60% win rate, you need at least 100 trades to be 95% confident that the edge is real (and not just luck at 50/50).

The minimum number of trades for statistical significance at a given win rate:

$$n \geq \frac{z_{\alpha/2}^2 \cdot p(1-p)}{(p - 0.5)^2}$$

For $p = 0.60$ and 95% confidence ($z = 1.96$):

$$n \geq \frac{1.96^2 \times 0.60 \times 0.40}{(0.60 - 0.50)^2} = \frac{0.9216}{0.01} \approx 92$$

So roughly 100 trades minimum. For edge detection at finer margins, you need proportionally more.


14.12 Chapter Summary

This chapter presented a comprehensive toolkit for trading binary prediction markets:

Fundamental analysis (Section 14.2) builds probability estimates from first principles. It is the most intellectually demanding approach but can identify large, persistent edges. The key is combining multiple information sources (polls, fundamentals, historical base rates) through a principled Bayesian framework.

Event-driven trading (Section 14.3) exploits identifiable catalysts. Pre-event positioning captures expected moves; post-event reaction trading captures interpretation edges. Success requires a systematic event calendar and pre-planned response strategies.

Mean reversion (Section 14.4) profits from noise-driven price overreactions. The adapted Bollinger Band approach accounts for the heteroscedastic nature of binary prices. The critical skill is distinguishing noise from genuine information.

Closing the gap (Section 14.5) is the most mechanical strategy, exploiting the certainty that binary contracts must converge to 0 or 1 at expiry. It thrives in stale, low-attention markets and can be systematized across many contracts.

Momentum (Section 14.6) captures information cascades and slow diffusion effects. It works best in the middle of a contract's life and should be used cautiously near price extremes.

Contrarian (Section 14.7) strategies fade crowd overreactions, exploiting biases such as herding, overconfidence, and partisanship. Risk management is paramount because the crowd is sometimes right.

News and sentiment (Section 14.8) trading leverages speed and interpretation. The edge decays rapidly, making this the most time-sensitive strategy.

Multi-strategy combination (Section 14.9) aggregates signals from multiple approaches, improving robustness and reducing reliance on any single method. Conflict between strategies is itself informative.

Risk management (Section 14.10) and backtesting (Section 14.11) are the foundations that make all strategies viable. Without proper position sizing, correlation management, and out-of-sample validation, even the best strategy will eventually lead to ruin.

The overarching lesson is that no single strategy dominates. The most successful binary market traders maintain a toolkit of approaches, deploying each in the market conditions where it excels. They size their positions conservatively, diversify across strategies and markets, and continuously validate their methods with rigorous backtesting.


What's Next

In Chapter 15, we will turn from strategy to execution. We will examine order types and execution algorithms for prediction markets --- how to get your trades filled at favorable prices, how to minimize market impact, and how to build automated execution systems that implement the strategies from this chapter. We will also explore the practical challenges of live trading: API integration, order management, and monitoring systems.


Key equations to remember:

$$\text{Binary Edge} = q - p \quad \text{(estimated probability minus market price)}$$

$$f^* = \frac{q - p}{1 - p} \quad \text{(Kelly fraction for buying YES)}$$

$$Z = \frac{p_t - \bar{p}_{t,k}}{\sigma_k} \quad \text{(mean-reversion Z-score)}$$

$$S_{\text{combined}} = \sum_i w_i \cdot S_i \quad \text{(multi-strategy signal aggregation)}$$