31 min read

> "Emerging and alternative betting markets often feature thinner liquidity, less sharp action, and greater inefficiency. For the prepared bettor, they represent frontier opportunities."

Learning Objectives

  • Build game-specific predictive models for major esports titles including CS2, League of Legends, and Dota 2
  • Implement the strokes gained framework for golf tournament modeling including course fit analysis and field strength adjustments
  • Construct player projection systems for prop markets that account for game environment, matchup effects, and correlation structure
  • Extract implied probabilities from futures markets and identify value in long-term championship and award wagers
  • Develop systematic approaches to niche and thin markets where data scarcity and market inefficiency create opportunities

Chapter 22: Modeling Emerging Markets

"Emerging and alternative betting markets often feature thinner liquidity, less sharp action, and greater inefficiency. For the prepared bettor, they represent frontier opportunities." --- Adapted from the textbook outline

Chapter Overview

The preceding chapters of Part IV focused on the major team sports (NFL, NBA, MLB, NHL, soccer) and individual sports (tennis, MMA, boxing) that together account for the vast majority of global sports betting handle. These markets are deep, liquid, and heavily scrutinized by sharp bettors and sophisticated syndicates. Edges exist but are narrow, and the cost of finding them --- in terms of data, infrastructure, and intellectual capital --- is high.

This chapter turns to the other side of the market landscape: the emerging, alternative, and niche markets where the dynamics are fundamentally different. Esports betting has grown from a curiosity to a multi-billion-dollar global market in less than a decade. Golf betting, long an afterthought, has exploded with the rise of daily fantasy sports (DFS) and the availability of granular shot-level data. Player proposition (prop) markets have proliferated as sportsbooks seek to increase handle and engagement. Futures markets --- season-long championship odds, award races, division winners --- offer structural characteristics that differ from event-by-event betting. And beneath all of these, a vast ecosystem of niche sports (table tennis, darts, snooker, cricket, handball, volleyball) provides thin, inefficient markets where the analytical bettor with specialized knowledge can find substantial edge.

The common thread across all these markets is that they are less efficient than the major team sports markets. This inefficiency arises from several sources: less sharp money flowing into the market, less historical data for the sportsbook to calibrate from, higher variance in outcomes, and fewer analytical resources devoted to pricing by market makers. For the bettor willing to invest the effort to build specialized models, these markets offer returns that are increasingly difficult to find in the NFL or Premier League.

In this chapter, you will learn to: - Model esports matches with game-specific metrics, map probabilities, and patch-aware systems - Apply strokes gained decomposition to golf tournament prediction, including course fit and field strength - Build player projection systems for prop markets with proper environmental and correlation modeling - Extract value from futures markets by understanding time value, entry timing, and hedging - Systematically approach niche and thin markets where data scarcity is both a challenge and an advantage


22.1 Esports Betting and Modeling

The Esports Landscape

Esports --- competitive video gaming --- has grown into a major betting market, with global esports betting handle estimated at over $15 billion annually as of 2025. The three largest esports betting markets by volume are:

  1. Counter-Strike 2 (CS2, formerly CS:GO): A 5-on-5 tactical first-person shooter. Teams play a series of maps (best-of-one, best-of-three, or rarely best-of-five). Each map consists of up to 30 rounds (first to 16, with overtime at 15-15). The game has a deep competitive ecosystem with multiple tiers of professional play.

  2. League of Legends (LoL): A 5-on-5 multiplayer online battle arena (MOBA). Matches are typically best-of-one in regular season play and best-of-five in playoffs. Individual games last 25--45 minutes and involve complex strategic elements including champion selection (draft), objective control, and team fighting.

  3. Dota 2: Another 5-on-5 MOBA with similarities to LoL but substantially different gameplay mechanics, a more complex drafting phase, and a different competitive calendar. Dota 2's premier event, The International, features the largest prize pool in esports.

Other titles with significant betting markets include Valorant (tactical FPS similar to CS2), Overwatch 2 (team-based FPS), and various fighting games and real-time strategy titles.

Game-Specific Metrics

Each esports title has unique metrics that drive predictive modeling. Unlike traditional sports, where the fundamental rules have been stable for decades, esports are subject to patch updates --- developer-released modifications to game balance, maps, and mechanics --- that can fundamentally alter the competitive dynamics. This creates a unique modeling challenge: historical data may become less relevant after a significant patch.

CS2 Key Metrics:

Metric Description Predictive Value
Map win rate Win percentage on each competitive map High --- teams have strong map preferences
CT/T-side win rate Win rate on counter-terrorist vs. terrorist side Medium --- indicates tactical style
First kill rate Percentage of rounds with first kill High --- correlates strongly with round win
Clutch rate Win rate in numerically disadvantaged situations Medium --- indicates individual skill depth
Average damage per round Total damage dealt divided by rounds played High --- fundamental measure of fragging power
Pistol round win rate Win rate in economy-reset rounds (1, 16) High --- pistol rounds have outsized impact
Economy management Average equipment value, eco round frequency Medium --- affects round-by-round competitiveness

League of Legends Key Metrics:

Metric Description Predictive Value
Gold difference at 15 min Average gold lead/deficit at the 15-minute mark High --- early game advantage is strongly predictive
First blood rate Frequency of scoring the first kill Medium --- indicates early aggression
Dragon control rate Percentage of dragon objectives secured High --- dragon buffs compound over the game
Baron control rate Percentage of Baron Nashor kills secured High --- Baron buff enables team pushes
Tower plate gold Gold earned from tower plates (first 14 minutes) Medium --- indicates laning phase dominance
Vision score per minute Ward placement and destruction rate Medium --- correlates with map control
Champion pool depth Number of champions competently played per role Important for draft prediction

Dota 2 Key Metrics:

Metric Description Predictive Value
Net worth advantage at 20 min Average gold/experience lead at 20 minutes High
Roshan control rate Frequency of securing Roshan kills High --- Aegis is a powerful game-changing item
Tower damage efficiency Damage to structures relative to game length Medium --- indicates strategic objective focus
Draft win rate Win rate from advantageous drafts (model-predicted) High --- Dota 2 drafts are extremely impactful
Buyback usage rate Frequency and timing of buyback usage Medium --- indicates clutch decision-making

Map Win Probabilities in CS2

CS2's map veto system creates a critical pre-match strategic element. In a best-of-three, each team bans maps, then picks maps, creating a three-map pool (if needed) from the seven-map competitive pool. Modeling which maps will be played --- and each team's win probability on each map --- is essential.

The map veto model operates in two stages:

  1. Map prediction: Estimate which maps will be played based on each team's known preferences and bans. Teams have strong tendencies: some teams consistently ban certain maps, and map pools are often publicly tracked by analysts.

  2. Map-specific win probability: For each possible map, estimate each team's win probability using map-specific ratings (analogous to surface-specific tennis Elo).

$$P(\text{Team A wins BO3}) = \sum_{m_1, m_2, m_3} P(\text{maps} = (m_1, m_2, m_3)) \cdot P(A \text{ wins} | \text{maps})$$

where the sum is over all possible map sequences and the conditional probability accounts for the best-of-three structure.

Roster Changes and Patch Effects

Two factors unique to esports create both modeling challenges and betting opportunities.

Roster changes occur frequently in esports. Unlike traditional sports where rosters are large and changes at any single position have moderate impact, esports rosters are small (5 players), and replacing one player can fundamentally alter a team's capabilities. A roster change effectively creates a partially new entity, and the model must balance historical team data (which reflects the old roster) against uncertainty about the new configuration.

The standard approach is to apply a roster change discount: reduce confidence in the team's rating (increase the Glicko-2 RD equivalent) proportional to the importance of the changed player. An in-game leader (IGL) change in CS2 is more impactful than replacing a utility player; a mid-lane change in LoL is more impactful than a support change.

Patch effects are changes to game balance released by the developer. Major patches can invalidate weeks or months of historical data by fundamentally changing which strategies, characters, or weapons are effective. The modeling response is:

  1. Track patch dates and weight post-patch data more heavily.
  2. Identify patch impact: Not all patches are equally disruptive. A minor balance tweak is different from a fundamental gameplay overhaul.
  3. Accelerate rating updates immediately post-patch (increase the K-factor equivalent).
  4. Model meta shifts: Track which teams and players benefit from specific types of changes (e.g., a team that favors a particular champion composition benefits when those champions are buffed).

Data Sources for Esports

Unlike traditional sports, where data is mature and standardized, esports data sources vary widely in quality and availability.

Source Coverage Cost Quality
HLTV.org (CS2) Comprehensive professional CS2 Free (basic), paid (advanced) Excellent
Oracle's Elixir (LoL) Professional LoL across multiple leagues Free Very good
DotaBuff / OpenDota (Dota 2) Professional and public Dota 2 Free / freemium Good
Liquipedia Multi-game wiki with results/rosters Free Good for results, limited stats
PandaScore API Multi-game real-time data Paid Professional grade
Abios API Multi-game live and historical Paid Professional grade

Python Implementation: CS2 Match Prediction

import math
from dataclasses import dataclass, field
from typing import Dict, List, Optional, Tuple
from itertools import combinations


@dataclass
class CS2Team:
    """CS2 team with overall and map-specific ratings."""

    name: str
    overall_elo: float = 1500.0
    map_elo: Dict[str, float] = field(default_factory=lambda: {
        "mirage": 1500.0,
        "inferno": 1500.0,
        "nuke": 1500.0,
        "overpass": 1500.0,
        "ancient": 1500.0,
        "anubis": 1500.0,
        "vertigo": 1500.0,
    })
    map_games: Dict[str, int] = field(default_factory=lambda: {
        m: 0 for m in [
            "mirage", "inferno", "nuke", "overpass",
            "ancient", "anubis", "vertigo",
        ]
    })
    map_bans: Dict[str, float] = field(default_factory=lambda: {
        m: 0.0 for m in [
            "mirage", "inferno", "nuke", "overpass",
            "ancient", "anubis", "vertigo",
        ]
    })
    roster_change_penalty: float = 0.0
    patch_version: Optional[str] = None


class CS2MatchPredictor:
    """
    Predicts CS2 match outcomes using map-specific Elo ratings.

    Features:
        - Map-specific Elo with blending to overall rating
        - Map veto probability modeling
        - Best-of-N series probability calculation
        - Roster change and patch adjustments
    """

    ALL_MAPS = [
        "mirage", "inferno", "nuke", "overpass",
        "ancient", "anubis", "vertigo",
    ]

    def __init__(
        self,
        overall_k: float = 32.0,
        map_k: float = 40.0,
        map_weight_threshold: int = 20,
        roster_change_rd_increase: float = 0.05,
    ):
        self.overall_k = overall_k
        self.map_k = map_k
        self.map_weight_threshold = map_weight_threshold
        self.roster_change_rd_increase = roster_change_rd_increase
        self.teams: Dict[str, CS2Team] = {}

    def get_or_create_team(self, name: str) -> CS2Team:
        if name not in self.teams:
            self.teams[name] = CS2Team(name=name)
        return self.teams[name]

    def _expected(self, rating_a: float, rating_b: float) -> float:
        return 1.0 / (1.0 + 10.0 ** ((rating_b - rating_a) / 400.0))

    def _blended_map_rating(self, team: CS2Team, map_name: str) -> float:
        """Blend map-specific and overall rating based on map sample size."""
        n = team.map_games.get(map_name, 0)
        weight = n / (n + self.map_weight_threshold)
        blended = (
            weight * team.map_elo.get(map_name, 1500.0)
            + (1 - weight) * team.overall_elo
        )
        # Apply roster change penalty as uncertainty (reduce extreme ratings)
        if team.roster_change_penalty > 0:
            blended = blended * (1 - team.roster_change_penalty) + \
                      1500.0 * team.roster_change_penalty
        return blended

    def predict_map(
        self,
        team_a: str,
        team_b: str,
        map_name: str,
    ) -> Dict[str, float]:
        """Predict win probability for a specific map."""
        ta = self.get_or_create_team(team_a)
        tb = self.get_or_create_team(team_b)

        rating_a = self._blended_map_rating(ta, map_name)
        rating_b = self._blended_map_rating(tb, map_name)

        prob_a = self._expected(rating_a, rating_b)

        return {
            "map": map_name,
            "team_a": team_a,
            "team_b": team_b,
            "rating_a": round(rating_a, 1),
            "rating_b": round(rating_b, 1),
            "win_prob_a": round(prob_a, 4),
            "win_prob_b": round(1 - prob_a, 4),
        }

    def predict_bo3(
        self,
        team_a: str,
        team_b: str,
        maps: Optional[List[str]] = None,
    ) -> Dict:
        """
        Predict best-of-3 series outcome.

        If maps are not specified, uses the most likely map pool
        based on team ban tendencies.
        """
        if maps is None:
            maps = self._estimate_map_pool(team_a, team_b, num_maps=3)

        map_probs = []
        for m in maps:
            pred = self.predict_map(team_a, team_b, m)
            map_probs.append(pred["win_prob_a"])

        # Best of 3: A needs to win 2 maps
        p1, p2, p3 = map_probs[0], map_probs[1], map_probs[2]

        # A wins 2-0
        p_2_0 = p1 * p2
        # A wins 2-1 (loses exactly one of the first two, wins map 3)
        p_2_1 = (p1 * (1 - p2) * p3) + ((1 - p1) * p2 * p3)
        # Total
        p_a_wins = p_2_0 + p_2_1

        return {
            "team_a": team_a,
            "team_b": team_b,
            "maps": maps,
            "map_probs_a": [round(p, 4) for p in map_probs],
            "p_2_0": round(p_2_0, 4),
            "p_2_1": round(p_2_1, 4),
            "series_win_prob_a": round(p_a_wins, 4),
            "series_win_prob_b": round(1 - p_a_wins, 4),
        }

    def _estimate_map_pool(
        self,
        team_a: str,
        team_b: str,
        num_maps: int = 3,
    ) -> List[str]:
        """
        Estimate most likely played maps based on team ban tendencies.

        Simplified model: each team bans their worst map, then picks
        their best remaining map, with the third map being the
        remaining map with highest combined rating.
        """
        ta = self.get_or_create_team(team_a)
        tb = self.get_or_create_team(team_b)

        available = list(self.ALL_MAPS)

        # Each team bans their weakest map
        worst_a = min(available, key=lambda m: ta.map_elo.get(m, 1500))
        available.remove(worst_a)
        worst_b = min(available, key=lambda m: tb.map_elo.get(m, 1500))
        available.remove(worst_b)

        # Each team picks their strongest remaining map
        best_a = max(available, key=lambda m: ta.map_elo.get(m, 1500))
        available_after_a = [m for m in available if m != best_a]
        best_b = max(available_after_a, key=lambda m: tb.map_elo.get(m, 1500))
        remaining = [m for m in available_after_a if m != best_b]

        # Decider: highest combined rating
        decider = max(
            remaining,
            key=lambda m: ta.map_elo.get(m, 1500) + tb.map_elo.get(m, 1500),
        )

        return [best_a, best_b, decider][:num_maps]

    def update_map_result(
        self,
        winner: str,
        loser: str,
        map_name: str,
    ) -> None:
        """Update ratings after a map result."""
        tw = self.get_or_create_team(winner)
        tl = self.get_or_create_team(loser)

        # Overall update
        exp_w = self._expected(tw.overall_elo, tl.overall_elo)
        tw.overall_elo += self.overall_k * (1.0 - exp_w)
        tl.overall_elo += self.overall_k * (0.0 - (1.0 - exp_w))

        # Map-specific update
        exp_w_map = self._expected(
            tw.map_elo.get(map_name, 1500),
            tl.map_elo.get(map_name, 1500),
        )
        tw.map_elo[map_name] = tw.map_elo.get(map_name, 1500) + \
            self.map_k * (1.0 - exp_w_map)
        tl.map_elo[map_name] = tl.map_elo.get(map_name, 1500) + \
            self.map_k * (0.0 - (1.0 - exp_w_map))

        tw.map_games[map_name] = tw.map_games.get(map_name, 0) + 1
        tl.map_games[map_name] = tl.map_games.get(map_name, 0) + 1


# --- Worked Example ---
predictor = CS2MatchPredictor()

# Set up team ratings (in practice, built from historical match data)
navi = predictor.get_or_create_team("NAVI")
navi.overall_elo = 1720
navi.map_elo = {
    "mirage": 1750, "inferno": 1700, "nuke": 1680,
    "overpass": 1740, "ancient": 1650, "anubis": 1710, "vertigo": 1600,
}
navi.map_games = {m: 45 for m in CS2MatchPredictor.ALL_MAPS}

faze = predictor.get_or_create_team("FaZe")
faze.overall_elo = 1690
faze.map_elo = {
    "mirage": 1680, "inferno": 1730, "nuke": 1710,
    "overpass": 1650, "ancient": 1700, "anubis": 1680, "vertigo": 1640,
}
faze.map_games = {m: 42 for m in CS2MatchPredictor.ALL_MAPS}

# Predict a best-of-3 series
result = predictor.predict_bo3("NAVI", "FaZe")
print(f"Series: {result['team_a']} vs {result['team_b']}")
print(f"Predicted maps: {result['maps']}")
print(f"Map probabilities (NAVI): {result['map_probs_a']}")
print(f"P(NAVI 2-0): {result['p_2_0']:.2%}")
print(f"P(NAVI 2-1): {result['p_2_1']:.2%}")
print(f"NAVI series win: {result['series_win_prob_a']:.2%}")
print(f"FaZe series win: {result['series_win_prob_b']:.2%}")

# Predict specific map matchup
inferno = predictor.predict_map("NAVI", "FaZe", "inferno")
print(f"\nInferno: NAVI {inferno['win_prob_a']:.2%} vs FaZe {inferno['win_prob_b']:.2%}")

Market Insight: Esports betting markets are notably less efficient than traditional sports, particularly for lower-tier leagues (e.g., ESEA Advanced in CS2, LFL in LoL). Sportsbooks often set lines based on limited information, and roster changes, coaching changes, and patch effects create rapid shifts in relative team strength that the market is slow to incorporate. Map-specific knowledge is particularly valuable: a team's map pool is often publicly known through ban/pick tracking, yet many sportsbooks do not fully adjust their series prices for map pool dynamics.


22.2 Golf Tournament Modeling

The Strokes Gained Framework

Golf is uniquely suited to quantitative modeling because it produces a continuous, interval-scale measure of performance: strokes. Unlike ball sports where "runs" or "goals" are count data with complex dependencies, each golf stroke is an independent action with a measurable outcome. Mark Broadie's strokes gained framework, developed at Columbia Business School and now the foundation of PGA Tour analytics, decomposes a golfer's performance into contributions from specific shot types relative to the field average.

The fundamental concept: every position on a golf course (lie, distance to hole) has an expected number of strokes remaining to complete the hole, based on PGA Tour averages. If a golfer takes a shot from position A (expected 3.8 strokes remaining) to position B (expected 2.5 strokes remaining), they have used 1 stroke but gained $3.8 - 2.5 - 1.0 = 0.3$ strokes relative to the average player.

$$\text{SG}_{\text{shot}} = (\text{Expected strokes from start}) - (\text{Expected strokes from end}) - 1$$

The PGA Tour decomposes strokes gained into four categories:

  1. SG: Off the Tee (SG:OTT) --- Driving performance, tee shots on par 4s and par 5s
  2. SG: Approach (SG:APP) --- Approach shots into the green
  3. SG: Around the Green (SG:ARG) --- Chips, pitches, and bunker shots
  4. SG: Putting (SG:PUTT) --- Performance on the putting surface

A golfer's total strokes gained per round is:

$$\text{SG:Total} = \text{SG:OTT} + \text{SG:APP} + \text{SG:ARG} + \text{SG:PUTT}$$

Course Fit Analysis

Different golf courses place different demands on the four skill components. A long, wide-open course with generous fairways but well-defended greens rewards SG:OTT and SG:APP but is less sensitive to SG:ARG. A short, tight course with small, undulating greens rewards accuracy and short-game skill more than raw distance.

Course characterization involves analyzing historical tournament data to determine which SG components are most predictive of performance at each venue. The regression model is:

$$\text{Finish Position}_i = \beta_0 + \beta_1 \cdot \text{SG:OTT}_i + \beta_2 \cdot \text{SG:APP}_i + \beta_3 \cdot \text{SG:ARG}_i + \beta_4 \cdot \text{SG:PUTT}_i + \epsilon_i$$

The coefficients $\beta_1$ through $\beta_4$ tell us which skills matter most at each course. At Augusta National (The Masters), SG:APP and SG:ARG tend to have larger coefficients because the course demands precise iron play into severely contoured greens and excellent chipping and pitching around those greens. At Torrey Pines (South Course), SG:OTT is more important because the course is long and rewards driving distance.

Sample Course Fit Profiles:

Course SG:OTT Weight SG:APP Weight SG:ARG Weight SG:PUTT Weight
Augusta National 0.18 0.35 0.28 0.19
Pebble Beach 0.15 0.30 0.30 0.25
TPC Sawgrass 0.20 0.33 0.22 0.25
Torrey Pines (South) 0.30 0.30 0.18 0.22
St Andrews (Old Course) 0.22 0.25 0.25 0.28

Field Strength Adjustment

Golf tournament fields vary enormously in quality. The Masters features approximately 90 of the world's best players. A typical PGA Tour event features 144 players of mixed quality. Korn Ferry Tour events and international tours feature weaker fields.

Strokes gained data is measured relative to the field average, which means that raw SG numbers from a weak-field event are not directly comparable to those from a strong-field event. Field strength adjustment normalizes SG data to a common baseline.

The adjustment is:

$$\text{SG:Adjusted}_i = \text{SG:Raw}_i + \text{Field Strength Offset}$$

where the field strength offset is derived from the difference between the average world ranking of the field and a reference field. A common approach uses the Official World Golf Ranking (OWGR) of all competitors to compute a field quality index.

Cut Probability and Tournament Simulation

In stroke play events, approximately half the field is eliminated after 36 holes (the "cut"). Modeling cut probability is important both for outright winner predictions and for derivative bets (top 5, top 10, top 20, make/miss cut).

Cut probability depends on: 1. The golfer's expected performance (SG:Total, adjusted for course and field) 2. The variance of that performance (how consistently they play) 3. The strength of the specific field

A Monte Carlo simulation approach:

  1. For each golfer in the field, sample round scores from a distribution calibrated to their SG profile and historical variance.
  2. After 36 simulated holes, apply the cut rule (typically top 65 and ties).
  3. For surviving golfers, simulate rounds 3 and 4.
  4. Record finish positions.
  5. Repeat thousands of times to build probability distributions for all outcomes.

Python Implementation: Golf Tournament Model

import numpy as np
from dataclasses import dataclass, field
from typing import Dict, List, Optional, Tuple


@dataclass
class Golfer:
    """Professional golfer with strokes gained profile."""

    name: str
    sg_ott: float = 0.0       # Strokes Gained: Off the Tee
    sg_app: float = 0.0       # Strokes Gained: Approach
    sg_arg: float = 0.0       # Strokes Gained: Around the Green
    sg_putt: float = 0.0      # Strokes Gained: Putting
    round_std: float = 2.8    # Standard deviation of round score vs field
    world_ranking: int = 100
    recent_form_adj: float = 0.0  # Recent performance adjustment

    @property
    def sg_total(self) -> float:
        return self.sg_ott + self.sg_app + self.sg_arg + self.sg_putt


@dataclass
class CourseProfile:
    """Course characteristics as SG component weights."""

    name: str
    sg_ott_weight: float = 0.25
    sg_app_weight: float = 0.30
    sg_arg_weight: float = 0.22
    sg_putt_weight: float = 0.23
    par: int = 72
    average_winning_score: float = -12.0  # Relative to par
    course_difficulty: float = 0.0  # Adjustment to par


class GolfTournamentSimulator:
    """
    Monte Carlo tournament simulator using strokes gained framework.

    Simulates complete tournaments including cuts, using course-fit
    adjusted golfer projections and realistic round-to-round variance.
    """

    def __init__(
        self,
        course: CourseProfile,
        field: List[Golfer],
        n_simulations: int = 10000,
        n_rounds: int = 4,
        cut_after_round: int = 2,
        cut_top_n: int = 65,
        correlation_between_rounds: float = 0.08,
        random_seed: Optional[int] = None,
    ):
        self.course = course
        self.field = field
        self.n_simulations = n_simulations
        self.n_rounds = n_rounds
        self.cut_after_round = cut_after_round
        self.cut_top_n = cut_top_n
        self.round_corr = correlation_between_rounds
        self.rng = np.random.default_rng(random_seed)

    def course_fit_score(self, golfer: Golfer) -> float:
        """
        Calculate a golfer's course-fit-adjusted expected strokes gained.

        Weights the four SG components by the course profile to produce
        an expected SG:Total specific to this course.
        """
        weights = np.array([
            self.course.sg_ott_weight,
            self.course.sg_app_weight,
            self.course.sg_arg_weight,
            self.course.sg_putt_weight,
        ])
        # Normalize weights to sum to 1
        weights = weights / weights.sum()

        sg_components = np.array([
            golfer.sg_ott,
            golfer.sg_app,
            golfer.sg_arg,
            golfer.sg_putt,
        ])

        # Course-fit SG: weight components by course importance
        # then scale so that the total magnitude is preserved
        total_sg = golfer.sg_total
        course_fit = np.sum(weights * sg_components) * 4  # Scale back up

        # Blend with raw SG:Total (course fit is informative but noisy)
        blended = 0.6 * course_fit + 0.4 * total_sg

        # Add recent form adjustment
        return blended + golfer.recent_form_adj

    def simulate_tournament(self) -> Dict[str, Dict]:
        """
        Run full Monte Carlo tournament simulation.

        Returns:
            Dictionary mapping golfer names to result distributions:
            win probability, top-5/10/20 probability, make-cut probability,
            and expected finish position.
        """
        n_golfers = len(self.field)

        # Pre-compute course-fit scores
        expected_sg = np.array([
            self.course_fit_score(g) for g in self.field
        ])
        round_stds = np.array([g.round_std for g in self.field])

        # Result accumulators
        wins = np.zeros(n_golfers)
        top_5 = np.zeros(n_golfers)
        top_10 = np.zeros(n_golfers)
        top_20 = np.zeros(n_golfers)
        made_cut = np.zeros(n_golfers)
        finish_positions = np.zeros(n_golfers)

        for sim in range(self.n_simulations):
            # Simulate all rounds
            # Each golfer's score per round = -(expected_sg + noise)
            # Lower scores are better; SG is positive = better
            cumulative_scores = np.zeros(n_golfers)

            # Generate per-golfer tournament "form" factor
            # (introduces small round-to-round correlation)
            tournament_form = self.rng.normal(0, 0.5, n_golfers)

            for rnd in range(self.n_rounds):
                if rnd == self.cut_after_round:
                    # Apply cut: only top_n continue
                    cut_line_idx = np.argsort(cumulative_scores)[:self.cut_top_n]
                    cut_mask = np.zeros(n_golfers, dtype=bool)
                    cut_mask[cut_line_idx] = True

                round_noise = self.rng.normal(0, 1, n_golfers) * round_stds
                round_scores = -(expected_sg + tournament_form * self.round_corr + round_noise)

                if rnd >= self.cut_after_round:
                    # Only update scores for golfers who made the cut
                    round_scores[~cut_mask] = np.inf
                    cumulative_scores[~cut_mask] = np.inf

                cumulative_scores += round_scores

            # Record cut results
            if self.cut_after_round < self.n_rounds:
                made_cut += cut_mask.astype(float)
            else:
                made_cut += 1.0

            # Determine finish positions (lower score = better position)
            valid = cumulative_scores < np.inf
            rankings = np.full(n_golfers, n_golfers)  # Default: last
            if valid.any():
                valid_scores = cumulative_scores[valid]
                order = np.argsort(valid_scores)
                ranks = np.empty_like(order)
                ranks[order] = np.arange(1, len(order) + 1)
                rankings[valid] = ranks

            # Accumulate results
            wins += (rankings == 1).astype(float)
            top_5 += (rankings <= 5).astype(float)
            top_10 += (rankings <= 10).astype(float)
            top_20 += (rankings <= 20).astype(float)
            finish_positions += rankings

        # Compile results
        results = {}
        for i, golfer in enumerate(self.field):
            results[golfer.name] = {
                "expected_sg": round(expected_sg[i], 3),
                "win_prob": round(wins[i] / self.n_simulations, 4),
                "top_5_prob": round(top_5[i] / self.n_simulations, 4),
                "top_10_prob": round(top_10[i] / self.n_simulations, 4),
                "top_20_prob": round(top_20[i] / self.n_simulations, 4),
                "make_cut_prob": round(made_cut[i] / self.n_simulations, 4),
                "avg_finish": round(finish_positions[i] / self.n_simulations, 1),
            }

        return results


# --- Worked Example ---
# Define course
augusta = CourseProfile(
    name="Augusta National",
    sg_ott_weight=0.18,
    sg_app_weight=0.35,
    sg_arg_weight=0.28,
    sg_putt_weight=0.19,
    par=72,
)

# Define a small illustrative field
golfers = [
    Golfer("Scottie Scheffler", sg_ott=0.8, sg_app=1.2, sg_arg=0.4,
           sg_putt=0.3, round_std=2.5, world_ranking=1),
    Golfer("Rory McIlroy", sg_ott=1.0, sg_app=0.9, sg_arg=0.2,
           sg_putt=0.1, round_std=2.7, world_ranking=3),
    Golfer("Jon Rahm", sg_ott=0.7, sg_app=0.8, sg_arg=0.5,
           sg_putt=0.4, round_std=2.6, world_ranking=5),
    Golfer("Xander Schauffele", sg_ott=0.6, sg_app=0.7, sg_arg=0.3,
           sg_putt=0.5, round_std=2.6, world_ranking=4),
    Golfer("Collin Morikawa", sg_ott=0.3, sg_app=1.1, sg_arg=0.3,
           sg_putt=0.0, round_std=2.8, world_ranking=8),
]

simulator = GolfTournamentSimulator(
    course=augusta,
    field=golfers,
    n_simulations=50000,
    random_seed=42,
)

results = simulator.simulate_tournament()

print(f"{'Golfer':<25} {'Win%':>8} {'Top5%':>8} {'Top10%':>8} "
      f"{'Cut%':>8} {'AvgFin':>8}")
print("-" * 75)
for name, data in sorted(results.items(), key=lambda x: -x[1]["win_prob"]):
    print(f"{name:<25} {data['win_prob']:>7.1%} {data['top_5_prob']:>7.1%} "
          f"{data['top_10_prob']:>7.1%} {data['make_cut_prob']:>7.1%} "
          f"{data['avg_finish']:>8.1f}")

Market Insight: Golf outright winner markets are among the most inefficient in sports betting. With 144-player fields, the favorite typically has only a 10--15% implied probability, and the long tail of the distribution creates opportunities. Course fit models that properly weight SG components can identify 80/1 or 100/1 golfers whose true probability is meaningfully higher than the market implies. Additionally, matchup bets (head-to-head between two golfers), top-5/10/20 finishes, and make/miss cut bets offer different risk-return profiles that can be exploited with the same underlying model.


22.3 Player Prop Markets

The Prop Market Explosion

Player proposition bets --- wagers on individual player statistical outcomes rather than game results --- have become one of the fastest-growing segments of the sports betting market. The typical NFL Sunday now features thousands of prop offerings per game: passing yards, rushing yards, receiving yards, touchdowns, receptions, tackles, and dozens more. NBA, MLB, NHL, and soccer markets have followed suit.

Props are attractive to bettors because they feel personal and intuitive (everyone has an opinion about whether LeBron James will score over 25.5 points), and they are attractive to sportsbooks because the margins are wider than on sides and totals, and the volume-generating effect of same-game parlays (SGPs) --- which combine multiple props from the same game --- is substantial.

For the quantitative bettor, props present a compelling opportunity because:

  1. Less sharp action: The professional betting market focuses primarily on sides and totals. Prop markets receive less sophisticated flow.
  2. Model advantages compound: A prop model that properly accounts for game environment, matchup, and usage can find numerous edges per game.
  3. Correlation opportunities: Sportsbooks often missprice the correlation between related props, creating SGP value.

Building Player Projection Systems

A player prop projection system must estimate the distribution of a player's statistical output --- not just the mean, but the variance and shape of the distribution. The components:

1. Baseline projection: The player's average performance, typically a weighted combination of season-long stats and recent form. For NBA points, this might be:

$$\hat{Y}_{\text{base}} = w_{\text{season}} \cdot \bar{Y}_{\text{season}} + w_{\text{recent}} \cdot \bar{Y}_{\text{recent}}$$

where $\bar{Y}_{\text{season}}$ is the season average, $\bar{Y}_{\text{recent}}$ is the average over the last 5--10 games, and the weights are calibrated to maximize out-of-sample accuracy.

2. Game environment adjustment: Player stats are not produced in a vacuum. They depend critically on game context:

  • Pace: A game with a projected pace of 105 possessions (fast) produces more counting stats than a game projected at 95 possessions (slow). The adjustment is:

$$\hat{Y}_{\text{pace}} = \hat{Y}_{\text{base}} \cdot \frac{\text{Projected Pace}}{\text{League Average Pace}}$$

  • Game script (implied total and spread): A team that is expected to be winning by a large margin will rest starters in the fourth quarter, reducing counting stats. A team trailing will play faster and attempt more three-pointers, increasing certain stats.

  • Rest and scheduling: Back-to-back games in the NBA reduce player minutes by approximately 2--3 minutes on average, with corresponding reductions in all counting stats. Travel, altitude, and time-zone changes also matter.

3. Matchup adjustment: The opponent's defensive profile affects individual player production.

For NBA points: $$\hat{Y}_{\text{matchup}} = \hat{Y}_{\text{pace}} \cdot (1 + \text{DvP}_{\text{opponent}})$$

where DvP (Defense versus Position) measures how many points the opponent allows to the player's position relative to the league average. A DvP of +0.05 means the opponent allows 5% more points than average to that position.

For NFL receiving yards: $$\hat{Y}_{\text{matchup}} = \hat{Y}_{\text{base}} \cdot \frac{\text{Opponent Yds/Target Allowed (position)}}{\text{League Avg Yds/Target (position)}}$$

4. Usage and role adjustment: If a teammate is injured or the player's role has changed, the baseline must be adjusted. In the NBA, if a team's second-best player is out, the primary player's usage rate typically increases by 3--5 percentage points, which translates to roughly 2--4 additional points per game.

Correlation Analysis Between Props

The correlation between player props within the same game is a critical --- and frequently mispriced --- factor. Consider:

  • A quarterback's passing yards are positively correlated with his receivers' receiving yards (trivially, because the same plays generate both stats).
  • A quarterback's passing touchdowns are positively correlated with the game total.
  • A player's points are negatively correlated with their minutes if the game is a blowout (they sit out the fourth quarter).
  • Two receivers on the same team are negatively correlated in target share (more targets for one means fewer for the other) but may be positively correlated in overall volume if the game script favors passing.

Understanding these correlations is essential for: 1. Pricing same-game parlays correctly (to identify when the sportsbook's SGP price understates the true correlation, creating value). 2. Avoiding double-counting the same edge across multiple prop bets. 3. Managing portfolio risk when betting multiple props on the same game.

Python Implementation: Player Prop Projection

import numpy as np
from dataclasses import dataclass
from typing import Dict, List, Optional
from scipy import stats


@dataclass
class PlayerProfile:
    """Player statistical profile for prop modeling."""

    name: str
    team: str
    position: str
    season_avg: float          # Season average for the stat
    recent_avg: float          # Last 5-10 games average
    season_std: float          # Standard deviation of game-to-game performance
    games_played: int
    minutes_avg: float = 0.0   # Average minutes (NBA) or snaps (NFL)
    usage_rate: float = 0.0    # Usage rate (NBA) or target share (NFL)


@dataclass
class GameEnvironment:
    """Game-level factors that affect player prop projections."""

    projected_pace: float          # Possessions (NBA) or plays (NFL)
    league_avg_pace: float
    spread: float                  # Positive = player's team favored
    implied_total: float
    league_avg_total: float
    is_back_to_back: bool = False
    teammate_out: Optional[str] = None
    teammate_usage_boost: float = 0.0  # Additional usage from teammate absence


@dataclass
class MatchupFactor:
    """Opponent defensive profile for the relevant stat/position."""

    dvp_rating: float = 0.0       # Defense vs Position: positive = weaker defense
    pace_allowed: float = 0.0     # Opponent pace differential
    position_rank: int = 15       # Rank 1=best defense, 30=worst


class PlayerPropModel:
    """
    Projects individual player stat lines for prop market analysis.

    Combines baseline projections with game environment, matchup,
    and usage adjustments to produce full probability distributions.
    """

    def __init__(
        self,
        season_weight: float = 0.55,
        recent_weight: float = 0.45,
        pace_sensitivity: float = 0.85,
        total_sensitivity: float = 0.35,
        back_to_back_penalty: float = 0.06,
        blowout_minutes_reduction: float = 0.08,
    ):
        self.season_weight = season_weight
        self.recent_weight = recent_weight
        self.pace_sensitivity = pace_sensitivity
        self.total_sensitivity = total_sensitivity
        self.b2b_penalty = back_to_back_penalty
        self.blowout_reduction = blowout_minutes_reduction

    def project(
        self,
        player: PlayerProfile,
        environment: GameEnvironment,
        matchup: MatchupFactor,
    ) -> Dict:
        """
        Generate full projection for a player prop.

        Returns projected mean, standard deviation, and over/under
        probabilities for common lines.
        """
        # Step 1: Baseline (weighted season + recent)
        baseline = (
            self.season_weight * player.season_avg
            + self.recent_weight * player.recent_avg
        )

        # Step 2: Pace adjustment
        pace_factor = 1.0 + self.pace_sensitivity * (
            (environment.projected_pace / environment.league_avg_pace) - 1.0
        )
        projection = baseline * pace_factor

        # Step 3: Game total / script adjustment
        total_factor = 1.0 + self.total_sensitivity * (
            (environment.implied_total / environment.league_avg_total) - 1.0
        )
        projection *= total_factor

        # Step 4: Blowout risk (large spread reduces minutes)
        if abs(environment.spread) > 8:
            blowout_factor = 1.0 - self.blowout_reduction * (
                (abs(environment.spread) - 8) / 10
            )
            projection *= max(0.8, blowout_factor)

        # Step 5: Matchup adjustment
        matchup_factor = 1.0 + matchup.dvp_rating
        projection *= matchup_factor

        # Step 6: Usage boost from teammate absence
        if environment.teammate_out:
            projection *= (1.0 + environment.teammate_usage_boost)

        # Step 7: Back-to-back penalty
        if environment.is_back_to_back:
            projection *= (1.0 - self.b2b_penalty)

        # Compute distribution parameters
        # Standard deviation scales with the projection
        projected_std = player.season_std * (projection / max(player.season_avg, 1))

        # Use normal distribution for continuous stats (points, yards)
        # Use Poisson for count stats (touchdowns, made threes)
        return {
            "player": player.name,
            "projected_mean": round(projection, 2),
            "projected_std": round(projected_std, 2),
            "adjustments": {
                "baseline": round(baseline, 2),
                "pace_factor": round(pace_factor, 4),
                "total_factor": round(total_factor, 4),
                "matchup_factor": round(matchup_factor, 4),
                "b2b_applied": environment.is_back_to_back,
                "teammate_boost": round(environment.teammate_usage_boost, 4),
            },
        }

    def over_under_probability(
        self,
        projected_mean: float,
        projected_std: float,
        line: float,
        distribution: str = "normal",
    ) -> Dict[str, float]:
        """
        Calculate over/under probabilities for a given line.

        Args:
            projected_mean: Projected value for the stat.
            projected_std: Projected standard deviation.
            line: The sportsbook's over/under line.
            distribution: 'normal' for continuous stats, 'poisson' for counts.

        Returns:
            Over and under probabilities.
        """
        if distribution == "normal":
            p_under = stats.norm.cdf(line, loc=projected_mean, scale=projected_std)
            p_over = 1.0 - p_under
        elif distribution == "poisson":
            # For Poisson, use the mean directly
            p_under = stats.poisson.cdf(int(line), mu=projected_mean)
            p_over = 1.0 - p_under
        else:
            raise ValueError(f"Unknown distribution: {distribution}")

        return {
            "line": line,
            "p_over": round(p_over, 4),
            "p_under": round(p_under, 4),
            "edge_over": None,  # To be filled with market comparison
            "edge_under": None,
        }

    def find_value(
        self,
        projection: Dict,
        line: float,
        market_over_odds: int,
        market_under_odds: int,
        distribution: str = "normal",
    ) -> Dict:
        """
        Compare model projection to market line and identify value.

        Args:
            projection: Output from self.project().
            line: The sportsbook line.
            market_over_odds: American odds for the over.
            market_under_odds: American odds for the under.

        Returns:
            Dictionary with model probabilities, implied probabilities,
            and expected value for each side.
        """
        model_probs = self.over_under_probability(
            projection["projected_mean"],
            projection["projected_std"],
            line,
            distribution,
        )

        def american_to_implied(odds: int) -> float:
            if odds > 0:
                return 100 / (odds + 100)
            else:
                return abs(odds) / (abs(odds) + 100)

        def american_to_decimal(odds: int) -> float:
            if odds > 0:
                return 1 + odds / 100
            else:
                return 1 + 100 / abs(odds)

        implied_over = american_to_implied(market_over_odds)
        implied_under = american_to_implied(market_under_odds)
        decimal_over = american_to_decimal(market_over_odds)
        decimal_under = american_to_decimal(market_under_odds)

        ev_over = model_probs["p_over"] * decimal_over - 1
        ev_under = model_probs["p_under"] * decimal_under - 1

        return {
            "player": projection["player"],
            "line": line,
            "projected_mean": projection["projected_mean"],
            "model_p_over": model_probs["p_over"],
            "model_p_under": model_probs["p_under"],
            "market_implied_over": round(implied_over, 4),
            "market_implied_under": round(implied_under, 4),
            "ev_over": round(ev_over, 4),
            "ev_under": round(ev_under, 4),
            "recommended": "OVER" if ev_over > ev_under and ev_over > 0.02
                          else "UNDER" if ev_under > 0.02
                          else "NO BET",
        }


# --- Worked Example: NBA Points Prop ---
model = PlayerPropModel()

# Jayson Tatum points prop
tatum = PlayerProfile(
    name="Jayson Tatum",
    team="BOS",
    position="SF",
    season_avg=27.5,
    recent_avg=30.2,
    season_std=7.8,
    games_played=55,
    minutes_avg=36.5,
    usage_rate=0.315,
)

game_env = GameEnvironment(
    projected_pace=101.5,
    league_avg_pace=99.8,
    spread=-7.5,           # Celtics favored by 7.5
    implied_total=224.5,
    league_avg_total=222.0,
    is_back_to_back=False,
)

matchup = MatchupFactor(
    dvp_rating=0.04,  # Opponent allows 4% more points to SFs than avg
    pace_allowed=0.0,
    position_rank=22,  # 22nd ranked defense vs SF
)

projection = model.project(tatum, game_env, matchup)
print(f"Jayson Tatum Points Projection:")
print(f"  Projected mean: {projection['projected_mean']}")
print(f"  Projected std:  {projection['projected_std']}")
print(f"  Adjustments: {projection['adjustments']}")

# Compare to market
value = model.find_value(
    projection=projection,
    line=27.5,
    market_over_odds=-115,
    market_under_odds=-105,
)
print(f"\nMarket Analysis:")
print(f"  Line: {value['line']}")
print(f"  Model P(Over): {value['model_p_over']:.2%}")
print(f"  Market Implied P(Over): {value['market_implied_over']:.2%}")
print(f"  EV Over:  {value['ev_over']:+.2%}")
print(f"  EV Under: {value['ev_under']:+.2%}")
print(f"  Recommendation: {value['recommended']}")

Correlation Matrix for Same-Game Parlays

import numpy as np


def compute_prop_correlations(
    player_game_logs: np.ndarray,
    stat_names: List[str],
) -> np.ndarray:
    """
    Compute the correlation matrix between player stats from game logs.

    Args:
        player_game_logs: Array of shape (n_games, n_stats) containing
                          per-game stat lines.
        stat_names: Names of the statistics (for labeling).

    Returns:
        Correlation matrix as numpy array.
    """
    corr_matrix = np.corrcoef(player_game_logs, rowvar=False)
    return corr_matrix


# Example: NBA stat correlations
stat_names = ["Points", "Rebounds", "Assists", "3PM", "Steals", "Minutes"]

# Hypothetical correlation matrix for a star wing player
example_corr = np.array([
    [1.00, 0.15, 0.25, 0.55, 0.10, 0.65],  # Points
    [0.15, 1.00, 0.05, -0.10, 0.12, 0.40],  # Rebounds
    [0.25, 0.05, 1.00, 0.10, 0.08, 0.50],   # Assists
    [0.55, -0.10, 0.10, 1.00, 0.05, 0.35],  # 3PM
    [0.10, 0.12, 0.08, 0.05, 1.00, 0.25],   # Steals
    [0.65, 0.40, 0.50, 0.35, 0.25, 1.00],   # Minutes
])

print("Prop Correlation Matrix:")
header = f"{'':>12}" + "".join(f"{s:>12}" for s in stat_names)
print(header)
for i, name in enumerate(stat_names):
    row = f"{name:>12}" + "".join(f"{example_corr[i,j]:>12.2f}" for j in range(len(stat_names)))
    print(row)

# Key insight: Points and 3PM are highly correlated (0.55),
# meaning an SGP combining Over Points + Over 3PM is less
# valuable than the independent odds suggest.
# Points and Rebounds are weakly correlated (0.15),
# making them better candidates for SGP combinations.

Market Insight: Player prop markets are among the most profitable for quantitative bettors because sportsbooks set thousands of lines daily and cannot devote deep analysis to each one. The most consistent edges come from: (1) game environment adjustments that the market underweights (pace, blowout risk, rest); (2) injury-driven usage changes before the market adjusts; and (3) same-game parlay mispricing where the sportsbook overstates the correlation between negatively correlated props or understates it for positively correlated ones. Building an automated screening system that flags the highest-EV props across all games is a key competitive advantage.


22.4 Futures Market Analysis

The Structure of Futures Markets

Futures bets are wagers on outcomes that will be resolved at some point in the future --- typically weeks or months away. Common examples include:

  • Championship futures: Betting on which team will win the Super Bowl, NBA Finals, World Series, or Stanley Cup before or during the season.
  • Division/conference winners: Betting on divisional or conference outcomes.
  • Award futures: MVP, Rookie of the Year, Defensive Player of the Year, Cy Young, etc.
  • Season win totals: Over/under on a team's regular-season wins.

Futures markets differ from game-by-game markets in several fundamental ways:

  1. Time value of money: Capital locked up in a futures bet is unavailable for other wagers. A bet placed before the NFL season starts in September will not resolve until early February, tying up capital for approximately five months.

  2. Higher margins: Sportsbooks charge wider margins on futures because the long settlement period increases their cost of carrying the risk.

  3. Evolving information: The true probability of a futures outcome changes continuously as the season unfolds (injuries, trades, form changes), creating opportunities for bettors who can model these changes faster than the market.

  4. Hedging possibilities: A bettor holding a valuable futures position can lock in profit by hedging against the position later in the season when the odds have moved in their favor.

Implied Probability Extraction

The first step in futures analysis is extracting implied probabilities from the market prices and removing the overround (the sportsbook's margin).

For a market with $n$ possible outcomes at American odds $o_1, o_2, \ldots, o_n$, the raw implied probabilities are:

$$\hat{p}_i = \begin{cases} \frac{|o_i|}{|o_i| + 100} & \text{if } o_i < 0 \\ \frac{100}{o_i + 100} & \text{if } o_i > 0 \end{cases}$$

The overround is $\Sigma = \sum_{i=1}^{n} \hat{p}_i - 1$. For NFL Super Bowl futures, the overround is typically 20--40%, meaning the sum of all team implied probabilities is 1.20 to 1.40.

Removing the overround to obtain true implied probabilities requires an assumption about how the sportsbook distributes its margin. The simplest method is multiplicative normalization:

$$p_i^{\text{true}} = \frac{\hat{p}_i}{\sum_{j=1}^{n} \hat{p}_j}$$

This assumes the sportsbook applies the margin proportionally to all outcomes. In practice, sportsbooks tend to apply more margin to long shots (the "favorite-longshot bias"), so more sophisticated methods (Shin's method, power method) may produce better estimates.

Value Identification in Futures

Identifying value in futures requires a model that produces independent probability estimates for each outcome. For championship futures, this is typically a simulation model:

  1. Simulate the remaining regular season (or the entire season if pre-season).
  2. Determine playoff seedings.
  3. Simulate the playoffs.
  4. Record which team wins the championship.
  5. Repeat thousands of times.

The model's win probabilities are then compared to the market-implied probabilities. If the model assigns a team a 12% championship probability and the market-implied probability (after vig removal) is 8%, the team represents value.

Entry timing is critical for futures. The optimal time to place a futures bet is when the market has not yet incorporated information that the model has. Pre-season futures often have the widest margins but also the most uncertainty. In-season futures have tighter margins and less uncertainty, but the market is also more efficient because it has more data. The sweet spot is often early in the season, when a few weeks of data have significantly updated the model but the market is still partially anchored to pre-season assessments.

Hedging Futures Positions

A bettor who placed a $100 bet on a team at +2000 (implied 4.76%) before the season and that team reaches the championship game at +130 (implied 43.5%) has a position worth far more than its original cost. They can:

  1. Let it ride: The expected value is still positive if the model gives the team more than 43.5% in the final.
  2. Hedge for guaranteed profit: Bet on the opponent to lock in a profit regardless of outcome.
  3. Partial hedge: Bet a smaller amount on the opponent to guarantee some profit while maintaining upside.

The hedge calculation:

$$\text{Hedge bet} = \frac{\text{Futures payout}}{\text{Opponent decimal odds}}$$

If the futures bet pays $2,100 (including stake) on a win, and the opponent is +130 ($2.30 decimal), the full hedge bet is $2,100 / 2.30 = $913. This guarantees a profit of approximately $2,100 - $913 - $100 = $1,087 if the team wins, or $913 \times 1.30 - $100 = $1,087 if the opponent wins (approximately equal, after rounding).

Python Implementation: Futures Analysis

import numpy as np
from typing import Dict, List, Tuple, Optional


class FuturesAnalyzer:
    """
    Analyzes and identifies value in futures betting markets.

    Extracts implied probabilities, removes vig, compares to model
    probabilities, and calculates optimal hedging strategies.
    """

    def __init__(self):
        pass

    @staticmethod
    def american_to_implied(odds: int) -> float:
        """Convert American odds to raw implied probability."""
        if odds > 0:
            return 100.0 / (odds + 100.0)
        else:
            return abs(odds) / (abs(odds) + 100.0)

    @staticmethod
    def american_to_decimal(odds: int) -> float:
        """Convert American odds to decimal odds."""
        if odds > 0:
            return 1.0 + odds / 100.0
        else:
            return 1.0 + 100.0 / abs(odds)

    def extract_implied_probabilities(
        self,
        odds_dict: Dict[str, int],
        method: str = "multiplicative",
    ) -> Dict[str, Dict[str, float]]:
        """
        Extract implied probabilities from a futures market.

        Args:
            odds_dict: Maps team/player name to American odds.
            method: Vig removal method ('multiplicative', 'power', or 'shin').

        Returns:
            Dictionary with raw and adjusted probabilities for each outcome.
        """
        raw_probs = {
            name: self.american_to_implied(odds)
            for name, odds in odds_dict.items()
        }
        total = sum(raw_probs.values())
        overround = total - 1.0

        if method == "multiplicative":
            adjusted = {
                name: prob / total for name, prob in raw_probs.items()
            }
        elif method == "power":
            # Power method: find exponent k such that sum(p_i^k) = 1
            # Binary search for k
            lo, hi = 0.5, 2.0
            for _ in range(100):
                mid = (lo + hi) / 2
                s = sum(p ** mid for p in raw_probs.values())
                if s > 1.0:
                    lo = mid
                else:
                    hi = mid
            adjusted = {
                name: prob ** ((lo + hi) / 2)
                for name, prob in raw_probs.items()
            }
            # Renormalize
            adj_total = sum(adjusted.values())
            adjusted = {k: v / adj_total for k, v in adjusted.items()}
        elif method == "shin":
            # Shin's method: assumes insider trading creates the margin
            # z is the proportion of insider money
            # Iterative solution
            n = len(raw_probs)
            probs_list = list(raw_probs.values())
            z = (total - 1) / (n - 1)  # Approximate z
            adjusted = {}
            for name, raw_p in raw_probs.items():
                adj_p = (
                    (z ** 2 + 4 * (1 - z) * raw_p ** 2 / total) ** 0.5 - z
                ) / (2 * (1 - z))
                adjusted[name] = adj_p
            # Renormalize
            adj_total = sum(adjusted.values())
            adjusted = {k: v / adj_total for k, v in adjusted.items()}
        else:
            raise ValueError(f"Unknown method: {method}")

        results = {}
        for name in odds_dict:
            results[name] = {
                "american_odds": odds_dict[name],
                "decimal_odds": round(self.american_to_decimal(odds_dict[name]), 3),
                "raw_implied": round(raw_probs[name], 4),
                "adjusted_implied": round(adjusted[name], 4),
            }

        return {
            "outcomes": results,
            "overround": round(overround, 4),
            "overround_pct": f"{overround:.1%}",
            "method": method,
        }

    def identify_value(
        self,
        market_data: Dict,
        model_probabilities: Dict[str, float],
        min_edge: float = 0.02,
    ) -> List[Dict]:
        """
        Compare model probabilities to market-implied and flag value bets.

        Args:
            market_data: Output from extract_implied_probabilities().
            model_probabilities: Model's probability for each outcome.
            min_edge: Minimum edge threshold to flag as value.

        Returns:
            List of value bets sorted by edge (descending).
        """
        value_bets = []

        for name, data in market_data["outcomes"].items():
            model_prob = model_probabilities.get(name, 0)
            market_prob = data["adjusted_implied"]
            decimal_odds = data["decimal_odds"]

            # Expected value: model_prob * decimal_odds - 1
            ev = model_prob * decimal_odds - 1.0

            edge = model_prob - market_prob

            if edge > min_edge:
                value_bets.append({
                    "outcome": name,
                    "model_prob": round(model_prob, 4),
                    "market_prob": round(market_prob, 4),
                    "edge": round(edge, 4),
                    "american_odds": data["american_odds"],
                    "decimal_odds": decimal_odds,
                    "expected_value": round(ev, 4),
                    "ev_pct": f"{ev:.1%}",
                    "kelly_fraction": round(
                        max(0, (model_prob * decimal_odds - 1) / (decimal_odds - 1)),
                        4,
                    ),
                })

        return sorted(value_bets, key=lambda x: -x["edge"])

    def calculate_hedge(
        self,
        original_stake: float,
        original_odds: int,
        current_opponent_odds: int,
        hedge_type: str = "full",
    ) -> Dict:
        """
        Calculate hedging strategy for an existing futures position.

        Args:
            original_stake: Amount of the original futures bet.
            original_odds: American odds of the original bet.
            current_opponent_odds: Current American odds on the opponent.
            hedge_type: 'full' for guaranteed equal profit, 'partial'
                       for 50% hedge.

        Returns:
            Dictionary with hedge bet amount and guaranteed profits.
        """
        original_decimal = self.american_to_decimal(original_odds)
        opponent_decimal = self.american_to_decimal(current_opponent_odds)

        # Potential payout if original bet wins (profit only)
        original_payout = original_stake * original_decimal
        original_profit = original_payout - original_stake

        if hedge_type == "full":
            # Full hedge: equalize profit regardless of outcome
            # If original wins: original_payout - original_stake - hedge_bet
            # If opponent wins: hedge_bet * opponent_decimal - hedge_bet - original_stake
            # Set equal: original_profit - hedge = hedge * (opponent_decimal - 1) - original_stake
            # original_profit - hedge = hedge * opponent_decimal - hedge - original_stake
            # original_profit + original_stake = hedge * opponent_decimal
            # hedge = (original_profit + original_stake) / opponent_decimal
            hedge_bet = original_payout / opponent_decimal

            profit_if_original_wins = original_payout - original_stake - hedge_bet
            profit_if_opponent_wins = (
                hedge_bet * opponent_decimal - original_stake - hedge_bet
            )

        elif hedge_type == "partial":
            # Partial (50%) hedge
            hedge_bet = original_payout / opponent_decimal * 0.5

            profit_if_original_wins = original_payout - original_stake - hedge_bet
            profit_if_opponent_wins = (
                hedge_bet * opponent_decimal - original_stake - hedge_bet
            )
        else:
            raise ValueError(f"Unknown hedge type: {hedge_type}")

        return {
            "original_stake": original_stake,
            "original_odds": original_odds,
            "original_potential_profit": round(original_profit, 2),
            "hedge_bet": round(hedge_bet, 2),
            "opponent_odds": current_opponent_odds,
            "profit_if_original_wins": round(profit_if_original_wins, 2),
            "profit_if_opponent_wins": round(profit_if_opponent_wins, 2),
            "guaranteed_minimum": round(
                min(profit_if_original_wins, profit_if_opponent_wins), 2
            ),
            "hedge_type": hedge_type,
        }


# --- Worked Example: NFL Super Bowl Futures ---
analyzer = FuturesAnalyzer()

# Hypothetical Super Bowl futures market
futures_odds = {
    "Kansas City Chiefs": +350,
    "Buffalo Bills": +600,
    "Baltimore Ravens": +800,
    "San Francisco 49ers": +900,
    "Philadelphia Eagles": +1000,
    "Detroit Lions": +1200,
    "Dallas Cowboys": +1500,
    "Miami Dolphins": +2000,
    "Cincinnati Bengals": +2500,
    "Green Bay Packers": +3000,
    "Other (field)": +400,
}

market = analyzer.extract_implied_probabilities(futures_odds, method="multiplicative")
print(f"Market Overround: {market['overround_pct']}")
print(f"\n{'Team':<25} {'Odds':>8} {'Raw Impl':>10} {'Adj Impl':>10}")
print("-" * 55)
for name, data in market["outcomes"].items():
    print(f"{name:<25} {data['american_odds']:>+8} "
          f"{data['raw_implied']:>10.2%} {data['adjusted_implied']:>10.2%}")

# Model probabilities (from a simulation model)
model_probs = {
    "Kansas City Chiefs": 0.16,
    "Buffalo Bills": 0.11,
    "Baltimore Ravens": 0.09,
    "San Francisco 49ers": 0.10,  # Model likes SF more than market
    "Philadelphia Eagles": 0.08,
    "Detroit Lions": 0.07,
    "Dallas Cowboys": 0.04,
    "Miami Dolphins": 0.03,
    "Cincinnati Bengals": 0.03,
    "Green Bay Packers": 0.02,
    "Other (field)": 0.27,
}

value_bets = analyzer.identify_value(market, model_probs, min_edge=0.01)
print(f"\nValue Bets Identified:")
for bet in value_bets:
    print(f"  {bet['outcome']}: Model {bet['model_prob']:.1%} vs "
          f"Market {bet['market_prob']:.1%} | Edge: {bet['edge']:.1%} | "
          f"EV: {bet['ev_pct']}")

# Hedging example
hedge = analyzer.calculate_hedge(
    original_stake=100,
    original_odds=2000,   # Bet $100 on a team at +2000 before season
    current_opponent_odds=130,  # Team made the final, opponent is +130
    hedge_type="full",
)
print(f"\nHedging Strategy:")
print(f"  Original bet: ${hedge['original_stake']} at {hedge['original_odds']:+d}")
print(f"  Hedge bet: ${hedge['hedge_bet']:.2f} on opponent at "
      f"{hedge['opponent_odds']:+d}")
print(f"  Profit if original wins: ${hedge['profit_if_original_wins']:.2f}")
print(f"  Profit if opponent wins: ${hedge['profit_if_opponent_wins']:.2f}")
print(f"  Guaranteed minimum: ${hedge['guaranteed_minimum']:.2f}")

Market Insight: Futures markets are most inefficient at two points in time: (1) pre-season, when sportsbooks are pricing based on public perception and limited preseason information, and (2) immediately after major roster changes (trades, injuries) that significantly alter a team's championship probability but before the market fully adjusts. The time value of capital is a real cost --- tying up $500 in a futures bet for 5 months has an opportunity cost --- so the edge must be large enough to compensate. A useful rule of thumb: futures bets should have at least double the minimum edge threshold of game-by-game bets to account for the time value and wider margins.


22.5 Niche Sports and Thin Markets

The Case for Niche Specialization

The logic of niche specialization in sports betting is identical to the logic of niche specialization in any market: when fewer participants analyze a market, the informational efficiency of prices decreases, and the opportunities for informed participants increase.

Consider the contrast. The NFL Sunday main slate features billions of dollars in handle, thousands of professional bettors with sophisticated models, and real-time odds adjustment by market-making sportsbooks. The odds are set by the sharpest minds in the industry, and edges, if they exist, are measured in fractions of a percentage point.

Now consider a Tuesday afternoon table tennis match in the Russian Liga Pro, a professional darts Premier League match, or a regular-season handball game in the Danish league. These markets receive a tiny fraction of the handle, sportsbooks allocate minimal resources to pricing them, and the number of sharp bettors specializing in these sports is orders of magnitude smaller. The lines are set by algorithms with limited data, or by low-level traders using simple heuristics.

The niche bettor who builds even a modestly sophisticated model for these markets can find edges of 5--10% or more --- edges that would be unimaginable in the NFL.

Data Scarcity Challenges

The primary challenge in niche markets is data. While NFL modelers have access to decades of play-by-play data, granular injury reports, and extensive pre-game analytics, a darts modeler may have access to:

  • Match results (win/loss)
  • Checkout percentages
  • 180s thrown
  • Three-dart averages

And that may be the entirety of available data for lower-tier events.

Strategies for overcoming data scarcity:

  1. Transfer learning from related sports: Elo rating systems are sport-agnostic. A combat sports Elo system can be adapted for darts, snooker, or table tennis with minimal modification (just the K-factor and initialization).

  2. Feature engineering from limited data: Even with just match results, you can compute Elo ratings, winning streaks, head-to-head records, and home/away splits. These features, combined properly, can produce a useful model.

  3. Bayesian methods with informative priors: When data is scarce, Bayesian methods allow you to incorporate prior knowledge. If you know that player X just won the junior world championship, you can set a prior that their skill is above the average professional, even before observing any senior results.

  4. Proxy metrics: Metrics that are easy to observe (like three-dart average in darts, or break rate in snooker) can serve as proxies for underlying ability when more sophisticated metrics are unavailable.

Model Transferability

One of the most powerful approaches to niche market modeling is model transferability --- adapting a modeling framework developed for one sport to another sport with similar structure.

Examples of transferable frameworks:

Source Sport/Model Target Sport What Transfers What Must Be Adapted
Tennis Elo (surface-specific) Table tennis Rating system, matchup structure K-factor, surface irrelevant, match format
MMA combat Elo Boxing Rating system, physical attributes Scoring system, round structure, finish types
Baseball SG (park factors) Cricket (batting) Venue adjustment framework Sport-specific metrics
NBA prop model Handball player props Projection framework, game environment Different stat distributions, pace calculation
CS2 map Elo Valorant map Elo Map-specific rating, veto modeling Different map pool, agent-based considerations

The key insight is that the structure of the model transfers even when the parameters must be re-estimated. A surface-specific tennis Elo system and a table-specific snooker Elo system share the same mathematical structure; only the K-factors, blending weights, and initialization differ.

Finding Soft Lines

In thin markets, sportsbooks often set lines algorithmically using simple models or by reacting to limits in other, larger markets. This creates several exploitable patterns:

  1. Slow adjustment to roster/form changes: In niche sports, sportsbooks may not track player form as closely. A darts player who has been averaging 100+ in recent tournaments may still be priced based on their season-long average of 95.

  2. Incorrect head-to-head pricing: In individual sports with strong head-to-head effects (one player historically dominates another), thin markets may not incorporate this information.

  3. Copy-paste from stale sources: Some sportsbooks use third-party odds feeds for niche markets and do not adjust them locally. If the feed is slow to update, the bettor has a window of opportunity.

  4. Correlated markets mispricing: In niche team sports (handball, volleyball), the correlation between the moneyline and the total may not be properly modeled, creating value in parlays or derivative markets.

Python Implementation: Thin Market Rating System

import math
import numpy as np
from dataclasses import dataclass, field
from typing import Dict, List, Optional, Tuple
from datetime import date


@dataclass
class NichePlayer:
    """Generic player for thin market modeling."""

    name: str
    rating: float = 1500.0
    rating_deviation: float = 200.0  # High initial uncertainty
    matches: int = 0
    last_match_date: Optional[date] = None
    sport_specific_stats: Dict[str, float] = field(default_factory=dict)


class ThinMarketModel:
    """
    Versatile rating and prediction system for niche sports.

    Designed for data-scarce environments with:
    - Adaptive K-factor based on rating confidence
    - Bayesian-inspired initialization from limited information
    - Head-to-head adjustment tracking
    - Market comparison and value identification
    """

    def __init__(
        self,
        sport: str = "generic",
        base_k: float = 40.0,
        rd_decay_per_day: float = 0.5,
        max_rd: float = 350.0,
        min_rd: float = 30.0,
        h2h_weight: float = 0.10,
        min_h2h_matches: int = 3,
    ):
        self.sport = sport
        self.base_k = base_k
        self.rd_decay_per_day = rd_decay_per_day
        self.max_rd = max_rd
        self.min_rd = min_rd
        self.h2h_weight = h2h_weight
        self.min_h2h_matches = min_h2h_matches
        self.players: Dict[str, NichePlayer] = {}
        self.h2h_records: Dict[Tuple[str, str], Dict] = {}

    def add_player(
        self,
        name: str,
        initial_rating: float = 1500.0,
        initial_rd: float = 200.0,
        stats: Optional[Dict[str, float]] = None,
    ) -> NichePlayer:
        """Add a player with optional sport-specific stats for initialization."""
        player = NichePlayer(
            name=name,
            rating=initial_rating,
            rating_deviation=initial_rd,
            sport_specific_stats=stats or {},
        )
        self.players[name] = player
        return player

    def _update_rd(self, player: NichePlayer, current_date: date) -> None:
        """Increase rating deviation based on inactivity."""
        if player.last_match_date is None:
            return
        days = (current_date - player.last_match_date).days
        if days > 0:
            new_rd = min(
                self.max_rd,
                math.sqrt(player.rating_deviation ** 2 + (self.rd_decay_per_day * days) ** 2),
            )
            player.rating_deviation = new_rd

    def _adaptive_k(self, player: NichePlayer) -> float:
        """K-factor proportional to rating uncertainty."""
        # Higher RD = higher K (more responsive when uncertain)
        rd_factor = player.rating_deviation / 100.0
        experience_factor = max(0.5, 2.0 - player.matches / 20)
        return self.base_k * rd_factor * experience_factor

    def _get_h2h_adjustment(self, name_a: str, name_b: str) -> float:
        """
        Get head-to-head adjustment if sufficient history exists.

        Returns adjustment to add to player A's log-odds.
        """
        key = (name_a, name_b)
        reverse_key = (name_b, name_a)

        if key in self.h2h_records:
            record = self.h2h_records[key]
        elif reverse_key in self.h2h_records:
            record = self.h2h_records[reverse_key]
            # Reverse perspective
            a_wins = record.get("b_wins", 0)
            b_wins = record.get("a_wins", 0)
            total = record.get("total", 0)
            if total >= self.min_h2h_matches:
                h2h_rate = (a_wins + 0.5) / (total + 1)  # Laplace smoothing
                return self.h2h_weight * math.log(h2h_rate / (1 - h2h_rate))
            return 0.0
        else:
            return 0.0

        total = record.get("total", 0)
        if total < self.min_h2h_matches:
            return 0.0

        a_wins = record.get("a_wins", 0)
        h2h_rate = (a_wins + 0.5) / (total + 1)
        return self.h2h_weight * math.log(h2h_rate / (1 - h2h_rate))

    def predict(
        self,
        name_a: str,
        name_b: str,
        match_date: Optional[date] = None,
    ) -> Dict:
        """
        Generate win probability prediction.

        Combines Elo-based prediction with head-to-head adjustment
        and reports confidence based on rating deviations.
        """
        pa = self.players.get(name_a)
        pb = self.players.get(name_b)

        if pa is None or pb is None:
            raise ValueError("Both players must exist in the system.")

        # Update RDs for inactivity
        if match_date:
            self._update_rd(pa, match_date)
            self._update_rd(pb, match_date)

        # Base Elo probability
        elo_prob_a = 1.0 / (1.0 + 10.0 ** ((pb.rating - pa.rating) / 400.0))

        # H2H adjustment
        h2h_adj = self._get_h2h_adjustment(name_a, name_b)

        # Apply on log-odds scale
        log_odds = math.log(elo_prob_a / (1 - elo_prob_a)) + h2h_adj
        adjusted_prob_a = 1.0 / (1.0 + math.exp(-log_odds))

        # Confidence measure: lower combined RD = higher confidence
        combined_rd = math.sqrt(pa.rating_deviation ** 2 + pb.rating_deviation ** 2)
        confidence = max(0, min(1, 1 - combined_rd / 500))

        return {
            "player_a": name_a,
            "player_b": name_b,
            "rating_a": round(pa.rating, 1),
            "rating_b": round(pb.rating, 1),
            "rd_a": round(pa.rating_deviation, 1),
            "rd_b": round(pb.rating_deviation, 1),
            "elo_prob_a": round(elo_prob_a, 4),
            "h2h_adjustment": round(h2h_adj, 4),
            "adjusted_prob_a": round(adjusted_prob_a, 4),
            "adjusted_prob_b": round(1 - adjusted_prob_a, 4),
            "confidence": round(confidence, 3),
        }

    def find_market_value(
        self,
        name_a: str,
        name_b: str,
        market_odds_a: int,
        market_odds_b: int,
        match_date: Optional[date] = None,
        min_edge: float = 0.03,
        min_confidence: float = 0.3,
    ) -> Dict:
        """
        Compare model prediction to market odds and identify value.

        Only recommends bets when both the edge exceeds the minimum
        and the model confidence exceeds the minimum.
        """
        prediction = self.predict(name_a, name_b, match_date)

        def to_implied(odds: int) -> float:
            if odds > 0:
                return 100 / (odds + 100)
            return abs(odds) / (abs(odds) + 100)

        def to_decimal(odds: int) -> float:
            if odds > 0:
                return 1 + odds / 100
            return 1 + 100 / abs(odds)

        implied_a = to_implied(market_odds_a)
        implied_b = to_implied(market_odds_b)
        decimal_a = to_decimal(market_odds_a)
        decimal_b = to_decimal(market_odds_b)

        # Remove vig (multiplicative)
        total_implied = implied_a + implied_b
        fair_implied_a = implied_a / total_implied
        fair_implied_b = implied_b / total_implied

        ev_a = prediction["adjusted_prob_a"] * decimal_a - 1
        ev_b = prediction["adjusted_prob_b"] * decimal_b - 1

        edge_a = prediction["adjusted_prob_a"] - fair_implied_a
        edge_b = prediction["adjusted_prob_b"] - fair_implied_b

        # Determine recommendation
        recommendation = "NO BET"
        if prediction["confidence"] >= min_confidence:
            if edge_a >= min_edge and ev_a > 0:
                recommendation = f"BET {name_a}"
            elif edge_b >= min_edge and ev_b > 0:
                recommendation = f"BET {name_b}"

        return {
            "prediction": prediction,
            "market_odds": {name_a: market_odds_a, name_b: market_odds_b},
            "market_vig": round(total_implied - 1, 4),
            "fair_implied": {
                name_a: round(fair_implied_a, 4),
                name_b: round(fair_implied_b, 4),
            },
            "edge": {name_a: round(edge_a, 4), name_b: round(edge_b, 4)},
            "ev": {name_a: round(ev_a, 4), name_b: round(ev_b, 4)},
            "recommendation": recommendation,
        }

    def update(
        self,
        winner_name: str,
        loser_name: str,
        match_date: date,
    ) -> Dict:
        """Update ratings after a match result."""
        pw = self.players[winner_name]
        pl = self.players[loser_name]

        # Update RDs for time since last match
        self._update_rd(pw, match_date)
        self._update_rd(pl, match_date)

        pre_w, pre_l = pw.rating, pl.rating
        expected_w = 1.0 / (1.0 + 10.0 ** ((pl.rating - pw.rating) / 400.0))

        k_w = self._adaptive_k(pw)
        k_l = self._adaptive_k(pl)

        pw.rating += k_w * (1.0 - expected_w)
        pl.rating += k_l * (0.0 - (1.0 - expected_w))

        # Reduce RD (more confident after observing a result)
        pw.rating_deviation = max(self.min_rd, pw.rating_deviation * 0.92)
        pl.rating_deviation = max(self.min_rd, pl.rating_deviation * 0.92)

        # Update match counts and dates
        pw.matches += 1
        pl.matches += 1
        pw.last_match_date = match_date
        pl.last_match_date = match_date

        # Update H2H records
        key = (winner_name, loser_name)
        reverse_key = (loser_name, winner_name)
        if key in self.h2h_records:
            self.h2h_records[key]["a_wins"] += 1
            self.h2h_records[key]["total"] += 1
        elif reverse_key in self.h2h_records:
            self.h2h_records[reverse_key]["b_wins"] += 1
            self.h2h_records[reverse_key]["total"] += 1
        else:
            self.h2h_records[key] = {"a_wins": 1, "b_wins": 0, "total": 1}

        return {
            "winner": winner_name,
            "loser": loser_name,
            "pre_ratings": (round(pre_w, 1), round(pre_l, 1)),
            "post_ratings": (round(pw.rating, 1), round(pl.rating, 1)),
            "k_factors": (round(k_w, 1), round(k_l, 1)),
        }


# --- Worked Example: Professional Darts ---
darts = ThinMarketModel(
    sport="darts",
    base_k=40,
    h2h_weight=0.12,
    min_h2h_matches=3,
)

# Add players with initial estimates
darts.add_player("Luke Humphries", initial_rating=1750, initial_rd=60,
                 stats={"three_dart_avg": 99.5})
darts.add_player("Luke Littler", initial_rating=1700, initial_rd=80,
                 stats={"three_dart_avg": 98.2})
darts.add_player("Michael van Gerwen", initial_rating=1680, initial_rd=70,
                 stats={"three_dart_avg": 97.8})

# Simulate some results to build history
results = [
    ("Luke Humphries", "Luke Littler", date(2025, 1, 10)),
    ("Luke Littler", "Luke Humphries", date(2025, 2, 15)),
    ("Luke Humphries", "Luke Littler", date(2025, 3, 20)),
    ("Luke Humphries", "Michael van Gerwen", date(2025, 4, 5)),
    ("Luke Littler", "Michael van Gerwen", date(2025, 5, 10)),
]

for winner, loser, d in results:
    result = darts.update(winner, loser, d)
    print(f"{d}: {winner} def. {loser}")
    print(f"  Ratings: {result['post_ratings']}, K-factors: {result['k_factors']}")

# Market analysis
print("\n--- Market Value Analysis ---")
value = darts.find_market_value(
    "Luke Humphries", "Luke Littler",
    market_odds_a=-150,
    market_odds_b=+125,
    match_date=date(2025, 6, 1),
    min_edge=0.03,
)
print(f"Model prob: Humphries {value['prediction']['adjusted_prob_a']:.2%} vs "
      f"Littler {value['prediction']['adjusted_prob_b']:.2%}")
print(f"Market implied (fair): Humphries {value['fair_implied']['Luke Humphries']:.2%} vs "
      f"Littler {value['fair_implied']['Luke Littler']:.2%}")
print(f"Edges: Humphries {value['edge']['Luke Humphries']:+.2%}, "
      f"Littler {value['edge']['Luke Littler']:+.2%}")
print(f"Confidence: {value['prediction']['confidence']:.3f}")
print(f"Recommendation: {value['recommendation']}")

Risk Management in Thin Markets

Thin markets present unique risk management challenges that must be addressed:

1. Lower betting limits. Sportsbooks set much lower limits on niche sports, often $100--$500 per bet compared to $5,000--$50,000+ on major sports. This caps the absolute profit potential but does not diminish the edge percentage.

2. Account scrutiny. Because limits are low and the bettor pool is small, consistent winners in niche markets are identified and limited quickly. Diversifying across multiple sportsbooks and mixing recreational bets with sharp bets can extend account longevity.

3. Result verification. For some niche sports, result verification is slower and less reliable. Match fixing is a larger concern in lower-tier competitions where players are poorly paid. Monitoring for suspicious line movements and avoiding excessively obscure events reduces exposure to integrity issues.

4. Volatility. Small sample sizes mean higher variance in observed results. A bettor who is genuinely profitable at a 5% edge in niche darts may have a losing month simply due to variance, and the limited bet volume makes it harder to distinguish skill from luck.

5. Liquidity risk. Sportsbooks may void bets, change rules, or remove markets for niche sports with less notice than for major sports. Always read the specific rules for each market and sportsbook.

Market Insight: The most successful niche sports bettors tend to combine deep domain knowledge of a specific sport with transferable quantitative skills. A former competitive table tennis player who learns to build an Elo system has a dual advantage: they understand the nuances of the sport (playing styles, equipment effects, mental factors) that pure quants miss, and they have the mathematical framework to convert that knowledge into systematic predictions. The ideal niche market is one where you have domain expertise, sufficient data to build at least a basic model, and enough betting volume to make the time investment worthwhile.


22.6 Chapter Summary

This chapter surveyed the landscape of emerging and alternative betting markets, providing quantitative frameworks for each major category.

Key concepts and methods covered:

  1. Esports Betting and Modeling (Section 22.1): We developed game-specific modeling approaches for the three largest esports titles (CS2, LoL, Dota 2), emphasizing map-specific Elo ratings, map veto modeling for series predictions, and the unique challenges of roster changes and patch effects. Esports markets are structurally less efficient than traditional sports, particularly at lower tiers, creating opportunities for models that properly account for map pools and team composition.

  2. Golf Tournament Modeling (Section 22.2): The strokes gained framework decomposes golfer ability into four components (off the tee, approach, around the green, putting), enabling course-fit analysis that matches a golfer's skill profile to specific course demands. Monte Carlo tournament simulation produces probability distributions for all finish positions, supporting outright winner, top-N, and make/miss cut bets. Golf markets are among the most inefficient in sports betting due to large fields and high outcome variance.

  3. Player Prop Markets (Section 22.3): We built a complete player projection system that combines baseline performance with game environment (pace, total, spread), matchup (defense vs. position), and usage adjustments. The correlation structure between props is critical for same-game parlay valuation. Prop markets offer consistent edges because sportsbooks set thousands of lines daily and cannot devote deep analysis to each one.

  4. Futures Market Analysis (Section 22.4): Futures markets have unique structural features including time value of capital, wider margins, and evolving information. We implemented implied probability extraction with multiple vig-removal methods (multiplicative, power, Shin's), systematic value identification by comparing model probabilities to market prices, and hedging strategies for existing futures positions. The optimal entry timing is typically early in the season when model updates have diverged from market anchoring.

  5. Niche Sports and Thin Markets (Section 22.5): Thin markets (darts, table tennis, snooker, handball, and others) offer the largest potential edges due to less sharp action and less sophisticated sportsbook pricing. The primary challenges --- data scarcity, low limits, and integrity concerns --- can be addressed through model transferability (adapting frameworks from related sports), Bayesian methods with informative priors, and careful risk management. The ideal niche market combines the bettor's domain expertise with sufficient data and betting volume.

Cross-cutting themes across all emerging markets:

  • Inefficiency is the opportunity. Every emerging market discussed in this chapter is less efficient than the corresponding major market. The quantitative bettor's advantage is proportional to the market's inefficiency.
  • Data challenges require creative solutions. From esports patch disruptions to golf's four-day tournament format to niche sports' limited statistical coverage, each market demands adaptation of standard methods.
  • Specialization rewards depth. The bettor who deeply understands one emerging market will outperform the bettor who superficially covers many. Choose your niche based on your domain knowledge, available data, and market access.
  • Model transferability is a superpower. The mathematical frameworks developed throughout this textbook --- Elo ratings, regression models, Monte Carlo simulation, Bayesian updating --- transfer across sports with parameter re-estimation. Building a model for one sport makes building the next one faster.

Emerging markets opportunity assessment framework:

Market Typical Edge Data Availability Limits Frequency Overall Opportunity
Esports (Tier 1) 2--5% Good Medium High Strong
Esports (Tier 2+) 5--10% Moderate Low High Very Strong
Golf (outrights) 3--8% Excellent Medium Weekly Strong
Golf (matchups/props) 3--6% Excellent Medium Weekly Strong
Player props (NBA/NFL) 2--5% Excellent Medium-High Daily (in-season) Very Strong
Championship futures 3--10% Good Medium Seasonal Moderate (capital locked)
Niche individual sports 5--15% Poor-Moderate Very Low Variable Strong (if specialized)
Niche team sports 3--10% Moderate Low Variable Moderate

Review Questions:

  1. Why are esports models more sensitive to patch updates than traditional sports models are to rule changes? What specific mechanisms cause historical data to lose predictive value after a major patch?

  2. Explain the strokes gained framework: how does SG:Approach differ from traditional accuracy statistics like "greens in regulation percentage"? Why is the strokes gained decomposition more informative for predictive modeling?

  3. In a player prop model, why might two wide receivers on the same team have negatively correlated receiving yard totals but positively correlated game-level stats? How should a same-game parlay model handle this distinction?

  4. A sportsbook offers a team at +2500 to win the championship. Your model gives them a 6% probability. After removing vig (the market-implied probability is 3.5%), what is the expected value of a $100 bet? Should the time value of capital affect your decision?

  5. Describe three strategies for overcoming data scarcity when building a model for a niche sport with fewer than 100 available match results per player.

  6. Why do niche sports typically offer larger edges than major sports? What factors limit the ability of sharp bettors to exploit these edges at scale?


Exercises:

  1. (Programming) Build a complete CS2 map Elo system using data from HLTV.org. Track map-specific ratings for the top 20 teams over a full competitive year. Compare the predictive accuracy of: (a) overall Elo only, (b) map-specific Elo only, and (c) blended overall + map Elo. Report log-loss for each approach.

  2. (Analysis) Using PGA Tour strokes gained data, characterize three courses with distinct skill demands. For each course, identify the three golfers in the current top 50 world rankings with the best course fit and the three with the worst. Compare your predictions to historical tournament results at those venues.

  3. (Modeling) Build a player prop model for NBA points that incorporates pace, game total, spread, defense vs. position, and rest. Backtest it against a season of actual lines from a sportsbook (or a publicly available closing line dataset). Report the frequency of value bets found, the average edge, and the simulated ROI assuming flat-stake betting.

  4. (Portfolio) Select a niche sport you have domain knowledge in. Collect at least 200 match results, build an Elo system, and backtest its predictive accuracy. Then simulate a betting strategy against hypothetical closing lines (generated by adding noise to your model's probabilities). Report the expected annual profit and the number of bets required per month to achieve it.

  5. (Futures) Track a major championship futures market (Super Bowl, NBA Finals, World Cup) from pre-season through conclusion. Record odds weekly for all teams. Build a simulation model that produces weekly probability updates. Identify the point in the season where the largest discrepancy between your model and the market existed, and calculate the expected value of a bet placed at that moment.