16 min read

> "The single most important thing a sports bettor can do to improve their bottom line isn't building a better model -- it's getting a better number." -- Professional sports bettor wisdom

Chapter 12: Line Shopping and Market Analysis

"The single most important thing a sports bettor can do to improve their bottom line isn't building a better model -- it's getting a better number." -- Professional sports bettor wisdom

Line shopping -- the practice of comparing odds across multiple sportsbooks to find the best available price -- is the closest thing to a "free lunch" in sports betting. While building predictive models requires deep expertise and significant effort, line shopping requires only discipline and access to multiple accounts. In this chapter, we will rigorously quantify the impact of line shopping, introduce Closing Line Value (CLV) as the gold-standard performance metric, and build practical Python tools for systematically identifying the best available odds.


12.1 Why Line Shopping Matters

The Cost of Ignoring Line Shopping

Consider a bettor who places 1,000 bets per year at $100 per bet, all at standard -110 juice on sides and totals. Their total handle is $100,000. At -110, the implied probability is:

$$ p_{\text{implied}} = \frac{110}{110 + 100} = 52.38\% $$

To break even, they need to win 52.38% of their bets. Now suppose that by shopping lines, they could consistently get -107 instead of -110. The new breakeven win rate becomes:

$$ p_{\text{implied}} = \frac{107}{107 + 100} = 51.69\% $$

That 0.69% reduction in breakeven win rate translates directly to profit. If our bettor wins 53% of their bets (a solid but not extraordinary win rate), here is the comparison:

Metric At -110 At -107 Difference
Bets Won 530 530 0
Bets Lost 470 470 0
Won Amount $53,000 | $53,000 $0
Lost Amount -$51,700 | -$50,290 +$1,410
Net Profit $1,300** | **$2,710 +$1,410
ROI 1.30% 2.71% +1.41%

That seemingly minor 3-cent improvement in juice more than doubles the bettor's profit. Over a career of tens of thousands of bets, this difference compounds dramatically.

The Point Spread Effect

For point spread bets, getting a better number is even more valuable because key numbers in football and basketball create discrete probability jumps. Consider an NFL spread of -3:

Line Obtained Win % (Historical) Profit per $110 Risked
-3.5 (-110) 49.5% -$1.55
-3 (-110) 52.0% (push ~5%) +$1.60
-2.5 (-110) 54.5% +$4.45

Getting -2.5 instead of -3.5 on a football spread is worth approximately 5 percentage points of win rate. Across key numbers like 3, 7, 6, 10, and 14 in football, the value of half-point improvements varies dramatically.

Key Insight: In NFL betting, approximately 15% of games land exactly on the spread of 3, and about 6% land on 7. Getting the right side of these key numbers through line shopping has an outsized impact on long-term profitability.

Historical Line Differences Across Books

Sportsbooks do not all post identical lines. Differences arise from:

  1. Different models and risk assessments -- Each book's trading desk uses proprietary models
  2. Different customer bases -- A book with sharp action will move lines differently than a recreational-heavy book
  3. Different hold targets -- Some books aim for 4.5% hold; others target 6% or more
  4. Timing of line moves -- Books react to information at different speeds
  5. Promotional pricing -- Reduced juice offers (-105 or even +100) on select markets

Empirical studies of line differences across major US sportsbooks reveal the following patterns:

Market Type Avg. Spread Difference (pts) Avg. Moneyline Difference (cents) Max Observed Difference
NFL Sides 0.5 - 1.0 5 - 15 2.0 pts / 40 cents
NFL Totals 0.5 - 1.5 5 - 15 2.5 pts / 30 cents
NBA Sides 0.5 - 1.5 5 - 20 3.0 pts / 50 cents
NBA Totals 1.0 - 2.0 5 - 15 3.0 pts / 30 cents
MLB ML N/A 10 - 30 N/A / 60 cents
NHL ML N/A 10 - 25 N/A / 50 cents
Soccer 3-Way N/A 10 - 40 N/A / 80 cents

Callout: The "Penny Wise" Principle

Many bettors dismiss small line differences ("it's only 2 cents"). But consider: if you bet 500 times per year and save an average of 8 cents per bet on a $100 unit, that is $4,000 in additional annual profit. For a bettor with a $10,000 bankroll, that is 40% of their bankroll -- the difference between a winning year and a losing one.

Quantifying the Lifetime Value of Line Shopping

Let us build a simple simulation to estimate the cumulative impact of line shopping over a career:

import numpy as np
import matplotlib.pyplot as plt

def simulate_line_shopping_impact(
    n_bets: int = 10000,
    base_win_rate: float = 0.53,
    base_odds: float = -110,
    improved_odds: float = -107,
    bet_size: float = 100,
    n_simulations: int = 5000
) -> dict:
    """
    Simulate the cumulative impact of line shopping over a betting career.

    Parameters
    ----------
    n_bets : int
        Total number of bets in the simulation
    base_win_rate : float
        Probability of winning each bet
    base_odds : float
        American odds without line shopping (negative)
    improved_odds : float
        American odds with line shopping (negative)
    bet_size : float
        Flat bet size in dollars
    n_simulations : int
        Number of Monte Carlo simulations

    Returns
    -------
    dict with cumulative profit arrays for both scenarios
    """
    def american_to_profit(odds, stake):
        """Convert American odds to profit on a win."""
        if odds < 0:
            return stake * (100 / abs(odds))
        else:
            return stake * (odds / 100)

    profit_base = american_to_profit(base_odds, bet_size)
    profit_improved = american_to_profit(improved_odds, bet_size)

    results_base = np.zeros((n_simulations, n_bets))
    results_improved = np.zeros((n_simulations, n_bets))

    for sim in range(n_simulations):
        outcomes = np.random.binomial(1, base_win_rate, n_bets)

        # Without line shopping
        pnl_base = np.where(outcomes == 1, profit_base, -bet_size)
        results_base[sim] = np.cumsum(pnl_base)

        # With line shopping
        pnl_improved = np.where(outcomes == 1, profit_improved, -bet_size)
        results_improved[sim] = np.cumsum(pnl_improved)

    return {
        'base_mean': results_base.mean(axis=0),
        'base_p5': np.percentile(results_base, 5, axis=0),
        'base_p95': np.percentile(results_base, 95, axis=0),
        'improved_mean': results_improved.mean(axis=0),
        'improved_p5': np.percentile(results_improved, 5, axis=0),
        'improved_p95': np.percentile(results_improved, 95, axis=0),
    }

# Run simulation
results = simulate_line_shopping_impact()

# Plot results
bets = np.arange(1, 10001)
plt.figure(figsize=(12, 7))
plt.plot(bets, results['base_mean'], 'b-', label='Without Line Shopping (-110)')
plt.fill_between(bets, results['base_p5'], results['base_p95'], alpha=0.15, color='blue')
plt.plot(bets, results['improved_mean'], 'r-', label='With Line Shopping (-107)')
plt.fill_between(bets, results['improved_p5'], results['improved_p95'], alpha=0.15, color='red')
plt.xlabel('Number of Bets')
plt.ylabel('Cumulative Profit ($)')
plt.title('Cumulative Impact of Line Shopping (53% Win Rate)')
plt.legend()
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.savefig('line_shopping_impact.png', dpi=150)
plt.show()

# Print summary statistics at key milestones
for n in [100, 500, 1000, 5000, 10000]:
    base_profit = results['base_mean'][n-1]
    improved_profit = results['improved_mean'][n-1]
    diff = improved_profit - base_profit
    print(f"After {n:>5} bets: Base=${base_profit:>8.0f}, "
          f"Improved=${improved_profit:>8.0f}, "
          f"Difference=${diff:>7.0f}")

Expected output:

After   100 bets: Base=$     118, Improved=$     259, Difference=$    141
After   500 bets: Base=$     591, Improved=$    1296, Difference=$    705
After  1000 bets: Base=$    1182, Improved=$    2593, Difference=$   1410
After  5000 bets: Base=$    5909, Improved=$   12963, Difference=$   7054
After 10000 bets: Base=$   11818, Improved=$   25925, Difference=$  14107

After 10,000 bets, line shopping at just 3 cents better adds over $14,000 in additional profit on $100 bets. The gap widens continuously because it is a linear, additive advantage on every single wager.


12.2 Closing Line Value (CLV) as a Performance Metric

What Is Closing Line Value?

Closing Line Value (CLV) is the difference between the odds at which you placed your bet and the closing odds -- the final odds available just before the event begins. CLV is widely regarded as the single best predictor of long-term betting profitability.

The logic is straightforward: the closing line represents the market's most informed, efficient estimate of the true probability, because it incorporates the maximum amount of information and sharp money. If you consistently bet at prices better than the closing line, you are extracting value that the fully efficient market did not offer at close.

Formally, for a bet placed at odds $o_{\text{placed}}$ with closing odds $o_{\text{close}}$:

$$ \text{CLV} = p_{\text{close}} - p_{\text{placed}} $$

where $p$ denotes the implied probability derived from the odds. Positive CLV means you got a better price than the market's final assessment.

Why Beating the Close Predicts Profitability

The theoretical justification comes from market efficiency theory. If the closing line is the best available estimate of the true probability (adjusted for vig), then:

  1. Bettors with positive CLV are systematically getting prices that exceed the market's final assessment of fair value
  2. The closing line is hard to beat because it incorporates all available information, including sharp money, injury reports, weather updates, and public betting patterns
  3. CLV is independent of outcome variance -- unlike tracking wins and losses, CLV can be measured on every bet regardless of result

Empirical research consistently shows:

CLV Range Expected Long-term ROI Sample Size Needed
< -2% Strongly negative ~200 bets
-2% to 0% Slightly negative ~500 bets
0% to +2% Slightly positive ~1,000 bets
+2% to +4% Solidly profitable ~500 bets
> +4% Highly profitable ~200 bets

Key Insight: A bettor's CLV converges to a meaningful signal much faster than their actual win rate. While you might need 5,000+ bets to be statistically confident about a 1% ROI from win/loss records alone, CLV can reveal whether you have an edge within a few hundred bets.

Calculating CLV for Different Bet Types

Moneyline CLV

For moneyline bets, CLV is calculated using no-vig implied probabilities.

from dataclasses import dataclass
from typing import Optional

@dataclass
class BetRecord:
    """Record of a single bet placed."""
    event: str
    selection: str
    bet_type: str  # 'moneyline', 'spread', 'total'
    odds_placed: float  # American odds at time of bet
    odds_closing: float  # Closing American odds
    spread_placed: Optional[float] = None  # For spread/total bets
    spread_closing: Optional[float] = None
    stake: float = 100.0
    result: Optional[str] = None  # 'win', 'loss', 'push'

def american_to_implied_prob(odds: float) -> float:
    """
    Convert American odds to implied probability.

    Parameters
    ----------
    odds : float
        American odds (e.g., -110, +150)

    Returns
    -------
    float
        Implied probability between 0 and 1
    """
    if odds < 0:
        return abs(odds) / (abs(odds) + 100)
    else:
        return 100 / (odds + 100)

def remove_vig_power(prob_a: float, prob_b: float) -> tuple:
    """
    Remove vig using the power method (multiplicative).

    The power method solves for k such that:
        prob_a^k + prob_b^k = 1

    This is more theoretically sound than the additive method
    for markets with unequal vig distribution.

    Parameters
    ----------
    prob_a, prob_b : float
        Implied probabilities for each side (with vig)

    Returns
    -------
    tuple of (fair_prob_a, fair_prob_b)
    """
    from scipy.optimize import brentq

    def equation(k):
        return prob_a**k + prob_b**k - 1.0

    # Find k using root finding
    try:
        k = brentq(equation, 0.01, 10.0)
        fair_a = prob_a**k
        fair_b = prob_b**k
    except ValueError:
        # Fallback to additive method if power method fails
        total = prob_a + prob_b
        fair_a = prob_a / total
        fair_b = prob_b / total

    return fair_a, fair_b

def remove_vig_additive(prob_a: float, prob_b: float) -> tuple:
    """
    Remove vig using the additive (proportional) method.

    Simply divides each implied probability by the total overround.

    Parameters
    ----------
    prob_a, prob_b : float
        Implied probabilities for each side

    Returns
    -------
    tuple of (fair_prob_a, fair_prob_b)
    """
    total = prob_a + prob_b
    return prob_a / total, prob_b / total

def calculate_moneyline_clv(
    odds_placed: float,
    closing_odds_selection: float,
    closing_odds_opponent: float,
    method: str = 'additive'
) -> dict:
    """
    Calculate CLV for a moneyline bet.

    Parameters
    ----------
    odds_placed : float
        American odds at which the bet was placed
    closing_odds_selection : float
        Closing American odds for the selection you bet
    closing_odds_opponent : float
        Closing American odds for the other side
    method : str
        Vig removal method: 'additive' or 'power'

    Returns
    -------
    dict with CLV metrics
    """
    # Implied probability at placed odds (includes vig)
    implied_placed = american_to_implied_prob(odds_placed)

    # Closing implied probabilities (with vig)
    closing_selection = american_to_implied_prob(closing_odds_selection)
    closing_opponent = american_to_implied_prob(closing_odds_opponent)

    # Remove vig from closing line
    if method == 'power':
        fair_selection, fair_opponent = remove_vig_power(
            closing_selection, closing_opponent
        )
    else:
        fair_selection, fair_opponent = remove_vig_additive(
            closing_selection, closing_opponent
        )

    # CLV = fair closing probability - implied probability at placement
    clv = fair_selection - implied_placed

    # CLV as expected ROI
    # If fair prob is p, and you bet at implied prob q:
    # Expected return = p * (1/q - 1) - (1-p) * 1
    #                 = p/q - p - 1 + p = p/q - 1
    expected_roi = fair_selection / implied_placed - 1

    return {
        'implied_placed': implied_placed,
        'closing_no_vig': fair_selection,
        'clv_probability': clv,
        'clv_percentage': clv * 100,
        'expected_roi': expected_roi * 100,
    }

# Example: Bet on Team A at +150, line closes at +130 / -150
result = calculate_moneyline_clv(
    odds_placed=150,
    closing_odds_selection=130,
    closing_odds_opponent=-150
)

print("Moneyline CLV Analysis")
print(f"  Implied prob at bet placement: {result['implied_placed']:.4f}")
print(f"  Closing no-vig probability:    {result['closing_no_vig']:.4f}")
print(f"  CLV (probability):             {result['clv_probability']:+.4f}")
print(f"  CLV (percentage points):       {result['clv_percentage']:+.2f}%")
print(f"  Expected ROI:                  {result['expected_roi']:+.2f}%")

Expected output:

Moneyline CLV Analysis
  Implied prob at bet placement: 0.4000
  Closing no-vig probability:    0.4348
  CLV (probability):             +0.0348
  CLV (percentage points):       +3.48%
  Expected ROI:                  +8.70%

Spread and Total CLV

For spread bets, CLV is calculated differently because the line itself moves, not just the odds. We need to account for both the spread movement and the price change.

def calculate_spread_clv(
    spread_placed: float,
    odds_placed: float,
    spread_closing: float,
    odds_closing: float,
    sport: str = 'NFL'
) -> dict:
    """
    Calculate CLV for a spread bet, accounting for both
    spread movement and price changes.

    Parameters
    ----------
    spread_placed : float
        Point spread at time of bet (negative = favorite)
    odds_placed : float
        American odds at time of bet
    spread_closing : float
        Closing point spread
    odds_closing : float
        Closing American odds
    sport : str
        Sport for key-number adjustments

    Returns
    -------
    dict with CLV metrics
    """
    # Historical value of half-point by sport (approximate)
    half_point_value = {
        'NFL': {
            'default': 0.015,  # ~1.5% per half point
            3.0: 0.045,        # 4.5% around the key number 3
            7.0: 0.030,        # 3.0% around the key number 7
            6.0: 0.020,
            10.0: 0.020,
            14.0: 0.018,
        },
        'NBA': {
            'default': 0.010,  # ~1.0% per half point
        },
        'MLB': {
            'default': 0.008,
        }
    }

    # Calculate the spread movement value
    spread_diff = spread_closing - spread_placed
    # Positive spread_diff means the line moved in the bettor's favor
    # (e.g., bet at -3, closed at -3.5 means +0.5 in bettor's favor)

    sport_values = half_point_value.get(sport, {'default': 0.015})

    # Check if movement crosses a key number
    points_moved = abs(spread_diff)
    nearest_key = min(sport_values.keys(),
                      key=lambda k: abs(abs(spread_placed) - k) if k != 'default' else float('inf'))

    if nearest_key != 'default' and abs(abs(spread_placed) - nearest_key) <= 0.5:
        value_per_half = sport_values[nearest_key]
    else:
        value_per_half = sport_values['default']

    spread_clv = spread_diff * value_per_half * 2  # Convert half-points to full points

    # Also calculate the odds-based CLV
    implied_placed = american_to_implied_prob(odds_placed)
    implied_closing = american_to_implied_prob(odds_closing)

    # If spread hasn't moved, use pure odds CLV
    if spread_diff == 0:
        odds_clv = implied_closing - implied_placed
        total_clv = odds_clv
    else:
        # Combined CLV from spread movement and odds change
        odds_clv = implied_closing - implied_placed
        total_clv = spread_clv + odds_clv

    return {
        'spread_placed': spread_placed,
        'spread_closing': spread_closing,
        'spread_movement': spread_diff,
        'spread_clv': spread_clv,
        'odds_clv': odds_clv,
        'total_clv': total_clv,
        'total_clv_pct': total_clv * 100,
    }

# Example: Bet Patriots -2.5 (-110), line closes at -3.5 (-110)
result = calculate_spread_clv(
    spread_placed=-2.5,
    odds_placed=-110,
    spread_closing=-3.5,
    odds_closing=-110,
    sport='NFL'
)

print("Spread CLV Analysis")
print(f"  Spread at placement: {result['spread_placed']}")
print(f"  Spread at close:     {result['spread_closing']}")
print(f"  Spread movement:     {result['spread_movement']:+.1f}")
print(f"  CLV from spread:     {result['spread_clv']*100:+.2f}%")
print(f"  CLV from odds:       {result['odds_clv']*100:+.2f}%")
print(f"  Total CLV:           {result['total_clv_pct']:+.2f}%")

Aggregating CLV Over a Sample

A single bet's CLV tells you little. The power of CLV emerges in aggregate:

import pandas as pd
import numpy as np
from scipy import stats

def analyze_clv_sample(clv_values: list, confidence: float = 0.95) -> dict:
    """
    Analyze a sample of CLV values to assess betting skill.

    Parameters
    ----------
    clv_values : list of float
        CLV for each bet (as probability, not percentage)
    confidence : float
        Confidence level for interval estimation

    Returns
    -------
    dict with statistical analysis of CLV
    """
    arr = np.array(clv_values)
    n = len(arr)
    mean_clv = arr.mean()
    std_clv = arr.std(ddof=1)
    se = std_clv / np.sqrt(n)

    # t-test: is mean CLV significantly different from 0?
    t_stat, p_value = stats.ttest_1samp(arr, 0)

    # Confidence interval
    t_crit = stats.t.ppf((1 + confidence) / 2, df=n-1)
    ci_lower = mean_clv - t_crit * se
    ci_upper = mean_clv + t_crit * se

    # Percentage of bets with positive CLV
    pct_positive = (arr > 0).mean()

    return {
        'n_bets': n,
        'mean_clv': mean_clv,
        'median_clv': np.median(arr),
        'std_clv': std_clv,
        'se': se,
        't_statistic': t_stat,
        'p_value': p_value,
        'ci_lower': ci_lower,
        'ci_upper': ci_upper,
        'pct_positive_clv': pct_positive,
        'significant': p_value < (1 - confidence),
    }

# Example: Simulated CLV data for a sharp bettor
np.random.seed(42)
sharp_clv = np.random.normal(0.02, 0.08, 500)  # Mean +2% CLV
result = analyze_clv_sample(sharp_clv.tolist())

print(f"CLV Analysis ({result['n_bets']} bets)")
print(f"  Mean CLV:    {result['mean_clv']*100:+.2f}%")
print(f"  Median CLV:  {result['median_clv']*100:+.2f}%")
print(f"  Std Dev:     {result['std_clv']*100:.2f}%")
print(f"  t-statistic: {result['t_statistic']:.3f}")
print(f"  p-value:     {result['p_value']:.6f}")
print(f"  95% CI:      [{result['ci_lower']*100:+.2f}%, {result['ci_upper']*100:+.2f}%]")
print(f"  % Positive:  {result['pct_positive_clv']*100:.1f}%")
print(f"  Significant: {result['significant']}")

Callout: CLV vs. Actual Results

A bettor can have positive CLV and still lose money over a small sample. CLV measures the expected edge, not realized outcomes. Think of CLV as measuring the quality of your decisions, while profit/loss measures your decisions plus luck. Over time, the two converge. In the short run, trust CLV more than your P&L.


12.3 Odds Comparison Techniques

Systematic Odds Comparison

Effective line shopping requires a systematic approach. Here is a framework for comparing odds across multiple books:

Step 1: Identify the Market

For each event, define the set of markets you want to compare: - Spread (full game, first half, quarters) - Total (full game, team total, first half) - Moneyline - Props (player props, game props)

Step 2: Collect Odds from All Available Books

For each market, record the current odds from every sportsbook you have access to. Here is an example comparison table for an NFL game:

Sportsbook Spread Odds Total Odds Home ML Away ML
Book A -3 -110 45.5 -110 -155 +135
Book B -3 -108 45 -110 -150 +130
Book C -2.5 -112 45.5 -108 -148 +132
Book D -3 -105 46 -110 -160 +140
Book E -3.5 +100 45.5 -105 -145 +128
Best -2.5 +100 Varies -105 -145 +140

Step 3: Identify Outliers

An outlier is a line significantly different from the consensus. Outliers can represent: - Value opportunities -- the book has mispriced the event - Stale lines -- the book has not yet reacted to news or sharp action - Different market structure -- the book may be dealing to a different number intentionally

def identify_outliers(odds_dict: dict, threshold_cents: int = 10) -> list:
    """
    Identify sportsbooks offering outlier odds on a market.

    Parameters
    ----------
    odds_dict : dict
        {sportsbook_name: american_odds} for one side of a market
    threshold_cents : int
        Minimum deviation from median to flag as outlier (in cents)

    Returns
    -------
    list of dicts with outlier information
    """
    books = list(odds_dict.keys())
    odds = list(odds_dict.values())

    # Convert to implied probabilities for comparison
    implied = [american_to_implied_prob(o) for o in odds]
    median_implied = np.median(implied)

    outliers = []
    for book, odd, imp in zip(books, odds, implied):
        deviation = (median_implied - imp) * 100  # Positive = better than median
        if abs(deviation) >= threshold_cents / 100:
            outliers.append({
                'book': book,
                'odds': odd,
                'implied_prob': imp,
                'deviation_pct': deviation,
                'is_value': deviation > 0,
            })

    return sorted(outliers, key=lambda x: -x['deviation_pct'])

# Example usage
odds_market = {
    'Book A': +135,
    'Book B': +130,
    'Book C': +132,
    'Book D': +140,
    'Book E': +128,
    'Book F': +155,  # Outlier!
}

outliers = identify_outliers(odds_market, threshold_cents=5)
for o in outliers:
    direction = "VALUE" if o['is_value'] else "avoid"
    print(f"  {o['book']}: {o['odds']:+d} "
          f"(deviation: {o['deviation_pct']:+.2f}%) [{direction}]")

Timing Considerations

The timing of when you compare odds matters enormously:

  1. Opening lines (released 1-2 weeks before the event): Most variation between books, but some books are slow to open and have wide vig
  2. Midweek (3-5 days before): Markets have settled after initial sharp action; good time for a systematic comparison
  3. Day of event: Maximum information incorporated; differences between books are usually smallest but can spike with late-breaking news
  4. Live/in-play: Enormous variation due to different algorithms and reaction times; potentially the most exploitable but hardest to systematically capture

Best vs. Worst Line Analysis

To understand just how much line shopping matters, let us look at historical best vs. worst available odds:

def best_worst_analysis(odds_by_book: dict) -> dict:
    """
    Compare the best and worst available odds for a market.

    Parameters
    ----------
    odds_by_book : dict
        {book_name: american_odds} for one selection

    Returns
    -------
    dict with comparison metrics
    """
    best_book = max(odds_by_book, key=odds_by_book.get)
    worst_book = min(odds_by_book, key=odds_by_book.get)

    best_odds = odds_by_book[best_book]
    worst_odds = odds_by_book[worst_book]

    best_implied = american_to_implied_prob(best_odds)
    worst_implied = american_to_implied_prob(worst_odds)

    edge_difference = worst_implied - best_implied

    return {
        'best_book': best_book,
        'best_odds': best_odds,
        'worst_book': worst_book,
        'worst_odds': worst_odds,
        'implied_difference': edge_difference,
        'implied_diff_pct': edge_difference * 100,
        'profit_diff_per_100': (1/best_implied - 1/worst_implied) * 100,
    }

12.4 Building a Line-Comparison Bot in Python

In this section, we build a complete line-comparison system that scrapes odds from publicly available APIs, stores them in a database, and alerts you when value appears.

Disclaimer: Before scraping any website, review its terms of service. Many sportsbooks prohibit automated data collection. The code below uses the publicly available Odds API (https://the-odds-api.com/) which aggregates odds legally. Always comply with applicable terms of service and local regulations.

Architecture Overview

Our system has four components:

  1. Data Collector -- Fetches odds from the API on a schedule
  2. Database -- Stores historical odds in SQLite
  3. Analyzer -- Compares current odds to find value
  4. Alerter -- Sends notifications when value is detected

Component 1: Data Collector

"""
odds_collector.py
Collects odds from The Odds API and stores them in a local database.
"""

import requests
import sqlite3
import time
import json
import logging
from datetime import datetime, timezone
from typing import Optional

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

class OddsCollector:
    """
    Collects and stores odds from The Odds API.

    Parameters
    ----------
    api_key : str
        API key for The Odds API
    db_path : str
        Path to SQLite database file
    sports : list of str
        Sport keys to track (e.g., 'americanfootball_nfl')
    regions : str
        Comma-separated regions (e.g., 'us,us2,eu')
    markets : str
        Comma-separated markets (e.g., 'h2h,spreads,totals')
    """

    BASE_URL = "https://api.the-odds-api.com/v4/sports"

    def __init__(
        self,
        api_key: str,
        db_path: str = "odds_history.db",
        sports: list = None,
        regions: str = "us,us2",
        markets: str = "h2h,spreads,totals"
    ):
        self.api_key = api_key
        self.db_path = db_path
        self.sports = sports or [
            'americanfootball_nfl',
            'basketball_nba',
            'icehockey_nhl',
            'baseball_mlb'
        ]
        self.regions = regions
        self.markets = markets
        self._init_database()

    def _init_database(self):
        """Create database tables if they don't exist."""
        conn = sqlite3.connect(self.db_path)
        cursor = conn.cursor()

        cursor.execute("""
            CREATE TABLE IF NOT EXISTS odds_snapshots (
                id INTEGER PRIMARY KEY AUTOINCREMENT,
                snapshot_time TEXT NOT NULL,
                sport TEXT NOT NULL,
                event_id TEXT NOT NULL,
                home_team TEXT NOT NULL,
                away_team TEXT NOT NULL,
                commence_time TEXT NOT NULL,
                bookmaker TEXT NOT NULL,
                market TEXT NOT NULL,
                selection TEXT NOT NULL,
                price REAL NOT NULL,
                point REAL,
                UNIQUE(snapshot_time, event_id, bookmaker, market, selection)
            )
        """)

        cursor.execute("""
            CREATE TABLE IF NOT EXISTS best_odds_log (
                id INTEGER PRIMARY KEY AUTOINCREMENT,
                log_time TEXT NOT NULL,
                event_id TEXT NOT NULL,
                market TEXT NOT NULL,
                selection TEXT NOT NULL,
                best_book TEXT NOT NULL,
                best_odds REAL NOT NULL,
                median_odds REAL NOT NULL,
                edge_vs_median REAL NOT NULL
            )
        """)

        cursor.execute("""
            CREATE INDEX IF NOT EXISTS idx_odds_event
            ON odds_snapshots(event_id, market, selection)
        """)

        cursor.execute("""
            CREATE INDEX IF NOT EXISTS idx_odds_time
            ON odds_snapshots(snapshot_time)
        """)

        conn.commit()
        conn.close()
        logger.info("Database initialized successfully.")

    def fetch_odds(self, sport: str) -> Optional[list]:
        """
        Fetch current odds for a sport from the API.

        Parameters
        ----------
        sport : str
            Sport key (e.g., 'americanfootball_nfl')

        Returns
        -------
        list of event dicts, or None on failure
        """
        url = f"{self.BASE_URL}/{sport}/odds"
        params = {
            'apiKey': self.api_key,
            'regions': self.regions,
            'markets': self.markets,
            'oddsFormat': 'american',
        }

        try:
            response = requests.get(url, params=params, timeout=30)
            response.raise_for_status()

            remaining = response.headers.get('x-requests-remaining', '?')
            logger.info(f"Fetched {sport}: {len(response.json())} events. "
                       f"API requests remaining: {remaining}")

            return response.json()

        except requests.RequestException as e:
            logger.error(f"Error fetching {sport}: {e}")
            return None

    def store_odds(self, events: list, sport: str):
        """
        Store fetched odds in the database.

        Parameters
        ----------
        events : list
            List of event dicts from the API
        sport : str
            Sport key
        """
        conn = sqlite3.connect(self.db_path)
        cursor = conn.cursor()
        snapshot_time = datetime.now(timezone.utc).isoformat()

        rows_inserted = 0
        for event in events:
            event_id = event['id']
            home = event['home_team']
            away = event['away_team']
            commence = event['commence_time']

            for bookmaker in event.get('bookmakers', []):
                book_name = bookmaker['key']

                for market in bookmaker.get('markets', []):
                    market_key = market['key']

                    for outcome in market.get('outcomes', []):
                        selection = outcome['name']
                        price = outcome['price']
                        point = outcome.get('point')

                        try:
                            cursor.execute("""
                                INSERT OR IGNORE INTO odds_snapshots
                                (snapshot_time, sport, event_id, home_team,
                                 away_team, commence_time, bookmaker, market,
                                 selection, price, point)
                                VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
                            """, (snapshot_time, sport, event_id, home, away,
                                  commence, book_name, market_key, selection,
                                  price, point))
                            rows_inserted += cursor.rowcount
                        except sqlite3.Error as e:
                            logger.warning(f"DB insert error: {e}")

        conn.commit()
        conn.close()
        logger.info(f"Stored {rows_inserted} new odds records.")

    def collect_all(self):
        """Fetch and store odds for all configured sports."""
        for sport in self.sports:
            events = self.fetch_odds(sport)
            if events:
                self.store_odds(events, sport)
            time.sleep(1)  # Rate limiting courtesy

    def run_continuous(self, interval_seconds: int = 300):
        """
        Continuously collect odds at a specified interval.

        Parameters
        ----------
        interval_seconds : int
            Seconds between collection cycles (default: 5 minutes)
        """
        logger.info(f"Starting continuous collection every "
                    f"{interval_seconds}s...")
        while True:
            try:
                self.collect_all()
            except Exception as e:
                logger.error(f"Collection cycle error: {e}")
            time.sleep(interval_seconds)

Component 2: Odds Analyzer

"""
odds_analyzer.py
Analyzes stored odds to find the best available lines and value opportunities.
"""

import sqlite3
import numpy as np
from datetime import datetime, timezone, timedelta
from typing import List, Dict, Optional

class OddsAnalyzer:
    """
    Analyzes odds stored in the database to find value.

    Parameters
    ----------
    db_path : str
        Path to the SQLite database
    min_edge : float
        Minimum edge vs. median to flag as value (default: 0.02 = 2%)
    """

    def __init__(self, db_path: str = "odds_history.db", min_edge: float = 0.02):
        self.db_path = db_path
        self.min_edge = min_edge

    def get_current_odds(self, event_id: str, market: str) -> Dict:
        """
        Get the most recent odds snapshot for an event/market.

        Parameters
        ----------
        event_id : str
            Unique event identifier
        market : str
            Market type ('h2h', 'spreads', 'totals')

        Returns
        -------
        dict mapping (selection, point) -> {bookmaker: odds}
        """
        conn = sqlite3.connect(self.db_path)
        cursor = conn.cursor()

        # Get the most recent snapshot time for this event
        cursor.execute("""
            SELECT MAX(snapshot_time) FROM odds_snapshots
            WHERE event_id = ? AND market = ?
        """, (event_id, market))

        latest_time = cursor.fetchone()[0]
        if not latest_time:
            conn.close()
            return {}

        cursor.execute("""
            SELECT selection, point, bookmaker, price
            FROM odds_snapshots
            WHERE event_id = ? AND market = ? AND snapshot_time = ?
        """, (event_id, market, latest_time))

        result = {}
        for selection, point, bookmaker, price in cursor.fetchall():
            key = (selection, point)
            if key not in result:
                result[key] = {}
            result[key][bookmaker] = price

        conn.close()
        return result

    def find_best_odds(self, event_id: str, market: str) -> List[Dict]:
        """
        Find the best available odds for each selection in a market.

        Returns
        -------
        list of dicts with best odds information
        """
        current = self.get_current_odds(event_id, market)

        results = []
        for (selection, point), book_odds in current.items():
            if not book_odds:
                continue

            best_book = max(book_odds, key=book_odds.get)
            best_price = book_odds[best_book]

            # Calculate median odds (as implied probability)
            implied_probs = [american_to_implied_prob(o) for o in book_odds.values()]
            median_implied = np.median(implied_probs)
            best_implied = american_to_implied_prob(best_price)

            edge = median_implied - best_implied

            results.append({
                'selection': selection,
                'point': point,
                'best_book': best_book,
                'best_price': best_price,
                'median_implied': median_implied,
                'best_implied': best_implied,
                'edge_vs_median': edge,
                'n_books': len(book_odds),
                'all_odds': book_odds,
            })

        return sorted(results, key=lambda x: -x['edge_vs_median'])

    def find_value_bets(self) -> List[Dict]:
        """
        Scan all upcoming events for value bets exceeding the minimum edge.

        Returns
        -------
        list of value bet opportunities
        """
        conn = sqlite3.connect(self.db_path)
        cursor = conn.cursor()

        # Get all upcoming events
        now = datetime.now(timezone.utc).isoformat()
        cursor.execute("""
            SELECT DISTINCT event_id, home_team, away_team,
                   commence_time, sport, market
            FROM odds_snapshots
            WHERE commence_time > ?
            ORDER BY commence_time
        """, (now,))

        events = cursor.fetchall()
        conn.close()

        value_bets = []
        seen = set()

        for event_id, home, away, commence, sport, market in events:
            key = (event_id, market)
            if key in seen:
                continue
            seen.add(key)

            best_odds = self.find_best_odds(event_id, market)

            for odds_info in best_odds:
                if odds_info['edge_vs_median'] >= self.min_edge:
                    value_bets.append({
                        'sport': sport,
                        'event': f"{away} @ {home}",
                        'commence': commence,
                        'market': market,
                        **odds_info,
                    })

        return sorted(value_bets, key=lambda x: -x['edge_vs_median'])

    def track_line_movement(
        self, event_id: str, market: str, selection: str
    ) -> List[Dict]:
        """
        Track how a line has moved over time for a specific selection.

        Parameters
        ----------
        event_id : str
        market : str
        selection : str

        Returns
        -------
        list of dicts with time-series line movement data
        """
        conn = sqlite3.connect(self.db_path)
        cursor = conn.cursor()

        cursor.execute("""
            SELECT snapshot_time, bookmaker, price, point
            FROM odds_snapshots
            WHERE event_id = ? AND market = ? AND selection = ?
            ORDER BY snapshot_time
        """, (event_id, market, selection))

        movements = []
        for snap_time, bookmaker, price, point in cursor.fetchall():
            movements.append({
                'time': snap_time,
                'bookmaker': bookmaker,
                'price': price,
                'point': point,
            })

        conn.close()
        return movements

Component 3: Alert System

"""
odds_alerter.py
Sends alerts when value opportunities are detected.
"""

import smtplib
from email.mime.text import MIMEText
from email.mime.multipart import MIMEMultipart
import json
import logging
from datetime import datetime

logger = logging.getLogger(__name__)

class OddsAlerter:
    """
    Sends alerts for value betting opportunities.

    Supports email and console output. Can be extended with
    SMS, Telegram, Discord, etc.

    Parameters
    ----------
    method : str
        Alert method: 'console', 'email', or 'both'
    email_config : dict, optional
        Email configuration with keys: smtp_server, smtp_port,
        username, password, from_addr, to_addr
    """

    def __init__(self, method: str = 'console', email_config: dict = None):
        self.method = method
        self.email_config = email_config or {}

    def format_alert(self, value_bet: dict) -> str:
        """Format a value bet into a readable alert string."""
        lines = [
            "=" * 60,
            f"VALUE ALERT: {value_bet['event']}",
            f"Sport: {value_bet['sport']}",
            f"Market: {value_bet['market']}",
            f"Selection: {value_bet['selection']}",
            f"Point: {value_bet.get('point', 'N/A')}",
            f"Best Book: {value_bet['best_book']}",
            f"Best Price: {value_bet['best_price']:+d}"
                if isinstance(value_bet['best_price'], int)
                else f"Best Price: {value_bet['best_price']}",
            f"Edge vs. Median: {value_bet['edge_vs_median']*100:.2f}%",
            f"# Books Compared: {value_bet['n_books']}",
            f"Game Time: {value_bet['commence']}",
            "-" * 40,
            "All available odds:",
        ]

        for book, odds in sorted(
            value_bet['all_odds'].items(),
            key=lambda x: -x[1]
        ):
            marker = " <-- BEST" if book == value_bet['best_book'] else ""
            lines.append(f"  {book:>20s}: {odds:>+6}{marker}")

        lines.append("=" * 60)
        return "\n".join(lines)

    def send_alert(self, value_bet: dict):
        """Send an alert for a value bet opportunity."""
        message = self.format_alert(value_bet)

        if self.method in ('console', 'both'):
            print(message)

        if self.method in ('email', 'both'):
            self._send_email(
                subject=f"Value Alert: {value_bet['event']} - "
                        f"{value_bet['selection']} "
                        f"({value_bet['edge_vs_median']*100:.1f}% edge)",
                body=message
            )

    def _send_email(self, subject: str, body: str):
        """Send an email alert."""
        try:
            msg = MIMEMultipart()
            msg['From'] = self.email_config['from_addr']
            msg['To'] = self.email_config['to_addr']
            msg['Subject'] = subject
            msg.attach(MIMEText(body, 'plain'))

            with smtplib.SMTP(
                self.email_config['smtp_server'],
                self.email_config['smtp_port']
            ) as server:
                server.starttls()
                server.login(
                    self.email_config['username'],
                    self.email_config['password']
                )
                server.send_message(msg)

            logger.info(f"Email alert sent: {subject}")
        except Exception as e:
            logger.error(f"Failed to send email: {e}")

    def send_batch_alert(self, value_bets: list):
        """Send alerts for multiple value bets."""
        if not value_bets:
            logger.info("No value bets found in this scan.")
            return

        logger.info(f"Found {len(value_bets)} value opportunities.")
        for vb in value_bets:
            self.send_alert(vb)

Putting It All Together: The Main Script

"""
line_shopping_bot.py
Main entry point that ties together collection, analysis, and alerting.
"""

import argparse
import time
import logging
from odds_collector import OddsCollector
from odds_analyzer import OddsAnalyzer
from odds_alerter import OddsAlerter

logging.basicConfig(
    level=logging.INFO,
    format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
)
logger = logging.getLogger(__name__)

def run_line_shopping_bot(
    api_key: str,
    min_edge: float = 0.02,
    interval: int = 300,
    alert_method: str = 'console'
):
    """
    Run the complete line shopping bot.

    Parameters
    ----------
    api_key : str
        The Odds API key
    min_edge : float
        Minimum edge to trigger alert (default: 2%)
    interval : int
        Seconds between scans (default: 300 = 5 min)
    alert_method : str
        'console', 'email', or 'both'
    """
    collector = OddsCollector(api_key=api_key)
    analyzer = OddsAnalyzer(min_edge=min_edge)
    alerter = OddsAlerter(method=alert_method)

    logger.info("Line Shopping Bot started.")
    logger.info(f"Min edge threshold: {min_edge*100:.1f}%")
    logger.info(f"Scan interval: {interval}s")

    while True:
        try:
            # Step 1: Collect latest odds
            logger.info("Collecting odds...")
            collector.collect_all()

            # Step 2: Analyze for value
            logger.info("Analyzing for value...")
            value_bets = analyzer.find_value_bets()

            # Step 3: Alert on findings
            alerter.send_batch_alert(value_bets)

            logger.info(f"Scan complete. Next scan in {interval}s.")

        except KeyboardInterrupt:
            logger.info("Bot stopped by user.")
            break
        except Exception as e:
            logger.error(f"Error in scan cycle: {e}")

        time.sleep(interval)

if __name__ == '__main__':
    parser = argparse.ArgumentParser(description='Line Shopping Bot')
    parser.add_argument('--api-key', required=True, help='The Odds API key')
    parser.add_argument('--min-edge', type=float, default=0.02,
                       help='Minimum edge threshold (default: 0.02)')
    parser.add_argument('--interval', type=int, default=300,
                       help='Scan interval in seconds (default: 300)')
    parser.add_argument('--alert', choices=['console', 'email', 'both'],
                       default='console', help='Alert method')

    args = parser.parse_args()
    run_line_shopping_bot(
        api_key=args.api_key,
        min_edge=args.min_edge,
        interval=args.interval,
        alert_method=args.alert
    )

To run the bot:

python line_shopping_bot.py --api-key YOUR_KEY --min-edge 0.015 --interval 180

Callout: Extending the Bot

This implementation is a starting point. Production-level improvements include: - Persistent scheduling with cron or a task scheduler instead of while True - Telegram/Discord integration for mobile alerts - Dashboard using Flask or Streamlit to visualize odds movements - Closing line tracking by storing the final pre-game odds and computing CLV automatically - Bankroll integration to calculate recommended stake sizes based on Kelly criterion


12.5 Timing Your Bets

The Timing Dilemma

One of the most important and debated questions in sports betting is: when should you place your bet? There are two broad schools of thought:

  1. Bet early (at the opener or shortly after): Capture value before the market corrects
  2. Bet late (close to game time): Maximize the information available to you

The correct answer depends on the source of your edge, the sport, and the specific market.

Opening Line Value

When sportsbooks first post a line (the "opener"), it is typically their most uncertain estimate. The opening line has several characteristics:

  • Wider vig: Books often open with higher hold to compensate for uncertainty
  • More prone to error: Less information has been priced in
  • Sharp action hasn't arrived: The market-correcting force of sharp bettors hasn't yet moved the line
  • Limits are lower: Books accept smaller bets on openers to manage risk

Despite the wider vig, openers can offer substantial value because the line itself may be significantly off. Here is a decision framework:

def should_bet_early(
    your_estimated_prob: float,
    opening_implied_prob: float,
    expected_closing_implied: float,
    opening_vig: float = 0.048,
    closing_vig: float = 0.043
) -> dict:
    """
    Determine whether to bet at the opener or wait for the close.

    Parameters
    ----------
    your_estimated_prob : float
        Your model's estimated probability
    opening_implied_prob : float
        Opening line implied probability (with vig)
    expected_closing_implied : float
        Your estimate of where the closing line will land
    opening_vig : float
        Estimated total vig at open (default: 4.8%)
    closing_vig : float
        Estimated total vig at close (default: 4.3%)

    Returns
    -------
    dict with recommendation and analysis
    """
    # Approximate no-vig opening probability
    opening_no_vig = opening_implied_prob / (1 + opening_vig / 2)

    # Edge at open
    edge_at_open = your_estimated_prob - opening_implied_prob

    # Expected edge at close
    closing_no_vig = expected_closing_implied / (1 + closing_vig / 2)
    edge_at_close = your_estimated_prob - expected_closing_implied

    # Factor in potential line movement
    if edge_at_open > edge_at_close:
        recommendation = "BET EARLY"
        reason = ("Your edge is larger at the opener. The market will "
                 "likely move toward your position, reducing available value.")
    elif edge_at_close > edge_at_open + 0.01:
        recommendation = "WAIT"
        reason = ("Additional information is expected to increase your edge. "
                 "The value of waiting outweighs the risk of missing the number.")
    else:
        recommendation = "BET EARLY (slight)"
        reason = ("Edges are similar, but betting early locks in value and "
                 "avoids the risk of the line moving against you.")

    return {
        'recommendation': recommendation,
        'reason': reason,
        'edge_at_open': edge_at_open,
        'edge_at_close': edge_at_close,
        'edge_difference': edge_at_open - edge_at_close,
    }

Sport-Specific Timing Strategies

Each sport has distinct characteristics that affect optimal bet timing:

NFL (American Football)

Time Frame Strategy Reasoning
Sunday/Monday (opener) Bet strong opinions immediately NFL lines are the most efficient; early value disappears fastest
Monday-Wednesday Monitor injury reports, bet if value on key numbers Key-number crossings (3, 7) create discrete value jumps
Thursday-Saturday Bet weather-dependent totals once forecasts firm up Wind and precipitation significantly affect totals
Sunday morning Final injury check; grab any remaining value Inactive lists released 90 min before kickoff

NFL-specific insight: The NFL opener-to-close line movement is typically 1-2 points for sides and 1-3 points for totals. About 60-65% of the closing line move happens in the first 24 hours after opening.

NBA (Basketball)

Time Frame Strategy Reasoning
Morning (opener) Bet strong model outputs early NBA lines are less efficient at open due to volume
Early afternoon Check rest/travel angles Back-to-back games, road trips create fatigue edges
2-3 hours before tip Check lineup confirmations Rest days for stars dramatically move lines
Final 30 minutes Bet on confirmed lineups if value remains Starting lineup confirmation is the last major info

NBA-specific insight: Star player rest announcements move NBA lines 2-5 points. If your model accounts for rest patterns before the market does, the opener is extremely valuable.

MLB (Baseball)

Time Frame Strategy Reasoning
Previous evening Bet confirmed starters you are bullish on Starting pitcher is the primary driver of MLB lines
Morning Check for lineup changes, weather Lineup card submitted 3-4 hours before first pitch
1-2 hours before Bet weather-sensitive totals Wind direction at Wrigley Field can swing totals by 2-3 runs
Listed pitcher rule Always use "listed pitcher" option Protects against late pitcher changes

MLB-specific insight: MLB lines move the most based on the starting pitcher. A bullpen game announcement or last-minute starter change can move a line 30-50 cents on the moneyline. Always protect yourself with "listed pitcher" bets.

NHL (Hockey)

Time Frame Strategy Reasoning
Morning (opener) Bet strong opinions on underdogs NHL moneylines have the most value on underdogs at the open
Early afternoon Check goalie confirmations Starting goalie can move a line 10-20 cents
1 hour before Confirm goalie, late scratches Morning skate confirms starters

Soccer

Time Frame Strategy Reasoning
Days before Bet early if your model is strong Soccer markets are less efficient than US sports
24 hours before Check team sheets, travel reports European competitions with midweek travel create fatigue
1 hour before Lineup announcements Confirmed lineups can move lines significantly

Empirical Analysis: Open vs. Close Profitability

To test whether opening or closing lines offer more value, we can analyze historical data:

import pandas as pd
import numpy as np

def analyze_open_vs_close_value(
    historical_bets: pd.DataFrame,
    n_bootstrap: int = 10000
) -> dict:
    """
    Analyze whether opening or closing lines are more profitable.

    Parameters
    ----------
    historical_bets : pd.DataFrame
        Must contain columns: 'open_odds', 'close_odds', 'result' (1=win, 0=loss)
    n_bootstrap : int
        Number of bootstrap samples for confidence intervals

    Returns
    -------
    dict with profitability comparison
    """
    def profit_at_odds(odds, result):
        """Calculate profit for a $100 bet."""
        if result == 1:
            if odds > 0:
                return 100 * odds / 100
            else:
                return 100 * 100 / abs(odds)
        else:
            return -100

    df = historical_bets.copy()
    df['profit_open'] = df.apply(
        lambda r: profit_at_odds(r['open_odds'], r['result']), axis=1
    )
    df['profit_close'] = df.apply(
        lambda r: profit_at_odds(r['close_odds'], r['result']), axis=1
    )

    roi_open = df['profit_open'].sum() / (len(df) * 100)
    roi_close = df['profit_close'].sum() / (len(df) * 100)

    # Bootstrap confidence intervals
    open_rois = []
    close_rois = []
    for _ in range(n_bootstrap):
        sample = df.sample(len(df), replace=True)
        open_rois.append(sample['profit_open'].sum() / (len(sample) * 100))
        close_rois.append(sample['profit_close'].sum() / (len(sample) * 100))

    return {
        'roi_at_open': roi_open,
        'roi_at_close': roi_close,
        'roi_diff': roi_open - roi_close,
        'open_ci_95': (np.percentile(open_rois, 2.5),
                       np.percentile(open_rois, 97.5)),
        'close_ci_95': (np.percentile(close_rois, 2.5),
                        np.percentile(close_rois, 97.5)),
        'n_bets': len(df),
        'pct_open_better': (df['open_odds'] > df['close_odds']).mean() * 100,
    }

The "Steam Move" Phenomenon

A steam move occurs when multiple sportsbooks simultaneously move their lines in the same direction, typically in response to large wagers from sharp bettors or syndicates. Understanding steam moves is critical for bet timing:

  1. Detecting steam: When 3+ books move the same direction within minutes, a steam move is likely
  2. Following steam: Betting in the direction of the move at a book that hasn't yet adjusted can be profitable
  3. Fading steam: Betting against the move is generally unprofitable; sharp money is right more often than not
def detect_steam_move(
    line_history: list,
    window_minutes: int = 10,
    min_books_moving: int = 3,
    min_move_size: float = 0.5
) -> list:
    """
    Detect potential steam moves from line movement history.

    Parameters
    ----------
    line_history : list of dict
        Each dict: {'time': datetime, 'book': str, 'line': float}
    window_minutes : int
        Time window to look for coordinated moves
    min_books_moving : int
        Minimum number of books moving in same direction
    min_move_size : float
        Minimum line movement to count (in points)

    Returns
    -------
    list of detected steam move events
    """
    from collections import defaultdict
    from datetime import timedelta

    # Group by time windows
    if not line_history:
        return []

    sorted_history = sorted(line_history, key=lambda x: x['time'])
    steam_events = []

    for i, entry in enumerate(sorted_history):
        window_end = entry['time'] + timedelta(minutes=window_minutes)

        # Find all movements within the window
        window_moves = []
        for j in range(i, len(sorted_history)):
            if sorted_history[j]['time'] > window_end:
                break
            window_moves.append(sorted_history[j])

        # Check if multiple books moved in the same direction
        moves_by_book = defaultdict(list)
        for move in window_moves:
            moves_by_book[move['book']].append(move['line'])

        # Determine direction of movement for each book
        up_moves = 0
        down_moves = 0
        for book, lines in moves_by_book.items():
            if len(lines) >= 2:
                change = lines[-1] - lines[0]
                if change >= min_move_size:
                    up_moves += 1
                elif change <= -min_move_size:
                    down_moves += 1

        if up_moves >= min_books_moving:
            steam_events.append({
                'time': entry['time'],
                'direction': 'UP',
                'books_moving': up_moves,
            })
        elif down_moves >= min_books_moving:
            steam_events.append({
                'time': entry['time'],
                'direction': 'DOWN',
                'books_moving': down_moves,
            })

    return steam_events

Callout: The Speed Advantage

In modern sports betting, information travels fast. A line that is 2 points off the market will be corrected within minutes at most books. The window to capture value from stale lines is measured in seconds to minutes, not hours. This is why automated tools (like the bot we built in Section 12.4) are increasingly important -- no human can manually check 8 sportsbooks across dozens of markets fast enough to capture every opportunity.

Combining Timing with Line Shopping

The most effective approach combines timing knowledge with systematic line shopping:

  1. Pre-game homework: Develop your opinion on a game using your model or analysis framework before lines open
  2. Opener scan: When lines open, immediately scan all books for the best number. If your opinion aligns with value, bet the opener
  3. Mid-cycle monitoring: Use your bot to track line movements. If the line moves away from your position (confirming your side), the value may be gone. If the line moves toward your position, additional value may appear at stale books
  4. Pre-game final check: 30-60 minutes before game time, do a final scan. Late-breaking news can create short-lived value opportunities
  5. CLV tracking: After the game starts, record the closing line and calculate your CLV on each bet

12.6 Chapter Summary

This chapter established line shopping and market analysis as fundamental practices for any serious sports bettor. The key takeaways are:

Line Shopping Impact: - Even small improvements in odds (3-5 cents) compound dramatically over thousands of bets - The difference between shopping and not shopping can double or triple a bettor's annual ROI - Key numbers in football and basketball make half-point improvements disproportionately valuable

Closing Line Value (CLV): - CLV is the most reliable predictor of long-term betting profitability - It measures the quality of your betting decisions independently of outcome variance - CLV converges to a meaningful signal faster than raw win/loss records - Calculating CLV requires recording both your placed odds and the closing odds

Practical Tools: - We built a complete Python system for collecting, storing, analyzing, and alerting on odds across multiple sportsbooks - The system uses The Odds API for data collection, SQLite for storage, and configurable alerting - This infrastructure can be extended with dashboards, mobile alerts, and bankroll integration

Bet Timing: - The optimal time to bet depends on your edge source and the sport - NFL and NBA bets generally favor early action for model-based edges - MLB bets should account for confirmed pitchers and weather - Steam moves represent coordinated sharp action and are generally not to be faded - Combining systematic line shopping with informed timing maximizes long-term expected value

In Chapter 13, we will build on these concepts to develop a complete framework for value betting -- systematically identifying, executing, and tracking bets where you have a genuine edge over the market.


Exercises

Exercise 12.1: Using the CLV calculation functions from Section 12.2, compute the CLV for the following bets: - (a) Bet on Team A moneyline at +180. Closing line: +160 / -185. Use the additive vig removal method. - (b) Bet on Under 48.5 at -108. Closing line: Under 47.5 at -110. (Hint: the total moved through your number.)

Exercise 12.2: Modify the OddsCollector class to also track first-half spreads and totals. What additional API parameters would you need?

Exercise 12.3: Build a function that takes a pandas DataFrame of historical odds and computes the "line shopping premium" -- the average difference between the best and worst available odds across all events in the dataset.

Exercise 12.4: Write a simulation that compares three strategies over 5,000 bets: (a) always bet at the median available odds, (b) always bet at the best available odds, (c) always bet at the worst available odds. Assume a true win probability of 52% and that odds vary by N(0, 0.03) across books around a fair line with 4.5% vig.

Exercise 12.5: Implement a "steam move detector" that processes real-time line movement data and identifies when three or more books move the same direction within a 5-minute window. Test it on synthetic data.