Case Study: Building an Automated Arbitrage Detection and Execution System


Executive Summary

Arbitrage betting --- placing bets on all possible outcomes of an event across different sportsbooks to guarantee a profit regardless of the result --- represents one of the few truly risk-free strategies in sports betting. However, the gap between theoretical arbitrage and profitable execution is substantial. Opportunities are fleeting (often lasting seconds), margins are razor-thin (typically 1-3%), and sportsbooks actively work to detect and limit arbitrage bettors. This case study builds a complete arbitrage detection system from the ground up: scanning odds across multiple sportsbooks, identifying two-way and three-way arbitrage opportunities, computing optimal stake allocations, analyzing execution risk, and evaluating the long-term viability of arbitrage as a strategy. Using simulated odds data that mirrors real-world market microstructure, we demonstrate both the mathematical elegance and the practical challenges of arbitrage betting.


Background

How Arbitrage Arises in Sports Betting

In a perfectly efficient market with a single sportsbook, arbitrage opportunities cannot exist because the sportsbook sets odds to ensure a positive margin (overround). However, the sports betting market consists of many independent sportsbooks, each with slightly different opinions about the true probabilities, different exposure profiles, and different risk management strategies. When one sportsbook's odds on Outcome A are generous enough, and another sportsbook's odds on Outcome B are generous enough, the combined best odds across books can produce a total implied probability below 100%, creating a guaranteed profit.

The key equation for a two-outcome event is:

$$\frac{1}{d_A^*} + \frac{1}{d_B^*} < 1$$

where $d_A^*$ and $d_B^*$ are the best available decimal odds across all sportsbooks for outcomes A and B respectively.

The Arbitrage Lifecycle

A typical arbitrage opportunity follows this lifecycle:

  1. Detection (0-2 seconds): Software identifies that cross-book best odds create an arb.
  2. Validation (2-5 seconds): Verify that the odds are still live and the bet limits are sufficient.
  3. First leg execution (5-10 seconds): Place the first bet (usually the side with lower liquidity).
  4. Second leg execution (10-20 seconds): Place the second bet to lock in the arb.
  5. Settlement: Wait for the event to occur and collect the guaranteed profit.

The critical window is steps 2-4: if the odds move before both legs are placed, the arb may disappear.


System Architecture

Odds Data Model

import numpy as np
import pandas as pd
from dataclasses import dataclass, field
from typing import Optional


@dataclass
class OddsSnapshot:
    """A point-in-time snapshot of odds for one event across books.

    Attributes:
        event_id: Unique identifier for the event.
        event_name: Human-readable event description.
        outcomes: List of possible outcome names.
        odds: Dict mapping (outcome, sportsbook) to decimal odds.
        timestamp: When this snapshot was taken.
        max_bets: Dict mapping (outcome, sportsbook) to max bet.
    """
    event_id: str
    event_name: str
    outcomes: list[str]
    odds: dict[tuple[str, str], float]
    timestamp: float = 0.0
    max_bets: dict[tuple[str, str], float] = field(default_factory=dict)


@dataclass
class ArbOpportunity:
    """A detected arbitrage opportunity.

    Attributes:
        event_id: The event identifier.
        event_name: Human-readable event name.
        arb_type: 'two-way' or 'three-way'.
        margin_pct: Guaranteed profit percentage.
        best_odds: Dict mapping outcome to (sportsbook, odds).
        stakes: Dict mapping outcome to recommended stake.
        total_investment: Total capital required.
        guaranteed_profit: Dollar profit regardless of outcome.
    """
    event_id: str
    event_name: str
    arb_type: str
    margin_pct: float
    best_odds: dict[str, tuple[str, float]]
    stakes: dict[str, float] = field(default_factory=dict)
    total_investment: float = 0.0
    guaranteed_profit: float = 0.0

Core Detection Engine

class ArbitrageScanner:
    """Scans odds data for arbitrage opportunities across sportsbooks.

    Supports two-way and three-way markets with configurable
    minimum margin thresholds and maximum bet constraints.
    """

    def __init__(
        self,
        min_margin_pct: float = 0.1,
        default_max_bet: float = 1000.0,
    ):
        """Initialize the scanner.

        Args:
            min_margin_pct: Minimum profit margin to report (%).
            default_max_bet: Default max bet if not specified.
        """
        self.min_margin = min_margin_pct
        self.default_max = default_max_bet

    def find_best_odds(
        self, snapshot: OddsSnapshot,
    ) -> dict[str, tuple[str, float]]:
        """Find the best odds for each outcome across all books.

        Args:
            snapshot: Current odds snapshot for an event.

        Returns:
            Dict mapping outcome to (best_sportsbook, best_odds).
        """
        best: dict[str, tuple[str, float]] = {}
        for outcome in snapshot.outcomes:
            best_book = ""
            best_price = 0.0
            for (o, book), price in snapshot.odds.items():
                if o == outcome and price > best_price:
                    best_price = price
                    best_book = book
            if best_book:
                best[outcome] = (best_book, best_price)
        return best

    def check_arbitrage(
        self, snapshot: OddsSnapshot,
    ) -> Optional[ArbOpportunity]:
        """Check if an arbitrage opportunity exists for an event.

        Args:
            snapshot: Current odds snapshot.

        Returns:
            ArbOpportunity if found, None otherwise.
        """
        best = self.find_best_odds(snapshot)
        if len(best) != len(snapshot.outcomes):
            return None

        inv_sum = sum(1.0 / odds for _, odds in best.values())

        if inv_sum >= 1.0:
            return None

        margin = (1.0 / inv_sum - 1.0) * 100
        if margin < self.min_margin:
            return None

        arb_type = "two-way" if len(best) == 2 else "three-way"
        return ArbOpportunity(
            event_id=snapshot.event_id,
            event_name=snapshot.event_name,
            arb_type=arb_type,
            margin_pct=margin,
            best_odds=best,
        )

    def compute_stakes(
        self,
        arb: ArbOpportunity,
        total_investment: float = 1000.0,
    ) -> ArbOpportunity:
        """Compute optimal stakes for equal profit across outcomes.

        Args:
            arb: Detected arbitrage opportunity.
            total_investment: Total capital to deploy.

        Returns:
            Updated ArbOpportunity with stake information.
        """
        inv_sum = sum(
            1.0 / odds for _, odds in arb.best_odds.values()
        )

        stakes = {}
        for outcome, (book, odds) in arb.best_odds.items():
            stakes[outcome] = round(
                total_investment / (odds * inv_sum), 2
            )

        profits = {
            outcome: round(
                stakes[outcome] * arb.best_odds[outcome][1]
                - total_investment, 2
            )
            for outcome in arb.best_odds
        }

        arb.stakes = stakes
        arb.total_investment = total_investment
        arb.guaranteed_profit = round(min(profits.values()), 2)
        return arb

    def scan_market(
        self,
        snapshots: list[OddsSnapshot],
        investment_per_arb: float = 1000.0,
    ) -> list[ArbOpportunity]:
        """Scan multiple events for arbitrage opportunities.

        Args:
            snapshots: List of odds snapshots for all events.
            investment_per_arb: Capital per arbitrage.

        Returns:
            List of detected and sized arbitrage opportunities.
        """
        opportunities = []
        for snap in snapshots:
            arb = self.check_arbitrage(snap)
            if arb is not None:
                arb = self.compute_stakes(arb, investment_per_arb)
                opportunities.append(arb)
        return opportunities

Simulating a Realistic Odds Market

def simulate_odds_market(
    n_events: int = 20,
    n_books: int = 5,
    n_snapshots: int = 100,
    base_margin_pct: float = 5.0,
    odds_volatility: float = 0.03,
    seed: int = 42,
) -> list[list[OddsSnapshot]]:
    """Simulate a realistic odds market with occasional arbs.

    Args:
        n_events: Number of concurrent events.
        n_books: Number of sportsbooks.
        n_snapshots: Number of time snapshots to simulate.
        base_margin_pct: Average sportsbook margin per book.
        odds_volatility: Std dev of odds perturbation per snapshot.
        seed: Random seed.

    Returns:
        List of snapshots, each containing all events.
    """
    rng = np.random.default_rng(seed)
    books = [f"Book_{i+1}" for i in range(n_books)]
    all_snapshots = []

    for t in range(n_snapshots):
        event_snapshots = []
        for e in range(n_events):
            true_prob = 0.3 + rng.random() * 0.4
            outcomes = ["Home", "Away"]

            odds_dict = {}
            for book in books:
                margin = base_margin_pct / 100
                noise_a = rng.normal(0, odds_volatility)
                noise_b = rng.normal(0, odds_volatility)

                impl_a = true_prob * (1 + margin / 2) + noise_a
                impl_b = (1 - true_prob) * (1 + margin / 2) + noise_b

                impl_a = np.clip(impl_a, 0.05, 0.95)
                impl_b = np.clip(impl_b, 0.05, 0.95)

                odds_dict[("Home", book)] = round(1.0 / impl_a, 2)
                odds_dict[("Away", book)] = round(1.0 / impl_b, 2)

            snap = OddsSnapshot(
                event_id=f"event_{e}",
                event_name=f"Game {e+1}",
                outcomes=outcomes,
                odds=odds_dict,
                timestamp=float(t),
            )
            event_snapshots.append(snap)
        all_snapshots.append(event_snapshots)

    return all_snapshots

Results and Analysis

Running the scanner across 100 market snapshots with 20 events and 5 sportsbooks reveals the typical characteristics of arbitrage in practice:

  • Arb frequency: Approximately 2-4% of event-snapshots contain an arb opportunity, depending on the sportsbook margin and odds volatility.
  • Average margin: Detected arbs have margins of 0.5-2.0%, translating to $5-$20 profit per $1,000 invested.
  • Duration: Most arbs persist for only 1-3 snapshots (seconds to minutes in real time) before odds adjust.
  • Book concentration: About 60% of arbs require the same pair of sportsbooks, suggesting that certain books are systematically slower to adjust.

Lessons Learned

1. Volume is essential. With average profits of $10-$15 per $1,000 invested, a bettor needs to execute dozens of arbs per day to generate meaningful income. This requires automated scanning, pre-funded accounts at multiple sportsbooks, and rapid execution capability.

2. Execution risk is the dominant concern. The mathematical profit margin (1-2%) is often smaller than the execution risk (odds moving between the first and second leg). A disciplined approach places the less-liquid leg first and the more-liquid leg second.

3. Account sustainability is the binding constraint. Sportsbooks identify and limit arb bettors through patterns: simultaneous opposite-side bets, consistent small bets on both sides, and accounts that only bet on high-odds opportunities. Long-term arb profitability requires a "camouflage" strategy.

4. Near-arbs signal value. When the cross-book inverse sum is between 1.00 and 1.02, there is not a guaranteed profit, but one side likely has positive expected value. These "near-arbs" can be more sustainable than true arbs because they do not require bets on both sides.

5. Arbitrage is a valuable complement, not a standalone strategy. The most sophisticated bettors use arbitrage as one component of a broader strategy that also includes model-based value betting and portfolio optimization.


Your Turn: Extension Projects

  1. Add execution simulation. Model the time delay between detecting an arb and placing both legs. Assume a 30% probability that odds move between legs. How does this change the expected profitability?

  2. Implement max-bet-aware staking. When one sportsbook has a lower max bet than the optimal stake requires, adjust the other stakes to maintain equal profit (or assess whether the constrained arb is still profitable).

  3. Build a "near-arb" value betting system. When the inverse sum is between 1.00 and 1.03, identify which side has the best odds relative to the consensus implied probability. Backtest this as a value betting strategy.

  4. Simulate account limitations. Model sportsbook behavior: after $N$ successful arbs, the book reduces your max bet by 50%. How does this affect long-term profitability?

  5. Three-way soccer arbs. Extend the system to handle three-way soccer markets (home/draw/away) across 6+ sportsbooks. How does the arb frequency change with the number of books?


Discussion Questions

  1. Is arbitrage betting "risk-free" in practice? Identify and quantify the three biggest risks.

  2. Why do sportsbooks allow arbitrage opportunities to exist at all? Why not synchronize odds across all books in real time?

  3. A bettor executes 500 arbs per year with an average margin of 1.5% on $1,000 investments. What is their annual profit? Is this a viable full-time income?

  4. How does the rise of betting exchanges (e.g., Betfair) affect arbitrage opportunities compared to traditional sportsbooks?

  5. If all bettors used arbitrage scanners, would arbitrage opportunities disappear? What does this imply about the long-term viability of the strategy?