> "The essence of mathematics is not to make simple things complicated, but to make complicated things simple."
Learning Objectives
- Formulate betting allocation problems as linear programs and solve them using PuLP and cvxpy
- Apply Markowitz mean-variance portfolio theory to construct efficient betting portfolios
- Build algorithmic arbitrage detection systems that scan odds across multiple sportsbooks
- Extend the Kelly criterion with practical constraints including regulatory limits and correlation
- Design multi-objective optimization frameworks that balance profit, risk, and operational constraints
In This Chapter
Chapter 25: Optimization Methods for Betting
"The essence of mathematics is not to make simple things complicated, but to make complicated things simple." --- Stan Gudder
Chapter Overview
Throughout this book, we have developed progressively more sophisticated methods for estimating probabilities, building predictive models, and evaluating betting opportunities. But estimation is only half the problem. Once you have identified a set of positive-expected-value bets, you face a second, equally challenging question: how much should you bet on each one?
This is fundamentally an optimization problem. You have a bankroll, a set of available bets with estimated edges and uncertainties, constraints imposed by sportsbooks (maximum bet sizes, minimum bets) and regulators (maximum daily/weekly exposure in some jurisdictions), and objectives that may include not just maximizing expected profit but also controlling risk, managing liquidity, and maintaining a sustainable long-term operation.
This chapter brings together the optimization toolkit needed to translate probabilistic insights into optimal betting decisions. We begin with linear programming, which handles problems with linear objectives and constraints. We then extend to portfolio optimization, adapting the mean-variance framework pioneered by Harry Markowitz for financial portfolios to the betting context. We develop algorithmic tools for detecting and executing arbitrage opportunities. We generalize the Kelly criterion (introduced in Chapter 4) with practical constraints. Finally, we tackle multi-objective optimization, which provides a principled framework for balancing competing goals.
This chapter assumes familiarity with the Kelly criterion and basic bet sizing concepts from Chapter 4, as well as the modeling evaluation methods from Chapter 14. All implementations use Python's cvxpy, PuLP, and scipy.optimize libraries.
In this chapter, you will learn to:
- Formulate and solve linear programs that determine optimal bet allocations under real-world constraints
- Construct efficient betting portfolios using mean-variance optimization with correlated bet outcomes
- Build automated arbitrage detection systems that scan odds across multiple sportsbooks in real time
- Extend Kelly criterion bet sizing with maximum bet constraints, maximum exposure limits, and correlation between outcomes
- Design multi-objective optimization frameworks that navigate the trade-offs between profit, risk, and operational feasibility
25.1 Linear Programming for Betting Portfolios
The Linear Programming Framework
A linear program (LP) is an optimization problem with a linear objective function and linear inequality/equality constraints. The standard form is:
$$\begin{aligned} \text{maximize} \quad & \mathbf{c}^T \mathbf{x} \\ \text{subject to} \quad & \mathbf{A} \mathbf{x} \leq \mathbf{b} \\ & \mathbf{x} \geq \mathbf{0} \end{aligned}$$
where $\mathbf{x} \in \mathbb{R}^n$ is the vector of decision variables (bet amounts), $\mathbf{c} \in \mathbb{R}^n$ is the vector of objective coefficients (expected returns), $\mathbf{A} \in \mathbb{R}^{m \times n}$ is the constraint matrix, and $\mathbf{b} \in \mathbb{R}^m$ is the constraint vector.
Linear programs have remarkable properties:
- Global optimality: If a solution exists, the LP solver finds the global optimum (not a local one).
- Efficient algorithms: The simplex method and interior-point methods solve LPs with thousands of variables in milliseconds.
- Duality: Every LP has a dual problem that provides bounds and economic interpretations.
- Sensitivity analysis: We can determine how the optimal solution changes as parameters vary, which is critical when our probability estimates are uncertain.
Betting Allocation as a Linear Program
Consider a bettor facing $n$ potential bets. For bet $i$:
- $x_i$ = the amount wagered on bet $i$ (decision variable)
- $e_i$ = the expected return per dollar wagered (i.e., if the true probability is $p_i$ and the decimal odds are $d_i$, then $e_i = p_i \cdot d_i - 1$)
- $B$ = the total bankroll
- $M_i$ = the maximum allowed bet on event $i$ (sportsbook limit)
- $E_{\max}$ = the maximum total exposure (total amount at risk)
The LP formulation is:
$$\begin{aligned} \text{maximize} \quad & \sum_{i=1}^{n} e_i x_i \\ \text{subject to} \quad & \sum_{i=1}^{n} x_i \leq E_{\max} \quad & \text{(total exposure limit)} \\ & 0 \leq x_i \leq M_i \quad & \text{(individual bet limits)} \\ & \sum_{i \in S_k} x_i \leq L_k \quad & \text{(sport/league exposure limits)} \end{aligned}$$
where $S_k$ denotes the set of bets in sport/league $k$ and $L_k$ is the maximum exposure for that sport.
Python Implementation with PuLP
import numpy as np
import pandas as pd
from typing import Optional
def solve_betting_lp_pulp(
expected_returns: np.ndarray,
max_bets: np.ndarray,
max_total_exposure: float,
min_bet: float = 0.0,
group_constraints: Optional[dict[str, tuple[list[int], float]]] = None,
bankroll: float = 10000.0,
) -> dict:
"""
Solve a betting allocation problem using linear programming (PuLP).
Maximizes expected profit subject to individual bet limits,
total exposure limits, and optional group constraints.
Args:
expected_returns: Array of expected return per dollar for each bet.
Positive values indicate positive-EV bets.
max_bets: Array of maximum allowed wager for each bet.
max_total_exposure: Maximum total amount wagered across all bets.
min_bet: Minimum bet size (0 for no minimum).
group_constraints: Optional dictionary mapping group names to
(list of bet indices, max exposure for group).
bankroll: Total bankroll (for reporting purposes).
Returns:
Dictionary with optimal bet amounts, expected profit, and
constraint analysis.
"""
try:
from pulp import (
LpMaximize, LpProblem, LpVariable, lpSum, value, LpStatus
)
except ImportError:
raise ImportError("PuLP is required: pip install pulp")
n_bets = len(expected_returns)
# Create the problem
prob = LpProblem("Betting_Allocation", LpMaximize)
# Decision variables: amount to bet on each opportunity
x = [
LpVariable(f"bet_{i}", lowBound=0, upBound=max_bets[i])
for i in range(n_bets)
]
# Objective: maximize expected profit
prob += lpSum(expected_returns[i] * x[i] for i in range(n_bets))
# Constraint: total exposure
prob += lpSum(x[i] for i in range(n_bets)) <= max_total_exposure
# Group constraints (e.g., per-sport limits)
if group_constraints is not None:
for group_name, (indices, max_group) in group_constraints.items():
prob += (
lpSum(x[i] for i in indices) <= max_group,
f"group_{group_name}",
)
# Solve
prob.solve()
# Extract results
optimal_bets = np.array([value(x[i]) for i in range(n_bets)])
expected_profit = sum(
expected_returns[i] * optimal_bets[i] for i in range(n_bets)
)
total_wagered = optimal_bets.sum()
return {
"status": LpStatus[prob.status],
"optimal_bets": optimal_bets,
"expected_profit": expected_profit,
"total_wagered": total_wagered,
"roi_pct": expected_profit / total_wagered * 100 if total_wagered > 0 else 0,
"bankroll_fraction": total_wagered / bankroll,
"n_bets_placed": np.sum(optimal_bets > min_bet),
"objective_value": value(prob.objective),
}
def solve_betting_lp_cvxpy(
expected_returns: np.ndarray,
max_bets: np.ndarray,
max_total_exposure: float,
group_indices: Optional[list[list[int]]] = None,
group_limits: Optional[list[float]] = None,
) -> dict:
"""
Solve betting allocation using cvxpy (alternative to PuLP).
cvxpy is more flexible for extending to quadratic and
second-order cone programs.
Args:
expected_returns: Expected return per dollar for each bet.
max_bets: Maximum wager for each bet.
max_total_exposure: Maximum total exposure.
group_indices: List of lists of bet indices for group constraints.
group_limits: Maximum exposure per group.
Returns:
Dictionary with optimal allocation and diagnostics.
"""
import cvxpy as cp
n = len(expected_returns)
x = cp.Variable(n, nonneg=True)
# Objective: maximize expected profit
objective = cp.Maximize(expected_returns @ x)
# Constraints
constraints = [
x <= max_bets,
cp.sum(x) <= max_total_exposure,
]
if group_indices is not None and group_limits is not None:
for indices, limit in zip(group_indices, group_limits):
constraints.append(cp.sum(x[indices]) <= limit)
# Solve
problem = cp.Problem(objective, constraints)
problem.solve(solver=cp.ECOS)
optimal_bets = x.value
if optimal_bets is None:
return {"status": "infeasible", "optimal_bets": np.zeros(n)}
# Clean tiny values
optimal_bets = np.maximum(optimal_bets, 0)
optimal_bets[optimal_bets < 0.01] = 0
return {
"status": problem.status,
"optimal_bets": optimal_bets,
"expected_profit": float(expected_returns @ optimal_bets),
"total_wagered": float(optimal_bets.sum()),
"objective_value": float(problem.value),
}
# Worked Example: NFL Week Betting Allocation
print("LINEAR PROGRAMMING: NFL BETTING ALLOCATION")
print("=" * 60)
# 8 potential bets for a Sunday slate
bets = pd.DataFrame({
"game": [
"KC @ BUF", "PHI @ SF", "DET @ DAL", "BAL @ CIN",
"MIA @ NYJ", "GB @ MIN", "LAR @ SEA", "DEN @ LV",
],
"bet_type": [
"KC ML", "PHI +3", "DET -6.5", "Over 47.5",
"MIA ML", "GB +1", "Under 44", "DEN -3",
],
"true_prob": [0.58, 0.56, 0.57, 0.54, 0.55, 0.53, 0.56, 0.52],
"decimal_odds": [2.10, 1.95, 1.91, 1.90, 2.20, 1.95, 1.87, 1.91],
"max_bet": [500, 500, 500, 500, 300, 500, 500, 300],
})
# Expected returns per dollar
bets["expected_return"] = bets["true_prob"] * bets["decimal_odds"] - 1
print("\nAvailable Bets:")
print(bets[["game", "bet_type", "true_prob", "decimal_odds",
"expected_return", "max_bet"]].to_string(index=False))
# Define group constraints: max $1500 on any single game's bets
# (In this example, each game has only one bet, so this is straightforward)
group_constraints = {
"afc_games": ([0, 3, 4, 7], 1200), # AFC games max $1200
"nfc_games": ([1, 2, 5, 6], 1200), # NFC games max $1200
}
# Solve
result = solve_betting_lp_pulp(
expected_returns=bets["expected_return"].values,
max_bets=bets["max_bet"].values,
max_total_exposure=2000,
group_constraints=group_constraints,
bankroll=10000,
)
print(f"\nOptimization Status: {result['status']}")
print(f"\nOptimal Allocation:")
for i, (_, row) in enumerate(bets.iterrows()):
if result["optimal_bets"][i] > 0:
exp_profit = result["optimal_bets"][i] * row["expected_return"]
print(f" {row['bet_type']:<15} ${result['optimal_bets'][i]:>7.2f} "
f"(E[profit] = ${exp_profit:>6.2f})")
print(f"\nTotal wagered: ${result['total_wagered']:.2f}")
print(f"Expected profit: ${result['expected_profit']:.2f}")
print(f"Portfolio ROI: {result['roi_pct']:.2f}%")
print(f"Bankroll used: {result['bankroll_fraction']:.1%}")
print(f"Bets placed: {result['n_bets_placed']:.0f} of {len(bets)}")
Sensitivity Analysis
One of the most valuable features of LP is sensitivity analysis --- understanding how the optimal solution changes when parameters vary. In betting, our probability estimates are uncertain, so we want to know:
- How much can a probability estimate change before the optimal allocation shifts? This is the concept of allowable increase/decrease in the objective coefficients.
- What is the shadow price of each constraint? The shadow price tells us how much the optimal profit would increase if we relaxed a constraint by one unit (e.g., if the sportsbook increased our maximum bet by $1).
- Which constraints are binding? A binding constraint (one that holds with equality) is actively limiting our profit. Non-binding constraints have slack and are not currently restrictive.
Real-World Application: When a sportsbook limits your maximum bet size on a specific market, the shadow price of that constraint tells you exactly how much expected profit you are losing. If the shadow price on the "KC ML max bet" constraint is $0.08 per dollar, then increasing your limit from $500 to $600 would add $8.00 to your expected profit. This information guides which sportsbooks and markets to prioritize.
25.2 Portfolio Optimization (Markowitz for Bets)
From Finance to Betting
Harry Markowitz's Modern Portfolio Theory (MPT), developed in 1952, revolutionized finance by formalizing the trade-off between risk and return. The key insight is that diversification --- spreading investments across correlated assets --- can reduce portfolio risk without sacrificing expected return. The same principle applies to betting.
In finance, the portfolio return is:
$$R_p = \sum_{i=1}^{n} w_i R_i$$
where $w_i$ is the weight of asset $i$ and $R_i$ is its random return. The expected portfolio return is:
$$E[R_p] = \sum_{i=1}^{n} w_i \mu_i = \mathbf{w}^T \boldsymbol{\mu}$$
and the portfolio variance is:
$$\text{Var}(R_p) = \sum_{i=1}^{n} \sum_{j=1}^{n} w_i w_j \sigma_{ij} = \mathbf{w}^T \boldsymbol{\Sigma} \mathbf{w}$$
where $\boldsymbol{\Sigma}$ is the covariance matrix of returns.
Adapting MPT to Betting
In a betting context:
- Assets are individual bets (or bet types/markets)
- Weights $w_i$ represent the fraction of bankroll allocated to bet $i$
- Expected returns $\mu_i = p_i \cdot d_i - 1$ where $p_i$ is the true probability and $d_i$ is the decimal odds
- Variances $\sigma_i^2 = p_i(1-p_i) \cdot d_i^2$ (for a simple win/lose bet at decimal odds $d_i$)
- Covariances $\sigma_{ij}$ capture the correlation between bet outcomes
The efficient frontier is the set of portfolios that achieve the maximum expected return for each level of risk (variance). Any portfolio below the efficient frontier is suboptimal: there exists another portfolio with the same risk but higher return, or the same return but lower risk.
Sources of Correlation Between Bets
Bet outcomes are often correlated, and ignoring these correlations leads to suboptimal portfolios:
- Same-game correlations: Betting on a team to win and the over in the same game are positively correlated (teams that win often score more).
- Conference/division correlations: Betting on multiple teams in the same division creates correlation because they play each other.
- Market-wide factors: Weather, scheduling, or league-wide trends can create correlations across many games.
- Model-based correlations: If your edge comes from a factor that affects multiple games similarly (e.g., a rest-days model), the outcomes of those bets will be correlated conditional on your model being right or wrong.
Python Implementation
import numpy as np
import pandas as pd
import cvxpy as cp
from scipy.optimize import minimize
from typing import Optional
class BettingPortfolioOptimizer:
"""
Mean-variance portfolio optimizer adapted for sports betting.
Constructs the efficient frontier for a set of bets with
estimated expected returns and a covariance matrix of outcomes.
"""
def __init__(
self,
expected_returns: np.ndarray,
covariance_matrix: np.ndarray,
bet_names: Optional[list[str]] = None,
):
"""
Args:
expected_returns: Array of expected return per dollar for
each bet.
covariance_matrix: n x n covariance matrix of bet returns.
bet_names: Optional list of bet names for labeling.
"""
self.mu = expected_returns
self.sigma = covariance_matrix
self.n = len(expected_returns)
self.names = bet_names or [f"Bet_{i}" for i in range(self.n)]
# Validate inputs
assert len(self.mu) == self.n
assert self.sigma.shape == (self.n, self.n)
# Check positive semi-definiteness
eigenvalues = np.linalg.eigvalsh(self.sigma)
if np.any(eigenvalues < -1e-10):
raise ValueError("Covariance matrix is not positive semi-definite.")
def optimize_portfolio(
self,
target_return: Optional[float] = None,
risk_aversion: Optional[float] = None,
max_weight: float = 0.25,
max_total_weight: float = 1.0,
long_only: bool = True,
) -> dict:
"""
Find the optimal portfolio allocation.
Can optimize for either:
- Minimum variance given a target return, or
- Maximum utility = E[R] - (lambda/2) * Var(R)
Args:
target_return: If specified, minimize variance subject to
achieving this expected return.
risk_aversion: If specified (and target_return is None),
maximize expected return minus risk_aversion/2 * variance.
max_weight: Maximum fraction of bankroll on any single bet.
max_total_weight: Maximum total fraction of bankroll wagered.
long_only: If True, no negative weights (no "selling" bets).
Returns:
Dictionary with optimal weights, portfolio return, risk,
and Sharpe-like ratio.
"""
w = cp.Variable(self.n)
# Portfolio return and risk
portfolio_return = self.mu @ w
portfolio_variance = cp.quad_form(w, self.sigma)
# Constraints
constraints = [cp.sum(w) <= max_total_weight]
if long_only:
constraints.append(w >= 0)
constraints.append(w <= max_weight)
if target_return is not None:
# Minimize variance subject to target return
constraints.append(portfolio_return >= target_return)
objective = cp.Minimize(portfolio_variance)
elif risk_aversion is not None:
# Maximize utility
objective = cp.Maximize(
portfolio_return - (risk_aversion / 2) * portfolio_variance
)
else:
# Default: maximize Sharpe-like ratio (approximate)
# Use risk_aversion = 1 as default
objective = cp.Maximize(
portfolio_return - 0.5 * portfolio_variance
)
problem = cp.Problem(objective, constraints)
problem.solve(solver=cp.SCS)
if problem.status not in ["optimal", "optimal_inaccurate"]:
return {"status": problem.status, "weights": np.zeros(self.n)}
weights = np.array(w.value).flatten()
weights = np.maximum(weights, 0) # Clean tiny negatives
weights[weights < 1e-6] = 0
port_ret = float(self.mu @ weights)
port_var = float(weights @ self.sigma @ weights)
port_std = np.sqrt(port_var)
return {
"status": problem.status,
"weights": weights,
"portfolio_return": port_ret,
"portfolio_variance": port_var,
"portfolio_std": port_std,
"sharpe_ratio": port_ret / port_std if port_std > 0 else np.inf,
"n_active_bets": np.sum(weights > 1e-6),
"total_weight": weights.sum(),
}
def efficient_frontier(
self,
n_points: int = 50,
max_weight: float = 0.25,
max_total_weight: float = 1.0,
) -> pd.DataFrame:
"""
Compute the efficient frontier --- the set of portfolios
with maximum return for each level of risk.
Args:
n_points: Number of points on the frontier.
max_weight: Maximum weight per bet.
max_total_weight: Maximum total weight.
Returns:
DataFrame with portfolio return, std, and Sharpe for
each point on the frontier.
"""
# Find return range
min_return_port = self.optimize_portfolio(
risk_aversion=1000, max_weight=max_weight,
max_total_weight=max_total_weight
)
max_return_port = self.optimize_portfolio(
risk_aversion=0.001, max_weight=max_weight,
max_total_weight=max_total_weight
)
if min_return_port["status"] not in ["optimal", "optimal_inaccurate"]:
return pd.DataFrame()
min_ret = min_return_port["portfolio_return"]
max_ret = max_return_port["portfolio_return"]
target_returns = np.linspace(min_ret, max_ret, n_points)
frontier = []
for target in target_returns:
result = self.optimize_portfolio(
target_return=target,
max_weight=max_weight,
max_total_weight=max_total_weight,
)
if result["status"] in ["optimal", "optimal_inaccurate"]:
frontier.append({
"target_return": target,
"portfolio_return": result["portfolio_return"],
"portfolio_std": result["portfolio_std"],
"sharpe_ratio": result["sharpe_ratio"],
"n_active_bets": result["n_active_bets"],
"total_weight": result["total_weight"],
})
return pd.DataFrame(frontier)
def compute_bet_covariance(
true_probs: np.ndarray,
decimal_odds: np.ndarray,
correlation_matrix: np.ndarray,
) -> np.ndarray:
"""
Compute the covariance matrix of bet returns.
For binary bet outcomes with given correlations.
Args:
true_probs: True win probabilities for each bet.
decimal_odds: Decimal odds for each bet.
correlation_matrix: n x n correlation matrix between bet
outcomes (on the 0/1 win/lose scale).
Returns:
n x n covariance matrix of bet returns (profit/loss per dollar).
"""
n = len(true_probs)
# Standard deviations of returns
# For bet i: outcome is either (d_i - 1) with prob p_i or -1 with prob (1-p_i)
# E[R_i] = p_i * (d_i - 1) + (1-p_i) * (-1) = p_i * d_i - 1
# Var(R_i) = p_i * (1-p_i) * d_i^2
variances = true_probs * (1 - true_probs) * decimal_odds ** 2
stds = np.sqrt(variances)
# Build covariance matrix from correlation matrix and standard deviations
cov_matrix = np.outer(stds, stds) * correlation_matrix
return cov_matrix
# Worked Example: Betting Portfolio Optimization
print("PORTFOLIO OPTIMIZATION FOR BETTING")
print("=" * 60)
# 6 available bets for tonight's games
bet_data = pd.DataFrame({
"name": [
"LAL ML", "BOS -5.5", "Over 224 (MIL-PHI)",
"GSW +3", "DEN ML", "Under 218 (DAL-MEM)",
],
"true_prob": [0.58, 0.55, 0.54, 0.53, 0.60, 0.52],
"decimal_odds": [2.00, 1.91, 1.91, 1.95, 1.72, 1.91],
})
bet_data["expected_return"] = (
bet_data["true_prob"] * bet_data["decimal_odds"] - 1
)
print("Available Bets:")
print(bet_data.to_string(index=False))
# Define correlations
# Bets on the same game are correlated; bets on different games less so
n_bets = len(bet_data)
corr_matrix = np.eye(n_bets)
# LAL ML and BOS -5.5: very slight correlation (both favorites)
corr_matrix[0, 1] = corr_matrix[1, 0] = 0.05
# Over 224 and team MLs: moderate correlation
corr_matrix[0, 2] = corr_matrix[2, 0] = 0.15
corr_matrix[1, 2] = corr_matrix[2, 1] = 0.10
# GSW +3 and DEN ML: slight negative (if both western conf)
corr_matrix[3, 4] = corr_matrix[4, 3] = -0.05
# Under and MLs: slight
corr_matrix[4, 5] = corr_matrix[5, 4] = -0.10
cov_matrix = compute_bet_covariance(
bet_data["true_prob"].values,
bet_data["decimal_odds"].values,
corr_matrix,
)
# Create optimizer
optimizer = BettingPortfolioOptimizer(
expected_returns=bet_data["expected_return"].values,
covariance_matrix=cov_matrix,
bet_names=bet_data["name"].tolist(),
)
# Find optimal portfolio for different risk levels
print("\nOPTIMAL PORTFOLIOS AT DIFFERENT RISK AVERSION LEVELS")
print("-" * 65)
print(f"{'Lambda':>8} {'E[R]':>8} {'Std':>8} {'Sharpe':>8} {'N Bets':>8}")
print("-" * 65)
for lam in [0.5, 1.0, 2.0, 5.0, 10.0, 20.0]:
result = optimizer.optimize_portfolio(
risk_aversion=lam, max_weight=0.30, max_total_weight=1.0
)
if result["status"] in ["optimal", "optimal_inaccurate"]:
print(f"{lam:>8.1f} {result['portfolio_return']:>8.4f} "
f"{result['portfolio_std']:>8.4f} "
f"{result['sharpe_ratio']:>8.4f} "
f"{result['n_active_bets']:>8.0f}")
# Show allocation for moderate risk aversion
print("\nDETAILED ALLOCATION (lambda=2.0)")
print("-" * 40)
result = optimizer.optimize_portfolio(
risk_aversion=2.0, max_weight=0.30, max_total_weight=1.0
)
for name, weight in zip(bet_data["name"], result["weights"]):
if weight > 0.001:
dollar_amount = weight * 10000 # $10,000 bankroll
print(f" {name:<25} {weight:>6.1%} (${dollar_amount:>7.0f})")
print(f"\n Portfolio E[Return]: {result['portfolio_return']:.4f}")
print(f" Portfolio Std Dev: {result['portfolio_std']:.4f}")
print(f" Sharpe Ratio: {result['sharpe_ratio']:.4f}")
# Compute efficient frontier
print("\nEFFICIENT FRONTIER")
print("-" * 55)
frontier = optimizer.efficient_frontier(
n_points=10, max_weight=0.30, max_total_weight=1.0
)
if len(frontier) > 0:
print(f"{'E[Return]':>10} {'Std Dev':>10} {'Sharpe':>10} {'N Bets':>8}")
for _, row in frontier.iterrows():
print(f"{row['portfolio_return']:>10.4f} {row['portfolio_std']:>10.4f} "
f"{row['sharpe_ratio']:>10.4f} {row['n_active_bets']:>8.0f}")
Key Insight: The correlation between bet outcomes significantly affects optimal portfolio construction. If all your bets are positively correlated (e.g., you have a model that identifies value in favorites, and you bet several favorites on the same slate), your portfolio risk is much higher than if the bets were independent. Mean-variance optimization naturally accounts for this by diversifying across uncorrelated or negatively correlated bets. This is why diversifying across sports, bet types, and games is not just common sense but mathematically optimal.
25.3 Arbitrage Detection Algorithms
The Mathematics of Arbitrage
An arbitrage opportunity (or "arb" or "sure bet") exists when the odds across multiple sportsbooks allow you to bet on all possible outcomes of an event and guarantee a profit regardless of the result.
For a two-outcome event (e.g., a tennis match or a moneyline with no draw), let $d_A$ be the best available decimal odds on outcome A (across all books) and $d_B$ be the best available decimal odds on outcome B. An arbitrage exists if and only if:
$$\frac{1}{d_A} + \frac{1}{d_B} < 1$$
The sum $\frac{1}{d_A} + \frac{1}{d_B}$ represents the total implied probability. If it is less than 1, the market is "over-round" in the bettor's favor, and a guaranteed profit is possible.
The optimal stakes to ensure equal profit regardless of outcome are:
$$x_A = \frac{B}{d_A \cdot S}, \quad x_B = \frac{B}{d_B \cdot S}$$
where $B$ is the total amount to invest and $S = \frac{1}{d_A} + \frac{1}{d_B}$. The guaranteed profit is:
$$\Pi = B \left(\frac{1}{S} - 1\right) = B \left(\frac{1}{\frac{1}{d_A} + \frac{1}{d_B}} - 1\right)$$
For a three-outcome event (e.g., soccer: home win, draw, away win), the condition becomes:
$$\frac{1}{d_H} + \frac{1}{d_D} + \frac{1}{d_A} < 1$$
Practical Considerations
While the mathematics of arbitrage is straightforward, execution involves several practical challenges:
- Speed: Arbitrage opportunities are fleeting. Lines move quickly, and an opportunity that exists at 2:00:00 PM may be gone by 2:00:05 PM.
- Limits: Sportsbooks limit bettors who consistently exploit arbitrage. Being identified as an "arber" can result in severely reduced limits or account closure.
- Different rules: Books may have different rules for voided legs, overtime, etc. What appears to be an arb may not be if one book voids the bet under certain conditions.
- Capital requirements: Arb margins are typically very small (1-3%), so significant capital is needed to generate meaningful profits.
- Execution risk: Placing the second leg of an arb after the first creates risk if the line moves before you can execute.
Python Implementation
import numpy as np
import pandas as pd
from itertools import combinations
from typing import Optional
import time
class ArbitrageDetector:
"""
Real-time arbitrage detection across multiple sportsbooks.
Scans odds data for two-way and three-way arbitrage
opportunities, computes optimal stake allocation, and
reports guaranteed profit margins.
"""
def __init__(self):
self.opportunities = []
def check_two_way_arb(
self,
odds_a: list[tuple[str, float]],
odds_b: list[tuple[str, float]],
) -> Optional[dict]:
"""
Check for arbitrage in a two-outcome event.
Args:
odds_a: List of (sportsbook_name, decimal_odds) for outcome A.
odds_b: List of (sportsbook_name, decimal_odds) for outcome B.
Returns:
Dictionary with arb details if found, None otherwise.
"""
# Find best odds for each outcome
best_a = max(odds_a, key=lambda x: x[1])
best_b = max(odds_b, key=lambda x: x[1])
inv_sum = 1 / best_a[1] + 1 / best_b[1]
if inv_sum < 1.0:
margin = (1 / inv_sum - 1) * 100
return {
"type": "two-way",
"outcome_a_book": best_a[0],
"outcome_a_odds": best_a[1],
"outcome_b_book": best_b[0],
"outcome_b_odds": best_b[1],
"inverse_sum": inv_sum,
"profit_margin_pct": margin,
"is_cross_book": best_a[0] != best_b[0],
}
return None
def check_three_way_arb(
self,
odds_home: list[tuple[str, float]],
odds_draw: list[tuple[str, float]],
odds_away: list[tuple[str, float]],
) -> Optional[dict]:
"""
Check for arbitrage in a three-outcome event (e.g., soccer).
Args:
odds_home: List of (book, odds) for home win.
odds_draw: List of (book, odds) for draw.
odds_away: List of (book, odds) for away win.
Returns:
Dictionary with arb details if found, None otherwise.
"""
best_h = max(odds_home, key=lambda x: x[1])
best_d = max(odds_draw, key=lambda x: x[1])
best_a = max(odds_away, key=lambda x: x[1])
inv_sum = 1 / best_h[1] + 1 / best_d[1] + 1 / best_a[1]
if inv_sum < 1.0:
margin = (1 / inv_sum - 1) * 100
books_used = {best_h[0], best_d[0], best_a[0]}
return {
"type": "three-way",
"home_book": best_h[0],
"home_odds": best_h[1],
"draw_book": best_d[0],
"draw_odds": best_d[1],
"away_book": best_a[0],
"away_odds": best_a[1],
"inverse_sum": inv_sum,
"profit_margin_pct": margin,
"n_books_needed": len(books_used),
}
return None
def compute_optimal_stakes(
self,
arb: dict,
total_investment: float = 1000.0,
) -> dict:
"""
Compute optimal stake allocation for an arbitrage opportunity.
Allocates stakes to equalize profit across all outcomes.
Args:
arb: Arbitrage opportunity dictionary from check_*_arb.
total_investment: Total amount to invest.
Returns:
Dictionary with stake for each outcome and guaranteed profit.
"""
if arb["type"] == "two-way":
S = 1 / arb["outcome_a_odds"] + 1 / arb["outcome_b_odds"]
stake_a = total_investment / (arb["outcome_a_odds"] * S)
stake_b = total_investment / (arb["outcome_b_odds"] * S)
profit_if_a = stake_a * arb["outcome_a_odds"] - total_investment
profit_if_b = stake_b * arb["outcome_b_odds"] - total_investment
return {
"stake_a": round(stake_a, 2),
"stake_b": round(stake_b, 2),
"total_invested": round(stake_a + stake_b, 2),
"profit_if_a": round(profit_if_a, 2),
"profit_if_b": round(profit_if_b, 2),
"guaranteed_profit": round(min(profit_if_a, profit_if_b), 2),
"roi_pct": round(
min(profit_if_a, profit_if_b) / total_investment * 100, 3
),
}
elif arb["type"] == "three-way":
S = (1 / arb["home_odds"] + 1 / arb["draw_odds"]
+ 1 / arb["away_odds"])
stake_h = total_investment / (arb["home_odds"] * S)
stake_d = total_investment / (arb["draw_odds"] * S)
stake_a = total_investment / (arb["away_odds"] * S)
profit_h = stake_h * arb["home_odds"] - total_investment
profit_d = stake_d * arb["draw_odds"] - total_investment
profit_a = stake_a * arb["away_odds"] - total_investment
return {
"stake_home": round(stake_h, 2),
"stake_draw": round(stake_d, 2),
"stake_away": round(stake_a, 2),
"total_invested": round(stake_h + stake_d + stake_a, 2),
"profit_if_home": round(profit_h, 2),
"profit_if_draw": round(profit_d, 2),
"profit_if_away": round(profit_a, 2),
"guaranteed_profit": round(min(profit_h, profit_d, profit_a), 2),
"roi_pct": round(
min(profit_h, profit_d, profit_a) / total_investment * 100,
3,
),
}
def scan_market(
self,
market_data: list[dict],
) -> list[dict]:
"""
Scan a set of events across multiple sportsbooks for arbitrage.
Args:
market_data: List of event dictionaries, each containing:
- 'event': Event name
- 'outcomes': List of outcome names
- 'odds': Dict mapping (outcome, book) to decimal odds
Returns:
List of arbitrage opportunities found.
"""
opportunities = []
for event in market_data:
event_name = event["event"]
outcomes = event["outcomes"]
odds = event["odds"]
if len(outcomes) == 2:
odds_a = [
(book, decimal_odds)
for (outcome, book), decimal_odds in odds.items()
if outcome == outcomes[0]
]
odds_b = [
(book, decimal_odds)
for (outcome, book), decimal_odds in odds.items()
if outcome == outcomes[1]
]
arb = self.check_two_way_arb(odds_a, odds_b)
elif len(outcomes) == 3:
odds_lists = []
for outcome in outcomes:
outcome_odds = [
(book, d)
for (o, book), d in odds.items()
if o == outcome
]
odds_lists.append(outcome_odds)
arb = self.check_three_way_arb(*odds_lists)
else:
continue
if arb is not None:
arb["event"] = event_name
stakes = self.compute_optimal_stakes(arb, total_investment=1000)
arb.update(stakes)
opportunities.append(arb)
self.opportunities = opportunities
return opportunities
# Worked Example: Arbitrage Detection
print("ARBITRAGE DETECTION ACROSS SPORTSBOOKS")
print("=" * 60)
detector = ArbitrageDetector()
# Simulated odds from multiple sportsbooks
market_data = [
{
"event": "Lakers vs Celtics",
"outcomes": ["Lakers", "Celtics"],
"odds": {
("Lakers", "BookA"): 2.30,
("Lakers", "BookB"): 2.25,
("Lakers", "BookC"): 2.35, # Best Lakers odds
("Celtics", "BookA"): 1.65,
("Celtics", "BookB"): 1.72, # Best Celtics odds
("Celtics", "BookC"): 1.62,
},
},
{
"event": "Liverpool vs Arsenal",
"outcomes": ["Liverpool", "Draw", "Arsenal"],
"odds": {
("Liverpool", "BookA"): 2.40,
("Liverpool", "BookB"): 2.50,
("Liverpool", "BookC"): 2.45,
("Draw", "BookA"): 3.40,
("Draw", "BookB"): 3.30,
("Draw", "BookC"): 3.55,
("Arsenal", "BookA"): 3.00,
("Arsenal", "BookB"): 3.10,
("Arsenal", "BookC"): 2.95,
},
},
{
"event": "Djokovic vs Alcaraz",
"outcomes": ["Djokovic", "Alcaraz"],
"odds": {
("Djokovic", "BookA"): 1.85,
("Djokovic", "BookB"): 1.90,
("Djokovic", "BookC"): 1.88,
("Alcaraz", "BookA"): 2.05,
("Alcaraz", "BookB"): 2.00,
("Alcaraz", "BookC"): 2.10,
},
},
]
arbs = detector.scan_market(market_data)
print(f"\nScanned {len(market_data)} events across 3 sportsbooks")
print(f"Arbitrage opportunities found: {len(arbs)}")
for arb in arbs:
print(f"\n{'=' * 50}")
print(f"EVENT: {arb['event']}")
print(f"Type: {arb['type']}")
print(f"Profit margin: {arb['profit_margin_pct']:.3f}%")
print(f"Guaranteed profit on $1,000: ${arb['guaranteed_profit']:.2f}")
if arb["type"] == "two-way":
print(f"\nAllocation:")
print(f" Outcome A @ {arb['outcome_a_odds']:.2f} ({arb['outcome_a_book']}): "
f"${arb['stake_a']:.2f}")
print(f" Outcome B @ {arb['outcome_b_odds']:.2f} ({arb['outcome_b_book']}): "
f"${arb['stake_b']:.2f}")
elif arb["type"] == "three-way":
print(f"\nAllocation:")
print(f" Home @ {arb['home_odds']:.2f} ({arb['home_book']}): "
f"${arb['stake_home']:.2f}")
print(f" Draw @ {arb['draw_odds']:.2f} ({arb['draw_book']}): "
f"${arb['stake_draw']:.2f}")
print(f" Away @ {arb['away_odds']:.2f} ({arb['away_book']}): "
f"${arb['stake_away']:.2f}")
# Check for near-arbs (close to arbitrage, may indicate value)
print(f"\n\nNEAR-ARBITRAGE ANALYSIS")
print("-" * 50)
for event in market_data:
outcomes = event["outcomes"]
best_odds = {}
for outcome in outcomes:
outcome_odds = [
(book, d) for (o, book), d in event["odds"].items()
if o == outcome
]
best = max(outcome_odds, key=lambda x: x[1])
best_odds[outcome] = best
inv_sum = sum(1 / bo[1] for bo in best_odds.values())
margin = (inv_sum - 1) * 100
print(f"\n{event['event']}:")
for outcome, (book, odds) in best_odds.items():
print(f" {outcome}: {odds:.2f} ({book})")
print(f" Combined margin: {margin:+.2f}%")
if margin < 0:
print(f" >>> ARBITRAGE: {abs(margin):.2f}% guaranteed profit")
elif margin < 2:
print(f" >>> Near-arb: only {margin:.2f}% overround")
Common Pitfall: Arbitrage detection software often identifies "arbs" that disappear by the time you try to execute them. In practice, you need sub-second execution capability and accounts at multiple sportsbooks with sufficient limits. Many experienced arbers report that the biggest risk is not the mathematical calculation but the execution risk --- the line moving between placing the first and second legs. Always consider this gap risk in your calculations.
25.4 Constraint-Based Bet Sizing
Beyond Basic Kelly
The Kelly criterion (Chapter 4) provides the theoretically optimal bet size for maximizing long-run bankroll growth:
$$f^* = \frac{p \cdot d - 1}{d - 1} = \frac{p(d - 1) - (1 - p)}{d - 1}$$
where $f^*$ is the optimal fraction of bankroll to wager, $p$ is the true win probability, and $d$ is the decimal odds. However, the basic Kelly formula has several practical limitations:
- Single-bet assumption: Kelly assumes you are betting on one event at a time. When you have multiple simultaneous bets, the single-bet Kelly fractions do not account for the total bankroll exposure.
- No constraints: Real-world betting involves maximum bet limits, minimum bets, and regulatory exposure caps.
- Independence assumption: Kelly assumes bet outcomes are independent. Correlated bets require a multivariate extension.
- Exact probability assumption: Kelly assumes you know the true probability exactly. In practice, probability estimates have uncertainty, and full Kelly is known to be too aggressive when estimates are imprecise.
Multi-Bet Kelly with Constraints
For $n$ simultaneous bets with correlated outcomes, the optimal allocation maximizes the expected log-growth of the bankroll:
$$\max_{\mathbf{f}} E\left[\log\left(1 + \sum_{i=1}^{n} f_i R_i\right)\right]$$
where $f_i$ is the fraction of bankroll on bet $i$ and $R_i$ is the random return on bet $i$ (either $d_i - 1$ if it wins or $-1$ if it loses).
This expectation involves a sum over all $2^n$ possible outcome combinations (for binary bets), each weighted by its probability. With constraints, the problem becomes:
$$\begin{aligned} \max_{\mathbf{f}} \quad & \sum_{\mathbf{s} \in \{0,1\}^n} P(\mathbf{s}) \log\left(1 + \sum_{i=1}^{n} f_i r_i(s_i)\right) \\ \text{subject to} \quad & \sum_{i=1}^{n} f_i \leq f_{\max}^{\text{total}} \\ & 0 \leq f_i \leq f_{\max}^{i} \quad \forall i \\ & \text{(additional constraints)} \end{aligned}$$
where $r_i(s_i) = (d_i - 1)$ if $s_i = 1$ (win) and $r_i(s_i) = -1$ if $s_i = 0$ (loss).
This is a concave maximization problem (the logarithm of a linear function is concave), which can be solved efficiently using convex optimization tools.
Python Implementation
import numpy as np
import pandas as pd
from scipy.optimize import minimize
from itertools import product
from typing import Optional
class ConstrainedKelly:
"""
Constrained multi-bet Kelly criterion optimizer.
Computes optimal bet sizing for multiple simultaneous bets
with correlation, maximum bet constraints, and total exposure
limits. Maximizes expected log-growth of the bankroll.
"""
def __init__(self, seed: int = 42):
self.rng = np.random.default_rng(seed)
def optimize(
self,
probabilities: np.ndarray,
decimal_odds: np.ndarray,
correlation_matrix: Optional[np.ndarray] = None,
max_individual_fraction: float = 0.10,
max_total_fraction: float = 0.30,
kelly_fraction: float = 1.0,
n_scenarios: int = 10000,
) -> dict:
"""
Optimize bet sizing using constrained Kelly criterion.
For small numbers of bets (n <= 15), uses exact enumeration
over all outcome combinations. For larger numbers, uses
Monte Carlo simulation of outcomes.
Args:
probabilities: True win probability for each bet.
decimal_odds: Decimal odds for each bet.
correlation_matrix: n x n correlation matrix of outcomes.
If None, assumes independence.
max_individual_fraction: Maximum fraction of bankroll on
any single bet.
max_total_fraction: Maximum total fraction wagered.
kelly_fraction: Fraction of full Kelly to use (e.g., 0.5
for half-Kelly). Applied after optimization.
n_scenarios: Number of Monte Carlo scenarios (used when
n > 15).
Returns:
Dictionary with optimal fractions, expected growth,
and risk metrics.
"""
n = len(probabilities)
if n <= 15:
return self._optimize_exact(
probabilities, decimal_odds, correlation_matrix,
max_individual_fraction, max_total_fraction, kelly_fraction
)
else:
return self._optimize_monte_carlo(
probabilities, decimal_odds, correlation_matrix,
max_individual_fraction, max_total_fraction,
kelly_fraction, n_scenarios
)
def _optimize_exact(
self,
probs: np.ndarray,
odds: np.ndarray,
corr: Optional[np.ndarray],
max_ind: float,
max_total: float,
kelly_frac: float,
) -> dict:
"""Exact optimization over all 2^n outcome scenarios."""
n = len(probs)
# Generate all possible outcome combinations
scenarios = np.array(list(product([0, 1], repeat=n)))
n_scenarios = len(scenarios)
# Compute scenario probabilities
if corr is None or np.allclose(corr, np.eye(n)):
# Independent outcomes
scenario_probs = np.prod(
scenarios * probs + (1 - scenarios) * (1 - probs),
axis=1,
)
else:
# Correlated outcomes using Gaussian copula
scenario_probs = self._compute_correlated_probs(
probs, corr, scenarios
)
# Returns for each bet in each scenario
# Win: decimal_odds - 1; Lose: -1
returns = scenarios * (odds - 1) + (1 - scenarios) * (-1)
def neg_expected_log_growth(f):
"""Negative expected log growth (to minimize)."""
portfolio_returns = returns @ f
# Ensure 1 + portfolio_return > 0 to avoid log(0)
wealth = 1 + portfolio_returns
wealth = np.maximum(wealth, 1e-10)
return -np.sum(scenario_probs * np.log(wealth))
# Constraints
constraints = [
{"type": "ineq", "fun": lambda f: max_total - np.sum(f)},
]
bounds = [(0, max_ind)] * n
# Initial guess: scaled single-bet Kelly
f0 = np.array([
max(0, min(max_ind, (probs[i] * odds[i] - 1) / (odds[i] - 1)))
for i in range(n)
])
f0 = f0 * max_total / max(f0.sum(), max_total)
result = minimize(
neg_expected_log_growth,
f0,
method="SLSQP",
bounds=bounds,
constraints=constraints,
options={"maxiter": 1000, "ftol": 1e-12},
)
optimal_f = np.maximum(result.x, 0)
optimal_f[optimal_f < 1e-6] = 0
# Apply fractional Kelly
optimal_f *= kelly_frac
# Compute metrics
portfolio_returns = returns @ optimal_f
wealth = 1 + portfolio_returns
wealth = np.maximum(wealth, 1e-10)
expected_growth = np.sum(scenario_probs * np.log(wealth))
expected_return = np.sum(scenario_probs * portfolio_returns)
prob_loss = np.sum(scenario_probs[portfolio_returns < 0])
# Risk of ruin approximation
worst_case = portfolio_returns.min()
return {
"optimal_fractions": optimal_f,
"expected_log_growth": expected_growth,
"expected_return": expected_return,
"prob_any_loss": prob_loss,
"worst_case_loss_pct": -worst_case * 100,
"total_exposure": optimal_f.sum(),
"kelly_fraction_used": kelly_frac,
"n_bets_placed": np.sum(optimal_f > 0),
"converged": result.success,
}
def _compute_correlated_probs(
self,
probs: np.ndarray,
corr: np.ndarray,
scenarios: np.ndarray,
) -> np.ndarray:
"""
Compute joint probabilities for correlated binary outcomes
using Gaussian copula approximation.
"""
from scipy.stats import norm, multivariate_normal
n = len(probs)
# Convert marginal probabilities to normal thresholds
thresholds = norm.ppf(probs)
# Create multivariate normal distribution
mvn = multivariate_normal(mean=np.zeros(n), cov=corr)
# For each scenario, compute probability using numerical integration
# Approximate with Monte Carlo
n_mc = 100000
rng = np.random.default_rng(42)
samples = rng.multivariate_normal(np.zeros(n), corr, size=n_mc)
binary_outcomes = (samples < thresholds).astype(int)
scenario_probs = np.zeros(len(scenarios))
for i, scenario in enumerate(scenarios):
matches = np.all(binary_outcomes == scenario, axis=1)
scenario_probs[i] = matches.mean()
# Ensure probabilities sum to 1
scenario_probs /= scenario_probs.sum()
return scenario_probs
def _optimize_monte_carlo(
self,
probs: np.ndarray,
odds: np.ndarray,
corr: Optional[np.ndarray],
max_ind: float,
max_total: float,
kelly_frac: float,
n_scenarios: int,
) -> dict:
"""MC-based optimization for many simultaneous bets."""
n = len(probs)
# Generate scenarios
if corr is None or np.allclose(corr, np.eye(n)):
outcomes = self.rng.binomial(1, probs, size=(n_scenarios, n))
else:
from scipy.stats import norm
samples = self.rng.multivariate_normal(
np.zeros(n), corr, size=n_scenarios
)
thresholds = norm.ppf(probs)
outcomes = (samples < thresholds).astype(int)
returns = outcomes * (odds - 1) + (1 - outcomes) * (-1)
def neg_expected_log_growth(f):
portfolio_returns = returns @ f
wealth = 1 + portfolio_returns
wealth = np.maximum(wealth, 1e-10)
return -np.mean(np.log(wealth))
constraints = [
{"type": "ineq", "fun": lambda f: max_total - np.sum(f)},
]
bounds = [(0, max_ind)] * n
f0 = np.full(n, min(max_ind, max_total / n))
result = minimize(
neg_expected_log_growth,
f0,
method="SLSQP",
bounds=bounds,
constraints=constraints,
)
optimal_f = np.maximum(result.x * kelly_frac, 0)
optimal_f[optimal_f < 1e-6] = 0
portfolio_returns = returns @ optimal_f
wealth = 1 + portfolio_returns
return {
"optimal_fractions": optimal_f,
"expected_log_growth": float(np.mean(np.log(np.maximum(wealth, 1e-10)))),
"expected_return": float(np.mean(portfolio_returns)),
"prob_any_loss": float(np.mean(portfolio_returns < 0)),
"worst_case_loss_pct": float(-portfolio_returns.min() * 100),
"total_exposure": float(optimal_f.sum()),
"kelly_fraction_used": kelly_frac,
"n_bets_placed": int(np.sum(optimal_f > 0)),
"converged": result.success,
}
@staticmethod
def compare_kelly_variants(
probability: float,
decimal_odds: float,
bankroll: float = 10000.0,
) -> pd.DataFrame:
"""
Compare full Kelly, fractional Kelly, and constrained Kelly
for a single bet.
Args:
probability: True win probability.
decimal_odds: Decimal odds offered.
bankroll: Current bankroll.
Returns:
DataFrame comparing different Kelly variants.
"""
edge = probability * decimal_odds - 1
full_kelly = (probability * (decimal_odds - 1) - (1 - probability)) / (decimal_odds - 1)
full_kelly = max(0, full_kelly)
variants = []
for fraction in [0.25, 0.50, 0.75, 1.00]:
f = full_kelly * fraction
bet_amount = f * bankroll
expected_growth = (
probability * np.log(1 + f * (decimal_odds - 1))
+ (1 - probability) * np.log(1 - f)
)
expected_return = f * edge
prob_ruin_approx = (
((1 - probability) / probability) ** (bankroll / bet_amount)
if bet_amount > 0 and probability > 0.5 else np.nan
)
variants.append({
"variant": f"{fraction:.0%} Kelly",
"fraction": f,
"bet_amount": bet_amount,
"expected_growth": expected_growth,
"expected_return": expected_return,
"max_loss_pct": f * 100,
})
return pd.DataFrame(variants)
# Worked Example: Multi-Bet Constrained Kelly
print("CONSTRAINED KELLY CRITERION")
print("=" * 60)
kelly = ConstrainedKelly(seed=42)
# 5 simultaneous bets
bets = pd.DataFrame({
"name": ["KC ML", "PHI -3", "Over 48", "DEN ML", "Under 44"],
"true_prob": [0.58, 0.55, 0.54, 0.57, 0.53],
"decimal_odds": [2.10, 1.91, 1.91, 1.95, 1.91],
})
bets["edge"] = bets["true_prob"] * bets["decimal_odds"] - 1
bets["single_kelly"] = [
max(0, (p * (d - 1) - (1 - p)) / (d - 1))
for p, d in zip(bets["true_prob"], bets["decimal_odds"])
]
print("Available Bets:")
print(bets[["name", "true_prob", "decimal_odds", "edge",
"single_kelly"]].to_string(index=False))
# Correlation: Over 48 and team MLs are correlated
corr = np.eye(5)
corr[0, 2] = corr[2, 0] = 0.25 # KC ML and Over correlated
corr[1, 2] = corr[2, 1] = 0.20 # PHI -3 and Over correlated
corr[0, 3] = corr[3, 0] = 0.05 # KC and DEN slight correlation
# Full Kelly (no constraints beyond position limits)
print("\nFULL KELLY (max 15% per bet, 40% total)")
result_full = kelly.optimize(
probabilities=bets["true_prob"].values,
decimal_odds=bets["decimal_odds"].values,
correlation_matrix=corr,
max_individual_fraction=0.15,
max_total_fraction=0.40,
kelly_fraction=1.0,
)
for name, frac in zip(bets["name"], result_full["optimal_fractions"]):
if frac > 0:
print(f" {name:<15} {frac:>6.2%} of bankroll")
print(f" Total exposure: {result_full['total_exposure']:.2%}")
print(f" E[log growth]: {result_full['expected_log_growth']:.6f}")
print(f" E[return]: {result_full['expected_return']:.4f}")
print(f" P(loss): {result_full['prob_any_loss']:.2%}")
# Half Kelly (more conservative)
print("\nHALF KELLY (max 15% per bet, 40% total)")
result_half = kelly.optimize(
probabilities=bets["true_prob"].values,
decimal_odds=bets["decimal_odds"].values,
correlation_matrix=corr,
max_individual_fraction=0.15,
max_total_fraction=0.40,
kelly_fraction=0.5,
)
for name, frac in zip(bets["name"], result_half["optimal_fractions"]):
if frac > 0:
print(f" {name:<15} {frac:>6.2%} of bankroll")
print(f" Total exposure: {result_half['total_exposure']:.2%}")
print(f" E[log growth]: {result_half['expected_log_growth']:.6f}")
print(f" Worst case: -{result_half['worst_case_loss_pct']:.1f}%")
# Compare single-bet Kelly variants
print("\nSINGLE-BET KELLY COMPARISON (KC ML)")
comparison = ConstrainedKelly.compare_kelly_variants(
probability=0.58, decimal_odds=2.10, bankroll=10000
)
print(comparison.to_string(index=False))
Key Insight: Full Kelly is almost always too aggressive in practice. Because our probability estimates contain errors, full Kelly systematically overestimates edges and recommends bets that are too large. Half-Kelly or quarter-Kelly sacrifice a small amount of expected long-run growth rate in exchange for dramatically lower variance and drawdown risk. Most professional bettors use fractional Kelly in the range of 0.25 to 0.50.
25.5 Multi-Objective Optimization
Beyond Single-Objective Optimization
Real-world betting decisions involve multiple competing objectives:
- Maximize expected profit: The primary goal.
- Minimize risk: Measured by variance, max drawdown, or probability of ruin.
- Minimize time investment: Some strategies require more monitoring and execution effort than others.
- Maintain sportsbook access: Aggressive betting patterns may trigger limits; a "stealth" objective values smaller, more dispersed bets.
- Maximize liquidity: Keep enough bankroll uninvested to capitalize on unexpected opportunities.
These objectives typically conflict. More aggressive betting increases expected profit but also increases risk. Bet diversification reduces risk but may require accounts at more sportsbooks and more execution time. Stealth betting preserves account access but reduces immediate expected value.
Multi-objective optimization provides a framework for navigating these trade-offs without arbitrarily collapsing all objectives into a single number.
The Pareto Frontier
A solution $\mathbf{x}^*$ is Pareto optimal (or Pareto efficient) if no other feasible solution improves one objective without worsening at least one other objective. The set of all Pareto optimal solutions forms the Pareto frontier.
For two objectives $f_1(\mathbf{x})$ (to maximize) and $f_2(\mathbf{x})$ (to minimize):
$$\mathbf{x}^* \text{ is Pareto optimal if } \nexists \mathbf{x}: f_1(\mathbf{x}) \geq f_1(\mathbf{x}^*) \text{ and } f_2(\mathbf{x}) \leq f_2(\mathbf{x}^*) \text{ with at least one strict inequality}$$
The Weighted Sum Method
The simplest approach to multi-objective optimization is to combine objectives into a single weighted sum:
$$\max_{\mathbf{x}} \sum_{k=1}^{K} \lambda_k f_k(\mathbf{x}), \quad \text{subject to constraints}$$
where $\lambda_k \geq 0$ are weights that reflect the decision-maker's preferences. By varying the weights, we trace out the Pareto frontier (for convex problems).
The $\varepsilon$-Constraint Method
An alternative approach optimizes one objective while constraining the others:
$$\max_{\mathbf{x}} f_1(\mathbf{x}) \quad \text{subject to} \quad f_k(\mathbf{x}) \geq \varepsilon_k, \; k = 2, \ldots, K$$
By varying the $\varepsilon_k$ thresholds, we again trace the Pareto frontier, but this method can handle non-convex Pareto frontiers that the weighted sum method cannot.
Python Implementation
import numpy as np
import pandas as pd
from scipy.optimize import minimize
from typing import Callable, Optional
class MultiObjectiveBettingOptimizer:
"""
Multi-objective optimization for betting strategy design.
Balances expected profit, risk, and operational constraints
to find the Pareto frontier of optimal strategies.
"""
def __init__(self, seed: int = 42):
self.rng = np.random.default_rng(seed)
def compute_pareto_frontier(
self,
probabilities: np.ndarray,
decimal_odds: np.ndarray,
n_points: int = 50,
max_individual: float = 0.10,
max_total: float = 0.50,
min_liquidity: float = 0.50,
n_simulations: int = 10000,
) -> pd.DataFrame:
"""
Compute the Pareto frontier balancing expected return and risk.
Uses the epsilon-constraint method: maximize expected return
subject to a varying upper bound on risk (variance).
Args:
probabilities: True win probabilities.
decimal_odds: Decimal odds.
n_points: Number of points on the frontier.
max_individual: Max fraction per bet.
max_total: Max total fraction.
min_liquidity: Minimum uninvested fraction.
n_simulations: Monte Carlo scenarios for risk estimation.
Returns:
DataFrame with frontier points showing return, risk, and
allocation details.
"""
n = len(probabilities)
# Generate outcome scenarios
outcomes = self.rng.binomial(
1, probabilities, size=(n_simulations, n)
)
returns = outcomes * (decimal_odds - 1) + (1 - outcomes) * (-1)
def compute_metrics(f):
"""Compute portfolio metrics for a given allocation."""
portfolio_returns = returns @ f
return {
"expected_return": np.mean(portfolio_returns),
"variance": np.var(portfolio_returns, ddof=1),
"std": np.std(portfolio_returns, ddof=1),
"sharpe": (
np.mean(portfolio_returns) / np.std(portfolio_returns, ddof=1)
if np.std(portfolio_returns) > 0 else 0
),
"max_drawdown": self._compute_max_drawdown(portfolio_returns),
"prob_loss": np.mean(portfolio_returns < 0),
"var_95": np.percentile(portfolio_returns, 5),
"cvar_95": np.mean(
portfolio_returns[portfolio_returns <= np.percentile(portfolio_returns, 5)]
),
}
# Find the range of achievable risk levels
# Minimum risk: minimize variance
def neg_return(f):
return -np.mean(returns @ f)
def portfolio_var(f):
return np.var(returns @ f, ddof=1)
bounds = [(0, max_individual)] * n
total_constraint = {
"type": "ineq",
"fun": lambda f: max_total - np.sum(f),
}
liquidity_constraint = {
"type": "ineq",
"fun": lambda f: (1 - min_liquidity) - np.sum(f),
}
# Minimum variance portfolio
f0 = np.full(n, min(max_individual, max_total / n))
min_var_result = minimize(
portfolio_var, f0, method="SLSQP",
bounds=bounds,
constraints=[total_constraint, liquidity_constraint],
)
min_var = portfolio_var(min_var_result.x)
# Maximum return portfolio
max_ret_result = minimize(
neg_return, f0, method="SLSQP",
bounds=bounds,
constraints=[total_constraint, liquidity_constraint],
)
max_var = portfolio_var(max_ret_result.x)
# Trace the frontier
var_range = np.linspace(min_var, max_var, n_points)
frontier = []
for target_var in var_range:
var_constraint = {
"type": "ineq",
"fun": lambda f, tv=target_var: tv - portfolio_var(f),
}
result = minimize(
neg_return, f0, method="SLSQP",
bounds=bounds,
constraints=[
total_constraint, liquidity_constraint, var_constraint
],
)
if result.success:
f_opt = np.maximum(result.x, 0)
metrics = compute_metrics(f_opt)
metrics["allocation"] = f_opt.copy()
metrics["total_exposure"] = f_opt.sum()
metrics["n_active_bets"] = np.sum(f_opt > 1e-4)
frontier.append(metrics)
return pd.DataFrame(frontier)
def weighted_sum_optimization(
self,
probabilities: np.ndarray,
decimal_odds: np.ndarray,
weight_return: float = 1.0,
weight_risk: float = 1.0,
weight_concentration: float = 0.0,
max_individual: float = 0.10,
max_total: float = 0.50,
n_simulations: int = 10000,
) -> dict:
"""
Optimize a weighted combination of objectives.
Objective = w_ret * E[return] - w_risk * Var(return)
- w_conc * HHI(allocation)
where HHI (Herfindahl-Hirschman Index) measures concentration.
Args:
probabilities: True win probabilities.
decimal_odds: Decimal odds.
weight_return: Weight on expected return.
weight_risk: Weight on variance (penalty).
weight_concentration: Weight on concentration (penalty).
max_individual: Max fraction per bet.
max_total: Max total fraction.
n_simulations: Monte Carlo scenarios.
Returns:
Dictionary with optimal allocation and objective breakdown.
"""
n = len(probabilities)
outcomes = self.rng.binomial(
1, probabilities, size=(n_simulations, n)
)
returns = outcomes * (decimal_odds - 1) + (1 - outcomes) * (-1)
def neg_weighted_objective(f):
port_ret = returns @ f
exp_ret = np.mean(port_ret)
var_ret = np.var(port_ret, ddof=1)
# HHI: sum of squared weights (normalized)
total_w = f.sum()
if total_w > 0:
normalized = f / total_w
hhi = np.sum(normalized ** 2)
else:
hhi = 0
objective = (
weight_return * exp_ret
- weight_risk * var_ret
- weight_concentration * hhi
)
return -objective
bounds = [(0, max_individual)] * n
constraints = [
{"type": "ineq", "fun": lambda f: max_total - np.sum(f)},
]
f0 = np.full(n, min(max_individual, max_total / n))
result = minimize(
neg_weighted_objective, f0, method="SLSQP",
bounds=bounds, constraints=constraints,
)
f_opt = np.maximum(result.x, 0)
f_opt[f_opt < 1e-6] = 0
port_ret = returns @ f_opt
exp_ret = np.mean(port_ret)
var_ret = np.var(port_ret, ddof=1)
total_w = f_opt.sum()
if total_w > 0:
normalized = f_opt / total_w
hhi = np.sum(normalized ** 2)
else:
hhi = 0
return {
"optimal_allocation": f_opt,
"expected_return": exp_ret,
"variance": var_ret,
"std": np.sqrt(var_ret),
"concentration_hhi": hhi,
"total_exposure": total_w,
"n_active_bets": int(np.sum(f_opt > 0)),
"weighted_objective": -result.fun,
"sharpe": exp_ret / np.sqrt(var_ret) if var_ret > 0 else np.inf,
}
@staticmethod
def _compute_max_drawdown(returns: np.ndarray) -> float:
"""Compute maximum drawdown from a returns array."""
cumsum = np.cumsum(returns)
running_max = np.maximum.accumulate(cumsum)
drawdowns = running_max - cumsum
return float(drawdowns.max()) if len(drawdowns) > 0 else 0.0
# Worked Example: Multi-Objective Betting Optimization
print("MULTI-OBJECTIVE OPTIMIZATION")
print("=" * 60)
optimizer = MultiObjectiveBettingOptimizer(seed=42)
# 8 available bets
bets = pd.DataFrame({
"name": [
"KC ML", "PHI -3", "Over 48", "DEN ML",
"Under 44", "GB +1", "BAL -6", "MIA ML",
],
"true_prob": [0.58, 0.55, 0.54, 0.57, 0.53, 0.52, 0.56, 0.55],
"decimal_odds": [2.10, 1.91, 1.91, 1.95, 1.91, 1.95, 1.87, 2.05],
})
bets["edge"] = bets["true_prob"] * bets["decimal_odds"] - 1
print("Available Bets:")
print(bets[["name", "true_prob", "decimal_odds", "edge"]].to_string(index=False))
# Compute Pareto frontier
print("\nPARETO FRONTIER: RETURN VS RISK")
print("-" * 55)
frontier = optimizer.compute_pareto_frontier(
probabilities=bets["true_prob"].values,
decimal_odds=bets["decimal_odds"].values,
n_points=10,
max_individual=0.08,
max_total=0.40,
min_liquidity=0.60,
)
if len(frontier) > 0:
print(f"{'E[Return]':>10} {'Std Dev':>10} {'Sharpe':>10} "
f"{'P(Loss)':>10} {'N Bets':>8}")
print("-" * 55)
for _, row in frontier.iterrows():
print(f"{row['expected_return']:>10.4f} {row['std']:>10.4f} "
f"{row['sharpe']:>10.4f} {row['prob_loss']:>10.1%} "
f"{row['n_active_bets']:>8.0f}")
# Weighted-sum optimization with different preference profiles
print("\nWEIGHTED SUM OPTIMIZATION")
print("-" * 60)
profiles = {
"Aggressive": (2.0, 0.5, 0.0),
"Balanced": (1.0, 1.0, 0.5),
"Conservative": (0.5, 2.0, 1.0),
"Diversified": (1.0, 1.0, 2.0),
}
for profile_name, (w_ret, w_risk, w_conc) in profiles.items():
result = optimizer.weighted_sum_optimization(
probabilities=bets["true_prob"].values,
decimal_odds=bets["decimal_odds"].values,
weight_return=w_ret,
weight_risk=w_risk,
weight_concentration=w_conc,
max_individual=0.08,
max_total=0.40,
)
print(f"\n{profile_name} Profile (w_ret={w_ret}, w_risk={w_risk}, "
f"w_conc={w_conc}):")
print(f" E[return]: {result['expected_return']:.4f}")
print(f" Std dev: {result['std']:.4f}")
print(f" Sharpe: {result['sharpe']:.4f}")
print(f" HHI: {result['concentration_hhi']:.4f}")
print(f" Total exposure: {result['total_exposure']:.2%}")
print(f" Active bets: {result['n_active_bets']}")
# Show top allocations
alloc = result["optimal_allocation"]
sorted_idx = np.argsort(-alloc)
for idx in sorted_idx[:4]:
if alloc[idx] > 0.001:
print(f" {bets['name'].iloc[idx]:<15} "
f"{alloc[idx]:>6.2%}")
Choosing Your Objective Weights
The multi-objective framework forces you to be explicit about your preferences. Some guidelines:
-
Professional bettors with large bankrolls and long time horizons can tolerate more risk. They should weight expected return more heavily and accept higher variance.
-
Recreational bettors with limited bankrolls should weight risk reduction heavily. The psychological cost of a 30% drawdown is enormous, and it may force suboptimal decisions (like chasing losses).
-
Bettors with multiple sportsbook accounts should weight concentration heavily to avoid triggering limits at any single book. A diversified pattern of moderate bets is more sustainable than large concentrated bets.
-
Bettors in regulated markets with daily/weekly loss limits should incorporate these as hard constraints, not soft penalties.
Real-World Application: A bettor who has identified edges across NFL, NBA, and tennis markets faces a multi-objective problem every day: allocate limited bankroll across sports, manage time spent on each sport, maintain accounts at multiple books, and control overall risk. The multi-objective framework provides a principled way to make these trade-offs rather than relying on ad hoc rules. By computing the Pareto frontier once (a computationally expensive but one-time task), you can quickly identify the best strategy for any given day's priorities.
25.6 Chapter Summary
This chapter developed the optimization toolkit for translating probabilistic insights into optimal betting actions:
Linear Programming (Section 25.1): We formulated betting allocation as a linear program, maximizing expected profit subject to bet limits, exposure constraints, and group constraints. LP provides globally optimal solutions, efficient computation, and valuable sensitivity analysis. We implemented solvers using both PuLP and cvxpy.
Portfolio Optimization (Section 25.2): We adapted Markowitz mean-variance theory to construct efficient betting portfolios. The key insight is that correlations between bet outcomes matter: diversification across uncorrelated bets reduces risk without sacrificing expected return. We computed the efficient frontier, showing the trade-off between return and risk.
Arbitrage Detection (Section 25.3): We built algorithmic tools for scanning odds across multiple sportsbooks to identify guaranteed-profit opportunities. While mathematically straightforward, we discussed the practical challenges of speed, limits, execution risk, and account sustainability.
Constrained Kelly (Section 25.4): We extended the Kelly criterion to handle multiple simultaneous bets with correlations, individual bet limits, and total exposure constraints. We demonstrated that fractional Kelly (0.25 to 0.50 of full Kelly) provides a much better risk-return trade-off than full Kelly when probability estimates contain errors.
Multi-Objective Optimization (Section 25.5): We developed frameworks for balancing competing objectives --- profit, risk, concentration, and liquidity --- using both the Pareto frontier (epsilon-constraint method) and weighted-sum optimization. This provides a principled approach to the messy, multi-dimensional reality of betting strategy design.
Key Takeaways
-
Every betting decision is an optimization problem. Making it explicit --- defining the objective, constraints, and decision variables --- leads to better decisions than intuitive allocation.
-
Linear programming is the starting point: fast, globally optimal, and provides sensitivity analysis. Use it when the objective and constraints are linear.
-
Bet correlations are not optional. Ignoring them leads to overconcentrated portfolios with much higher risk than intended. Always estimate and account for correlations, even approximately.
-
Fractional Kelly is strictly dominant over full Kelly in practice because our probability estimates are uncertain. Use 25-50% of full Kelly as a default.
-
Multi-objective optimization makes trade-offs explicit. Instead of guessing how to balance profit and risk, compute the Pareto frontier and choose consciously.
-
Arbitrage opportunities are mathematically simple but operationally complex. Treat arb detection as one tool in a diversified strategy, not as a standalone approach.
What Comes Next
With the optimization toolkit now in hand, you have the complete analytical pipeline: from data collection and feature engineering (Parts II-III), through modeling and evaluation (Part IV), to the advanced quantitative methods of Part V --- time series analysis, simulation, and optimization. The remaining chapters will address the practical, operational, and psychological dimensions of applying these tools in real-world betting markets.
Chapter 25 Exercises
-
Formulate and solve a linear program for allocating a $5,000 bankroll across 10 bets with the following constraints: no more than $500 on any single bet, no more than $1,500 on any single sport, and no more than $3,000 total wagered. Generate random edges between 2% and 8% for each bet. How does the solution change if you tighten the individual bet limit to $300?
-
Using the portfolio optimization framework, construct the efficient frontier for a set of 6 bets with the following correlation structure: bets 1-2 are in the same game (correlation 0.3), bets 3-4 are in the same game (correlation 0.25), and bets 5-6 are independent of all others. Plot the frontier and identify the maximum Sharpe ratio portfolio. How does the optimal portfolio change if you double the correlations?
-
Write an arbitrage scanner that monitors odds for 5 events across 4 sportsbooks. Generate realistic odds (each book's margin should be 4-6%) and randomly perturb them to occasionally create cross-book arbs. Over 1,000 "snapshots" of odds, how many arb opportunities arise, and what is the average margin? How does reducing each book's margin to 2-3% change the frequency of arbs?
-
Implement the full constrained multi-bet Kelly for a scenario with 4 correlated bets (NFL game: spread, total, moneyline, team total). The correlation matrix should reflect that these bets share a common game. Compare full Kelly, half Kelly, and quarter Kelly in terms of expected growth, variance, and worst-case loss. How sensitive is the optimal allocation to a 3 percentage point error in each probability estimate?
-
Design a multi-objective betting strategy for a bettor with accounts at 3 sportsbooks who wants to maximize expected profit, minimize variance, and minimize the maximum bet at any single book (to avoid being limited). Compute the Pareto frontier over expected profit and max-single-book exposure, holding variance at a moderate level. How does the optimal strategy differ from a naive equal-allocation approach?