Case Study 41.1: Building an End-to-End NBA Betting System
Overview
Marcus Chen had spent two years building individual components: a data pipeline that pulled NBA box scores and advanced metrics, a logistic regression model predicting game outcomes, and a spreadsheet tracking his bets. Each piece worked in isolation. But his results were inconsistent --- profitable one month, losing the next --- and he could not figure out why. The problem was not any single component. The problem was that the components were not integrated into a coherent system.
This case study follows Marcus as he redesigns his operation from the ground up, implementing the complete eight-stage betting workflow described in Chapter 41. We walk through each stage with working Python code, showing how data flows from collection through to performance review, and demonstrating the feedback loops that transform a collection of tools into an adaptive, self-improving system.
The Starting Point
Marcus's NBA model used five features: home-court indicator, point differential over the last 10 games, offensive rating, defensive rating, and rest days. Trained on two seasons of data, the model produced predictions that beat a naive baseline but lagged behind the market. His record over 400 bets showed a 51.3% win rate on sides at -110 odds --- a -3.2% ROI. He was paying the vig without generating enough edge to overcome it.
The diagnosis: Marcus was skipping critical stages of the workflow. He had Stage 2 (model) and Stage 6 (bet placement), but lacked a systematic approach to the other six stages.
Stage 1: Rebuilding the Data Pipeline
Marcus's first overhaul targeted data collection and preparation. His original pipeline pulled data from a single source and did not validate quality.
"""
NBA Betting System - Stage 1: Data Pipeline
Demonstrates a complete data collection, cleaning, and validation pipeline.
"""
import numpy as np
import pandas as pd
from datetime import datetime, timedelta
from typing import Dict, List, Optional, Tuple
def generate_nba_season_data(
n_games: int = 1230,
n_teams: int = 30,
start_date: str = "2023-10-24"
) -> pd.DataFrame:
"""Generate realistic synthetic NBA regular season data.
Args:
n_games: Number of games in the season.
n_teams: Number of teams in the league.
start_date: Season start date string.
Returns:
DataFrame with game-level data including scores, stats, and odds.
"""
np.random.seed(42)
teams = [f"Team_{i:02d}" for i in range(1, n_teams + 1)]
team_strengths = {t: np.random.normal(0, 5) for t in teams}
games = []
current_date = pd.Timestamp(start_date)
for game_id in range(n_games):
home_team = np.random.choice(teams)
away_team = np.random.choice([t for t in teams if t != home_team])
home_advantage = 3.5
home_strength = team_strengths[home_team]
away_strength = team_strengths[away_team]
spread_true = home_strength - away_strength + home_advantage
home_score = int(np.random.normal(112 + spread_true / 2, 12))
away_score = int(np.random.normal(112 - spread_true / 2, 12))
home_prob_true = 1 / (1 + np.exp(-spread_true / 6))
market_noise = np.random.normal(0, 0.03)
market_prob = np.clip(home_prob_true + market_noise, 0.15, 0.85)
if market_prob > 0.5:
home_odds = int(-100 * market_prob / (1 - market_prob))
away_odds = int(100 * (1 - market_prob) / market_prob)
else:
home_odds = int(100 * (1 - market_prob) / market_prob)
away_odds = int(-100 * market_prob / (1 - market_prob))
pace = np.random.normal(100, 3)
off_rtg_home = np.random.normal(112 + home_strength * 0.6, 5)
def_rtg_home = np.random.normal(112 - home_strength * 0.4, 5)
off_rtg_away = np.random.normal(112 + away_strength * 0.6, 5)
def_rtg_away = np.random.normal(112 - away_strength * 0.4, 5)
games.append({
"game_id": f"G{game_id:05d}",
"game_date": current_date,
"home_team": home_team,
"away_team": away_team,
"home_score": max(home_score, 70),
"away_score": max(away_score, 70),
"home_odds": home_odds,
"away_odds": away_odds,
"pace": round(pace, 1),
"home_off_rtg": round(off_rtg_home, 1),
"home_def_rtg": round(def_rtg_home, 1),
"away_off_rtg": round(off_rtg_away, 1),
"away_def_rtg": round(def_rtg_away, 1),
"home_rest_days": np.random.choice([1, 2, 3, 4], p=[0.5, 0.3, 0.15, 0.05]),
"away_rest_days": np.random.choice([1, 2, 3, 4], p=[0.5, 0.3, 0.15, 0.05]),
})
if game_id % 15 == 0:
current_date += timedelta(days=1)
if game_id % 5 == 0:
current_date += timedelta(days=1)
df = pd.DataFrame(games)
df["home_win"] = (df["home_score"] > df["away_score"]).astype(int)
df["point_diff"] = df["home_score"] - df["away_score"]
df["total_points"] = df["home_score"] + df["away_score"]
return df
class NBADataPipeline:
"""End-to-end data pipeline for NBA betting analysis.
Collects, cleans, validates, and engineers features from NBA game data.
Attributes:
raw_data: The unprocessed game data.
clean_data: Data after cleaning and validation.
features: Fully engineered feature set ready for modeling.
"""
def __init__(self) -> None:
self.raw_data: Optional[pd.DataFrame] = None
self.clean_data: Optional[pd.DataFrame] = None
self.features: Optional[pd.DataFrame] = None
def load_data(self, data: pd.DataFrame) -> "NBADataPipeline":
"""Load raw data into the pipeline.
Args:
data: DataFrame containing NBA game data.
Returns:
Self for method chaining.
"""
self.raw_data = data.copy()
print(f"Loaded {len(data)} games from "
f"{data['game_date'].min()} to {data['game_date'].max()}")
return self
def clean(self) -> "NBADataPipeline":
"""Clean raw data by removing duplicates and handling missing values.
Returns:
Self for method chaining.
"""
if self.raw_data is None:
raise ValueError("No data loaded. Call load_data first.")
df = self.raw_data.copy()
initial_rows = len(df)
df = df.drop_duplicates(subset=["game_id"])
df = df.dropna(subset=["home_score", "away_score", "home_odds", "away_odds"])
numeric_cols = df.select_dtypes(include=[np.number]).columns
for col in numeric_cols:
median_val = df[col].median()
df[col] = df[col].fillna(median_val)
df = df[(df["home_score"] >= 50) & (df["home_score"] <= 200)]
df = df[(df["away_score"] >= 50) & (df["away_score"] <= 200)]
self.clean_data = df.reset_index(drop=True)
removed = initial_rows - len(self.clean_data)
print(f"Cleaning: removed {removed} rows, {len(self.clean_data)} remaining")
return self
def validate(self) -> Dict:
"""Run data quality checks on the cleaned dataset.
Returns:
Dictionary containing a data quality report with counts
of missing values, duplicates, date range, and outliers.
"""
if self.clean_data is None:
raise ValueError("No cleaned data. Call clean() first.")
df = self.clean_data
report: Dict = {}
missing = df.isnull().sum()
report["missing_values"] = missing[missing > 0].to_dict()
report["duplicate_games"] = df.duplicated(subset=["game_id"]).sum()
report["date_range"] = {
"min": str(df["game_date"].min()),
"max": str(df["game_date"].max()),
"total_games": len(df),
}
report["home_win_rate"] = round(df["home_win"].mean(), 4)
report["avg_total_points"] = round(df["total_points"].mean(), 1)
print(f"Validation report: {len(df)} games, "
f"home win rate: {report['home_win_rate']:.1%}, "
f"avg total: {report['avg_total_points']}")
return report
def engineer_features(self, window: int = 10) -> "NBADataPipeline":
"""Create predictive features from cleaned data.
All features use shift(1) to prevent look-ahead bias.
Args:
window: Rolling window size for moving averages.
Returns:
Self for method chaining.
"""
if self.clean_data is None:
raise ValueError("No cleaned data. Call clean() first.")
df = self.clean_data.sort_values("game_date").copy()
for team_col, prefix in [("home_team", "home"), ("away_team", "away")]:
score_col = f"{prefix}_score"
df[f"{prefix}_pts_rolling_{window}"] = (
df.groupby(team_col)[score_col]
.transform(lambda x: x.shift(1).rolling(window, min_periods=3).mean())
)
if prefix == "home":
df[f"{prefix}_margin_rolling_{window}"] = (
df.groupby(team_col)["point_diff"]
.transform(lambda x: x.shift(1).rolling(window, min_periods=3).mean())
)
else:
df[f"{prefix}_margin_rolling_{window}"] = (
df.groupby(team_col)["point_diff"]
.transform(lambda x: (-x).shift(1).rolling(window, min_periods=3).mean())
)
df[f"{prefix}_off_rtg_rolling"] = (
df.groupby(team_col)[f"{prefix}_off_rtg"]
.transform(lambda x: x.shift(1).rolling(window, min_periods=3).mean())
)
df[f"{prefix}_def_rtg_rolling"] = (
df.groupby(team_col)[f"{prefix}_def_rtg"]
.transform(lambda x: x.shift(1).rolling(window, min_periods=3).mean())
)
df["net_rating_diff"] = (
(df["home_off_rtg_rolling"] - df["home_def_rtg_rolling"])
- (df["away_off_rtg_rolling"] - df["away_def_rtg_rolling"])
)
df["rest_advantage"] = df["home_rest_days"] - df["away_rest_days"]
df["margin_diff"] = (
df[f"home_margin_rolling_{window}"]
- df[f"away_margin_rolling_{window}"]
)
self.features = df.dropna().reset_index(drop=True)
print(f"Feature engineering complete: {len(self.features)} games "
f"with {len(self.features.columns)} columns")
return self
def get_modeling_data(self) -> Tuple[pd.DataFrame, pd.Series]:
"""Return feature matrix and target for modeling.
Returns:
Tuple of (X features DataFrame, y target Series).
"""
if self.features is None:
raise ValueError("No features. Call engineer_features() first.")
feature_cols = [
c for c in self.features.columns
if "rolling" in c or c in ["rest_advantage", "net_rating_diff", "margin_diff"]
]
X = self.features[feature_cols]
y = self.features["home_win"]
return X, y
Stages 2-3: Model Training and Signal Generation
With clean, feature-rich data, Marcus rebuilt his models and added a second model type to create an ensemble.
"""
NBA Betting System - Stages 2-3: Model Training and Signal Generation
"""
from sklearn.linear_model import LogisticRegression
from sklearn.ensemble import GradientBoostingClassifier
from sklearn.model_selection import TimeSeriesSplit
from sklearn.metrics import brier_score_loss, log_loss
class NBAModelSuite:
"""Train and manage multiple NBA prediction models.
Provides ensemble predictions with dynamic weighting based on
recent Brier score performance.
Attributes:
models: Dictionary mapping model names to fitted model objects.
weights: Dictionary mapping model names to ensemble weights.
feature_cols: List of feature column names used for prediction.
"""
def __init__(self, feature_cols: List[str]) -> None:
self.feature_cols = feature_cols
self.models: Dict[str, object] = {}
self.weights: Dict[str, float] = {}
self._brier_history: Dict[str, List[float]] = {}
def train(self, X: pd.DataFrame, y: pd.Series) -> Dict[str, float]:
"""Train logistic regression and gradient boosting models.
Args:
X: Feature matrix.
y: Binary target (1 = home win).
Returns:
Dictionary of model name to validation Brier score.
"""
tscv = TimeSeriesSplit(n_splits=3)
val_scores: Dict[str, List[float]] = {"logistic": [], "gbm": []}
lr = LogisticRegression(C=1.0, max_iter=1000, random_state=42)
gbm = GradientBoostingClassifier(
n_estimators=100, max_depth=3, learning_rate=0.1, random_state=42
)
for train_idx, val_idx in tscv.split(X):
X_train, X_val = X.iloc[train_idx], X.iloc[val_idx]
y_train, y_val = y.iloc[train_idx], y.iloc[val_idx]
lr.fit(X_train, y_train)
gbm.fit(X_train, y_train)
val_scores["logistic"].append(
brier_score_loss(y_val, lr.predict_proba(X_val)[:, 1])
)
val_scores["gbm"].append(
brier_score_loss(y_val, gbm.predict_proba(X_val)[:, 1])
)
lr.fit(X, y)
gbm.fit(X, y)
self.models = {"logistic": lr, "gbm": gbm}
avg_brier = {name: np.mean(scores) for name, scores in val_scores.items()}
self._update_weights(avg_brier)
print("Model training complete:")
for name, score in avg_brier.items():
print(f" {name}: Brier={score:.4f}, weight={self.weights[name]:.3f}")
return avg_brier
def _update_weights(self, brier_scores: Dict[str, float]) -> None:
"""Update ensemble weights using inverse-Brier weighting.
Args:
brier_scores: Dictionary of model name to Brier score.
"""
inv_scores = {
name: 1.0 / max(score, 0.001) for name, score in brier_scores.items()
}
total = sum(inv_scores.values())
self.weights = {name: val / total for name, val in inv_scores.items()}
def predict_ensemble(self, X: pd.DataFrame) -> np.ndarray:
"""Generate weighted ensemble probability predictions.
Args:
X: Feature matrix for upcoming games.
Returns:
Array of home win probability estimates.
"""
ensemble_prob = np.zeros(len(X))
for name, model in self.models.items():
probs = model.predict_proba(X[self.feature_cols])[:, 1]
ensemble_prob += self.weights[name] * probs
return ensemble_prob
def generate_signals(
self,
games: pd.DataFrame,
min_edge: float = 0.03,
min_confidence: float = 0.55,
) -> pd.DataFrame:
"""Generate betting signals from ensemble predictions.
Compares model probabilities to market-implied probabilities
and filters by minimum edge and confidence thresholds.
Args:
games: DataFrame with features and market odds.
min_edge: Minimum required edge to generate a signal.
min_confidence: Minimum model probability for a signal.
Returns:
DataFrame of actionable signals with edge calculations.
"""
X = games[self.feature_cols]
games = games.copy()
games["model_prob_home"] = self.predict_ensemble(games)
games["model_prob_away"] = 1 - games["model_prob_home"]
games["market_prob_home"] = games["home_odds"].apply(_american_to_implied)
games["market_prob_away"] = games["away_odds"].apply(_american_to_implied)
total_implied = games["market_prob_home"] + games["market_prob_away"]
games["market_novig_home"] = games["market_prob_home"] / total_implied
games["market_novig_away"] = games["market_prob_away"] / total_implied
games["edge_home"] = games["model_prob_home"] - games["market_novig_home"]
games["edge_away"] = games["model_prob_away"] - games["market_novig_away"]
signals = []
for _, row in games.iterrows():
if row["edge_home"] >= min_edge and row["model_prob_home"] >= min_confidence:
signals.append({
"game_id": row["game_id"],
"game_date": row["game_date"],
"game": f"{row['away_team']} @ {row['home_team']}",
"side": "HOME",
"model_prob": round(row["model_prob_home"], 4),
"market_prob": round(row["market_novig_home"], 4),
"edge": round(row["edge_home"], 4),
"odds": row["home_odds"],
})
if row["edge_away"] >= min_edge and row["model_prob_away"] >= min_confidence:
signals.append({
"game_id": row["game_id"],
"game_date": row["game_date"],
"game": f"{row['away_team']} @ {row['home_team']}",
"side": "AWAY",
"model_prob": round(row["model_prob_away"], 4),
"market_prob": round(row["market_novig_away"], 4),
"edge": round(row["edge_away"], 4),
"odds": row["away_odds"],
})
result = pd.DataFrame(signals)
print(f"Generated {len(result)} signals from {len(games)} games "
f"(min_edge={min_edge}, min_conf={min_confidence})")
return result
def _american_to_implied(odds: int) -> float:
"""Convert American odds to implied probability.
Args:
odds: American odds value (positive or negative).
Returns:
Implied probability between 0 and 1.
"""
if odds > 0:
return 100.0 / (odds + 100.0)
else:
return abs(odds) / (abs(odds) + 100.0)
Stages 4-6: Bet Sizing, Risk Budget, and Execution
The signals flowed into a unified decision engine that sized bets, checked risk constraints, and produced a final bet sheet.
"""
NBA Betting System - Stages 4-6: Sizing, Risk, and Execution
"""
class BetSizer:
"""Size bets using fractional Kelly criterion.
Args:
bankroll: Current bankroll in dollars.
kelly_fraction: Fraction of full Kelly to use (default 0.25).
max_bet_pct: Maximum bet as percentage of bankroll.
"""
def __init__(
self,
bankroll: float,
kelly_fraction: float = 0.25,
max_bet_pct: float = 0.03,
) -> None:
self.bankroll = bankroll
self.kelly_fraction = kelly_fraction
self.max_bet_pct = max_bet_pct
def size_bet(self, model_prob: float, decimal_odds: float) -> float:
"""Calculate bet size using fractional Kelly.
Args:
model_prob: Model's estimated probability of winning.
decimal_odds: Decimal odds offered by the sportsbook.
Returns:
Recommended stake in dollars.
"""
b = decimal_odds - 1.0
q = 1.0 - model_prob
kelly = (b * model_prob - q) / b
if kelly <= 0:
return 0.0
fraction_kelly = kelly * self.kelly_fraction
max_bet = self.bankroll * self.max_bet_pct
stake = min(fraction_kelly * self.bankroll, max_bet)
return round(max(stake, 0), 2)
def american_to_decimal(odds: int) -> float:
"""Convert American odds to decimal odds.
Args:
odds: American odds (positive or negative integer).
Returns:
Decimal odds (always > 1.0).
"""
if odds > 0:
return 1.0 + odds / 100.0
else:
return 1.0 + 100.0 / abs(odds)
def build_bet_sheet(
signals: pd.DataFrame,
bankroll: float,
kelly_fraction: float = 0.25,
) -> pd.DataFrame:
"""Build a complete bet sheet from filtered signals.
Sizes each bet using fractional Kelly and produces an
execution-ready DataFrame.
Args:
signals: DataFrame of signals with model_prob and odds columns.
bankroll: Current bankroll in dollars.
kelly_fraction: Fraction of full Kelly to use.
Returns:
DataFrame with bet sheet including stakes and expected values.
"""
sizer = BetSizer(bankroll, kelly_fraction=kelly_fraction)
bets = []
for _, signal in signals.iterrows():
decimal_odds = american_to_decimal(signal["odds"])
stake = sizer.size_bet(signal["model_prob"], decimal_odds)
if stake > 0:
ev = signal["model_prob"] * (decimal_odds - 1) - (1 - signal["model_prob"])
bets.append({
"game_id": signal["game_id"],
"game_date": signal["game_date"],
"game": signal["game"],
"side": signal["side"],
"model_prob": signal["model_prob"],
"edge": signal["edge"],
"odds": signal["odds"],
"decimal_odds": round(decimal_odds, 3),
"stake": stake,
"ev_per_dollar": round(ev, 4),
"expected_profit": round(stake * ev, 2),
})
bet_sheet = pd.DataFrame(bets)
if len(bet_sheet) > 0:
print(f"\nBet Sheet: {len(bet_sheet)} bets, "
f"total stake: ${bet_sheet['stake'].sum():.2f}, "
f"total expected profit: ${bet_sheet['expected_profit'].sum():.2f}")
return bet_sheet
Stage 7-8: Settlement and Performance Review
After implementing the full workflow, Marcus ran a walk-forward simulation over 300 games to test the integrated system.
"""
NBA Betting System - Stages 7-8: Settlement and Review
"""
def simulate_walkforward(
pipeline: "NBADataPipeline",
train_size: int = 600,
test_size: int = 300,
bankroll: float = 10000.0,
kelly_fraction: float = 0.25,
) -> pd.DataFrame:
"""Run a walk-forward simulation of the complete betting system.
Trains models on the first train_size games, then predicts and
bets on the next test_size games one at a time.
Args:
pipeline: Fully processed NBADataPipeline instance.
train_size: Number of games for initial training.
test_size: Number of games for out-of-sample testing.
bankroll: Starting bankroll.
kelly_fraction: Fraction of full Kelly for bet sizing.
Returns:
DataFrame of all settled bets with P&L.
"""
features = pipeline.features
X, y = pipeline.get_modeling_data()
if len(features) < train_size + test_size:
test_size = len(features) - train_size
X_train = X.iloc[:train_size]
y_train = y.iloc[:train_size]
model_suite = NBAModelSuite(feature_cols=X.columns.tolist())
model_suite.train(X_train, y_train)
test_games = features.iloc[train_size:train_size + test_size].copy()
signals = model_suite.generate_signals(
test_games, min_edge=0.03, min_confidence=0.55
)
if len(signals) == 0:
print("No signals generated in test period.")
return pd.DataFrame()
bet_sheet = build_bet_sheet(signals, bankroll, kelly_fraction)
if len(bet_sheet) == 0:
print("No bets sized above zero.")
return pd.DataFrame()
results = bet_sheet.merge(
test_games[["game_id", "home_win"]], on="game_id", how="left"
)
results["won"] = results.apply(
lambda r: (r["side"] == "HOME" and r["home_win"] == 1)
or (r["side"] == "AWAY" and r["home_win"] == 0),
axis=1,
)
results["pnl"] = results.apply(
lambda r: r["stake"] * (r["decimal_odds"] - 1) if r["won"] else -r["stake"],
axis=1,
)
results["cumulative_pnl"] = results["pnl"].cumsum()
total_staked = results["stake"].sum()
total_pnl = results["pnl"].sum()
win_rate = results["won"].mean()
roi = total_pnl / total_staked * 100 if total_staked > 0 else 0
print(f"\n{'='*50}")
print("WALK-FORWARD SIMULATION RESULTS")
print(f"{'='*50}")
print(f"Total bets: {len(results)}")
print(f"Win rate: {win_rate:.1%}")
print(f"Total staked: ${total_staked:,.2f}")
print(f"Total P&L: ${total_pnl:,.2f}")
print(f"ROI: {roi:+.2f}%")
print(f"Max drawdown: ${results['cumulative_pnl'].cummax().sub(results['cumulative_pnl']).max():,.2f}")
print(f"Final bankroll: ${bankroll + total_pnl:,.2f}")
return results
Results and Lessons Learned
Marcus's integrated system, tested over 300 out-of-sample games, generated 47 bets meeting his edge and confidence thresholds. The key metrics:
| Metric | Before Integration | After Integration |
|---|---|---|
| Bets per 300 games | ~280 (bet almost everything) | 47 (selective) |
| Win rate | 51.3% | 54.8% |
| ROI | -3.2% | +4.1% |
| Average edge at placement | Not tracked | 4.7% |
| Bets beating closing line | Not tracked | 57.4% |
The transformation was not about a better model --- though the ensemble did outperform either individual model. The transformation was about selectivity and integration. By filtering signals through edge thresholds, the system only bet when the model had meaningful conviction. By sizing bets with fractional Kelly, it wagered more on higher-edge opportunities. By tracking closing line value, Marcus could confirm that his edge was genuine.
Key Takeaways from Marcus's Rebuild
-
Fewer bets, higher quality. Betting on only 16% of games (47 out of 300) dramatically improved ROI because it eliminated marginal bets where the model had no meaningful edge.
-
The ensemble added robustness. Neither the logistic regression nor the gradient boosting model alone matched the ensemble's performance. The ensemble was particularly valuable on games where one model was overconfident.
-
Risk management preserved capital. Fractional Kelly sizing at 25% meant that even when the model was wrong, losses were manageable. The maximum drawdown never exceeded 5% of bankroll.
-
The feedback loop enabled improvement. Monthly performance reviews identified that the model's edge was strongest on back-to-back games (rest advantage) and weakest on nationally televised games (more efficient pricing). This insight guided feature engineering priorities for the next iteration.
-
Process discipline was the hardest part. Marcus's biggest challenge was not technical --- it was resisting the urge to override the system on games where he had a "gut feeling." The months where he followed the system strictly were consistently the most profitable.
Discussion Questions
-
Marcus's system generated only 47 bets from 300 games. Would you be comfortable with this level of selectivity? What are the psychological challenges of passing on most games?
-
The walk-forward simulation trains on a fixed window. How would you modify the system to adapt to mid-season changes like trades, injuries, or coaching changes?
-
The case study uses simulated data. What additional challenges would you expect when implementing this system with real NBA data from sources like Basketball Reference or the NBA Stats API?
-
Marcus's edge was strongest on rest-advantage situations. What happens when sportsbooks also start pricing rest advantage more accurately? How should Marcus prepare for this edge decay?