Chapter 25: Game Outcome Prediction - Key Takeaways

Executive Summary

Game outcome prediction synthesizes player evaluation, team analysis, and situational factors into probabilistic forecasts. While prediction models can achieve reasonable accuracy (65-68%), betting markets represent highly efficient aggregations of information that are difficult to beat consistently. This chapter provided frameworks for building prediction models, evaluating their performance, and understanding market efficiency.

Core Concepts

1. Prediction Targets

Target	Definition	Typical Use
Win Probability	P(team wins game)	Risk assessment, playoff odds
Point Spread	Expected margin of victory	Betting, power rankings
Total Points	Expected combined score	Game planning, betting
Exact Score	Distribution over final scores	Prop bets, simulations

Relationship:

Win Prob = P(Spread > 0) = CDF_normal(Spread / StdDev)

2. Baseline Models

Always compare against baselines:

Baseline	Accuracy	Method
Home team always	~58%	Pick home team
Better record	~61%	Pick better team
Simple ratings	~63%	Point differential based
Elo ratings	~66%	Dynamic strength ratings
Market closing line	~67%	Vegas consensus

3. Key Prediction Factors

Primary Factors: - Team offensive/defensive efficiency - Home court advantage (~3-4 points) - Recent performance trend

Situational Factors: - Rest days differential (~1-2 points) - Travel distance (minor) - Altitude (Denver: +1-2 points) - Schedule density

Information Factors: - Injuries (varies by player impact) - Lineup changes - Back-to-back games

4. Evaluation Metrics

For Binary Predictions: - Accuracy: % correct predictions - Brier Score: Mean squared probability error (lower = better) - Log Loss: -log(predicted probability of actual outcome)

For Spread Predictions: - MAE: Mean absolute margin error - RMSE: Root mean squared error (~11-12 points typical) - ATS %: Against-the-spread accuracy

Formulas:

Brier Score = (1/n) × sum((p_i - o_i)^2)
Log Loss = -(1/n) × sum(o_i×log(p_i) + (1-o_i)×log(1-p_i))

5. Probability Calibration

Well-calibrated model: When predicting X%, outcomes should occur X% of the time.

Testing Calibration: 1. Group predictions by probability range 2. Compare predicted vs. actual win rates 3. Plot calibration curve 4. Calculate Expected Calibration Error (ECE)

Practical Application Checklist

Building a Prediction Model

[ ] Define prediction target (spread, win prob, total)
[ ] Collect historical game data (3+ seasons)
[ ] Calculate team strength metrics (Elo, efficiency, etc.)
[ ] Engineer situational features (rest, travel, home)
[ ] Handle missing data (injuries, early season)
[ ] Train model with time-based validation
[ ] Evaluate against baselines
[ ] Check calibration
[ ] Document uncertainty

Pre-Game Prediction Process

[ ] Update team ratings with most recent results
[ ] Apply home court adjustment
[ ] Incorporate known injuries/absences
[ ] Check for situational factors (B2B, travel)
[ ] Generate spread prediction
[ ] Convert to win probability
[ ] Apply uncertainty bounds
[ ] Compare to market line (if available)

Model Evaluation

[ ] Calculate accuracy on holdout data
[ ] Compare to relevant baselines
[ ] Check calibration across probability ranges
[ ] Test for systematic biases
[ ] Measure ATS performance (if applicable)
[ ] Calculate statistical significance
[ ] Report confidence intervals

Key Formulas Quick Reference

Win Probability from Spread

Win_Prob = norm.cdf(Spread / Std_Dev)
Example: 6-point favorite, 12-pt std dev: norm.cdf(0.5) = 69%

Elo Expected Score

E_A = 1 / (1 + 10^((R_B - R_A) / 400))

Elo Update

R_new = R_old + K × (Actual - Expected)

Pythagorean Expectation

Win% = PF^n / (PF^n + PA^n)
Where n ≈ 14 for NBA

Kelly Criterion

Kelly% = (bp - q) / b
Where: b = odds, p = win prob, q = 1-p

Break-Even Win Rate

At -110 odds: 110/210 = 52.4%
At -105 odds: 105/205 = 51.2%

Common Mistakes to Avoid

Mistake 1: Ignoring Sample Size

Problem: Drawing conclusions from small samples Solution: Calculate statistical significance; need 500+ games for reliable conclusions

Mistake 2: Overfitting

Problem: Model performs well on training data, poorly on new data Solution: Use time-based validation, regularization, simpler models

Mistake 3: Look-Ahead Bias

Problem: Using information not available at prediction time Solution: Strict time-based data separation, walk-forward validation

Mistake 4: Ignoring Calibration

Problem: Focusing only on accuracy, ignoring probability quality Solution: Always check and report calibration metrics

Mistake 5: Underestimating Markets

Problem: Assuming your model beats the market Solution: Use closing lines as the primary benchmark

Mistake 6: Ignoring Vig

Problem: Reporting gross win rates without accounting for betting costs Solution: Always calculate net ROI including transaction costs

Summary: Model Performance Expectations

Accuracy Benchmarks

Model Type	Expected Accuracy	Notes
Random	50%	Baseline
Home team always	58%	Simple baseline
Record-based	61%	Better team wins
Simple Elo	65%	Dynamic ratings
Advanced model	66-67%	Multiple factors
Closing line	67-68%	Market consensus

ATS Expectations

Performance	Interpretation
48-52%	Random, no edge
52-54%	Possible small edge, needs validation
54-56%	Meaningful edge (rare, validate carefully)
56%+	Exceptional (verify methodology)

Required Sample Sizes

Target Confidence	At 52% true rate	At 55% true rate
90%	1,500 games	400 games
95%	2,200 games	600 games
99%	3,500 games	1,000 games

Market Efficiency Summary

Why Markets Are Efficient

Competition: Many sophisticated participants
Speed: Information incorporated in minutes
Arbitrage: Price differences eliminated quickly
Data: Same public data available to all

Where Edges Might Exist

Speed: Acting on news faster than markets adjust
Private info: Injury, lineup information
Micro-markets: Less liquid, less efficient
Live betting: More noise, more opportunity
Promos: Bonuses can create positive EV

What Doesn't Work

Simple systems (home dogs, fading public)
Historical patterns (markets adapt)
Public models (sharps build similar models)
Past ATS performance (no persistence)

Further Study Recommendations

Statistical foundations: Study probability, regression, time series
Elo systems: Implement and calibrate your own rating system
Market analysis: Track line movements and closing line value
Simulation: Build Monte Carlo game and season simulators
Machine learning: Explore advanced prediction techniques
Evaluation: Master proper scoring rules and calibration analysis