Chapter 21: Key Takeaways - In-Game Win Probability

Core Concepts Summary

1. Win Probability Definition

Definition: The probability that a team wins given the current game state
Key Inputs: Score differential, time remaining, possession, team strength, home court
Range: 0.0 (certain loss) to 1.0 (certain win)
Starting Point: Typically ~0.50-0.55 for home team at tip-off

2. Win Probability Added (WPA)

Definition: Change in win probability caused by a single play
Formula: WPA = WP_after - WP_before
Range: Typically -0.15 to +0.15 per play (higher in clutch)
Team WPA: Sum equals final WP - initial WP (~0.5 for winners)

3. Leverage Index (LI)

Definition: Importance of current game situation relative to average
Formula: LI = Expected WP swing / Average expected WP swing
Baseline: LI = 1.0 is average importance
High Leverage: LI > 3.0 indicates critical moments

4. Model Calibration

Definition: When predicted probabilities match actual outcomes
Well-Calibrated: 70% predictions win ~70% of the time
Brier Score: Measures combined calibration and discrimination (0-1, lower is better)
ECE: Expected Calibration Error measures average calibration gap

Essential Formulas

Win Probability (Simplified Normal CDF Model)

WP = Phi(effective_lead / sqrt(variance * time_remaining))

Where:
- effective_lead = score_diff + possession_value + home_advantage
- possession_value = ~1.0-1.1 points
- home_advantage = ~3.0-3.5 points
- variance = ~0.068 points per second per team
- Phi = standard normal cumulative distribution function

Seconds Remaining Calculation

For regulation (quarters 1-4):
seconds_remaining = (4 - period) * 720 + clock_seconds

Where:
- period = current quarter (1-4)
- clock_seconds = minutes * 60 + seconds on game clock

Brier Score

Brier Score = (1/n) * sum((predicted_i - actual_i)^2)

Where:
- predicted_i = predicted win probability for situation i
- actual_i = actual outcome (1 if win, 0 if loss)
- n = number of predictions

Interpretation:
- 0.00 = perfect predictions
- 0.25 = always predicting 0.5 (baseline)
- Lower is better

Win Probability Added

WPA = WP_after - WP_before

For a game winner:
Total Team WPA = 1.0 - initial_WP = ~0.5 (starting from neutral)

For individual play:
WPA accounts for context (time, score, leverage)

Leverage Index

LI = Expected WP swing in situation / Average expected WP swing

Rules of thumb:
- LI < 0.5: Low leverage (early game, blowout)
- LI = 1.0: Average leverage
- LI = 2-3: High leverage (close game, late)
- LI > 5: Extreme leverage (clutch moments)

Logistic Regression Model

WP = 1 / (1 + exp(-z))

Where:
z = beta_0 + beta_1*score_diff + beta_2*f(time) + beta_3*possession + ...

Common time transformations:
- log(seconds_remaining + 1)
- sqrt(seconds_remaining)
- seconds_remaining / 2880 (normalized)

Expected Calibration Error

ECE = sum(n_b / N * |accuracy_b - confidence_b|)

Where:
- n_b = samples in bin b
- N = total samples
- accuracy_b = actual win rate in bin b
- confidence_b = average predicted probability in bin b

Implementation Checklist

Building a Win Probability Model

[ ] Data Collection
[ ] Gather play-by-play data (1000+ games recommended)
[ ] Include: game_id, period, clock, scores, events
[ ] Calculate seconds remaining
[ ] Determine possession for each event
[ ] Create binary outcome variable (home_win)
[ ] Feature Engineering
[ ] Score differential (primary feature)
[ ] Time remaining transformations (log, sqrt)
[ ] Possession indicator (-1, 0, 1)
[ ] Score-time interaction terms
[ ] Quarter/period indicators
[ ] Clutch situation flag
[ ] Optional: team strength adjustment
[ ] Model Training
[ ] Choose algorithm (logistic regression recommended for interpretability)
[ ] Use time-series cross-validation
[ ] Apply regularization (L2) to prevent overfitting
[ ] Calculate Brier score on held-out data
[ ] Calibration
[ ] Create calibration curve (predicted vs actual)
[ ] Calculate Expected Calibration Error
[ ] Apply Platt scaling if needed
[ ] Verify calibration across all probability bins
[ ] Validation
[ ] Temporal validation (train on past, test on future)
[ ] Stratified evaluation (by quarter, score differential)
[ ] Compare to baseline models
[ ] Check edge cases (overtime, large leads)
[ ] Deployment
[ ] Build prediction API
[ ] Handle real-time updates
[ ] Monitor calibration drift
[ ] Document model assumptions

Common Pitfalls to Avoid

1. Data Leakage

Problem: Using future information to predict current state Solution: Use time-series cross-validation; never use game outcome as feature

2. Ignoring Calibration

Problem: Model has good accuracy but poor probability estimates Solution: Always evaluate calibration; apply post-hoc calibration techniques

3. Overcomplicating Features

Problem: Too many features lead to overfitting Solution: Start simple (score, time, possession); add features incrementally

4. Misinterpreting WPA

Problem: Using WPA to predict future performance Solution: WPA describes past impact; use for narrative, not projection

5. Ignoring Context in Leverage

Problem: Treating all high-WPA plays as equally skillful Solution: Normalize by leverage index; consider shot difficulty

6. Small Sample Overconfidence

Problem: Drawing conclusions from single games or few plays Solution: Report confidence intervals; aggregate over many observations

Quick Reference Tables

Benchmark Brier Scores

Model	Brier Score	Quality
Perfect	0.000	Ideal (impossible)
Strong Model	0.10-0.15	Excellent
Good Model	0.15-0.20	Good
Baseline (always 0.5)	0.250	Poor
Weak Model	0.20-0.25	Marginal
Random	0.333	Worthless

Leverage Index Reference

Situation	Approximate LI
Start of game	0.5-0.8
Up 10, start of Q2	0.6-0.9
Tied, end of Q3	1.5-2.0
Down 3, 2 min left	2.5-3.5
Tied, 30 sec left	4.0-6.0
Tied, final possession	5.0-8.0
Up 20, Q4	0.1-0.3

Win Probability Guidelines

Game State	Approximate Home WP
Tip-off	52-55%
Up 5 at half	72-75%
Up 10 at half	85-88%
Up 15 at half	92-95%
Down 5, 5 min left	25-30%
Down 10, 5 min left	8-12%
Up 3, 30 sec left, ball	90-95%
Tied, 10 sec left, ball	55-60%

Feature Importance (Typical)

Feature	Relative Importance
Score differential	1.00 (baseline)
Time remaining	0.60-0.70
Score x Time interaction	0.30-0.40
Possession	0.15-0.25
Home court	0.10-0.15
Team strength	0.10-0.20

Decision Frameworks

Framework 1: Interpreting Win Probability

1. Check current WP estimate
2. Compare to pre-game expectations
3. Identify key swings (when did WP change most?)
4. Calculate improbability if underdog wins
5. Context: Is this a meaningful probability shift?

Framework 2: Evaluating Player WPA

1. Calculate total WPA for game/season
2. Normalize by possessions played
3. Separate by leverage tier (high vs low)
4. Compare to expected WPA given opportunities
5. Caveat: WPA is descriptive, not predictive

Framework 3: Model Calibration Check

1. Bin predictions into 10 groups (0-10%, 10-20%, etc.)
2. Calculate actual win rate in each bin
3. Plot predicted vs actual (should be diagonal)
4. Identify over/under-confident regions
5. Apply recalibration if needed

Framework 4: Building Real-Time WP System

1. Train model on historical data
2. Set up data pipeline for live game feed
3. Calculate WP after each play
4. Store WP trajectory for visualization
5. Calculate WPA for significant events
6. Monitor for calibration drift

Key Insights Summary

Score differential is king: Accounts for 50-60% of model predictive power
Time transforms matter: Log and sqrt transformations capture non-linear decay
Possession is worth ~1 point: Important to include, especially late game
Calibration trumps accuracy: Well-calibrated 70% is better than miscalibrated 75%
Leverage varies 100x: From 0.1 in blowouts to 10+ in crunch time
WPA is retrospective: Great for storytelling, poor for prediction
Single games have huge variance: Even 90% WP situations lose 10% of the time
Simple models often win: Logistic regression competes with complex ML
Temporal validation is essential: Prevents data leakage and overfit
Context always matters: Same WP swing means different things in different situations

Application Scenarios

Scenario 1: Broadcasting Win Probability

Build model with ~150ms inference time
Display WP after each possession
Highlight plays with WPA > 0.10
Show "win probability graph" during breaks
Calculate "comeback improbability" for late leads

Scenario 2: Coaching Decision Support

Calculate WP for each strategic option
Compare: foul vs defend, 2 vs 3, timeout vs play
Present as "this choice gives X% better WP"
Track decision quality over time
Adjust for personnel and matchups

Scenario 3: Player Evaluation

Calculate season WPA for each player
Normalize by minutes and possessions
Separate clutch WPA (LI > 2) from regular
Compare to expected WPA given shot/play quality
Use alongside other metrics (not in isolation)

Scenario 4: Fan Engagement

Show live WP during games
Create "nail-biter index" (time spent near 50%)
Rank most improbable wins
Identify "plays of the game" by WPA
Compare current game to historical context

Tools and Resources

Recommended Software

Python: scikit-learn, XGBoost, statsmodels
R: hoopR, tidyverse, mgcv
Visualization: Matplotlib, Plotly, D3.js

Data Sources

NBA API (official play-by-play)
Basketball-Reference (historical data)
Second Spectrum (proprietary tracking)
ESPN API (real-time feeds)

Key Metrics to Track

Brier Score (overall model quality)
ECE (calibration quality)
WPA distribution (player/team evaluation)
Leverage distribution (game excitement)
Calibration drift (production monitoring)

Reference Implementations

ESPN Win Probability
FiveThirtyEight NBA model
Inpredictable (Mike Beuoy)
Cleaning the Glass

Summary Equations Card

Win Probability (Normal CDF):
WP = Phi((score_diff + poss_value + home_adv) / sqrt(var * time))

Brier Score:
BS = (1/n) * sum((pred - actual)^2)

Win Probability Added:
WPA = WP_after - WP_before

Leverage Index:
LI = E[WP_swing_current] / E[WP_swing_average]

Seconds Remaining:
sec = (4 - quarter) * 720 + clock_seconds

Effective Lead:
eff_lead = score_diff + (~1.0 if possession) + (~3.5 if home)

Expected Points per Possession:
EPP = ~1.0 to 1.1 points

Calibration Error (per bin):
CE_b = |actual_rate_b - predicted_avg_b|

ECE:
ECE = sum((n_b/N) * CE_b)