Chapter 21: Further Reading - Win Probability Models
Academic Papers
Win Probability Methodology
-
"A Win Probability Model for Football" - Burke (2010) - Foundation of modern WP models - Feature selection methodology - Calibration approaches
-
"Expected Points and Expected Points Added" - Burke - EP foundation for WP models - State-value framework - Integration with WP
-
"Calibration of Probabilistic Predictions" - Gneiting & Raftery - Calibration theory - Proper scoring rules - Evaluation metrics
Machine Learning Applications
- "Gradient Boosting for Win Probability" - MIT Sloan - Advanced model architectures - Feature engineering - Comparison to baselines
Books
Sports Analytics
-
"The Book: Playing the Percentages" - Tango, Lichtman, Dolphin - Win probability in baseball - Transferable concepts - Decision analysis framework
-
"Mathletics" - Wayne Winston - Multi-sport probability models - Basic methodology - Practical applications
Statistical Methods
-
"Applied Logistic Regression" - Hosmer & Lemeshow - Logistic regression theory - Calibration assessment - Model diagnostics
-
"Probabilistic Machine Learning" - Murphy - Calibration methods - Bayesian approaches - Neural network outputs
Online Resources
Football Analytics Sites
-
Pro Football Reference - Win probability tools - Historical data - https://www.pro-football-reference.com/
-
ESPN Win Probability - Live WP tracking - Methodology explanations - https://www.espn.com/
-
Football Outsiders - WP-based analysis - Decision analysis - https://www.footballoutsiders.com/
-
The Athletic - WP visualizations - Fourth-down analysis - https://theathletic.com/
Academic Resources
- NFL Big Data Bowl - WP model submissions - Code examples - https://www.kaggle.com/c/nfl-big-data-bowl
Data Sources
Play-by-Play Data
-
nflverse (R) - Comprehensive PBP data - WP calculations included - https://nflverse.com/
-
nfl_data_py (Python) - Python NFL data access - Pre-calculated WP
-
College Football Data API - College PBP data - https://collegefootballdata.com/
Historical Archives
- Sports Reference - Historical game data - Play-level data
Tools and Libraries
Python
-
scikit-learn - Logistic regression - Calibration utilities -
sklearn.calibration -
XGBoost - Gradient boosting - Built-in calibration - https://xgboost.readthedocs.io/
-
PyTorch - Neural network WP models - https://pytorch.org/
-
matplotlib/seaborn - Calibration curves - WP charts
R
-
nflfastR - NFL WP calculations - Pre-built models
-
ggplot2 - WP visualizations
Industry Resources
Team Analytics
-
NFL Next Gen Stats - Win probability features - Real-time tracking
-
ESPN Analytics - Fourth-down decisions - Live WP
Media Applications
- FiveThirtyEight - Game predictions - WP methodology articles - https://fivethirtyeight.com/
Methodological Deep Dives
Calibration Techniques
-
Platt Scaling - Original paper on probability calibration - Post-hoc calibration method
-
Isotonic Regression - Non-parametric calibration - scikit-learn implementation
-
Temperature Scaling - Neural network calibration - Simple and effective
Feature Engineering
-
Interaction Terms - Score × Time interactions - Field position effects
-
Time Decay Features - Modeling urgency - Late-game dynamics
Suggested Learning Path
Week 1-2: Foundations
- Study logistic regression for probability
- Implement basic WP with score/time
- Understand calibration concepts
Week 3-4: Feature Engineering
- Add game state features
- Create interaction terms
- Handle edge cases
Week 5-6: Advanced Models
- Implement gradient boosting
- Experiment with neural networks
- Compare model performance
Week 7-8: Calibration
- Analyze calibration curves
- Apply isotonic regression
- Validate on held-out data
Week 9+: Applications
- Build WPA calculator
- Implement decision analysis
- Create visualizations