Chapter 22: Player Performance Prediction - Key Takeaways
Executive Summary
Player performance prediction is both one of the most valuable and most challenging applications of basketball analytics. This chapter provided frameworks for projecting future player performance using historical data, aging curves, similarity scores, and uncertainty quantification. Effective projections require combining statistical rigor with basketball knowledge while maintaining appropriate humility about the limits of predictability.
Core Concepts
1. The Four Components of Player Projection
Every player projection system must address four fundamental sub-problems:
| Component | Definition | Key Challenge |
|---|---|---|
| Skill Estimation | Determining true underlying abilities from observed performance | Separating signal from noise |
| Aging Adjustment | Projecting how abilities change over time | Accounting for individual variation |
| Context Adjustment | Accounting for team, role, and environment changes | Predicting future context |
| Uncertainty Quantification | Communicating confidence in projections | Calibrating probability estimates |
2. Regression to the Mean
Key Principle: Extreme observations contain more noise than typical observations and should be regressed toward the population mean.
Formula:
Regressed Estimate = (Observed × Sample Size + Prior Mean × Prior Weight) / (Sample Size + Prior Weight)
Application Rules: - Stronger regression for smaller samples - Less regression for more stable statistics (e.g., free throw percentage) - More regression for volatile statistics (e.g., three-point percentage) - Less regression for players with longer track records
3. Aging Curves
General Pattern: - Most skills peak in mid-to-late 20s - Athletic attributes decline earlier than skill-based attributes - Different positions and playing styles age differently
Key Statistics by Aging Pattern:
| Early Peak (23-25) | Mid Peak (25-28) | Late Peak (28-30) |
|---|---|---|
| Speed/Quickness | Scoring | Court vision |
| Vertical leap | Rebounding | Free throw % |
| Perimeter defense | Overall defense | Three-point % |
Construction Methods: - Delta method: Track same players across seasons - Cross-sectional: Compare players of different ages (biased) - Mixed methods: Combine approaches with statistical corrections
4. Similarity Scores (Comparables)
Purpose: Use historical players' career paths to inform projections for current players.
Best Practices: - Select 5-15 comparables for statistical stability - Weight by similarity score - Account for era differences - Include only comparable contexts (age, role, situation)
Common Pitfalls: - Over-relying on small comparable sets - Ignoring context differences - Treating unprecedented players as having valid comparables
5. Uncertainty Quantification
Types of Uncertainty: 1. Model uncertainty: Error in the projection model itself 2. Parameter uncertainty: Uncertainty in estimated model parameters 3. Individual uncertainty: Natural variation in player outcomes
Confidence vs. Prediction Intervals: - Confidence intervals: Uncertainty about the model/parameters - Prediction intervals: Uncertainty about specific future observations (always wider)
Rule of Thumb: Year-over-year individual player projections typically have prediction intervals of +/- 2-4 Win Shares for established players.
Practical Application Checklist
For Building a Projection System
- [ ] Define clear target metrics (WS, BPM, VORP, etc.)
- [ ] Collect multiple seasons of historical data
- [ ] Implement appropriate regression to the mean
- [ ] Build or adopt validated aging curves
- [ ] Develop similarity/comparable methodology
- [ ] Incorporate context adjustments (pace, role, team)
- [ ] Validate with walk-forward cross-validation
- [ ] Calibrate uncertainty estimates
- [ ] Document limitations clearly
For Evaluating Projections
- [ ] Check that projection intervals contain observed values at expected rates
- [ ] Compare to naive baselines (persistence, aging curve only)
- [ ] Assess calibration of probability estimates
- [ ] Examine residuals for systematic biases
- [ ] Test performance on holdout seasons
- [ ] Compare to other public projection systems
For Using Projections in Decision-Making
- [ ] Consider full projection distribution, not just point estimates
- [ ] Account for asymmetric costs of over/underestimation
- [ ] Update projections as new data becomes available
- [ ] Recognize player-specific uncertainty factors
- [ ] Combine projections with qualitative information
- [ ] Document reasoning for departures from model projections
Common Mistakes to Avoid
Mistake 1: Ignoring Regression to the Mean
Problem: Projecting extreme performances to continue Solution: Always regress toward population/player baseline, especially for volatile statistics
Mistake 2: Using Average Aging Curves Rigidly
Problem: Applying population averages to all players equally Solution: Adjust for player-specific factors (playing style, injury history, physical profile)
Mistake 3: Overconfident Point Estimates
Problem: Presenting projections without uncertainty Solution: Always report confidence/prediction intervals; emphasize range over point estimate
Mistake 4: Ignoring Context Changes
Problem: Projecting statistics without accounting for expected role/team changes Solution: Build context adjustments into projection methodology
Mistake 5: Small Sample Overreaction
Problem: Dramatically changing projections based on limited new data Solution: Weight new information appropriately based on sample size
Mistake 6: Survivorship Bias in Comparables
Problem: Using only successful historical players as comparables Solution: Include all relevant players, including those who declined or left league
Key Formulas and Calculations
Standard Error of a Proportion
SE = sqrt(p × (1-p) / n)
Weighted Average (Marcel-style)
Weighted Stat = (Season1 × W1 + Season2 × W2 + Season3 × W3) / (W1 + W2 + W3)
Typical weights: 5/4/3 for most recent to oldest
Regressed Estimate
Regressed = (Observed × n + Prior × k) / (n + k)
Where k = prior weight (equivalent sample size)
Euclidean Distance (for similarity)
Distance = sqrt(sum((xi - yi)^2))
Where xi, yi are standardized feature values
Projection Confidence Interval
CI = Projection ± z × SE
Where z = 1.96 for 95% confidence
Summary: The Projection Mindset
Think Probabilistically
- Projections are probability distributions, not certainties
- Communicate ranges and likelihoods
- Acknowledge that outliers will occur
Update Continuously
- Incorporate new data as it becomes available
- Adjust priors based on observed performance
- Re-evaluate comparables as careers unfold
Remain Humble
- Perfect prediction is impossible
- Model limitations are inherent
- Qualitative factors matter
Focus on Decisions
- Projections serve decision-making
- Match uncertainty to decision stakes
- Consider the asymmetric costs of errors
Quick Reference: Projection Confidence by Context
| Context | Confidence Level | Key Uncertainty Factors |
|---|---|---|
| Veteran (5+ years), stable role | Higher | Aging, injury |
| Veteran, new team/role | Moderate | Context fit, role change |
| Young player (2-4 years) | Moderate | Development trajectory |
| Rookie | Lower | Translation, role, development |
| International/College prospect | Lowest | Multiple translation steps |
| Injury recovery | Low-Moderate | Recovery timeline, adaptation |
Further Study Recommendations
- Statistical foundations: Study Bayesian inference and time series analysis
- Historical systems: Examine PECOTA (baseball), CARMELO, and RAPTOR methodologies
- Practice: Build projections for upcoming seasons and compare to outcomes
- Read critically: Evaluate public projection systems for their approaches and limitations
- Iterate: Continuously improve personal models based on forecast errors