Chapter 22: Player Performance Prediction - Key Takeaways

Executive Summary

Player performance prediction is both one of the most valuable and most challenging applications of basketball analytics. This chapter provided frameworks for projecting future player performance using historical data, aging curves, similarity scores, and uncertainty quantification. Effective projections require combining statistical rigor with basketball knowledge while maintaining appropriate humility about the limits of predictability.


Core Concepts

1. The Four Components of Player Projection

Every player projection system must address four fundamental sub-problems:

Component Definition Key Challenge
Skill Estimation Determining true underlying abilities from observed performance Separating signal from noise
Aging Adjustment Projecting how abilities change over time Accounting for individual variation
Context Adjustment Accounting for team, role, and environment changes Predicting future context
Uncertainty Quantification Communicating confidence in projections Calibrating probability estimates

2. Regression to the Mean

Key Principle: Extreme observations contain more noise than typical observations and should be regressed toward the population mean.

Formula:

Regressed Estimate = (Observed × Sample Size + Prior Mean × Prior Weight) / (Sample Size + Prior Weight)

Application Rules: - Stronger regression for smaller samples - Less regression for more stable statistics (e.g., free throw percentage) - More regression for volatile statistics (e.g., three-point percentage) - Less regression for players with longer track records

3. Aging Curves

General Pattern: - Most skills peak in mid-to-late 20s - Athletic attributes decline earlier than skill-based attributes - Different positions and playing styles age differently

Key Statistics by Aging Pattern:

Early Peak (23-25) Mid Peak (25-28) Late Peak (28-30)
Speed/Quickness Scoring Court vision
Vertical leap Rebounding Free throw %
Perimeter defense Overall defense Three-point %

Construction Methods: - Delta method: Track same players across seasons - Cross-sectional: Compare players of different ages (biased) - Mixed methods: Combine approaches with statistical corrections

4. Similarity Scores (Comparables)

Purpose: Use historical players' career paths to inform projections for current players.

Best Practices: - Select 5-15 comparables for statistical stability - Weight by similarity score - Account for era differences - Include only comparable contexts (age, role, situation)

Common Pitfalls: - Over-relying on small comparable sets - Ignoring context differences - Treating unprecedented players as having valid comparables

5. Uncertainty Quantification

Types of Uncertainty: 1. Model uncertainty: Error in the projection model itself 2. Parameter uncertainty: Uncertainty in estimated model parameters 3. Individual uncertainty: Natural variation in player outcomes

Confidence vs. Prediction Intervals: - Confidence intervals: Uncertainty about the model/parameters - Prediction intervals: Uncertainty about specific future observations (always wider)

Rule of Thumb: Year-over-year individual player projections typically have prediction intervals of +/- 2-4 Win Shares for established players.


Practical Application Checklist

For Building a Projection System

  • [ ] Define clear target metrics (WS, BPM, VORP, etc.)
  • [ ] Collect multiple seasons of historical data
  • [ ] Implement appropriate regression to the mean
  • [ ] Build or adopt validated aging curves
  • [ ] Develop similarity/comparable methodology
  • [ ] Incorporate context adjustments (pace, role, team)
  • [ ] Validate with walk-forward cross-validation
  • [ ] Calibrate uncertainty estimates
  • [ ] Document limitations clearly

For Evaluating Projections

  • [ ] Check that projection intervals contain observed values at expected rates
  • [ ] Compare to naive baselines (persistence, aging curve only)
  • [ ] Assess calibration of probability estimates
  • [ ] Examine residuals for systematic biases
  • [ ] Test performance on holdout seasons
  • [ ] Compare to other public projection systems

For Using Projections in Decision-Making

  • [ ] Consider full projection distribution, not just point estimates
  • [ ] Account for asymmetric costs of over/underestimation
  • [ ] Update projections as new data becomes available
  • [ ] Recognize player-specific uncertainty factors
  • [ ] Combine projections with qualitative information
  • [ ] Document reasoning for departures from model projections

Common Mistakes to Avoid

Mistake 1: Ignoring Regression to the Mean

Problem: Projecting extreme performances to continue Solution: Always regress toward population/player baseline, especially for volatile statistics

Mistake 2: Using Average Aging Curves Rigidly

Problem: Applying population averages to all players equally Solution: Adjust for player-specific factors (playing style, injury history, physical profile)

Mistake 3: Overconfident Point Estimates

Problem: Presenting projections without uncertainty Solution: Always report confidence/prediction intervals; emphasize range over point estimate

Mistake 4: Ignoring Context Changes

Problem: Projecting statistics without accounting for expected role/team changes Solution: Build context adjustments into projection methodology

Mistake 5: Small Sample Overreaction

Problem: Dramatically changing projections based on limited new data Solution: Weight new information appropriately based on sample size

Mistake 6: Survivorship Bias in Comparables

Problem: Using only successful historical players as comparables Solution: Include all relevant players, including those who declined or left league


Key Formulas and Calculations

Standard Error of a Proportion

SE = sqrt(p × (1-p) / n)

Weighted Average (Marcel-style)

Weighted Stat = (Season1 × W1 + Season2 × W2 + Season3 × W3) / (W1 + W2 + W3)
Typical weights: 5/4/3 for most recent to oldest

Regressed Estimate

Regressed = (Observed × n + Prior × k) / (n + k)
Where k = prior weight (equivalent sample size)

Euclidean Distance (for similarity)

Distance = sqrt(sum((xi - yi)^2))
Where xi, yi are standardized feature values

Projection Confidence Interval

CI = Projection ± z × SE
Where z = 1.96 for 95% confidence

Summary: The Projection Mindset

Think Probabilistically

  • Projections are probability distributions, not certainties
  • Communicate ranges and likelihoods
  • Acknowledge that outliers will occur

Update Continuously

  • Incorporate new data as it becomes available
  • Adjust priors based on observed performance
  • Re-evaluate comparables as careers unfold

Remain Humble

  • Perfect prediction is impossible
  • Model limitations are inherent
  • Qualitative factors matter

Focus on Decisions

  • Projections serve decision-making
  • Match uncertainty to decision stakes
  • Consider the asymmetric costs of errors

Quick Reference: Projection Confidence by Context

Context Confidence Level Key Uncertainty Factors
Veteran (5+ years), stable role Higher Aging, injury
Veteran, new team/role Moderate Context fit, role change
Young player (2-4 years) Moderate Development trajectory
Rookie Lower Translation, role, development
International/College prospect Lowest Multiple translation steps
Injury recovery Low-Moderate Recovery timeline, adaptation

Further Study Recommendations

  1. Statistical foundations: Study Bayesian inference and time series analysis
  2. Historical systems: Examine PECOTA (baseball), CARMELO, and RAPTOR methodologies
  3. Practice: Build projections for upcoming seasons and compare to outcomes
  4. Read critically: Evaluate public projection systems for their approaches and limitations
  5. Iterate: Continuously improve personal models based on forecast errors