Key Takeaways: Elo and Power Ratings
One-Page Reference
Core Elo Equations
Expected Score:
E = 1 / (1 + 10^((R_opponent - R_team) / 400))
Rating Update:
R_new = R_old + K × (Actual - Expected)
Where: Actual = 1 (win), 0 (loss), 0.5 (tie)
Key Parameters
| Parameter | Typical Value | Effect |
|---|---|---|
| K-Factor | 25-32 | Higher = more responsive, more volatile |
| Home Advantage | 48 Elo points | ~2.5 point spread |
| Initial Rating | 1500 | Starting point for all teams |
| Season Regression | 1/3 toward mean | Accounts for offseason changes |
Rating to Spread Conversion
Elo → Spread: - ~25 Elo points ≈ 1 point spread - Spread = (Away_Elo - Home_Elo - HFA) / 25
Spread → Probability: - Each point ≈ 3% probability shift - P(home) = 1 / (1 + 10^(spread/8))
Margin of Victory Adjustment
Formula:
multiplier = log(margin + 1) × autocorr_factor
autocorr_factor = 2.2 / ((elo_diff × 0.001) + 2.2)
Effect: - Rewards larger margins - Dampens expected blowouts (favorite wins big) - Amplifies upset blowouts
Rating System Comparison
| System | Strengths | Weaknesses |
|---|---|---|
| Elo | Simple, interpretable, cross-season | Path dependent, no margin |
| SRS | SOS built-in, path independent | Single season only, all games equal |
| Efficiency | Granular, offense/defense splits | Complex, needs play-by-play |
Simple Rating System (SRS)
Core Equation:
Rating = Average Margin + Average Opponent Rating
- Iteratively solved until convergence
- Automatically adjusts for schedule strength
- Ratings are in point units (easier interpretation)
Season Transitions
Regression to Mean:
new_rating = old_rating - (old_rating - mean) × regression_factor
Why Regress: 1. Extreme ratings partly luck-driven 2. Roster changes occur in offseason 3. Future data will update anyway
Typical Regression: 1/3 toward mean
Performance Benchmarks
| Metric | Baseline | Good | Excellent | Market |
|---|---|---|---|---|
| SU Accuracy | 50% | 60% | 63%+ | ~63% |
| Brier Score | 0.250 | 0.220 | <0.210 | ~0.210 |
| Spread MAE | 14.0 | 11.0 | <10.5 | ~10.5 |
Practical Guidelines
K-Factor Selection: - 20-25: Stable, slow adaptation - 28-32: Balanced (most common) - 40+: Very responsive, high volatility
Margin Capping: - Cap at 14-24 points - Prevents garbage time distortion - Blowouts beyond cap add little information
Home Advantage: - ~2.5 points (down from 3.0 historically) - ~48 Elo points - Continues to decline over time
When to Use Each System
Elo: Public-facing predictions, cross-season comparisons, simplicity required
SRS: Single-season analysis, schedule strength focus, spread predictions
Efficiency: Detailed analysis, matchup breakdowns, maximum information extraction
Ensemble: Combine all three for best performance
Common Mistakes to Avoid
- Overfitting K-factor on evaluation data
- Ignoring sample size early in season
- Forgetting home advantage adjustment
- Not capping margins (one blowout distorts weeks)
- Over-interpreting early-season ratings
- Neglecting regression between seasons
Quick Implementation Checklist
- [ ] Initialize all teams at 1500
- [ ] Set K-factor (start with 28)
- [ ] Add home advantage (48 Elo)
- [ ] Process games chronologically
- [ ] Consider margin adjustment
- [ ] Apply 1/3 regression between seasons
- [ ] Calibrate to spreads
- [ ] Validate on holdout data
Key Insight
Simple rating systems perform surprisingly well. Elo with margin adjustment achieves ~60% straight-up accuracy—competitive with complex approaches. The key is proper calibration, thoughtful parameter selection, and honest evaluation.