Key Takeaways: Elo and Power Ratings

One-Page Reference


Core Elo Equations

Expected Score:

E = 1 / (1 + 10^((R_opponent - R_team) / 400))

Rating Update:

R_new = R_old + K × (Actual - Expected)

Where: Actual = 1 (win), 0 (loss), 0.5 (tie)


Key Parameters

Parameter Typical Value Effect
K-Factor 25-32 Higher = more responsive, more volatile
Home Advantage 48 Elo points ~2.5 point spread
Initial Rating 1500 Starting point for all teams
Season Regression 1/3 toward mean Accounts for offseason changes

Rating to Spread Conversion

Elo → Spread: - ~25 Elo points ≈ 1 point spread - Spread = (Away_Elo - Home_Elo - HFA) / 25

Spread → Probability: - Each point ≈ 3% probability shift - P(home) = 1 / (1 + 10^(spread/8))


Margin of Victory Adjustment

Formula:

multiplier = log(margin + 1) × autocorr_factor
autocorr_factor = 2.2 / ((elo_diff × 0.001) + 2.2)

Effect: - Rewards larger margins - Dampens expected blowouts (favorite wins big) - Amplifies upset blowouts


Rating System Comparison

System Strengths Weaknesses
Elo Simple, interpretable, cross-season Path dependent, no margin
SRS SOS built-in, path independent Single season only, all games equal
Efficiency Granular, offense/defense splits Complex, needs play-by-play

Simple Rating System (SRS)

Core Equation:

Rating = Average Margin + Average Opponent Rating
  • Iteratively solved until convergence
  • Automatically adjusts for schedule strength
  • Ratings are in point units (easier interpretation)

Season Transitions

Regression to Mean:

new_rating = old_rating - (old_rating - mean) × regression_factor

Why Regress: 1. Extreme ratings partly luck-driven 2. Roster changes occur in offseason 3. Future data will update anyway

Typical Regression: 1/3 toward mean


Performance Benchmarks

Metric Baseline Good Excellent Market
SU Accuracy 50% 60% 63%+ ~63%
Brier Score 0.250 0.220 <0.210 ~0.210
Spread MAE 14.0 11.0 <10.5 ~10.5

Practical Guidelines

K-Factor Selection: - 20-25: Stable, slow adaptation - 28-32: Balanced (most common) - 40+: Very responsive, high volatility

Margin Capping: - Cap at 14-24 points - Prevents garbage time distortion - Blowouts beyond cap add little information

Home Advantage: - ~2.5 points (down from 3.0 historically) - ~48 Elo points - Continues to decline over time


When to Use Each System

Elo: Public-facing predictions, cross-season comparisons, simplicity required

SRS: Single-season analysis, schedule strength focus, spread predictions

Efficiency: Detailed analysis, matchup breakdowns, maximum information extraction

Ensemble: Combine all three for best performance


Common Mistakes to Avoid

  1. Overfitting K-factor on evaluation data
  2. Ignoring sample size early in season
  3. Forgetting home advantage adjustment
  4. Not capping margins (one blowout distorts weeks)
  5. Over-interpreting early-season ratings
  6. Neglecting regression between seasons

Quick Implementation Checklist

  • [ ] Initialize all teams at 1500
  • [ ] Set K-factor (start with 28)
  • [ ] Add home advantage (48 Elo)
  • [ ] Process games chronologically
  • [ ] Consider margin adjustment
  • [ ] Apply 1/3 regression between seasons
  • [ ] Calibrate to spreads
  • [ ] Validate on holdout data

Key Insight

Simple rating systems perform surprisingly well. Elo with margin adjustment achieves ~60% straight-up accuracy—competitive with complex approaches. The key is proper calibration, thoughtful parameter selection, and honest evaluation.