Key Takeaways: Introduction to Prediction Models

One-Page Reference


Core Concept

A prediction model is a systematic method for generating forecasts about uncertain future events based on available information—not guessing, but mathematical transformation of data into probabilistic outcomes.


The Prediction Pipeline

[Raw Data] → [Feature Engineering] → [Model] → [Predictions] → [Evaluation]
     ↑                                              ↓
     └──────────── [Feedback Loop] ←───────────────┘

Types of NFL Predictions

Type Output Use Case
Outcome Winner + probability Straight-up picks
Spread Point margin Betting analysis
Total Combined score Over/under analysis
Season Win total Futures, projections

Key Evaluation Metrics

Accuracy Metrics

Metric Formula Benchmark
Straight-up Correct / Total 50% random, 55-60% good
ATS Covers / Total 52.4% to profit
Brier Score mean((prob - outcome)²) 0.25 random, <0.22 good
MAE mean(|pred - actual|) ~10-12 points

What "Good" Looks Like

  • Straight-up: 55-60% over large sample
  • ATS: 53-55% is elite (very rare)
  • Brier: Below 0.22
  • MAE: Below 12 points

Common Pitfalls

1. Overfitting

Problem: Model memorizes past data, fails on new data Solution: Use train/test splits, cross-validation

2. Data Leakage

Problem: Using information not available at prediction time Solution: Strict temporal separation of features

3. Ignoring Variance

Problem: NFL spread std ≈ 13.5 points Solution: Quantify uncertainty, accept randomness

4. Small Sample Illusions

Problem: 60% accuracy over 20 games means nothing Solution: Require 100+ game samples for conclusions


Building Blocks

1. Team Ratings

Single number representing team strength

rating = weighted_average(point_differentials)

2. Home Field Advantage

~2.5 points in modern NFL

spread = away_rating - home_rating - HFA

3. Adjustments

  • Rest days (+0.5 pts/day)
  • Travel (long distance: +1-1.5 pts)
  • Timezone (west→east: +1 pt)
  • Weather, injuries

4. Uncertainty

total_std = sqrt(game_variance² + sample_uncertainty² + rating_uncertainty²)

Converting Spread to Probability

home_win_prob = 1 / (1 + 10^(spread / 8))
Spread Home Win Prob
-14 85%
-7 72%
-3 60%
0 50%
+3 40%
+7 28%

Quick Calibration Check

Your 70% predictions should win ~70% of the time.

# Group predictions by probability bin
# Compare predicted prob to actual win rate
calibration_error = |predicted_prob - actual_win_rate|

Sample Size Requirements

Sample 95% CI Width Reliable?
20 games ±22% No
50 games ±14% Barely
100 games ±10% Somewhat
500 games ±4% Yes
1000 games ±3% Very

Model Comparison Framework

Aspect Ask
Inputs What data does it use?
Process How does it combine information?
Outputs What does it predict?
Evaluation How was it validated?
Uncertainty Does it quantify confidence?

Red Flags

  • Claims of >65% sustained accuracy
  • No reported sample size
  • No out-of-sample testing
  • Using post-game data to predict
  • Ignoring model uncertainty

Baseline Expectations

Method Expected Accuracy
Coin flip 50%
Always pick home 52%
Always pick favorite 67%
Vegas spread 50% ATS
Good model 55-58% SU
Elite model 58-62% SU

Remember

  1. Systematic > Intuitive - Models beat gut feelings long-term
  2. Evaluation is mandatory - No testing = no credibility
  3. Variance is real - Even perfect models have bad weeks
  4. Simple often wins - Complexity ≠ accuracy
  5. Continuous improvement - Update with new data