> "Football is a game of inches, and inches are what we measure."
Learning Objectives
- Load and process NFL play-by-play data using nflfastR and nfl_data_py in Python
- Calculate and interpret Expected Points Added (EPA), DVOA, success rate, and CPOE
- Build a regression-based spread prediction model from team efficiency statistics
- Quantify quarterback value and model the impact of injuries on point spreads
- Identify and exploit NFL-specific betting market patterns including key numbers, teasers, and weather effects
In This Chapter
Chapter 15: Modeling the NFL
"Football is a game of inches, and inches are what we measure." --- Bill Belichick (attributed)
The National Football League occupies a unique position in the American sports betting landscape. It generates more handle than any other sport in the United States, commands the attention of the sharpest bettors and most sophisticated modeling operations, and offers a market that is simultaneously among the most efficient and yet still contains exploitable edges for the well-prepared analyst. The NFL is where the largest volumes of sharp money flow, where closing lines are most respected as probability benchmarks, and where the analytical revolution that began with Moneyball in baseball has now fully arrived.
This chapter is the first in Part IV, where we move from general modeling frameworks to sport-specific applications. Everything you learned in Chapters 9 through 11---regression modeling, feature engineering, and model evaluation---now meets the particular structure, data, and market dynamics of professional American football. The NFL presents unique modeling challenges: a small sample size of only 272 regular-season games per year, enormous week-to-week variance, the outsized importance of the quarterback position, and a betting market shaped by massive public interest. It also presents unique opportunities: rich play-by-play data, well-studied efficiency metrics, and persistent market biases rooted in the sport's structure.
By the end of this chapter, you will have a working NFL spread prediction model, a framework for adjusting predictions based on injuries, and a deep understanding of the market patterns that create betting opportunities.
In this chapter, you will learn to: - Access and process NFL play-by-play data programmatically - Build custom efficiency metrics from raw play data - Construct a spread prediction model using regression techniques - Quantify and adjust for player injuries, especially at quarterback - Exploit NFL-specific market inefficiencies including key numbers and teasers
15.1 The NFL Data Landscape
The Revolution in Public Football Data
A decade ago, building a serious NFL model required either expensive proprietary data subscriptions or laborious manual data collection. Today, the NFL data landscape is remarkably rich and almost entirely free. The open-source community---led by projects like nflfastR and its Python counterpart nfl_data_py---has democratized access to play-by-play data that includes pre-calculated advanced metrics like Expected Points Added (EPA) and win probability. This section introduces the key data sources, their structure, and the Python code you will use throughout this chapter.
nflfastR and nfl_data_py
The nflfastR project (originally an R package, with the Python wrapper nfl_data_py) provides play-by-play data for every NFL game dating back to 1999. Each row in the dataset represents a single play, with over 300 columns of information including:
- Basic play information: game ID, date, quarter, time remaining, down, distance, yard line, play type (run, pass, punt, field goal, etc.)
- Result information: yards gained, first down indicator, turnover indicator, scoring
- Pre-calculated analytics: EPA (Expected Points Added), WPA (Win Probability Added), air yards, yards after catch, completion probability, CPOE
- Player identifiers: passer, rusher, receiver, with consistent ID mapping across seasons
Let us begin by installing and loading the data.
# Installation
# pip install nfl_data_py pandas numpy
import nfl_data_py as nfl
import pandas as pd
import numpy as np
# Load play-by-play data for recent seasons
seasons = [2022, 2023, 2024]
pbp = nfl.import_pbp_data(seasons)
print(f"Total plays loaded: {len(pbp):,}")
print(f"Columns available: {len(pbp.columns)}")
print(f"Seasons: {pbp['season'].unique()}")
print(f"Games per season: {pbp.groupby('season')['game_id'].nunique().to_dict()}")
Total plays loaded: 131,847
Columns available: 372
Seasons: [2022 2023 2024]
Games per season: {2022: 285, 2023: 285, 2024: 285}
The 285 games per season include 272 regular-season games and 13 postseason games. Let us examine the key columns we will use throughout this chapter.
# Filter to regular season and meaningful play types
regular_season = pbp[pbp['season_type'] == 'REG'].copy()
# Focus on run and pass plays (exclude special teams, penalties, etc.)
plays = regular_season[regular_season['play_type'].isin(['pass', 'run'])].copy()
print(f"Regular season run/pass plays: {len(plays):,}")
print(f"\nPlay type distribution:")
print(plays['play_type'].value_counts())
# Key columns for modeling
key_cols = [
'game_id', 'season', 'week', 'posteam', 'defteam',
'play_type', 'yards_gained', 'epa', 'success',
'down', 'ydstogo', 'yardline_100',
'pass_attempt', 'rush_attempt', 'first_down',
'air_yards', 'yards_after_catch', 'cpoe',
'wp', 'wpa', 'score_differential'
]
print(f"\nSample plays (key columns):")
print(plays[key_cols].head(10).to_string())
Understanding Expected Points Added (EPA)
EPA is the foundational metric in modern football analytics. It measures the value of each play relative to what would be expected given the game situation before the play. The concept rests on an Expected Points (EP) model: a model that estimates the expected number of points that will be scored on the current drive, given the down, distance, yard line, and other situational factors.
The Expected Points curve is derived empirically from historical data. For example:
- 1st and 10 at your own 20-yard line: EP $\approx$ 0.5 (you are expected to score about half a point on this drive)
- 1st and 10 at midfield: EP $\approx$ 2.0
- 1st and goal at the opponent's 2-yard line: EP $\approx$ 5.9
EPA for a single play is then:
$$\text{EPA} = \text{EP}_{\text{after}} - \text{EP}_{\text{before}}$$
A play that moves the offense from a situation worth 1.0 expected points to a situation worth 2.5 expected points has an EPA of +1.5. A fumble that hands the ball to the opponent in a situation where they now have 3.0 expected points produces a large negative EPA for the offense.
Why is EPA superior to raw yards? Consider two plays: 1. A 5-yard gain on 3rd and 3 (converts a first down): high EPA 2. A 5-yard gain on 3rd and 8 (fails to convert): negative or near-zero EPA
Both plays gained 5 yards, but their values to the offense are dramatically different. EPA captures this situational context. It is the single most important metric in this chapter.
# Calculate team-level EPA metrics per game
team_game_offense = plays.groupby(['game_id', 'season', 'week', 'posteam']).agg(
total_epa=('epa', 'sum'),
epa_per_play=('epa', 'mean'),
plays_count=('epa', 'count'),
pass_epa=('epa', lambda x: x[plays.loc[x.index, 'play_type'] == 'pass'].sum()),
rush_epa=('epa', lambda x: x[plays.loc[x.index, 'play_type'] == 'run'].sum()),
success_rate=('success', 'mean'),
pass_rate=('pass_attempt', 'mean')
).reset_index()
team_game_offense.columns = [
'game_id', 'season', 'week', 'team',
'off_total_epa', 'off_epa_play', 'off_plays',
'off_pass_epa', 'off_rush_epa', 'off_success_rate', 'pass_rate'
]
# Similarly for defense (what the defense allowed)
team_game_defense = plays.groupby(['game_id', 'season', 'week', 'defteam']).agg(
def_total_epa=('epa', 'sum'),
def_epa_play=('epa', 'mean'),
def_plays=('epa', 'count'),
def_success_rate=('success', 'mean')
).reset_index()
team_game_defense.columns = [
'game_id', 'season', 'week', 'team',
'def_total_epa', 'def_epa_play', 'def_plays', 'def_success_rate'
]
print("Offensive EPA per play leaders (2024 season):")
season_2024_off = team_game_offense[team_game_offense['season'] == 2024]
season_off_avg = season_2024_off.groupby('team')['off_epa_play'].mean().sort_values(ascending=False)
print(season_off_avg.head(10).to_string())
Tracking Data and Next Gen Stats
Beyond play-by-play data, the NFL collects player tracking data via RFID chips in shoulder pads. This data, available in limited form through the NFL's Next Gen Stats platform, records each player's position on the field at 10 frames per second. While raw tracking data is not publicly available outside of the annual Big Data Bowl competition, several derived metrics are published:
- Completion Probability: The probability that a pass will be completed, based on the receiver's separation, distance to the nearest defender, air distance, and other factors
- Expected Rushing Yards: The expected yards on a rushing attempt given the positions of blockers and defenders
- Separation: The distance between a receiver and the nearest defender at the time of the throw
- Time to Throw: The elapsed time from snap to throw
These metrics are accessible via nfl_data_py's Next Gen Stats import functions:
# Load Next Gen Stats data
ngs_passing = nfl.import_ngs_data(stat_type='passing', years=[2022, 2023, 2024])
ngs_rushing = nfl.import_ngs_data(stat_type='rushing', years=[2022, 2023, 2024])
ngs_receiving = nfl.import_ngs_data(stat_type='receiving', years=[2022, 2023, 2024])
print("Next Gen Stats - Passing columns:")
print(ngs_passing.columns.tolist())
# Key NGS passing metrics
print("\nTop QBs by avg completion % over expected (2024):")
ngs_2024 = ngs_passing[
(ngs_passing['season'] == 2024) &
(ngs_passing['season_type'] == 'REG') &
(ngs_passing['week'] == 0) # Season-level aggregation
]
print(ngs_2024[['player_display_name', 'avg_completed_air_yards',
'completion_percentage_above_expectation',
'avg_time_to_throw']].head(10).to_string())
Other Key Data Sources
Beyond nflfastR, several additional data sources are valuable for NFL modeling:
| Data Source | Content | Access |
|---|---|---|
| nfl_data_py | Play-by-play, rosters, schedules, injuries, Next Gen Stats | pip install nfl_data_py |
| Pro Football Reference | Historical stats, snap counts, advanced stats | Web (free) |
| Football Outsiders | DVOA, DYAR, proprietary metrics | Web (partial paywall) |
| ESPN QBR | Total Quarterback Rating | ESPN API |
| Vegas Insider / Covers | Historical odds, lines, results | Web (free) |
| Sportsbook Review | Opening and closing lines, consensus | Web (free) |
# Load supporting data
rosters = nfl.import_rosters([2024])
schedule = nfl.import_schedules([2024])
injuries = nfl.import_injuries([2024])
print(f"Roster entries: {len(rosters)}")
print(f"Schedule entries: {len(schedule)}")
# Schedule data includes spread and over/under
print("\nSchedule columns with betting data:")
betting_cols = [c for c in schedule.columns if any(
term in c.lower() for term in ['spread', 'over', 'under', 'line', 'moneyline']
)]
print(betting_cols)
# Example: Week 1 games with spreads
week1 = schedule[schedule['week'] == 1][
['game_id', 'away_team', 'home_team', 'spread_line', 'total_line',
'home_score', 'away_score']
]
print("\nWeek 1 games:")
print(week1.to_string())
Building a Clean Game-Level Dataset
For modeling purposes, we need to aggregate play-level data into game-level and then season-level summaries. Here is a comprehensive function that builds the dataset we will use throughout this chapter.
def build_game_dataset(pbp_data, season_filter=None):
"""
Build a game-level dataset from play-by-play data.
Returns a DataFrame with one row per team per game,
including offensive and defensive efficiency metrics.
"""
df = pbp_data.copy()
if season_filter:
df = df[df['season'].isin(season_filter)]
# Filter to regular season, run/pass plays
df = df[(df['season_type'] == 'REG') &
(df['play_type'].isin(['pass', 'run']))].copy()
# Offensive stats
off_stats = df.groupby(['game_id', 'season', 'week', 'posteam']).agg(
off_epa_play=('epa', 'mean'),
off_epa_total=('epa', 'sum'),
off_success_rate=('success', 'mean'),
off_plays=('epa', 'count'),
off_pass_rate=('pass_attempt', 'mean'),
off_yards_play=('yards_gained', 'mean'),
off_first_down_rate=('first_down', 'mean'),
off_epa_pass=('epa', lambda x: x[df.loc[x.index, 'pass_attempt'] == 1].mean()),
off_epa_rush=('epa', lambda x: x[df.loc[x.index, 'rush_attempt'] == 1].mean()),
).reset_index()
off_stats.rename(columns={'posteam': 'team'}, inplace=True)
# Defensive stats
def_stats = df.groupby(['game_id', 'season', 'week', 'defteam']).agg(
def_epa_play=('epa', 'mean'),
def_epa_total=('epa', 'sum'),
def_success_rate=('success', 'mean'),
def_plays=('epa', 'count'),
def_epa_pass=('epa', lambda x: x[df.loc[x.index, 'pass_attempt'] == 1].mean()),
def_epa_rush=('epa', lambda x: x[df.loc[x.index, 'rush_attempt'] == 1].mean()),
).reset_index()
def_stats.rename(columns={'defteam': 'team'}, inplace=True)
# Merge offensive and defensive
game_data = off_stats.merge(
def_stats, on=['game_id', 'season', 'week', 'team'], how='outer'
)
# Add game results from schedule
schedule = nfl.import_schedules(game_data['season'].unique().tolist())
schedule_home = schedule[['game_id', 'home_team', 'away_team',
'home_score', 'away_score',
'spread_line', 'total_line']].copy()
game_data = game_data.merge(schedule_home, on='game_id', how='left')
# Calculate margin and result
game_data['is_home'] = (game_data['team'] == game_data['home_team']).astype(int)
game_data['points_for'] = np.where(
game_data['is_home'] == 1,
game_data['home_score'],
game_data['away_score']
)
game_data['points_against'] = np.where(
game_data['is_home'] == 1,
game_data['away_score'],
game_data['home_score']
)
game_data['margin'] = game_data['points_for'] - game_data['points_against']
return game_data
# Build the dataset
games = build_game_dataset(pbp, season_filter=[2022, 2023, 2024])
print(f"Game-level dataset: {len(games)} team-game rows")
print(f"Columns: {list(games.columns)}")
15.2 DVOA and Efficiency Metrics
Football Outsiders' DVOA
Defense-adjusted Value Over Average (DVOA) is the signature metric of Football Outsiders, and understanding it is essential for anyone modeling the NFL. DVOA measures a team's efficiency on every play compared to a baseline and adjusts for the quality of opponents faced.
The DVOA calculation works as follows:
-
Success value assignment: Each play is assigned a success value based on its outcome relative to the situation. A first down on 3rd and 2 is more valuable than a first down on 1st and 10.
-
Comparison to average: Each play's success value is compared to the league-average success value in the same situation (down, distance, location on the field, quarter, score differential).
-
Opponent adjustment: The raw values are adjusted for the quality of the opponent's defense (for offensive DVOA) or offense (for defensive DVOA), creating a strength-of-schedule correction.
The formula can be expressed conceptually as:
$$\text{DVOA} = \frac{\sum_{i=1}^{N} (\text{PlayValue}_i - \text{AvgPlayValue}_{\text{situation}_i})}{\sum_{i=1}^{N} \text{AvgPlayValue}_{\text{situation}_i}} \times \text{OppAdj}$$
DVOA is expressed as a percentage. An offensive DVOA of +15.0% means the team was 15% better than average after adjusting for opponents. For defense, negative DVOA is better (allowing fewer points than average).
While we cannot replicate the exact proprietary DVOA methodology, we can build a similar metric using EPA data.
Building an EPA-Based Efficiency System
EPA already accomplishes much of what DVOA does---measuring play value relative to situation---but without the opponent adjustment. Let us build a system that adds opponent adjustment and creates a comprehensive efficiency profile for each team.
def calculate_adjusted_epa(games_df, season, through_week=None):
"""
Calculate opponent-adjusted EPA metrics for each team in a season.
Uses iterative adjustment: each team's efficiency is adjusted
by the average quality of opponents faced.
"""
season_data = games_df[games_df['season'] == season].copy()
if through_week:
season_data = season_data[season_data['week'] <= through_week]
# Step 1: Raw EPA averages per team
team_raw = season_data.groupby('team').agg(
raw_off_epa=('off_epa_play', 'mean'),
raw_def_epa=('def_epa_play', 'mean'),
raw_off_pass_epa=('off_epa_pass', 'mean'),
raw_off_rush_epa=('off_epa_rush', 'mean'),
raw_def_pass_epa=('def_epa_pass', 'mean'),
raw_def_rush_epa=('def_epa_rush', 'mean'),
raw_success_rate=('off_success_rate', 'mean'),
games_played=('game_id', 'count')
).reset_index()
# Step 2: Calculate opponent strength for each game
# For each team-game, find the opponent
season_data['opponent'] = np.where(
season_data['is_home'] == 1,
season_data['away_team'],
season_data['home_team']
)
# Step 3: Iterative opponent adjustment (3 iterations)
adj_off = team_raw.set_index('team')['raw_off_epa'].to_dict()
adj_def = team_raw.set_index('team')['raw_def_epa'].to_dict()
for iteration in range(3):
new_adj_off = {}
new_adj_def = {}
for team in team_raw['team'].values:
team_games = season_data[season_data['team'] == team]
opponents = team_games['opponent'].values
# Opponent defensive strength (for adjusting offense)
opp_def_strength = np.mean([adj_def.get(opp, 0) for opp in opponents])
# Opponent offensive strength (for adjusting defense)
opp_off_strength = np.mean([adj_off.get(opp, 0) for opp in opponents])
raw_off = team_raw[team_raw['team'] == team]['raw_off_epa'].values[0]
raw_def = team_raw[team_raw['team'] == team]['raw_def_epa'].values[0]
# Adjust: better offense if you faced tough defenses
new_adj_off[team] = raw_off - opp_def_strength
# Adjust: better defense if you faced tough offenses
new_adj_def[team] = raw_def - opp_off_strength
adj_off = new_adj_off
adj_def = new_adj_def
# Build final ratings DataFrame
ratings = pd.DataFrame({
'team': list(adj_off.keys()),
'adj_off_epa': list(adj_off.values()),
'adj_def_epa': list(adj_def.values()),
})
# Add raw metrics back
ratings = ratings.merge(team_raw, on='team')
# Overall rating: offense minus defense (lower def EPA is better)
ratings['overall_rating'] = ratings['adj_off_epa'] - ratings['adj_def_epa']
ratings = ratings.sort_values('overall_rating', ascending=False)
return ratings
# Calculate ratings for 2024
ratings_2024 = calculate_adjusted_epa(games, season=2024)
print("2024 NFL Team Ratings (Opponent-Adjusted EPA):")
print(ratings_2024[['team', 'adj_off_epa', 'adj_def_epa', 'overall_rating',
'games_played']].head(15).to_string(index=False))
Success Rate
While EPA captures the magnitude of each play's impact, success rate provides a complementary view by measuring consistency. A play is defined as "successful" based on the following thresholds:
- 1st down: gaining 50% or more of the yards needed (i.e., 5+ yards on 1st and 10)
- 2nd down: gaining 70% or more of the remaining yards needed
- 3rd/4th down: gaining 100% of the yards needed (converting the first down or scoring)
$$\text{Success Rate} = \frac{\text{Number of Successful Plays}}{\text{Total Plays}}$$
The league-average success rate is approximately 47%. Teams above 50% are consistently moving the chains; teams below 44% are struggling to sustain drives.
Success rate and EPA are correlated but distinct. A team can have a moderate EPA boosted by a few explosive plays while having a low success rate, indicating inconsistency. Conversely, a team that steadily gains 4-5 yards per play may have a high success rate but only average EPA. For modeling purposes, both metrics provide independent predictive signal.
# Success rate by team
success_rates = games[games['season'] == 2024].groupby('team').agg(
off_success=('off_success_rate', 'mean'),
def_success=('def_success_rate', 'mean')
).reset_index()
success_rates['success_margin'] = success_rates['off_success'] - success_rates['def_success']
success_rates = success_rates.sort_values('success_margin', ascending=False)
print("2024 Success Rate Rankings:")
print(success_rates.to_string(index=False, float_format='%.3f'))
Completion Percentage Over Expected (CPOE)
CPOE isolates a quarterback's accuracy from the difficulty of his throws. Using the NFL's Next Gen Stats completion probability model, each pass attempt is assigned an expected completion percentage based on:
- Air distance to the receiver
- Separation from the nearest defender
- Whether the receiver is in the end zone
- Whether the quarterback was under pressure
CPOE is then:
$$\text{CPOE} = \text{Actual Completion \%} - \text{Expected Completion \%}$$
A quarterback with a CPOE of +4.0% completes passes 4 percentage points more often than the model expects given the difficulty of his throws. CPOE has been shown to be one of the most stable quarterback metrics from season to season, making it especially valuable for projections.
# CPOE from play-by-play data
pass_plays = plays[
(plays['play_type'] == 'pass') &
(plays['cpoe'].notna()) &
(plays['season'] == 2024)
].copy()
qb_cpoe = pass_plays.groupby(['passer_player_name', 'posteam']).agg(
cpoe_mean=('cpoe', 'mean'),
pass_attempts=('cpoe', 'count'),
epa_per_dropback=('epa', 'mean'),
success_rate=('success', 'mean')
).reset_index()
# Filter to QBs with significant playing time
qb_cpoe = qb_cpoe[qb_cpoe['pass_attempts'] >= 200]
qb_cpoe = qb_cpoe.sort_values('cpoe_mean', ascending=False)
print("2024 QB CPOE Rankings (min 200 attempts):")
print(qb_cpoe.to_string(index=False, float_format='%.2f'))
Creating Your Own Composite Efficiency Metric
The metrics above---EPA, success rate, CPOE---each capture different aspects of team quality. For modeling, we want a single composite rating that incorporates multiple signals. Here is a weighted composite approach:
def build_composite_rating(games_df, season, through_week=None,
recency_weight=0.7, window=6):
"""
Build a composite team rating incorporating:
- Opponent-adjusted EPA (offense and defense)
- Success rate margin
- Pass/rush EPA splits
- Recent performance weighting
Parameters:
-----------
recency_weight : float
How much weight to give to the most recent 'window' games
vs. the full season (0-1)
window : int
Number of recent games for recency weighting
"""
season_data = games_df[games_df['season'] == season].copy()
if through_week:
season_data = season_data[season_data['week'] <= through_week]
teams = season_data['team'].unique()
ratings = []
for team in teams:
team_data = season_data[season_data['team'] == team].sort_values('week')
if len(team_data) < 3:
continue
# Full season metrics
full_off_epa = team_data['off_epa_play'].mean()
full_def_epa = team_data['def_epa_play'].mean()
full_success = team_data['off_success_rate'].mean()
full_def_success = team_data['def_success_rate'].mean()
# Recent window metrics
recent = team_data.tail(window)
recent_off_epa = recent['off_epa_play'].mean()
recent_def_epa = recent['def_epa_play'].mean()
recent_success = recent['off_success_rate'].mean()
recent_def_success = recent['def_success_rate'].mean()
# Weighted combination
w = recency_weight
off_epa = w * recent_off_epa + (1 - w) * full_off_epa
def_epa = w * recent_def_epa + (1 - w) * full_def_epa
success_margin = (w * recent_success + (1 - w) * full_success) - \
(w * recent_def_success + (1 - w) * full_def_success)
# Composite: 60% EPA margin, 25% success margin, 15% pass EPA premium
epa_margin = off_epa - def_epa
pass_epa_off = team_data['off_epa_pass'].mean()
pass_epa_def = team_data['def_epa_pass'].mean()
pass_margin = pass_epa_off - pass_epa_def
composite = 0.60 * epa_margin + 0.25 * success_margin + 0.15 * pass_margin
ratings.append({
'team': team,
'composite_rating': composite,
'off_epa': off_epa,
'def_epa': def_epa,
'epa_margin': epa_margin,
'success_margin': success_margin,
'pass_margin': pass_margin,
'games': len(team_data)
})
return pd.DataFrame(ratings).sort_values('composite_rating', ascending=False)
# Build Week 12 ratings for 2024
composite = build_composite_rating(games, season=2024, through_week=12)
print("Composite Team Ratings (through Week 12, 2024):")
print(composite[['team', 'composite_rating', 'off_epa', 'def_epa',
'epa_margin']].head(16).to_string(index=False, float_format='%.3f'))
15.3 Spread and Totals Modeling
The Power Ratings Approach
The most common approach to NFL spread prediction uses power ratings: a single number assigned to each team that represents its strength relative to the league. The predicted spread for a game between team A (home) and team B (away) is:
$$\text{Predicted Spread} = (\text{Rating}_{\text{home}} - \text{Rating}_{\text{away}}) + \text{HFA}$$
where HFA is the home-field advantage, typically expressed in points. The spread is quoted from the home team's perspective (negative means the home team is favored). Home-field advantage in the NFL has declined steadily over the past two decades:
| Era | Average HFA (points) |
|---|---|
| 2000--2009 | 3.0 |
| 2010--2019 | 2.7 |
| 2020 (COVID, limited fans) | 0.6 |
| 2021--2024 | 1.5--2.0 |
The COVID-era data provided a natural experiment confirming that crowd noise and home comfort contribute significantly to HFA. Post-COVID, HFA has partially recovered but remains below historical norms.
Building a Regression-Based Spread Model
While power ratings are intuitive, a regression model allows us to weight multiple input features and quantify the uncertainty in our predictions. We will build a model that predicts game margin (home score minus away score) from team efficiency metrics.
from sklearn.linear_model import Ridge
from sklearn.model_selection import TimeSeriesSplit, cross_val_score
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import mean_squared_error, mean_absolute_error
import warnings
warnings.filterwarnings('ignore')
def prepare_matchup_features(games_df, ratings_func, season, train_weeks, predict_week):
"""
Prepare features for spread prediction.
For each game in predict_week, calculate team ratings using
data only through train_weeks (no lookahead bias).
"""
# Build ratings using only data through the training window
ratings = ratings_func(games_df, season=season, through_week=train_weeks)
# Get games for the prediction week
schedule = nfl.import_schedules([season])
pred_games = schedule[schedule['week'] == predict_week].copy()
features = []
for _, game in pred_games.iterrows():
home = game['home_team']
away = game['away_team']
home_stats = ratings[ratings['team'] == home]
away_stats = ratings[ratings['team'] == away]
if len(home_stats) == 0 or len(away_stats) == 0:
continue
home_stats = home_stats.iloc[0]
away_stats = away_stats.iloc[0]
feature_row = {
'game_id': game['game_id'],
'home_team': home,
'away_team': away,
'actual_margin': game['home_score'] - game['away_score'],
'market_spread': game.get('spread_line', np.nan),
# Differential features
'off_epa_diff': home_stats['off_epa'] - away_stats['off_epa'],
'def_epa_diff': home_stats['def_epa'] - away_stats['def_epa'],
'composite_diff': home_stats['composite_rating'] - away_stats['composite_rating'],
'epa_margin_diff': home_stats['epa_margin'] - away_stats['epa_margin'],
'success_margin_diff': home_stats['success_margin'] - away_stats['success_margin'],
# Individual team features (for the model to learn interactions)
'home_off_epa': home_stats['off_epa'],
'home_def_epa': home_stats['def_epa'],
'away_off_epa': away_stats['off_epa'],
'away_def_epa': away_stats['def_epa'],
}
features.append(feature_row)
return pd.DataFrame(features)
# Build training data: use 2022-2023 seasons
# For each week, use ratings through the previous week to predict
all_matchups = []
for season in [2022, 2023]:
for week in range(4, 19): # Start at week 4 (need data to build ratings)
week_matchups = prepare_matchup_features(
games, build_composite_rating, season,
train_weeks=week-1, predict_week=week
)
all_matchups.append(week_matchups)
train_data = pd.concat(all_matchups, ignore_index=True)
train_data = train_data.dropna(subset=['actual_margin'])
print(f"Training data: {len(train_data)} games")
print(f"Features: {[c for c in train_data.columns if c not in ['game_id', 'home_team', 'away_team', 'actual_margin', 'market_spread']]}")
# Feature columns
feature_cols = [
'off_epa_diff', 'def_epa_diff', 'composite_diff',
'epa_margin_diff', 'success_margin_diff',
'home_off_epa', 'home_def_epa', 'away_off_epa', 'away_def_epa'
]
X_train = train_data[feature_cols].values
y_train = train_data['actual_margin'].values
# Ridge regression with cross-validation
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
# Test multiple regularization strengths
best_alpha = None
best_score = -np.inf
for alpha in [0.1, 0.5, 1.0, 5.0, 10.0, 50.0, 100.0]:
ridge = Ridge(alpha=alpha)
tscv = TimeSeriesSplit(n_splits=5)
scores = cross_val_score(ridge, X_train_scaled, y_train,
cv=tscv, scoring='neg_mean_squared_error')
mean_score = scores.mean()
if mean_score > best_score:
best_score = mean_score
best_alpha = alpha
print(f"Best alpha: {best_alpha}")
print(f"Best CV RMSE: {np.sqrt(-best_score):.2f}")
# Train final model
model = Ridge(alpha=best_alpha)
model.fit(X_train_scaled, y_train)
# Feature importance
for feat, coef in sorted(zip(feature_cols, model.coef_),
key=lambda x: abs(x[1]), reverse=True):
print(f" {feat:25s}: {coef:+.3f}")
print(f" {'Intercept (HFA)':25s}: {model.intercept_:+.3f}")
The intercept in this model represents the estimated home-field advantage after accounting for team quality differences. We expect it to be in the range of +1.5 to +2.5 in the modern NFL.
Evaluating Against the Market
The critical test of any model is not whether it predicts game outcomes accurately in an absolute sense, but whether it predicts outcomes more accurately than the market. The closing line set by sportsbooks reflects the collective wisdom of sharp bettors and sophisticated models. Beating the closing line consistently is the hallmark of a profitable model.
# Evaluate on 2024 season (out of sample)
test_matchups = []
for week in range(4, 19):
week_matchups = prepare_matchup_features(
games, build_composite_rating, 2024,
train_weeks=week-1, predict_week=week
)
test_matchups.append(week_matchups)
test_data = pd.concat(test_matchups, ignore_index=True)
test_data = test_data.dropna(subset=['actual_margin', 'market_spread'])
X_test = test_data[feature_cols].values
X_test_scaled = scaler.transform(X_test)
test_data['predicted_margin'] = model.predict(X_test_scaled)
# Model accuracy metrics
model_rmse = np.sqrt(mean_squared_error(test_data['actual_margin'],
test_data['predicted_margin']))
model_mae = mean_absolute_error(test_data['actual_margin'],
test_data['predicted_margin'])
# Market accuracy metrics
market_rmse = np.sqrt(mean_squared_error(test_data['actual_margin'],
-test_data['market_spread']))
market_mae = mean_absolute_error(test_data['actual_margin'],
-test_data['market_spread'])
print(f"Model RMSE: {model_rmse:.2f} | Market RMSE: {market_rmse:.2f}")
print(f"Model MAE: {model_mae:.2f} | Market MAE: {market_mae:.2f}")
# Correlation with actual margin
model_corr = np.corrcoef(test_data['actual_margin'],
test_data['predicted_margin'])[0, 1]
market_corr = np.corrcoef(test_data['actual_margin'],
-test_data['market_spread'])[0, 1]
print(f"Model r: {model_corr:.3f} | Market r: {market_corr:.3f}")
# ATS (Against the Spread) performance
test_data['model_edge'] = test_data['predicted_margin'] - (-test_data['market_spread'])
test_data['model_pick'] = np.where(test_data['model_edge'] > 0, 'home', 'away')
test_data['actual_ats'] = test_data['actual_margin'] + test_data['market_spread']
# Did the model's side cover?
test_data['model_correct'] = np.where(
test_data['model_pick'] == 'home',
test_data['actual_ats'] > 0,
test_data['actual_ats'] < 0
)
ats_record = test_data['model_correct'].mean()
print(f"\nModel ATS record: {test_data['model_correct'].sum()}-"
f"{(~test_data['model_correct']).sum()} "
f"({ats_record:.1%})")
Totals Modeling
Totals (over/under) prediction follows a similar framework but requires modeling the total points scored rather than the margin. The key insight is that total points and margin are partially independent: two teams can combine for 50 points with a margin of 28 (39-11) or 2 (26-24). Different metrics predict totals than predict spreads.
Pace of play, offensive explosion, and defensive weakness all contribute to totals:
$$\text{Predicted Total} = \alpha + \beta_1 \cdot (\text{Home Off EPA}) + \beta_2 \cdot (\text{Away Off EPA}) + \beta_3 \cdot (\text{Home Def EPA}) + \beta_4 \cdot (\text{Away Def EPA}) + \beta_5 \cdot (\text{Pace}_{\text{home}}) + \beta_6 \cdot (\text{Pace}_{\text{away}})$$
def build_totals_model(games_df, train_seasons, test_season):
"""Build and evaluate a totals prediction model."""
# Add total points to games data
games_with_totals = games_df.copy()
games_with_totals['total_points'] = (
games_with_totals['points_for'] + games_with_totals['points_against']
)
# Calculate pace: plays per game
pace = games_with_totals.groupby(['season', 'team']).agg(
avg_plays=('off_plays', 'mean')
).reset_index()
# Build features for each game (need both teams' stats)
schedule = nfl.import_schedules(train_seasons + [test_season])
all_data = []
for _, game_row in schedule[schedule['season_type'] == 'REG'].iterrows():
season = game_row['season']
home = game_row['home_team']
away = game_row['away_team']
team_season = games_with_totals[games_with_totals['season'] == season]
home_data = team_season[team_season['team'] == home]
away_data = team_season[team_season['team'] == away]
if len(home_data) < 3 or len(away_data) < 3:
continue
all_data.append({
'season': season,
'actual_total': game_row['home_score'] + game_row['away_score'],
'market_total': game_row.get('total_line', np.nan),
'home_off_epa': home_data['off_epa_play'].mean(),
'home_def_epa': home_data['def_epa_play'].mean(),
'away_off_epa': away_data['off_epa_play'].mean(),
'away_def_epa': away_data['def_epa_play'].mean(),
'home_pace': home_data['off_plays'].mean(),
'away_pace': away_data['off_plays'].mean(),
'home_pass_rate': home_data['pass_rate'].mean() if 'pass_rate' in home_data.columns else 0.6,
'away_pass_rate': away_data['pass_rate'].mean() if 'pass_rate' in away_data.columns else 0.6,
})
df = pd.DataFrame(all_data).dropna()
# Train/test split
train = df[df['season'].isin(train_seasons)]
test = df[df['season'] == test_season]
feature_cols = ['home_off_epa', 'home_def_epa', 'away_off_epa', 'away_def_epa',
'home_pace', 'away_pace', 'home_pass_rate', 'away_pass_rate']
X_train = train[feature_cols].values
y_train = train['actual_total'].values
X_test = test[feature_cols].values
y_test = test['actual_total'].values
scaler = StandardScaler()
X_train_s = scaler.fit_transform(X_train)
X_test_s = scaler.transform(X_test)
model = Ridge(alpha=10.0)
model.fit(X_train_s, y_train)
predictions = model.predict(X_test_s)
model_rmse = np.sqrt(mean_squared_error(y_test, predictions))
market_rmse = np.sqrt(mean_squared_error(y_test, test['market_total'].values))
print(f"Totals Model RMSE: {model_rmse:.2f}")
print(f"Market Total RMSE: {market_rmse:.2f}")
return model, scaler
totals_model, totals_scaler = build_totals_model(
games, train_seasons=[2022, 2023], test_season=2024
)
Combining Model Output with Market Lines
A naive approach is to bet whenever your model disagrees with the market. A more sophisticated approach blends your model's prediction with the market line, recognizing that the market contains valuable information:
$$\text{Blended Prediction} = w \cdot \text{Model Prediction} + (1 - w) \cdot \text{Market Line}$$
The optimal weight $w$ depends on your model's relative accuracy. If your model's RMSE is 14.0 and the market's is 13.5, an optimal blend might use $w \approx 0.3$---leaning heavily on the market but incorporating your model's independent signal. The value comes from the blend, not from replacing the market entirely.
def calculate_optimal_blend(model_preds, market_preds, actuals):
"""
Find the optimal blending weight between model and market predictions.
Uses grid search to minimize RMSE.
"""
best_w = 0
best_rmse = np.inf
for w in np.arange(0, 1.01, 0.05):
blended = w * model_preds + (1 - w) * market_preds
rmse = np.sqrt(mean_squared_error(actuals, blended))
if rmse < best_rmse:
best_rmse = rmse
best_w = w
print(f"Optimal model weight: {best_w:.2f}")
print(f"Blended RMSE: {best_rmse:.2f}")
print(f"Model-only RMSE: {np.sqrt(mean_squared_error(actuals, model_preds)):.2f}")
print(f"Market-only RMSE: {np.sqrt(mean_squared_error(actuals, market_preds)):.2f}")
return best_w
# Example usage
optimal_w = calculate_optimal_blend(
test_data['predicted_margin'].values,
-test_data['market_spread'].values,
test_data['actual_margin'].values
)
15.4 Player-Level Impact and Injuries
The Outsized Importance of the Quarterback
No position in team sports has as much influence on outcomes as the NFL quarterback. A quarterback touches the ball on every offensive play, makes pre-snap reads that determine blocking assignments and route adjustments, and executes the throws or handoffs that determine play success. Research consistently shows that the difference between an elite quarterback and a replacement-level quarterback is worth approximately 8 to 12 points per game---far more than any other single position.
To quantify this, we can measure the EPA difference between starting quarterbacks and their backups using historical data:
def quantify_qb_value(pbp_data, season):
"""
Measure the value of each team's starting QB by comparing
their EPA/play to the team's backup QB EPA/play.
"""
season_data = pbp_data[
(pbp_data['season'] == season) &
(pbp_data['season_type'] == 'REG') &
(pbp_data['play_type'] == 'pass') &
(pbp_data['passer_player_name'].notna())
].copy()
# Group by team and passer
qb_stats = season_data.groupby(['posteam', 'passer_player_name']).agg(
epa_play=('epa', 'mean'),
total_epa=('epa', 'sum'),
dropbacks=('epa', 'count'),
cpoe=('cpoe', 'mean'),
success_rate=('success', 'mean')
).reset_index()
# For each team, identify starter (most dropbacks) and backup
team_qb_value = []
for team in qb_stats['posteam'].unique():
team_qbs = qb_stats[qb_stats['posteam'] == team].sort_values(
'dropbacks', ascending=False
)
if len(team_qbs) < 1:
continue
starter = team_qbs.iloc[0]
# Calculate league replacement level
all_backups = qb_stats[qb_stats['dropbacks'] < 100]
replacement_epa = all_backups['epa_play'].mean()
# Backup EPA: use actual backup if available, else replacement level
if len(team_qbs) >= 2:
backup = team_qbs.iloc[1]
backup_epa = backup['epa_play'] if backup['dropbacks'] >= 30 else replacement_epa
else:
backup_epa = replacement_epa
# Value over replacement per dropback
epa_vor = starter['epa_play'] - replacement_epa
# Estimated point value per game (assume ~35 dropbacks per game)
dropbacks_per_game = 35
points_per_epa = 1.0 # Approximate conversion
point_value_per_game = epa_vor * dropbacks_per_game * points_per_epa
team_qb_value.append({
'team': team,
'starter': starter['passer_player_name'],
'starter_epa': starter['epa_play'],
'starter_cpoe': starter['cpoe'],
'starter_dropbacks': starter['dropbacks'],
'backup_epa': backup_epa,
'replacement_epa': replacement_epa,
'epa_over_replacement': epa_vor,
'estimated_point_value': point_value_per_game
})
return pd.DataFrame(team_qb_value).sort_values(
'estimated_point_value', ascending=False
)
qb_values = quantify_qb_value(pbp, 2024)
print("2024 QB Value Over Replacement:")
print(qb_values[['team', 'starter', 'starter_epa', 'epa_over_replacement',
'estimated_point_value']].head(15).to_string(index=False, float_format='%.3f'))
Injury Impact Modeling
When a quarterback is injured, the betting market adjusts the spread. The question for the modeler is: does the market adjust enough? Research suggests that the market generally handles high-profile quarterback injuries efficiently but may underreact to injuries at other positions, particularly when multiple starters are missing along the offensive line or in the secondary.
The framework for injury adjustment is:
$$\text{Adjusted Spread} = \text{Base Spread} + \sum_{i \in \text{injured}} \Delta_i$$
where $\Delta_i$ is the estimated point impact of losing player $i$. For quarterbacks, this impact can be estimated from the EPA model above. For other positions, we need a different approach.
def estimate_positional_injury_impact():
"""
Estimated point impact per game of losing a starter at each position.
Based on historical analysis of team performance with/without starters.
These values represent approximate AVERAGE impacts. Individual players
may be worth significantly more or less.
"""
position_impacts = {
'QB': {
'elite_to_backup': -8.0, # e.g., Mahomes to backup
'good_to_backup': -5.0, # e.g., solid starter to backup
'average_to_backup': -3.0, # e.g., mediocre starter to backup
'bad_to_backup': -0.5, # starter is barely above backup
},
'Non-QB Positions (approximate per starter)': {
'LT': -1.2,
'Other OL': -0.7,
'WR1': -1.0,
'WR2': -0.5,
'RB': -0.3,
'TE': -0.5,
'Edge Rusher': -0.8,
'CB1': -0.7,
'Interior DL': -0.5,
'LB': -0.4,
'Safety': -0.4,
}
}
return position_impacts
# Display injury impact estimates
impacts = estimate_positional_injury_impact()
print("Estimated Point Impact of Losing a Starter:")
print("\nQuarterback:")
for level, impact in impacts['QB'].items():
print(f" {level:25s}: {impact:+.1f} points")
print("\nOther positions:")
for pos, impact in impacts['Non-QB Positions (approximate per starter)'].items():
print(f" {pos:25s}: {impact:+.1f} points")
The Replacement-Level Concept
The concept of "replacement level" is borrowed from baseball's WAR (Wins Above Replacement) framework. In the NFL context, replacement level represents the performance you would expect from a freely available player---the type of player who can be signed off the practice squad or obtained via a minimum-salary transaction.
For quarterbacks, the replacement level in EPA/play is approximately -0.15 to -0.10 (below average, as the average includes starters who are well above replacement). The difference between this replacement level and a team's starter is the value above replacement that the team loses when the starter is injured.
def build_injury_adjusted_prediction(base_prediction, home_team, away_team,
injuries_home, injuries_away,
qb_values_df, season):
"""
Adjust a base spread prediction for injuries.
Parameters:
-----------
base_prediction : float
Model's predicted margin (home - away) before injury adjustment
injuries_home : list of dicts
Each dict has 'player', 'position', 'status' ('out', 'doubtful', 'questionable')
injuries_away : list of dicts
Same format as injuries_home
"""
# Status probabilities of missing the game
status_prob = {
'out': 1.0,
'doubtful': 0.85,
'questionable': 0.35,
'probable': 0.05
}
home_adjustment = 0
away_adjustment = 0
for inj in injuries_home:
prob_missing = status_prob.get(inj['status'].lower(), 0)
if inj['position'] == 'QB':
# Use QB-specific value if available
qb_row = qb_values_df[
(qb_values_df['team'] == home_team) &
(qb_values_df['starter'] == inj['player'])
]
if len(qb_row) > 0:
impact = -qb_row.iloc[0]['estimated_point_value']
else:
impact = -3.0 # Default average QB impact
else:
# Use position-based estimates
pos_impacts = {
'LT': -1.2, 'OL': -0.7, 'WR': -0.7, 'RB': -0.3,
'TE': -0.5, 'DE': -0.8, 'EDGE': -0.8, 'DT': -0.5,
'CB': -0.7, 'LB': -0.4, 'S': -0.4, 'K': -1.5
}
impact = pos_impacts.get(inj['position'], -0.3)
home_adjustment += impact * prob_missing
for inj in injuries_away:
prob_missing = status_prob.get(inj['status'].lower(), 0)
if inj['position'] == 'QB':
qb_row = qb_values_df[
(qb_values_df['team'] == away_team) &
(qb_values_df['starter'] == inj['player'])
]
if len(qb_row) > 0:
impact = qb_row.iloc[0]['estimated_point_value']
else:
impact = 3.0
else:
pos_impacts = {
'LT': 1.2, 'OL': 0.7, 'WR': 0.7, 'RB': 0.3,
'TE': 0.5, 'DE': 0.8, 'EDGE': 0.8, 'DT': 0.5,
'CB': 0.7, 'LB': 0.4, 'S': 0.4, 'K': 1.5
}
impact = pos_impacts.get(inj['position'], 0.3)
away_adjustment += impact * prob_missing
adjusted_prediction = base_prediction + home_adjustment + away_adjustment
return {
'base_prediction': base_prediction,
'home_injury_adj': home_adjustment,
'away_injury_adj': away_adjustment,
'total_adjustment': home_adjustment + away_adjustment,
'adjusted_prediction': adjusted_prediction
}
# Example: predicting a game with a key QB injury
example = build_injury_adjusted_prediction(
base_prediction=-3.5, # Model says home favored by 3.5 without injuries
home_team='KC',
away_team='BUF',
injuries_home=[], # KC healthy
injuries_away=[
{'player': 'J.Allen', 'position': 'QB', 'status': 'Out'},
{'player': 'S.Diggs', 'position': 'WR', 'status': 'Questionable'}
],
qb_values_df=qb_values,
season=2024
)
print("Injury-Adjusted Prediction Example:")
for key, value in example.items():
if isinstance(value, float):
print(f" {key:25s}: {value:+.1f}")
else:
print(f" {key:25s}: {value}")
Monitoring the Injury Report Programmatically
The NFL publishes official injury reports on Wednesday, Thursday, and Friday of each week (for Sunday games). These reports include practice participation status and game designations. Using nfl_data_py, you can access injury data:
# Load injury data
injuries = nfl.import_injuries([2024])
print(f"Injury report entries: {len(injuries)}")
print(f"Columns: {list(injuries.columns)}")
# Example: Week 10 injury report
week10 = injuries[injuries['week'] == 10]
print(f"\nWeek 10 injury designations:")
print(week10.groupby('report_status')['gsis_id'].count())
# Focus on players listed as Out or Doubtful
critical = week10[week10['report_status'].isin(['Out', 'Doubtful'])]
print(f"\nPlayers Out or Doubtful in Week 10:")
print(critical[['team', 'full_name', 'position', 'report_status']].to_string(index=False))
15.5 NFL Betting Market Patterns
Key Numbers: 3 and 7
The most distinctive feature of NFL betting is the importance of key numbers. Because touchdowns are worth 7 points (6 + extra point) and field goals are worth 3 points, final margins cluster around these numbers and their combinations. The most common NFL margins are:
| Margin | Historical Frequency |
|---|---|
| 3 | ~14.5% |
| 7 | ~9.5% |
| 10 | ~6.0% |
| 6 | ~5.5% |
| 4 | ~5.0% |
| 1 | ~4.5% |
| 14 | ~4.5% |
This distribution has profound implications for spread betting. A spread of -3 is fundamentally different from -2.5 or -3.5 because roughly 14.5% of games land on a margin of exactly 3. The bettor who gets +3 instead of +2.5 "pushes" (ties) on those games instead of losing---a massive difference in expected value.
def analyze_key_numbers(games_df, seasons=None):
"""
Analyze the distribution of game margins and the
importance of key numbers in NFL betting.
"""
if seasons:
data = games_df[games_df['season'].isin(seasons)]
else:
data = games_df
# Calculate margin (use home team perspective, take absolute value for symmetry)
margins = data.drop_duplicates(subset='game_id')
margins['abs_margin'] = abs(margins['margin'])
# Distribution of margins
margin_dist = margins['abs_margin'].value_counts().sort_index()
margin_pct = margin_dist / len(margins) * 100
print("NFL Game Margin Distribution:")
print(f"{'Margin':>6} | {'Count':>5} | {'Pct':>6} | {'Cumulative':>10}")
print("-" * 35)
cumulative = 0
for margin in range(0, 35):
if margin in margin_pct.index:
pct = margin_pct[margin]
cumulative += pct
count = margin_dist[margin]
marker = " ***" if margin in [3, 7, 10, 14, 17] else ""
print(f"{margin:>6} | {count:>5} | {pct:>5.1f}% | {cumulative:>9.1f}%{marker}")
return margin_dist
margin_dist = analyze_key_numbers(games, seasons=[2020, 2021, 2022, 2023, 2024])
Teaser Strategy
A teaser is a modified parlay where the bettor receives additional points on the spread in exchange for combining multiple selections. The most common teaser in the NFL is the 6-point, 2-team teaser at -110 odds. The strategic value of teasers in the NFL derives directly from key numbers.
The classic "Wong teaser" strategy (named after Stanford Wong) involves teasing spreads that cross both 3 and 7. For example:
- Teasing a -7.5 favorite down to -1.5: crosses 7, 6, 5, 4, 3, 2
- Teasing a +1.5 underdog up to +7.5: crosses 2, 3, 4, 5, 6, 7
The mathematical framework for evaluating a 2-team teaser at -110:
$$\text{Required Win Rate} = \frac{1.10}{2.10} = 52.38\%$$
$$\text{Required Each-Leg Win Rate} = \sqrt{0.5238} = 72.36\%$$
But when we tease through both 3 and 7, historical win rates on each leg are approximately 74-77%, making the teaser +EV:
def analyze_teaser_value(games_df, schedule_df, teaser_points=6):
"""
Analyze historical teaser profitability by original spread.
For each original spread, calculate the win rate after
adding teaser_points, to identify +EV teaser opportunities.
"""
# Merge game results with spreads
game_results = schedule_df[schedule_df['season_type'] == 'REG'].copy()
game_results['margin'] = game_results['home_score'] - game_results['away_score']
game_results = game_results.dropna(subset=['spread_line', 'margin'])
# For home team: covers if margin > -spread_line (spread is negative for favorites)
# After teaser: covers if margin > -(spread_line) - teaser_points
results = []
spread_ranges = [
(-8.5, -7.5, '-7.5 to -8.5 (tease to -1.5 to -2.5)'),
(-7, -7, '-7 (tease to -1)'),
(-3.5, -2.5, '-2.5 to -3.5 (tease to +2.5 to +3.5)'),
(-2, -1, '-1 to -2 (tease to +4 to +5)'),
(1, 2, '+1 to +2 (tease to +7 to +8)'),
(2.5, 3.5, '+2.5 to +3.5 (tease to +8.5 to +9.5)'),
(7, 8, '+7 to +8 (tease to +13 to +14)'),
]
for low, high, label in spread_ranges:
subset = game_results[
(game_results['spread_line'] >= low) &
(game_results['spread_line'] <= high)
]
if len(subset) < 10:
continue
# Standard cover rate (home perspective)
standard_cover = (subset['margin'] > -subset['spread_line']).mean()
# Teased cover rate
teased_cover = (
subset['margin'] > -(subset['spread_line'] + teaser_points)
).mean()
results.append({
'original_spread': label,
'games': len(subset),
'standard_cover': standard_cover,
'teased_cover': teased_cover,
})
results_df = pd.DataFrame(results)
print("Teaser Value Analysis (6-point teaser):")
print(f"Required each-leg win rate for 2-team -110 teaser: 72.36%\n")
print(results_df.to_string(index=False, float_format='%.1%'))
return results_df
schedule = nfl.import_schedules([2020, 2021, 2022, 2023, 2024])
teaser_analysis = analyze_teaser_value(games, schedule)
Divisional Game Adjustments
Games between divisional rivals exhibit different characteristics than non-divisional games. Divisional opponents play each other twice per season, leading to greater familiarity. Historical analysis shows:
- Reduced home-field advantage: Divisional games show approximately 0.5 fewer points of HFA compared to non-divisional games
- More competitive games: Divisional underdogs cover at a slightly higher rate than non-divisional underdogs
- Lower totals: Familiarity tends to reduce scoring, as defensive coordinators are better prepared against opponents they have already scouted extensively
def analyze_divisional_effects(games_df, schedule_df):
"""Analyze how divisional games differ from non-divisional games."""
# NFL division mapping
divisions = {
'AFC East': ['BUF', 'MIA', 'NE', 'NYJ'],
'AFC North': ['BAL', 'CIN', 'CLE', 'PIT'],
'AFC South': ['HOU', 'IND', 'JAX', 'TEN'],
'AFC West': ['DEN', 'KC', 'LAC', 'LV'],
'NFC East': ['DAL', 'NYG', 'PHI', 'WAS'],
'NFC North': ['CHI', 'DET', 'GB', 'MIN'],
'NFC South': ['ATL', 'CAR', 'NO', 'TB'],
'NFC West': ['ARI', 'LAR', 'SEA', 'SF'],
}
team_to_div = {}
for div, teams in divisions.items():
for team in teams:
team_to_div[team] = div
results = schedule_df[schedule_df['season_type'] == 'REG'].copy()
results['margin'] = results['home_score'] - results['away_score']
results['total'] = results['home_score'] + results['away_score']
results = results.dropna(subset=['spread_line', 'margin'])
results['is_divisional'] = results.apply(
lambda r: team_to_div.get(r['home_team']) == team_to_div.get(r['away_team']),
axis=1
)
# Compare divisional vs non-divisional
for label, subset in [('Divisional', results[results['is_divisional']]),
('Non-Divisional', results[~results['is_divisional']])]:
cover_rate = (subset['margin'] > -subset['spread_line']).mean()
avg_margin = subset['margin'].mean()
avg_total = subset['total'].mean()
avg_abs_margin = subset['margin'].abs().mean()
print(f"\n{label} Games (n={len(subset)}):")
print(f" Home team cover rate: {cover_rate:.1%}")
print(f" Average home margin: {avg_margin:.1f} (proxy for HFA)")
print(f" Average total points: {avg_total:.1f}")
print(f" Average absolute margin: {avg_abs_margin:.1f}")
analyze_divisional_effects(games, schedule)
Weather Effects
Weather is a significant factor in NFL betting, particularly for totals. Rain, snow, and especially wind can substantially reduce scoring. Temperature has a modest effect, primarily through its impact on grip and field conditions.
The key weather variables for NFL modeling are:
- Wind speed: The strongest predictor of reduced scoring. Wind over 15 mph significantly decreases passing efficiency and field goal accuracy.
- Precipitation: Rain reduces scoring by approximately 2-4 points on the total. Snow has a larger effect, approximately 3-6 points.
- Temperature: Extreme cold (below 20 degrees F) reduces scoring modestly, approximately 1-2 points.
def estimate_weather_impact(wind_mph, precipitation, temperature_f):
"""
Estimate the impact of weather on game total (points adjustment).
Based on historical analysis of NFL games played in various conditions.
Returns an adjustment to subtract from the base total prediction.
"""
adjustment = 0
# Wind impact (non-linear)
if wind_mph >= 25:
adjustment += 6.0
elif wind_mph >= 20:
adjustment += 4.0
elif wind_mph >= 15:
adjustment += 2.5
elif wind_mph >= 10:
adjustment += 1.0
# Precipitation impact
if precipitation == 'snow':
adjustment += 4.5
elif precipitation == 'heavy_rain':
adjustment += 3.0
elif precipitation == 'rain':
adjustment += 1.5
# Temperature impact
if temperature_f <= 10:
adjustment += 2.5
elif temperature_f <= 20:
adjustment += 1.5
elif temperature_f <= 32:
adjustment += 0.5
return adjustment
# Example calculations
scenarios = [
('Dome/Mild', 5, 'none', 72),
('Cold & Windy', 22, 'none', 18),
('Snow Game', 12, 'snow', 25),
('Rainy', 8, 'rain', 45),
('Extreme Wind', 30, 'none', 40),
]
print("Weather Impact on Total:")
print(f"{'Scenario':20s} | {'Wind':>5s} | {'Precip':>10s} | {'Temp':>5s} | {'Adj':>5s}")
print("-" * 60)
for name, wind, precip, temp in scenarios:
adj = estimate_weather_impact(wind, precip, temp)
print(f"{name:20s} | {wind:>5d} | {precip:>10s} | {temp:>5d} | {-adj:>+5.1f}")
Public vs. Sharp Money
One of the most persistent edges in NFL betting comes from understanding where public and sharp money flows. Public money refers to wagers from recreational bettors, who tend to bet favorites, overs, and popular teams. Sharp money refers to wagers from professional bettors and syndicates, who are price-sensitive and tend to move lines.
Key patterns:
-
Fade the public on primetime games: Monday Night Football and Sunday Night Football attract disproportionate public betting on favorites and overs. Historically, underdogs and unders have shown slight edges in these spots.
-
Reverse line movement: When the line moves against the side receiving the majority of bets (e.g., the public is 75% on the favorite, but the line moves toward the underdog), this indicates sharp money on the underdog. These moves are strong signals.
-
Steam moves: A sudden, sharp line movement across multiple sportsbooks simultaneously, typically driven by a large sharp bet or syndicate action.
def analyze_public_betting_patterns(schedule_df, seasons):
"""
Analyze NFL betting patterns that indicate public/sharp divergence.
Note: True bet percentage data requires subscription services like
Sports Insights or Action Network. Here we use proxy indicators.
"""
data = schedule_df[
(schedule_df['season'].isin(seasons)) &
(schedule_df['season_type'] == 'REG')
].copy()
data['margin'] = data['home_score'] - data['away_score']
data = data.dropna(subset=['spread_line', 'margin', 'total_line'])
data['total'] = data['home_score'] + data['away_score']
# Proxy: Large favorites (likely heavy public action)
large_fav = data[data['spread_line'] <= -7] # Home favored by 7+
large_fav_cover = (large_fav['margin'] > -large_fav['spread_line']).mean()
small_spread = data[(data['spread_line'] >= -3) & (data['spread_line'] <= 3)]
small_cover = (small_spread['margin'] > -small_spread['spread_line']).mean()
# Over/under performance
over_rate = (data['total'] > data['total_line']).mean()
push_rate = (data['total'] == data['total_line']).mean()
# Primetime games (approximate: weeks have TNF, SNF, MNF)
# In the schedule data, we can identify by gameday
print("NFL Public Betting Pattern Analysis")
print(f"Seasons: {seasons}")
print(f"Total games: {len(data)}")
print(f"\nLarge favorites (7+ points):")
print(f" Count: {len(large_fav)}")
print(f" Cover rate: {large_fav_cover:.1%}")
print(f" (Breakeven at -110: 52.4%)")
print(f"\nSmall spreads (3 or less):")
print(f" Count: {len(small_spread)}")
print(f" Home cover rate: {small_cover:.1%}")
print(f"\nOver/Under:")
print(f" Over rate: {over_rate:.1%}")
print(f" Push rate: {push_rate:.1%}")
print(f" Under rate: {1 - over_rate - push_rate:.1%}")
# Spread distribution analysis
print(f"\nSpread bucket ATS performance (home team):")
bins = [(-50, -10), (-10, -7), (-7, -3), (-3, 0), (0, 3), (3, 7), (7, 50)]
for low, high in bins:
bucket = data[(data['spread_line'] > low) & (data['spread_line'] <= high)]
if len(bucket) > 10:
cover = (bucket['margin'] > -bucket['spread_line']).mean()
print(f" Spread ({low:+d} to {high:+d}): {len(bucket):>4d} games, "
f"home cover {cover:.1%}")
analyze_public_betting_patterns(schedule, seasons=[2020, 2021, 2022, 2023, 2024])
Putting It All Together: A Complete Game Analysis
Let us walk through a complete analysis of a hypothetical Week 14 matchup, combining all the tools from this chapter:
def complete_game_analysis(home_team, away_team, season, week,
games_df, pbp_data, market_spread, market_total,
home_injuries=None, away_injuries=None,
wind_mph=5, precipitation='none', temperature_f=65):
"""
Complete game analysis combining all Chapter 15 tools.
"""
print(f"{'='*60}")
print(f"GAME ANALYSIS: {away_team} @ {home_team}")
print(f"Season {season}, Week {week}")
print(f"Market: {home_team} {market_spread:+.1f}, Total {market_total}")
print(f"{'='*60}")
# 1. Team Ratings
ratings = build_composite_rating(games_df, season, through_week=week-1)
home_rating = ratings[ratings['team'] == home_team].iloc[0]
away_rating = ratings[ratings['team'] == away_team].iloc[0]
print(f"\n--- TEAM RATINGS ---")
print(f"{home_team}: Composite {home_rating['composite_rating']:+.3f} "
f"(Off: {home_rating['off_epa']:+.3f}, Def: {home_rating['def_epa']:+.3f})")
print(f"{away_team}: Composite {away_rating['composite_rating']:+.3f} "
f"(Off: {away_rating['off_epa']:+.3f}, Def: {away_rating['def_epa']:+.3f})")
# 2. Base spread prediction
hfa = 1.8 # Current-era home field advantage
base_spread = (home_rating['composite_rating'] - away_rating['composite_rating']) * 14 + hfa
# Note: multiply by ~14 to convert EPA-scale rating to points scale
print(f"\n--- SPREAD PREDICTION ---")
print(f"Base model spread: {home_team} {base_spread:+.1f}")
# 3. Injury adjustments
if home_injuries or away_injuries:
qb_vals = quantify_qb_value(pbp_data, season)
adj = build_injury_adjusted_prediction(
base_spread, home_team, away_team,
home_injuries or [], away_injuries or [],
qb_vals, season
)
spread_after_injuries = adj['adjusted_prediction']
print(f"Injury adjustment: {adj['total_adjustment']:+.1f}")
print(f"Spread after injuries: {home_team} {spread_after_injuries:+.1f}")
else:
spread_after_injuries = base_spread
print("No significant injuries reported.")
# 4. Weather adjustment (for totals)
weather_adj = estimate_weather_impact(wind_mph, precipitation, temperature_f)
print(f"\n--- TOTALS PREDICTION ---")
base_total = 45.0 # Would come from totals model in practice
adjusted_total = base_total - weather_adj
print(f"Base total prediction: {base_total:.1f}")
print(f"Weather adjustment: {-weather_adj:+.1f}")
print(f"Adjusted total prediction: {adjusted_total:.1f}")
# 5. Market comparison
print(f"\n--- MARKET COMPARISON ---")
model_vs_market_spread = spread_after_injuries - market_spread
model_vs_market_total = adjusted_total - market_total
print(f"Model spread: {spread_after_injuries:+.1f} vs Market: {market_spread:+.1f}")
print(f" Difference: {model_vs_market_spread:+.1f}")
if abs(model_vs_market_spread) >= 2.0:
side = home_team if model_vs_market_spread > 0 else away_team
print(f" ** POTENTIAL EDGE on {side} **")
else:
print(f" No significant edge detected on spread.")
print(f"Model total: {adjusted_total:.1f} vs Market: {market_total}")
print(f" Difference: {model_vs_market_total:+.1f}")
if abs(model_vs_market_total) >= 2.5:
direction = "OVER" if model_vs_market_total > 0 else "UNDER"
print(f" ** POTENTIAL EDGE on {direction} **")
else:
print(f" No significant edge detected on total.")
print(f"\n{'='*60}")
# Example analysis
complete_game_analysis(
home_team='KC', away_team='BUF',
season=2024, week=14,
games_df=games, pbp_data=pbp,
market_spread=-2.5, market_total=47.5,
away_injuries=[
{'player': 'D.Knox', 'position': 'TE', 'status': 'Out'}
],
wind_mph=18, precipitation='none', temperature_f=28
)
15.6 Chapter Summary
This chapter provided a comprehensive framework for modeling the NFL from a betting perspective. We covered the entire pipeline from data acquisition to prediction to market analysis.
Key Takeaways
Data and Metrics: - The nflfastR/nfl_data_py ecosystem provides rich play-by-play data with pre-calculated EPA, WPA, and other advanced metrics dating back to 1999. - EPA (Expected Points Added) is the foundational metric for measuring play and player value, superior to raw yardage because it accounts for game situation. - CPOE (Completion Percentage Over Expected) isolates quarterback accuracy from throw difficulty and is one of the most stable QB metrics year-to-year. - Success rate complements EPA by measuring consistency rather than magnitude.
Modeling: - Opponent-adjusted efficiency ratings form the basis of spread prediction. An iterative adjustment process accounts for strength of schedule. - Ridge regression with time-series cross-validation provides a principled framework for combining multiple efficiency features into spread and totals predictions. - The optimal approach blends your model's predictions with the market line rather than ignoring market information entirely. The market encodes substantial wisdom. - Home-field advantage in the modern NFL is approximately 1.5 to 2.0 points, down significantly from historical levels.
Player Impact: - Quarterback value dominates all other positional values. Elite QBs are worth 8-12 points over replacement per game. - Injury adjustments should be probabilistic (accounting for questionable/doubtful designations) and position-specific. - The market generally prices high-profile QB injuries efficiently but may underreact to cumulative injuries at other positions.
Market Patterns: - Key numbers (3, 7) create structural features in the NFL margin distribution that affect teaser strategy and spread pricing. - Wong teasers crossing both 3 and 7 have historically been profitable at standard -110 pricing. - Divisional games show reduced home-field advantage and more competitive outcomes. - Weather, particularly wind, significantly impacts totals and should be factored into over/under analysis. - Public money tends to favor popular teams, large favorites, and overs, especially in primetime games. Reverse line movement can signal sharp action on the opposite side.
Looking Ahead
In Chapter 16, we will apply a similar framework to the NBA, where the data landscape, pace of play, and market dynamics create a very different set of modeling challenges and opportunities. The NBA's 82-game season provides much larger sample sizes but introduces complications from rest, travel, and lineup variation that the NFL's weekly schedule largely avoids.
Practice Exercises
-
EPA Stability: Download play-by-play data for 2022 and 2023. Calculate each team's EPA/play in weeks 1-9 and weeks 10-18 of each season. What is the correlation between the two halves? How does this inform the sample size needed for reliable team evaluation?
-
Key Number Analysis: Using historical data from 2015-2024, calculate the exact frequency with which NFL games land on each margin from 1 to 21. Verify the key number frequencies presented in this chapter. How has the distribution changed since the extra point was moved back to the 15-yard line in 2015?
-
QB Injury Model: Identify 10 instances where a starting QB was injured and missed multiple games. Compare the team's performance (points scored, EPA/play) with the starter versus without. How do your findings compare to the positional impact estimates in Section 15.4?
-
Weather Model: Build a regression model that predicts game totals using offensive and defensive efficiency metrics plus weather variables. Compare the model's out-of-sample performance with and without weather features. Does weather improve predictions?
-
Teaser Backtest: Using historical closing lines and results from 2018-2024, simulate a systematic Wong teaser strategy (6-point teasers crossing 3 and 7 at -110). What is the historical ROI? How sensitive is the result to the exact spread thresholds you use?
Further Reading
- Burke, Brian. "Expected Points and Expected Points Added." Advanced Football Analytics.
- Alamar, Benjamin. Sports Analytics: A Guide for Coaches, Managers, and Other Decision Makers. Columbia University Press, 2013.
- Winston, Wayne. Mathletics: How Gamblers, Managers, and Sports Enthusiasts Use Mathematics in Baseball, Basketball, and Football. Princeton University Press, 2009.
- Football Outsiders. "Methods to Our Madness." footballoutsiders.com.
- Carl, Sam. "nflfastR: Functions to Efficiently Access NFL Play by Play Data." github.com/nflverse/nflfastR.
- Bales, Jonathan. Fantasy Football for Smart People: How to Win at Daily Fantasy Sports. Fantasy Labs, 2015.
- Stern, Hal. "On the Probability of Winning a Football Game." The American Statistician, 1991.
- Harville, David. "Predictions for National Football League Games via Linear-Model Methodology." Journal of the American Statistical Association, 1980.