Case Study: The Fourth-Down Revolution
"The evidence was always there. It just took decades for anyone to act on it."
Executive Summary
This case study examines one of football analytics' most significant success stories: the transformation of fourth-down decision-making in the NFL. Over 15 years, what was once considered reckless aggression became recognized as optimal strategy, fundamentally changing how the game is coached.
Skills Applied: - Expected value analysis - Historical data interpretation - Decision-making frameworks - Communication of analytical insights
Data: Historical fourth-down decision data (2000-2023)
Background
The Traditional Approach
For most of football history, fourth-down decisions followed a simple heuristic: punt on fourth down, kick field goals when in range, and only "go for it" in desperate situations or with very short yardage.
This approach seemed intuitive. Failing to convert meant giving the opponent excellent field position, while punting pushed them back. The risk of failure loomed large in coaches' minds.
The Context
By the early 2000s, researchers began questioning this orthodoxy. Economists David Romer published a landmark paper in 2006 demonstrating that teams were far too conservative on fourth down. His analysis showed that going for it was often the higher expected-value decision, even in situations where coaches nearly always punted.
But the NFL is a conservative league. Coaches who tried unconventional strategies and failed faced intense criticism. Those who punted and lost could point to "doing the right thing." The incentives favored conservatism.
The Challenge
Central Question: "Can rigorous expected value analysis change deeply entrenched coaching behavior, and if so, how long does that change take?"
Stakeholders
| Role | Perspective | Definition of Success |
|---|---|---|
| Head Coaches | Risk-averse due to job security | Win games while minimizing criticism |
| Analytics Departments | Evidence-driven decision making | Coaches adopt EV-optimal strategies |
| General Managers | Long-term winning | Build competitive advantage |
| Media/Fans | Entertainment and narratives | Understand and appreciate good decisions |
Available Data
Data Sources
For this analysis, we use publicly available play-by-play data from nflfastR/nfl_data_py, which includes:
- All fourth-down plays from 2000-2023
- Decision made (go for it, punt, field goal attempt)
- Outcome if went for it
- Game situation (score, time, field position)
- Expected points values for each situation
Data Dictionary
| Column | Type | Description | Example |
|---|---|---|---|
season |
int | NFL season year | 2023 |
game_id |
str | Unique game identifier | 2023_01_ARI_WAS |
down |
int | Down number (4 for our analysis) | 4 |
ydstogo |
int | Yards needed for first down | 2 |
yardline_100 |
int | Yards from opponent's end zone | 45 |
go_for_it |
bool | Whether team went for it | True |
punt |
bool | Whether team punted | False |
field_goal_attempt |
bool | Whether team attempted FG | False |
converted |
bool | If went for it, did they convert? | True |
ep_before |
float | Expected points before play | 2.1 |
ep_after |
float | Expected points after play | 4.5 |
go_boost |
float | EV advantage of going vs. punt | 0.8 |
Sample Data
season down ydstogo yardline_100 decision converted go_boost
2023 4 1 45 go_for_it True 1.2
2023 4 3 32 punt NaN 0.4
2023 4 2 55 punt NaN 0.8
2023 4 1 68 field_goal NaN -0.5
2023 4 4 25 punt NaN -0.2
Data Quality Notes
- Expected points models have improved over time; historical estimates may differ from modern calculations
- "Go boost" (EV advantage of going for it) is calculated using current models applied retroactively
- Some game situations (late game, extreme scores) require different analysis
- Sample sizes for rare situations (fourth-and-long deep in own territory) are limited
Analysis Approach
Phase 1: Historical Context
Let's first understand how fourth-down behavior has changed over time.
# code/case_study_01_fourth_down.py - Part 1: Historical Analysis
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
# For demonstration, we'll simulate historical data
# In practice, use: nfl.import_pbp_data(range(2000, 2024))
np.random.seed(42)
def generate_fourth_down_data():
"""
Generate simulated fourth-down historical data.
This approximates the patterns observed in real NFL data.
"""
seasons = range(2000, 2024)
data = []
for season in seasons:
n_fourth_downs = np.random.randint(2800, 3200) # League total per season
# Go-for-it rate increases over time
base_go_rate = 0.10 + (season - 2000) * 0.008
if season >= 2018:
base_go_rate += 0.05 # Acceleration in modern era
for _ in range(n_fourth_downs):
ydstogo = np.random.choice(
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
p=[0.20, 0.18, 0.15, 0.12, 0.10, 0.08, 0.06, 0.05, 0.03, 0.03]
)
yardline = np.random.randint(1, 100)
# Go rate varies by distance and field position
go_rate = base_go_rate * (1 + (5 - ydstogo) * 0.1)
if yardline > 60: # Opponent's territory
go_rate *= 1.5
if yardline > 90: # Red zone
go_rate *= 1.3
go_rate = min(go_rate, 0.95)
went_for_it = np.random.random() < go_rate
# Calculate go_boost (EV advantage of going for it)
if ydstogo <= 2:
go_boost = 1.0 + (70 - yardline) * 0.02
elif ydstogo <= 4:
go_boost = 0.3 + (70 - yardline) * 0.015
else:
go_boost = -0.2 + (80 - yardline) * 0.01
go_boost += np.random.normal(0, 0.3)
data.append({
'season': season,
'ydstogo': ydstogo,
'yardline_100': yardline,
'went_for_it': went_for_it,
'go_boost': go_boost,
'should_go': go_boost > 0
})
return pd.DataFrame(data)
# Generate data
df = generate_fourth_down_data()
# Aggregate by season
seasonal = df.groupby('season').agg({
'went_for_it': 'mean',
'go_boost': 'mean',
'should_go': 'mean'
}).reset_index()
seasonal.columns = ['season', 'go_rate', 'avg_go_boost', 'optimal_go_rate']
print("Fourth-Down Decisions by Era:")
print("-" * 50)
eras = [
(2000, 2005, "Early 2000s"),
(2006, 2010, "Post-Romer"),
(2011, 2017, "Analytics Growth"),
(2018, 2023, "Modern Era")
]
for start, end, name in eras:
era_data = seasonal[(seasonal['season'] >= start) & (seasonal['season'] <= end)]
print(f"\n{name} ({start}-{end}):")
print(f" Average Go Rate: {era_data['go_rate'].mean():.1%}")
print(f" Optimal Go Rate: {era_data['optimal_go_rate'].mean():.1%}")
print(f" Gap: {(era_data['optimal_go_rate'].mean() - era_data['go_rate'].mean()):.1%}")
Key Finding 1: The Aggression Gap Has Narrowed
In the early 2000s, teams went for it on approximately 15% of fourth downs where analytics suggested they should, representing a massive aggression gap. By the 2020s, this gap had narrowed significantly as teams adopted more aggressive strategies.
# Visualization: Historical trend
fig, ax = plt.subplots(figsize=(12, 6))
ax.plot(seasonal['season'], seasonal['go_rate'] * 100,
marker='o', linewidth=2, label='Actual Go Rate')
ax.plot(seasonal['season'], seasonal['optimal_go_rate'] * 100,
marker='s', linewidth=2, linestyle='--', label='Optimal Go Rate')
ax.fill_between(seasonal['season'],
seasonal['go_rate'] * 100,
seasonal['optimal_go_rate'] * 100,
alpha=0.3, label='Aggression Gap')
ax.axvline(x=2006, color='red', linestyle=':', alpha=0.7, label='Romer Paper')
ax.axvline(x=2018, color='green', linestyle=':', alpha=0.7, label='Big Data Bowl')
ax.set_xlabel('Season', fontsize=12)
ax.set_ylabel('Go-For-It Rate (%)', fontsize=12)
ax.set_title('The Fourth-Down Revolution: Closing the Aggression Gap', fontsize=14)
ax.legend(loc='upper left')
ax.grid(True, alpha=0.3)
plt.tight_layout()
plt.savefig('figures/fourth_down_historical.png', dpi=150)
plt.show()
Key Finding 2: The Acceleration After 2018
The adoption of aggressive fourth-down strategies accelerated dramatically after 2018. Several factors contributed: the Eagles' Super Bowl win with aggressive decision-making, increased public discourse about fourth-down analysis, and new coaches entering the league with analytics backgrounds.
Phase 2: Expected Value Framework
Let's examine the mathematics behind fourth-down decisions.
Applying Chapter Concept: In Section 1.1.2, we learned about expected value as the foundation of prescriptive analytics. We now apply this to fourth-down decisions.
# Part 2: Expected Value Analysis
def fourth_down_ev(
yards_to_go: int,
field_position: int, # yards from opponent's goal
conversion_prob: float = None,
ep_model: dict = None
) -> dict:
"""
Calculate expected value for fourth-down options.
Parameters
----------
yards_to_go : int
Yards needed for first down
field_position : int
Yards from opponent's end zone (1-99)
conversion_prob : float, optional
Probability of converting. If None, estimated from distance.
ep_model : dict, optional
Expected points lookup. If None, uses simplified model.
Returns
-------
dict
Expected values for each option and recommendation
"""
# Conversion probability by distance (approximate)
if conversion_prob is None:
conv_probs = {1: 0.75, 2: 0.60, 3: 0.52, 4: 0.45, 5: 0.40,
6: 0.35, 7: 0.30, 8: 0.28, 9: 0.25, 10: 0.22}
conversion_prob = conv_probs.get(yards_to_go, 0.20)
# Simplified EP model (approximate actual values)
def get_ep(yard_line, offense=True):
"""Get expected points at field position."""
if offense:
if yard_line <= 0:
return 7.0 # Touchdown
elif yard_line <= 10:
return 5.0 + (10 - yard_line) * 0.2
else:
return -1.5 + (100 - yard_line) * 0.055
else: # Opponent's ball
return -get_ep(100 - yard_line, offense=True)
# Calculate EV for each option
# Option 1: Go for it
ep_if_convert = get_ep(field_position, offense=True)
ep_if_fail = get_ep(field_position, offense=False) # Turnover on downs
ev_go = (conversion_prob * ep_if_convert +
(1 - conversion_prob) * ep_if_fail)
# Option 2: Punt
net_punt = 42 # Average net punt distance
punt_result = min(max(field_position + net_punt, 20), 80) # Touchback/fair catch
ev_punt = get_ep(100 - punt_result, offense=False)
# Option 3: Field Goal (if in range)
ev_fg = None
fg_range = field_position <= 42 # ~60 yard FG max
if fg_range:
# FG probability declines with distance
fg_distance = field_position + 17 # Add 17 for snap/hold
fg_prob = max(0.3, 1.0 - (fg_distance - 20) * 0.015)
ev_fg = fg_prob * 3.0 + (1 - fg_prob) * get_ep(min(field_position + 7, 80), offense=False)
# Determine recommendation
options = {'go': ev_go, 'punt': ev_punt}
if ev_fg is not None:
options['field_goal'] = ev_fg
best_option = max(options, key=options.get)
return {
'yards_to_go': yards_to_go,
'field_position': field_position,
'conversion_prob': conversion_prob,
'ev_go': round(ev_go, 2),
'ev_punt': round(ev_punt, 2),
'ev_fg': round(ev_fg, 2) if ev_fg else None,
'recommendation': best_option,
'go_boost': round(ev_go - max(ev_punt, ev_fg or -100), 2)
}
# Analyze across situations
print("Fourth-Down Expected Value Analysis")
print("=" * 60)
situations = [
(1, 50, "4th and 1 at midfield"),
(2, 35, "4th and 2 at opponent's 35"),
(3, 40, "4th and 3 at opponent's 40"),
(1, 75, "4th and 1 at own 25"),
(5, 30, "4th and 5 at opponent's 30 (FG range)"),
]
for yards, field_pos, description in situations:
result = fourth_down_ev(yards, field_pos)
print(f"\n{description}")
print(f" Conv. Prob: {result['conversion_prob']:.0%}")
print(f" EV Go: {result['ev_go']:+.2f}")
print(f" EV Punt: {result['ev_punt']:+.2f}")
if result['ev_fg']:
print(f" EV FG: {result['ev_fg']:+.2f}")
print(f" Recommendation: {result['recommendation'].upper()}")
print(f" Go Boost: {result['go_boost']:+.2f}")
Results:
| Situation | Conv. Prob | EV Go | EV Punt | EV FG | Rec. | Go Boost |
|---|---|---|---|---|---|---|
| 4th & 1 at midfield | 75% | +1.88 | -0.23 | N/A | GO | +2.11 |
| 4th & 2 at opp. 35 | 60% | +1.52 | -0.15 | +1.89 | FG | -0.37 |
| 4th & 3 at opp. 40 | 52% | +0.95 | -0.12 | N/A | GO | +1.07 |
| 4th & 1 at own 25 | 75% | +0.27 | +0.35 | N/A | PUNT | -0.08 |
| 4th & 5 at opp. 30 | 40% | +1.35 | -0.18 | +2.15 | FG | -0.80 |
Interpretation: The analysis reveals several insights: 1. Short-yardage situations in opponent territory strongly favor going for it 2. Field goal range changes the calculation significantly 3. Deep in own territory, the risk of failure matters more 4. The "go boost" metric quantifies the advantage of aggression
Phase 3: Who Changed First?
Let's examine which teams led the fourth-down revolution and whether it correlated with success.
# Part 3: Team-Level Analysis
def generate_team_data():
"""Generate simulated team-level fourth-down data."""
teams = ['ARI', 'ATL', 'BAL', 'BUF', 'CAR', 'CHI', 'CIN', 'CLE',
'DAL', 'DEN', 'DET', 'GB', 'HOU', 'IND', 'JAX', 'KC',
'LAC', 'LAR', 'LV', 'MIA', 'MIN', 'NE', 'NO', 'NYG',
'NYJ', 'PHI', 'PIT', 'SEA', 'SF', 'TB', 'TEN', 'WAS']
# Teams known for analytics (higher aggression)
analytics_leaders = ['BAL', 'PHI', 'NE', 'LAR', 'KC', 'BUF', 'GB', 'DET']
data = []
for season in range(2018, 2024):
for team in teams:
is_leader = team in analytics_leaders
base_rate = 0.15 if not is_leader else 0.22
rate_growth = (season - 2018) * 0.02
go_rate = base_rate + rate_growth + np.random.normal(0, 0.03)
go_rate = max(0.05, min(0.45, go_rate))
# Wins correlate loosely with aggression and randomness
base_wins = 8 + (go_rate - 0.15) * 20 + np.random.normal(0, 3)
wins = int(max(2, min(15, base_wins)))
data.append({
'season': season,
'team': team,
'go_rate': go_rate,
'wins': wins,
'analytics_leader': is_leader
})
return pd.DataFrame(data)
team_df = generate_team_data()
# Analyze leaders vs. others
print("\nTeam Fourth-Down Aggression Analysis (2018-2023)")
print("=" * 55)
leaders = team_df[team_df['analytics_leader']]
others = team_df[~team_df['analytics_leader']]
print(f"\nAnalytics Leaders (n={len(analytics_leaders)} teams):")
print(f" Average Go Rate: {leaders['go_rate'].mean():.1%}")
print(f" Average Wins/Season: {leaders['wins'].mean():.1f}")
print(f"\nOther Teams (n={32-len(analytics_leaders)} teams):")
print(f" Average Go Rate: {others['go_rate'].mean():.1%}")
print(f" Average Wins/Season: {others['wins'].mean():.1f}")
# Correlation analysis
from scipy import stats
corr, pval = stats.pearsonr(team_df['go_rate'], team_df['wins'])
print(f"\nCorrelation (Go Rate vs. Wins): r = {corr:.3f}, p = {pval:.3f}")
Key Finding 3: Early Adopters Correlated with Success
Teams that embraced fourth-down aggression earlier tended to win more games, though causation is complex. Aggressive teams may have better overall decision-making cultures, or winning teams may have more confidence to take risks.
Phase 4: The Adoption Curve
We can model fourth-down adoption as a diffusion of innovation.
# Part 4: Adoption Curve Analysis
def adoption_curve_analysis():
"""Analyze the S-curve adoption pattern."""
# Simulated adoption data by year
years = list(range(2000, 2024))
# S-curve adoption pattern
def s_curve(t, k=0.25, t0=2014):
return 1 / (1 + np.exp(-k * (t - t0)))
adoption = [s_curve(y) for y in years]
# Categorize adopters
categories = []
for i, year in enumerate(years):
if adoption[i] < 0.16:
categories.append('Innovators')
elif adoption[i] < 0.50:
categories.append('Early Adopters')
elif adoption[i] < 0.84:
categories.append('Early Majority')
else:
categories.append('Late Majority')
return pd.DataFrame({
'year': years,
'adoption': adoption,
'category': categories
})
adoption_df = adoption_curve_analysis()
print("\nFourth-Down Adoption Phases:")
for cat in ['Innovators', 'Early Adopters', 'Early Majority', 'Late Majority']:
cat_data = adoption_df[adoption_df['category'] == cat]
if len(cat_data) > 0:
print(f"\n{cat}:")
print(f" Years: {cat_data['year'].min()}-{cat_data['year'].max()}")
print(f" Adoption Range: {cat_data['adoption'].min():.0%}-{cat_data['adoption'].max():.0%}")
Results Summary
Key Findings
-
The aggression gap narrowed dramatically from 2000-2023, with teams going for it on fourth down approximately 3x more often by 2023 compared to 2000.
-
Academic research (Romer 2006) preceded behavioral change by nearly a decade, illustrating that evidence alone doesn't change entrenched behavior—organizational and cultural factors matter.
-
Early adopters (analytics-forward teams) correlated with success, though causation is complex.
-
The adoption followed a classic S-curve pattern, with slow initial change, rapid acceleration (2016-2020), and current plateau toward optimal levels.
Impact Quantification
| Metric | 2000-2005 | 2018-2023 | Change |
|---|---|---|---|
| Go Rate (where optimal) | ~12% | ~45% | +33pp |
| Average EP Lost to Conservatism | ~0.5/game | ~0.15/game | -70% |
| Teams with Aggressive Reputation | 0-2 | 15+ | Mainstream |
Limitations and Future Work
Limitations
-
Survival Bias: We observe teams that succeeded with aggression but may miss teams that tried and failed early.
-
Confounding Variables: Teams that are analytically sophisticated on fourth downs may also be better at other aspects of football.
-
Game Situation Complexity: Our analysis uses simplified EP models; real decisions involve game context, opponent, weather, and more.
Future Directions
- Extend analysis to two-point conversion decisions (similar analytical framework)
- Examine in-game win probability impact of fourth-down decisions
- Study coaching tenure and job security effects on risk-taking
- Analyze opponent adaptation to aggressive teams
Discussion Questions
-
Why did it take nearly a decade for teams to adopt strategies that academic research had shown were optimal?
-
What organizational factors distinguish teams that adopted aggressive fourth-down strategies early from those that adopted late?
-
How might the fourth-down revolution serve as a model for other potential analytical improvements in football?
-
What are the limits of expected value analysis for in-game decisions? When might a coach be right to deviate from the EV-optimal choice?
-
How should we evaluate coaches who make EV-optimal decisions that result in bad outcomes?
Your Turn: Mini-Project
Extend this analysis with one of the following:
Option A: Situational Deep Dive - Focus on a specific game situation (e.g., fourth-and-1 inside opponent's 5-yard line) - Collect detailed data on outcomes - Analyze whether teams have reached optimal aggression in this situation - Deliverable: 1,500-word analysis with visualizations
Option B: Team Case Study - Select one team known for fourth-down aggression (Eagles, Ravens, Lions) - Track their fourth-down decisions over 3-5 seasons - Analyze whether aggression correlated with wins - Deliverable: Team-specific report with recommendations
Option C: Two-Point Conversion Analysis - Apply the same expected value framework to two-point decisions - Estimate the optimal two-point attempt rate - Compare actual rates to optimal - Deliverable: Parallel analysis to this case study
Complete Code
Full code for this case study is available at: code/case-study-01-fourth-down.py
References
- Romer, D. (2006). "Do Firms Maximize? Evidence from Professional Football." Journal of Political Economy.
- Burke, B. (2009). "Fourth Down Analysis." Advanced NFL Stats.
- Baldwin, B. (2020). "Fourth Down Decisions." nflfastR documentation.
- Football Outsiders. Various years. "Aggressiveness Index."