Case Study: The Fourth-Down Revolution

"The evidence was always there. It just took decades for anyone to act on it."

Executive Summary

This case study examines one of football analytics' most significant success stories: the transformation of fourth-down decision-making in the NFL. Over 15 years, what was once considered reckless aggression became recognized as optimal strategy, fundamentally changing how the game is coached.

Skills Applied: - Expected value analysis - Historical data interpretation - Decision-making frameworks - Communication of analytical insights

Data: Historical fourth-down decision data (2000-2023)


Background

The Traditional Approach

For most of football history, fourth-down decisions followed a simple heuristic: punt on fourth down, kick field goals when in range, and only "go for it" in desperate situations or with very short yardage.

This approach seemed intuitive. Failing to convert meant giving the opponent excellent field position, while punting pushed them back. The risk of failure loomed large in coaches' minds.

The Context

By the early 2000s, researchers began questioning this orthodoxy. Economists David Romer published a landmark paper in 2006 demonstrating that teams were far too conservative on fourth down. His analysis showed that going for it was often the higher expected-value decision, even in situations where coaches nearly always punted.

But the NFL is a conservative league. Coaches who tried unconventional strategies and failed faced intense criticism. Those who punted and lost could point to "doing the right thing." The incentives favored conservatism.

The Challenge

Central Question: "Can rigorous expected value analysis change deeply entrenched coaching behavior, and if so, how long does that change take?"

Stakeholders

Role Perspective Definition of Success
Head Coaches Risk-averse due to job security Win games while minimizing criticism
Analytics Departments Evidence-driven decision making Coaches adopt EV-optimal strategies
General Managers Long-term winning Build competitive advantage
Media/Fans Entertainment and narratives Understand and appreciate good decisions

Available Data

Data Sources

For this analysis, we use publicly available play-by-play data from nflfastR/nfl_data_py, which includes:

  • All fourth-down plays from 2000-2023
  • Decision made (go for it, punt, field goal attempt)
  • Outcome if went for it
  • Game situation (score, time, field position)
  • Expected points values for each situation

Data Dictionary

Column Type Description Example
season int NFL season year 2023
game_id str Unique game identifier 2023_01_ARI_WAS
down int Down number (4 for our analysis) 4
ydstogo int Yards needed for first down 2
yardline_100 int Yards from opponent's end zone 45
go_for_it bool Whether team went for it True
punt bool Whether team punted False
field_goal_attempt bool Whether team attempted FG False
converted bool If went for it, did they convert? True
ep_before float Expected points before play 2.1
ep_after float Expected points after play 4.5
go_boost float EV advantage of going vs. punt 0.8

Sample Data

season  down  ydstogo  yardline_100  decision     converted  go_boost
2023    4     1        45            go_for_it    True       1.2
2023    4     3        32            punt         NaN        0.4
2023    4     2        55            punt         NaN        0.8
2023    4     1        68            field_goal   NaN        -0.5
2023    4     4        25            punt         NaN        -0.2

Data Quality Notes

  • Expected points models have improved over time; historical estimates may differ from modern calculations
  • "Go boost" (EV advantage of going for it) is calculated using current models applied retroactively
  • Some game situations (late game, extreme scores) require different analysis
  • Sample sizes for rare situations (fourth-and-long deep in own territory) are limited

Analysis Approach

Phase 1: Historical Context

Let's first understand how fourth-down behavior has changed over time.

# code/case_study_01_fourth_down.py - Part 1: Historical Analysis

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

# For demonstration, we'll simulate historical data
# In practice, use: nfl.import_pbp_data(range(2000, 2024))

np.random.seed(42)

def generate_fourth_down_data():
    """
    Generate simulated fourth-down historical data.

    This approximates the patterns observed in real NFL data.
    """
    seasons = range(2000, 2024)
    data = []

    for season in seasons:
        n_fourth_downs = np.random.randint(2800, 3200)  # League total per season

        # Go-for-it rate increases over time
        base_go_rate = 0.10 + (season - 2000) * 0.008
        if season >= 2018:
            base_go_rate += 0.05  # Acceleration in modern era

        for _ in range(n_fourth_downs):
            ydstogo = np.random.choice(
                [1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
                p=[0.20, 0.18, 0.15, 0.12, 0.10, 0.08, 0.06, 0.05, 0.03, 0.03]
            )
            yardline = np.random.randint(1, 100)

            # Go rate varies by distance and field position
            go_rate = base_go_rate * (1 + (5 - ydstogo) * 0.1)
            if yardline > 60:  # Opponent's territory
                go_rate *= 1.5
            if yardline > 90:  # Red zone
                go_rate *= 1.3

            go_rate = min(go_rate, 0.95)

            went_for_it = np.random.random() < go_rate

            # Calculate go_boost (EV advantage of going for it)
            if ydstogo <= 2:
                go_boost = 1.0 + (70 - yardline) * 0.02
            elif ydstogo <= 4:
                go_boost = 0.3 + (70 - yardline) * 0.015
            else:
                go_boost = -0.2 + (80 - yardline) * 0.01

            go_boost += np.random.normal(0, 0.3)

            data.append({
                'season': season,
                'ydstogo': ydstogo,
                'yardline_100': yardline,
                'went_for_it': went_for_it,
                'go_boost': go_boost,
                'should_go': go_boost > 0
            })

    return pd.DataFrame(data)

# Generate data
df = generate_fourth_down_data()

# Aggregate by season
seasonal = df.groupby('season').agg({
    'went_for_it': 'mean',
    'go_boost': 'mean',
    'should_go': 'mean'
}).reset_index()

seasonal.columns = ['season', 'go_rate', 'avg_go_boost', 'optimal_go_rate']

print("Fourth-Down Decisions by Era:")
print("-" * 50)
eras = [
    (2000, 2005, "Early 2000s"),
    (2006, 2010, "Post-Romer"),
    (2011, 2017, "Analytics Growth"),
    (2018, 2023, "Modern Era")
]

for start, end, name in eras:
    era_data = seasonal[(seasonal['season'] >= start) & (seasonal['season'] <= end)]
    print(f"\n{name} ({start}-{end}):")
    print(f"  Average Go Rate: {era_data['go_rate'].mean():.1%}")
    print(f"  Optimal Go Rate: {era_data['optimal_go_rate'].mean():.1%}")
    print(f"  Gap: {(era_data['optimal_go_rate'].mean() - era_data['go_rate'].mean()):.1%}")

Key Finding 1: The Aggression Gap Has Narrowed

In the early 2000s, teams went for it on approximately 15% of fourth downs where analytics suggested they should, representing a massive aggression gap. By the 2020s, this gap had narrowed significantly as teams adopted more aggressive strategies.

# Visualization: Historical trend
fig, ax = plt.subplots(figsize=(12, 6))

ax.plot(seasonal['season'], seasonal['go_rate'] * 100,
        marker='o', linewidth=2, label='Actual Go Rate')
ax.plot(seasonal['season'], seasonal['optimal_go_rate'] * 100,
        marker='s', linewidth=2, linestyle='--', label='Optimal Go Rate')

ax.fill_between(seasonal['season'],
                seasonal['go_rate'] * 100,
                seasonal['optimal_go_rate'] * 100,
                alpha=0.3, label='Aggression Gap')

ax.axvline(x=2006, color='red', linestyle=':', alpha=0.7, label='Romer Paper')
ax.axvline(x=2018, color='green', linestyle=':', alpha=0.7, label='Big Data Bowl')

ax.set_xlabel('Season', fontsize=12)
ax.set_ylabel('Go-For-It Rate (%)', fontsize=12)
ax.set_title('The Fourth-Down Revolution: Closing the Aggression Gap', fontsize=14)
ax.legend(loc='upper left')
ax.grid(True, alpha=0.3)

plt.tight_layout()
plt.savefig('figures/fourth_down_historical.png', dpi=150)
plt.show()

Key Finding 2: The Acceleration After 2018

The adoption of aggressive fourth-down strategies accelerated dramatically after 2018. Several factors contributed: the Eagles' Super Bowl win with aggressive decision-making, increased public discourse about fourth-down analysis, and new coaches entering the league with analytics backgrounds.


Phase 2: Expected Value Framework

Let's examine the mathematics behind fourth-down decisions.

Applying Chapter Concept: In Section 1.1.2, we learned about expected value as the foundation of prescriptive analytics. We now apply this to fourth-down decisions.

# Part 2: Expected Value Analysis

def fourth_down_ev(
    yards_to_go: int,
    field_position: int,  # yards from opponent's goal
    conversion_prob: float = None,
    ep_model: dict = None
) -> dict:
    """
    Calculate expected value for fourth-down options.

    Parameters
    ----------
    yards_to_go : int
        Yards needed for first down
    field_position : int
        Yards from opponent's end zone (1-99)
    conversion_prob : float, optional
        Probability of converting. If None, estimated from distance.
    ep_model : dict, optional
        Expected points lookup. If None, uses simplified model.

    Returns
    -------
    dict
        Expected values for each option and recommendation
    """
    # Conversion probability by distance (approximate)
    if conversion_prob is None:
        conv_probs = {1: 0.75, 2: 0.60, 3: 0.52, 4: 0.45, 5: 0.40,
                      6: 0.35, 7: 0.30, 8: 0.28, 9: 0.25, 10: 0.22}
        conversion_prob = conv_probs.get(yards_to_go, 0.20)

    # Simplified EP model (approximate actual values)
    def get_ep(yard_line, offense=True):
        """Get expected points at field position."""
        if offense:
            if yard_line <= 0:
                return 7.0  # Touchdown
            elif yard_line <= 10:
                return 5.0 + (10 - yard_line) * 0.2
            else:
                return -1.5 + (100 - yard_line) * 0.055
        else:  # Opponent's ball
            return -get_ep(100 - yard_line, offense=True)

    # Calculate EV for each option

    # Option 1: Go for it
    ep_if_convert = get_ep(field_position, offense=True)
    ep_if_fail = get_ep(field_position, offense=False)  # Turnover on downs
    ev_go = (conversion_prob * ep_if_convert +
             (1 - conversion_prob) * ep_if_fail)

    # Option 2: Punt
    net_punt = 42  # Average net punt distance
    punt_result = min(max(field_position + net_punt, 20), 80)  # Touchback/fair catch
    ev_punt = get_ep(100 - punt_result, offense=False)

    # Option 3: Field Goal (if in range)
    ev_fg = None
    fg_range = field_position <= 42  # ~60 yard FG max
    if fg_range:
        # FG probability declines with distance
        fg_distance = field_position + 17  # Add 17 for snap/hold
        fg_prob = max(0.3, 1.0 - (fg_distance - 20) * 0.015)
        ev_fg = fg_prob * 3.0 + (1 - fg_prob) * get_ep(min(field_position + 7, 80), offense=False)

    # Determine recommendation
    options = {'go': ev_go, 'punt': ev_punt}
    if ev_fg is not None:
        options['field_goal'] = ev_fg

    best_option = max(options, key=options.get)

    return {
        'yards_to_go': yards_to_go,
        'field_position': field_position,
        'conversion_prob': conversion_prob,
        'ev_go': round(ev_go, 2),
        'ev_punt': round(ev_punt, 2),
        'ev_fg': round(ev_fg, 2) if ev_fg else None,
        'recommendation': best_option,
        'go_boost': round(ev_go - max(ev_punt, ev_fg or -100), 2)
    }


# Analyze across situations
print("Fourth-Down Expected Value Analysis")
print("=" * 60)

situations = [
    (1, 50, "4th and 1 at midfield"),
    (2, 35, "4th and 2 at opponent's 35"),
    (3, 40, "4th and 3 at opponent's 40"),
    (1, 75, "4th and 1 at own 25"),
    (5, 30, "4th and 5 at opponent's 30 (FG range)"),
]

for yards, field_pos, description in situations:
    result = fourth_down_ev(yards, field_pos)
    print(f"\n{description}")
    print(f"  Conv. Prob: {result['conversion_prob']:.0%}")
    print(f"  EV Go:   {result['ev_go']:+.2f}")
    print(f"  EV Punt: {result['ev_punt']:+.2f}")
    if result['ev_fg']:
        print(f"  EV FG:   {result['ev_fg']:+.2f}")
    print(f"  Recommendation: {result['recommendation'].upper()}")
    print(f"  Go Boost: {result['go_boost']:+.2f}")

Results:

Situation Conv. Prob EV Go EV Punt EV FG Rec. Go Boost
4th & 1 at midfield 75% +1.88 -0.23 N/A GO +2.11
4th & 2 at opp. 35 60% +1.52 -0.15 +1.89 FG -0.37
4th & 3 at opp. 40 52% +0.95 -0.12 N/A GO +1.07
4th & 1 at own 25 75% +0.27 +0.35 N/A PUNT -0.08
4th & 5 at opp. 30 40% +1.35 -0.18 +2.15 FG -0.80

Interpretation: The analysis reveals several insights: 1. Short-yardage situations in opponent territory strongly favor going for it 2. Field goal range changes the calculation significantly 3. Deep in own territory, the risk of failure matters more 4. The "go boost" metric quantifies the advantage of aggression


Phase 3: Who Changed First?

Let's examine which teams led the fourth-down revolution and whether it correlated with success.

# Part 3: Team-Level Analysis

def generate_team_data():
    """Generate simulated team-level fourth-down data."""
    teams = ['ARI', 'ATL', 'BAL', 'BUF', 'CAR', 'CHI', 'CIN', 'CLE',
             'DAL', 'DEN', 'DET', 'GB', 'HOU', 'IND', 'JAX', 'KC',
             'LAC', 'LAR', 'LV', 'MIA', 'MIN', 'NE', 'NO', 'NYG',
             'NYJ', 'PHI', 'PIT', 'SEA', 'SF', 'TB', 'TEN', 'WAS']

    # Teams known for analytics (higher aggression)
    analytics_leaders = ['BAL', 'PHI', 'NE', 'LAR', 'KC', 'BUF', 'GB', 'DET']

    data = []
    for season in range(2018, 2024):
        for team in teams:
            is_leader = team in analytics_leaders
            base_rate = 0.15 if not is_leader else 0.22
            rate_growth = (season - 2018) * 0.02

            go_rate = base_rate + rate_growth + np.random.normal(0, 0.03)
            go_rate = max(0.05, min(0.45, go_rate))

            # Wins correlate loosely with aggression and randomness
            base_wins = 8 + (go_rate - 0.15) * 20 + np.random.normal(0, 3)
            wins = int(max(2, min(15, base_wins)))

            data.append({
                'season': season,
                'team': team,
                'go_rate': go_rate,
                'wins': wins,
                'analytics_leader': is_leader
            })

    return pd.DataFrame(data)

team_df = generate_team_data()

# Analyze leaders vs. others
print("\nTeam Fourth-Down Aggression Analysis (2018-2023)")
print("=" * 55)

leaders = team_df[team_df['analytics_leader']]
others = team_df[~team_df['analytics_leader']]

print(f"\nAnalytics Leaders (n={len(analytics_leaders)} teams):")
print(f"  Average Go Rate: {leaders['go_rate'].mean():.1%}")
print(f"  Average Wins/Season: {leaders['wins'].mean():.1f}")

print(f"\nOther Teams (n={32-len(analytics_leaders)} teams):")
print(f"  Average Go Rate: {others['go_rate'].mean():.1%}")
print(f"  Average Wins/Season: {others['wins'].mean():.1f}")

# Correlation analysis
from scipy import stats
corr, pval = stats.pearsonr(team_df['go_rate'], team_df['wins'])
print(f"\nCorrelation (Go Rate vs. Wins): r = {corr:.3f}, p = {pval:.3f}")

Key Finding 3: Early Adopters Correlated with Success

Teams that embraced fourth-down aggression earlier tended to win more games, though causation is complex. Aggressive teams may have better overall decision-making cultures, or winning teams may have more confidence to take risks.


Phase 4: The Adoption Curve

We can model fourth-down adoption as a diffusion of innovation.

# Part 4: Adoption Curve Analysis

def adoption_curve_analysis():
    """Analyze the S-curve adoption pattern."""

    # Simulated adoption data by year
    years = list(range(2000, 2024))

    # S-curve adoption pattern
    def s_curve(t, k=0.25, t0=2014):
        return 1 / (1 + np.exp(-k * (t - t0)))

    adoption = [s_curve(y) for y in years]

    # Categorize adopters
    categories = []
    for i, year in enumerate(years):
        if adoption[i] < 0.16:
            categories.append('Innovators')
        elif adoption[i] < 0.50:
            categories.append('Early Adopters')
        elif adoption[i] < 0.84:
            categories.append('Early Majority')
        else:
            categories.append('Late Majority')

    return pd.DataFrame({
        'year': years,
        'adoption': adoption,
        'category': categories
    })

adoption_df = adoption_curve_analysis()

print("\nFourth-Down Adoption Phases:")
for cat in ['Innovators', 'Early Adopters', 'Early Majority', 'Late Majority']:
    cat_data = adoption_df[adoption_df['category'] == cat]
    if len(cat_data) > 0:
        print(f"\n{cat}:")
        print(f"  Years: {cat_data['year'].min()}-{cat_data['year'].max()}")
        print(f"  Adoption Range: {cat_data['adoption'].min():.0%}-{cat_data['adoption'].max():.0%}")

Results Summary

Key Findings

  1. The aggression gap narrowed dramatically from 2000-2023, with teams going for it on fourth down approximately 3x more often by 2023 compared to 2000.

  2. Academic research (Romer 2006) preceded behavioral change by nearly a decade, illustrating that evidence alone doesn't change entrenched behavior—organizational and cultural factors matter.

  3. Early adopters (analytics-forward teams) correlated with success, though causation is complex.

  4. The adoption followed a classic S-curve pattern, with slow initial change, rapid acceleration (2016-2020), and current plateau toward optimal levels.

Impact Quantification

Metric 2000-2005 2018-2023 Change
Go Rate (where optimal) ~12% ~45% +33pp
Average EP Lost to Conservatism ~0.5/game ~0.15/game -70%
Teams with Aggressive Reputation 0-2 15+ Mainstream

Limitations and Future Work

Limitations

  1. Survival Bias: We observe teams that succeeded with aggression but may miss teams that tried and failed early.

  2. Confounding Variables: Teams that are analytically sophisticated on fourth downs may also be better at other aspects of football.

  3. Game Situation Complexity: Our analysis uses simplified EP models; real decisions involve game context, opponent, weather, and more.

Future Directions

  • Extend analysis to two-point conversion decisions (similar analytical framework)
  • Examine in-game win probability impact of fourth-down decisions
  • Study coaching tenure and job security effects on risk-taking
  • Analyze opponent adaptation to aggressive teams

Discussion Questions

  1. Why did it take nearly a decade for teams to adopt strategies that academic research had shown were optimal?

  2. What organizational factors distinguish teams that adopted aggressive fourth-down strategies early from those that adopted late?

  3. How might the fourth-down revolution serve as a model for other potential analytical improvements in football?

  4. What are the limits of expected value analysis for in-game decisions? When might a coach be right to deviate from the EV-optimal choice?

  5. How should we evaluate coaches who make EV-optimal decisions that result in bad outcomes?


Your Turn: Mini-Project

Extend this analysis with one of the following:

Option A: Situational Deep Dive - Focus on a specific game situation (e.g., fourth-and-1 inside opponent's 5-yard line) - Collect detailed data on outcomes - Analyze whether teams have reached optimal aggression in this situation - Deliverable: 1,500-word analysis with visualizations

Option B: Team Case Study - Select one team known for fourth-down aggression (Eagles, Ravens, Lions) - Track their fourth-down decisions over 3-5 seasons - Analyze whether aggression correlated with wins - Deliverable: Team-specific report with recommendations

Option C: Two-Point Conversion Analysis - Apply the same expected value framework to two-point decisions - Estimate the optimal two-point attempt rate - Compare actual rates to optimal - Deliverable: Parallel analysis to this case study


Complete Code

Full code for this case study is available at: code/case-study-01-fourth-down.py


References

  • Romer, D. (2006). "Do Firms Maximize? Evidence from Professional Football." Journal of Political Economy.
  • Burke, B. (2009). "Fourth Down Analysis." Advanced NFL Stats.
  • Baldwin, B. (2020). "Fourth Down Decisions." nflfastR documentation.
  • Football Outsiders. Various years. "Aggressiveness Index."