The running back position has undergone a dramatic philosophical shift in modern NFL analytics. Once considered the cornerstone of championship teams, rushing production now faces scrutiny for its relatively low value compared to passing. This...
In This Chapter
- Chapter Overview
- 7.1 The Devaluation of the Running Back
- 7.2 Problems with Traditional Rushing Metrics
- 7.3 EPA-Based Rushing Metrics
- 7.4 Opportunity vs. Production
- 7.5 Game Script and Rushing Context
- 7.6 Offensive Line and Scheme Effects
- 7.7 Situational Rushing Value
- 7.8 Receiving Value for Running Backs
- 7.9 Comprehensive RB Evaluation Framework
- 7.10 Practical Applications
- Chapter Summary
- Practice Exercises
- Further Reading
Chapter 7: Rushing Analytics
Chapter Overview
The running back position has undergone a dramatic philosophical shift in modern NFL analytics. Once considered the cornerstone of championship teams, rushing production now faces scrutiny for its relatively low value compared to passing. This chapter explores why traditional rushing statistics mislead evaluators, how Expected Points Added reveals the true value of running plays, and when rushing actually does matter. You'll learn to separate scheme from talent, understand the contextual factors that inflate or deflate rushing numbers, and develop a nuanced framework for evaluating running backs in today's NFL.
Learning Objectives
By the end of this chapter, you will be able to:
- Explain why yards per carry is a flawed metric for RB evaluation
- Calculate and interpret EPA-based rushing metrics
- Understand the relationship between opportunity and production
- Analyze how game script and score affect rushing statistics
- Evaluate offensive line contribution to rushing success
- Identify situational rushing value (goal line, short yardage)
- Compare running backs using efficiency and volume metrics
7.1 The Devaluation of the Running Back
Historical Context
For decades, the running back was the glamour position. The NFL's greatest teams featured legendary runners: Jim Brown, Walter Payton, Emmitt Smith, Barry Sanders. Teams drafted running backs early, paid them handsomely, and built offenses around their ground games.
Then came the analytics revolution.
Data revealed uncomfortable truths:
- Passing is more efficient: The average pass attempt produces roughly twice the EPA of the average rush
- Running back production is replaceable: The difference between "elite" and "average" RBs is smaller than at any other skill position
- Running backs have short careers: The typical peak lasts 2-4 years, making long-term contracts risky
- Opportunity drives statistics: Volume explains rushing production more than talent
This doesn't mean rushing is worthless—far from it. But it does mean we need better tools to evaluate when rushing matters, which backs truly add value, and how much teams should invest in the position.
The Pass-vs-Rush Efficiency Gap
import nfl_data_py as nfl
import pandas as pd
import numpy as np
# Load 2023 data
pbp = nfl.import_pbp_data([2023])
# Compare pass vs rush efficiency
play_type_epa = pbp.groupby('play_type').agg(
plays=('play_id', 'count'),
total_epa=('epa', 'sum'),
epa_per_play=('epa', 'mean'),
success_rate=('epa', lambda x: (x > 0).mean())
).round(3)
print(play_type_epa[['plays', 'epa_per_play', 'success_rate']])
Typical results:
| Play Type | EPA/Play | Success Rate |
|---|---|---|
| Pass | ~0.05 | ~45% |
| Run | ~-0.05 | ~42% |
The gap is stark: an average pass is worth about 0.10 EPA more than an average rush. Over a season with 500 rushing attempts, that difference represents 50 expected points—roughly 5 wins worth of value.
Why the gap exists:
- Passes can gain more yards per play (higher ceiling)
- Passes are more likely to gain first downs on third-and-long
- Passing yards have become more reliable as rules favor offense
- Defensive penalties are more common against the pass
7.2 Problems with Traditional Rushing Metrics
Yards Per Carry: The Illusion of Efficiency
Yards per carry (YPC) is the most commonly cited rushing metric. A back averaging 5.0 YPC seems excellent; 3.5 YPC seems poor. But YPC suffers from critical flaws:
Problem 1: It ignores context
A 3-yard gain on 3rd-and-2 is more valuable than a 6-yard gain on 1st-and-10 with a 21-point lead. YPC treats them the same.
Problem 2: It's heavily scheme-dependent
Outside zone schemes often produce higher YPC than power running schemes, regardless of RB talent.
Problem 3: It rewards volatility over consistency
A back with carries of 1, 1, 1, 1, 16 yards (YPC = 4.0) appears equal to one with 4, 4, 4, 4, 4 yards (YPC = 4.0). But the consistent back is more valuable—football rewards moving the chains.
Problem 4: It conflates the line with the back
The offensive line creates (or doesn't create) the initial running lanes. YPC reflects O-line performance as much as RB skill.
The Distribution of Rushing Plays
Rushing outcomes follow a particular pattern:
# Analyze rushing outcome distribution
rushes = pbp[pbp['rush_attempt'] == 1]
# Create yard buckets
rushes['yard_bucket'] = pd.cut(
rushes['yards_gained'],
bins=[-50, -1, 0, 3, 5, 10, 20, 100],
labels=['Loss', '0', '1-3', '4-5', '6-10', '11-20', '21+']
)
distribution = rushes['yard_bucket'].value_counts(normalize=True).sort_index()
print("Rushing Outcome Distribution:")
print(distribution.round(3))
Typical distribution: - ~20% of rushes gain 0 or lose yards - ~35% gain 1-3 yards - ~20% gain 4-5 yards - ~15% gain 6-10 yards - ~10% gain 11+ yards
Most rushes cluster in a narrow, low-yardage band. The occasional explosive run inflates YPC while masking the high frequency of stuffed plays.
The Problem with Rushing Touchdowns
Rushing touchdowns seem like a measure of production, but they're largely a function of:
- Team red zone possessions (how often does the offense get close?)
- Goal-line opportunities (scheme and game script)
- Randomness (small sample sizes)
A back on a good offense with conservative goal-line plays will score more than a better back on a worse team. TDs tell us more about opportunity than ability.
7.3 EPA-Based Rushing Metrics
Expected Points Added for Rushes
EPA measures the change in expected points from before to after a play. For rushing:
def calculate_rush_epa(pbp: pd.DataFrame, min_carries: int = 100) -> pd.DataFrame:
"""Calculate EPA-based rushing metrics."""
rushes = pbp[pbp['rush_attempt'] == 1].copy()
rb_stats = (rushes
.groupby('rusher_player_name')
.agg(
carries=('rush_attempt', 'sum'),
total_epa=('epa', 'sum'),
epa_per_carry=('epa', 'mean'),
yards=('yards_gained', 'sum'),
ypc=('yards_gained', 'mean'),
success_rate=('epa', lambda x: (x > 0).mean()),
first_down_rate=('first_down', 'mean'),
td_rate=('rush_touchdown', 'mean'),
fumble_rate=('fumble_lost', 'mean')
)
.query(f'carries >= {min_carries}')
.sort_values('epa_per_carry', ascending=False)
.round(3)
)
return rb_stats
Interpreting Rush EPA
Unlike passing, where positive EPA is common, the average rushing play has negative EPA. This means:
| EPA/Carry | Interpretation |
|---|---|
| > 0.05 | Excellent |
| 0.00 to 0.05 | Above average |
| -0.10 to 0.00 | Average |
| -0.15 to -0.10 | Below average |
| < -0.15 | Poor |
Key insight: An RB with 0.00 EPA/carry is well above average! Breaking even is impressive in an inherently negative-value play type.
Success Rate: Consistency Over Explosiveness
Success rate measures how often a player adds value:
# Define success (EPA > 0)
rushes['success'] = rushes['epa'] > 0
# Alternative: traditional success (40%/50%/100% of yards needed)
def traditional_success(row):
"""Traditional success rate definition."""
if row['down'] == 1:
return row['yards_gained'] >= 0.40 * row['ydstogo']
elif row['down'] == 2:
return row['yards_gained'] >= 0.50 * row['ydstogo']
else: # 3rd or 4th down
return row['yards_gained'] >= row['ydstogo']
rushes['trad_success'] = rushes.apply(traditional_success, axis=1)
Success rate captures consistency better than YPC:
- A back with 50% success rate consistently moves chains
- A back with 35% success rate often stalls drives
- The difference is larger than their YPC might suggest
The Negative Expected Value Problem
Because average rushing EPA is negative, comparing RBs by EPA can be misleading:
# Example: Two RBs
rb_a = {'carries': 300, 'epa_per_carry': -0.02} # "Good" back
rb_b = {'carries': 150, 'epa_per_carry': -0.08} # "Bad" back
# Total EPA
rb_a_total = 300 * -0.02 # -6 EPA
rb_b_total = 150 * -0.08 # -12 EPA
# RB A looks better, but...
# Both hurt the team compared to passing
This is why analysts argue for reduced rushing volume—even "good" rushing often costs expected points compared to passing alternatives.
7.4 Opportunity vs. Production
The Volume-Efficiency Tradeoff
A fundamental challenge in rushing analysis: volume and efficiency often correlate negatively.
# Analyze volume vs efficiency relationship
rb_stats = calculate_rush_epa(pbp, min_carries=50)
# Correlation
corr = rb_stats['carries'].corr(rb_stats['epa_per_carry'])
print(f"Correlation between carries and EPA/carry: {corr:.3f}")
# Typically negative: around -0.20 to -0.40
Why this happens:
- Regression to the mean: High early-season efficiency leads to more carries, then performance regresses
- Game script: Teams ahead run more but face stacked boxes
- Fatigue: More carries may reduce per-carry efficiency
- Defense adjusts: Consistent usage becomes predictable
Yards Created vs. Yards Given
Not all rushing yards are equal. Some yards are "given" by the blocking scheme; others are "created" by the runner:
Yards before contact (YBC): Distance gained before first defender contact - Primarily reflects O-line and scheme - Often called "easy yards"
Yards after contact (YAC): Distance gained after initial contact - More attributable to the runner - Reflects power, elusiveness, and vision
# If YBC/YAC data is available (Next Gen Stats)
def decompose_rushing_production(rushes: pd.DataFrame) -> pd.DataFrame:
"""Separate O-line contribution from RB contribution."""
if 'yards_before_contact' not in rushes.columns:
print("Yards before/after contact not in dataset")
return None
decomposition = (rushes
.groupby('rusher_player_name')
.agg(
carries=('rush_attempt', 'sum'),
total_yards=('yards_gained', 'sum'),
ybc=('yards_before_contact', 'sum'),
yac=('yards_after_contact', 'sum'),
ybc_per_carry=('yards_before_contact', 'mean'),
yac_per_carry=('yards_after_contact', 'mean')
)
)
decomposition['pct_ybc'] = decomposition['ybc'] / decomposition['total_yards']
decomposition['pct_yac'] = decomposition['yac'] / decomposition['total_yards']
return decomposition
Evaluating Rushing Talent vs. Situation
To isolate RB talent from situation:
- Compare to replacement: How does the RB perform vs. backups in the same system?
- Examine YAC specifically: Yards after contact better reflect individual skill
- Control for down/distance: Compare on similar play types
- Consider box defenders: Performance against stacked boxes shows ability under pressure
def analyze_vs_stacked_box(rushes: pd.DataFrame) -> pd.DataFrame:
"""Analyze performance against different box counts."""
# Stacked box: 8+ defenders in box
if 'defenders_in_box' not in rushes.columns:
print("Defenders in box not available")
return None
rushes['stacked_box'] = rushes['defenders_in_box'] >= 8
by_box = (rushes
.groupby(['rusher_player_name', 'stacked_box'])
.agg(
carries=('rush_attempt', 'sum'),
epa=('epa', 'mean'),
ypc=('yards_gained', 'mean'),
success=('epa', lambda x: (x > 0).mean())
)
.unstack()
)
return by_box
7.5 Game Script and Rushing Context
The Score Effect
Game script dramatically affects rushing statistics:
def analyze_game_script_effect(rushes: pd.DataFrame) -> pd.DataFrame:
"""Analyze rushing by score differential."""
rushes['game_state'] = pd.cut(
rushes['score_differential'],
bins=[-100, -14, -7, 0, 7, 14, 100],
labels=['Down 14+', 'Down 7-14', 'Down 1-7',
'Tied/Up 1-7', 'Up 7-14', 'Up 14+']
)
script_analysis = (rushes
.groupby('game_state')
.agg(
carries=('rush_attempt', 'count'),
epa=('epa', 'mean'),
ypc=('yards_gained', 'mean'),
success=('epa', lambda x: (x > 0).mean())
)
.round(3)
)
return script_analysis
Typical findings:
| Game State | EPA/Carry | YPC | Notes |
|---|---|---|---|
| Down 14+ | Higher | Lower | Defense expects pass |
| Close game | Lower | Average | Normal game flow |
| Up 14+ | Lower | Lower | Defense stacks box, less consequence |
Key insight: Running backs on winning teams accumulate garbage-time carries with worse efficiency but don't affect wins. Those on losing teams have fewer chances.
Fourth Quarter Considerations
Late-game rushing presents unique analytical challenges:
def fourth_quarter_analysis(rushes: pd.DataFrame) -> pd.DataFrame:
"""Analyze 4th quarter rushing dynamics."""
q4 = rushes[rushes['qtr'] == 4]
# Split by game competitiveness
q4['competitive'] = abs(q4['score_differential']) <= 7
q4['running_clock'] = q4['score_differential'] > 0
analysis = (q4
.groupby(['competitive', 'running_clock'])
.agg(
carries=('rush_attempt', 'count'),
epa=('epa', 'mean'),
success=('epa', lambda x: (x > 0).mean())
)
)
return analysis
When a team is running out the clock:
- Success is redefined: Not losing yards and burning clock is success
- EPA is inappropriate: The goal isn't maximizing points but running time
- Volume inflates: Many low-value carries accumulate
Adjusting for Game Context
To fairly evaluate running backs:
def context_adjusted_rushing(rushes: pd.DataFrame) -> pd.DataFrame:
"""Adjust rushing stats for game context."""
# Filter to "meaningful" carries
meaningful = rushes[
(rushes['qtr'] <= 3) | # First 3 quarters
(abs(rushes['score_differential']) <= 14) # Or close in 4th
]
adjusted_stats = (meaningful
.groupby('rusher_player_name')
.agg(
adj_carries=('rush_attempt', 'sum'),
adj_epa=('epa', 'mean'),
adj_success=('epa', lambda x: (x > 0).mean())
)
)
# Compare to raw stats
raw_stats = (rushes
.groupby('rusher_player_name')
.agg(
raw_carries=('rush_attempt', 'sum'),
raw_epa=('epa', 'mean')
)
)
combined = adjusted_stats.join(raw_stats)
combined['context_penalty'] = combined['raw_epa'] - combined['adj_epa']
return combined
7.6 Offensive Line and Scheme Effects
The O-Line Problem
Running back evaluation is inseparable from offensive line evaluation:
def team_rush_blocking_quality(pbp: pd.DataFrame) -> pd.DataFrame:
"""Estimate team rush blocking quality."""
rushes = pbp[pbp['rush_attempt'] == 1]
team_rushing = (rushes
.groupby('posteam')
.agg(
carries=('rush_attempt', 'sum'),
epa=('epa', 'mean'),
ypc=('yards_gained', 'mean'),
success_rate=('epa', lambda x: (x > 0).mean()),
# Stuffed rate: runs for 0 or negative
stuff_rate=('yards_gained', lambda x: (x <= 0).mean())
)
.sort_values('epa', ascending=False)
)
return team_rushing
What the O-line controls: - Initial hole creation - Yards before contact - Stuff rate (runs for 0 or loss)
What the RB controls: - Vision (finding and hitting the hole) - Yards after contact - Breakaway speed
Scheme-Based Rushing Styles
Different schemes produce different rushing profiles:
Outside Zone: - Lateral movement before hitting hole - Cutback opportunities - Generally higher YPC - Requires vision and patience
Inside Zone: - Attack between the tackles - Quick decisions - More consistent, lower variance
Power/Gap: - Pulling linemen create leads - Downhill running - Favors power over speed
def analyze_rush_direction(rushes: pd.DataFrame) -> pd.DataFrame:
"""Analyze rushing by gap/direction."""
# Use run_gap and run_location if available
if 'run_gap' not in rushes.columns:
# Approximate with description or direction
return None
direction_analysis = (rushes
.groupby('run_gap')
.agg(
carries=('rush_attempt', 'count'),
epa=('epa', 'mean'),
ypc=('yards_gained', 'mean'),
success=('epa', lambda x: (x > 0).mean()),
big_play_rate=('yards_gained', lambda x: (x >= 10).mean())
)
.sort_values('epa', ascending=False)
)
return direction_analysis
Isolating RB from System
Perfect isolation is impossible, but approximations help:
- Multiple RBs in same system: Compare backs sharing snaps
- Same RB, different systems: Track performance across team changes
- Yards after contact focus: Emphasize RB-attributable production
- Regression models: Control for O-line, scheme, and game factors
def compare_backfield_mates(rushes: pd.DataFrame, team: str) -> pd.DataFrame:
"""Compare RBs sharing the same backfield."""
team_rushes = rushes[rushes['posteam'] == team]
comparison = (team_rushes
.groupby('rusher_player_name')
.agg(
carries=('rush_attempt', 'sum'),
epa=('epa', 'mean'),
success=('epa', lambda x: (x > 0).mean()),
ypc=('yards_gained', 'mean')
)
.query('carries >= 30')
.sort_values('epa', ascending=False)
)
return comparison
7.7 Situational Rushing Value
Where Running Matters
While passing is more efficient on average, specific situations favor running:
Short Yardage (3rd/4th and 1-2)
def short_yardage_analysis(pbp: pd.DataFrame) -> pd.DataFrame:
"""Analyze short yardage rushing."""
short = pbp[
(pbp['down'].isin([3, 4])) &
(pbp['ydstogo'] <= 2)
]
by_play_type = (short
.groupby('play_type')
.agg(
plays=('play_id', 'count'),
conversion_rate=('first_down', 'mean'),
epa=('epa', 'mean'),
td_rate=('touchdown', 'mean')
)
)
return by_play_type
In short-yardage situations, rushing often performs better: - Shorter required gain reduces variance advantage of passing - Lower interception risk - Physical running is more reliable
Goal Line (Inside the 5)
def goal_line_analysis(pbp: pd.DataFrame) -> pd.DataFrame:
"""Analyze goal line rushing."""
goal_line = pbp[pbp['yardline_100'] <= 5]
gl_by_type = (goal_line
.groupby('play_type')
.agg(
plays=('play_id', 'count'),
td_rate=('touchdown', 'mean'),
epa=('epa', 'mean')
)
)
return gl_by_type
Goal-line rushing has unique dynamics: - Compressed field limits defensive options - Physical power becomes premium - Play-action becomes more effective
Late-Game Clock Management
When protecting leads, rushing serves non-scoring purposes:
def clock_killing_value(rushes: pd.DataFrame) -> pd.DataFrame:
"""Assess rushing in clock-killing situations."""
clock_kill = rushes[
(rushes['qtr'] == 4) &
(rushes['score_differential'] > 0) &
(rushes['score_differential'] <= 14) &
(rushes['half_seconds_remaining'] <= 300) # Last 5 minutes
]
# Adjusted success: don't fumble, don't lose big
clock_kill['clock_success'] = (
(clock_kill['yards_gained'] >= -2) &
(clock_kill['fumble_lost'] != 1)
)
analysis = (clock_kill
.groupby('rusher_player_name')
.agg(
carries=('rush_attempt', 'sum'),
clock_success=('clock_success', 'mean'),
fumbles=('fumble_lost', 'sum')
)
.query('carries >= 10')
)
return analysis
Identifying Situational Specialists
Some backs excel in specific roles:
def identify_specialists(rushes: pd.DataFrame, min_carries: int = 50) -> pd.DataFrame:
"""Identify RB specialization roles."""
# Short yardage carries
short = rushes[rushes['ydstogo'] <= 2]
short_specialists = (short
.groupby('rusher_player_name')
.agg(short_carries=('rush_attempt', 'sum'),
short_success=('epa', lambda x: (x > 0).mean()))
)
# Receiving backs
receiving = rushes # Would need reception data
# ...
# Overall volume
volume = (rushes
.groupby('rusher_player_name')
.agg(total_carries=('rush_attempt', 'sum'))
.query(f'total_carries >= {min_carries}')
)
specialists = volume.join(short_specialists)
# Flag short-yardage specialists
specialists['short_yardage_specialist'] = (
(specialists['short_carries'] >= 20) &
(specialists['short_success'] > 0.55)
)
return specialists
7.8 Receiving Value for Running Backs
The Dual-Threat Premium
In modern NFL offenses, receiving ability separates elite backs from replacement-level:
def rushing_and_receiving_value(pbp: pd.DataFrame, min_touches: int = 100) -> pd.DataFrame:
"""Calculate combined rushing and receiving value."""
# Rushing stats
rushes = pbp[pbp['rush_attempt'] == 1]
rush_stats = (rushes
.groupby('rusher_player_name')
.agg(
carries=('rush_attempt', 'sum'),
rush_epa=('epa', 'sum'),
rush_epa_per=('epa', 'mean')
)
)
# Receiving stats for RBs
targets = pbp[(pbp['pass_attempt'] == 1)]
# Filter to RB targets (would need position data)
rec_stats = (targets
.groupby('receiver_player_name')
.agg(
targets=('pass_attempt', 'sum'),
receptions=('complete_pass', 'sum'),
rec_epa=('epa', 'sum'),
rec_epa_per=('epa', 'mean')
)
)
# Join (matching player names)
combined = rush_stats.join(rec_stats, how='outer').fillna(0)
combined['total_touches'] = combined['carries'] + combined['targets']
combined['total_epa'] = combined['rush_epa'] + combined['rec_epa']
combined['epa_per_touch'] = combined['total_epa'] / combined['total_touches']
return combined.query(f'total_touches >= {min_touches}')
Why Receiving Matters More
RB receptions are often more valuable than rushes because:
- Higher EPA per play: Short passes are more efficient than short runs
- Mismatch creation: RBs vs. linebackers favor the offense
- Versatility value: Forces defense to respect multiple threats
- Third-down utility: Receiving backs stay on field in passing situations
Pass-Catching Metrics for RBs
def rb_receiving_analysis(pbp: pd.DataFrame, rb_names: list = None) -> pd.DataFrame:
"""Analyze RB receiving contributions."""
receptions = pbp[(pbp['pass_attempt'] == 1) & (pbp['complete_pass'] == 1)]
if rb_names:
receptions = receptions[receptions['receiver_player_name'].isin(rb_names)]
rb_receiving = (receptions
.groupby('receiver_player_name')
.agg(
receptions=('complete_pass', 'sum'),
targets=('pass_attempt', 'count'), # Will differ in full query
yards=('yards_gained', 'sum'),
yac=('yards_after_catch', 'sum'),
epa=('epa', 'sum'),
epa_per_rec=('epa', 'mean'),
first_downs=('first_down', 'sum')
)
)
rb_receiving['catch_rate'] = rb_receiving['receptions'] / rb_receiving['targets']
rb_receiving['yac_per_rec'] = rb_receiving['yac'] / rb_receiving['receptions']
return rb_receiving
7.9 Comprehensive RB Evaluation Framework
Multi-Metric Evaluation
No single metric captures RB value. A comprehensive evaluation includes:
class RBEvaluator:
"""Comprehensive running back evaluation framework."""
def __init__(self, pbp: pd.DataFrame, min_carries: int = 100):
self.pbp = pbp
self.min_carries = min_carries
self.rushes = pbp[pbp['rush_attempt'] == 1]
def calculate_all_metrics(self) -> pd.DataFrame:
"""Calculate comprehensive RB metrics."""
metrics = (self.rushes
.groupby('rusher_player_name')
.agg(
# Volume
carries=('rush_attempt', 'sum'),
yards=('yards_gained', 'sum'),
# Efficiency
epa_total=('epa', 'sum'),
epa_per_carry=('epa', 'mean'),
ypc=('yards_gained', 'mean'),
success_rate=('epa', lambda x: (x > 0).mean()),
# Scoring
touchdowns=('rush_touchdown', 'sum'),
td_rate=('rush_touchdown', 'mean'),
# Ball security
fumbles=('fumble_lost', 'sum'),
fumble_rate=('fumble_lost', 'mean'),
# Explosiveness
big_runs=('yards_gained', lambda x: (x >= 10).sum()),
explosive_rate=('yards_gained', lambda x: (x >= 10).mean()),
long_run=('yards_gained', 'max'),
# Consistency
stuff_rate=('yards_gained', lambda x: (x <= 0).mean()),
median_gain=('yards_gained', 'median')
)
.query(f'carries >= {self.min_carries}')
)
# Add rankings
metrics['epa_rank'] = metrics['epa_per_carry'].rank(ascending=False)
metrics['success_rank'] = metrics['success_rate'].rank(ascending=False)
return metrics.sort_values('epa_per_carry', ascending=False)
def situational_breakdown(self, rb_name: str) -> dict:
"""Generate situational performance breakdown."""
rb_rushes = self.rushes[self.rushes['rusher_player_name'] == rb_name]
if len(rb_rushes) < 50:
return {"error": "Insufficient sample size"}
breakdown = {
'overall': {
'carries': len(rb_rushes),
'epa': rb_rushes['epa'].mean(),
'success': (rb_rushes['epa'] > 0).mean()
},
'by_down': {},
'by_quarter': {},
'by_score': {}
}
# By down
for down in [1, 2, 3]:
down_rushes = rb_rushes[rb_rushes['down'] == down]
if len(down_rushes) >= 20:
breakdown['by_down'][f'down_{down}'] = {
'carries': len(down_rushes),
'epa': down_rushes['epa'].mean(),
'success': (down_rushes['epa'] > 0).mean()
}
# By quarter
for qtr in [1, 2, 3, 4]:
qtr_rushes = rb_rushes[rb_rushes['qtr'] == qtr]
if len(qtr_rushes) >= 15:
breakdown['by_quarter'][f'Q{qtr}'] = {
'carries': len(qtr_rushes),
'epa': qtr_rushes['epa'].mean()
}
# By game score
ahead = rb_rushes[rb_rushes['score_differential'] > 7]
behind = rb_rushes[rb_rushes['score_differential'] < -7]
close = rb_rushes[abs(rb_rushes['score_differential']) <= 7]
breakdown['by_score'] = {
'ahead': {'carries': len(ahead), 'epa': ahead['epa'].mean() if len(ahead) > 0 else 0},
'behind': {'carries': len(behind), 'epa': behind['epa'].mean() if len(behind) > 0 else 0},
'close': {'carries': len(close), 'epa': close['epa'].mean() if len(close) > 0 else 0}
}
return breakdown
def generate_report(self, rb_name: str) -> str:
"""Generate text evaluation report."""
metrics = self.calculate_all_metrics()
if rb_name not in metrics.index:
return f"RB {rb_name} not found or doesn't meet minimum carries"
rb = metrics.loc[rb_name]
n_rbs = len(metrics)
situational = self.situational_breakdown(rb_name)
report = f"""
========================================
RB EVALUATION REPORT: {rb_name}
========================================
VOLUME: {int(rb['carries'])} carries, {int(rb['yards'])} yards
EFFICIENCY:
EPA/Carry: {rb['epa_per_carry']:.3f} (Rank: {int(rb['epa_rank'])}/{n_rbs})
YPC: {rb['ypc']:.1f}
Success Rate: {rb['success_rate']*100:.1f}% (Rank: {int(rb['success_rank'])}/{n_rbs})
EXPLOSIVENESS:
10+ Yard Runs: {int(rb['big_runs'])} ({rb['explosive_rate']*100:.1f}%)
Longest Run: {int(rb['long_run'])} yards
CONSISTENCY:
Stuff Rate (0 or less): {rb['stuff_rate']*100:.1f}%
Median Gain: {rb['median_gain']:.1f} yards
BALL SECURITY:
Fumbles Lost: {int(rb['fumbles'])}
Fumble Rate: {rb['fumble_rate']*100:.2f}%
SCORING:
Touchdowns: {int(rb['touchdowns'])}
SITUATIONAL:
"""
if 'by_score' in situational:
scores = situational['by_score']
report += f"""
When Ahead (7+): {scores['ahead']['carries']} carries, {scores['ahead']['epa']:.3f} EPA
Close Game: {scores['close']['carries']} carries, {scores['close']['epa']:.3f} EPA
When Behind: {scores['behind']['carries']} carries, {scores['behind']['epa']:.3f} EPA
"""
# Assessment
report += "\nASSESSMENT:\n"
if rb['epa_per_carry'] > 0:
report += " - Above-average efficiency (positive EPA rare for RBs)\n"
if rb['success_rate'] > 0.45:
report += " - Highly consistent (45%+ success rate)\n"
if rb['explosive_rate'] > 0.12:
report += " - Explosive threat (12%+ big play rate)\n"
if rb['fumble_rate'] < 0.005:
report += " - Excellent ball security\n"
if rb['stuff_rate'] > 0.22:
report += " - Concern: High stuff rate (may be O-line or vision)\n"
return report
The Value Hierarchy
Based on analytics, RB value comes from (in order):
- Receiving ability: Highest EPA per opportunity
- Efficiency in meaningful situations: Close-game success rate
- Ball security: Fumbles have massive negative EPA
- Short-yardage conversion: Situational value
- Volume on good teams: Wins correlate with rushing volume (but causation is reversed)
7.10 Practical Applications
Draft and Contract Valuation
Analytics has transformed how teams value running backs:
def expected_value_calculation(rb_stats: pd.DataFrame) -> pd.DataFrame:
"""Estimate RB value for contract purposes."""
# Estimate EPA value (very rough)
# 1 win ≈ 10 EPA, 1 win ≈ $2-3M on market
rb_stats = rb_stats.copy()
rb_stats['estimated_wins_above_replacement'] = (
(rb_stats['epa_total'] - (-0.08 * rb_stats['carries'])) # vs replacement (-0.08 EPA/carry)
) / 10 # EPA per win
rb_stats['estimated_value_M'] = rb_stats['estimated_wins_above_replacement'] * 2.5
return rb_stats[['carries', 'epa_total', 'estimated_wins_above_replacement', 'estimated_value_M']]
Key insights for RB valuation:
- Don't pay for volume: Yards and TDs are available cheaply
- Pay for receiving: Dual-threat backs command premiums
- Avoid long contracts: Short peak windows create risk
- Draft for value: Day 2-3 picks can produce starting quality
Team Rushing Strategy Analysis
def analyze_team_rush_strategy(pbp: pd.DataFrame, team: str) -> dict:
"""Analyze a team's rushing strategy."""
team_plays = pbp[pbp['posteam'] == team]
rushes = team_plays[team_plays['rush_attempt'] == 1]
strategy = {
'rush_rate': rushes.shape[0] / team_plays.shape[0],
'rush_epa': rushes['epa'].mean(),
'first_down_rush_rate': (
team_plays[team_plays['down'] == 1]['rush_attempt'].mean()
),
'second_and_long_rush_rate': (
team_plays[(team_plays['down'] == 2) & (team_plays['ydstogo'] >= 7)]['rush_attempt'].mean()
),
'leading_rush_rate': (
team_plays[team_plays['score_differential'] > 7]['rush_attempt'].mean()
),
'behind_rush_rate': (
team_plays[team_plays['score_differential'] < -7]['rush_attempt'].mean()
)
}
return strategy
Identifying Undervalued Backs
def find_undervalued_rbs(pbp: pd.DataFrame) -> pd.DataFrame:
"""Identify potentially undervalued running backs."""
rb_stats = RBEvaluator(pbp, min_carries=50).calculate_all_metrics()
# High efficiency, low volume = potentially underused
undervalued = rb_stats[
(rb_stats['epa_per_carry'] > 0) &
(rb_stats['carries'] < 150) &
(rb_stats['success_rate'] > 0.45)
].sort_values('epa_per_carry', ascending=False)
return undervalued
Chapter Summary
Key Takeaways
- Passing is more efficient than rushing on average by about 0.10 EPA per play
- YPC is misleading: It ignores context, rewards volatility, and conflates O-line with RB
- EPA and success rate provide better efficiency measures for rushers
- Game script inflates volume stats: Winning teams rush more in low-leverage situations
- The O-line drives a significant portion of rushing production (yards before contact)
- Receiving ability is the differentiating RB skill in modern NFL value
- Situational rushing matters: Short yardage, goal line, and clock management
- RB contracts are often bad investments: Short careers and replaceability
Common Analytical Mistakes
| Mistake | Better Approach |
|---|---|
| Using only YPC | Add success rate and EPA |
| Ignoring game script | Filter to competitive situations |
| Crediting RB for all yards | Separate YBC from YAC |
| Volume = value | Efficiency matters more |
| TD totals | TD rate, context-adjusted |
Looking Ahead
Chapter 8 explores receiving analytics, examining how to evaluate pass-catchers in a passing-dominated league. We'll learn about target share, efficiency metrics, separation, and the contested catch rate—metrics that increasingly drive offensive success.
Practice Exercises
See the accompanying exercises.md file for hands-on practice problems ranging from basic EPA calculations to comprehensive RB evaluation systems.
Further Reading
See further-reading.md for academic papers, industry resources, and data sources for advanced rushing analytics.