Case Study: The Workhorse RB Debate
Scenario
The Cleveland Browns have a decision to make. Their star running back, who led the team in rushing attempts for three consecutive seasons, is entering free agency. The analytics department has been tasked with answering: Should the team re-sign their workhorse RB, or can they replicate his production more cheaply?
The RB's statistics over the past season: - 285 carries - 1,156 rushing yards (4.06 YPC) - 10 rushing touchdowns - 32 receptions, 285 receiving yards, 2 receiving TDs
His agent is seeking a 4-year, $56 million contract with $32 million guaranteed.
Data Gathering
First, let's pull the relevant data:
import nfl_data_py as nfl
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
# Load data
pbp = nfl.import_pbp_data([2023])
# Filter to rushing plays
rushes = pbp[pbp['rush_attempt'] == 1].copy()
# Team's rushing data
team_rushes = rushes[rushes['posteam'] == 'CLE']
# Get all RB data for comparison
rb_stats = rushes.groupby('rusher_player_name').agg(
carries=('rush_attempt', 'sum'),
yards=('yards_gained', 'sum'),
ypc=('yards_gained', 'mean'),
epa_total=('epa', 'sum'),
epa_per_carry=('epa', 'mean'),
success_rate=('epa', lambda x: (x > 0).mean()),
td=('rush_touchdown', 'sum'),
fumbles=('fumble_lost', 'sum')
).query('carries >= 100').sort_values('carries', ascending=False)
print("Top 15 RBs by Volume:")
print(rb_stats.head(15)[['carries', 'yards', 'ypc', 'epa_per_carry', 'success_rate', 'td']].round(3))
Analysis 1: Traditional vs Advanced Metrics
Traditional View
By traditional metrics, our workhorse looks solid: - Top 8 in rushing yards - Top 5 in rushing touchdowns - YPC of 4.06 (near league average)
This suggests a valuable, productive player worth re-signing.
Analytics View
# Compare traditional and advanced rankings
rb_stats['yards_rank'] = rb_stats['yards'].rank(ascending=False)
rb_stats['td_rank'] = rb_stats['td'].rank(ascending=False)
rb_stats['epa_rank'] = rb_stats['epa_per_carry'].rank(ascending=False)
rb_stats['success_rank'] = rb_stats['success_rate'].rank(ascending=False)
workhorse = rb_stats.loc['N.Chubb'] # Example name
print(f"\nWorkhorse Rankings:")
print(f" Yards Rank: {int(workhorse['yards_rank'])}")
print(f" TD Rank: {int(workhorse['td_rank'])}")
print(f" EPA/Carry Rank: {int(workhorse['epa_rank'])}")
print(f" Success Rate Rank: {int(workhorse['success_rank'])}")
Finding: The workhorse ranks 7th in yards but only 18th in EPA per carry and 22nd in success rate. High volume is masking low efficiency.
Analysis 2: Game Script Decomposition
# Split by game state
team_rushes['game_state'] = pd.cut(
team_rushes['score_differential'],
bins=[-100, -7, 7, 100],
labels=['Behind', 'Close', 'Ahead']
)
# Analyze by game state
game_script = team_rushes.groupby(['rusher_player_name', 'game_state']).agg(
carries=('rush_attempt', 'count'),
epa=('epa', 'mean')
).unstack()
print("\nWorkhorse by Game State:")
print(game_script.loc['N.Chubb'])
Finding: - 42% of carries came when ahead by 7+ (clock-killing mode) - EPA was -0.12 when ahead (low-value carries) - EPA was +0.05 in close games (actual competitive value)
The workhorse was accumulating volume in situations where rushing efficiency naturally suffers.
Analysis 3: Backfield Comparison
# Compare all Cleveland RBs
cle_rbs = team_rushes.groupby('rusher_player_name').agg(
carries=('rush_attempt', 'sum'),
epa=('epa', 'mean'),
success=('epa', lambda x: (x > 0).mean()),
ypc=('yards_gained', 'mean')
).query('carries >= 20').sort_values('epa', ascending=False)
print("\nCleveland RB Comparison:")
print(cle_rbs.round(3))
Finding: The backup RB had: - 68 carries with 0.02 EPA/carry (better than workhorse's -0.04) - Higher success rate (47% vs 41%) - Similar YPC in the same system
Same offensive line, same scheme, different results. This suggests the workhorse isn't outperforming replacement options.
Analysis 4: Receiving Value
# Add receiving contribution
targets = pbp[pbp['pass_attempt'] == 1]
rb_receiving = targets.groupby('receiver_player_name').agg(
targets=('pass_attempt', 'count'),
receptions=('complete_pass', 'sum'),
rec_yards=('yards_gained', 'sum'),
rec_epa=('epa', 'sum')
).query('targets >= 20')
# Combine with rushing
combined = rb_stats.join(rb_receiving, how='left').fillna(0)
combined['total_epa'] = combined['epa_total'] + combined['rec_epa']
combined['total_touches'] = combined['carries'] + combined['receptions']
combined['epa_per_touch'] = combined['total_epa'] / combined['total_touches']
print("\nTotal Touch Efficiency (Top 10):")
print(combined.nlargest(10, 'epa_per_touch')[
['carries', 'receptions', 'total_epa', 'epa_per_touch']
].round(3))
Finding: When including receiving, our workhorse ranks 15th in EPA per touch. His receiving contribution (32 catches, 285 yards) is modest compared to top dual-threat backs.
Analysis 5: Replacement Level Calculation
# Define replacement level (25th percentile efficiency)
replacement_epa = rb_stats.query('carries >= 50')['epa_per_carry'].quantile(0.25)
print(f"Replacement Level EPA/Carry: {replacement_epa:.3f}")
# Calculate value over replacement
rb_stats['epa_over_replacement'] = (
(rb_stats['epa_per_carry'] - replacement_epa) * rb_stats['carries']
)
# Convert to estimated wins
rb_stats['wins_over_replacement'] = rb_stats['epa_over_replacement'] / 10 # ~10 EPA per win
print("\nWorkhorse Value Over Replacement:")
workhorse_vor = rb_stats.loc['N.Chubb', 'epa_over_replacement']
workhorse_war = rb_stats.loc['N.Chubb', 'wins_over_replacement']
print(f" EPA over replacement: {workhorse_vor:.1f}")
print(f" Estimated wins added: {workhorse_war:.2f}")
Finding: Despite 285 carries, the workhorse provides only ~0.3 wins above replacement. At $14M/year, this is poor value (each win costs ~$47M via this player).
Analysis 6: Contract Valuation
# Market value estimation
def estimate_rb_value(war: float, market_rate_per_win: float = 2.5) -> float:
"""Estimate RB market value based on wins added."""
return war * market_rate_per_win
workhorse_value = estimate_rb_value(workhorse_war)
print(f"\nWorkhorse Fair Value Estimate: ${workhorse_value:.1f}M/year")
print(f"Asking Price: $14M/year")
print(f"Overpay: ${14 - workhorse_value:.1f}M/year")
Finding: Based on wins added, fair value is approximately $0.75M/year. The $14M asking price represents ~$13M annual overpay.
Recommendation
Do Not Re-Sign at Asking Price
Evidence: 1. Efficiency below average: 18th in EPA/carry despite 7th in volume 2. Inflated by game script: 42% of carries in low-value clock-killing situations 3. Replaceable internally: Backup has better efficiency in same system 4. Modest receiving: Not a difference-making dual-threat 5. Poor value: $14M for ~0.3 wins over replacement
Alternative Approaches
-
Sign a cheaper veteran ($3-5M) - Target efficient backups from other teams - Get 80% of production at 30% of cost
-
Draft Day 2-3 RB ($1-2M avg) - Rookie RBs often produce immediately - 4 years of cost control
-
RBBC approach ($4-6M total) - Split carries among 2-3 backs - Maintain freshness, reduce injury risk
-
Counter-offer ($8M max, 2 years) - If team insists on workhorse model - Shorter term limits downside
Visualization
fig, axes = plt.subplots(2, 2, figsize=(12, 10))
# Plot 1: Carries vs EPA
ax1 = axes[0, 0]
ax1.scatter(rb_stats['carries'], rb_stats['epa_per_carry'], alpha=0.7)
ax1.axhline(y=0, color='gray', linestyle='--')
ax1.scatter([285], [-0.04], color='red', s=100, zorder=5, label='Workhorse')
ax1.set_xlabel('Carries')
ax1.set_ylabel('EPA per Carry')
ax1.set_title('Volume vs Efficiency')
ax1.legend()
# Plot 2: Success Rate by Game State
ax2 = axes[0, 1]
states = ['Behind', 'Close', 'Ahead']
workhorse_success = [0.38, 0.44, 0.40]
league_avg = [0.44, 0.42, 0.41]
x = np.arange(len(states))
ax2.bar(x - 0.2, workhorse_success, 0.4, label='Workhorse')
ax2.bar(x + 0.2, league_avg, 0.4, label='League Avg')
ax2.set_xticks(x)
ax2.set_xticklabels(states)
ax2.set_ylabel('Success Rate')
ax2.set_title('Success by Game State')
ax2.legend()
# Plot 3: EPA Distribution
ax3 = axes[1, 0]
workhorse_rushes = team_rushes[team_rushes['rusher_player_name'] == 'N.Chubb']
ax3.hist(workhorse_rushes['epa'], bins=30, edgecolor='black', alpha=0.7)
ax3.axvline(x=0, color='red', linestyle='--', label='Break-even')
ax3.set_xlabel('EPA')
ax3.set_ylabel('Frequency')
ax3.set_title('EPA Distribution')
ax3.legend()
# Plot 4: Value vs Cost
ax4 = axes[1, 1]
rbs = ['Workhorse', 'Backup A', 'Draft Pick', 'Vet FA']
values = [0.3, 0.2, 0.2, 0.15]
costs = [14, 2, 1.5, 4]
ax4.barh(rbs, values, color='green', alpha=0.7, label='WAR')
ax4.barh(rbs, [-c/20 for c in costs], color='red', alpha=0.7, label='Cost/20')
ax4.axvline(x=0, color='black')
ax4.set_xlabel('Value (WAR) / Cost')
ax4.set_title('Cost-Benefit Comparison')
plt.tight_layout()
plt.savefig('workhorse_analysis.png', dpi=300, bbox_inches='tight')
plt.close()
Lessons Learned
- Volume masks efficiency: High-carry backs accumulate stats but may hurt the team
- Game script matters: Clock-killing carries have lower value
- Same-system comparisons reveal truth: Backups in the same offense isolate RB skill
- Receiving is the differentiator: Dual-threat ability separates valuable RBs
- Replacement is cheap: Day 2-3 picks and veteran backups produce starter-level output
Discussion Questions
-
Could the workhorse's low efficiency be due to offensive line play rather than his ability?
-
How would your recommendation change if the workhorse were 5 years younger?
-
What additional data would help distinguish RB skill from situation?
-
How might fan and locker room considerations affect this decision?
-
Is there value to a "bell cow" approach that statistics don't capture?