Case Study: The Workhorse RB Debate

Scenario

The Cleveland Browns have a decision to make. Their star running back, who led the team in rushing attempts for three consecutive seasons, is entering free agency. The analytics department has been tasked with answering: Should the team re-sign their workhorse RB, or can they replicate his production more cheaply?

The RB's statistics over the past season: - 285 carries - 1,156 rushing yards (4.06 YPC) - 10 rushing touchdowns - 32 receptions, 285 receiving yards, 2 receiving TDs

His agent is seeking a 4-year, $56 million contract with $32 million guaranteed.


Data Gathering

First, let's pull the relevant data:

import nfl_data_py as nfl
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

# Load data
pbp = nfl.import_pbp_data([2023])

# Filter to rushing plays
rushes = pbp[pbp['rush_attempt'] == 1].copy()

# Team's rushing data
team_rushes = rushes[rushes['posteam'] == 'CLE']

# Get all RB data for comparison
rb_stats = rushes.groupby('rusher_player_name').agg(
    carries=('rush_attempt', 'sum'),
    yards=('yards_gained', 'sum'),
    ypc=('yards_gained', 'mean'),
    epa_total=('epa', 'sum'),
    epa_per_carry=('epa', 'mean'),
    success_rate=('epa', lambda x: (x > 0).mean()),
    td=('rush_touchdown', 'sum'),
    fumbles=('fumble_lost', 'sum')
).query('carries >= 100').sort_values('carries', ascending=False)

print("Top 15 RBs by Volume:")
print(rb_stats.head(15)[['carries', 'yards', 'ypc', 'epa_per_carry', 'success_rate', 'td']].round(3))

Analysis 1: Traditional vs Advanced Metrics

Traditional View

By traditional metrics, our workhorse looks solid: - Top 8 in rushing yards - Top 5 in rushing touchdowns - YPC of 4.06 (near league average)

This suggests a valuable, productive player worth re-signing.

Analytics View

# Compare traditional and advanced rankings
rb_stats['yards_rank'] = rb_stats['yards'].rank(ascending=False)
rb_stats['td_rank'] = rb_stats['td'].rank(ascending=False)
rb_stats['epa_rank'] = rb_stats['epa_per_carry'].rank(ascending=False)
rb_stats['success_rank'] = rb_stats['success_rate'].rank(ascending=False)

workhorse = rb_stats.loc['N.Chubb']  # Example name
print(f"\nWorkhorse Rankings:")
print(f"  Yards Rank: {int(workhorse['yards_rank'])}")
print(f"  TD Rank: {int(workhorse['td_rank'])}")
print(f"  EPA/Carry Rank: {int(workhorse['epa_rank'])}")
print(f"  Success Rate Rank: {int(workhorse['success_rank'])}")

Finding: The workhorse ranks 7th in yards but only 18th in EPA per carry and 22nd in success rate. High volume is masking low efficiency.


Analysis 2: Game Script Decomposition

# Split by game state
team_rushes['game_state'] = pd.cut(
    team_rushes['score_differential'],
    bins=[-100, -7, 7, 100],
    labels=['Behind', 'Close', 'Ahead']
)

# Analyze by game state
game_script = team_rushes.groupby(['rusher_player_name', 'game_state']).agg(
    carries=('rush_attempt', 'count'),
    epa=('epa', 'mean')
).unstack()

print("\nWorkhorse by Game State:")
print(game_script.loc['N.Chubb'])

Finding: - 42% of carries came when ahead by 7+ (clock-killing mode) - EPA was -0.12 when ahead (low-value carries) - EPA was +0.05 in close games (actual competitive value)

The workhorse was accumulating volume in situations where rushing efficiency naturally suffers.


Analysis 3: Backfield Comparison

# Compare all Cleveland RBs
cle_rbs = team_rushes.groupby('rusher_player_name').agg(
    carries=('rush_attempt', 'sum'),
    epa=('epa', 'mean'),
    success=('epa', lambda x: (x > 0).mean()),
    ypc=('yards_gained', 'mean')
).query('carries >= 20').sort_values('epa', ascending=False)

print("\nCleveland RB Comparison:")
print(cle_rbs.round(3))

Finding: The backup RB had: - 68 carries with 0.02 EPA/carry (better than workhorse's -0.04) - Higher success rate (47% vs 41%) - Similar YPC in the same system

Same offensive line, same scheme, different results. This suggests the workhorse isn't outperforming replacement options.


Analysis 4: Receiving Value

# Add receiving contribution
targets = pbp[pbp['pass_attempt'] == 1]
rb_receiving = targets.groupby('receiver_player_name').agg(
    targets=('pass_attempt', 'count'),
    receptions=('complete_pass', 'sum'),
    rec_yards=('yards_gained', 'sum'),
    rec_epa=('epa', 'sum')
).query('targets >= 20')

# Combine with rushing
combined = rb_stats.join(rb_receiving, how='left').fillna(0)
combined['total_epa'] = combined['epa_total'] + combined['rec_epa']
combined['total_touches'] = combined['carries'] + combined['receptions']
combined['epa_per_touch'] = combined['total_epa'] / combined['total_touches']

print("\nTotal Touch Efficiency (Top 10):")
print(combined.nlargest(10, 'epa_per_touch')[
    ['carries', 'receptions', 'total_epa', 'epa_per_touch']
].round(3))

Finding: When including receiving, our workhorse ranks 15th in EPA per touch. His receiving contribution (32 catches, 285 yards) is modest compared to top dual-threat backs.


Analysis 5: Replacement Level Calculation

# Define replacement level (25th percentile efficiency)
replacement_epa = rb_stats.query('carries >= 50')['epa_per_carry'].quantile(0.25)
print(f"Replacement Level EPA/Carry: {replacement_epa:.3f}")

# Calculate value over replacement
rb_stats['epa_over_replacement'] = (
    (rb_stats['epa_per_carry'] - replacement_epa) * rb_stats['carries']
)

# Convert to estimated wins
rb_stats['wins_over_replacement'] = rb_stats['epa_over_replacement'] / 10  # ~10 EPA per win

print("\nWorkhorse Value Over Replacement:")
workhorse_vor = rb_stats.loc['N.Chubb', 'epa_over_replacement']
workhorse_war = rb_stats.loc['N.Chubb', 'wins_over_replacement']
print(f"  EPA over replacement: {workhorse_vor:.1f}")
print(f"  Estimated wins added: {workhorse_war:.2f}")

Finding: Despite 285 carries, the workhorse provides only ~0.3 wins above replacement. At $14M/year, this is poor value (each win costs ~$47M via this player).


Analysis 6: Contract Valuation

# Market value estimation
def estimate_rb_value(war: float, market_rate_per_win: float = 2.5) -> float:
    """Estimate RB market value based on wins added."""
    return war * market_rate_per_win

workhorse_value = estimate_rb_value(workhorse_war)
print(f"\nWorkhorse Fair Value Estimate: ${workhorse_value:.1f}M/year")
print(f"Asking Price: $14M/year")
print(f"Overpay: ${14 - workhorse_value:.1f}M/year")

Finding: Based on wins added, fair value is approximately $0.75M/year. The $14M asking price represents ~$13M annual overpay.


Recommendation

Do Not Re-Sign at Asking Price

Evidence: 1. Efficiency below average: 18th in EPA/carry despite 7th in volume 2. Inflated by game script: 42% of carries in low-value clock-killing situations 3. Replaceable internally: Backup has better efficiency in same system 4. Modest receiving: Not a difference-making dual-threat 5. Poor value: $14M for ~0.3 wins over replacement

Alternative Approaches

  1. Sign a cheaper veteran ($3-5M) - Target efficient backups from other teams - Get 80% of production at 30% of cost

  2. Draft Day 2-3 RB ($1-2M avg) - Rookie RBs often produce immediately - 4 years of cost control

  3. RBBC approach ($4-6M total) - Split carries among 2-3 backs - Maintain freshness, reduce injury risk

  4. Counter-offer ($8M max, 2 years) - If team insists on workhorse model - Shorter term limits downside


Visualization

fig, axes = plt.subplots(2, 2, figsize=(12, 10))

# Plot 1: Carries vs EPA
ax1 = axes[0, 0]
ax1.scatter(rb_stats['carries'], rb_stats['epa_per_carry'], alpha=0.7)
ax1.axhline(y=0, color='gray', linestyle='--')
ax1.scatter([285], [-0.04], color='red', s=100, zorder=5, label='Workhorse')
ax1.set_xlabel('Carries')
ax1.set_ylabel('EPA per Carry')
ax1.set_title('Volume vs Efficiency')
ax1.legend()

# Plot 2: Success Rate by Game State
ax2 = axes[0, 1]
states = ['Behind', 'Close', 'Ahead']
workhorse_success = [0.38, 0.44, 0.40]
league_avg = [0.44, 0.42, 0.41]
x = np.arange(len(states))
ax2.bar(x - 0.2, workhorse_success, 0.4, label='Workhorse')
ax2.bar(x + 0.2, league_avg, 0.4, label='League Avg')
ax2.set_xticks(x)
ax2.set_xticklabels(states)
ax2.set_ylabel('Success Rate')
ax2.set_title('Success by Game State')
ax2.legend()

# Plot 3: EPA Distribution
ax3 = axes[1, 0]
workhorse_rushes = team_rushes[team_rushes['rusher_player_name'] == 'N.Chubb']
ax3.hist(workhorse_rushes['epa'], bins=30, edgecolor='black', alpha=0.7)
ax3.axvline(x=0, color='red', linestyle='--', label='Break-even')
ax3.set_xlabel('EPA')
ax3.set_ylabel('Frequency')
ax3.set_title('EPA Distribution')
ax3.legend()

# Plot 4: Value vs Cost
ax4 = axes[1, 1]
rbs = ['Workhorse', 'Backup A', 'Draft Pick', 'Vet FA']
values = [0.3, 0.2, 0.2, 0.15]
costs = [14, 2, 1.5, 4]
ax4.barh(rbs, values, color='green', alpha=0.7, label='WAR')
ax4.barh(rbs, [-c/20 for c in costs], color='red', alpha=0.7, label='Cost/20')
ax4.axvline(x=0, color='black')
ax4.set_xlabel('Value (WAR) / Cost')
ax4.set_title('Cost-Benefit Comparison')

plt.tight_layout()
plt.savefig('workhorse_analysis.png', dpi=300, bbox_inches='tight')
plt.close()

Lessons Learned

  1. Volume masks efficiency: High-carry backs accumulate stats but may hurt the team
  2. Game script matters: Clock-killing carries have lower value
  3. Same-system comparisons reveal truth: Backups in the same offense isolate RB skill
  4. Receiving is the differentiator: Dual-threat ability separates valuable RBs
  5. Replacement is cheap: Day 2-3 picks and veteran backups produce starter-level output

Discussion Questions

  1. Could the workhorse's low efficiency be due to offensive line play rather than his ability?

  2. How would your recommendation change if the workhorse were 5 years younger?

  3. What additional data would help distinguish RB skill from situation?

  4. How might fan and locker room considerations affect this decision?

  5. Is there value to a "bell cow" approach that statistics don't capture?