3 min read

> "A quarterback is not a commodity. They're the rarest and most valuable commodity in all of sports." — Jon Gruden

Chapter 6: Quarterback Evaluation

"A quarterback is not a commodity. They're the rarest and most valuable commodity in all of sports." — Jon Gruden

Learning Objectives

By the end of this chapter, you will be able to:

  1. Calculate and interpret EPA-based quarterback metrics
  2. Understand Completion Percentage Over Expected (CPOE)
  3. Analyze air yards and passing depth
  4. Adjust QB metrics for context (supporting cast, opponent, situation)
  5. Build a comprehensive quarterback evaluation framework
  6. Recognize the limitations of statistical QB evaluation

6.1 The Quarterback Evaluation Challenge

Why Quarterbacks Matter Most

The quarterback position dominates NFL analysis for good reason:

  1. Involvement: QBs touch the ball on every offensive play
  2. Impact: EPA studies show QB play explains more variance in team success than any other position
  3. Economics: The highest-paid position by far
  4. Data: More measurable actions than any other position

Why Evaluation Is Difficult

Despite data abundance, QB evaluation remains challenging:

  1. Interdependence: Receivers must get open; line must protect
  2. Scheme effects: System fit affects performance
  3. Situation selection: Play calls determine opportunity
  4. Sample variability: Even starters throw only ~500 passes per season

6.2 EPA-Based Quarterback Metrics

EPA Per Dropback

The foundational measure of quarterback efficiency:

import pandas as pd
import numpy as np
import nfl_data_py as nfl
from typing import List, Optional, Dict

def calculate_qb_epa(
    pbp: pd.DataFrame,
    min_dropbacks: int = 200
) -> pd.DataFrame:
    """
    Calculate EPA-based QB statistics.

    Parameters
    ----------
    pbp : pd.DataFrame
        Play-by-play data
    min_dropbacks : int
        Minimum dropbacks for inclusion

    Returns
    -------
    pd.DataFrame
        QB statistics
    """
    # Filter to pass plays
    passes = pbp.query("pass == 1 and epa.notna()").copy()

    qb_stats = (
        passes
        .groupby(['season', 'passer_player_id', 'passer_player_name'])
        .agg(
            dropbacks=('pass', 'count'),
            pass_attempts=('pass_attempt', 'sum'),
            completions=('complete_pass', 'sum'),
            yards=('yards_gained', 'sum'),
            air_yards=('air_yards', 'sum'),
            yac=('yards_after_catch', 'sum'),
            touchdowns=('pass_touchdown', 'sum'),
            interceptions=('interception', 'sum'),
            sacks=('sack', 'sum'),
            total_epa=('epa', 'sum'),
            epa_per_play=('epa', 'mean'),
            cpoe=('cpoe', 'mean'),
            success_rate=('success', 'mean')
        )
        .reset_index()
        .query(f"dropbacks >= {min_dropbacks}")
    )

    # Derived metrics
    qb_stats['comp_pct'] = qb_stats['completions'] / qb_stats['pass_attempts']
    qb_stats['yards_per_att'] = qb_stats['yards'] / qb_stats['pass_attempts']
    qb_stats['td_rate'] = qb_stats['touchdowns'] / qb_stats['pass_attempts']
    qb_stats['int_rate'] = qb_stats['interceptions'] / qb_stats['pass_attempts']
    qb_stats['sack_rate'] = qb_stats['sacks'] / qb_stats['dropbacks']

    return qb_stats.sort_values('epa_per_play', ascending=False)

Total EPA

While efficiency (EPA per play) matters, volume matters too:

def analyze_epa_volume(qb_stats: pd.DataFrame) -> pd.DataFrame:
    """
    Analyze the efficiency vs volume tradeoff.

    Parameters
    ----------
    qb_stats : pd.DataFrame
        QB statistics from calculate_qb_epa

    Returns
    -------
    pd.DataFrame
        QBs with efficiency and volume metrics
    """
    qb_stats = qb_stats.copy()

    # Rank by efficiency and total
    qb_stats['epa_rank'] = qb_stats['epa_per_play'].rank(ascending=False)
    qb_stats['total_epa_rank'] = qb_stats['total_epa'].rank(ascending=False)

    # Composite score (balance efficiency and volume)
    qb_stats['composite_rank'] = (qb_stats['epa_rank'] + qb_stats['total_epa_rank']) / 2

    return qb_stats.sort_values('composite_rank')

EPA by Situation

Breaking down performance by game context:

def epa_by_situation(
    pbp: pd.DataFrame,
    qb_name: str
) -> pd.DataFrame:
    """
    Analyze QB EPA in different situations.

    Parameters
    ----------
    pbp : pd.DataFrame
        Play-by-play data
    qb_name : str
        QB name to analyze

    Returns
    -------
    pd.DataFrame
        Situational EPA breakdown
    """
    qb_plays = pbp.query(
        f"pass == 1 and passer_player_name.str.contains('{qb_name}', na=False)"
    ).copy()

    situations = {
        'all_plays': 'True',
        'early_downs': 'down <= 2',
        'third_down': 'down == 3',
        'fourth_down': 'down == 4',
        'first_half': 'qtr <= 2',
        'second_half': 'qtr >= 3',
        'trailing': 'score_differential < 0',
        'leading': 'score_differential > 0',
        'close_game': 'abs(score_differential) <= 7',
        'red_zone': 'yardline_100 <= 20',
        'third_and_long': 'down == 3 and ydstogo >= 7'
    }

    results = []
    for situation, filter_expr in situations.items():
        subset = qb_plays.query(filter_expr) if filter_expr != 'True' else qb_plays

        if len(subset) >= 20:
            results.append({
                'situation': situation,
                'plays': len(subset),
                'epa_per_play': subset['epa'].mean(),
                'success_rate': subset['success'].mean(),
                'comp_pct': subset['complete_pass'].mean()
            })

    return pd.DataFrame(results)

6.3 Completion Percentage Over Expected (CPOE)

The Problem with Raw Completion Percentage

Raw completion percentage is heavily influenced by: - Target depth (short passes complete more often) - Receiver quality (separation matters) - Scheme (some systems target easier completions)

How CPOE Works

CPOE compares actual completions to an expected rate based on: - Distance to target - Separation - Air yards - Pressure - Throw location

$$CPOE = Actual\ Completion\% - Expected\ Completion\%$$

def analyze_cpoe(
    pbp: pd.DataFrame,
    min_attempts: int = 200
) -> pd.DataFrame:
    """
    Analyze CPOE for quarterbacks.

    Parameters
    ----------
    pbp : pd.DataFrame
        Play-by-play data
    min_attempts : int
        Minimum pass attempts

    Returns
    -------
    pd.DataFrame
        CPOE analysis by QB
    """
    passes = pbp.query("pass_attempt == 1 and cpoe.notna()").copy()

    cpoe_stats = (
        passes
        .groupby('passer_player_name')
        .agg(
            attempts=('pass_attempt', 'sum'),
            actual_comp_pct=('complete_pass', 'mean'),
            cpoe=('cpoe', 'mean'),
            cpoe_std=('cpoe', 'std')
        )
        .query(f"attempts >= {min_attempts}")
        .reset_index()
        .sort_values('cpoe', ascending=False)
    )

    # Calculate implied expected completion %
    cpoe_stats['expected_comp_pct'] = cpoe_stats['actual_comp_pct'] - cpoe_stats['cpoe'] / 100

    return cpoe_stats


def cpoe_vs_epa(
    pbp: pd.DataFrame,
    min_attempts: int = 300
) -> pd.DataFrame:
    """
    Compare CPOE to EPA to find over/under-performers.

    Parameters
    ----------
    pbp : pd.DataFrame
        Play-by-play data
    min_attempts : int
        Minimum pass attempts

    Returns
    -------
    pd.DataFrame
        QBs with CPOE and EPA rankings
    """
    passes = pbp.query("pass_attempt == 1 and cpoe.notna() and epa.notna()")

    qb_stats = (
        passes
        .groupby('passer_player_name')
        .agg(
            attempts=('pass_attempt', 'sum'),
            cpoe=('cpoe', 'mean'),
            epa_per_att=('epa', 'mean')
        )
        .query(f"attempts >= {min_attempts}")
        .reset_index()
    )

    qb_stats['cpoe_rank'] = qb_stats['cpoe'].rank(ascending=False)
    qb_stats['epa_rank'] = qb_stats['epa_per_att'].rank(ascending=False)
    qb_stats['rank_diff'] = qb_stats['cpoe_rank'] - qb_stats['epa_rank']

    return qb_stats.sort_values('rank_diff')

Interpreting CPOE

CPOE Interpretation
> +4% Elite accuracy
+2% to +4% Above average
-2% to +2% Average
-4% to -2% Below average
< -4% Poor accuracy

6.4 Air Yards and Passing Depth

Average Depth of Target (ADOT)

def analyze_air_yards(
    pbp: pd.DataFrame,
    min_attempts: int = 200
) -> pd.DataFrame:
    """
    Analyze passing depth for quarterbacks.

    Parameters
    ----------
    pbp : pd.DataFrame
        Play-by-play data
    min_attempts : int
        Minimum pass attempts

    Returns
    -------
    pd.DataFrame
        Air yards analysis
    """
    passes = pbp.query("pass_attempt == 1 and air_yards.notna()").copy()

    air_stats = (
        passes
        .groupby('passer_player_name')
        .agg(
            attempts=('pass_attempt', 'sum'),
            total_air=('air_yards', 'sum'),
            adot=('air_yards', 'mean'),
            yac=('yards_after_catch', 'mean'),
            completed_air=('air_yards', lambda x: x[passes.loc[x.index, 'complete_pass'] == 1].sum()),
            deep_attempts=('air_yards', lambda x: (x >= 20).sum()),
            deep_completions=('air_yards', lambda x: ((x >= 20) & (passes.loc[x.index, 'complete_pass'] == 1)).sum())
        )
        .query(f"attempts >= {min_attempts}")
        .reset_index()
    )

    air_stats['cay'] = air_stats['completed_air'] / air_stats['attempts']  # Completed Air Yards
    air_stats['deep_pct'] = air_stats['deep_attempts'] / air_stats['attempts']
    air_stats['deep_comp_pct'] = air_stats['deep_completions'] / air_stats['deep_attempts']

    return air_stats.sort_values('adot', ascending=False)

Aggressive vs Conservative Passers

def passing_style_analysis(
    pbp: pd.DataFrame,
    min_attempts: int = 300
) -> pd.DataFrame:
    """
    Classify QBs by passing style.

    Parameters
    ----------
    pbp : pd.DataFrame
        Play-by-play data
    min_attempts : int
        Minimum attempts

    Returns
    -------
    pd.DataFrame
        QBs classified by passing style
    """
    passes = pbp.query("pass_attempt == 1 and air_yards.notna()").copy()

    # Define depth buckets
    passes['depth_bucket'] = pd.cut(
        passes['air_yards'],
        bins=[-20, 0, 10, 20, 100],
        labels=['Behind', 'Short', 'Intermediate', 'Deep']
    )

    # Pivot for each QB
    depth_dist = (
        passes
        .groupby(['passer_player_name', 'depth_bucket'])
        .size()
        .unstack(fill_value=0)
    )

    # Normalize to percentages
    depth_pct = depth_dist.div(depth_dist.sum(axis=1), axis=0)
    depth_pct.columns = [f'{c}_pct' for c in depth_pct.columns]

    # Add EPA by depth
    epa_by_depth = (
        passes
        .groupby(['passer_player_name', 'depth_bucket'])['epa']
        .mean()
        .unstack()
    )
    epa_by_depth.columns = [f'{c}_epa' for c in epa_by_depth.columns]

    result = depth_pct.join(epa_by_depth).reset_index()

    # Filter to minimum attempts
    attempts = passes.groupby('passer_player_name').size()
    result = result[result['passer_player_name'].isin(attempts[attempts >= min_attempts].index)]

    return result

6.5 Context Adjustments

Supporting Cast Effects

QB performance is influenced by receivers and offensive line:

def adjust_for_supporting_cast(
    pbp: pd.DataFrame,
    team: str,
    method: str = 'yac_adjustment'
) -> Dict:
    """
    Attempt to adjust QB stats for supporting cast.

    Parameters
    ----------
    pbp : pd.DataFrame
        Play-by-play data
    team : str
        Team to analyze
    method : str
        Adjustment method

    Returns
    -------
    Dict
        Adjustment analysis
    """
    team_passes = pbp.query(f"pass == 1 and posteam == '{team}'")

    if method == 'yac_adjustment':
        # Compare team's YAC to league average
        team_yac = team_passes['yards_after_catch'].mean()
        league_yac = pbp.query("pass == 1")['yards_after_catch'].mean()
        yac_diff = team_yac - league_yac

        # YAC above/below average attributed to receivers
        adjustment = {
            'team_yac': team_yac,
            'league_yac': league_yac,
            'yac_difference': yac_diff,
            'interpretation': f"Receivers contributed {yac_diff:+.2f} YAC vs average"
        }

    elif method == 'pressure_adjustment':
        # Account for pressure rate
        team_pressure = team_passes['sack'].mean() + team_passes.get('qb_hit', pd.Series([0])).mean()
        league_pressure = pbp.query("pass == 1")['sack'].mean()

        adjustment = {
            'team_pressure': team_pressure,
            'league_pressure': league_pressure,
            'pressure_diff': team_pressure - league_pressure,
            'interpretation': f"QB faced {(team_pressure - league_pressure)*100:+.1f}% different pressure rate"
        }

    return adjustment


def qb_with_vs_without_receiver(
    pbp: pd.DataFrame,
    qb_name: str,
    receiver_name: str
) -> Dict:
    """
    Compare QB performance with and without a specific receiver.

    Parameters
    ----------
    pbp : pd.DataFrame
        Play-by-play data
    qb_name : str
        QB name
    receiver_name : str
        Receiver name

    Returns
    -------
    Dict
        Comparison results
    """
    qb_passes = pbp.query(
        f"pass == 1 and passer_player_name.str.contains('{qb_name}', na=False)"
    )

    # Games with receiver
    receiver_games = qb_passes[
        qb_passes['receiver_player_name'].str.contains(receiver_name, na=False)
    ]['game_id'].unique()

    with_receiver = qb_passes[qb_passes['game_id'].isin(receiver_games)]
    without_receiver = qb_passes[~qb_passes['game_id'].isin(receiver_games)]

    return {
        'with_receiver': {
            'games': len(receiver_games),
            'epa': with_receiver['epa'].mean(),
            'comp_pct': with_receiver['complete_pass'].mean()
        },
        'without_receiver': {
            'games': len(qb_passes['game_id'].unique()) - len(receiver_games),
            'epa': without_receiver['epa'].mean() if len(without_receiver) > 0 else None,
            'comp_pct': without_receiver['complete_pass'].mean() if len(without_receiver) > 0 else None
        }
    }

Opponent Adjustments

def opponent_adjusted_epa(
    pbp: pd.DataFrame,
    min_dropbacks: int = 200
) -> pd.DataFrame:
    """
    Calculate opponent-adjusted EPA for QBs.

    Parameters
    ----------
    pbp : pd.DataFrame
        Play-by-play data
    min_dropbacks : int
        Minimum dropbacks

    Returns
    -------
    pd.DataFrame
        QBs with raw and adjusted EPA
    """
    passes = pbp.query("pass == 1 and epa.notna()").copy()

    # Calculate opponent pass defense strength
    def_strength = (
        passes
        .groupby('defteam')['epa']
        .mean()
        .rename('def_epa_allowed')
    )

    passes = passes.merge(def_strength, left_on='defteam', right_index=True)

    # Calculate adjustment (opponent strength vs average)
    league_avg_def = def_strength.mean()
    passes['opponent_adjustment'] = passes['def_epa_allowed'] - league_avg_def
    passes['adjusted_epa'] = passes['epa'] - passes['opponent_adjustment']

    # Aggregate by QB
    qb_adjusted = (
        passes
        .groupby('passer_player_name')
        .agg(
            dropbacks=('pass', 'count'),
            raw_epa=('epa', 'mean'),
            adjusted_epa=('adjusted_epa', 'mean'),
            avg_opponent_adj=('opponent_adjustment', 'mean')
        )
        .query(f"dropbacks >= {min_dropbacks}")
        .reset_index()
        .sort_values('adjusted_epa', ascending=False)
    )

    qb_adjusted['adjustment'] = qb_adjusted['adjusted_epa'] - qb_adjusted['raw_epa']

    return qb_adjusted

6.6 Pressure and Time to Throw

Performance Under Pressure

def pressure_analysis(
    pbp: pd.DataFrame,
    min_dropbacks: int = 200
) -> pd.DataFrame:
    """
    Analyze QB performance under pressure (using sack as proxy).

    Parameters
    ----------
    pbp : pd.DataFrame
        Play-by-play data
    min_dropbacks : int
        Minimum dropbacks

    Returns
    -------
    pd.DataFrame
        Pressure analysis by QB
    """
    passes = pbp.query("pass == 1 and epa.notna()").copy()

    # Use sack as proxy for pressure
    # Note: Real pressure data from NGS or PFF would be better
    passes['pressured'] = passes['sack'] == 1

    qb_pressure = (
        passes
        .groupby('passer_player_name')
        .agg(
            dropbacks=('pass', 'count'),
            sack_rate=('sack', 'mean'),
            clean_epa=('epa', lambda x: x[~passes.loc[x.index, 'pressured']].mean()),
            sack_epa=('epa', lambda x: x[passes.loc[x.index, 'pressured']].mean()),
            overall_epa=('epa', 'mean')
        )
        .query(f"dropbacks >= {min_dropbacks}")
        .reset_index()
    )

    qb_pressure['pressure_delta'] = qb_pressure['sack_epa'] - qb_pressure['clean_epa']

    return qb_pressure.sort_values('clean_epa', ascending=False)

6.7 Comprehensive QB Evaluation Framework

The QB Rating Framework

Combining multiple metrics into a holistic view:

class QBEvaluator:
    """
    Comprehensive quarterback evaluation framework.

    Example
    -------
    >>> evaluator = QBEvaluator(pbp)
    >>> report = evaluator.evaluate('P.Mahomes')
    >>> evaluator.compare(['P.Mahomes', 'J.Allen', 'J.Burrow'])
    """

    def __init__(self, pbp: pd.DataFrame, min_dropbacks: int = 200):
        self.pbp = pbp.query("pass == 1")
        self.min_dropbacks = min_dropbacks
        self._calculate_baselines()

    def _calculate_baselines(self):
        """Calculate league baselines for percentile calculations."""
        qb_stats = (
            self.pbp
            .groupby('passer_player_name')
            .agg(
                dropbacks=('pass', 'count'),
                epa=('epa', 'mean'),
                cpoe=('cpoe', 'mean'),
                success_rate=('success', 'mean'),
                adot=('air_yards', 'mean')
            )
            .query(f"dropbacks >= {self.min_dropbacks}")
        )
        self.baselines = qb_stats

    def evaluate(self, qb_name: str) -> Dict:
        """
        Generate comprehensive evaluation for a QB.

        Parameters
        ----------
        qb_name : str
            QB name to evaluate

        Returns
        -------
        Dict
            Comprehensive evaluation
        """
        qb_plays = self.pbp[
            self.pbp['passer_player_name'].str.contains(qb_name, na=False)
        ]

        if len(qb_plays) < self.min_dropbacks:
            return {'error': f'Insufficient sample size: {len(qb_plays)}'}

        # Core metrics
        core = {
            'dropbacks': len(qb_plays),
            'epa_per_play': qb_plays['epa'].mean(),
            'total_epa': qb_plays['epa'].sum(),
            'cpoe': qb_plays['cpoe'].mean(),
            'success_rate': qb_plays['success'].mean(),
            'comp_pct': qb_plays['complete_pass'].mean(),
            'adot': qb_plays['air_yards'].mean(),
            'td_rate': qb_plays['pass_touchdown'].mean(),
            'int_rate': qb_plays['interception'].mean(),
            'sack_rate': qb_plays['sack'].mean()
        }

        # Percentile ranks
        percentiles = {}
        for metric in ['epa', 'cpoe', 'success_rate', 'adot']:
            if metric in self.baselines.columns:
                value = core.get(f'{metric}_per_play', core.get(metric))
                pct = (self.baselines[metric] < value).mean()
                percentiles[f'{metric}_percentile'] = pct

        # Situational breakdown
        situations = epa_by_situation(self.pbp, qb_name)

        return {
            'qb_name': qb_name,
            'core_metrics': core,
            'percentiles': percentiles,
            'situations': situations.to_dict('records')
        }

    def compare(self, qb_names: List[str]) -> pd.DataFrame:
        """
        Compare multiple QBs.

        Parameters
        ----------
        qb_names : List[str]
            QB names to compare

        Returns
        -------
        pd.DataFrame
            Comparison table
        """
        comparisons = []
        for name in qb_names:
            eval_result = self.evaluate(name)
            if 'error' not in eval_result:
                row = {'qb_name': name, **eval_result['core_metrics']}
                comparisons.append(row)

        return pd.DataFrame(comparisons).sort_values('epa_per_play', ascending=False)

    def generate_report(self, qb_name: str) -> str:
        """
        Generate text report for a QB.

        Parameters
        ----------
        qb_name : str
            QB name

        Returns
        -------
        str
            Formatted report
        """
        eval_result = self.evaluate(qb_name)

        if 'error' in eval_result:
            return eval_result['error']

        core = eval_result['core_metrics']
        pcts = eval_result['percentiles']

        report = f"""
QUARTERBACK EVALUATION: {qb_name}
{'=' * 50}

VOLUME
------
Dropbacks: {core['dropbacks']:,}

EFFICIENCY
----------
EPA per Play: {core['epa_per_play']:.3f} ({pcts.get('epa_percentile', 0):.0%} percentile)
Total EPA: {core['total_epa']:.1f}
Success Rate: {core['success_rate']:.1%}

ACCURACY
--------
Completion %: {core['comp_pct']:.1%}
CPOE: {core['cpoe']:.1f}% ({pcts.get('cpoe_percentile', 0):.0%} percentile)

STYLE
-----
ADOT: {core['adot']:.1f} yards
TD Rate: {core['td_rate']:.1%}
INT Rate: {core['int_rate']:.1%}

PRESSURE
--------
Sack Rate: {core['sack_rate']:.1%}
"""
        return report

6.8 Limitations and Caveats

What Statistics Miss

  1. Pre-snap reads: Identifying defenses and audibling
  2. Ball placement: Quality of throw location
  3. Pocket presence: Movement and awareness
  4. Leadership: Intangible team effects
  5. Play-calling influence: QB input into game plan

Sample Size Concerns

def stability_analysis(
    pbp: pd.DataFrame,
    metric: str = 'epa',
    sample_sizes: List[int] = [50, 100, 200, 400, 600]
) -> pd.DataFrame:
    """
    Analyze how stable QB metrics are at different sample sizes.

    Parameters
    ----------
    pbp : pd.DataFrame
        Play-by-play data
    metric : str
        Metric to analyze
    sample_sizes : List[int]
        Sample sizes to test

    Returns
    -------
    pd.DataFrame
        Stability analysis
    """
    passes = pbp.query("pass == 1 and epa.notna()")

    # Get QBs with enough data
    qb_plays = passes.groupby('passer_player_name').size()
    qualified_qbs = qb_plays[qb_plays >= max(sample_sizes)].index

    results = []

    for n in sample_sizes:
        first_half = []
        second_half = []

        for qb in qualified_qbs:
            qb_data = passes[passes['passer_player_name'] == qb].sample(2 * n, replace=False)
            first_half.append(qb_data.iloc[:n][metric].mean())
            second_half.append(qb_data.iloc[n:2*n][metric].mean())

        correlation = np.corrcoef(first_half, second_half)[0, 1]
        results.append({
            'sample_size': n,
            'split_half_correlation': correlation,
            'n_qbs': len(qualified_qbs)
        })

    return pd.DataFrame(results)

6.9 Chapter Summary

Quarterback evaluation requires multiple lenses:

  1. EPA measures overall play value but doesn't isolate QB contribution
  2. CPOE isolates accuracy from target difficulty
  3. Air yards reveals passing style and aggressiveness
  4. Context adjustments account for supporting cast and opponents
  5. Situational analysis reveals strengths and weaknesses

No single metric captures QB quality. The best evaluations combine statistics with film study and context understanding.


Chapter Summary Checklist

After completing this chapter, you should be able to:

  • [ ] Calculate and interpret EPA per dropback
  • [ ] Explain what CPOE measures and why it matters
  • [ ] Analyze passing depth and style
  • [ ] Adjust for supporting cast effects
  • [ ] Build comprehensive QB evaluation reports
  • [ ] Articulate limitations of statistical QB evaluation

Preview: Chapter 7

Next: Rushing Analytics — why traditional rushing stats mislead and how to properly evaluate running backs.