Exercises: Advanced Passing Metrics

Level 1: Conceptual Understanding

Exercise 1.1: EPA Basics

Explain in your own words why a 10-yard completion on 3rd and 5 has a different EPA than a 10-yard completion on 1st and 10.

Exercise 1.2: CPOE Interpretation

A quarterback has a 68% completion percentage and a CPOE of -3.2%. What does this tell you about: - The difficulty of his throws? - His actual accuracy? - How he compares to an average QB throwing the same passes?

Exercise 1.3: Air Yards Concepts

Two quarterbacks have the same completion percentage (65%): - QB A has an aDOT of 6.5 yards - QB B has an aDOT of 10.2 yards

Which quarterback do you expect has a higher CPOE? Explain your reasoning.

Exercise 1.4: Pressure Impact

Why is it important to analyze quarterback performance separately for clean pocket and under pressure situations?

Exercise 1.5: YAC vs Air Yards

A receiver has 800 receiving yards with an average of 10 yards per reception: - 350 yards came from air yards - 450 yards came from YAC

What does this distribution suggest about the receiver's role in the offense?


Level 2: Basic Calculations

Exercise 2.1: Calculate EPA

Given the following expected points values, calculate the EPA for each play:

Play EP Before EP After Result
A 2.5 3.8 Completion
B 1.2 -0.5 Interception
C 0.8 7.0 Touchdown
D 3.2 2.1 Sack

Exercise 2.2: Expected Completions

Using a simplified completion probability model where: - Completion Probability = 0.85 - (0.015 × Air Yards)

Calculate the expected number of completions for these 5 passes: | Pass | Air Yards | |------|-----------| | 1 | 5 | | 2 | 12 | | 3 | 3 | | 4 | 20 | | 5 | 8 |

Exercise 2.3: CPOE Calculation

A quarterback throws 40 passes with the following results: - 28 completions (actual) - Sum of completion probabilities: 26.4 (expected completions)

Calculate: - Actual completion percentage - Expected completion percentage - CPOE

Exercise 2.4: Air Yards Share

A quarterback's completions result in: - Total receiving yards: 3,200 - Completed air yards: 2,100

Calculate the air yards share and interpret what it means.

Exercise 2.5: Pressure-Adjusted Stats

A quarterback has these splits:

Situation Attempts Completions Yards
Clean Pocket 280 196 2,520
Under Pressure 95 52 580

Calculate: - Completion percentage for each situation - YPA for each situation - The completion percentage drop under pressure


Level 3: Implementation Exercises

Exercise 3.1: EPA Calculator Class

Implement a Python class that calculates EPA for passing plays:

class PassingEPACalculator:
    """Calculate EPA for passing plays."""

    def __init__(self):
        # Initialize EP lookup table
        pass

    def get_expected_points(self, yard_line: int, down: int, distance: int) -> float:
        """Return expected points for situation."""
        pass

    def calculate_play_epa(self, play: Dict) -> float:
        """Calculate EPA for a single play."""
        pass

    def aggregate_epa(self, plays: List[Dict]) -> Dict:
        """Calculate aggregate EPA metrics."""
        pass

Requirements: - Handle completions, incompletions, interceptions, and touchdowns - Return both per-play and aggregate metrics

Exercise 3.2: CPOE Analysis

Create a function that analyzes CPOE by depth zone:

def analyze_cpoe_by_depth(passes: List[Dict]) -> pd.DataFrame:
    """
    Analyze CPOE broken down by pass depth.

    Parameters:
    -----------
    passes : list
        List of passes with air_yards, completed, exp_completion_prob

    Returns:
    --------
    pd.DataFrame with columns:
        - depth_zone
        - attempts
        - actual_comp_pct
        - expected_comp_pct
        - cpoe
    """
    pass

Depth zones should be: Behind LOS, Short (0-9), Medium (10-19), Deep (20+)

Exercise 3.3: Air Yards Metrics

Implement a complete air yards analysis function:

def calculate_air_yards_metrics(passes: List[Dict]) -> Dict:
    """
    Calculate comprehensive air yards metrics.

    Returns dict with:
        - intended_air_yards (total)
        - completed_air_yards
        - iay_per_attempt
        - cay_per_attempt
        - adot
        - yac_total
        - yac_per_completion
        - air_yards_share
        - depth_distribution (dict with short, medium, deep percentages)
    """
    pass

Exercise 3.4: Pressure Analysis

Build a pressure analysis tool:

def create_pressure_splits(passes: List[Dict]) -> Dict:
    """
    Create comprehensive pressure analysis.

    Returns:
        - clean_pocket_stats (comp%, ypa, td%, int%)
        - pressure_stats
        - pressure_rate
        - pressure_to_sack_rate
        - adjusted_comp_pct (normalized to league avg pressure)
    """
    pass

Exercise 3.5: QB Comparison Tool

Create a function that generates side-by-side QB comparisons:

def compare_quarterbacks(qb1_passes: List[Dict], qb1_name: str,
                          qb2_passes: List[Dict], qb2_name: str) -> str:
    """
    Generate formatted comparison of two quarterbacks.

    Include:
        - Traditional stats
        - CPOE
        - Air yards metrics
        - Pressure performance
        - Advantages for each QB
    """
    pass

Level 4: Advanced Analysis

Exercise 4.1: Build a Completion Probability Model

Using the provided sample data, build a logistic regression model to predict completion probability:

# Sample features to include:
# - air_yards
# - under_pressure (0/1)
# - third_down (0/1)
# - seconds_remaining (game time pressure)
# - score_differential

from sklearn.linear_model import LogisticRegression

def train_completion_probability_model(training_data: pd.DataFrame) -> LogisticRegression:
    """
    Train a completion probability model.

    Parameters:
    -----------
    training_data : pd.DataFrame
        DataFrame with features and 'completed' target

    Returns:
    --------
    LogisticRegression : Trained model
    """
    pass

def evaluate_model(model, test_data: pd.DataFrame) -> Dict:
    """
    Evaluate model accuracy with:
        - Log loss
        - Calibration (predicted vs actual completion rates by decile)
        - AUC-ROC
    """
    pass

Exercise 4.2: Situational EPA Analysis

Create an analysis that breaks down EPA by game situations:

def analyze_situational_epa(passes: List[Dict]) -> pd.DataFrame:
    """
    Break down EPA by:
        - Down (1st, 2nd, 3rd, 4th)
        - Field position (own territory, midfield, red zone)
        - Score differential (leading, tied, trailing)
        - Quarter (1-4)

    Return DataFrame with EPA per dropback for each situation.
    """
    pass

Exercise 4.3: Opponent-Adjusted Metrics

Implement opponent-adjusted passing statistics:

def calculate_opponent_adjusted_stats(
    passes: List[Dict],
    opponent_pass_def_ranks: Dict[str, int],
    league_avg_stats: Dict
) -> Dict:
    """
    Adjust QB stats for opponent defensive strength.

    Parameters:
    -----------
    passes : list
        Passes with 'opponent' field
    opponent_pass_def_ranks : dict
        Mapping of team to pass defense rank (1-130)
    league_avg_stats : dict
        League average comp%, ypa, etc.

    Returns:
    --------
    dict with:
        - raw_comp_pct
        - opponent_adjusted_comp_pct
        - adjustment_factor
        - performance_by_opponent_tier
    """
    pass

Exercise 4.4: EPA Stability Analysis

Analyze how stable EPA is from week to week:

def analyze_epa_stability(weekly_epa: List[Tuple[int, float]]) -> Dict:
    """
    Analyze week-to-week EPA stability.

    Parameters:
    -----------
    weekly_epa : list
        List of (week, epa_per_dropback) tuples

    Returns:
    --------
    dict with:
        - mean_epa
        - std_epa
        - coefficient_of_variation
        - trend (improving/declining/stable)
        - consistency_score
    """
    pass

Exercise 4.5: Composite QB Score

Design and implement a composite QB scoring system:

class CompositeQBScore:
    """
    Multi-factor QB scoring system.

    Factors (with suggested weights):
        - CPOE (25%)
        - EPA per dropback (25%)
        - Pressure performance (15%)
        - Big play rate (10%)
        - Turnover avoidance (15%)
        - Situational performance (10%)
    """

    def __init__(self, weights: Dict[str, float] = None):
        pass

    def calculate_score(self, qb_stats: Dict) -> float:
        """Calculate composite score 0-100."""
        pass

    def get_grade(self, score: float) -> str:
        """Convert score to letter grade."""
        pass

    def explain_score(self, qb_stats: Dict) -> str:
        """Return breakdown of score components."""
        pass

Level 5: Research Projects

Exercise 5.1: CPOE Predictive Value

Research question: How well does CPOE predict future quarterback performance?

Tasks: 1. Calculate season-by-season CPOE for at least 20 quarterbacks 2. Analyze year-to-year correlation (is CPOE stable?) 3. Compare CPOE's predictive value to traditional completion percentage 4. Write a report with visualizations

Exercise 5.2: Air Yards and Offensive Scheme

Analyze how air yards metrics vary by offensive scheme:

Tasks: 1. Classify teams into offensive scheme categories (air raid, west coast, pro-style, spread, etc.) 2. Calculate average aDOT, YAC, and air yards share by scheme 3. Analyze whether CPOE means the same thing across different schemes 4. Create visualizations showing scheme differences

Exercise 5.3: Pressure Impact Study

Comprehensive study of pressure's impact on passing:

Tasks: 1. Calculate league-wide completion percentage drop under pressure 2. Identify QBs who are most/least affected by pressure 3. Analyze whether pressure resilience is stable year-to-year 4. Study relationship between offensive line quality and QB metrics 5. Present findings with statistical significance tests

Exercise 5.4: Building a Better QB Metric

Design, implement, and validate a new composite QB metric:

Tasks: 1. Identify limitations of existing metrics 2. Propose new composite metric with theoretical justification 3. Implement calculation in Python 4. Validate against future performance 5. Compare to existing metrics (passer rating, QBR, EPA) 6. Write paper with methodology and results

Exercise 5.5: Game Script and EPA

Analyze how game script affects EPA metrics:

Tasks: 1. Define game script categories (blowout win/loss, close game, comeback, etc.) 2. Calculate EPA per dropback by game script 3. Identify which QBs have most variance by game script 4. Discuss implications for QB evaluation 5. Propose adjustments for "garbage time" and game context


Bonus Challenges

Challenge A: Real-Time EPA Calculator

Build an interactive tool that calculates EPA in real-time: - User inputs: yard line, down, distance, play result - Tool outputs: EP before, EP after, EPA - Visualize the EP curve

Challenge B: Historical CPOE Analysis

If you can access historical play-by-play data: - Calculate CPOE for top QB seasons historically - Identify the most accurate seasons ever by CPOE - Compare to passer rating rankings

Challenge C: Machine Learning CPOE

Build a more sophisticated completion probability model: - Use gradient boosting or neural network - Include more features (weather, stadium, time of day) - Compare to simple logistic regression model - Analyze feature importance


Solutions

Solutions are available in code/exercise-solutions.py