4 min read

While game prediction focuses on team-level outcomes, player performance forecasting zooms in on individuals—projecting how many yards a running back will gain, how efficiently a quarterback will perform, or how a freshman will develop over four...

Chapter 19: Player Performance Forecasting

Learning Objectives

By the end of this chapter, you will be able to:

  • Build projection models for individual player statistics
  • Handle player development curves and aging effects
  • Incorporate historical comparables for projection
  • Model position-specific performance trajectories
  • Account for context factors like team system and usage
  • Evaluate and calibrate player projections
  • Deploy player projection systems for fantasy and recruiting applications

Introduction

While game prediction focuses on team-level outcomes, player performance forecasting zooms in on individuals—projecting how many yards a running back will gain, how efficiently a quarterback will perform, or how a freshman will develop over four years. These projections power fantasy football decisions, recruiting evaluations, draft analysis, and strategic team building.

Player forecasting presents unique challenges beyond team prediction. Players follow developmental curves, experience injuries, change roles, and operate within constantly shifting team contexts. A quarterback's efficiency depends heavily on his offensive line and receiving corps; a running back's production reflects both his talent and his team's scheme and play-calling tendencies.

This chapter develops comprehensive approaches to player projection, from simple regression baselines to sophisticated models incorporating aging curves, historical comparables, and Bayesian updating. We'll address the practical challenges of small sample sizes, position-specific considerations, and the ever-present uncertainty in projecting human performance.


19.1 The Player Projection Problem

19.1.1 Types of Player Projections

Player projections serve different purposes requiring different approaches:

Season-Long Projections: Total statistics for an upcoming season (yards, touchdowns, fantasy points)

Game-by-Game Projections: Weekly performance predictions accounting for matchup and context

Career Development: Multi-year projections for recruiting and draft evaluation

Breakout Prediction: Identifying players likely to significantly exceed expectations

import numpy as np
import pandas as pd
from dataclasses import dataclass
from typing import Dict, List, Tuple, Optional
from enum import Enum


class ProjectionType(Enum):
    """Types of player projections."""
    SEASON = 'season'
    GAME = 'game'
    CAREER = 'career'
    BREAKOUT = 'breakout'


@dataclass
class PlayerProjection:
    """
    Complete player projection.

    Attributes:
    -----------
    player_id : str
        Unique player identifier
    name : str
        Player name
    position : str
        Player position
    team : str
        Team name
    projection_type : ProjectionType
        Type of projection
    stats : Dict[str, float]
        Projected statistics
    confidence_intervals : Dict[str, Tuple[float, float]]
        90% confidence intervals for each stat
    percentile_range : Tuple[float, float]
        Projected percentile range vs. position
    comparable_players : List[str]
        Historical comparable players
    """
    player_id: str
    name: str
    position: str
    team: str
    projection_type: ProjectionType
    stats: Dict[str, float]
    confidence_intervals: Dict[str, Tuple[float, float]]
    percentile_range: Tuple[float, float]
    comparable_players: List[str]


class Position(Enum):
    """Football positions."""
    QB = 'QB'
    RB = 'RB'
    WR = 'WR'
    TE = 'TE'
    OL = 'OL'
    DL = 'DL'
    LB = 'LB'
    DB = 'DB'
    K = 'K'
    P = 'P'

19.1.2 Key Challenges

Player projection faces several inherent challenges:

Small Samples: A quarterback might throw 400-500 passes per season—enough for stable efficiency estimates, but a running back with 150 carries or a receiver with 60 targets provides much less data.

Role Changes: Players change roles within teams, move to new teams, or see their usage shift dramatically year-to-year.

Injuries: Injuries create missing data and may permanently alter performance trajectories.

Context Dependence: Performance depends heavily on teammates, scheme, coaching, and opponent quality.

Development Non-linearity: Players don't improve linearly—they may plateau, break out suddenly, or decline unexpectedly.

def assess_sample_quality(stats: pd.DataFrame) -> Dict[str, str]:
    """
    Assess sample size quality for projection.

    Parameters:
    -----------
    stats : pd.DataFrame
        Player statistics

    Returns:
    --------
    Dict : Sample quality assessment by stat category
    """
    thresholds = {
        'QB': {'attempts': 200, 'quality': 'reliable'},
        'RB': {'carries': 100, 'quality': 'moderate'},
        'WR': {'targets': 50, 'quality': 'limited'},
        'TE': {'targets': 40, 'quality': 'limited'}
    }

    assessments = {}

    for pos, criteria in thresholds.items():
        key_stat = list(criteria.keys())[0]
        threshold = criteria[key_stat]

        if key_stat in stats.columns:
            sample = stats[key_stat].sum()
            if sample >= threshold * 1.5:
                assessments[pos] = 'reliable'
            elif sample >= threshold:
                assessments[pos] = 'moderate'
            else:
                assessments[pos] = 'limited'

    return assessments

19.2 Baseline Projection Methods

19.2.1 Simple Regression to Mean

The most fundamental projection approach combines recent performance with regression toward league average:

class RegressionToMeanProjector:
    """
    Project using regression to the mean.

    Key insight: Extreme performances tend to regress toward average.
    The smaller the sample, the more regression is appropriate.
    """

    def __init__(self, league_averages: Dict[str, float],
                 regression_rates: Dict[str, float] = None):
        """
        Initialize projector.

        Parameters:
        -----------
        league_averages : Dict[str, float]
            League average for each stat
        regression_rates : Dict[str, float]
            Regression rate by stat (0 = full regression, 1 = no regression)
        """
        self.league_avg = league_averages
        self.regression_rates = regression_rates or self._default_regression_rates()

    def _default_regression_rates(self) -> Dict[str, float]:
        """Default regression rates by stat type."""
        return {
            'yards_per_attempt': 0.4,  # Highly variable, regress more
            'completion_pct': 0.5,
            'td_rate': 0.3,  # TD rate very unstable
            'int_rate': 0.4,
            'yards_per_carry': 0.4,
            'yards_per_target': 0.5,
            'catch_rate': 0.6,  # More stable
            'yards_after_catch': 0.5
        }

    def project(self, observed: float, stat_name: str,
                sample_size: int = None) -> float:
        """
        Project a statistic using regression to mean.

        Parameters:
        -----------
        observed : float
            Observed rate/average
        stat_name : str
            Name of statistic
        sample_size : int
            Sample size (adjusts regression amount)

        Returns:
        --------
        float : Projected value
        """
        league_mean = self.league_avg.get(stat_name, observed)
        base_rate = self.regression_rates.get(stat_name, 0.5)

        # Adjust regression based on sample size
        if sample_size:
            # Larger samples -> less regression
            # Using formula: effective_rate = base_rate * sqrt(sample / 300)
            effective_rate = min(base_rate * np.sqrt(sample_size / 300), 0.95)
        else:
            effective_rate = base_rate

        # Projection = weight * observed + (1 - weight) * mean
        projected = effective_rate * observed + (1 - effective_rate) * league_mean

        return projected

    def project_all_stats(self, player_stats: Dict[str, Tuple[float, int]],
                         usage_projection: int = None) -> Dict[str, float]:
        """
        Project all statistics for a player.

        Parameters:
        -----------
        player_stats : Dict[str, Tuple[float, int]]
            Dict of stat_name -> (observed_rate, sample_size)
        usage_projection : int
            Projected usage (attempts, carries, targets)

        Returns:
        --------
        Dict[str, float] : Projected statistics
        """
        projections = {}

        for stat_name, (observed, sample) in player_stats.items():
            projections[stat_name] = self.project(observed, stat_name, sample)

        return projections

19.2.2 Marcel Projections

The Marcel method (named after a baseball projection system) weights multiple years with recency bias:

class MarcelProjector:
    """
    Marcel-style projections using weighted multi-year data.

    Weights: Current year 5, Previous year 4, Two years ago 3
    Plus regression based on total weighted opportunities.
    """

    def __init__(self, league_averages: Dict[str, float],
                 weights: Tuple[float, float, float] = (5, 4, 3)):
        """
        Initialize Marcel projector.

        Parameters:
        -----------
        league_averages : Dict[str, float]
            League average rates
        weights : Tuple[float, float, float]
            Year weights (current, -1, -2)
        """
        self.league_avg = league_averages
        self.weights = weights

    def project(self, seasons: List[Dict]) -> Dict[str, float]:
        """
        Generate Marcel projection from multi-year data.

        Parameters:
        -----------
        seasons : List[Dict]
            List of season dictionaries, most recent first
            Each dict has: 'opportunities', 'stats' (rate stats)

        Returns:
        --------
        Dict[str, float] : Projected statistics
        """
        if not seasons:
            return {k: v for k, v in self.league_avg.items()}

        # Limit to 3 most recent seasons
        seasons = seasons[:3]

        # Calculate weighted stats
        weighted_stats = {}
        total_weighted_opps = 0

        for i, season in enumerate(seasons):
            if i >= len(self.weights):
                break

            weight = self.weights[i]
            opps = season.get('opportunities', 0)

            total_weighted_opps += weight * opps

            for stat, value in season.get('stats', {}).items():
                if stat not in weighted_stats:
                    weighted_stats[stat] = 0
                weighted_stats[stat] += weight * opps * value

        # Normalize by weighted opportunities
        projections = {}
        for stat, weighted_sum in weighted_stats.items():
            if total_weighted_opps > 0:
                observed_rate = weighted_sum / total_weighted_opps
            else:
                observed_rate = self.league_avg.get(stat, 0)

            # Apply regression
            regression_opps = 1200  # Reference point for full weight
            reliability = total_weighted_opps / (total_weighted_opps + regression_opps)

            league_rate = self.league_avg.get(stat, observed_rate)
            projections[stat] = reliability * observed_rate + (1 - reliability) * league_rate

        return projections

19.3 Aging and Development Curves

19.3.1 Position-Specific Aging Curves

Players at different positions follow different developmental trajectories:

class AgingCurveModel:
    """
    Model player aging and development by position.

    Different positions have different peak ages and decline rates.
    College players are typically still developing.
    """

    def __init__(self):
        # Peak ages and decline rates by position
        self.position_curves = {
            'QB': {'development_peak': 24, 'performance_peak': 28,
                   'decline_start': 35, 'decline_rate': 0.02},
            'RB': {'development_peak': 22, 'performance_peak': 25,
                   'decline_start': 28, 'decline_rate': 0.05},
            'WR': {'development_peak': 23, 'performance_peak': 27,
                   'decline_start': 31, 'decline_rate': 0.03},
            'TE': {'development_peak': 24, 'performance_peak': 28,
                   'decline_start': 32, 'decline_rate': 0.03},
            'OL': {'development_peak': 24, 'performance_peak': 28,
                   'decline_start': 33, 'decline_rate': 0.03},
            'DL': {'development_peak': 23, 'performance_peak': 27,
                   'decline_start': 31, 'decline_rate': 0.04},
            'LB': {'development_peak': 23, 'performance_peak': 26,
                   'decline_start': 30, 'decline_rate': 0.04},
            'DB': {'development_peak': 23, 'performance_peak': 26,
                   'decline_start': 29, 'decline_rate': 0.05}
        }

    def get_age_adjustment(self, position: str, current_age: int,
                          projection_year: int = 1) -> float:
        """
        Get age-based adjustment factor.

        Parameters:
        -----------
        position : str
            Player position
        current_age : int
            Current age
        projection_year : int
            Years into future for projection

        Returns:
        --------
        float : Adjustment factor (1.0 = no change)
        """
        if position not in self.position_curves:
            return 1.0

        curve = self.position_curves[position]
        projected_age = current_age + projection_year

        dev_peak = curve['development_peak']
        perf_peak = curve['performance_peak']
        decline_start = curve['decline_start']
        decline_rate = curve['decline_rate']

        # Development phase (college-age)
        if projected_age < dev_peak:
            # Expect improvement
            years_to_peak = dev_peak - projected_age
            adjustment = 1.0 + 0.03 * years_to_peak / 4  # ~3% per 4 years

        # Prime years
        elif projected_age <= decline_start:
            adjustment = 1.0

        # Decline phase
        else:
            years_past_prime = projected_age - decline_start
            adjustment = 1.0 - decline_rate * years_past_prime

        return max(adjustment, 0.5)  # Floor at 50%

    def project_career_trajectory(self, position: str,
                                  current_age: int,
                                  current_performance: float,
                                  years: int = 5) -> pd.DataFrame:
        """
        Project career trajectory over multiple years.

        Parameters:
        -----------
        position : str
            Player position
        current_age : int
            Current age
        current_performance : float
            Current performance level (normalized)
        years : int
            Years to project

        Returns:
        --------
        pd.DataFrame : Year-by-year projections
        """
        trajectory = []

        for year in range(years + 1):
            age = current_age + year
            adjustment = self.get_age_adjustment(position, current_age, year)
            projected = current_performance * adjustment

            trajectory.append({
                'year': year,
                'age': age,
                'adjustment': adjustment,
                'projected_performance': projected
            })

        return pd.DataFrame(trajectory)


class CollegeDevelopmentModel:
    """
    Model college player development across 4 years.
    """

    def __init__(self):
        # Average improvement by year
        self.year_improvements = {
            'FR_to_SO': 0.15,  # 15% improvement freshman to sophomore
            'SO_to_JR': 0.10,  # 10% sophomore to junior
            'JR_to_SR': 0.05   # 5% junior to senior
        }

        # Position-specific adjustments
        self.position_factors = {
            'QB': 1.2,   # QBs develop more
            'RB': 0.8,   # RBs develop faster initially
            'WR': 1.1,
            'TE': 1.15,
            'OL': 1.1,
            'DL': 1.0,
            'LB': 0.95,
            'DB': 0.9
        }

    def project_development(self, position: str,
                           current_year: str,
                           current_performance: float) -> Dict[str, float]:
        """
        Project development through college career.

        Parameters:
        -----------
        position : str
            Player position
        current_year : str
            Current class year ('FR', 'SO', 'JR', 'SR')
        current_performance : float
            Current performance level

        Returns:
        --------
        Dict[str, float] : Projected performance by year
        """
        years = ['FR', 'SO', 'JR', 'SR']
        year_idx = years.index(current_year)

        pos_factor = self.position_factors.get(position, 1.0)

        projections = {current_year: current_performance}
        perf = current_performance

        for i in range(year_idx, 3):
            transition = f'{years[i]}_to_{years[i+1]}'
            base_improvement = self.year_improvements.get(transition, 0.05)
            improvement = base_improvement * pos_factor

            perf = perf * (1 + improvement)
            projections[years[i+1]] = perf

        return projections

19.4 Historical Comparable Players

19.4.1 Finding Comparables

Identifying similar players from history helps contextualize projections:

from sklearn.preprocessing import StandardScaler
from sklearn.neighbors import NearestNeighbors


class PlayerComparablesFinder:
    """
    Find historical comparable players.
    """

    def __init__(self, historical_data: pd.DataFrame,
                 comparison_features: List[str]):
        """
        Initialize comparables finder.

        Parameters:
        -----------
        historical_data : pd.DataFrame
            Historical player seasons
        comparison_features : List[str]
            Features to use for comparison
        """
        self.historical = historical_data
        self.features = comparison_features
        self.scaler = StandardScaler()
        self.nn_model = None

        self._fit_model()

    def _fit_model(self):
        """Fit the nearest neighbors model."""
        # Prepare feature matrix
        X = self.historical[self.features].fillna(0)
        X_scaled = self.scaler.fit_transform(X)

        # Fit nearest neighbors
        self.nn_model = NearestNeighbors(n_neighbors=10, metric='euclidean')
        self.nn_model.fit(X_scaled)

    def find_comparables(self, player_stats: Dict[str, float],
                        n_comparables: int = 5,
                        same_position: bool = True) -> pd.DataFrame:
        """
        Find comparable historical players.

        Parameters:
        -----------
        player_stats : Dict[str, float]
            Player's statistics
        n_comparables : int
            Number of comparables to return
        same_position : bool
            Only return same position players

        Returns:
        --------
        pd.DataFrame : Comparable players with similarity scores
        """
        # Prepare player features
        player_features = np.array([
            player_stats.get(f, 0) for f in self.features
        ]).reshape(1, -1)

        player_scaled = self.scaler.transform(player_features)

        # Find neighbors
        distances, indices = self.nn_model.kneighbors(
            player_scaled, n_neighbors=n_comparables * 2
        )

        # Get comparable players
        comparables = self.historical.iloc[indices[0]].copy()
        comparables['similarity_score'] = 1 / (1 + distances[0])

        # Filter by position if requested
        if same_position and 'position' in player_stats:
            comparables = comparables[
                comparables['position'] == player_stats['position']
            ]

        return comparables.head(n_comparables)

    def project_from_comparables(self, player_stats: Dict[str, float],
                                target_stats: List[str],
                                n_comparables: int = 5) -> Dict[str, float]:
        """
        Generate projection based on comparable players.

        Parameters:
        -----------
        player_stats : Dict[str, float]
            Current player statistics
        target_stats : List[str]
            Statistics to project
        n_comparables : int
            Number of comparables to use

        Returns:
        --------
        Dict[str, float] : Projected statistics
        """
        comparables = self.find_comparables(player_stats, n_comparables)

        projections = {}
        for stat in target_stats:
            if stat in comparables.columns:
                # Weight by similarity
                weights = comparables['similarity_score']
                values = comparables[stat]
                projections[stat] = np.average(values, weights=weights)

        return projections

19.5 Position-Specific Models

19.5.1 Quarterback Projections

class QuarterbackProjector:
    """
    Specialized projections for quarterbacks.
    """

    def __init__(self, league_averages: Dict[str, float] = None):
        self.league_avg = league_averages or {
            'completion_pct': 0.62,
            'yards_per_attempt': 7.5,
            'td_rate': 0.045,
            'int_rate': 0.025,
            'sack_rate': 0.07,
            'rush_yards_per_game': 15
        }

        self.stat_stability = {
            'completion_pct': 0.5,  # Moderate stability
            'yards_per_attempt': 0.4,
            'td_rate': 0.25,  # Low stability
            'int_rate': 0.3,
            'sack_rate': 0.5
        }

    def project_efficiency(self, past_seasons: List[Dict],
                          context_factors: Dict = None) -> Dict[str, float]:
        """
        Project quarterback efficiency metrics.

        Parameters:
        -----------
        past_seasons : List[Dict]
            Previous season statistics
        context_factors : Dict
            Team/context adjustments (OL quality, WR quality, etc.)

        Returns:
        --------
        Dict[str, float] : Projected efficiency metrics
        """
        projections = {}

        for stat, stability in self.stat_stability.items():
            # Get weighted average from past seasons
            weighted_sum = 0
            weight_total = 0

            for i, season in enumerate(past_seasons[:3]):
                year_weight = [5, 4, 3][i] if i < 3 else 0
                attempts = season.get('attempts', 0)

                if stat in season:
                    weighted_sum += year_weight * attempts * season[stat]
                    weight_total += year_weight * attempts

            if weight_total > 0:
                observed = weighted_sum / weight_total
            else:
                observed = self.league_avg[stat]

            # Apply regression
            regression_amount = 1 - stability
            regressed = stability * observed + regression_amount * self.league_avg[stat]

            projections[stat] = regressed

        # Apply context factors
        if context_factors:
            projections = self._apply_context(projections, context_factors)

        return projections

    def _apply_context(self, projections: Dict[str, float],
                      context: Dict) -> Dict[str, float]:
        """Apply context-based adjustments."""
        adjusted = projections.copy()

        # OL quality affects sack rate and time to throw
        if 'ol_quality' in context:
            ol_adj = (context['ol_quality'] - 0.5) * 0.1  # -5% to +5%
            adjusted['sack_rate'] *= (1 - ol_adj)
            adjusted['yards_per_attempt'] *= (1 + ol_adj * 0.5)

        # WR quality affects completion and yards
        if 'wr_quality' in context:
            wr_adj = (context['wr_quality'] - 0.5) * 0.08
            adjusted['completion_pct'] += wr_adj
            adjusted['yards_per_attempt'] *= (1 + wr_adj)

        return adjusted

    def project_counting_stats(self, efficiency: Dict[str, float],
                               games: int = 12,
                               dropback_rate: float = 35) -> Dict[str, float]:
        """
        Convert efficiency to counting stats.

        Parameters:
        -----------
        efficiency : Dict[str, float]
            Projected efficiency metrics
        games : int
            Games projected
        dropback_rate : float
            Dropbacks per game

        Returns:
        --------
        Dict[str, float] : Projected counting statistics
        """
        total_dropbacks = games * dropback_rate
        sacks = total_dropbacks * efficiency.get('sack_rate', 0.07)
        attempts = total_dropbacks - sacks

        return {
            'games': games,
            'attempts': int(attempts),
            'completions': int(attempts * efficiency['completion_pct']),
            'yards': int(attempts * efficiency['yards_per_attempt']),
            'touchdowns': int(attempts * efficiency['td_rate']),
            'interceptions': int(attempts * efficiency['int_rate']),
            'sacks': int(sacks)
        }

19.5.2 Running Back Projections

class RunningBackProjector:
    """
    Specialized projections for running backs.
    """

    def __init__(self):
        self.league_avg = {
            'yards_per_carry': 4.3,
            'td_rate_rush': 0.035,
            'yards_per_target': 7.5,
            'catch_rate': 0.75,
            'yards_after_contact': 2.5
        }

        # RB efficiency is less stable due to OL dependence
        self.stat_stability = {
            'yards_per_carry': 0.35,
            'td_rate_rush': 0.2,
            'catch_rate': 0.55,
            'yards_per_target': 0.4
        }

    def project(self, past_seasons: List[Dict],
               projected_usage: Dict = None) -> Dict[str, float]:
        """
        Project running back statistics.

        Parameters:
        -----------
        past_seasons : List[Dict]
            Previous season data
        projected_usage : Dict
            Projected carries, targets

        Returns:
        --------
        Dict[str, float] : Projected statistics
        """
        # Project efficiency
        efficiency = self._project_efficiency(past_seasons)

        # Apply usage projection
        if projected_usage:
            carries = projected_usage.get('carries', 150)
            targets = projected_usage.get('targets', 30)
        else:
            carries = 150
            targets = 30

        # Calculate counting stats
        return {
            'carries': carries,
            'rush_yards': int(carries * efficiency['yards_per_carry']),
            'rush_td': int(carries * efficiency['td_rate_rush']),
            'targets': targets,
            'receptions': int(targets * efficiency['catch_rate']),
            'rec_yards': int(targets * efficiency['catch_rate'] *
                           efficiency['yards_per_target']),
            'yards_per_carry': efficiency['yards_per_carry'],
            'catch_rate': efficiency['catch_rate']
        }

    def _project_efficiency(self, past_seasons: List[Dict]) -> Dict[str, float]:
        """Project efficiency metrics."""
        projections = {}

        for stat, stability in self.stat_stability.items():
            # Marcel-style weighting
            weighted_sum = 0
            weight_total = 0

            for i, season in enumerate(past_seasons[:3]):
                weight = [5, 4, 3][i] if i < 3 else 0

                if stat == 'yards_per_carry':
                    opps = season.get('carries', 0)
                else:
                    opps = season.get('targets', 0)

                if stat in season and opps > 0:
                    weighted_sum += weight * opps * season[stat]
                    weight_total += weight * opps

            if weight_total > 0:
                observed = weighted_sum / weight_total
            else:
                observed = self.league_avg[stat]

            # Regression
            projections[stat] = (
                stability * observed +
                (1 - stability) * self.league_avg[stat]
            )

        return projections

19.6 Complete Projection System

19.6.1 Integrated Projection Pipeline

class PlayerProjectionSystem:
    """
    Complete player projection system.
    """

    def __init__(self, historical_data: pd.DataFrame = None):
        """
        Initialize projection system.

        Parameters:
        -----------
        historical_data : pd.DataFrame
            Historical player data for comparables
        """
        self.qb_projector = QuarterbackProjector()
        self.rb_projector = RunningBackProjector()
        self.aging_model = AgingCurveModel()
        self.dev_model = CollegeDevelopmentModel()

        if historical_data is not None:
            self.comparables = PlayerComparablesFinder(
                historical_data,
                ['yards_per_game', 'td_per_game', 'efficiency_rating']
            )
        else:
            self.comparables = None

    def project_player(self, player_data: Dict,
                      projection_type: str = 'season') -> PlayerProjection:
        """
        Generate complete player projection.

        Parameters:
        -----------
        player_data : Dict
            Player information and historical stats
        projection_type : str
            'season', 'game', 'career'

        Returns:
        --------
        PlayerProjection : Complete projection
        """
        position = player_data['position']
        past_seasons = player_data.get('past_seasons', [])

        # Position-specific projection
        if position == 'QB':
            stats = self.qb_projector.project_efficiency(past_seasons)
            counting = self.qb_projector.project_counting_stats(
                stats, games=player_data.get('projected_games', 12)
            )
            projected_stats = {**stats, **counting}

        elif position == 'RB':
            projected_stats = self.rb_projector.project(
                past_seasons,
                player_data.get('projected_usage')
            )

        else:
            # Generic projection
            projected_stats = self._generic_projection(player_data)

        # Apply aging adjustment
        if 'age' in player_data:
            age_adj = self.aging_model.get_age_adjustment(
                position, player_data['age']
            )
            projected_stats = {
                k: v * age_adj if isinstance(v, (int, float)) else v
                for k, v in projected_stats.items()
            }

        # Calculate confidence intervals
        confidence_intervals = self._calculate_confidence(
            projected_stats, len(past_seasons)
        )

        # Find comparables
        comparable_names = []
        if self.comparables:
            comps = self.comparables.find_comparables(
                projected_stats, n_comparables=3
            )
            comparable_names = comps['name'].tolist() if 'name' in comps.columns else []

        return PlayerProjection(
            player_id=player_data.get('id', ''),
            name=player_data.get('name', 'Unknown'),
            position=position,
            team=player_data.get('team', ''),
            projection_type=ProjectionType(projection_type),
            stats=projected_stats,
            confidence_intervals=confidence_intervals,
            percentile_range=(25, 75),  # Placeholder
            comparable_players=comparable_names
        )

    def _generic_projection(self, player_data: Dict) -> Dict[str, float]:
        """Generic projection for positions without specialized models."""
        past_seasons = player_data.get('past_seasons', [])

        if not past_seasons:
            return {}

        # Simple weighted average
        projections = {}
        stat_keys = set()
        for season in past_seasons:
            stat_keys.update(season.keys())

        for stat in stat_keys:
            if stat in ['season', 'games', 'team']:
                continue

            values = [s.get(stat, 0) for s in past_seasons[:3]]
            weights = [5, 4, 3][:len(values)]

            if sum(weights) > 0:
                projections[stat] = np.average(values, weights=weights)

        return projections

    def _calculate_confidence(self, stats: Dict[str, float],
                             n_seasons: int) -> Dict[str, Tuple[float, float]]:
        """Calculate confidence intervals for projections."""
        intervals = {}

        # Uncertainty decreases with more data
        base_uncertainty = 0.3 if n_seasons < 2 else (0.2 if n_seasons < 3 else 0.15)

        for stat, value in stats.items():
            if isinstance(value, (int, float)) and value > 0:
                uncertainty = base_uncertainty * value
                intervals[stat] = (value - uncertainty, value + uncertainty)

        return intervals

    def batch_project(self, players: List[Dict]) -> pd.DataFrame:
        """
        Generate projections for multiple players.

        Returns:
        --------
        pd.DataFrame : Projections for all players
        """
        projections = []

        for player in players:
            proj = self.project_player(player)
            row = {
                'player_id': proj.player_id,
                'name': proj.name,
                'position': proj.position,
                'team': proj.team,
                **proj.stats
            }
            projections.append(row)

        return pd.DataFrame(projections)

Summary

Player performance forecasting requires balancing observed data with appropriate regression, accounting for development trajectories, and handling position-specific considerations. Key takeaways:

  1. Regression to the Mean: Small samples require significant regression toward league averages.

  2. Multi-Year Weighting: Marcel-style projections weight recent seasons more heavily while using all available data.

  3. Position-Specific Models: Different positions have different stability profiles and development curves.

  4. Context Matters: Team quality, scheme, and usage significantly impact player production.

  5. Uncertainty Quantification: Always provide confidence intervals to communicate projection uncertainty.

  6. Historical Comparables: Similar players from history provide valuable context for projections.

The next chapter applies similar techniques to recruiting analytics, projecting player development before they've played college football.


Chapter 19 Exercises

See exercises.md for practice problems on building player projection systems.

Chapter 19 Code Examples

  • example-01-regression-projections.py: Baseline projection methods
  • example-02-aging-curves.py: Development and aging models
  • example-03-position-models.py: Position-specific projectors
  • example-04-complete-system.py: Integrated projection system