> "The difference between college and pro football is the difference between investing in emerging markets and blue chips. The variance is higher, the information asymmetry is greater, and the opportunities are richer---if you know where to look."
Learning Objectives
- Develop power rating systems capable of handling 130+ team pools with sparse head-to-head data
- Quantify the impact of coaching changes on team performance and incorporate transition-year adjustments into predictive models
- Use recruiting rankings and composite scores as leading indicators of future team strength
- Build conference strength adjustments and properly account for strength of schedule in cross-conference play
- Identify and exploit college-specific market inefficiencies including public bias, overnight line movements, and early-season uncertainty
In This Chapter
Chapter 20: Modeling College Sports
"The difference between college and pro football is the difference between investing in emerging markets and blue chips. The variance is higher, the information asymmetry is greater, and the opportunities are richer---if you know where to look." --- Rufus Peabody, professional sports bettor
College sports, and college football in particular, occupy a unique position in the sports betting landscape. The combination of a massive team pool (over 130 FBS teams in football alone, 363 Division I teams in basketball), wildly uneven talent distribution, annual roster turnover of 20--25%, and a passionate but often irrational public betting market creates an environment where quantitative models can achieve edges that are simply unavailable in the more efficient professional leagues.
But this opportunity comes with proportional challenges. The same factors that create inefficiencies---large team pools, sparse data, coaching changes, recruiting pipelines, conference realignment---also make modeling college sports significantly more difficult than modeling professional leagues. A model that performs well in the NFL, where 32 teams play 17 games each with relatively stable rosters, may fail catastrophically when applied to a landscape of 130+ teams playing 12--13 games each, with three-to-four-year roster cycles driven by recruiting and the transfer portal.
This chapter develops the quantitative framework for modeling college sports, with primary emphasis on college football (the largest college betting market) and secondary discussion of college basketball where the principles differ. We build on the power rating systems introduced in Chapter 15 and the market efficiency concepts from Chapter 16, extending them to address the unique challenges of the college domain.
Chapter Overview
College sports modeling requires solving several problems simultaneously. First, the sheer number of teams means that most pairs of teams never play each other directly, forcing us to infer relative strength through transitive comparisons and conference-level adjustments. Second, the high rate of roster and coaching turnover means that last season's performance is only a rough guide to this season's quality. Third, the availability of recruiting data provides a forward-looking signal that has no parallel in professional sports. Fourth, the college betting market is populated by a large number of recreational bettors whose biases create systematic mispricings.
We address these challenges in sequence. Section 20.1 develops power rating systems for large team pools, with explicit attention to the sparse-data problem and conference-level regression. Section 20.2 tackles the coaching change problem, quantifying the average impact of a coaching change and developing methods to adjust models during transition years. Section 20.3 introduces recruiting data as a predictive feature, showing how 247Sports and Rivals rankings predict future team strength. Section 20.4 develops conference strength adjustments for cross-conference comparisons. Section 20.5 identifies and analyzes college-specific market inefficiencies.
Throughout this chapter, all Python code is designed for practical implementation with real data. The examples use simulated data that mirrors the statistical properties of actual college football, but the code is structured to work directly with data from sources like Sports Reference, the 247Sports API, or commercial providers.
20.1 Challenges of Large Team Pools
The Scale Problem
The NFL has 32 teams. Each team plays 17 regular-season games. Over a single season, the league produces $32 \times 17 / 2 = 272$ unique game results. Every team has played roughly half the league by the end of the season, and with two or three degrees of separation, the transitive connections between any two teams are dense.
Now consider FBS college football. There are 133 teams (as of the 2024 season, with ongoing conference realignment). Each plays 12--13 regular-season games. Over a season, the league produces approximately $133 \times 12.5 / 2 \approx 830$ game results. This sounds like more data than the NFL, but the connectivity is far sparser. Teams play most of their games within their conference (8--9 games) and only 3--4 non-conference games. Many pairs of conferences have only a handful of cross-conference matchups in any given season. Some FBS teams have never played each other in history.
This sparsity creates a fundamental challenge for power ratings: how do you compare a team from the SEC to a team from the Sun Belt when they share almost no common opponents? The answer requires both statistical methodology and structural assumptions about conference strength.
Power Ratings for Large Pools
The standard approach to power ratings in college sports is a least-squares or maximum likelihood framework that jointly estimates the strength of every team. The most common formulation is the margin-based rating, where each team $i$ has a single parameter $r_i$ representing its "power," and the expected margin of victory when team $i$ plays team $j$ at a neutral site is:
$$E[\text{margin}_{ij}] = r_i - r_j$$
With a home-field advantage parameter $h$, the expected margin for a home game is:
$$E[\text{margin}_{ij}] = r_i - r_j + h$$
Given $N$ observed games with actual margins $m_1, m_2, \ldots, m_N$, the least-squares ratings are found by minimizing:
$$\sum_{k=1}^{N} \left(m_k - (r_{h_k} - r_{a_k} + h)\right)^2$$
where $h_k$ and $a_k$ are the home and away teams in game $k$.
This is a simple linear regression problem and can be solved efficiently even for 130+ teams. However, the raw least-squares solution has several problems in the college context:
-
Sensitivity to blowouts: A 56--0 win over a weak opponent inflates the winning team's rating. Most implementations cap the margin (e.g., at 24 or 28 points) to reduce the influence of running up the score.
-
Conference insularity: Teams within a conference are well-connected, but connections across conferences are sparse. A single upset in a cross-conference game can ripple through the entire conference's ratings.
-
Instability early in the season: With only 2--3 games played, the system is severely under-determined and ratings are unreliable.
Regression to Conference Mean
One of the most powerful techniques for stabilizing college power ratings is regression to the conference mean. The idea is simple: before the season provides enough data to differentiate teams, we assume that each team's strength is drawn from a distribution centered on its conference's historical average strength. As the season progresses and we observe more data, we gradually shift from the prior (conference mean) to the data-driven rating.
Mathematically, this is a Bayesian approach. Let $\mu_c$ be the prior mean rating for conference $c$, and let $\sigma_c^2$ be the prior variance (how spread out teams within the conference tend to be). After observing $n$ games with an empirical rating $\hat{r}_i$, the posterior mean is:
$$r_i^{\text{post}} = \frac{\sigma_{\text{data}}^2 \cdot \mu_c + n \cdot \sigma_c^2 \cdot \hat{r}_i}{\sigma_{\text{data}}^2 + n \cdot \sigma_c^2}$$
Early in the season (small $n$), the rating is pulled strongly toward $\mu_c$. Late in the season, the data dominates. This framework neatly handles the cold-start problem and provides sensible ratings even after just one or two games.
Python Code: Large-Pool Power Ratings
"""
Power rating system for college sports with 130+ teams.
Implements margin-based ratings with conference regression,
margin capping, and week-by-week updates.
"""
import numpy as np
import pandas as pd
from scipy.optimize import minimize
from typing import Dict, List, Optional, Tuple
from dataclasses import dataclass, field
@dataclass
class ConferenceProfile:
"""Prior strength distribution for a conference."""
name: str
prior_mean: float # average team power rating in conference
prior_std: float # standard deviation within conference
n_teams: int
# Conference priors based on historical performance (approximate)
CONFERENCE_PRIORS = {
'SEC': ConferenceProfile('SEC', 12.0, 8.0, 16),
'Big Ten': ConferenceProfile('Big Ten', 10.0, 8.0, 18),
'Big 12': ConferenceProfile('Big 12', 7.0, 6.0, 16),
'ACC': ConferenceProfile('ACC', 6.0, 7.0, 17),
'Pac-12': ConferenceProfile('Pac-12', 5.0, 6.0, 8),
'Group of 5': ConferenceProfile('Group of 5', -3.0, 5.0, 58),
}
class CollegePowerRatings:
"""
Least-squares power ratings with conference regression
for large college team pools.
Parameters
----------
margin_cap : float
Maximum margin of victory to use (cap blowouts).
home_advantage : float
Initial estimate of home-field advantage in points.
conference_regression_weight : float
How strongly to regress to conference mean (0 to 1).
Higher values = more regression (useful early season).
"""
def __init__(
self,
margin_cap: float = 28.0,
home_advantage: float = 3.0,
conference_regression_weight: float = 0.3,
):
self.margin_cap = margin_cap
self.home_advantage = home_advantage
self.conf_reg_weight = conference_regression_weight
self.ratings: Dict[str, float] = {}
self.team_conferences: Dict[str, str] = {}
self.games_played: Dict[str, int] = {}
def initialize_ratings(
self,
team_conferences: Dict[str, str],
conference_priors: Dict[str, ConferenceProfile] = CONFERENCE_PRIORS,
):
"""
Initialize team ratings from conference priors.
Parameters
----------
team_conferences : dict
Mapping of team name to conference name.
"""
self.team_conferences = team_conferences
for team, conf in team_conferences.items():
if conf in conference_priors:
prior = conference_priors[conf]
# Initialize at conference mean with small random noise
self.ratings[team] = prior.prior_mean + np.random.normal(
0, 1.0
)
else:
self.ratings[team] = 0.0
self.games_played[team] = 0
def _cap_margin(self, margin: float) -> float:
"""Cap the margin of victory to reduce blowout influence."""
return np.clip(margin, -self.margin_cap, self.margin_cap)
def _loss_function(
self,
params: np.ndarray,
home_teams: List[str],
away_teams: List[str],
margins: np.ndarray,
team_list: List[str],
weights: np.ndarray,
) -> float:
"""Weighted least squares loss with conference regression."""
n_teams = len(team_list)
ratings = params[:n_teams]
hfa = params[n_teams]
team_idx = {t: i for i, t in enumerate(team_list)}
# Data fit term
loss = 0.0
for k in range(len(margins)):
hi = team_idx[home_teams[k]]
ai = team_idx[away_teams[k]]
predicted = ratings[hi] - ratings[ai] + hfa
residual = margins[k] - predicted
loss += weights[k] * residual ** 2
# Conference regression penalty
for team, conf in self.team_conferences.items():
if team not in team_idx:
continue
idx = team_idx[team]
if conf in CONFERENCE_PRIORS:
prior_mean = CONFERENCE_PRIORS[conf].prior_mean
prior_std = CONFERENCE_PRIORS[conf].prior_std
n_games = self.games_played.get(team, 0)
# Regression strength decreases with games played
reg_strength = self.conf_reg_weight / (1 + n_games / 4)
loss += reg_strength * (
(ratings[idx] - prior_mean) / prior_std
) ** 2
return loss
def fit(
self,
games_df: pd.DataFrame,
recency_half_life: Optional[int] = None,
):
"""
Fit power ratings to game results.
Parameters
----------
games_df : DataFrame
Columns: 'home_team', 'away_team', 'home_score', 'away_score',
'week' (optional), 'neutral_site' (optional, bool).
recency_half_life : int, optional
If provided, apply exponential time decay with this half-life
(in weeks).
"""
df = games_df.copy()
df['margin'] = df['home_score'] - df['away_score']
df['capped_margin'] = df['margin'].apply(self._cap_margin)
# Handle neutral-site games
if 'neutral_site' not in df.columns:
df['neutral_site'] = False
# Adjust margins for neutral site
df.loc[df['neutral_site'], 'capped_margin'] += self.home_advantage
# Recency weights
if recency_half_life and 'week' in df.columns:
max_week = df['week'].max()
weeks_ago = max_week - df['week']
weights = np.exp(-np.log(2) * weeks_ago / recency_half_life)
else:
weights = np.ones(len(df))
# Update games played
for _, row in df.iterrows():
self.games_played[row['home_team']] = (
self.games_played.get(row['home_team'], 0) + 1
)
self.games_played[row['away_team']] = (
self.games_played.get(row['away_team'], 0) + 1
)
# Collect all teams
all_teams = sorted(
set(df['home_team']) | set(df['away_team'])
)
n = len(all_teams)
# Initial parameter vector: ratings + home advantage
x0 = np.zeros(n + 1)
team_idx = {t: i for i, t in enumerate(all_teams)}
for team in all_teams:
if team in self.ratings:
x0[team_idx[team]] = self.ratings[team]
x0[n] = self.home_advantage
# Optimize
result = minimize(
self._loss_function,
x0,
args=(
df['home_team'].tolist(),
df['away_team'].tolist(),
df['capped_margin'].values,
all_teams,
weights.values,
),
method='L-BFGS-B',
options={'maxiter': 500},
)
# Extract results
for i, team in enumerate(all_teams):
self.ratings[team] = result.x[i]
self.home_advantage = result.x[n]
# Normalize: average rating = 0
avg_rating = np.mean(list(self.ratings.values()))
for team in self.ratings:
self.ratings[team] -= avg_rating
return self
def predict(
self,
home_team: str,
away_team: str,
neutral: bool = False,
) -> Dict[str, float]:
"""
Predict the expected margin and win probability.
Returns dict with 'spread', 'home_win_prob', 'total' (estimated).
"""
r_home = self.ratings.get(home_team, 0)
r_away = self.ratings.get(away_team, 0)
hfa = 0 if neutral else self.home_advantage
spread = r_home - r_away + hfa
# Convert spread to win probability using logistic approximation
# Calibrated to college football: std dev of margin ~ 14 points
sigma = 14.0
home_win_prob = 1.0 / (1.0 + np.exp(-spread / (sigma * 0.55)))
return {
'spread': spread,
'home_win_prob': home_win_prob,
'away_win_prob': 1 - home_win_prob,
}
def get_rankings(self, top_n: int = 25) -> pd.DataFrame:
"""Return top N teams by power rating."""
records = [
{
'Team': team,
'Rating': rating,
'Conference': self.team_conferences.get(team, 'Unknown'),
'Games': self.games_played.get(team, 0),
}
for team, rating in self.ratings.items()
]
return (
pd.DataFrame(records)
.sort_values('Rating', ascending=False)
.head(top_n)
.reset_index(drop=True)
)
def conference_summary(self) -> pd.DataFrame:
"""Compute average rating by conference."""
records = []
conf_ratings = {}
for team, rating in self.ratings.items():
conf = self.team_conferences.get(team, 'Unknown')
if conf not in conf_ratings:
conf_ratings[conf] = []
conf_ratings[conf].append(rating)
for conf, ratings in conf_ratings.items():
records.append({
'Conference': conf,
'Avg Rating': np.mean(ratings),
'Median Rating': np.median(ratings),
'Std Dev': np.std(ratings),
'Best': np.max(ratings),
'Worst': np.min(ratings),
'Teams': len(ratings),
})
return (
pd.DataFrame(records)
.sort_values('Avg Rating', ascending=False)
.reset_index(drop=True)
)
# --- Demonstration ---
if __name__ == "__main__":
np.random.seed(42)
# Create realistic team pool
sec_teams = [
'Alabama', 'Georgia', 'LSU', 'Tennessee', 'Ole Miss',
'Texas A&M', 'Florida', 'Auburn', 'Kentucky', 'Missouri',
'Mississippi St', 'South Carolina', 'Arkansas', 'Vanderbilt',
'Texas', 'Oklahoma',
]
big_ten_teams = [
'Ohio State', 'Michigan', 'Penn State', 'Oregon', 'USC',
'Wisconsin', 'Iowa', 'Minnesota', 'Illinois', 'Nebraska',
'Purdue', 'Indiana', 'Maryland', 'Rutgers', 'Northwestern',
'UCLA', 'Washington', 'Michigan St',
]
g5_teams = [
'Boise State', 'Memphis', 'SMU', 'Tulane', 'UNLV',
'App State', 'James Madison', 'Liberty', 'Marshall',
'Coastal Carolina', 'Troy', 'South Alabama',
]
team_conferences = {}
for t in sec_teams:
team_conferences[t] = 'SEC'
for t in big_ten_teams:
team_conferences[t] = 'Big Ten'
for t in g5_teams:
team_conferences[t] = 'Group of 5'
all_teams = sec_teams + big_ten_teams + g5_teams
# Generate synthetic season results
true_ratings = {}
for t in sec_teams:
true_ratings[t] = np.random.normal(12, 8)
for t in big_ten_teams:
true_ratings[t] = np.random.normal(10, 8)
for t in g5_teams:
true_ratings[t] = np.random.normal(-3, 5)
games = []
for week in range(1, 13):
# Generate ~30 games per week
available = list(all_teams)
np.random.shuffle(available)
n_games = min(len(available) // 2, 30)
for g in range(n_games):
home = available[2 * g]
away = available[2 * g + 1]
expected_margin = true_ratings[home] - true_ratings[away] + 3.0
actual_margin = expected_margin + np.random.normal(0, 14)
home_score = max(0, int(28 + actual_margin / 2 + np.random.normal(0, 5)))
away_score = max(0, int(28 - actual_margin / 2 + np.random.normal(0, 5)))
games.append({
'home_team': home,
'away_team': away,
'home_score': home_score,
'away_score': away_score,
'week': week,
'neutral_site': week == 1 and g < 3,
})
games_df = pd.DataFrame(games)
# Fit power ratings
model = CollegePowerRatings(
margin_cap=28,
home_advantage=3.0,
conference_regression_weight=0.3,
)
model.initialize_ratings(team_conferences)
model.fit(games_df, recency_half_life=8)
print("=== Top 25 Power Ratings ===\n")
print(model.get_rankings(25).to_string(index=False))
print("\n=== Conference Summary ===\n")
print(model.conference_summary().to_string(index=False))
print("\n=== Prediction Example ===\n")
pred = model.predict('Georgia', 'Boise State', neutral=True)
print(f"Georgia vs Boise State (neutral site):")
print(f" Predicted spread: Georgia by {pred['spread']:.1f}")
print(f" Georgia win prob: {pred['home_win_prob']:.1%}")
Early-Season vs. Late-Season Ratings
One of the most important practical considerations in college power ratings is the dramatic difference in reliability between early-season and late-season ratings. After Week 1, you have exactly one data point per team. After Week 4, you have four. After Week 12, you have twelve. The information content is not merely proportional to the number of games---it increases super-linearly because transitive connections multiply as more games are played.
A robust approach uses a sliding regression weight that evolves throughout the season:
- Weeks 0--2: Rely primarily on preseason priors (recruiting, returning production, coaching stability).
- Weeks 3--5: Blend priors with early-season results, heavily regressing toward conference means.
- Weeks 6--8: Current-season data begins to dominate, but priors still inform ratings for teams with unrepresentative schedules.
- Weeks 9--12+: Mostly data-driven, with minimal prior influence.
This evolution must be explicitly modeled; simply fitting least-squares ratings to the full season's data gives equal weight to Week 1 blowouts against FCS opponents and Week 12 rivalry games.
20.2 Coaching Changes and Their Impact
Quantifying Coaching Impact
Coaching changes are among the most impactful events in college sports. Unlike professional leagues, where coaching changes may produce modest effects on well-established rosters, college coaching changes can fundamentally alter a program's trajectory. The coach controls recruiting (the primary talent pipeline), scheme selection (which determines whether existing players fit the system), and culture (which affects retention in the transfer portal era).
Historical analysis of FBS coaching changes reveals several consistent patterns:
-
Year 1 (Transition Year): On average, teams that change coaches experience a decline of approximately 1.5--2.5 points in power rating relative to their pre-change level in the first year. However, this average masks huge variance: some teams improve dramatically (if the previous coach was incompetent or if the new coach inherits strong talent), while others collapse.
-
Year 2 (Adjustment Year): The new coach's recruits begin to arrive, and the scheme is more familiar. Teams typically stabilize around their prior level, plus or minus the quality of the coaching hire.
-
Year 3+ (Steady State): The coach's full recruiting classes are now upperclassmen, and the program's identity is established. By Year 3, the coaching change effect is largely absorbed into the team's fundamental talent level and the coach's actual ability.
The magnitude of the Year-1 effect depends critically on:
- Coaching quality differential: Replacing a mediocre coach with an elite one produces a smaller negative shock (or even a positive one) compared to replacing a long-tenured coach with an unknown.
- Scheme compatibility: If the new coach runs a fundamentally different scheme (e.g., switching from a pro-style offense to a spread option), the existing roster may be poorly suited to the new system, amplifying the Year-1 decline.
- Transfer portal activity: In the modern era, coaching changes trigger significant portal activity. Key players may leave, but the new coach can also bring in immediate contributors through transfers.
- Recruiting momentum: If the previous coach's recruiting was poor, the new coach inherits a talent-depleted roster that may not improve until their own recruits arrive.
Python Code: Coaching Change Adjustment
"""
Coaching change impact analysis and model adjustment.
Quantifies the expected performance impact of coaching changes
and adjusts power ratings accordingly.
"""
import numpy as np
import pandas as pd
from typing import Dict, List, Optional, Tuple
from dataclasses import dataclass
@dataclass
class CoachingChange:
"""Record of a coaching change at a program."""
team: str
year: int
outgoing_coach: str
incoming_coach: str
outgoing_tenure: int # years at the program
incoming_source: str # 'promotion', 'lateral', 'step_up', 'external'
scheme_change: bool # True if fundamentally different scheme
year_in_tenure: int = 1 # current year of new coach's tenure
@dataclass
class CoachProfile:
"""
Profile of a coach's expected impact.
win_rate_delta is the expected change in win rate per season
relative to a replacement-level coach.
"""
name: str
career_win_pct: float
years_experience: int
prev_program_rating_delta: float # how much they improved their last job
estimated_ability: float # standardized score (-3 to +3)
class CoachingChangeModel:
"""
Model the impact of coaching changes on team performance.
Uses historical data to estimate transition-year effects
and adjusts power ratings for teams with recent coaching changes.
"""
# Historical average impacts by year of new coach's tenure
# Values are in power rating points (negative = decline)
TENURE_YEAR_IMPACTS = {
1: -2.0, # Year 1: average decline
2: -0.5, # Year 2: partial recovery
3: 0.0, # Year 3: back to baseline
4: 0.3, # Year 4: slight improvement (coach's recruits are juniors)
5: 0.5, # Year 5: peak (if coach is competent)
}
# Adjustments based on coaching change characteristics
SCHEME_CHANGE_PENALTY = -1.5 # additional Year-1 penalty for scheme change
PROMOTION_BONUS = 0.5 # internal promotion reduces disruption
ELITE_HIRE_BONUS = 2.0 # known elite coach reduces transition cost
POOR_PREDECESSOR_BONUS = 1.5 # bad previous coach means improvement room
def __init__(self):
self.coaching_changes: Dict[str, CoachingChange] = {}
self.coach_profiles: Dict[str, CoachProfile] = {}
def register_change(self, change: CoachingChange):
"""Register a coaching change for a team."""
self.coaching_changes[change.team] = change
def register_coach_profile(self, profile: CoachProfile):
"""Register a coach's ability profile."""
self.coach_profiles[profile.name] = profile
def compute_adjustment(
self,
team: str,
pre_change_rating: float,
) -> Dict[str, float]:
"""
Compute the power rating adjustment for a coaching change.
Parameters
----------
team : str
Team name.
pre_change_rating : float
Team's power rating before the coaching change.
Returns
-------
Dict with 'adjustment', 'adjusted_rating', 'confidence'.
"""
if team not in self.coaching_changes:
return {
'adjustment': 0.0,
'adjusted_rating': pre_change_rating,
'confidence': 1.0,
}
change = self.coaching_changes[team]
year = change.year_in_tenure
# Base tenure-year impact
base_impact = self.TENURE_YEAR_IMPACTS.get(year, 0.0)
# Scheme change adjustment (only in Year 1-2)
scheme_adj = 0.0
if change.scheme_change and year <= 2:
scheme_adj = self.SCHEME_CHANGE_PENALTY * (1.0 if year == 1 else 0.5)
# Source adjustment
source_adj = 0.0
if change.incoming_source == 'promotion':
source_adj = self.PROMOTION_BONUS
elif change.incoming_source == 'step_up':
source_adj = -0.5 # G5 to P5 coaches need time to adjust
# Coach quality adjustment
coach_adj = 0.0
if change.incoming_coach in self.coach_profiles:
profile = self.coach_profiles[change.incoming_coach]
if profile.estimated_ability > 1.5:
coach_adj = self.ELITE_HIRE_BONUS * min(year, 3) / 3
elif profile.estimated_ability < -0.5:
coach_adj = -1.0 # bad hire makes things worse
# Predecessor adjustment
pred_adj = 0.0
if change.outgoing_tenure >= 8 and pre_change_rating < -5:
pred_adj = self.POOR_PREDECESSOR_BONUS
total_adjustment = (
base_impact + scheme_adj + source_adj + coach_adj + pred_adj
)
# Confidence decreases for more recent changes (more uncertainty)
confidence = min(1.0, 0.5 + 0.1 * change.year_in_tenure)
return {
'adjustment': total_adjustment,
'adjusted_rating': pre_change_rating + total_adjustment,
'confidence': confidence,
'components': {
'base_tenure': base_impact,
'scheme_change': scheme_adj,
'source': source_adj,
'coach_quality': coach_adj,
'predecessor': pred_adj,
},
}
def analyze_historical_changes(
self,
results_df: pd.DataFrame,
) -> pd.DataFrame:
"""
Analyze historical coaching change impacts.
Parameters
----------
results_df : DataFrame
Columns: 'team', 'year', 'coaching_change' (bool),
'pre_change_rating', 'post_change_rating',
'year_of_tenure'.
Returns
-------
DataFrame with average impact by year of tenure.
"""
changes = results_df[results_df['coaching_change']].copy()
changes['rating_delta'] = (
changes['post_change_rating'] - changes['pre_change_rating']
)
summary = (
changes
.groupby('year_of_tenure')
.agg(
avg_delta=('rating_delta', 'mean'),
median_delta=('rating_delta', 'median'),
std_delta=('rating_delta', 'std'),
count=('rating_delta', 'count'),
)
.reset_index()
)
return summary
# --- Demonstration ---
if __name__ == "__main__":
model = CoachingChangeModel()
# Register some coaching changes
model.register_change(CoachingChange(
team='Nebraska',
year=2024,
outgoing_coach='Former Coach',
incoming_coach='Matt Rhule',
outgoing_tenure=4,
incoming_source='external',
scheme_change=True,
year_in_tenure=2,
))
model.register_change(CoachingChange(
team='Auburn',
year=2024,
outgoing_coach='Former Coach',
incoming_coach='Hugh Freeze',
outgoing_tenure=3,
incoming_source='lateral',
scheme_change=True,
year_in_tenure=2,
))
model.register_change(CoachingChange(
team='Wisconsin',
year=2024,
outgoing_coach='Former Coach',
incoming_coach='Luke Fickell',
outgoing_tenure=6,
incoming_source='lateral',
scheme_change=True,
year_in_tenure=2,
))
# Register coach profiles
model.register_coach_profile(CoachProfile(
name='Matt Rhule',
career_win_pct=0.52,
years_experience=12,
prev_program_rating_delta=8.0,
estimated_ability=0.5,
))
model.register_coach_profile(CoachProfile(
name='Hugh Freeze',
career_win_pct=0.62,
years_experience=15,
prev_program_rating_delta=12.0,
estimated_ability=1.2,
))
model.register_coach_profile(CoachProfile(
name='Luke Fickell',
career_win_pct=0.68,
years_experience=10,
prev_program_rating_delta=10.0,
estimated_ability=1.8,
))
# Compute adjustments
print("=== Coaching Change Adjustments ===\n")
for team in ['Nebraska', 'Auburn', 'Wisconsin']:
pre_rating = {'Nebraska': -2.0, 'Auburn': 3.0, 'Wisconsin': 5.0}[team]
result = model.compute_adjustment(team, pre_rating)
print(f"{team} (pre-change rating: {pre_rating:+.1f}):")
print(f" Total adjustment: {result['adjustment']:+.1f}")
print(f" Adjusted rating: {result['adjusted_rating']:+.1f}")
print(f" Confidence: {result['confidence']:.2f}")
print(f" Components:")
for comp, val in result['components'].items():
print(f" {comp:20s}: {val:+.1f}")
print()
The Transfer Portal Era
The introduction and rapid expansion of the transfer portal has fundamentally altered the dynamics of coaching changes in college sports. Before the portal (pre-2018), a coaching change meant that the new coach was largely stuck with the previous coach's roster for at least one year. Now, a coaching change triggers a cascade of portal activity:
- Key players who were loyal to the departing coach may enter the portal.
- The new coach can recruit transfers to fill scheme-specific needs.
- The net talent flow in the first transfer window after a coaching change is a strong predictor of Year-1 performance.
Modeling the portal effect requires tracking transfer data, which is now available from sources like 247Sports, On3, and various aggregation sites. The key metric is the net talent change: the sum of incoming transfer quality minus outgoing transfer quality, often measured by recruiting stars or composite rankings.
20.3 Recruiting Data as a Predictor
The Foundation of College Success
In professional sports, every team has access to approximately the same talent pool through the draft and free agency. Salary caps further equalize talent distribution. College sports have no such mechanisms. Instead, the talent pipeline is driven by recruiting: 17- and 18-year-old athletes choose which school to attend, influenced by coaching quality, facilities, academic reputation, geographic proximity, NIL (Name, Image, and Likeness) opportunities, and the perceived trajectory of the program.
The result is a profoundly unequal distribution of talent. The top 10 recruiting classes routinely contain more future NFL players than the entire rosters of most Group of 5 programs. This talent asymmetry is the single most important structural feature of college sports, and it is highly predictive of on-field performance.
The 247Sports Composite and Recruiting Metrics
The 247Sports Composite is the most widely used measure of recruiting class quality. It aggregates ratings from 247Sports, Rivals, ESPN, and On3 to produce a composite score for each recruit on a 0--1 scale, and a team-level composite that ranks all 130+ FBS classes.
Key recruiting metrics include:
| Metric | Description | Predictive Value |
|---|---|---|
| Team ranking | Overall class rank (1--130+) | Very high for top-25 vs. bottom-50 |
| Average player rating | Mean composite score per recruit | High (captures class quality) |
| Number of blue chips | Recruits rated 4-star or 5-star | Very high (see blue-chip ratio) |
| Class size | Total recruits signed | Moderate (larger classes = more chances) |
| Transfer additions | Quality of incoming transfers | Increasingly important |
| Position balance | Spread of talent across positions | Moderate (scheme-dependent) |
The Blue-Chip Ratio
Research by Bud Elliott and others has demonstrated that the blue-chip ratio---the percentage of a team's roster composed of 4-star and 5-star recruits---is one of the strongest predictors of championship-level performance. The finding is striking in its simplicity:
Since the inception of the BCS/CFP era, no team has won a national championship with a blue-chip ratio below approximately 50%.
This means that to be a championship contender, at least half of a team's scholarship players must have been rated as 4-star or 5-star recruits. The blue-chip ratio essentially measures the floor of talent required to compete at the highest level.
For betting purposes, the blue-chip ratio is useful as a ceiling estimator. Teams with high blue-chip ratios (Alabama, Georgia, Ohio State) have the talent to compete for championships even in down years. Teams with low blue-chip ratios may have excellent coaching and overperform relative to their talent, but they face a hard ceiling on how far they can go.
Lag Effects in Recruiting
Recruiting data is a leading indicator of future performance, not a contemporaneous one. A top-5 recruiting class signed in February will not significantly impact the team until those freshmen are sophomores or juniors---typically a 1--3 year lag. This lag structure creates a natural forecasting advantage:
- 1 year lag: Freshmen contribute modestly (especially at non-skill positions).
- 2 year lag: Sophomore year is typically when blue-chip recruits break out.
- 3 year lag: Junior and senior years represent peak contribution.
- 4+ year lag: Minimal incremental information (players have graduated or declared for the draft).
The optimal weighting of recruiting classes for predicting performance in year $t$ is approximately:
$$\text{Talent}_{t} = 0.10 \cdot R_{t} + 0.25 \cdot R_{t-1} + 0.30 \cdot R_{t-2} + 0.25 \cdot R_{t-3} + 0.10 \cdot R_{t-4}$$
where $R_{t-k}$ is the recruiting class composite score from $k$ years ago.
Python Code: Recruiting-Based Predictions
"""
Recruiting data as a predictive feature for college football.
Implements blue-chip ratio analysis, lag-weighted recruiting
composites, and integration with power ratings.
"""
import numpy as np
import pandas as pd
from typing import Dict, List, Optional, Tuple
from sklearn.linear_model import LinearRegression
# Optimal lag weights for recruiting class impact
RECRUITING_LAG_WEIGHTS = {
0: 0.10, # current year's class (freshmen)
1: 0.25, # last year's class (sophomores)
2: 0.30, # two years ago (juniors) -- peak impact
3: 0.25, # three years ago (seniors)
4: 0.10, # four years ago (5th years, some departed)
}
def compute_talent_composite(
recruiting_history: Dict[int, float],
current_year: int,
lag_weights: Dict[int, float] = RECRUITING_LAG_WEIGHTS,
) -> float:
"""
Compute a lag-weighted recruiting talent composite.
Parameters
----------
recruiting_history : dict
Mapping of year to recruiting class composite score.
current_year : int
The season to predict.
lag_weights : dict
Mapping of lag (years) to weight.
Returns
-------
Weighted talent composite score.
"""
total_weight = 0.0
weighted_sum = 0.0
for lag, weight in lag_weights.items():
recruit_year = current_year - lag
if recruit_year in recruiting_history:
weighted_sum += weight * recruiting_history[recruit_year]
total_weight += weight
if total_weight == 0:
return 0.0
return weighted_sum / total_weight
def compute_blue_chip_ratio(
roster_stars: List[int],
threshold: int = 4,
) -> float:
"""
Compute the blue-chip ratio for a roster.
Parameters
----------
roster_stars : list of int
Star rating (1-5) for each player on the roster.
threshold : int
Minimum stars to count as "blue chip" (default 4).
Returns
-------
Blue-chip ratio (0 to 1).
"""
if not roster_stars:
return 0.0
blue_chips = sum(1 for s in roster_stars if s >= threshold)
return blue_chips / len(roster_stars)
class RecruitingModel:
"""
Predict team performance from recruiting data.
Combines lag-weighted recruiting composites with power
ratings and coaching data to generate predictions.
"""
def __init__(self):
self.team_recruiting: Dict[str, Dict[int, float]] = {}
self.team_rosters: Dict[str, List[int]] = {}
self.regression_model: Optional[LinearRegression] = None
self.feature_importances: Optional[Dict[str, float]] = None
def add_recruiting_data(
self,
team: str,
year: int,
composite_score: float,
n_blue_chips: int = 0,
class_size: int = 25,
):
"""Add a recruiting class for a team and year."""
if team not in self.team_recruiting:
self.team_recruiting[team] = {}
self.team_recruiting[team][year] = composite_score
def add_roster_data(
self,
team: str,
star_ratings: List[int],
):
"""Add current roster star ratings for blue-chip ratio."""
self.team_rosters[team] = star_ratings
def build_features(
self,
team: str,
current_year: int,
) -> Dict[str, float]:
"""
Build feature set for a team-season.
Returns
-------
Dict of features including talent composite, blue-chip ratio,
recent recruiting trend, and class-to-class volatility.
"""
history = self.team_recruiting.get(team, {})
# Lag-weighted composite
talent = compute_talent_composite(history, current_year)
# Blue-chip ratio (if roster data available)
bcr = 0.0
if team in self.team_rosters:
bcr = compute_blue_chip_ratio(self.team_rosters[team])
# Recruiting trend (are they getting better or worse?)
recent = [
history.get(current_year - k, None) for k in range(3)
]
recent = [r for r in recent if r is not None]
if len(recent) >= 2:
trend = recent[0] - recent[-1] # positive = improving
else:
trend = 0.0
# Recruiting volatility
all_scores = [
history.get(current_year - k, None) for k in range(5)
]
all_scores = [s for s in all_scores if s is not None]
volatility = np.std(all_scores) if len(all_scores) >= 2 else 0.0
return {
'talent_composite': talent,
'blue_chip_ratio': bcr,
'recruiting_trend': trend,
'recruiting_volatility': volatility,
}
def fit_prediction_model(
self,
historical_data: pd.DataFrame,
):
"""
Fit a regression model: recruiting features -> wins.
Parameters
----------
historical_data : DataFrame
Columns: 'team', 'year', 'talent_composite',
'blue_chip_ratio', 'recruiting_trend',
'actual_wins', 'power_rating'.
"""
feature_cols = [
'talent_composite', 'blue_chip_ratio',
'recruiting_trend',
]
X = historical_data[feature_cols].values
y = historical_data['power_rating'].values
self.regression_model = LinearRegression()
self.regression_model.fit(X, y)
self.feature_importances = dict(
zip(feature_cols, self.regression_model.coef_)
)
r2 = self.regression_model.score(X, y)
print(f"Recruiting -> Power Rating model: R^2 = {r2:.3f}")
print(f"Feature importances:")
for feat, coef in self.feature_importances.items():
print(f" {feat:25s}: {coef:+.3f}")
def predict_rating(
self,
team: str,
current_year: int,
) -> float:
"""Predict power rating from recruiting features."""
if self.regression_model is None:
raise ValueError("Must call fit_prediction_model first.")
features = self.build_features(team, current_year)
X = np.array([[
features['talent_composite'],
features['blue_chip_ratio'],
features['recruiting_trend'],
]])
return self.regression_model.predict(X)[0]
def recruiting_vs_performance_analysis(
data: pd.DataFrame,
) -> pd.DataFrame:
"""
Analyze the relationship between recruiting and performance
across multiple seasons.
Parameters
----------
data : DataFrame
Columns: 'team', 'year', 'recruiting_rank',
'talent_composite', 'wins', 'power_rating'.
Returns
-------
Summary DataFrame with correlations by recruiting tier.
"""
# Bin teams into recruiting tiers
data = data.copy()
data['tier'] = pd.cut(
data['recruiting_rank'],
bins=[0, 10, 25, 50, 80, 133],
labels=['Top 10', '11-25', '26-50', '51-80', '80+'],
)
summary = (
data
.groupby('tier')
.agg(
avg_wins=('wins', 'mean'),
avg_rating=('power_rating', 'mean'),
avg_talent=('talent_composite', 'mean'),
n_teams=('team', 'count'),
)
.reset_index()
)
return summary
# --- Demonstration ---
if __name__ == "__main__":
np.random.seed(42)
model = RecruitingModel()
# Add recruiting data for sample teams
teams_recruiting = {
'Georgia': {2020: 95, 2021: 97, 2022: 93, 2023: 96, 2024: 94},
'Alabama': {2020: 98, 2021: 95, 2022: 96, 2023: 92, 2024: 90},
'Ohio State': {2020: 92, 2021: 90, 2022: 94, 2023: 95, 2024: 93},
'Oregon': {2020: 78, 2021: 82, 2022: 85, 2023: 88, 2024: 91},
'Boise State': {2020: 45, 2021: 48, 2022: 50, 2023: 52, 2024: 55},
'Iowa': {2020: 65, 2021: 62, 2022: 60, 2023: 63, 2024: 61},
}
rosters = {
'Georgia': [5]*10 + [4]*25 + [3]*35 + [2]*15,
'Alabama': [5]*12 + [4]*22 + [3]*38 + [2]*13,
'Ohio State': [5]*8 + [4]*24 + [3]*40 + [2]*13,
'Oregon': [5]*3 + [4]*15 + [3]*45 + [2]*22,
'Boise State': [4]*3 + [3]*25 + [2]*50 + [1]*7,
'Iowa': [4]*8 + [3]*35 + [2]*37 + [1]*5,
}
for team, history in teams_recruiting.items():
for year, score in history.items():
model.add_recruiting_data(team, year, score)
model.add_roster_data(team, rosters[team])
print("=== Team Recruiting Features (2024 Season) ===\n")
for team in teams_recruiting:
features = model.build_features(team, 2024)
bcr = compute_blue_chip_ratio(rosters[team])
print(f"{team:15s}:")
print(f" Talent composite: {features['talent_composite']:.1f}")
print(f" Blue-chip ratio: {bcr:.2%}")
print(f" Recruiting trend: {features['recruiting_trend']:+.1f}")
print()
# Fit prediction model with synthetic historical data
n_obs = 500
hist_data = pd.DataFrame({
'team': np.random.choice(list(teams_recruiting.keys()), n_obs),
'year': np.random.choice(range(2020, 2025), n_obs),
'talent_composite': np.random.uniform(30, 98, n_obs),
'blue_chip_ratio': np.random.uniform(0, 0.6, n_obs),
'recruiting_trend': np.random.normal(0, 5, n_obs),
'actual_wins': np.random.randint(2, 13, n_obs),
})
# Power rating correlates with talent
hist_data['power_rating'] = (
0.3 * hist_data['talent_composite']
+ 15 * hist_data['blue_chip_ratio']
+ 0.5 * hist_data['recruiting_trend']
- 15
+ np.random.normal(0, 4, n_obs)
)
print("=== Recruiting Prediction Model ===\n")
model.fit_prediction_model(hist_data)
print("\n=== Predicted Ratings from Recruiting ===\n")
for team in ['Georgia', 'Alabama', 'Ohio State', 'Oregon',
'Boise State', 'Iowa']:
pred = model.predict_rating(team, 2024)
print(f"{team:15s}: predicted rating = {pred:+.1f}")
Integrating Recruiting with On-Field Data
The optimal approach to college prediction blends recruiting data (a prior about talent) with on-field performance data (observed results). The weighting between the two should evolve over the season:
- Preseason: Recruiting accounts for 60--70% of the rating, with the remainder from prior-year performance and coaching adjustments.
- Mid-season (Week 6): Recruiting drops to 30--40%, with current-season results providing the majority signal.
- Late season (Week 10+): Recruiting accounts for only 15--20%, primarily through its correlation with roster talent.
This Bayesian blending ensures that preseason ratings are not wildly wrong (even if early results are surprising) while allowing the model to respond to genuine changes in team quality as the season unfolds.
20.4 Conference Strength and Cross-Conference Play
The Conference Strength Problem
The most contentious and consequential challenge in college sports modeling is accurately measuring conference strength. When Alabama beats Auburn, we can be reasonably confident about the relative strength of those two teams. But when Alabama beats a Sun Belt team in a non-conference game, how much of the result is explained by Alabama's quality versus the conference-level talent gap?
The problem is circular: to measure conference strength, we need to know how strong the individual teams are; to measure individual team strength, we need to know how strong their opponents (and thus their conference) are. Power rating systems handle this circularity through simultaneous estimation, but the estimates are only as good as the cross-conference data that connects the clusters.
Building Conference Power Ratings
Conference power ratings are computed as the average (or median) of the individual team ratings within each conference. However, several refinements improve accuracy:
-
Weighted average: Weight each team by games played to avoid over-representing teams with unusual schedules.
-
Trimmed average: Exclude the top and bottom teams in each conference to get a measure of the "typical" conference experience. A conference with one elite team and many weak teams (historically, Clemson in the ACC) has a different profile than one with many good teams and no dominant one.
-
Depth measure: Use the 50th or 75th percentile team rather than the mean. "Conference depth" is how strong the middle-of-the-pack teams are, which is what most teams in the conference actually face.
-
Cross-conference calibration: Use non-conference results as the primary signal for inter-conference strength comparisons. Weight cross-conference games more heavily in the conference power calculation.
Non-Conference Game Analysis
Non-conference games are the Rosetta Stone of college sports modeling. They provide the only direct evidence for cross-conference comparisons. However, they are also the noisiest data:
- Early-season timing: Most non-conference games occur in the first 3--4 weeks when rosters are least settled and schemes are least refined.
- Mismatched opponents: Many non-conference games are "buy games" where a Power 4 team pays a Group of 5 or FCS team to come lose. These blowouts provide little information about the top teams' ability.
- Neutral-site games: High-profile non-conference matchups are often played at neutral sites, requiring a home-advantage adjustment of zero rather than the standard 3 points.
- Motivation asymmetry: The Group of 5 team playing at a Power 4 stadium may be extra motivated (their Super Bowl), while the Power 4 team may be looking ahead.
Python Code: Conference Strength Analysis
"""
Conference strength analysis and cross-conference adjustment.
Computes conference power ratings, analyzes non-conference results,
and provides cross-conference calibration.
"""
import numpy as np
import pandas as pd
from typing import Dict, List, Tuple
from scipy import stats
class ConferenceStrengthAnalyzer:
"""
Analyze and compare conference strength in college sports.
Uses team-level power ratings and cross-conference results
to build a hierarchical picture of conference quality.
"""
def __init__(
self,
team_ratings: Dict[str, float],
team_conferences: Dict[str, str],
):
self.team_ratings = team_ratings
self.team_conferences = team_conferences
def compute_conference_ratings(
self,
method: str = 'trimmed_mean',
trim_pct: float = 0.15,
) -> pd.DataFrame:
"""
Compute conference-level power ratings.
Parameters
----------
method : str
'mean', 'median', 'trimmed_mean', or 'depth' (75th pct).
trim_pct : float
Fraction to trim from each end (for trimmed_mean).
Returns
-------
DataFrame with conference ratings and summary statistics.
"""
conf_teams = {}
for team, conf in self.team_conferences.items():
if conf not in conf_teams:
conf_teams[conf] = []
if team in self.team_ratings:
conf_teams[conf].append(self.team_ratings[team])
records = []
for conf, ratings in conf_teams.items():
ratings = sorted(ratings, reverse=True)
if method == 'mean':
rating = np.mean(ratings)
elif method == 'median':
rating = np.median(ratings)
elif method == 'trimmed_mean':
rating = stats.trim_mean(ratings, trim_pct)
elif method == 'depth':
# 75th percentile of conference (measures depth)
idx = int(len(ratings) * 0.75)
rating = ratings[min(idx, len(ratings) - 1)]
else:
rating = np.mean(ratings)
records.append({
'Conference': conf,
'Rating': rating,
'Best Team': max(ratings),
'Worst Team': min(ratings),
'Std Dev': np.std(ratings),
'N Teams': len(ratings),
'Top-to-Bottom Gap': max(ratings) - min(ratings),
})
return (
pd.DataFrame(records)
.sort_values('Rating', ascending=False)
.reset_index(drop=True)
)
def analyze_cross_conference(
self,
games_df: pd.DataFrame,
) -> pd.DataFrame:
"""
Analyze cross-conference game results.
Parameters
----------
games_df : DataFrame
Columns: 'home_team', 'away_team', 'home_score',
'away_score', 'neutral_site'.
Returns
-------
Conference vs conference record summary.
"""
results = []
for _, game in games_df.iterrows():
home_conf = self.team_conferences.get(game['home_team'])
away_conf = self.team_conferences.get(game['away_team'])
if home_conf is None or away_conf is None:
continue
if home_conf == away_conf:
continue # skip intra-conference
margin = game['home_score'] - game['away_score']
home_rating = self.team_ratings.get(game['home_team'], 0)
away_rating = self.team_ratings.get(game['away_team'], 0)
expected_margin = home_rating - away_rating
if not game.get('neutral_site', False):
expected_margin += 3.0
results.append({
'home_conf': home_conf,
'away_conf': away_conf,
'actual_margin': margin,
'expected_margin': expected_margin,
'residual': margin - expected_margin,
'home_team': game['home_team'],
'away_team': game['away_team'],
})
results_df = pd.DataFrame(results)
# Summarize by conference matchup
conf_summary = (
results_df
.groupby(['home_conf', 'away_conf'])
.agg(
n_games=('actual_margin', 'count'),
avg_margin=('actual_margin', 'mean'),
avg_expected=('expected_margin', 'mean'),
avg_residual=('residual', 'mean'),
)
.reset_index()
)
return conf_summary
def strength_of_schedule(
self,
team: str,
games_df: pd.DataFrame,
) -> Dict[str, float]:
"""
Compute strength of schedule for a team.
Returns average opponent rating, rank, and percentile.
"""
# Find all opponents
opponents = []
home_games = games_df[games_df['home_team'] == team]
away_games = games_df[games_df['away_team'] == team]
for _, game in home_games.iterrows():
opp = game['away_team']
if opp in self.team_ratings:
opponents.append(self.team_ratings[opp])
for _, game in away_games.iterrows():
opp = game['home_team']
if opp in self.team_ratings:
opponents.append(self.team_ratings[opp])
if not opponents:
return {
'avg_opponent_rating': 0.0,
'sos_rank': 0,
'sos_percentile': 0.0,
}
avg_opp = np.mean(opponents)
# Compute SOS rank relative to all teams
all_sos = {}
all_teams_list = set(games_df['home_team']) | set(games_df['away_team'])
for t in all_teams_list:
opps = []
for _, g in games_df[games_df['home_team'] == t].iterrows():
o = g['away_team']
if o in self.team_ratings:
opps.append(self.team_ratings[o])
for _, g in games_df[games_df['away_team'] == t].iterrows():
o = g['home_team']
if o in self.team_ratings:
opps.append(self.team_ratings[o])
if opps:
all_sos[t] = np.mean(opps)
sorted_sos = sorted(all_sos.items(), key=lambda x: x[1], reverse=True)
rank = next(
(i+1 for i, (t, _) in enumerate(sorted_sos) if t == team),
len(sorted_sos)
)
return {
'avg_opponent_rating': avg_opp,
'sos_rank': rank,
'sos_percentile': (1 - rank / len(sorted_sos)) * 100,
'n_opponents': len(opponents),
}
def adjusted_win_loss(
team: str,
team_rating: float,
games_df: pd.DataFrame,
team_ratings: Dict[str, float],
home_advantage: float = 3.0,
) -> Dict[str, float]:
"""
Compute adjusted win-loss metrics accounting for SOS.
Calculates expected wins given the team's schedule and
the difference from actual wins.
"""
expected_wins = 0.0
actual_wins = 0
n_games = 0
# Home games
for _, game in games_df[games_df['home_team'] == team].iterrows():
opp = game['away_team']
opp_rating = team_ratings.get(opp, 0)
neutral = game.get('neutral_site', False)
hfa = 0 if neutral else home_advantage
spread = team_rating - opp_rating + hfa
win_prob = 1.0 / (1.0 + np.exp(-spread / (14 * 0.55)))
expected_wins += win_prob
if game['home_score'] > game['away_score']:
actual_wins += 1
n_games += 1
# Away games
for _, game in games_df[games_df['away_team'] == team].iterrows():
opp = game['home_team']
opp_rating = team_ratings.get(opp, 0)
neutral = game.get('neutral_site', False)
hfa = 0 if neutral else home_advantage
spread = team_rating - opp_rating - hfa
win_prob = 1.0 / (1.0 + np.exp(-spread / (14 * 0.55)))
expected_wins += win_prob
if game['away_score'] > game['home_score']:
actual_wins += 1
n_games += 1
luck = actual_wins - expected_wins
return {
'actual_wins': actual_wins,
'expected_wins': expected_wins,
'luck': luck,
'games': n_games,
}
# --- Demonstration ---
if __name__ == "__main__":
np.random.seed(42)
# Create team ratings and conferences
team_ratings = {
'Georgia': 18.5, 'Alabama': 16.2, 'Texas': 14.8,
'Ohio State': 17.1, 'Michigan': 13.5, 'Oregon': 15.0,
'Penn State': 11.2, 'USC': 8.5, 'Wisconsin': 5.0,
'Iowa': 4.2, 'Rutgers': -3.5, 'Northwestern': -6.0,
'Tennessee': 12.0, 'LSU': 10.5, 'Ole Miss': 9.8,
'Vanderbilt': -8.0, 'Mississippi St': -2.0, 'Auburn': 1.5,
'Boise State': 6.5, 'Memphis': 2.0, 'Tulane': 1.0,
'App State': -1.0, 'Troy': -5.0, 'South Alabama': -8.5,
}
team_conferences = {
'Georgia': 'SEC', 'Alabama': 'SEC', 'Texas': 'SEC',
'Tennessee': 'SEC', 'LSU': 'SEC', 'Ole Miss': 'SEC',
'Vanderbilt': 'SEC', 'Mississippi St': 'SEC', 'Auburn': 'SEC',
'Ohio State': 'Big Ten', 'Michigan': 'Big Ten', 'Oregon': 'Big Ten',
'Penn State': 'Big Ten', 'USC': 'Big Ten', 'Wisconsin': 'Big Ten',
'Iowa': 'Big Ten', 'Rutgers': 'Big Ten', 'Northwestern': 'Big Ten',
'Boise State': 'MWC', 'Memphis': 'AAC', 'Tulane': 'AAC',
'App State': 'Sun Belt', 'Troy': 'Sun Belt',
'South Alabama': 'Sun Belt',
}
analyzer = ConferenceStrengthAnalyzer(team_ratings, team_conferences)
print("=== Conference Power Ratings ===\n")
conf_ratings = analyzer.compute_conference_ratings(method='trimmed_mean')
print(conf_ratings.to_string(index=False))
# Generate synthetic games for SOS analysis
games = []
teams_list = list(team_ratings.keys())
for week in range(1, 13):
np.random.shuffle(teams_list)
for i in range(0, len(teams_list) - 1, 2):
home = teams_list[i]
away = teams_list[i + 1]
spread = team_ratings[home] - team_ratings[away] + 3
margin = spread + np.random.normal(0, 14)
hs = max(0, int(28 + margin / 2))
aws = max(0, int(28 - margin / 2))
games.append({
'home_team': home,
'away_team': away,
'home_score': hs,
'away_score': aws,
'neutral_site': False,
'week': week,
})
games_df = pd.DataFrame(games)
print("\n=== Strength of Schedule ===\n")
for team in ['Georgia', 'Boise State', 'Vanderbilt']:
sos = analyzer.strength_of_schedule(team, games_df)
print(f"{team:15s}: avg opp rating = {sos['avg_opponent_rating']:+.1f}, "
f"SOS rank = {sos['sos_rank']}")
print("\n=== Adjusted Win-Loss ===\n")
for team in ['Georgia', 'Ohio State', 'Iowa', 'Boise State']:
awl = adjusted_win_loss(
team, team_ratings[team], games_df, team_ratings
)
print(f"{team:15s}: {awl['actual_wins']:.0f}-"
f"{awl['games'] - awl['actual_wins']:.0f} actual, "
f"{awl['expected_wins']:.1f} expected wins, "
f"luck = {awl['luck']:+.1f}")
The Conference Realignment Challenge
The ongoing wave of conference realignment (2024--2026) creates an additional modeling challenge. When teams move between conferences, historical conference priors no longer apply cleanly. USC and UCLA moving from the Pac-12 to the Big Ten, or Texas and Oklahoma moving from the Big 12 to the SEC, means that:
- Conference average ratings shift abruptly.
- Travel patterns change, affecting home-field advantage.
- Historical cross-conference data becomes intra-conference data.
- The remaining conference (e.g., the diminished Pac-12) loses its top teams and must be re-evaluated entirely.
A robust model handles realignment by anchoring to team-level ratings rather than conference-level assumptions. Conference membership is a grouping variable, not a causal factor. When the groupings change, the team-level ratings remain valid; only the conference summaries need recalculation.
20.5 College-Specific Market Inefficiencies
The Structure of Inefficiency
College sports betting markets are less efficient than professional markets for several structural reasons:
-
Public bias toward big names: Casual bettors disproportionately bet on teams they have heard of---Alabama, Ohio State, Notre Dame. This creates systematic mispricing where famous teams are slightly overvalued (too much public money on their side) and less-known teams are undervalued.
-
Massive team pool: Bookmakers cannot devote the same level of attention to every game. A Big Ten prime-time matchup will have a sharply-priced line; a Tuesday night MAC game will have a softer line with more room for model-driven edge.
-
Information asymmetry: Injury reports, depth chart changes, and lineup adjustments are less consistently reported in college than in professional sports. Bettors with access to reliable local reporting can gain an information edge.
-
Emotional betting: College fan bases are intensely loyal and geographically concentrated. When Alabama plays at LSU, the local Louisiana sportsbooks see overwhelming LSU action. This regional bias affects line movement.
-
Recreational money volume: College football Saturdays attract enormous recreational betting volume, particularly on parlays and teasers. This casual money inflates the overround and creates misalignment between the closing line and true probabilities.
Overnight Lines and Early-Week Value
One of the most well-documented inefficiencies in college football betting is the value available in overnight lines (also called "openers" or "look-ahead lines"). The lifecycle of a college football line typically follows this pattern:
- Sunday evening: Sportsbooks post early lines for the following week's games. These lines reflect the oddsmaker's initial assessment, often influenced by the model and adjusted for anticipated public action.
- Monday--Wednesday: Sharp bettors evaluate the openers and take positions. Lines move in response to this informed money.
- Thursday--Friday: More information becomes available (practice reports, injury updates). Lines continue to adjust.
- Saturday morning: Final line movements occur as the largest volume of bets comes in.
- Kickoff: The closing line represents the market's final consensus.
Research consistently shows that early-week lines are less accurate than closing lines. However, there is an important nuance: the direction of line movement from opener to closer is informative. If you can identify the direction that sharp money will push the line, you can capture the value of getting in early at a better number.
The practical strategy is:
- Generate your model's line for each game on Sunday.
- Compare to the opener. If your model agrees with the opener but expects sharp money to move the line against you, bet early.
- If your model disagrees with the opener and you expect sharp money to move the line toward your position, wait.
Bowl Game Angles
College football bowl games (and their modern equivalent, the expanded playoff) present a distinct set of betting opportunities:
-
Motivation disparity: In non-playoff bowl games, one team may be enthusiastic about the matchup while the other views it as an unwanted consolation prize. Teams that are perceived as "disappointed" with their bowl selection historically underperform.
-
Extended preparation time: Bowl games feature 3--4 weeks of preparation, compared to 6--7 days during the regular season. This extended prep time tends to benefit better-coached teams with more complex schemes, as they have more time to install game-specific adjustments.
-
Player opt-outs: NFL draft prospects increasingly skip bowl games to avoid injury risk. The impact of key player opt-outs is often underpriced by the market, especially when the opt-outs are announced close to game day after the line has already been set.
-
Location effects: Bowl games played in warm-weather venues (Florida, Texas, Arizona) may disadvantage northern teams that practice in cold weather during the preparation period. Conversely, the "home-field" effect of playing in a nearby bowl can boost the home team.
Early-Season Uncertainty
The first 3--4 weeks of the college football season offer unique betting opportunities because:
- Stale lines: Preseason lines are based on prior-year performance, recruiting, and coaching assessments. They cannot account for scheme changes, player development during fall camp, or the integration of transfers. The market is pricing based on expectations rather than evidence.
- Overreaction to Week 1: After Week 1 results are known, the market often overreacts. A team that wins a comfortable opening game against an FCS opponent sees its line move as if it proved something meaningful. A team that struggles in a difficult road opener may be unfairly penalized.
- Conference schedule onset: When conference play begins in Weeks 3--4, the first intra-conference results provide much stronger signals than non-conference games. Teams whose conference results diverge from preseason expectations represent potential value.
Python Code: College Market Inefficiency Analysis
"""
Analysis of college-specific market inefficiencies.
Examines public bias, overnight line value, bowl game angles,
and early-season uncertainty.
"""
import numpy as np
import pandas as pd
from typing import Dict, List, Optional, Tuple
class CollegeMarketAnalyzer:
"""
Analyze college sports betting market inefficiencies.
"""
# Historical ATS (against the spread) patterns
# Based on published research and market analysis
PUBLIC_BIAS_TEAMS = [
'Alabama', 'Ohio State', 'Notre Dame', 'Michigan',
'USC', 'Texas', 'LSU', 'Georgia', 'Florida',
'Oklahoma', 'Clemson', 'Penn State',
]
def __init__(self):
self.games_data: Optional[pd.DataFrame] = None
def load_data(self, games_df: pd.DataFrame):
"""Load game results with betting data."""
self.games_data = games_df.copy()
def analyze_public_bias(
self,
games_df: pd.DataFrame,
public_teams: List[str] = None,
) -> Dict[str, pd.DataFrame]:
"""
Analyze ATS performance of "public" teams vs. others.
Parameters
----------
games_df : DataFrame
Columns: 'home_team', 'away_team', 'home_score',
'away_score', 'spread' (home team spread, negative = favored),
'is_public_team' (bool, for the favorite side).
Returns
-------
Dict with 'public_ats', 'non_public_ats', 'comparison'.
"""
if public_teams is None:
public_teams = self.PUBLIC_BIAS_TEAMS
df = games_df.copy()
df['actual_margin'] = df['home_score'] - df['away_score']
df['ats_margin'] = df['actual_margin'] + df['spread']
df['covered'] = df['ats_margin'] > 0
# Identify games involving public teams
df['home_is_public'] = df['home_team'].isin(public_teams)
df['away_is_public'] = df['away_team'].isin(public_teams)
df['involves_public'] = df['home_is_public'] | df['away_is_public']
# Public team ATS when favored
public_fav = df[
(df['home_is_public'] & (df['spread'] < 0)) |
(df['away_is_public'] & (df['spread'] > 0))
]
# Non-public teams ATS when favored
non_public_fav = df[
~df['involves_public'] & (df['spread'] != 0)
]
public_cover_rate = public_fav['covered'].mean() if len(public_fav) > 0 else 0
non_public_cover_rate = non_public_fav['covered'].mean() if len(non_public_fav) > 0 else 0
comparison = pd.DataFrame([
{
'Category': 'Public Teams (when favored)',
'Games': len(public_fav),
'Cover Rate': f"{public_cover_rate:.1%}",
'Avg ATS Margin': f"{public_fav['ats_margin'].mean():+.1f}" if len(public_fav) > 0 else 'N/A',
},
{
'Category': 'Non-Public Teams (when favored)',
'Games': len(non_public_fav),
'Cover Rate': f"{non_public_cover_rate:.1%}",
'Avg ATS Margin': f"{non_public_fav['ats_margin'].mean():+.1f}" if len(non_public_fav) > 0 else 'N/A',
},
])
return {
'comparison': comparison,
'public_cover_rate': public_cover_rate,
'non_public_cover_rate': non_public_cover_rate,
'edge': non_public_cover_rate - public_cover_rate,
}
def analyze_line_movement(
self,
games_df: pd.DataFrame,
) -> pd.DataFrame:
"""
Analyze the value of betting openers vs closers.
Parameters
----------
games_df : DataFrame
Must contain: 'opening_spread', 'closing_spread',
'home_score', 'away_score'.
Returns
-------
DataFrame analyzing opener vs closer predictive value.
"""
df = games_df.copy()
df['actual_margin'] = df['home_score'] - df['away_score']
df['line_movement'] = df['closing_spread'] - df['opening_spread']
# Value of betting the opener vs closer
df['opener_ats'] = df['actual_margin'] + df['opening_spread']
df['closer_ats'] = df['actual_margin'] + df['closing_spread']
df['opener_covered'] = df['opener_ats'] > 0
df['closer_covered'] = df['closer_ats'] > 0
# When line moves, which side has value?
moved_games = df[abs(df['line_movement']) >= 0.5]
# Games where line moved toward home (home opened as lesser fav)
moved_to_home = moved_games[moved_games['line_movement'] < -0.5]
# Games where line moved toward away
moved_to_away = moved_games[moved_games['line_movement'] > 0.5]
summary = pd.DataFrame([
{
'Scenario': 'All games (opener)',
'N': len(df),
'Cover Rate': f"{df['opener_covered'].mean():.1%}",
'Avg ATS': f"{df['opener_ats'].mean():+.2f}",
},
{
'Scenario': 'All games (closer)',
'N': len(df),
'Cover Rate': f"{df['closer_covered'].mean():.1%}",
'Avg ATS': f"{df['closer_ats'].mean():+.2f}",
},
{
'Scenario': 'Steam toward home (bet home opener)',
'N': len(moved_to_home),
'Cover Rate': f"{moved_to_home['opener_covered'].mean():.1%}" if len(moved_to_home) > 0 else 'N/A',
'Avg ATS': f"{moved_to_home['opener_ats'].mean():+.2f}" if len(moved_to_home) > 0 else 'N/A',
},
{
'Scenario': 'Steam toward away (bet away opener)',
'N': len(moved_to_away),
'Cover Rate': f"{(1 - moved_to_away['opener_covered']).mean():.1%}" if len(moved_to_away) > 0 else 'N/A',
'Avg ATS': f"{(-moved_to_away['opener_ats']).mean():+.2f}" if len(moved_to_away) > 0 else 'N/A',
},
])
return summary
def analyze_early_season(
self,
games_df: pd.DataFrame,
) -> pd.DataFrame:
"""
Analyze ATS performance by week of season.
Early-season games (Weeks 1-3) often have softer lines.
"""
df = games_df.copy()
df['actual_margin'] = df['home_score'] - df['away_score']
df['ats_margin'] = df['actual_margin'] + df['spread']
df['covered'] = df['ats_margin'] > 0
# Group by season period
def get_period(week):
if week <= 3:
return 'Early (Wk 1-3)'
elif week <= 6:
return 'Early-Mid (Wk 4-6)'
elif week <= 9:
return 'Mid (Wk 7-9)'
else:
return 'Late (Wk 10+)'
df['period'] = df['week'].apply(get_period)
summary = (
df.groupby('period')
.agg(
n_games=('covered', 'count'),
cover_rate=('covered', 'mean'),
avg_ats=('ats_margin', 'mean'),
std_ats=('ats_margin', 'std'),
)
.reset_index()
)
# Format
summary['cover_rate'] = summary['cover_rate'].apply(
lambda x: f"{x:.1%}"
)
summary['avg_ats'] = summary['avg_ats'].apply(
lambda x: f"{x:+.2f}"
)
return summary
def bowl_game_analysis(
self,
bowl_games_df: pd.DataFrame,
) -> pd.DataFrame:
"""
Analyze bowl game ATS performance by various factors.
Parameters
----------
bowl_games_df : DataFrame
Must contain: 'home_team', 'away_team', 'home_score',
'away_score', 'spread', 'has_opt_outs' (bool),
'motivation_edge' (who wants to be there more),
'bowl_type' ('playoff', 'ny6', 'other').
"""
df = bowl_games_df.copy()
df['actual_margin'] = df['home_score'] - df['away_score']
df['ats_margin'] = df['actual_margin'] + df['spread']
df['covered'] = df['ats_margin'] > 0
analyses = []
# By bowl type
for btype in df['bowl_type'].unique():
subset = df[df['bowl_type'] == btype]
analyses.append({
'Factor': f'Bowl Type: {btype}',
'N': len(subset),
'Cover Rate': f"{subset['covered'].mean():.1%}",
'Avg ATS': f"{subset['ats_margin'].mean():+.1f}",
})
# Opt-out impact
if 'has_opt_outs' in df.columns:
with_opts = df[df['has_opt_outs'] == True]
without_opts = df[df['has_opt_outs'] == False]
analyses.append({
'Factor': 'Team with opt-outs',
'N': len(with_opts),
'Cover Rate': f"{with_opts['covered'].mean():.1%}" if len(with_opts) > 0 else 'N/A',
'Avg ATS': f"{with_opts['ats_margin'].mean():+.1f}" if len(with_opts) > 0 else 'N/A',
})
analyses.append({
'Factor': 'Team without opt-outs',
'N': len(without_opts),
'Cover Rate': f"{without_opts['covered'].mean():.1%}" if len(without_opts) > 0 else 'N/A',
'Avg ATS': f"{without_opts['ats_margin'].mean():+.1f}" if len(without_opts) > 0 else 'N/A',
})
return pd.DataFrame(analyses)
# --- Demonstration ---
if __name__ == "__main__":
np.random.seed(42)
analyzer = CollegeMarketAnalyzer()
# Generate synthetic season with realistic inefficiencies
n_games = 800
teams = [
'Alabama', 'Georgia', 'Ohio State', 'Michigan',
'Oregon', 'Penn State', 'Texas', 'Notre Dame',
'Tennessee', 'USC', 'LSU', 'Ole Miss',
'Iowa', 'Wisconsin', 'Boise State', 'Memphis',
'Tulane', 'App State', 'Vanderbilt', 'Rutgers',
]
games = []
for i in range(n_games):
home = np.random.choice(teams)
away = np.random.choice([t for t in teams if t != home])
# Public teams get slightly worse lines (market adjusts for public money)
is_public_home = home in CollegeMarketAnalyzer.PUBLIC_BIAS_TEAMS
is_public_away = away in CollegeMarketAnalyzer.PUBLIC_BIAS_TEAMS
true_spread = np.random.normal(0, 10) # true home advantage
opening_spread = true_spread + np.random.normal(0, 2) # noise
# Public bias: line moves against public team
if is_public_home and true_spread < 0:
opening_spread -= 1.0 # inflate home favorite spread
if is_public_away and true_spread > 0:
opening_spread += 1.0
closing_spread = true_spread + np.random.normal(0, 1) # less noise
actual_margin = true_spread + np.random.normal(0, 14)
home_score = max(0, int(28 + actual_margin / 2))
away_score = max(0, int(28 - actual_margin / 2))
games.append({
'home_team': home,
'away_team': away,
'home_score': home_score,
'away_score': away_score,
'spread': -opening_spread, # convention: negative = home favored
'opening_spread': -opening_spread,
'closing_spread': -closing_spread,
'week': np.random.randint(1, 14),
})
games_df = pd.DataFrame(games)
print("=== Public Bias Analysis ===\n")
bias_result = analyzer.analyze_public_bias(games_df)
print(bias_result['comparison'].to_string(index=False))
print(f"\nEdge (fade public): {bias_result['edge']:.1%}")
print("\n=== Line Movement Analysis ===\n")
movement = analyzer.analyze_line_movement(games_df)
print(movement.to_string(index=False))
print("\n=== Early Season Analysis ===\n")
early = analyzer.analyze_early_season(games_df)
print(early.to_string(index=False))
# Bowl game analysis
n_bowls = 40
bowl_games = []
for i in range(n_bowls):
home = np.random.choice(teams)
away = np.random.choice([t for t in teams if t != home])
true_spread = np.random.normal(0, 8)
actual_margin = true_spread + np.random.normal(0, 14)
bowl_games.append({
'home_team': home,
'away_team': away,
'home_score': max(0, int(24 + actual_margin / 2)),
'away_score': max(0, int(24 - actual_margin / 2)),
'spread': -true_spread + np.random.normal(0, 1.5),
'has_opt_outs': np.random.random() < 0.3,
'bowl_type': np.random.choice(
['playoff', 'ny6', 'other'], p=[0.1, 0.15, 0.75]
),
})
bowl_df = pd.DataFrame(bowl_games)
print("\n=== Bowl Game Analysis ===\n")
bowl_analysis = analyzer.bowl_game_analysis(bowl_df)
print(bowl_analysis.to_string(index=False))
Exploitable Patterns in College Betting
Based on historical analysis and market structure, the following patterns have shown persistence in college sports betting:
-
Fading public teams as large favorites: When teams like Alabama or Ohio State are favored by 20+ points, the line is often inflated by public money. Historically, fading (betting against) large public favorites has shown marginal profitability. The effect is small (approximately 1--2% ROI) and requires high volume, but it is one of the most documented biases.
-
Betting on mid-major teams in early-season non-conference games: Group of 5 teams hosting mid-tier Power 4 opponents in Weeks 1--3 are systematically undervalued. The market underestimates the value of home field in unfamiliar venues and overestimates the talent gap based on recruiting rankings alone.
-
Conference game unders after high-scoring non-conference play: Teams that put up big numbers against weak non-conference opponents often see inflated totals in their first conference games, where the competition stiffens.
-
Teams coming off bye weeks against teams on short rest: The scheduling advantage of a bye week is well-established, but the market does not always fully price in the compounding effect when the opponent is on a short rest (e.g., playing Thursday the previous week).
-
Coaching change Year-1 unders: Teams with new coaches in Year 1, especially those implementing a new offensive scheme, tend to start slowly. The under is often the better bet in the first half of the season for these teams.
Market Insight: The Information Cascade of College Saturday
On a typical college football Saturday, there are 60+ FBS games. The sheer volume overwhelms the market's ability to price every game sharply. Games kicking off at noon Eastern receive less attention than prime-time games, and games involving Group of 5 teams in lesser time slots may have lines that are soft enough to exploit with a competent model. The optimal strategy for a college bettor is to focus on these lower-profile games where the information edge is largest, rather than competing with the sharp market on high-profile matchups.
20.6 Chapter Summary
This chapter developed the quantitative framework for modeling college sports, addressing the unique challenges that distinguish this domain from professional sports modeling.
Key Concepts
Large Team Pools require power rating systems that can handle 130+ teams with sparse head-to-head data. The margin-based least-squares approach, augmented with conference regression priors and margin capping, provides a practical and effective solution. Early-season ratings rely heavily on preseason priors (recruiting, coaching, returning production), while late-season ratings are primarily data-driven.
Coaching Changes are among the most impactful events in college sports. The average Year-1 coaching change produces a decline of 1.5--2.5 points in power rating, but the variance is large. Scheme changes, coaching quality, and transfer portal activity all modulate the effect. A disciplined model accounts for coaching changes explicitly rather than letting them appear as unexplained noise.
Recruiting Data is the most powerful long-term predictor of college team performance. The blue-chip ratio (percentage of 4-star and 5-star recruits on the roster) sets a ceiling on program potential. Lag-weighted recruiting composites, emphasizing classes from 2--3 years prior, provide the strongest predictive signal. Integrating recruiting with on-field performance in a Bayesian framework yields ratings that are more stable and more accurate than either source alone.
Conference Strength is the most contentious measurement in college sports. Cross-conference games provide the only direct evidence for inter-conference comparisons, but these games are noisy, often mismatched, and played early in the season. Conference power ratings, computed as trimmed means of team-level ratings with cross-conference calibration, provide the best available assessment.
Market Inefficiencies in college sports are more numerous and more exploitable than in professional markets. Public bias toward big-name programs, overnight line value, bowl game angles, and early-season uncertainty all provide edges for the quantitative bettor. The key structural advantage is the massive number of games, which overwhelms the market's ability to price every matchup sharply.
Key Formulas
Margin-based power rating prediction:
$$E[\text{margin}_{ij}] = r_i - r_j + h \cdot \mathbf{1}_{\text{home}}$$
Bayesian regression to conference mean:
$$r_i^{\text{post}} = \frac{\sigma_{\text{data}}^2 \cdot \mu_c + n \cdot \sigma_c^2 \cdot \hat{r}_i}{\sigma_{\text{data}}^2 + n \cdot \sigma_c^2}$$
Lag-weighted recruiting composite:
$$\text{Talent}_t = \sum_{k=0}^{4} w_k \cdot R_{t-k}$$
Spread to win probability conversion:
$$P(\text{win}) = \frac{1}{1 + e^{-\text{spread}/(0.55 \cdot \sigma)}}$$
where $\sigma \approx 14$ points for college football and $\sigma \approx 11$ points for the NFL.
Practical Guidelines
-
Start with recruiting, layer on performance: Preseason ratings built from 4-year recruiting composites provide a stable base. Layer current-season results on top as they accumulate, but never completely abandon the recruiting prior---even at season's end, recruiting explains significant variance.
-
Cap margins aggressively: In college sports, blowouts are common and uninformative. A 45-point victory over an FCS opponent tells you almost nothing about the winning team's actual quality. Cap margins at 24--28 points to prevent these games from distorting your ratings.
-
Respect conference structure: Use conference regression priors in your power ratings, especially early in the season. A 3--0 Sun Belt team is not equivalent to a 3--0 SEC team, even if their margin-of-victory statistics are similar.
-
Track coaching changes explicitly: Maintain a database of coaching changes with metadata (scheme change, coaching quality, portal activity). Apply systematic adjustments rather than letting your model treat a coaching change team like any other.
-
Fish in shallow waters: Focus your betting on lower-profile games where the market is softest. A 2% edge on a Tuesday MACtion game is more exploitable than a 2% edge on an SEC on CBS headliner, because the latter line is priced much more sharply.
-
Monitor the transfer portal: In the modern era, the portal is as important as recruiting for Year-1 roster assessment. Track portal additions and departures, especially at positions of need, to update your preseason ratings.
Looking Ahead
Part IV has now covered sport-specific modeling for the major betting sports. The principles developed across these chapters---Poisson models for low-scoring sports, margin-based ratings for point-spread markets, domain-specific features like xG and recruiting, and market structure analysis---form the foundation for the more advanced techniques in Parts V and VI, where we introduce machine learning, live betting models, and portfolio-level bet management. The bettor who has mastered the fundamentals of sport-specific modeling is now ready to add computational power to their analytical framework.
Review Questions
-
Why does the sparsity of cross-conference play in college football create problems for power rating systems, and how does regression to the conference mean address this issue?
-
Describe the typical trajectory of a coaching change impact over Years 1--3. What factors determine whether Year 1 will be better or worse than the average decline?
-
What is the blue-chip ratio, and why does it function more as a ceiling estimator than a floor estimator for team performance?
-
Explain the lag structure of recruiting data as a predictive feature. Why is the class signed 2--3 years ago more predictive of current-season performance than the class signed in the current year?
-
A college football game between a public Power 4 team and an unknown Group of 5 team opens with a spread of -17. By midweek, the line moves to -19. What market dynamics likely caused this movement, and does the opener or closer represent the better betting value?
-
How would you adjust your power rating system to account for conference realignment (e.g., teams moving from one conference to another between seasons)?
Exercises
-
Power Rating System (Programming): Implement the CollegePowerRatings class and fit it to a full season of FBS data from Sports Reference. Compare your top-25 ranking to the final CFP rankings and the AP Poll. Analyze where your model disagrees and explain the sources of disagreement.
-
Recruiting Analysis (Data Analysis): Download five years of 247Sports team recruiting rankings. For each year, compute the lag-weighted talent composite and the blue-chip ratio. Fit a linear regression predicting end-of-season power rating from these recruiting features. What is the $R^2$? How does it change as you vary the lag weights?
-
Coaching Change Study (Analysis): Identify all FBS coaching changes from 2015 to 2024. For each, compute the team's power rating change from the year before the change to the year after. Fit a regression model predicting the Year-1 change from characteristics of the coaching change (previous coach tenure, new coach source, scheme change, recruiting rank). Which factors are statistically significant?
-
Conference Strength (Programming): Build a conference strength analyzer using the code from Section 20.4. Compute conference power ratings for each FBS conference using mean, median, and trimmed-mean methods. Which method produces the most stable conference rankings from year to year?
-
Market Inefficiency Backtest (Programming): Using historical line data and results, backtest the "fade public favorites" strategy described in Section 20.5. Define "public favorite" as one of the top-10 most bet-on teams when favored by 14+ points. Compute the strategy's ATS record, ROI, and statistical significance over a 5-year period. Is the effect large enough to be profitable after accounting for vig?
Further Reading
- Massey, K. (1997). "Statistical Models Applied to the Rating of Sports Teams." Bachelor's thesis, Bluefield College.
- Stern, H.S. (1995). "Who's Number 1 in College Football? And How Might We Decide?" Chance, 8(3), 7--14.
- Elliott, B. (2014--ongoing). "The Blue-Chip Ratio." SB Nation / Banner Society.
- Pomeroy, K. (2003--ongoing). KenPom.com: College Basketball Ratings.
- Sagarin, J. (1985--ongoing). USA TODAY Sagarin Ratings.
- Boulier, B.L. and Stekler, H.O. (2003). "Predicting the Outcomes of National Football League Games." International Journal of Forecasting, 19(2), 257--270.
- Harville, D. (1977). "The Use of Linear-Model Methodology to Rate High School or College Football Teams." Journal of the American Statistical Association, 72(358), 278--289.