College Football Playoff Projections

Beginner 10 min read 19 views Nov 27, 2025

# College Football Playoff Projections ## Overview Predicting College Football Playoff (CFP) selection involves analyzing team performance, strength of schedule, conference championships, and head-to-head results. With the expanded 12-team format (starting 2024), modeling playoff probabilities has become more complex and critical. ## CFP Selection Criteria ### Primary Factors 1. **Win-Loss Record**: Overall and conference record 2. **Strength of Schedule**: Quality of opponents faced 3. **Conference Championships**: Auto-bids for top 5 conference champs 4. **Head-to-Head Results**: Direct matchup outcomes 5. **Common Opponents**: Performance against shared foes 6. **Computer Rankings**: SP+, FPI, and other metrics ### 12-Team Playoff Format (2024+) - **5 Conference Champions**: Automatic bids (ranked) - **7 At-Large Bids**: Best remaining teams - **Top 4 Seeds**: Receive first-round byes - **Seeds 5-12**: Play first-round games ## R Implementation with cfbfastR ```r library(cfbfastR) library(dplyr) library(ggplot2) library(glmnet) library(randomForest) # Load team records and advanced stats team_records <- cfbd_game_info(year = 2023) %>% mutate( winner = if_else(home_points > away_points, home_team, away_team), loser = if_else(home_points < away_points, home_team, away_team) ) %>% pivot_longer(cols = c(winner, loser), names_to = "result", values_to = "team") %>% group_by(team) %>% summarise( games = n(), wins = sum(result == "winner"), losses = sum(result == "loser"), win_pct = wins / games ) # Get SP+ ratings (predictive power rating) sp_ratings <- cfbd_ratings_sp(year = 2023) # Get FPI ratings fpi_ratings <- cfbd_ratings_fpi(year = 2023) # Merge all data playoff_data <- team_records %>% left_join(sp_ratings %>% select(team, rating), by = "team") %>% rename(sp_rating = rating) %>% left_join(fpi_ratings %>% select(team, fpi, sos, resumeRanks.strengthOfRecord), by = "team") # Calculate playoff probability features playoff_features <- playoff_data %>% mutate( # Quality wins (top 25 opponents) quality_wins = wins * (sp_rating / 100), # Adjusted win percentage by SOS adj_win_pct = win_pct * (1 + sos / 100), # Combined ranking score composite_score = (sp_rating + fpi) / 2, # Conference championship bonus (simplified) conf_champ_bonus = if_else(wins >= 12, 5, 0) ) # Simple playoff probability model using logistic regression # Historical playoff teams (2014-2023, 4-team format) # For demonstration: teams with 11+ wins and top 15 SP+ made playoffs playoff_features <- playoff_features %>% mutate( playoff_threshold = (wins >= 11 & sp_rating >= 15), playoff_prob = plogis((sp_rating + fpi * 0.5 + wins * 2 - 30) / 10) ) # Top playoff contenders playoff_contenders <- playoff_features %>% arrange(desc(playoff_prob)) %>% select(team, wins, losses, sp_rating, fpi, sos, playoff_prob) %>% head(20) print("Top 20 Playoff Probability Rankings:") print(playoff_contenders) # Visualize playoff probabilities top_25 <- playoff_features %>% arrange(desc(playoff_prob)) %>% head(25) ggplot(top_25, aes(x = reorder(team, playoff_prob), y = playoff_prob * 100)) + geom_col(aes(fill = wins >= 12), show.legend = TRUE) + scale_fill_manual(values = c("steelblue", "darkgreen"), labels = c("< 12 wins", "12+ wins")) + coord_flip() + labs( title = "CFB Playoff Probabilities - Top 25 Teams", x = "Team", y = "Playoff Probability (%)", fill = "Win Total" ) + theme_minimal() # Simulate playoff scenarios using Monte Carlo simulate_playoff_scenarios <- function(teams_df, n_sims = 10000) { playoff_selections <- matrix(0, nrow = nrow(teams_df), ncol = 1) rownames(playoff_selections) <- teams_df$team for (sim in 1:n_sims) { # Simulate based on probabilities selected <- teams_df %>% mutate( random_factor = rnorm(n(), mean = composite_score, sd = 5), adjusted_score = random_factor + conf_champ_bonus ) %>% arrange(desc(adjusted_score)) %>% head(12) # 12-team playoff # Count selections for (team in selected$team) { playoff_selections[team, 1] <- playoff_selections[team, 1] + 1 } } playoff_pct <- (playoff_selections / n_sims) * 100 return(playoff_pct) } # Run simulation simulation_results <- simulate_playoff_scenarios(playoff_features) simulation_df <- data.frame( team = rownames(simulation_results), playoff_pct = simulation_results[, 1] ) %>% arrange(desc(playoff_pct)) %>% head(20) print("\nPlayoff Probability (10,000 Simulations):") print(simulation_df) # Plot simulation results ggplot(simulation_df, aes(x = reorder(team, playoff_pct), y = playoff_pct)) + geom_col(fill = "darkblue", alpha = 0.7) + geom_hline(yintercept = 50, linetype = "dashed", color = "red") + coord_flip() + labs( title = "CFB Playoff Selection Probability (Monte Carlo Simulation)", x = "Team", y = "Playoff Selection Probability (%)", subtitle = "Based on 10,000 simulations" ) + theme_minimal() ``` ## Python Implementation ```python import pandas as pd import numpy as np import requests import matplotlib.pyplot as plt import seaborn as sns from sklearn.linear_model import LogisticRegression from sklearn.preprocessing import StandardScaler def get_team_records(year): """Fetch team records from CFB Data API""" url = "https://api.collegefootballdata.com/records" params = {'year': year} response = requests.get(url, params=params) return pd.DataFrame(response.json()) def get_sp_ratings(year): """Fetch SP+ ratings""" url = "https://api.collegefootballdata.com/ratings/sp" params = {'year': year} response = requests.get(url, params=params) return pd.DataFrame(response.json()) def get_fpi_ratings(year): """Fetch FPI ratings""" url = "https://api.collegefootballdata.com/ratings/fpi" params = {'year': year} response = requests.get(url, params=params) return pd.DataFrame(response.json()) # Load data for 2023 season records = get_team_records(2023) sp_ratings = get_sp_ratings(2023) fpi_ratings = get_fpi_ratings(2023) # Prepare team records team_records = records[['team', 'total']].copy() team_records['wins'] = team_records['total'].apply(lambda x: x['wins']) team_records['losses'] = team_records['total'].apply(lambda x: x['losses']) team_records['win_pct'] = team_records['wins'] / (team_records['wins'] + team_records['losses']) team_records = team_records[['team', 'wins', 'losses', 'win_pct']] # Merge datasets playoff_df = team_records.merge( sp_ratings[['team', 'rating']], on='team', how='left' ).rename(columns={'rating': 'sp_rating'}) playoff_df = playoff_df.merge( fpi_ratings[['team', 'fpi', 'strengthOfSchedule']], on='team', how='left' ).rename(columns={'strengthOfSchedule': 'sos'}) # Drop teams with missing data playoff_df = playoff_df.dropna() # Calculate composite features playoff_df['composite_rating'] = ( playoff_df['sp_rating'] + playoff_df['fpi'] ) / 2 playoff_df['quality_metric'] = ( playoff_df['win_pct'] * playoff_df['composite_rating'] ) # Simple playoff probability model # Logistic function based on wins and ratings def calculate_playoff_prob(row): """Calculate playoff probability""" # Weighted combination of factors score = ( row['wins'] * 3 + row['sp_rating'] * 0.5 + row['fpi'] * 0.3 + row['sos'] * 0.2 ) # Logistic transformation prob = 1 / (1 + np.exp(-(score - 40) / 5)) return prob * 100 playoff_df['playoff_prob'] = playoff_df.apply(calculate_playoff_prob, axis=1) # Sort by playoff probability playoff_df = playoff_df.sort_values('playoff_prob', ascending=False) print("Top 20 CFB Playoff Contenders - 2023:") print(playoff_df[['team', 'wins', 'losses', 'sp_rating', 'fpi', 'playoff_prob']].head(20)) # Monte Carlo simulation for playoff selection def simulate_playoffs(df, n_simulations=10000, n_teams=12): """ Simulate playoff selection using Monte Carlo method """ results = {team: 0 for team in df['team']} for _ in range(n_simulations): # Add random noise to composite rating df_sim = df.copy() df_sim['sim_score'] = ( df_sim['composite_rating'] + np.random.normal(0, 3, size=len(df)) ) # Select top 12 teams selected = df_sim.nlargest(n_teams, 'sim_score')['team'].tolist() # Count selections for team in selected: results[team] += 1 # Convert to percentages results_df = pd.DataFrame({ 'team': list(results.keys()), 'selection_pct': [v / n_simulations * 100 for v in results.values()] }).sort_values('selection_pct', ascending=False) return results_df # Run simulation simulation = simulate_playoffs(playoff_df) print("\nPlayoff Selection Probability (10,000 Simulations):") print(simulation.head(20)) # Visualizations fig, axes = plt.subplots(2, 2, figsize=(16, 12)) # 1. Top 25 playoff probabilities top_25 = playoff_df.head(25) axes[0, 0].barh(range(len(top_25)), top_25['playoff_prob'], color='darkgreen', alpha=0.7) axes[0, 0].set_yticks(range(len(top_25))) axes[0, 0].set_yticklabels(top_25['team'], fontsize=8) axes[0, 0].set_xlabel('Playoff Probability (%)') axes[0, 0].set_title('Top 25 Playoff Probabilities (Model-Based)') axes[0, 0].invert_yaxis() # 2. Monte Carlo simulation results top_20_sim = simulation.head(20) axes[0, 1].barh(range(len(top_20_sim)), top_20_sim['selection_pct'], color='steelblue', alpha=0.7) axes[0, 1].set_yticks(range(len(top_20_sim))) axes[0, 1].set_yticklabels(top_20_sim['team'], fontsize=8) axes[0, 1].set_xlabel('Selection Probability (%)') axes[0, 1].set_title('Playoff Selection Probability (10K Simulations)') axes[0, 1].axvline(50, color='red', linestyle='--', alpha=0.5) axes[0, 1].invert_yaxis() # 3. Wins vs SP+ rating scatter axes[1, 0].scatter(playoff_df['wins'], playoff_df['sp_rating'], s=playoff_df['playoff_prob']*3, alpha=0.6, c=playoff_df['playoff_prob'], cmap='RdYlGn') axes[1, 0].set_xlabel('Wins') axes[1, 0].set_ylabel('SP+ Rating') axes[1, 0].set_title('Wins vs SP+ Rating (size = playoff probability)') axes[1, 0].grid(alpha=0.3) # 4. Resume strength comparison top_contenders = playoff_df.head(15) x = np.arange(len(top_contenders)) width = 0.35 axes[1, 1].barh(x - width/2, top_contenders['sp_rating'], width, label='SP+ Rating', color='blue', alpha=0.7) axes[1, 1].barh(x + width/2, top_contenders['fpi'], width, label='FPI Rating', color='orange', alpha=0.7) axes[1, 1].set_yticks(x) axes[1, 1].set_yticklabels(top_contenders['team'], fontsize=8) axes[1, 1].set_xlabel('Rating') axes[1, 1].set_title('Top 15 Teams - SP+ vs FPI Comparison') axes[1, 1].legend() axes[1, 1].invert_yaxis() plt.tight_layout() plt.show() ``` ## Key Modeling Considerations ### Feature Engineering 1. **Win Quality**: Weight wins by opponent strength 2. **Loss Penalty**: Timing and margin of losses matter 3. **Conference Strength**: Adjust for schedule difficulty 4. **Momentum**: Recent performance trends 5. **Head-to-Head**: Direct comparison tiebreaker ### Model Validation - Historical accuracy: Test on past playoff selections - Cross-validation: Train on multiple seasons - Calibration: Ensure probabilities match selection rates - Committee bias: Account for subjective factors ### Limitations - Committee subjectivity (eye test, narratives) - Conference championship game outcomes - Injury impacts not captured in season stats - Strength of schedule calculations vary ## Practical Applications 1. **Weekly Rankings**: Update probabilities after each game 2. **Scenario Analysis**: "What if" game outcome modeling 3. **Betting Markets**: Compare model to Vegas odds 4. **Resume Building**: Identify must-win games for teams ## Resources - [ESPN FPI](https://www.espn.com/college-football/fpi) - [SP+ Ratings](https://www.espn.com/college-football/story/_/id/38819392/college-football-sp+-rankings-week-9) - [CFP Selection Committee Protocol](https://collegefootballplayoff.com/sports/2016/10/20/selection-committee-protocol.aspx)

Transfer Portal Analytics Previous

NFL Draft Projection Models Next

Discussion

Have questions or feedback? Join our community discussion on Discord or GitHub Discussions.

Table of Contents

College Football Playoff Projections

Test Your Knowledge

Discussion