Recruiting Analytics

Beginner 10 min read 0 views Nov 27, 2025
# Recruiting Analytics ## Overview Recruiting is the lifeblood of college football programs. Analytics helps evaluate recruiting classes, predict player development, and assess the relationship between recruiting rankings and on-field success. ## Key Recruiting Metrics ### Talent Composite - Aggregate recruiting rankings from multiple services (247Sports, Rivals, ESPN, On3) - Normalizes different rating scales - Better predictor than single-source rankings ### Star Ratings - 5-star: Elite prospects (top 30-35 nationally) - 4-star: High-quality prospects (top 300) - 3-star: Solid contributors - 2-star and below: Developmental prospects ### Blue-Chip Ratio - Percentage of roster that were 4-star or 5-star recruits - Teams above 50% blue-chip ratio have historically been playoff contenders ## R Analysis with cfbfastR ```r library(cfbfastR) library(dplyr) library(ggplot2) library(tidyr) # Load recruiting data for multiple years recruiting_data <- map_df(2018:2023, function(year) { cfbd_recruiting_player(year = year) %>% mutate(year = year) }) # Calculate average star rating by team team_recruiting <- recruiting_data %>% group_by(year, committed_to) %>% summarise( commits = n(), avg_stars = mean(rating, na.rm = TRUE), five_stars = sum(stars == 5), four_stars = sum(stars == 4), three_stars = sum(stars == 3), blue_chip_pct = (five_stars + four_stars) / commits * 100, .groups = "drop" ) %>% arrange(year, desc(avg_stars)) # Top recruiting classes 2023 top_classes_2023 <- team_recruiting %>% filter(year == 2023) %>% arrange(desc(avg_stars)) %>% head(25) print("Top 25 Recruiting Classes - 2023") print(top_classes_2023) # Analyze recruiting vs team performance # Get team records team_records <- cfbd_game_info(year = 2023) %>% mutate( winner = if_else(home_points > away_points, home_team, away_team), loser = if_else(home_points < away_points, home_team, away_team) ) %>% pivot_longer(cols = c(winner, loser), names_to = "result", values_to = "team") %>% group_by(team) %>% summarise( wins = sum(result == "winner"), losses = sum(result == "loser"), win_pct = wins / (wins + losses) ) # Join recruiting with performance recruiting_performance <- team_recruiting %>% filter(year >= 2019, year <= 2022) %>% # 1-4 years ago group_by(committed_to) %>% summarise( avg_recruiting_rank = mean(avg_stars, na.rm = TRUE), avg_blue_chip = mean(blue_chip_pct, na.rm = TRUE) ) %>% inner_join(team_records, by = c("committed_to" = "team")) # Calculate correlation cor_recruiting_wins <- cor( recruiting_performance$avg_recruiting_rank, recruiting_performance$win_pct, use = "complete.obs" ) print(paste("Correlation between recruiting and winning:", round(cor_recruiting_wins, 3))) # Visualize recruiting vs performance ggplot(recruiting_performance, aes(x = avg_recruiting_rank, y = win_pct)) + geom_point(aes(size = avg_blue_chip, color = avg_blue_chip), alpha = 0.7) + geom_smooth(method = "lm", se = TRUE, color = "red", linetype = "dashed") + scale_color_gradient(low = "lightblue", high = "darkblue") + labs( title = "Recruiting Quality vs Team Performance", x = "Average Recruiting Rating (4-year average)", y = "Win Percentage (2023)", size = "Blue-Chip %", color = "Blue-Chip %" ) + theme_minimal() ``` ## Python Implementation ```python import pandas as pd import numpy as np import requests import matplotlib.pyplot as plt import seaborn as sns from scipy.stats import pearsonr def get_recruiting_data(year): """ Fetch recruiting data from CFB Data API """ url = "https://api.collegefootballdata.com/recruiting/players" params = {'year': year} response = requests.get(url, params=params) return pd.DataFrame(response.json()) # Load recruiting data for multiple years recruiting_dfs = [] for year in range(2019, 2024): df = get_recruiting_data(year) df['year'] = year recruiting_dfs.append(df) recruiting = pd.concat(recruiting_dfs, ignore_index=True) # Calculate team-level recruiting metrics team_recruiting = recruiting.groupby(['year', 'committedTo']).agg({ 'name': 'count', # Number of commits 'rating': 'mean', # Average rating 'stars': ['sum', lambda x: (x >= 4).sum()] # Total stars, blue-chip count }).reset_index() team_recruiting.columns = ['year', 'team', 'commits', 'avg_rating', 'total_stars', 'blue_chips'] team_recruiting['blue_chip_ratio'] = ( team_recruiting['blue_chips'] / team_recruiting['commits'] * 100 ) # Identify elite recruiting programs (2019-2023 average) elite_recruiters = team_recruiting.groupby('team').agg({ 'avg_rating': 'mean', 'blue_chip_ratio': 'mean', 'commits': 'sum' }).sort_values('avg_rating', ascending=False).head(25) print("Elite Recruiting Programs (2019-2023 Average):") print(elite_recruiters) # Blue-Chip Ratio analysis blue_chip_50_plus = team_recruiting[ team_recruiting['blue_chip_ratio'] >= 50 ]['team'].unique() print(f"\nTeams with 50%+ Blue-Chip Ratio: {len(blue_chip_50_plus)}") print(blue_chip_50_plus) # Visualize recruiting rankings distribution fig, axes = plt.subplots(1, 2, figsize=(14, 5)) # Star distribution star_counts = recruiting['stars'].value_counts().sort_index() axes[0].bar(star_counts.index, star_counts.values, color='steelblue') axes[0].set_xlabel('Star Rating') axes[0].set_ylabel('Number of Recruits') axes[0].set_title('Distribution of Recruit Star Ratings (2019-2023)') # Blue-chip ratio by team top_teams = team_recruiting[team_recruiting['year'] == 2023].nlargest( 15, 'blue_chip_ratio' ) axes[1].barh(top_teams['team'], top_teams['blue_chip_ratio'], color='darkgreen') axes[1].axvline(50, color='red', linestyle='--', label='50% Threshold') axes[1].set_xlabel('Blue-Chip Ratio (%)') axes[1].set_title('Top 15 Blue-Chip Ratios - 2023 Class') axes[1].legend() plt.tight_layout() plt.show() ``` ## Key Findings from Research ### Recruiting and Success - Strong correlation (r > 0.65) between recruiting rankings and win percentage - Blue-chip ratio above 50% is nearly required for playoff contention - Top 10 recruiting classes have 3-4x higher chance of winning conference ### Position Value - 5-star QBs and edge rushers provide highest ROI - Offensive line recruiting often undervalued by star ratings - Athletic profiles (speed, size) matter more than stars for some positions ### Regional Recruiting - In-state talent retention correlates with program success - Recruiting radius varies by program tier (national vs regional) - Southern states produce disproportionate share of elite talent ## Practical Applications 1. **Roster Management**: Identify recruiting needs by position 2. **Transfer Portal Strategy**: Fill gaps vs developing HS recruits 3. **Player Development**: Track star rating vs on-field production 4. **Competitive Analysis**: Assess talent gap vs conference opponents ## Advanced Metrics ### Composite Rankings - Aggregate multiple recruiting services - Weight by historical accuracy - Account for class size differences ### Expected Production - Project stats based on recruiting rating + team context - Compare actual vs expected to measure development - Identify over/under-performers ### Recruiting Momentum - Track commitment timing and flip rates - Early commits vs late surge patterns - Impact of coaching changes on recruiting ## Resources - [247Sports Composite Rankings](https://247sports.com/Season/2024-Football/CompositeTeamRankings/) - [On3 Industry Ranking](https://www.on3.com/recruiting/rankings/) - [Blue-Chip Ratio Research](https://www.espn.com/college-football/story/_/id/17379747)

Discussion

Have questions or feedback? Join our community discussion on Discord or GitHub Discussions.