Chapter 20 Quiz: Modeling College Sports
Test your understanding of college-specific modeling challenges, power ratings, recruiting data, and market inefficiencies.
Question 1. Why is the large number of teams in college football (133 FBS) more challenging than merely having more data to process?
Answer
The challenge is not computational but structural. With 133 teams each playing 12-13 games (mostly within their conference), most pairs of teams never play each other directly. This means the model must infer relative strength through transitive comparisons and conference-level connections. With only 3-4 non-conference games per team, the cross-conference connections are sparse, making it difficult to accurately compare teams from different conferences. In the NFL, every team plays roughly half the league through 17 games, creating dense connectivity. In college football, entire conference clusters may be connected to other clusters by only a handful of cross-conference results.Question 2. What is the margin-based power rating approach, and why is it preferred over win-loss based ratings for prediction?
Answer
Margin-based power ratings assign each team a single number representing their expected scoring margin against an average opponent at a neutral site. The predicted margin when team i hosts team j is: r_i - r_j + home_advantage. This approach is preferred over win-loss ratings because margin of victory contains more information than the binary outcome. A 35-7 win tells us much more about relative team quality than a 17-14 win, and margin-based systems can distinguish between these outcomes. Win-loss based systems like Elo effectively treat all wins the same (though they can be modified with margin multipliers). Additionally, margin-based ratings can be directly compared to point spreads, which is essential for betting applications.Question 3. Why do college power rating systems cap margins of victory, and what is a typical cap value?
Answer
Margin capping (typically at 24-28 points) prevents blowouts from distorting ratings. In college football, a top team might beat a weak opponent 56-0, but this 56-point margin does not provide 56 points worth of information about the winning team's quality -- it mostly reflects the losing team's inability. Without capping, teams that run up the score (or play weaker schedules with more blowout opportunities) would have artificially inflated ratings. A cap of 28 roughly corresponds to "the winning team was dominant in all four quarters," beyond which additional points provide diminishing information. The specific cap value should be optimized through cross-validation against the closing spread.Question 4. Explain the concept of regression to conference mean and why it is particularly important early in the season.
Answer
Regression to conference mean is a Bayesian technique where, before sufficient data is available, a team's rating is pulled toward the average rating of their conference. Early in the season (Weeks 1-3), each team has only 1-3 data points, which is far too little to estimate true team quality. By using the conference mean as a prior, the model assumes that an SEC team is likely better than a Sun Belt team before seeing any games. As the season progresses, the prior loses influence and the data takes over. This prevents early-season ratings from being wildly unstable -- a Georgia rating should not collapse just because they have a close game in Week 1, because the strong SEC prior keeps it anchored until enough data accumulates.Question 5. What is the blue-chip ratio, and what is its significance for predicting championship-level performance?
Answer
The blue-chip ratio is the percentage of a team's roster composed of 4-star and 5-star recruits (as rated by services like 247Sports). Research by Bud Elliott and others has demonstrated that no team has won a national championship in the BCS/CFP era with a blue-chip ratio below approximately 50%. This means that raw talent, as measured by recruiting rankings, sets a ceiling on team performance. Teams with low blue-chip ratios can overperform through coaching and scheme, but they face a hard ceiling at the championship level. For bettors, the blue-chip ratio is useful as a ceiling estimator and for identifying teams that are likely to regress if their on-field performance exceeds what their talent level supports.Question 6. How does the transfer portal era change the modeling of coaching changes?
Answer
Before the transfer portal, a coaching change meant the new coach inherited the previous coach's roster for at least one year. Now, coaching changes trigger significant roster turnover through the portal: players loyal to the departing coach may leave, and the new coach can bring in immediate contributors. This means: (1) the Year-1 transition penalty may be reduced if the new coach acquires portal talent, (2) the roster composition can change dramatically within weeks, making preseason projections less reliable, (3) the model needs to track net portal talent flow (quality of incoming minus outgoing transfers), and (4) the historical coaching change impact data from pre-portal eras is less applicable to the current environment.Question 7. Why is recruiting data a leading indicator rather than a contemporaneous one, and what is the optimal lag structure?
Answer
Recruiting data reflects talent signed as high school seniors who are typically 17-18 years old. These players do not contribute significantly as freshmen (except at a few skill positions), and their peak contribution comes as juniors and seniors. The optimal lag weighting is approximately: freshmen class (0.10), sophomore class (0.25), junior class (0.30), senior class (0.25), 5th-year class (0.10). This means a top-5 recruiting class signed in February has maximum predictive value for team performance 2-3 years later, not immediately. For bettors, this creates an advantage: future win totals for a team that has signed three consecutive elite classes are likely to underestimate that team's strength because the full impact of those classes has not yet materialized.Question 8. What is the typical home-field advantage in college football, and how does it compare to the NFL?
Answer
Home-field advantage in college football is approximately 3.0 points, which is higher than the NFL's approximately 1.5-2.0 points. Several factors contribute to the larger college advantage: (1) stadium atmospheres can be more extreme (100,000+ fans at places like Michigan, Ohio State, LSU), (2) younger players may be more affected by hostile environments, (3) travel distances can be substantial for some matchups, (4) altitude (Colorado, BYU) and climate differences matter, and (5) crowd noise makes communication more difficult for visiting offenses. However, home-field advantage varies significantly by venue -- some stadiums provide 4-5 points of advantage while smaller programs may only get 1-2 points. A sophisticated model should use venue-specific or at least team-specific home-field estimates.Question 9. Describe two specific ways that conference realignment affects a college sports model.
Answer
(1) Conference prior disruption: When strong teams leave a conference (e.g., Texas and Oklahoma leaving the Big 12 for the SEC), the conference's historical average rating drops, but the remaining teams' individual ratings should not change. The model must update conference priors without contaminating individual team ratings. The departing conference becomes weaker, and the receiving conference becomes stronger, but only at the mean -- not necessarily in terms of variance. (2) Schedule structure changes: Conference membership determines most of a team's schedule. When a team moves to a new conference, its strength of schedule changes dramatically, which affects both its raw results and the opponent-adjusted ratings. A team that was dominant in a weaker conference may look average in a stronger one, and the model must distinguish between true quality change and schedule difficulty change.Question 10. What is a "look-ahead line" in college football, and why can it occasionally offer value?
Answer
A look-ahead line is a point spread released approximately one week before the game, before the current week's results are known. Sportsbooks release these lines to attract early sharp action and gauge market sentiment. They can offer value because they do not incorporate: (1) injury information from the intervening week's games, (2) performance updates from the most recent results, (3) motivational factors that emerge from the current week (rivalry implications, elimination scenarios), and (4) weather forecasts. However, limits on look-ahead lines are typically very low (often $500-2000), so the practical value is limited for most bettors. The primary strategic use is comparing the look-ahead line to the eventual opening line to detect whether the market has moved in the direction your model predicts.Question 11. Why are early-season college football markets less efficient than late-season markets?
Answer
Early-season markets have less information available: (1) no current-season performance data exists for setting accurate ratings, forcing reliance on preseason projections that may be stale; (2) roster changes from the offseason (transfers, injuries, development) are not yet reflected in results; (3) scheme changes under new coaches have not been observed; (4) non-conference opponents may be poorly rated, creating additional uncertainty; (5) the public has strong preseason narratives that may be wrong (overrating traditional powers, underrating improving programs). This combination of information scarcity and narrative-driven public betting creates wider edges for model-based bettors who have better preseason priors (from recruiting data, transfer portal tracking, and coaching analysis).Question 12. How does the concept of "public bias" create market inefficiencies in college football?
Answer
Public bias in college football is the tendency of recreational bettors to overbet popular, nationally recognized programs (Alabama, Ohio State, USC) and underbet obscure teams (Iowa State, Memphis, Coastal Carolina). This occurs because: (1) casual bettors bet on teams they watch on TV, creating disproportionate action on nationally televised games; (2) brand recognition creates anchoring to historical reputation rather than current quality; (3) media narratives amplify recent success stories and ignore quiet improvement at lower-profile programs. The market effect is that popular teams' lines are slightly inflated (by 0.5-1.5 points), making their opponents systematically undervalued. This edge is small but persistent across many games per season.Question 13. Explain the difference between a team's power rating and their strength of schedule, and why both matter for prediction.
Answer
A team's power rating measures their intrinsic quality -- how they would perform against an average opponent at a neutral site. Strength of schedule (SOS) measures the average quality of their opponents. Both matter because a 10-2 team that played a strong schedule is likely much better than a 10-2 team that played a weak schedule, but their raw records are identical. For prediction purposes, the power rating already accounts for SOS if computed correctly (through opponent adjustment), but bettors should track SOS separately because: (1) it identifies teams whose ratings are based on thin cross-conference evidence, (2) it helps assess rating uncertainty (a strong-SOS team's rating is more reliable), and (3) it reveals situations where the public may overvalue or undervalue a team based on record alone.Question 14. What is the typical Year-1 impact of a coaching change in college football?
Answer
On average, teams that change coaches experience a decline of approximately 1.5-2.5 points in power rating in Year 1 (the transition year). However, this average masks enormous variance: some teams improve dramatically (especially if the previous coach was poor or the new coach is elite), while others decline by 5+ points (especially with a scheme change and roster attrition). The Year-1 penalty is driven by scheme unfamiliarity, roster fit issues, recruiting disruption, and general organizational instability. Key moderating factors include: whether the hire was internal (smaller penalty) or external (larger penalty), whether a scheme change occurred (additional 1-2 point penalty), and the quality differential between the outgoing and incoming coaches.Question 15. How should a model handle FCS (Football Championship Subdivision) opponents in the schedule?
Answer
FCS opponents should be handled carefully because: (1) they have no FBS rating and their quality varies enormously (top FCS teams like North Dakota State could compete in many G5 conferences, while bottom FCS teams would lose to most high school all-star teams); (2) games against FCS teams are typically blowouts that provide minimal information about the FBS team's quality. The recommended approach is to: assign FCS teams a fixed rating based on their level (e.g., top FCS at -10, average FCS at -20, weak FCS at -30), cap the margin heavily for these games (perhaps at 14-21 points), and down-weight these games in the rating system (e.g., giving them 50% of the weight of an FBS game). Some modelers exclude FCS games entirely, which is defensible though it wastes some information.Question 16. What makes bowl games different from regular-season games for modeling purposes?
Answer
Bowl games differ in several important ways: (1) Motivation asymmetry -- teams in prestigious bowls may be highly motivated while teams in lower-tier bowls may lack motivation, especially if they barely qualified or if key players are sitting out to protect their NFL draft status. (2) Preparation time -- teams have 3-4 weeks to prepare, which tends to benefit coaches who are better strategic planners, particularly underdogs. (3) No home-field advantage -- nearly all bowls are played at neutral sites, eliminating the typical 3-point HFA. (4) Roster attrition -- NFL-bound players and transfer portal entrants may not play. (5) Conference mismatches -- bowls often feature cross-conference matchups that provide valuable calibration data but also introduce matchup-specific factors (style of play, familiarity).Question 17. How would you adapt the college football framework for college basketball?
Answer
Key adaptations: (1) More data per team (30+ games vs. 12), allowing faster convergence and less reliance on priors. (2) Larger team pool (363 D-I teams), requiring more efficient computation and stronger regularization. (3) Different pace adjustment -- basketball scoring varies enormously by tempo, so the model must account for possessions per game. (4) Smaller margins of victory relative to randomness -- a single possession can swing the outcome, making point spread prediction harder. (5) Less predictive value in recruiting -- basketball has fewer players and more single-player impact, so individual recruit assessment matters more than class-level metrics. (6) Tournament dynamics -- March Madness creates unique modeling needs for single-elimination prediction.Question 18. What is the significance of non-conference results for cross-conference power rating calibration?
Answer
Non-conference results are the only direct evidence connecting teams from different conferences. Without them, the model cannot determine whether a 10-2 SEC team is better or worse than a 10-2 Big 12 team -- it can only rank teams within each conference. Cross-conference games serve as the "bridge" that allows the model to estimate the relative strength of conferences. However, these games are noisy (early season, mismatched opponents, motivation differences), so they must be handled carefully. Best practices include: weighting cross-conference games normally (not extra-heavily, as the noise would be amplified), using margin caps to prevent blowouts from distorting comparisons, and accounting for home/away/neutral site effects on the cross-conference margin.Question 19. How does overnight line movement differ from regular line movement in college football?
Answer
Overnight lines (released Sunday evening for the following Saturday's games) represent the sportsbook's initial assessment with limited information. They differ from regular line movement because: (1) they are set before the full market has had time to process the previous week's results; (2) limits are typically low, meaning small sharp bets can move the line significantly; (3) injury updates from the previous day's games may not be fully incorporated; (4) public betting has not yet begun in volume. Regular line movement during the week reflects the full information set including injury reports, weather, public betting patterns, and sharp money. Overnight lines can offer value when a modeler's updated ratings differ significantly from the initial market assessment, but the low limits reduce practical utility.Question 20. Explain the concept of "regression to conference mean" in mathematical terms.
Answer
Regression to conference mean is formalized as a Bayesian posterior estimate. Let mu_c be the prior mean rating for conference c, sigma_c^2 be the prior within-conference variance, and sigma_data^2 be the variance of individual game outcomes. After observing n games with an empirical rating r_hat, the posterior mean is: r_post = (sigma_data^2 * mu_c + n * sigma_c^2 * r_hat) / (sigma_data^2 + n * sigma_c^2). When n is small (early season), the weight on mu_c is large and the rating stays near the conference mean. As n grows, r_hat dominates and the conference prior becomes irrelevant. The key insight is that the regression strength is controlled by the ratio of observation noise to prior uncertainty: if the conference has tight talent distribution (small sigma_c), regression is stronger; if games are very noisy (large sigma_data), regression is also stronger.Question 21. What role does the 247Sports Composite play in preseason modeling?
Answer
The 247Sports Composite aggregates recruiting evaluations from multiple services (247Sports, Rivals, ESPN, On3) to produce a consensus rating for each recruit and each recruiting class. For preseason modeling, it serves as the primary measure of incoming talent quality. The composite is used to: (1) compute lag-weighted talent scores that predict current team quality, (2) calculate blue-chip ratios for ceiling estimation, (3) assess recruiting trends (is a program on the rise or in decline?), (4) evaluate the talent gap between programs (which directly correlates with expected point spreads), and (5) provide a preseason prior before any games are played. Teams in the top 10 of the composite consistently correlate with top-25 rankings by season's end, making it one of the most predictive preseason features.Question 22. Why might a team's Week 4 performance be misleading for predicting their Week 10 performance?
Answer
Several factors make early-season data unreliable: (1) Small sample size -- 3-4 games contain huge variance, and a single blowout or upset can dominate the statistics. (2) Schedule front-loading -- most non-conference games (often against weak opponents) are played early, inflating some teams' perceived quality. (3) Roster development -- young players improve significantly from September to November as they gain experience. (4) Scheme installation -- new coaches or coordinators may not have their full playbook installed until mid-season. (5) Injury accumulation -- the roster at Week 4 may be different from Week 10. (6) Regression effects -- teams that started hot or cold are likely to regress toward their true quality level. A good model accounts for these factors by heavily weighting priors in the early weeks.Question 23. How would you estimate venue-specific home-field advantage in college football?
Answer
Rather than using a single home-field advantage number for all college games (approximately 3.0 points), a venue-specific model would estimate HFA by venue or team. The approach: (1) collect home/away margins for each team over 5-10 seasons, (2) compute each team's average home margin versus their average away margin, controlling for opponent quality, (3) shrink these estimates toward the league-average HFA to account for small samples (most teams play only 6-7 home games per year), (4) include venue-specific factors: stadium capacity, elevation, altitude, surface type, and noise level. Historical data suggests that stadiums like Tiger Stadium (LSU), Beaver Stadium (Penn State), and Autzen Stadium (Oregon) provide 4-5 points of HFA, while some smaller-venue teams get only 1-2 points.Question 24. Describe how NIL (Name, Image, and Likeness) affects college sports modeling.
Answer
NIL introduces a financial dimension to college football that affects modeling through several channels: (1) Recruiting -- programs with larger NIL collectives can attract better recruits, creating a new dimension of recruiting advantage beyond coaching and facilities. (2) Retention -- NIL can prevent star players from entering the transfer portal or the NFL draft, improving roster stability. (3) Transfer portal dynamics -- NIL packages are now a primary driver of portal decisions, allowing wealthy programs to acquire immediate impact transfers. (4) Competitive balance shifts -- NIL has widened the gap between wealthy and less wealthy programs, potentially reducing parity. For modelers, NIL spending data (where available) should be treated as a leading indicator similar to recruiting rankings, though the data is less standardized and less transparent than traditional recruiting metrics.Question 25. What is the complete workflow for a college football bettor from preseason through Week 12?