20 min read

> "Football is a game of inches, and inches are measured in data."

Learning Objectives

  • Define sports analytics and distinguish it from traditional statistics
  • Trace the historical development of football analytics from basic statistics to modern methods
  • Identify specific ways college football programs use analytics for competitive advantage
  • Describe the five-stage analytics workflow from question to communication
  • Recognize ethical considerations in sports analytics work

Chapter 1: Introduction to College Football Analytics

"Football is a game of inches, and inches are measured in data." — Anonymous NFL Executive

Chapter Overview

On a crisp November evening in 2013, something unusual happened in college football. Kevin Kelley, the head coach of Pulaski Academy in Arkansas, had built a high school dynasty by never punting the ball—a strategy considered insane by traditional football minds. His teams won state championships by following a simple but radical data-driven insight: keeping possession of the football was more valuable than the field position gained by punting.

Kelley's approach wasn't based on hunches or tradition. It was based on analysis. He had studied expected points—the average number of points a team could expect to score from any field position—and concluded that the math favored aggression. His teams went for it on fourth down from anywhere on the field. They always attempted onside kicks after scoring. And they won. A lot.

Kelley's story illustrates a fundamental shift happening in football at all levels. Decisions once made by gut instinct are increasingly informed by data. Strategies passed down through generations of coaches are being questioned and sometimes overturned. The sport is being transformed not by new equipment or rule changes, but by a new way of thinking: analytics.

This chapter introduces you to the world of college football analytics. You will learn what analytics means in this context, how it evolved from basic statistics to sophisticated modeling, and how programs across the country use data to gain competitive advantages. By the end, you will understand the analytical workflow that guides effective analysis and the ethical responsibilities that come with working in this field.

In this chapter, you will learn to: - Distinguish between statistics and analytics in a sports context - Understand the key developments that shaped modern football analytics - Identify real applications of analytics in college football programs - Apply the five-stage analytics workflow to football questions - Consider ethical implications of analytics work


1.1 What Is Sports Analytics?

The word "analytics" appears everywhere in modern sports. Teams employ "analytics departments." Media coverage references "advanced analytics." Fans debate whether to trust "analytics" over the eye test. But what does the term actually mean?

1.1.1 Defining Analytics in Sports

Analytics refers to the systematic analysis of data to discover, interpret, and communicate meaningful patterns that inform decision-making. This definition contains several important elements:

Systematic analysis: Analytics isn't casual observation. It involves structured methods—statistical techniques, computational tools, and rigorous frameworks—applied consistently across many observations.

Data: Analytics requires information, typically numerical but sometimes textual or visual. In football, this includes play-by-play records, player statistics, video, and increasingly, tracking data that captures player movements at high frequency.

Meaningful patterns: The goal isn't to collect data for its own sake but to find patterns that reveal something useful—which strategies work, which players contribute most, which tendencies can be exploited.

Decision-making: Analytics exists to inform choices. In college football, these include roster decisions (who to recruit, who to start), strategic decisions (what plays to call, when to go for fourth down), and resource decisions (where to invest limited time and money).

💡 Intuition: Think of analytics as a translation layer between raw numbers and football decisions. Data alone tells you that a quarterback completed 65% of his passes. Analytics tells you whether that rate is good given his opponents, situation, and supporting cast—and what it might predict about future performance.

1.1.2 The Evolution from Statistics to Analytics

Statistics and analytics are related but distinct concepts. Understanding their relationship clarifies what makes modern analytics different from what came before.

Traditional football statistics answer questions like: - How many yards did the running back gain? - What was the quarterback's completion percentage? - How many touchdowns did the team score?

These are descriptive statistics—they describe what happened. They've existed as long as football has been played. Box scores in newspapers a century ago recorded rushing yards and passing completions.

Analytics extends beyond description to answer questions like: - Was 150 rushing yards a good performance given the opponent's defense? - How much did the quarterback's completion percentage depend on easy throws versus difficult ones? - Which touchdowns changed the game's outcome and which came in garbage time?

This shift involves several advances:

Context-awareness: Analytics accounts for situation. A 5-yard gain on 3rd-and-3 succeeds; the same gain on 3rd-and-10 fails. Traditional statistics treat them identically; analytics distinguishes them.

Expectation models: Analytics asks not just what happened but what should have happened. Expected points models estimate how valuable a field position is. Completion probability models estimate how difficult a throw was. Performance is measured against these expectations.

Prediction: Analytics looks forward, not just backward. How will this player perform next season? Who will win this game? What's the probability of converting this fourth down? Prediction requires methods that go beyond simple counting.

Attribution: When a team succeeds, analytics asks why. Was it the quarterback's accuracy, the receivers' ability to get open, or the offensive line's pass protection? Attribution isolates contributions that traditional statistics mix together.

1.1.3 Analytics vs. Traditional Scouting

The rise of analytics has created tension with traditional evaluation methods. Understanding this tension—and how to navigate it—is important for any aspiring analyst.

Traditional scouting relies on expert observation. Experienced scouts watch games and practices, assess player attributes through the eye test, and make judgments based on years of accumulated wisdom. This approach has produced countless successful evaluations.

Traditional scouting excels at: - Evaluating traits that are difficult to quantify (leadership, competitiveness, football IQ) - Assessing potential that hasn't yet shown up in statistics (a freshman who "has all the tools") - Recognizing context that numbers might miss (a player performing well despite poor surrounding talent) - Building relationships with players, coaches, and others who provide qualitative information

Analytics complements—and sometimes challenges—traditional methods: - Provides objective baselines that check subjective impressions - Processes information at scale (evaluating hundreds of players efficiently) - Reveals patterns invisible to the naked eye (subtle tendencies, small edges) - Reduces cognitive biases that affect human judgment

📊 Real-World Application: Most successful programs integrate both approaches. Georgia's football program, which has dominated college football recently, employs both experienced scouts and a sophisticated analytics staff. They use data to support—not replace—the human judgment of coaches who have watched thousands of games.

The most effective analysts understand that data is a tool, not a replacement for football knowledge. A number without context is meaningless. "EPA of 0.15" means nothing to someone who doesn't understand football situations. The analyst's job is to bridge the worlds of data and football.


1.2 The History of Football Analytics

Modern football analytics didn't emerge from nothing. It grew from decades of statistical thinking in sports, accelerated by technological change and inspired by success in other domains. Understanding this history provides context for where we are and where we're headed.

1.2.1 Early Statistical Analysis in Football

Statistical analysis in football dates back further than many realize.

The 1960s and 1970s saw pioneering work in quantifying football performance. Virgil Carter, an NFL quarterback with an economics degree, collaborated with statistician Robert Machol to study optimal decision-making in football. Their work, published in academic journals, laid groundwork for later expected points models.

Bud Goode, a former high school quarterback, pioneered efficiency metrics in the 1970s. His work computing passer ratings and other metrics was decades ahead of its time, though it found limited adoption.

The 1980s and 1990s brought more systematic data collection. STATS Inc. began providing detailed play-by-play data. Football Outsiders, founded in 2003, would later use such data to develop influential metrics.

During this period, analysis remained largely the province of journalists and academics. Teams relied primarily on traditional coaching methods and scouting. The data existed, but organizational cultures weren't ready to use it systematically.

1.2.2 The Moneyball Effect on Football

The 2003 publication of Moneyball changed everything—not just for baseball but for all sports.

Michael Lewis's book told the story of how the Oakland Athletics, with one of baseball's smallest payrolls, competed successfully by using statistical analysis to find undervalued players. Billy Beane and his team didn't necessarily discover new statistics; they applied existing ones more systematically than their competitors.

Moneyball's impact on football was indirect but profound:

Changed conversations: Suddenly, owners and executives asked whether their sport had similar inefficiencies to exploit. Analytics became a respectable topic in front offices.

Created career paths: Young people interested in sports saw a new route in—through data rather than playing or coaching. Graduate programs in sports analytics emerged.

Attracted talent: Analytically-minded people who might have worked on Wall Street or in tech saw opportunities in sports.

Generated skepticism: The book also created backlash. "Analytics guys" became targets for those who felt traditional football wisdom was under attack.

📝 Note: The Moneyball approach was always about finding market inefficiencies, not about replacing scouts with spreadsheets. As football analytics matured, this nuance became clearer. The goal isn't to replace football knowledge with numbers—it's to gain edges where traditional approaches fall short.

1.2.3 The Rise of Expected Points and Win Probability

The most important analytical advances in football involved expected points and win probability models.

Expected Points (EP) assigns point values to game situations. Being on the opponent's 10-yard line is worth more than being on your own 20-yard line—but how much more? Expected points models, built by analyzing thousands of historical drives, answer this question precisely.

The concept isn't new—Carter and Machol explored it in the 1970s—but modern computing power enabled more sophisticated implementations. Keith Goldner, Brian Burke (Advanced NFL Analytics), and later the team behind nflfastR created public expected points models that became industry standards.

Expected Points Added (EPA) follows naturally. If a play moves the offense from a situation worth 2 expected points to one worth 4 expected points, the play added 2 EPA. This provides a common currency for comparing plays of different types—a 15-yard run can be compared directly to a 20-yard pass based on their EPA values.

Win Probability (WP) models estimate the chance of winning from any game state. Down by 7 with 5 minutes left, first-and-goal from the 3? What's your win probability? These models, trained on historical game data, provide answers.

Win probability enables two important analyses:

  1. Decision evaluation: Did going for it on 4th down increase or decrease win probability relative to punting?

  2. Play value: Win Probability Added (WPA) measures how much each play changed the outcome probability. A touchdown in a blowout adds little WPA; a touchdown to take the lead late adds a lot.

These frameworks—EPA, WP, WPA—now form the foundation of serious football analysis. Parts II and IV of this textbook will teach you to calculate and apply them.


1.3 Analytics in Modern College Football

Theory is interesting; application is what matters. How do college football programs actually use analytics today?

1.3.1 How FBS Programs Use Analytics

Analytics adoption in college football varies widely. Some programs have multi-person analytics departments integrated into all aspects of operations. Others have a graduate assistant occasionally pulling reports. The trend, however, is clearly toward more systematic use of data.

Common applications include:

Opponent Analysis: Before each game, analytics staff provide reports on opponent tendencies. When do they blitz? What routes do they favor on third down? Where are their defensive weaknesses? This information shapes game plans.

Self-Evaluation: Programs use data to evaluate their own performance honestly. Are we actually good at running the ball, or have we benefited from weak opponents? What situations expose our weaknesses? Honest self-assessment prevents complacency.

Practice Focus: Limited practice time requires prioritization. Analytics helps identify which situations to emphasize. If data shows you struggle on third-and-medium, practice includes more third-and-medium scenarios.

Player Development: Tracking player metrics over time shows development trajectories. Is the young quarterback improving his accuracy under pressure? Are strength gains translating to on-field performance? Data provides objective measures of progress.

1.3.2 In-Game Decision Making

Perhaps the most visible application of analytics is in real-time game decisions.

Fourth-Down Decisions: The analytics revolution began here. Traditional wisdom said to punt on fourth down in most situations. Data showed that coaches punt too often—that going for it on fourth down succeeds frequently enough to justify the attempt, especially near midfield.

The 2023 and 2024 seasons saw more fourth-down attempts than ever in college football history. Coaches who once would have punted on 4th-and-2 from their own 35 now regularly go for it. This shift is analytics in action.

Two-Point Conversions: Similar analysis applies to the choice between kicking an extra point and going for two. Analytics shows situations where two-point attempts are clearly optimal—notably late in games when you need to score a specific number of points.

Timeout Usage: When to call timeouts, particularly late in games, significantly affects win probability. Analytics quantifies these effects, helping coaches make better choices under pressure.

⚠️ Common Pitfall: Analytics tells you what the optimal decision is on average. Individual circumstances—your kicker's ability, your opponent's fourth-down defense, your team's recent success—may justify deviation from the average-optimal choice. Good analysts account for team-specific factors; bad analysts apply generic models blindly.

1.3.3 Recruiting and Player Development

College football recruiting is a massive undertaking. Programs evaluate thousands of prospects to fill 25 scholarships per year. Analytics helps manage this complexity.

Recruiting Applications:

Prospect Prioritization: Which prospects are most likely to develop into impact players? While recruiting rankings from services like 247Sports and Rivals provide starting points, programs can build proprietary models that weight different attributes differently based on their needs.

Fit Assessment: Does a quarterback's skill set match our offensive scheme? Data on prospect performance in high school or junior college can be analyzed for scheme compatibility.

Evaluation Accuracy: Programs track whether their evaluations proved accurate. Did players they ranked highly actually develop as expected? This feedback loop improves future evaluation.

Transfer Portal Analysis: The transfer portal has transformed college football recruiting. Programs must now evaluate players from other colleges, not just high school. Analytics helps identify undervalued transfers—players whose production might have been suppressed by circumstances rather than ability.

1.3.4 Game Planning and Opponent Analysis

Every week, coaching staffs prepare game plans for their next opponent. Analytics enhances this process.

Tendency Analysis: Data reveals opponent tendencies that film review alone might miss. Patterns in play calling by down and distance, formation frequencies, and personnel groupings all inform game planning.

Personnel Matchup Analysis: Which defenders struggle against certain route combinations? Which offensive linemen have trouble with speed rushes? Matchup data helps identify where to attack and where to protect.

Predictive Modeling: What plays is the opponent likely to call in a given situation? Machine learning models trained on historical play-calling can suggest probabilities, helping prepare defensive responses.

Weather and Situational Adjustments: Environmental factors affect strategy. Wind direction affects passing; field conditions affect running game viability. Analytics can quantify these effects for specific situations.


1.4 The Analytics Workflow

Effective analytics follows a structured process. Understanding this workflow helps you approach problems systematically rather than diving into data without direction.

1.4.1 Question Formulation

Every analytics project begins with a question. This sounds obvious, but it's where many analyses go wrong.

Good analytical questions are:

Specific: "Is our offense good?" is too vague. "How does our red zone success rate compare to conference opponents?" is specific enough to answer.

Answerable with available data: Some questions, no matter how interesting, can't be answered with existing data. Before investing time, verify that relevant data exists.

Connected to decisions: Analytics should inform action. Ask "so what?"—if you answer this question, what would change? If nothing would change, the question may not be worth answering.

Appropriate in scope: Questions can be too broad (requiring a dissertation to answer) or too narrow (trivially answered). Good questions sit in between—substantial enough to matter but focused enough to address.

💡 Intuition: A useful exercise is to state the question and your best guess at the answer before analyzing. This forces clarity about what you're actually asking. It also helps you recognize when the data surprises you versus confirms what you expected.

1.4.2 Data Collection

With a clear question, you need data to answer it.

Data sources for college football analytics include:

  • Play-by-play data: Every play from every game, including down, distance, play type, yards gained, and outcome
  • Box score statistics: Traditional stats aggregated at the game and season level
  • Recruiting data: Ratings, rankings, and attributes of high school prospects
  • Tracking data: Player positions and movements throughout plays (limited public availability)
  • PFF grades: Subjective evaluations of every player on every play (subscription required)

Data collection considerations:

Availability: Is the data public, purchasable, or restricted? College football benefits from public data sources like the College Football Data API.

Quality: All data contains errors. Understanding the data's limitations prevents false confidence in results.

Relevance: Does the data actually address your question? Play-by-play data can't tell you about off-field factors.

Chapter 2 explores data sources in detail.

1.4.3 Data Processing

Raw data rarely arrives analysis-ready. Processing transforms raw data into useful form.

Common processing steps:

Cleaning: Handling missing values, correcting errors, standardizing formats. A team might be called "Ohio State," "OSU," or "THE Ohio State University" in different sources—cleaning ensures consistency.

Transformation: Creating new variables from existing ones. Down-and-distance combinations, game situation variables, opponent-adjusted metrics—these don't exist in raw data but must be computed.

Aggregation: Play-level data might need to be summarized to the game, season, or player level. Aggregation decisions affect what patterns you can detect.

Integration: Combining data from multiple sources. Merging play-by-play data with weather data or recruiting rankings requires matching on common keys.

Chapter 5 covers data processing comprehensively.

1.4.4 Analysis

With clean, processed data, you can finally analyze.

Analysis methods span a spectrum:

Descriptive analysis: Summarizing what happened. Means, percentages, rankings—the foundation of any analysis.

Comparative analysis: How does X compare to Y? Conference comparisons, year-over-year trends, before/after comparisons.

Inferential analysis: Drawing conclusions about populations from samples. Is this pattern real or just noise? Statistical significance testing addresses this question.

Predictive analysis: Forecasting future outcomes. Who will win Saturday's game? Which prospects will become starters?

Prescriptive analysis: Recommending actions. Should we go for it on fourth down? This requires combining prediction with decision frameworks.

1.4.5 Communication

The most brilliant analysis accomplishes nothing if nobody understands it. Communication is where analysis becomes impact.

Effective communication principles:

Know your audience: Coaches need different information than executives. Fans need different presentations than analysts. Tailor accordingly.

Lead with insights, not methods: Decision-makers care about what you found, not how you found it. Technical details can go in appendices.

Visualize effectively: A well-designed chart communicates faster than text. Part III of this book focuses on visualization.

Acknowledge uncertainty: Predictions aren't certainties. Honest communication includes confidence intervals and limitations.

Recommend actions: Don't just present findings—translate them into recommendations. "Based on this analysis, we should..."


1.5 Ethics and Responsibilities in Sports Analytics

Working with sports data carries responsibilities. This section addresses ethical considerations that every analyst should understand.

1.5.1 Data Privacy Considerations

Sports analytics involves data about real people—players, coaches, and others. This creates privacy obligations.

Player data: College athletes have privacy rights. Just because data is technically accessible doesn't mean using it is appropriate. Tracking data revealing player locations raises particular concerns.

Competitive data: Teams invest resources gathering proprietary data and analysis. Sharing such data without authorization harms the organization and violates trust.

Public vs. private data: Some data is clearly public (box scores, recruiting rankings). Other data exists in gray areas (practice performance, injury information). When in doubt, err on the side of protecting privacy.

1.5.2 Responsible Analysis and Communication

How you analyze and present findings matters ethically.

Cherry-picking: Selecting only evidence that supports a preferred conclusion while ignoring contradictory evidence is intellectually dishonest. Present findings fairly, including inconvenient results.

Overconfidence: Presenting uncertain conclusions as certain misleads decision-makers. Quantify uncertainty when possible; acknowledge it always.

Misuse potential: Some analyses could be used to harm people—targeting player weaknesses in ways that risk injury, for instance. Consider potential misuse before publishing or sharing.

Attribution: Give credit for ideas and methods you didn't create. The analytics community advances through sharing; acknowledging sources maintains that culture.

1.5.3 The Human Element

Behind every data point is a human being. This matters.

Player welfare: Analysis should ultimately serve to make the game better and safer. When analysis suggests working players harder or risking their health, the human cost deserves weight alongside competitive benefit.

Respect for the game: Football has meaning beyond data points. The traditions, relationships, and experiences of players and fans have value that analytics can't capture. Analysts who reduce everything to numbers miss what makes sports matter.

Humility: Models are simplifications of complex reality. Human judgment, developed over decades of experience, captures things models miss. The best analysts are humble about what data can't tell them.


1.6 Chapter Summary

This chapter introduced you to the field of college football analytics. You learned what analytics means, how it evolved, and how it's applied in modern programs.

Key Concepts

  1. Analytics is the systematic analysis of data to discover patterns that inform decision-making. It goes beyond simple statistics by adding context, building expectations, enabling prediction, and isolating attribution.

  2. Modern football analytics evolved from early statistical work through the Moneyball-inspired revolution to today's sophisticated methods. Expected points and win probability models form the foundation.

  3. College programs use analytics for opponent analysis, self-evaluation, in-game decisions, recruiting, and player development. Application varies but is growing across the sport.

  4. The analytics workflow consists of five stages: question formulation, data collection, data processing, analysis, and communication. Each stage matters for producing useful results.

  5. Ethical considerations include data privacy, responsible analysis, and respect for the human beings behind the data.

Key Terms

Term Definition
Analytics Systematic analysis of data to find patterns that inform decisions
Expected Points (EP) The average number of points expected from a given field position
Expected Points Added (EPA) The change in expected points resulting from a play
Win Probability (WP) The estimated chance of winning from a given game state
Play-by-play data Detailed records of every play in a game

Decision Framework

When approaching a football analytics problem:

├── Do I have a clear, specific question?
│   ├── No → Refine the question before proceeding
│   └── Yes → Continue
├── Is relevant data available?
│   ├── No → Identify what data would be needed
│   └── Yes → Continue
├── Have I considered ethical implications?
│   ├── No → Consider privacy and potential misuse
│   └── Yes → Continue
└── Will the analysis inform a decision?
    ├── No → Consider whether the analysis is worth doing
    └── Yes → Proceed through the workflow

What's Next

In Chapter 2: The Data Landscape of NCAA Football, you will dive into the world of college football data. You will learn about available data sources, how to access them, and their strengths and limitations. By the end, you will have hands-on experience obtaining data from the College Football Data API—the foundation for analyses throughout this book.

Before moving on, complete the exercises and quiz to solidify your understanding.


Chapter 1 Exercises → exercises.md

Chapter 1 Quiz → quiz.md

Case Study: The Fourth Down Revolution → case-study-01.md

Case Study: How Analytics Changed the 2019 LSU Offense → case-study-02.md


This chapter establishes the conceptual foundation for your analytics journey. The tools and techniques in subsequent chapters build on this understanding of what analytics is and why it matters.