Quiz: Introduction to Football Analytics
Test your understanding before moving to the next chapter. Target: 70% or higher to proceed. Time: ~35 minutes
Section 1: Multiple Choice (1 point each)
1. What is the primary distinction between a "statistic" and an "analytic" in football?
- A) Statistics are for offense, analytics are for defense
- B) Analytics provide context that enables comparison or decision-making
- C) Statistics are exact, analytics are estimates
- D) Analytics require computers, statistics can be calculated by hand
Answer
**B)** Analytics provide context that enables comparison or decision-making *Explanation:* A statistic is a raw number (e.g., "65% completion rate"). Analytics adds context—such as expected values, peer comparisons, or situational adjustments—that transforms data into actionable insight. Section 1.1.1 explains this distinction with the Mahomes example.2. Which type of analytics question is "What is the optimal play call on third-and-five?"
- A) Descriptive
- B) Predictive
- C) Prescriptive
- D) Exploratory
Answer
**C)** Prescriptive *Explanation:* Prescriptive analytics recommends actions ("What should we do?"). Questions asking what is "optimal" or using "should" typically fall in this category. See Section 1.1.2 for the full framework.3. According to the textbook, what is the "fundamental challenge" of football analytics?
- A) Teams don't share data
- B) Football is too complex to model
- C) Small samples and high variance make signal hard to extract from noise
- D) Coaches don't trust analysts
Answer
**C)** Small samples and high variance make signal hard to extract from noise *Explanation:* Football provides limited data (17 games, hundreds of plays) and outcomes are highly variable due to randomness. This signal-and-noise problem pervades all football analysis. See Section 1.1.4.4. What was the significance of Michael Lewis's Moneyball (2003) for football analytics?
- A) It introduced Expected Points Added (EPA)
- B) It demonstrated that rigorous data analysis could identify inefficiencies in sports
- C) It was the first book about football analytics
- D) It led to NFL teams immediately hiring analytics staff
Answer
**B)** It demonstrated that rigorous data analysis could identify inefficiencies in sports *Explanation:* While *Moneyball* focused on baseball, it showed the broader sports world that analytical approaches could find undervalued assets and improve decisions. This inspired cross-sport adoption. See Section 1.2.2.5. What was the primary contribution of Football Outsiders when it launched in 2003?
- A) Creating player tracking technology
- B) Bringing rigorous analysis and metrics like DVOA to public football discourse
- C) Developing the first fourth-down model
- D) Publishing the first academic paper on football analytics
Answer
**B)** Bringing rigorous analysis and metrics like DVOA to public football discourse *Explanation:* Football Outsiders, founded by Aaron Schatz, introduced DVOA (Defense-adjusted Value Over Average) and demonstrated that rigorous analysis could be communicated to mainstream audiences. See Section 1.2.2.6. What major development in 2018 democratized access to advanced football data?
- A) The NFL made all tracking data public
- B) The NFL Big Data Bowl began releasing tracking data samples to the public
- C) Play-by-play data became available for the first time
- D) Teams started publishing their analytics research
Answer
**B)** The NFL Big Data Bowl began releasing tracking data samples to the public *Explanation:* The Big Data Bowl competition released player tracking data (previously available only to teams) to the public for the first time, enabling broader research. See Section 1.2.4.7. Which of the following is analytics BEST suited to evaluate?
- A) A player's leadership in the locker room
- B) A quarterback's EPA per dropback over a season
- C) How a player will adapt to a new coaching scheme
- D) Whether a player is injury-prone
Answer
**B)** A quarterback's EPA per dropback over a season *Explanation:* Performance measurement is a core strength of analytics. Leadership (A), scheme fit (C), and injury prediction (D) are areas where analytics struggles due to limited data or high randomness. See Section 1.3.4.8. In the football analytics workflow, which step should come FIRST?
- A) Gather data
- B) Define the question
- C) Analyze data
- D) Clean data
Answer
**B)** Define the question *Explanation:* The most common mistake in analytics is answering the wrong question. Before touching data, you must clarify what decision the analysis informs, who the audience is, and what would change your mind. See Section 1.4.1.9. According to the text, what percentage of project time is often consumed by data cleaning?
- A) 10-20%
- B) 30-40%
- C) 50-80%
- D) 90-100%
Answer
**C)** 50-80% *Explanation:* Data cleaning—handling missing values, fixing inconsistencies, creating derived variables—is unglamorous but essential. Rushing it guarantees errors downstream. See Section 1.4.4.10. What is the typical organizational relationship between analytics departments and coaching staff in successful NFL organizations?
- A) Analytics tells coaches what to do
- B) Coaches ignore analytics
- C) Analytics provides information and recommendations; coaches make final decisions
- D) Analytics and coaching operate completely independently
Answer
**C)** Analytics provides information and recommendations; coaches make final decisions *Explanation:* The best analytics doesn't dictate—it illuminates the consequences of choices so decision-makers can act with fuller information. See Section 1.1 opening and Section 1.5.2.Section 2: True/False (1 point each)
For each statement, indicate whether it is True or False.
11. Analytics is meant to replace traditional scouting in NFL player evaluation.
Answer
**False** *Explanation:* Analytics complements rather than replaces scouting. The best organizations integrate both approaches, using each to check and enhance the other. Traditional scouting captures intangibles that data cannot. See Section 1.1.3.12. If a quarterback has a higher completion percentage than another quarterback over the same number of attempts, we can confidently say they are the better passer.
Answer
**False** *Explanation:* Raw completion percentage lacks context. It doesn't account for the difficulty of throws attempted, opponent quality, or random variation. Analytics adds context to enable valid comparison. See Section 1.1.1 and 1.1.4.13. The year-to-year correlation of a metric indicates how much we should regress observed values toward the mean when making projections.
Answer
**True** *Explanation:* Metrics with low year-to-year correlations (high noise) require heavy regression to the mean. Metrics with high correlations (stable) can be projected closer to their observed values. See Section 1.3.1.14. Every NFL team now employs at least one analytics professional.
Answer
**True** *Explanation:* As of the modern era (2018-present), all 32 NFL teams employ analytics staff, ranging from 2-3 people to departments of 15+. See Section 1.2.4.15. Predictive analytics is generally easier than descriptive analytics because you can verify predictions.
Answer
**False** *Explanation:* Prediction is harder than description because it requires identifying patterns that generalize beyond observed data. Random variation, changing conditions, and opponent adjustments all complicate prediction. See Section 1.1.2.16. The NFL Big Data Bowl has made complete tracking data for all NFL games publicly available.
Answer
**False** *Explanation:* The Big Data Bowl releases tracking data in limited form (specific plays or games for competition purposes), not complete data for all games. Full tracking data remains proprietary to teams. See Section 1.2.4.Section 3: Fill in the Blank (1 point each)
17. The three types of analytics, in order of increasing difficulty, are: descriptive, __, and prescriptive.
Answer
**predictive** *Explanation:* Descriptive (what happened) → Predictive (what will happen) → Prescriptive (what should we do). Each builds on the previous. See Section 1.1.2.18. Expected Points Added (EPA) has become the __ of modern football analysis, serving as a common metric for evaluating plays and players.
Answer
**lingua franca** (or common language / standard metric) *Explanation:* EPA is described as the "lingua franca" of football analysis because it provides a common framework for evaluating plays that is widely understood and adopted. See Section 1.2.4.19. The challenge of extracting reliable patterns from football data despite small samples and high variance is called the __ problem.
Answer
**signal and noise** (or signal-and-noise) *Explanation:* Football's limited data and high variance make it difficult to distinguish genuine skill differences (signal) from random variation (noise). See Section 1.1.4.20. In the analytics workflow, the step that often consumes 50-80% of project time is data __.
Answer
**cleaning** *Explanation:* Data cleaning—handling missing values, fixing inconsistencies, creating derived variables—is time-consuming but essential. See Section 1.4.4.Section 4: Short Answer (2 points each)
Write 2-4 sentences for each answer.
21. Explain why a team might have both a traditional scouting department AND an analytics department. What does each contribute?
Sample Answer
Traditional scouting captures qualitative elements that data cannot measure: technique details, leadership presence, football intelligence, and character. Analytics provides scale (evaluating thousands of players systematically), consistency (reducing cognitive biases), and uncertainty quantification. Together, they can check each other's conclusions—if scouts love a player but analytics shows concerning patterns, that discrepancy prompts deeper investigation. *Key points for full credit:* - Each approach has distinct strengths - They complement rather than replace each other - Integration enables better decisions than either alone22. A colleague says: "Our fourth-down model says we should have gone for it on that play, so the coach was wrong to punt." What's problematic about this reasoning?
Sample Answer
Expected value analysis tells us which decision maximizes average outcomes over many similar situations—it doesn't guarantee any single play will succeed. A decision can be analytically correct but still fail due to randomness. Moreover, models may not capture all relevant factors (game context, specific matchups, information the coach has). We should evaluate decisions by their process and expected value, not by their individual outcomes. *Key points for full credit:* - Expected value is about averages, not guarantees - Correct decisions can still fail - Models may not capture all relevant information23. Why is defining the question considered the most important step in the analytics workflow?
Sample Answer
Without a clear question, you cannot know what data to gather, what analysis to perform, or how to interpret results. A vague question like "Is our quarterback good?" cannot be answered analytically, while a specific question like "What is the probability our QB ranks top-10 in EPA next year?" can be. Furthermore, questions should connect to decisions—analysis that doesn't inform action is wasted effort. Time spent clarifying the question prevents wasted work later. *Key points for full credit:* - Vague questions cannot be answered - Questions should connect to decisions - Clarity upfront prevents wasted work24. Describe two ways the "signal and noise" problem affects how we should interpret quarterback statistics.
Sample Answer
First, small sample sizes (hundreds of passes) mean observed performance may not reflect true skill. A QB with a 68% completion rate may not actually be better than one with 64%—the difference could be noise. Second, we should regress statistics toward the mean when projecting future performance. A QB who led the league in passer rating likely benefited from some luck and probably won't repeat that exact performance. Both implications require humility about our conclusions. *Key points for full credit:* - Sample size affects reliability - Regression to the mean needed for projectionSection 5: Matching (1 point each)
Match each analytics milestone with its approximate era:
| Milestone | Era Options |
|---|---|
| 25. Football Outsiders founded | A. Pre-2000 |
| 26. NFL Big Data Bowl launches | B. 2000-2010 |
| 27. All 32 teams employ analytics staff | C. 2010-2018 |
| 28. Bill Walsh scripts first 15 plays | D. 2018-Present |
Answers
**25. B** (2000-2010) - Football Outsiders launched in 2003 **26. D** (2018-Present) - Big Data Bowl launched in 2018 **27. D** (2018-Present) - Universal adoption occurred in the modern era **28. A** (Pre-2000) - Bill Walsh coached the 49ers in the 1980s *Reference: Section 1.2*Scoring
| Section | Points | Your Score |
|---|---|---|
| Multiple Choice (1-10) | 10 | ___ |
| True/False (11-16) | 6 | ___ |
| Fill in Blank (17-20) | 4 | ___ |
| Short Answer (21-24) | 8 | ___ |
| Matching (25-28) | 4 | ___ |
| Total | 32 | ___ |
Passing Score: 23/32 (70%)
Review Recommendations
- Score < 50%: Re-read entire chapter, focusing on Sections 1.1 and 1.2
- Score 50-70%: Review Sections 1.3 and 1.4, redo Part A exercises
- Score 70-85%: Good understanding! Review any missed topics before proceeding
- Score > 85%: Excellent! You're ready for Chapter 2
Key Concepts to Review If Needed
- Three types of analytics (Section 1.1.2)
- Signal and noise problem (Section 1.1.4)
- History and evolution (Section 1.2)
- Analytics workflow (Section 1.4)
- Organizational structure (Section 1.5)