Exercises: Introduction to Soccer Analytics
These exercises progress from foundational concept checks to challenging applications. They reinforce the key ideas from Chapter 1 and prepare you for the technical content in later chapters.
Scoring Guide: - ⭐ Foundational (5-10 min each) - ⭐⭐ Intermediate (10-20 min each) - ⭐⭐⭐ Challenging (20-40 min each) - ⭐⭐⭐⭐ Advanced/Research (40+ min each)
Part A: Conceptual Understanding ⭐
Test your understanding of core concepts. No calculations required.
A.1. In your own words, define soccer analytics. What distinguishes it from simply collecting soccer statistics?
A.2. Explain why analytics cannot "replace" traditional expertise in soccer. Give two specific examples where human judgment remains essential despite having data.
A.3. The text states that soccer analytics developed more slowly than baseball analytics. List three factors that contributed to this delay and explain each briefly.
A.4. A colleague claims that "expected goals (xG) tells you everything you need to know about a team's quality." Explain why this statement is overly simplistic. What does xG capture, and what does it miss?
A.5. Describe the difference between event data and tracking data. For each type, give one example of an analysis that would be possible with that data type but not the other.
A.6. True or False (with explanation required): Analytics is primarily useful for wealthy clubs because smaller clubs cannot afford sophisticated data infrastructure. Justify your answer with at least one real-world example.
A.7. List the five phases of the analytics workflow described in this chapter. For each phase, write one sentence explaining its purpose.
A.8. Explain why the analytics workflow is described as iterative rather than linear. Give an example of how a finding in the "analysis" phase might cause you to revisit the "question" phase.
Part B: Stakeholder Analysis ⭐⭐
Apply your understanding of different stakeholders in soccer analytics.
B.1. Stakeholder Mapping
For each of the following decisions, identify the primary stakeholder(s) who would use analytics to inform the decision and explain what kind of analysis they would need:
a) Deciding which free-kick routine to use in an upcoming match b) Determining whether to sell a 28-year-old midfielder valued at €30 million c) Setting ticket prices for a crucial end-of-season match d) Identifying potential loan signings from lower leagues
B.2. Communication Adaptation
You have conducted an analysis showing that your team's xG performance has been significantly worse at home than away this season. Write a brief (3-4 sentence) summary of this finding for each of the following audiences:
a) The head coach (tactical focus) b) The sporting director (strategic focus) c) A journalist (public-facing, accessible)
B.3. Needs Assessment
A newly promoted Premier League club is building its analytics department from scratch. They have budget for three full-time positions. Based on what you learned about analytics roles, which three positions would you recommend they hire first? Justify your choices.
B.4. Conflict Resolution
The head scout has identified a player he believes is perfect for the team based on video scouting and personal observation. However, the analytics team's models rate this player as average and flag several statistical concerns. How should the sporting director approach this disagreement? What information would help resolve it?
Part C: Historical Analysis ⭐⭐
Explore the development of soccer analytics.
C.1. Research Charles Reep's work in more detail. What were his main conclusions about effective soccer play? Why are these conclusions now considered methodologically flawed? What lessons can modern analysts learn from Reep's mistakes?
C.2. The "Moneyball" approach in baseball focused on identifying undervalued statistics like on-base percentage. What might be the soccer equivalent of on-base percentage—a skill or contribution that was historically undervalued but has become better understood through analytics? Explain your reasoning.
C.3. Create a timeline of key events in soccer analytics from 1950 to present. Include at least 10 events and briefly explain why each was significant. You may need to conduct additional research.
C.4. Compare and contrast the analytics approaches of two clubs: Liverpool FC and Brentford FC. How did their different resources shape their analytical strategies? What common principles did they share?
Part D: Critical Thinking ⭐⭐⭐
Engage critically with analytical concepts and claims.
D.1. The Uncertainty Problem
A model predicts that Team A has a 65% probability of beating Team B. Team B wins 2-0. Does this result mean the model was wrong? Write a short essay (200-300 words) discussing how we should evaluate probabilistic predictions.
D.2. The Causation Challenge
An analyst notices that teams with higher pressing intensity tend to have better league positions. They conclude that pressing more will lead to better results. Critique this reasoning. What alternative explanations might exist for this correlation?
D.3. The Context Problem
Consider two strikers: - Striker A: 15 goals from 25 shots in League A - Striker B: 12 goals from 30 shots in League B
Without knowing anything else, can you determine which striker is better? What additional context would you need to make a fair comparison? List at least five factors that could affect this comparison.
D.4. The Communication Challenge
An analyst has built a complex machine learning model that predicts player transfer values. The model achieves excellent accuracy on historical data. However, when presenting to the sporting director, they fail to convey the key insights and the director remains skeptical. What might have gone wrong? How should the analyst have approached the presentation differently?
D.5. The Ethics Dilemma
You are an analyst at a club that has developed a model predicting which youth players are likely to succeed professionally. The model has 75% accuracy, meaning 25% of its predictions are wrong. The youth director wants to use this model to decide which 14-year-olds to release from the academy.
a) What are the potential benefits of using this model? b) What are the potential harms? c) How would you advise the youth director to use (or not use) this model?
Part E: Application Exercises ⭐⭐⭐
Apply concepts to realistic scenarios.
E.1. Question Formulation
You are working for a mid-table team that is struggling defensively. The coach has complained that "we concede too many goals from crosses."
a) Explain why this is not yet a well-formed analytical question b) Formulate three specific, answerable analytical questions that could help understand the problem c) For each question, identify what data you would need to answer it
E.2. Workflow Application
Walk through the analytics workflow for the following scenario:
A Premier League club wants to identify potential signing targets for a back-up left-back position. They need someone who can challenge for the starting spot within two seasons, is under 26 years old, and fits within a €15 million budget.
For each phase of the workflow (Question, Data, Analysis, Insight, Action), describe what you would do and what outputs you would produce.
E.3. Value Proposition
Write a one-page business case for a skeptical club owner who is considering investing in an analytics department. Your case should:
a) Quantify potential benefits with specific examples b) Acknowledge limitations honestly c) Propose a reasonable initial investment d) Describe how you would measure return on investment
Part F: Research and Extension ⭐⭐⭐⭐
Open-ended problems requiring additional research.
F.1. Literature Review
Find two academic papers about soccer analytics published in peer-reviewed journals. For each paper:
a) Summarize the research question and methodology b) Describe the key findings c) Critique the limitations d) Explain how this research could be applied in a club setting
Recommended journals: Journal of Sports Analytics, Journal of Quantitative Analysis in Sports, International Journal of Computer Science in Sport.
F.2. Case Study Research
Research the analytics operations of one of the following clubs in depth: - Manchester City - Ajax - Brighton & Hove Albion - Red Bull Salzburg - Club Brugge
Write a 500-word profile covering: - History and development of their analytics department - Key personnel and organizational structure - Notable successes attributed to analytics - Public philosophy and approach
F.3. Industry Analysis
The soccer data industry includes companies like Opta/Stats Perform, StatsBomb, Wyscout, InStat, and others.
a) Research and compare at least three data providers b) Describe what types of data each provides c) Identify their target customers (clubs, media, betting, etc.) d) Discuss how the competitive landscape has evolved over the past decade
F.4. Future Scenarios
Write a speculative essay (500-700 words) on one of the following topics:
a) "Soccer analytics in 2035: What will be possible with advances in AI and data collection?" b) "The democratization of analytics: Will advanced methods ever be available to amateur clubs?" c) "Player power and analytics: How will players use data to advocate for themselves?"
Solutions
Selected solutions are available in:
- appendices/g-answers-to-selected-exercises.md (odd-numbered problems)
Full solutions available to instructors upon request.
Reflection Questions
After completing these exercises, consider:
- Which concepts from this chapter were most challenging to understand?
- What questions do you still have about soccer analytics as a field?
- Which areas are you most excited to explore in later chapters?
- How might your background (technical, football, or otherwise) influence your perspective on analytics?
Write brief notes on these reflections to guide your continued learning.