Chapter 30 Exercises
Section 30.1: Emerging Technologies
Exercise 30.1 --- Computer Vision Evolution
Research the current state of pose estimation technology (e.g., OpenPose, MediaPipe, AlphaPose). Write a 500-word analysis comparing at least two systems on the following criteria: (a) accuracy in multi-person scenarios, (b) real-time processing capability, (c) applicability to sports contexts. Discuss what specific improvements would be needed for deployment in live soccer matches.
Exercise 30.2 --- LLM Query Design
Design a natural language query interface specification for soccer analytics. Define at least 15 example queries that an analyst might ask, categorized by complexity (simple lookups, aggregations, comparative analyses, tactical queries). For each query, specify what data sources would need to be accessed and what output format would be most useful.
Exercise 30.3 --- Wearable Sensor Fusion
A player wears a GPS vest (10 Hz position), a heart rate monitor (1 Hz), and a muscle oxygenation sensor (5 Hz). These sensors produce data at different frequencies and with different latencies. Write pseudocode for a data fusion pipeline that aligns these streams temporally and produces a unified player state estimate at 10 Hz. Discuss the challenges of sensor synchronization.
Exercise 30.4 --- Edge Computing Latency Budget
A real-time tactical alert system must process tracking data from 22 players at 25 fps and deliver alerts within 3 seconds of the triggering event. Break down a latency budget across the following stages: data acquisition, preprocessing, model inference, alert generation, and delivery to the coaching staff. What are the bottleneck stages, and how would you optimize them?
Exercise 30.5 --- Synthetic Data Generation
Using the conceptual framework from Section 30.1.6, design a synthetic data generator for corner kick scenarios. Specify: (a) the state representation for each player, (b) the conditioning variables (attacking team formation, defensive strategy), (c) the output format, and (d) how you would validate that the synthetic data is realistic.
Exercise 30.6 --- AR Match Visualization
Design a mockup (hand-drawn or digital) of an augmented reality overlay for a live soccer broadcast. Include at least four analytical elements (e.g., pressing intensity heatmap, passing lane probabilities, defensive line height, xG surface). For each element, describe the underlying data source and update frequency.
Section 30.2: Data Privacy and Ethics
Exercise 30.7 --- GDPR Compliance Audit
A Premier League club collects the following data about its players: GPS tracking during training and matches, heart rate data during training, sleep data from wearable devices, social media activity monitoring, and dietary logs submitted by players. For each data type, assess: (a) the legal basis under GDPR, (b) whether explicit consent is required, (c) the appropriate retention period, and (d) data subject access request obligations.
Exercise 30.8 --- Ethics Assessment Framework
Using the EthicsAssessment class from Section 30.2.2, create assessments for the following five scenarios: (a) Using tracking data to evaluate players for contract renewal, (b) Sharing anonymized biometric data with a sports science research group, (c) Monitoring players' GPS locations during off-days, (d) Using facial recognition to assess player emotions during training, (e) Selling aggregated performance data to a betting company. For each, assign a risk level and justify your assessment.
Exercise 30.9 --- Algorithmic Bias Detection
Download a publicly available scouting dataset (or create a synthetic one). Build a simple classification model that predicts "recommend for signing" and test it for demographic parity across leagues of origin. If bias is detected, implement one mitigation strategy (re-sampling, threshold adjustment, or adversarial debiasing) and measure the impact on both fairness metrics and model accuracy.
Exercise 30.10 --- Player Data Charter
Draft a 1,000-word "Player Data Charter" for a professional soccer club. Include provisions for: data collection scope, consent mechanisms, player access rights, data portability on transfer, retention and deletion policies, third-party sharing restrictions, and breach notification procedures. Reference relevant sections of GDPR and any sports-specific regulations you can identify.
Exercise 30.11 --- Ethical Dilemma Analysis
Consider the following dilemma: Your injury prediction model identifies a player as having a 40% probability of a significant muscle injury in the next 4 weeks. The player is the team's best striker, and the club is in a title race. The coach wants to play the player in every match. Write a structured analysis covering: (a) the stakeholders and their interests, (b) the ethical principles in tension, (c) at least three possible courses of action with their consequences, and (d) your recommended approach with justification.
Section 30.3: Democratization of Analytics
Exercise 30.12 --- Open Data Exploration
Using StatsBomb open data (or another freely available dataset), conduct a complete analytical project: (a) formulate a research question, (b) extract and clean the relevant data, (c) perform exploratory analysis, (d) build a simple model or metric, (e) create at least two visualizations, and (f) write up your findings in a 1,000-word report. Focus on clarity and reproducibility.
Exercise 30.13 --- Accessible Analytics Tool
Design (and optionally implement) a web-based tool that allows a non-technical youth coach to analyze their team's match performance. Specify: the required data inputs (should be manually enterable), the analyses performed, the output visualizations, and the user interface. The tool should require no programming knowledge to use.
Exercise 30.14 --- Open Source Contribution Plan
Select an open-source soccer analytics library (mplsoccer, socceraction, kloppy, or another). Review its current issues and feature requests on GitHub. Write a contribution plan that includes: (a) a specific feature or bug fix you could implement, (b) the technical approach, (c) the estimated effort, and (d) how you would test your contribution.
Exercise 30.15 --- Democratization Impact Assessment
Write a 750-word essay analyzing the potential positive and negative consequences of making professional-grade analytics tools available to amateur clubs. Consider impacts on competitive balance, player development, coaching practices, and the culture of the game at grassroots level.
Exercise 30.16 --- Data Journalism Project
Choose a recent soccer tournament (World Cup, Euros, Copa America) and use publicly available data to create a "data story" suitable for publication in a general-interest sports outlet. Include at least three original visualizations, write 800--1,200 words of accessible prose, and ensure all claims are supported by data.
Section 30.4: The Human Element
Exercise 30.17 --- Communication Challenge
Take one of the analyses you have completed in a previous chapter and rewrite the findings as: (a) a 2-minute verbal briefing for a head coach before training, (b) a one-page executive summary for a sporting director, and (c) a Twitter/X thread for a public audience. Reflect on how the content, language, and emphasis changed across the three formats.
Exercise 30.18 --- Cognitive Bias Identification
Watch a full soccer match (or extended highlights) and keep a log of moments where you notice cognitive biases influencing your own judgment. For each instance, identify the bias type (from the table in Section 30.4.4), describe the situation, and explain how data could either confirm or correct the biased judgment.
Exercise 30.19 --- Interdisciplinary Team Design
You are tasked with building a five-person analytics department for a newly promoted club with a moderate budget. Define the five roles, their required skill profiles, their key responsibilities, and how they would collaborate. Justify your choices with reference to the interdisciplinary model described in Section 30.4.5.
Exercise 30.20 --- Context vs. Data Debate
Select a controversial transfer from recent seasons. Prepare two arguments: one based purely on publicly available statistical evidence, and one based on contextual factors (tactical fit, personality, league adjustment, etc.). Then write a synthesis that explains how the two perspectives complement each other and where they conflict.
Section 30.5: Predictions for the Next Decade
Exercise 30.21 --- Prediction Evaluation
Select five predictions from Section 30.5 and, for each one, identify: (a) the current evidence supporting it, (b) the current evidence against it, (c) the key uncertainties, and (d) a specific observable milestone that would indicate the prediction is on track. If you are reading this after 2028, evaluate which near-term predictions have materialized.
Exercise 30.22 --- Your Own Predictions
Write your own set of 10 predictions for soccer analytics over the next decade. For each prediction, provide: (a) a specific, falsifiable statement, (b) a timeline, (c) your confidence level (low/medium/high), and (d) the reasoning behind the prediction. Compare your predictions with those in Section 30.5 and discuss where you agree and disagree.
Exercise 30.23 --- Digital Twin Specification
Design the specification for a "digital twin" of a soccer player (as described in Section 30.5.3). Define: (a) the state variables that would be tracked, (b) the data sources feeding the model, (c) the update frequency for each variable, (d) at least three use cases, and (e) the technical challenges that would need to be overcome.
Exercise 30.24 --- Scenario Planning
Conduct a scenario planning exercise for soccer analytics in 2035. Define three scenarios (optimistic, baseline, pessimistic) based on different assumptions about technological progress, regulatory environment, and cultural adoption. For each scenario, describe the state of analytics at elite clubs, lower-league clubs, and national federations.
Section 30.6: Career and Reflection
Exercise 30.25 --- Personal Skills Audit
Using the build_skills_inventory function from Section 30.6.1 (or your own assessment framework), conduct an honest self-assessment of your current skills across all relevant dimensions. Identify your top three strengths and top three gaps. Create a 6-month learning plan to address the most critical gaps, with specific resources, milestones, and time allocations.
Exercise 30.26 --- Portfolio Project
Plan and execute a portfolio-quality analytical project. Requirements: (a) use publicly available data, (b) address a genuine analytical question, (c) include at least one original metric or model, (d) produce at least three publication-quality visualizations, (e) write a clear 1,500--2,500-word writeup, and (f) publish it on a personal blog, GitHub, or other public platform.
Exercise 30.27 --- Informational Interview
Conduct an informational interview (in person, by video call, or by email) with someone currently working in soccer analytics (at a club, data provider, media organization, or related field). Prepare at least 10 thoughtful questions. Write a 500-word summary of what you learned, including any surprises and how the conversation changed your perspective on the field.
Exercise 30.28 --- Capstone Reflection
Write a 1,500-word reflective essay on your journey through this textbook. Address: (a) the most important concepts you learned, (b) how your understanding of soccer analytics has changed, (c) the areas you found most challenging, (d) how you plan to apply what you have learned, and (e) what questions remain unanswered for you. This is an exercise in honest self-assessment, not in providing "correct" answers.