Further Reading: Introduction to Soccer Metrics

An annotated bibliography for deeper exploration of soccer metric design, validation, and communication.


Books

"The Numbers Game: Why Everything You Know About Football Is Wrong" by Chris Anderson and David Sally - A landmark popular-science book that brought analytical thinking to a broad soccer audience. - Covers why traditional statistics mislead, how randomness shapes results, and why managers and scouts need better metrics. - Why read it: The best entry point for understanding why soccer needs advanced metrics. Directly complements Sections 5.1 and 5.2.

"Soccernomics" (4th ed.) by Simon Kuper and Stefan Szymanski - Applies economic reasoning to soccer, covering transfer market inefficiencies, the value of home advantage, and the economics of penalty shootouts. - The chapter on network effects in passing is an early example of moving beyond box-score statistics. - Why read it: Provides the economic context for why metric design matters -- better metrics lead to better decisions and competitive advantage.

"The Expected Goals Philosophy" by James Tippett - A concise, practitioner-oriented introduction to xG and its implications for match analysis and betting. - Explains how xG overcomes the limitations of raw goal tallies discussed in Section 5.1. - Why read it: Bridges the gap between the conceptual framework of Chapter 5 and the xG modeling we tackle in later chapters.

"Football Hackers: The Science and Art of a Data Revolution" by Christoph Biermann - Chronicles the rise of analytics in European soccer, from early Opta data to modern tracking systems. - Features interviews with pioneering analysts and club insiders on how metrics are actually used in practice. - Why read it: Rich in real-world examples of metric adoption, communication challenges, and trust-building -- topics from Section 5.6.

"Soccermatics: Mathematical Adventures in the Beautiful Game" by David Sumpter - A mathematician's perspective on soccer, covering expected goals, network analysis, and spatial models. - Balances mathematical rigour with accessibility, making it suitable for readers at all levels. - Why read it: Provides the mathematical intuition behind many of the metric design principles in Section 5.2.

"Net Gains: Inside the Beautiful Game's Analytics Revolution" by Ryan O'Hanlon - Explores how clubs use data to gain competitive advantages, from scouting to match preparation. - Discusses the human dynamics of presenting analytics to coaches and decision-makers. - Why read it: Excellent companion to Section 5.6 on communicating metrics to stakeholders.

"Mathletics: How Gamblers, Managers, and Sports Fans Use Mathematics in Baseball, Basketball, and Football" by Wayne L. Winston - While not soccer-specific, chapters on rate vs. counting statistics, regression to the mean, and metric validation transfer directly to soccer. - Why read it: Strengthens understanding of the statistical foundations discussed in Sections 5.3 and 5.5.


Academic Papers

Greenland, S., Senn, S. J., Rothman, K. J., Carlin, J. B., Poole, C., Goodman, S. N., & Altman, D. G. (2016). "Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations." European Journal of Epidemiology, 31(4), 337-350. - A widely cited paper on common statistical misinterpretations. While written for medical statistics, every caution applies to soccer metrics. - Relevance: Underpins the discussion of sample size, noise, and uncertainty in Sections 5.3 and 5.5.

Franks, A., Miller, A., Bornn, L., & Goldsberry, K. (2015). "Characterizing the spatial structure of defensive skill in professional basketball." Annals of Applied Statistics, 9(1), 94-121. - Develops a framework for evaluating defensive metrics using spatial data. The validation methodology (stability, discrimination, prediction) generalises to soccer. - Relevance: A blueprint for the three-pillar validation framework in Section 5.5.

Rathke, A. (2017). "An examination of expected goals and shot efficiency in soccer." Journal of Human Sport and Exercise, 12(2S), S514-S529. - One of the earliest academic studies of expected goals, demonstrating that xG outperforms raw shot statistics for predicting future results. - Relevance: Validates the predictive-validity approach discussed in Section 5.5.4.

Leitner, M. C., Daumann, F., Follert, F., & Richlan, F. (2023). "The cauldron has cooled down: a systematic review on home advantage in football during the COVID-19 pandemic." Management Review Quarterly, 73(3), 1075-1105. - Systematic review of home advantage research during behind-closed-doors matches, revealing how crowd effects influence venue adjustment factors. - Relevance: Directly informs the venue adjustment discussion in Section 5.4.5.

Pollard, R. & Reep, C. (1997). "Measuring the effectiveness of playing strategies at soccer." The Statistician, 46(4), 541-550. - A pioneering paper on measuring soccer performance using zone-based metrics and early spatial segmentation. - Relevance: Historical context for why the shift from counting to context-adjusted metrics was necessary.

Spearman, W. (2018). "Beyond Expected Goals." MIT Sloan Sports Analytics Conference. - Introduces the concept of measuring the value of actions beyond shots, including passing and ball progression. - Relevance: Demonstrates how metric design moves from simple counting (goals) to context-rich models (expected threat, pitch control).

Fernandez, J., Bornn, L., & Cervone, D. (2021). "A framework for the fine-grained evaluation of the instantaneous expected value of soccer possessions." Machine Learning, 110(6), 1389-1427. - Proposes a comprehensive framework for valuing possessions using spatial and temporal context. - Relevance: Advanced extension of the context-adjustment ideas in Section 5.4.

Pappalardo, L., Cintia, P., Rossi, A., Massucco, E., Ferragina, P., Pedreschi, D., & Giannotti, F. (2019). "A public data set of spatio-temporal match events in soccer competitions." Scientific Data, 6, 236. - Describes the Wyscout public dataset, including data quality considerations and event definitions. - Relevance: Understanding data quality is essential for interpreting metric reliability (Section 5.5).


Online Resources

Data and Tools

StatsBomb Open Data github.com/statsbomb/open-data - Free event-level data for select competitions, including shot locations, pass coordinates, and pressure events. - Essential for practising the metric computation techniques described in this chapter.

FBref (Football Reference) fbref.com - Comprehensive per-90 statistics, percentile rankings, and scouting reports powered by StatsBomb data. - The primary public platform for the kind of player comparison discussed in Sections 5.3 and 5.6.

Understat understat.com - Expected goals data for Europe's top five leagues, with match-level and player-level xG breakdowns. - Useful for exploring the relationship between xG and actual goals (Section 5.1).

Transfermarkt transfermarkt.com - Player valuations, transfer histories, and basic performance statistics. - Provides the context data (market values, age, contract length) needed for prescriptive metrics (Section 5.2.3).

Blogs and Analysis

StatsBomb Articles statsbomb.com/articles/ - In-depth methodology articles on xG, xA, possession value, and other advanced metrics. Written by practitioners for practitioners.

The Athletic (Soccer Analytics Coverage) - Long-form analytics journalism that demonstrates effective metric communication to a general audience. Good examples of the storytelling principles from Section 5.6.

Between the Posts betweentheposts.net - Goalkeeper analytics and match analysis using advanced metrics. Demonstrates niche metric design for a specific position.

Soccerment soccerment.com - Data-driven analysis and player ratings. Illustrates how composite metrics are built and communicated.

Video and Educational

Friends of Tracking (YouTube) youtube.com/channel/UCUBFJYcag8j2rm_9HkrrA7w - Video lectures by David Sumpter, Laurie Shaw, and others on metric design, xG modeling, and spatial analysis. The series on "how to build an xG model" is an excellent practical complement to this chapter.

MIT Sloan Sports Analytics Conference (YouTube) - Recorded presentations from the leading sports analytics conference. Search for soccer/football talks on metric validation, expected goals, and recruitment analytics.

McKay Johns (YouTube) - Practical Python tutorials for soccer analytics, including per-90 calculations, radar charts, and metric visualizations.

Community

Soccer Analytics Handbook (GitHub) github.com/devinpleuler/analytics-handbook - Devin Pleuler's open-source collection of soccer analytics concepts and Python code. Covers many of the same topics as this chapter with interactive notebooks.

r/SoccerAnalytics (Reddit) - Community discussion of soccer metrics, tools, and methodologies. Good for staying current with new developments and debating metric design choices.


For Beginners

  1. Read The Numbers Game for motivation and context.
  2. Explore FBref to see per-90 statistics and percentile rankings in action.
  3. Watch the Friends of Tracking introductory videos.
  4. Practice computing per-90 rates and simple adjustments with StatsBomb Open Data.

For Intermediate Practitioners

  1. Read Football Hackers for real-world examples of metric adoption.
  2. Study the Rathke (2017) and Spearman (2018) papers for validation methodology.
  3. Build a metric validation pipeline following Section 5.5.
  4. Write a scouting report following the communication principles in Section 5.6.

For Advanced Analysts

  1. Read Fernandez, Bornn & Cervone (2021) for state-of-the-art valuation frameworks.
  2. Study the Franks et al. (2015) spatial skill paper for advanced validation techniques.
  3. Implement multi-factor context adjustments (Section 5.4.6) using regression models.
  4. Design and validate a novel metric, documenting stability, discrimination, and predictive power.