Chapter 24: Further Reading
Foundational Texts
Deep Learning Theory
-
Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep Learning. MIT Press. The canonical textbook on deep learning, covering feedforward networks, regularization, optimization, CNNs, RNNs, and generative models. Chapters 6--10 provide the mathematical foundations for everything discussed in this chapter. Freely available at https://www.deeplearningbook.org/.
-
Bishop, C. M., & Bishop, H. (2024). Deep Learning: Foundations and Concepts. Springer. A rigorous, modern treatment of deep learning with stronger emphasis on probabilistic perspectives than Goodfellow et al. Particularly strong on variational inference and generative models.
-
Zhang, A., Lipton, Z. C., Li, M., & Smola, A. J. (2023). Dive into Deep Learning. Cambridge University Press. An interactive textbook with executable code examples. Excellent for practitioners who learn by implementing. Available at https://d2l.ai/.
Graph Neural Networks
-
Hamilton, W. L. (2020). Graph Representation Learning. Morgan & Claypool. The most accessible introduction to GNNs, covering spectral methods, message passing, and applications. Chapter 7 on the GNN framework directly underpins Section 24.3.
-
Bronstein, M. M., Bruna, J., Cohen, T., & Velickovic, P. (2021). "Geometric Deep Learning: Grids, Groups, Graphs, Geodesics, and Gauges." arXiv:2104.13478. A unifying perspective that frames CNNs, GNNs, and Transformers as instances of a common geometric framework. Essential reading for understanding why different architectures suit different data structures.
Reinforcement Learning
- Sutton, R. S., & Barto, A. G. (2018). Reinforcement Learning: An Introduction (2nd ed.). MIT Press. The definitive RL textbook. Chapters 3--6 cover the MDP framework, value functions, and temporal-difference learning that underpin Section 24.5. Freely available at http://incompleteideas.net/book/the-book-2nd.html.
Soccer Analytics Research
Neural Networks for Expected Goals
-
Anzer, G., & Bauer, P. (2021). "A Goal Scoring Probability Model for Shots Based on Synchronized Positional and Event Data in Football (Soccer)." Frontiers in Sports and Active Living, 3, 624475. Demonstrates that incorporating tracking data (defender and goalkeeper positions) into xG models via neural networks significantly improves performance over event-data-only models.
-
Fernandez, J., Bornn, L., & Cervone, D. (2021). "Decomposing the Immeasurable Sport: A deep learning expected possession value framework for soccer." MIT Sloan Sports Analytics Conference. Extends expected possession value to a continuous framework using deep learning, bridging xG and pitch control concepts.
Sequence Models for Soccer Events
-
Decroos, T., Bransen, L., Van Haaren, J., & Davis, J. (2019). "Actions Speak Louder Than Goals: Valuing Player Actions in Soccer." Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. Introduces VAEP, which uses gradient-boosted trees on fixed-length event sequences. The conceptual framework directly motivates the LSTM extension discussed in Case Study 1.
-
Simpson, I., Beal, R. J., Locke, D., & Norman, T. J. (2022). "Seq2Event: Learning the Language of Soccer using Transformer-based Match Event Prediction." Proceedings of the 28th ACM SIGKDD Conference. Applies Transformer architectures to soccer event sequence prediction, demonstrating the potential of self-attention for capturing long-range dependencies in match events.
-
Pappalardo, L., et al. (2019). "A Public Data Set of Spatio-Temporal Match Events in Soccer Competitions." Scientific Data, 6, 236. Describes the Wyscout event dataset used in many sequence modeling studies, providing context for the data structures discussed in Section 24.2.
Graph Neural Networks for Soccer
-
Bialkowski, A., Lucey, P., Carr, P., Yue, Y., Sridharan, S., & Matthews, I. (2014). "Identifying Team Style in Soccer Using Formations Learned from Spatiotemporal Tracking Data." IEEE International Conference on Data Mining Workshops. Early work on formation detection from tracking data, providing the baseline against which GNN approaches (Section 24.3) are evaluated.
-
Yeh, R. A., Schwing, A. G., Huang, J., & Murphy, K. (2019). "Diverse Generation for Multi-Agent Sports Games." CVPR. Uses graph-based generative models to produce realistic multi-agent trajectories in basketball, with methods directly transferable to soccer tracking data generation.
-
Sun, C., et al. (2020). "Predicting Soccer Passes in Context." ECML PKDD Workshop on Machine Learning and Data Mining for Sports Analytics. Applies graph neural networks to pass prediction, demonstrating how GNNs naturally capture the relational structure of passing opportunities.
Convolutional Networks for Spatial Analysis
-
Fernandez, J., & Bornn, L. (2018). "Wide Open Spaces: A statistical technique for measuring space creation in professional soccer." MIT Sloan Sports Analytics Conference. While not a deep learning paper, introduces pitch control models that motivate the CNN spatial representations discussed in Section 24.4.
-
Spearman, W. (2018). "Beyond Expected Goals." MIT Sloan Sports Analytics Conference. Presents physics-based pitch control models that can serve as both training targets and baselines for CNN-based approaches.
Reinforcement Learning in Sports
-
Liu, G., & Schulte, O. (2018). "Deep Reinforcement Learning in Ice Hockey for Context-Aware Player Evaluation." arXiv:1805.11088. Applies deep RL to player evaluation in hockey, with methods directly transferable to soccer. Demonstrates how learned value functions can decompose team performance into individual contributions.
-
Routley, K., & Schulte, O. (2015). "A Markov Game Model for Valuing Player Actions in Ice Hockey." UAI Workshop on Machine Learning for Sports Analytics. Foundational work on using MDPs for action valuation in team sports, providing the theoretical basis for soccer RL applications.
Generative Models for Sports
-
Le, H. M., Yue, Y., Carr, P., & Lucey, P. (2017). "Coordinated Multi-Agent Imitation Learning." ICML. Imitation learning for multi-agent trajectory generation in basketball, with direct implications for generating realistic soccer tracking data.
-
Li, C., et al. (2023). "Generative AI for Synthetic Sports Data." NeurIPS Workshop on Machine Learning for Sports. A survey of generative approaches for sports data synthesis, covering VAEs, GANs, and diffusion models with sports-specific evaluation metrics.
Technical Implementation
Frameworks and Libraries
-
PyTorch Documentation. https://pytorch.org/docs/. The primary deep learning framework for research and increasingly for production. Essential for implementing the architectures described in this chapter.
-
PyTorch Geometric Documentation. https://pytorch-geometric.readthedocs.io/. The leading library for GNN implementation in PyTorch. Provides efficient implementations of GCN, GAT, and message-passing layers discussed in Section 24.3.
-
Hugging Face Transformers Documentation. https://huggingface.co/docs/transformers/. While primarily focused on NLP, the Transformer implementations and training utilities are directly applicable to soccer event sequence modeling.
-
NumPy Documentation. https://numpy.org/doc/. All code examples in this chapter use NumPy for pedagogical clarity, making this reference essential.
Model Interpretation
-
Molnar, C. (2022). Interpretable Machine Learning: A Guide for Making Black Box Models Explainable (2nd ed.). Available at https://christophm.github.io/interpretable-ml-book/. Comprehensive coverage of SHAP, LIME, and other interpretability techniques referenced in Section 24.7.5.
-
SHAP Documentation. https://shap.readthedocs.io/. The primary library for computing Shapley values for neural network predictions.
Online Resources and Communities
-
Friends of Tracking (YouTube channel and GitHub). Open educational content on tracking data analysis, including deep learning applications to soccer data.
-
StatsBomb Open Data. https://github.com/statsbomb/open-data. Free event data for prototyping sequence models and action valuation frameworks.
-
Metrica Sports Sample Data. https://github.com/metrica-sports/sample-data. Open tracking data samples for developing and testing spatial deep learning models.
-
Papers With Code --- Sports Analytics. https://paperswithcode.com/. Provides links between research papers and their code implementations, useful for reproducing results from the cited literature.
Recommended Reading Sequence
For readers new to deep learning in soccer analytics, we suggest:
- Start with Zhang et al. (Dive into Deep Learning) for hands-on implementation experience
- Read Goodfellow et al. Chapters 6--10 for theoretical depth
- Study Hamilton (Graph Representation Learning) for GNN foundations
- Work through the Decroos et al. (VAEP) paper to understand action valuation
- Explore the Fernandez & Bornn papers for spatial deep learning motivation
- Implement the code examples in this chapter's
code/directory
For practitioners already working with deep learning:
- Focus on the soccer-specific papers (items under "Soccer Analytics Research")
- Study the Bronstein et al. geometric deep learning survey for architectural insights
- Explore PyTorch Geometric for efficient GNN implementation
- Review the generative models literature for data augmentation strategies