Chapter 23: Further Reading
Foundational Papers
Object Detection and Tracking in Sports
-
Redmon, J., Divvala, S., Girshick, R., & Farhadi, A. (2016). You Only Look Once: Unified, Real-Time Object Detection. CVPR 2016. The original YOLO paper that introduced single-pass object detection. While subsequent versions (YOLOv5 through YOLOv11) have superseded the original architecture, this paper remains essential for understanding the design philosophy behind real-time detection.
-
Wojke, N., Bewley, A., & Paulus, D. (2017). Simple Online and Realtime Tracking with a Deep Association Metric. ICIP 2017. Introduces DeepSORT, combining Kalman filtering with deep appearance features for robust multi-object tracking. The de facto baseline for sports tracking applications.
-
Cioppa, A., Deliege, A., Giancola, S., et al. (2022). SoccerNet- Tracking: Multiple Object Tracking Dataset and Benchmark in Soccer Videos. CVPR 2022 Workshops. The first large-scale benchmark specifically for soccer player tracking from broadcast video. Essential for evaluating and comparing tracking systems on standardized data.
-
Scott, A., Ingvarsson, R., & Sherrah, J. (2022). SoccerTrack: A Dataset and Tracking Algorithm for Soccer with Fish-Eye and Drone Videos. CVPR 2022 Workshops. Extends soccer tracking to non-broadcast camera views, demonstrating the challenges of domain transfer across camera types.
Computer Vision for Sports Analysis
-
Thomas, G., Gade, R., Moeslund, T. B., Carr, P., & Hilton, A. (2017). Computer Vision for Sports: Current Applications and Research Topics. Computer Vision and Image Understanding, 159, 3--18. A comprehensive survey of CV applications across sports, providing context for soccer-specific developments within the broader field.
-
Manafifard, M., Ebadi, H., & Moghaddam, H. A. (2017). A Survey on Player Tracking in Soccer Videos. Computer Vision and Image Understanding, 159, 19--46. A focused survey on player tracking in soccer, covering detection, tracking, and team identification. Provides a useful taxonomy of approaches.
-
Naik, B. T., Hashmi, M. F., & Bokde, N. D. (2022). A Comprehensive Review of Computer Vision in Sports: Open Issues, Future Trends and Research Directions. Applied Sciences, 12(9), 4429. An updated survey covering the deep learning era, including transformer-based approaches and self-supervised learning methods.
Pose Estimation
-
Sun, K., Xiao, B., Liu, D., & Wang, J. (2019). Deep High- Resolution Representation Learning for Human Pose Estimation. CVPR 2019. Introduces HRNet, which maintains high-resolution feature maps throughout the network. Widely used for sports pose estimation due to its accuracy on partially occluded subjects.
-
Bridgeman, L., Volino, M., Sherrah, J., & Mayol-Cuevas, W. (2019). Multi-Person 3D Pose Estimation and Tracking in Sports. CVPR 2019 Workshops. Addresses the specific challenges of pose estimation in sports contexts, including unusual body configurations and fast motion.
Event Detection and Action Recognition
-
Deliege, A., Cioppa, A., Giancola, S., et al. (2021). SoccerNet- v2: A Dataset and Benchmarks for Holistic Understanding of Broadcast Soccer Videos. CVPR 2021 Workshops. The definitive benchmark for soccer event detection, covering action spotting, camera shot segmentation, and replay grounding.
-
Giancola, S., Amine, M., Dghaily, T., & Ghanem, B. (2018). SoccerNet: A Scalable Dataset for Action Spotting in Soccer Videos. CVPR 2018 Workshops. The original SoccerNet dataset that catalysed research in automated soccer event detection. Covers goals, cards, and substitutions across 500 complete broadcast matches.
-
Vats, K., Fani, M., Walters, P., Clausi, D. A., & Zelek, J. (2020). Event Detection in Coarsely Annotated Sports Videos via Parallel Multi-Receptive Field 1D Convolutions. CVPR 2020 Workshops. Demonstrates temporal convolutional approaches to event detection that handle the coarse annotation granularity typical of sports data.
Homography and Camera Calibration
-
Sha, L., Lucey, P., Yue, Y., Wei, X., Hobbs, J., Rohlicek, C., & Sridharan, S. (2020). End-to-End Camera Calibration for Broadcast Videos. CVPR 2020. Proposes a deep learning approach to automatic camera calibration from broadcast soccer video, eliminating the need for manual landmark identification.
-
Nie, X., Chen, S., & Hamid, R. (2021). A Robust and Efficient Framework for Sports-Field Registration. WACV 2021. Addresses the practical challenges of robust homography estimation across varying broadcast conditions, camera angles, and pitch designs.
Books
-
Szeliski, R. (2022). Computer Vision: Algorithms and Applications, 2nd edition. Springer. The standard graduate-level textbook on computer vision. Chapters on feature detection, geometric transformations, object recognition, and motion estimation provide the theoretical foundations for the techniques used in soccer CV.
-
Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep Learning. MIT Press. The standard reference for deep learning theory. Chapters on convolutional networks (Ch 11) and sequence modelling (Ch 12) are directly relevant to soccer CV pipelines.
-
Forsyth, D. A., & Ponce, J. (2011). Computer Vision: A Modern Approach, 2nd edition. Pearson. An accessible introduction to computer vision with strong coverage of geometric reasoning, camera models, and image segmentation.
-
Sumpter, D. (2022). Twelve Yards: The Art and Psychology of the Perfect Penalty Kick. Faber & Faber. While focused on penalties, this book discusses the role of video analysis and data in understanding set-piece execution.
Online Resources
Code Repositories
-
Ultralytics YOLOv8 --- github.com/ultralytics/ultralytics The most widely used object detection library for sports applications. Pre-trained models can be fine-tuned on soccer-specific data with minimal code.
-
SoccerNet GitHub --- github.com/SoccerNet Open-source tools, benchmarks, and pre-trained models for soccer video understanding. Includes tracking, action spotting, and camera calibration tasks.
-
Roboflow Sports Datasets --- roboflow.com/sports Curated datasets for sports object detection, including annotated soccer frames for player, ball, and referee detection.
-
MMPose --- github.com/open-mmlab/mmpose A comprehensive pose estimation library supporting multiple architectures (HRNet, ViTPose) with pre-trained sports models.
-
Norfair --- github.com/tryolabs/norfair A lightweight, customisable multi-object tracking library in Python. Designed for real-time applications and easy integration with detection models.
Video Lectures and Talks
-
SoccerNet Challenge Workshops --- CVPR/ECCV annual workshops. Presentations from leading research groups on the latest advances in soccer video understanding. Available on the SoccerNet YouTube channel.
-
Friends of Tracking: Computer Vision Series --- YouTube playlist. Practical tutorials on applying CV techniques to soccer, including player detection, tracking, and pitch calibration.
-
Andrej Karpathy, "Computer Vision" lectures --- Stanford CS231n. The foundational deep learning for vision course. While not soccer- specific, it provides the theoretical background for all CNN-based approaches discussed in this chapter.
Blog Posts and Tutorials
-
Roboflow, "How to Detect Soccer Players" --- blog.roboflow.com. A step-by-step tutorial on training a custom YOLOv8 model for soccer player detection, including data annotation, training, and inference.
-
Tryolabs, "Real-Time Multi-Object Tracking in Soccer" --- tryolabs.com/blog. A practical walkthrough of building a soccer tracking system using Norfair, with code examples and performance analysis.
-
Davide Zambrano, "From Broadcast Video to Tracking Data" --- Medium. An accessible overview of the full pipeline from raw broadcast video to structured tracking data, with discussion of accuracy limitations.
Academic Journals and Conferences
For cutting-edge research on CV in soccer, monitor these venues:
- IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
- European Conference on Computer Vision (ECCV)
- IEEE Winter Conference on Applications of Computer Vision (WACV)
- ACM International Conference on Multimedia
- MIT Sloan Sports Analytics Conference (annual)
- Journal of Sports Sciences
- International Journal of Computer Vision
- IEEE Transactions on Pattern Analysis and Machine Intelligence
Suggested Reading Path
For readers approaching soccer CV for the first time, we recommend:
- Start with Thomas et al. (2017) for a broad survey of CV in sports, establishing context and vocabulary.
- Read Redmon et al. (2016) to understand the YOLO detection paradigm, then explore the Ultralytics documentation for modern implementations.
- Work through the Roboflow tutorial to gain hands-on experience with player detection.
- Study Wojke et al. (2017) for tracking fundamentals, then implement a simple tracker using Norfair.
- Explore the SoccerNet benchmarks to understand the state of the art in event detection and tracking evaluation.
- Read Sha et al. (2020) for automatic camera calibration, the key enabler for converting pixel-space detections to pitch coordinates.
- Return to this chapter's code examples and case studies to consolidate understanding of the full pipeline.