Chapter 21: Further Reading - Modeling Combat Sports and Tennis
Academic Papers and Research
-
Glickman, Mark E. "The Glicko System." Boston University (1995). The foundational paper introducing the Glicko rating system, which extends Elo by adding a rating deviation parameter. Explains the mathematical framework for handling uncertainty in ratings and the natural treatment of irregular competition schedules. Essential reading for anyone implementing rating systems for combat sports.
-
Glickman, Mark E. "Example of the Glicko-2 Rating System." Boston University (2013). The definitive reference for the Glicko-2 algorithm, including the volatility parameter and the Illinois algorithm for root-finding in the volatility update. Provides step-by-step calculations that make implementation straightforward. Freely available online and indispensable for implementers.
-
Kovalchik, Stephanie. "Searching for the GOAT of Tennis Win Prediction." Journal of Quantitative Analysis in Sports (2016). A comprehensive comparison of prediction models for professional tennis, including Elo variants, regression models, and point-based models. Finds that Elo-based systems with surface and recency adjustments are competitive with more complex approaches, validating the framework presented in this chapter.
-
Ingram, Martin. "How to Extend Elo: A Bayesian Perspective." Journal of Quantitative Analysis in Sports (2021). Presents a unified Bayesian interpretation of Elo and its extensions, showing how surface-specific ratings, time decay, and other modifications can be derived from principled statistical reasoning. Particularly relevant for understanding why blending overall and surface-specific ratings works.
-
Spanias, Dimitrios, and Anthony C. Constantinou. "Predicting MMA Match Outcomes with Machine Learning." International Journal of Forecasting (2022). Evaluates multiple approaches to MMA prediction, comparing Elo systems, logistic regression, and gradient-boosted trees. Finds that physical attributes (reach, age) add significant predictive value beyond ratings alone, supporting the physical attribute model presented in Section 21.4.
-
del Corral, Julio, and Juan Prieto-Rodriguez. "Are Differences in Ranks Good Predictors for Grand Slam Tennis Matches?" International Journal of Forecasting (2010). Examines whether ATP rankings and Elo ratings effectively predict Grand Slam outcomes, finding that surface-specific adjustments and recent form indicators improve prediction quality significantly.
-
Newton, Paul K., and Joseph B. Keller. "Probability of Winning at Tennis." Studies in Applied Mathematics (2005). Derives exact formulas for the probability of winning a tennis match from serve-point-win probabilities, accounting for the full hierarchical scoring structure. The mathematical foundation for the live win probability model in Section 21.5.
Books
-
Broadie, Mark. "Every Shot Counts: Using the Revolutionary Strokes Gained Approach to Improve Your Golf Game and Strategy." Gotham Books (2014). While focused on golf, Broadie's decomposition of performance into component parts provides a conceptual framework applicable to individual sports generally. The idea of decomposing tennis performance into serve, return, rally, and net components follows the same logic.
-
Glickman, Mark E., and Thomas Doan. "The US Chess Rating System." Chess Café (2019). A detailed treatment of practical considerations in implementing rating systems at scale, including handling new entrants, managing inactive players, and calibrating system parameters. The lessons transfer directly to combat sports and tennis rating implementation.
-
Silver, Nate. "The Signal and the Noise." Penguin Press (2012). Chapter 10 on chess and competitive gaming provides accessible discussion of Elo systems and their limitations. Silver's treatment of the difference between randomness and uncertainty is particularly relevant for understanding rating deviation in Glicko-2.
Data Sources
-
Jeff Sackmann's Tennis Abstract and Match Charting Project. The most comprehensive freely available tennis data resource. Includes ATP and WTA match results dating back to the 1960s, point-by-point data for thousands of matches via the Match Charting Project, and surface-specific statistics. Available at https://github.com/JeffSackmann. Indispensable for any tennis modeling project.
-
UFCStats.com. The official UFC statistics provider. Contains detailed per-fight statistics for every UFC event, including significant strikes landed and absorbed, takedown attempts and defense, control time, and submission attempts. The primary data source for building fighter profiles, style classifications, and the physical attribute model. Available at http://www.ufcstats.com.
-
BoxRec. The most comprehensive boxing database, containing professional records, fight results, and physical measurements for boxers worldwide. Includes historical data spanning decades and records for regional as well as championship-level bouts. Available at https://boxrec.com.
-
FiveThirtyEight Tennis Elo Ratings (archived). Before its closure, FiveThirtyEight maintained publicly available Elo ratings for professional tennis players, including surface-specific variants. The archived methodology articles provide excellent guidance on parameter choices and validation approaches. The ratings themselves serve as benchmarks for custom systems.
Online Resources and Communities
-
Tennis Abstract Blog (Jeff Sackmann). Regularly publishes analytical articles on tennis prediction, Elo calibration, surface effects, and serve statistics. The blog archives contain some of the best publicly available research on tennis analytics, with code examples and methodological discussions.
-
Bloody Elbow and MMA Fighting Analytics. These MMA media outlets occasionally publish analytical pieces on fighter statistics, matchup analysis, and prediction models. While less rigorous than academic sources, they provide domain knowledge and contextual understanding essential for building realistic combat sports models.
-
Pinnacle Sports Betting Resources -- Combat Sports and Tennis Sections. Pinnacle publishes articles on market efficiency, closing line value, and betting strategy specifically for tennis and MMA. As a sharp-action sportsbook, their insights on where the market is most and least efficient are particularly valuable.
-
The Pudding -- "The Complete History of the UFC" (interactive visualization). An excellent interactive data visualization that provides intuitive understanding of UFC fight outcomes, finishing rates across weight classes, and career trajectories. Useful for building intuition about the patterns that quantitative models should capture.
Methodological References
-
Elo, Arpad. "The Rating of Chessplayers, Past and Present." Arco Publishing (1978). The original book by the creator of the Elo system. While focused on chess, the foundational principles, mathematical derivations, and philosophical discussion of what ratings measure remain directly applicable to all individual competitions. A classic that every serious rating system developer should read.
-
Minka, Thomas P. "A Family of Algorithms for Approximate Bayesian Inference." PhD thesis, MIT (2001). Provides the theoretical foundation for approximate Bayesian methods used in modern rating systems, including the expectation propagation framework that generalizes Glicko-2. Technically demanding but provides deep understanding of why these algorithms work and how to extend them.