Chapter 10 Further Reading: Bayesian Thinking for Bettors

The following annotated bibliography provides resources for deeper exploration of the Bayesian topics introduced in Chapter 10. Entries are organized by category and chosen for their relevance to sports modeling and betting applications.


Books: Bayesian Statistics Foundations

1. Gelman, Andrew, Carlin, John B., Stern, Hal S., Dunson, David B., Vehtari, Aki, and Rubin, Donald B. Bayesian Data Analysis. Chapman & Hall/CRC, 2013 (3rd edition). The definitive reference on Bayesian statistics. Chapters 1--5 cover the foundational concepts (Bayes' theorem, single-parameter models, multiparameter models, and hierarchical models) that underpin Chapter 10. Chapters 11--12 cover MCMC algorithms. Chapter 15 provides an in-depth treatment of hierarchical models directly applicable to team strength estimation. The notation and examples set the standard for the field. Accessible to readers with intermediate probability and calculus backgrounds.

2. McElreath, Richard. Statistical Rethinking: A Bayesian Course with Examples in R and Stan. Chapman & Hall/CRC, 2020 (2nd edition). An outstanding pedagogical introduction to Bayesian statistics that emphasizes building and understanding models over mechanical formula application. McElreath's approach aligns closely with the philosophy of Chapter 10: start with priors, update with data, check your model. The accompanying lecture videos (freely available on YouTube) are among the best free resources for learning Bayesian statistics. Highly recommended as a companion to this textbook.

3. Davidson-Pilon, Cameron. Bayesian Methods for Hackers: Probabilistic Programming and Bayesian Inference. Addison-Wesley, 2016. A practical, code-first introduction to Bayesian methods using PyMC. Written for programmers who prefer learning through implementation rather than mathematical derivation. The first three chapters cover priors, posteriors, and MCMC with minimal prerequisites. The sports-adjacent examples (A/B testing, estimating unknown rates) translate directly to betting contexts. Available as a free Jupyter notebook on GitHub.

4. Kruschke, John K. Doing Bayesian Data Analysis: A Tutorial with R, JAGS, and Stan. Academic Press, 2015 (2nd edition). Known as "the puppy book" for its cover illustration, this is the most accessible full-length Bayesian textbook. Kruschke's use of intuitive diagrams and step-by-step explanations makes complex concepts approachable. The chapters on the Beta-Binomial model and hierarchical models are particularly relevant to the win probability and team strength models in Chapter 10.


Books: Probability and Bayesian Foundations

5. Jaynes, Edwin T. Probability Theory: The Logic of Science. Cambridge University Press, 2003. A deep, philosophical treatment of Bayesian probability as an extension of formal logic. Jaynes argues that probability theory is the unique consistent framework for reasoning under uncertainty --- an argument with direct implications for sports bettors who must make decisions under uncertainty. Not for the faint of heart mathematically, but profoundly influential in shaping the Bayesian worldview. Recommended for advanced readers seeking the philosophical foundations.

6. McGrayne, Sharon Bertsch. The Theory That Would Not Die: How Bayes' Rule Cracked the Enigma Code, Hunted Down Russian Submarines, and Emerged Triumphant from Two Centuries of Controversy. Yale University Press, 2011. A popular history of Bayes' theorem and its applications, from Turing's code-breaking to modern statistical practice. Not a technical book, but an engaging narrative that puts the Bayesian-frequentist debate into historical context. The opening quote of Chapter 10 is drawn from this work.


Books: Sports Analytics Applications

7. Albert, Jim, Glickman, Mark E., Swartz, Tim B., and Koning, Ruud H., eds. Handbook of Statistical Methods and Analyses in Sports. Chapman & Hall/CRC, 2017. A comprehensive academic handbook with chapters by leading sports statisticians. Several chapters use Bayesian methods, including Glickman's work on paired comparison models for team ratings and Albert's work on Bayesian approaches to baseball performance. The chapter on hierarchical models for team strength estimation is directly relevant to Case Study 2 in Chapter 10.

8. Albert, Jim. Bayesian Computation with R. Springer, 2009 (2nd edition). Though focused on R, Albert's book is valuable for its sports examples. Multiple chapters use baseball and football data to illustrate Bayesian concepts: estimating batting averages with Beta-Binomial models, comparing team strengths, and performing Bayesian hypothesis tests. The translation to Python is straightforward, and the conceptual insights are language-independent.


Academic Papers

9. Glickman, Mark E. and Stern, Hal S. "A State-Space Model for National Football League Scores." Journal of the American Statistical Association, 93(441), 1998, pp. 25-35. A seminal paper that applies a Bayesian state-space model to NFL game results, allowing team strengths to evolve over time. This paper introduced the idea of time-varying team parameters estimated within a Bayesian framework --- a direct precursor to the hierarchical models discussed in Section 10.5. The paper demonstrates that Bayesian smoothing produces better estimates than simple moving averages.

10. Baio, Gianluca and Blangiardo, Marta. "Bayesian Hierarchical Model for the Prediction of Football Results." Journal of Applied Statistics, 37(2), 2010, pp. 253-264. Applies a hierarchical Poisson model to English Premier League results, estimating team-specific offensive and defensive parameters. This paper demonstrates the full Bayesian workflow for sports prediction: prior specification, MCMC sampling, posterior predictive checks, and out-of-sample evaluation. The model structure directly parallels the hierarchical model in Section 10.5, adapted for soccer scoring.

11. Karlis, Dimitris and Ntzoufras, Ioannis. "Analysis of Sports Data by Using Bivariate Poisson Models." The Statistician, 52(3), 2003, pp. 381-393. Introduces bivariate Poisson models for predicting soccer scores, where the joint distribution of home and away goals captures correlation between teams' performances. The Bayesian implementation allows natural incorporation of prior information about team strengths. This paper extends the basic Poisson-Gamma conjugate model discussed in the exercises to a more realistic bivariate setting.

12. Gelman, Andrew and Hill, Jennifer. Data Analysis Using Regression and Multilevel/Hierarchical Models. Cambridge University Press, 2007. While technically a book, this reference is included here for its applied chapters on multilevel modeling. Chapters 11--13 cover the theory and practice of hierarchical models with extensive worked examples. The treatment of partial pooling and shrinkage is the clearest in the literature and directly supports the concepts in Section 10.5.

13. Vehtari, Aki, Gelman, Andrew, and Gabry, Jonah. "Practical Bayesian Model Evaluation Using Leave-One-Out Cross-Validation and WAIC." Statistics and Computing, 27(5), 2017, pp. 1413-1432. The definitive reference on Bayesian model comparison using LOO-CV and WAIC. These methods are essential for comparing competing Bayesian models (e.g., Normal vs. Student-t likelihood for scoring margins, or models with different covariate sets). The ArviZ Python package implements these diagnostics directly from PyMC traces.


Applied Tutorials and Blog Posts

14. Fonnesbeck, Christopher. "Bayesian Sports Analytics with PyMC." PyMCon 2022 talk and accompanying notebooks. A tutorial by the lead developer of PyMC demonstrating Bayesian sports models, including hierarchical team strength estimation and posterior predictive analysis for game outcomes. The notebooks provide production-quality PyMC code for the types of models discussed in Section 10.5. Available through the PyMC documentation website.

15. Robinson, David. "Understanding Empirical Bayes Estimation (Using Baseball Statistics)." Variance Explained Blog, 2015-2017. An exceptional series of blog posts that uses MLB batting averages to explain empirical Bayes estimation step by step. Robinson demonstrates the Beta-Binomial model, shrinkage toward the population mean, and the connection between empirical Bayes and fully Bayesian methods. The baseball examples translate directly to the sports betting contexts in Chapter 10. Available at varianceexplained.org.

16. Betancourt, Michael. "A Conceptual Introduction to Hamiltonian Monte Carlo." arXiv:1701.02434, 2017. The most accessible introduction to the MCMC algorithms used by PyMC and Stan. Betancourt explains why Hamiltonian Monte Carlo and the No-U-Turn Sampler (NUTS) are dramatically more efficient than older methods like Metropolis-Hastings and Gibbs sampling. Understanding these algorithms is not necessary for using PyMC, but the paper helps practitioners diagnose sampling problems and interpret diagnostic warnings.


Software Documentation

17. PyMC Documentation and Examples (pymc.io). The official documentation for PyMC, the probabilistic programming library used throughout Chapter 10. The "Getting Started" tutorial, the examples gallery, and the API reference for pm.Model, pm.sample, and pm.sample_posterior_predictive are essential references. The documentation includes several sports-related examples, including hierarchical models for team ratings.

18. ArviZ Documentation (arviz-devs.github.io/arviz). ArviZ is the companion library for analyzing and visualizing MCMC results. The documentation covers posterior summaries (az.summary), diagnostic plots (az.plot_trace, az.plot_posterior), model comparison (az.compare, az.loo, az.waic), and posterior predictive checks (az.plot_ppc). Every ArviZ function used in Chapter 10's code examples is documented with examples.


Data Sources

19. nflverse (nflverse.com) and nfl_data_py. The open-source ecosystem for NFL data, providing play-by-play data, game results, team statistics, and player information from 1999 to the present. The nfl_data_py Python package provides easy access to this data. For Bayesian NFL models, historical game results and team statistics are the primary data inputs for prior specification and model fitting.

20. Basketball Reference (basketball-reference.com) and nba_api. Basketball Reference provides comprehensive NBA team and player statistics. The nba_api Python package accesses the NBA's official statistics API programmatically. For the hierarchical NBA model in Case Study 2, team-level net ratings, pace statistics, and game-by-game scoring margins are the essential data inputs.


How to Use This Reading List

For readers working through this textbook sequentially, the following prioritization is suggested:

  • Start with: McElreath (entry 2) for the best introductory treatment of Bayesian thinking, or Davidson-Pilon (entry 3) if you prefer code-first learning.
  • Go deeper on theory: Gelman et al. (entry 1) chapters 1--5 and 15 for rigorous mathematical foundations and hierarchical models.
  • For sports-specific applications: Albert et al. (entry 7) and Glickman and Stern (entry 9) for academic sports Bayesian models.
  • For programming implementation: PyMC docs (entry 17) and ArviZ docs (entry 18) for the tools used in Chapter 10's code.
  • For empirical Bayes intuition: Robinson (entry 15) for the clearest explanation of shrinkage using baseball data.
  • For data: nflverse (entry 19) for NFL and basketball-reference/nba_api (entry 20) for NBA.
  • For the big picture: McGrayne (entry 6) for historical context, Jaynes (entry 5) for philosophical depth.

These resources will be referenced again in later chapters, particularly in Part IV (Advanced Quantitative Methods) where Bayesian methods are combined with machine learning and simulation techniques.