Chapter 7: Further Reading — Probability Distributions in Betting
Foundational Probability and Statistics
-
DeGroot, M.H. & Schervish, M.J. (2012). Probability and Statistics (4th ed.). Pearson. A comprehensive textbook covering all the distributions discussed in this chapter (Normal, Poisson, Binomial, Beta) with rigorous mathematical treatment and many worked examples. Chapters 5-7 are particularly relevant.
-
Casella, G. & Berger, R.L. (2002). Statistical Inference (2nd ed.). Cengage. A graduate-level reference for the theoretical foundations of probability distributions, estimation, and hypothesis testing. Useful for readers who want to understand the mathematical proofs behind the properties we use.
-
Wasserman, L. (2004). All of Statistics: A Concise Course in Statistical Inference. Springer. A concise and modern treatment that covers distributions, estimation, Bayesian inference, and nonparametric methods. Good for readers who want breadth without excessive length.
Poisson Models in Sports
-
Dixon, M. & Coles, S. (1997). "Modelling Association Football Scores and Inefficiencies in the Football Betting Market." Applied Statistics, 46(2), 265-280. The seminal paper introducing the Dixon-Coles modification to the independent Poisson model for soccer. Required reading for anyone building Poisson-based soccer models.
-
Maher, M.J. (1982). "Modelling Association Football Scores." Statistica Neerlandica, 36(3), 109-118. One of the earliest papers applying independent Poisson models to soccer scoring. Establishes the framework that Dixon and Coles later refined.
-
Karlis, D. & Ntzoufras, I. (2003). "Analysis of Sports Data by Using Bivariate Poisson Models." Journal of the Royal Statistical Society: Series D, 52(3), 381-393. Extends the Poisson model to a bivariate framework that captures correlation between the two teams' goal outputs without the ad-hoc correction of Dixon-Coles.
-
Lee, A.J. (1997). "Modeling Scores in the Premier League: Is Manchester United Really the Best?" Chance, 10(1), 15-19. An accessible introduction to Poisson regression for soccer modeling, written for a general statistical audience.
Binomial Models and Streaks
-
Gilovich, T., Vallone, R. & Tversky, A. (1985). "The Hot Hand in Basketball." Cognitive Psychology, 17(3), 295-314. The classic study showing that perceived "hot streaks" in basketball shooting are consistent with independent Bernoulli trials. One of the most cited papers in sports psychology and behavioral economics.
-
Miller, J.B. & Sanjurjo, A. (2018). "Surprised by the Hot Hand Fallacy? A Truth in the Law of Small Numbers." Econometrica, 86(6), 2019-2047. A critical reexamination of the Gilovich-Vallone-Tversky result, showing that the original analysis contained a subtle statistical bias. Argues that there is genuine evidence for the hot hand once the bias is corrected.
-
Albert, J. & Bennett, J. (2003). Curve Ball: Baseball, Statistics, and the Role of Chance in the Game (Revised ed.). Copernicus. Uses the binomial distribution extensively to analyze streaks, slumps, and clutch performance in baseball. Highly accessible and full of practical examples.
Normal Distribution and Spread Betting
-
Stern, H.S. (1991). "On the Probability of Winning a Football Game." The American Statistician, 45(3), 179-183. Analyzes the distribution of point margins in NFL games and evaluates the normal model's adequacy. A concise and insightful paper.
-
Glickman, M.E. & Stern, H.S. (1998). "A State-Space Model for National Football League Scores." Journal of the American Statistical Association, 93(441), 25-35. Develops a dynamic model for NFL scores that accounts for time-varying team strengths. Uses the normal distribution as a building block.
-
Lopez, M.J. & Matthews, G.J. (2015). "Building an NCAA Men's Basketball Predictive Model and Quantifying Its Success." Journal of Quantitative Analysis in Sports, 11(1), 5-12. Demonstrates the use of normal distribution models for predicting basketball outcomes, including spread and totals markets.
Bayesian Methods and the Beta Distribution
-
Gelman, A., Carlin, J.B., Stern, H.S., Dunson, D.B., Vehtari, A. & Rubin, D.B. (2013). Bayesian Data Analysis (3rd ed.). Chapman & Hall/CRC. The definitive reference for Bayesian statistics. Chapter 2 covers the Beta-Binomial model in detail. Later chapters cover hierarchical models, which are natural extensions for sports applications.
-
McElreath, R. (2020). Statistical Rethinking: A Bayesian Course with Examples in R and Stan (2nd ed.). Chapman & Hall/CRC. A highly readable introduction to Bayesian statistics that builds intuition through simulation and visualization. The chapter on Beta-Binomial updating is particularly clear.
-
Albert, J. (2009). Bayesian Computation with R (2nd ed.). Springer. A practical guide to implementing Bayesian models, with many sports examples. Chapter 2 covers Beta prior and posterior distributions.
Distribution Fitting and Model Selection
-
Burnham, K.P. & Anderson, D.R. (2002). Model Selection and Multimodel Inference: A Practical Information-Theoretic Approach (2nd ed.). Springer. The authoritative reference on AIC, BIC, and information-theoretic model selection. Essential reading for anyone comparing multiple distributional fits.
-
D'Agostino, R.B. & Stephens, M.A. (1986). Goodness-of-Fit Techniques. Marcel Dekker. A comprehensive reference covering chi-squared tests, Kolmogorov-Smirnov tests, Anderson-Darling tests, and other goodness-of-fit methods.
-
Clauset, A., Shalizi, C.R. & Newman, M.E.J. (2009). "Power-Law Distributions in Empirical Data." SIAM Review, 51(4), 661-703. While focused on power-law distributions, this paper provides an excellent tutorial on rigorous distribution fitting methodology, including maximum likelihood estimation and model comparison.
Advanced Topics
-
Coles, S. (2001). An Introduction to Statistical Modeling of Extreme Values. Springer. The standard introduction to extreme value theory. Relevant for modeling tail probabilities in sports betting, where standard distributions underestimate the frequency of extreme outcomes.
-
Nelsen, R.B. (2006). An Introduction to Copulas (2nd ed.). Springer. A clear introduction to copula theory for modeling dependent multivariate distributions. Applicable to modeling the dependence between two teams' scoring in a match.
-
Cameron, A.C. & Trivedi, P.K. (2013). Regression Analysis of Count Data (2nd ed.). Cambridge University Press. A comprehensive treatment of Poisson regression, Negative Binomial regression, zero-inflated models, and other count data models. Essential for anyone building regression-based sports models.
-
Johnson, N.L., Kemp, A.W. & Kotz, S. (2005). Univariate Discrete Distributions (3rd ed.). Wiley. An encyclopedic reference covering every discrete distribution you might encounter in sports modeling, including the Poisson, Binomial, Negative Binomial, geometric, and many others.
-
Johnson, N.L., Kotz, S. & Balakrishnan, N. (1994-1995). Continuous Univariate Distributions, Volumes 1 and 2 (2nd ed.). Wiley. The companion volumes for continuous distributions: Normal, Gamma, Beta, Weibull, log-normal, and dozens of others.
Online Resources and Datasets
-
Football-Data.co.uk (https://www.football-data.co.uk/) A comprehensive source of historical soccer results and betting odds for European leagues. Invaluable for building and testing Poisson models.
-
Basketball-Reference (https://www.basketball-reference.com/) Complete NBA statistics, game logs, and historical data. Useful for analyzing streaks, player props, and team-level distributions.
-
Pro-Football-Reference (https://www.pro-football-reference.com/) Complete NFL statistics and historical game results. Essential for analyzing spread margins and testing normal distribution models.
-
SciPy Statistical Functions (https://docs.scipy.org/doc/scipy/reference/stats.html) The official documentation for SciPy's statistical distributions module, which implements all distributions discussed in this chapter and provides functions for PMF/PDF, CDF, fitting, and hypothesis testing.
-
Pinnacle Sports Betting Resources (https://www.pinnacle.com/betting-resources/) Articles on sports betting theory from one of the sharpest bookmakers, including many that discuss distributional models for various sports.
-
Fivethirtyeight Sports Models (https://projects.fivethirtyeight.com/) Nate Silver's sports prediction models (Elo ratings, win probabilities) provide practical examples of how distributional thinking is applied to real prediction problems.
End of Further Reading — Chapter 7