Chapter 23 Key Takeaways: Time Series Analysis for Betting

Key Concepts

  1. Stationarity: A fundamental requirement for most time series models. A stationary series has a constant mean, constant variance, and autocovariance that depends only on the lag, not the time. Sports performance series are often non-stationary due to trends (improving/declining teams), structural breaks (injuries, trades), and seasonal effects (schedule-related patterns).

  2. ARIMA Models: The Autoregressive Integrated Moving Average framework combines three components: AR terms capture persistence (good weeks follow good weeks), I (differencing) removes trends to achieve stationarity, and MA terms capture the effect of past shocks. Model selection via AIC/BIC and validation through residual diagnostics (Ljung-Box test) are essential.

  3. Mean Reversion: The tendency of performance metrics to return toward their long-run average. The Ornstein-Uhlenbeck process provides a mathematical framework, with the half-life ($\ln(2)/\theta$) quantifying how quickly reversion occurs. Not all metrics mean-revert at the same speed: fumble recovery rates revert rapidly, while team talent persists.

  4. Changepoint Detection: Algorithms that identify moments when the underlying data-generating process changes. CUSUM, PELT, and BOCPD detect shifts caused by injuries, trades, coaching changes, or scheme adjustments. Early detection of changepoints before the market adjusts creates betting opportunities.

  5. Seasonal and Calendar Effects: Recurring patterns tied to the schedule rather than team quality. Day-of-week effects, back-to-back game penalties, post-bye performance, and late-season fatigue can all be modeled as seasonal components. Most well-documented calendar effects are already priced into betting markets.

  6. ACF and PACF: Diagnostic tools for identifying appropriate time series model orders. The autocorrelation function (ACF) and partial autocorrelation function (PACF) reveal the correlation structure of a series. AR signatures show slow ACF decay with sharp PACF cutoff; MA signatures show the opposite.

  7. Forecasting vs. Fitting: In-sample model fit (R-squared, AIC) does not guarantee out-of-sample predictive accuracy. Time series models for betting must be validated using expanding or rolling window backtests that respect the temporal ordering of data.


Key Formulas

Formula Expression Example
First Differencing $\Delta X_t = X_t - X_{t-1}$ Removes linear trend
AR(1) Process $X_t = c + \phi X_{t-1} + \varepsilon_t$ Persistence model
ARIMA(p,d,q) $\phi(B)(1-B)^d X_t = \theta(B)\varepsilon_t$ General time series model
OU Mean Reversion $dX_t = \theta(\mu - X_t)dt + \sigma dW_t$ Continuous-time model
Half-Life $t_{1/2} = \ln(2) / \theta$ $\theta = 0.15$: half-life = 4.6 weeks
CUSUM $S_k = \sum_{i=1}^{k}(x_i - \bar{x})$ Changepoint at $\max|S_k|$
ADF Test $\Delta X_t = \alpha + \beta t + \gamma X_{t-1} + \sum \delta_i \Delta X_{t-i} + \varepsilon_t$ Reject $H_0$ ($\gamma = 0$) for stationarity

Quick-Reference Decision Framework

When applying time series methods to a sports betting problem, follow this progression:

Step 1 --- Assess stationarity. Run both ADF and KPSS tests. If the series is non-stationary, apply differencing. If variance is non-constant, consider a log or Box-Cox transformation. Never fit an ARMA model to non-stationary data.

Step 2 --- Check for changepoints. Before fitting a global model, test for structural breaks using PELT or BOCPD. If changepoints exist, consider fitting models only to the most recent regime. A post-changepoint segment may be short, requiring simpler models (fewer parameters).

Step 3 --- Identify model order. Use ACF and PACF plots to guide the choice of AR and MA orders. Confirm with AIC/BIC grid search. For seasonal data, examine the ACF at the seasonal lag and consider SARIMA.

Step 4 --- Fit and validate. Fit the model and check residual diagnostics. Use an expanding-window backtest to assess true out-of-sample accuracy. Compare the model's RMSE and bias against a naive benchmark (historical mean or random walk).

Step 5 --- Convert forecasts to betting decisions. Compare the model's prediction to the market line. Calculate the implied probability of covering the spread using the prediction and its standard error. Only bet when the edge exceeds a minimum threshold (typically 3-5% in implied probability terms) to account for model uncertainty and vig.

The core principle: Sports data evolves over time. Models that ignore temporal dynamics --- trends, momentum, mean reversion, and regime changes --- leave money on the table. Time series methods capture the dynamics that static models miss, but they require careful stationarity handling, proper validation, and honest assessment of forecast uncertainty.


Ready for Chapter 24? Self-Assessment Checklist

Before moving on to Chapter 24 ("Simulation and Monte Carlo Methods"), confirm that you can do the following:

  • [ ] Perform ADF and KPSS stationarity tests and correctly interpret their (complementary) null hypotheses
  • [ ] Apply first differencing and seasonal differencing to make a series stationary
  • [ ] Read ACF and PACF plots to identify appropriate AR and MA orders
  • [ ] Fit ARIMA models using statsmodels and evaluate them via AIC, residual diagnostics, and out-of-sample RMSE
  • [ ] Estimate Ornstein-Uhlenbeck parameters and calculate the half-life of mean reversion for a sports metric
  • [ ] Apply CUSUM, PELT, or BOCPD to detect changepoints in a team performance series
  • [ ] Distinguish between genuine structural breaks and random variation using significance tests
  • [ ] Identify and test for common calendar effects (day-of-week, rest days, post-bye, etc.)
  • [ ] Conduct a proper rolling-window or expanding-window backtest for a time series betting model
  • [ ] Convert time series forecasts into betting edge estimates with appropriate uncertainty quantification

If you can check every box with confidence, you are well prepared for Chapter 24. If any items feel uncertain, revisit the relevant sections of Chapter 23 or work through the corresponding exercises before proceeding.