Case Study 2: Predicting the Premier League: Season Simulation Models

Overview

Every August, pundits, bookmakers, and data analysts attempt to predict the Premier League season outcome. By December, the early predictions are compared to reality, and by May, the full reckoning arrives. Some seasons are highly predictable (Manchester City's centurion 2017-18 campaign), while others produce shocking outcomes (Leicester City 2015-16, with pre-season title odds of 5000-1).

This case study builds and evaluates a complete season simulation system using the Dixon-Coles model (Section 16.2.4), Monte Carlo simulation (Section 16.7), and fixture difficulty adjustments (Section 16.6). We apply it retrospectively to five Premier League seasons (2018-19 through 2022-23), evaluating prediction accuracy and calibration at different points during each season.

Model Architecture

Step 1: Team Strength Estimation

We use the Dixon-Coles model with time-decay weighting to estimate team-level attack ($\alpha_i$) and defense ($\beta_i$) parameters. The model is fitted on all matches from the current season plus the previous two seasons, with exponential decay:

$$w(t) = e^{-\xi t}$$

where $t$ is measured in days before the current date and $\xi = 0.002$ (corresponding to a half-life of approximately 347 days, or roughly one season).

The log-likelihood function:

$$\ell = \sum_{m} w(t_m) \cdot \log P(x_m, y_m \mid \alpha, \beta, \gamma, \rho)$$

where $(x_m, y_m)$ is the observed scoreline of match $m$, and parameters are estimated by numerical optimization (L-BFGS-B algorithm with constraints to ensure identifiability: $\sum_i \alpha_i = n$ where $n$ is the number of teams).

Step 2: Pre-Season Priors

At the start of a new season, limited current-season data is available. We use three sources for pre-season team strength:

  1. Previous season performance (60% weight): Attack and defense parameters from the final Dixon-Coles fit of the prior season, with 25% regression toward the mean to account for squad changes.

  2. Transfer activity adjustment (25% weight): Net xG impact of incoming and outgoing players, estimated from player-level ratings (Chapter 14). A team that signs a striker with +0.15 xG/90 above the outgoing striker receives an upward attack parameter adjustment.

  3. Market prices (15% weight): Betting market title and relegation odds, converted to implied team strength rankings via a log-odds transformation. Markets aggregate information from thousands of informed participants and provide a strong prior, particularly for promoted teams with limited top-flight data.

Step 3: Monte Carlo Simulation

For each remaining matchweek, we simulate every match using the current Dixon-Coles parameters:

  1. Compute $\lambda_H = \alpha_H \cdot \beta_A \cdot \gamma$ and $\lambda_A = \alpha_A \cdot \beta_H$.
  2. Draw home goals $G_H \sim \text{Poisson}(\lambda_H)$ and away goals $G_A \sim \text{Poisson}(\lambda_A)$.
  3. Apply the Dixon-Coles correction for low-scoring outcomes.
  4. Assign points (3 for win, 1 for draw, 0 for loss).
  5. Repeat for all remaining fixtures.
  6. Compute final league table with tiebreakers (goal difference, then goals scored).

We run $N = 50{,}000$ simulations per prediction point, updating after each matchweek.

Step 4: Output Generation

From the simulation ensemble, we extract:

  • Position probabilities: $P(\text{team } i \text{ finishes position } k)$ for all $i, k$.
  • Outcome probabilities: $P(\text{title})$, $P(\text{top 4})$, $P(\text{top 6})$, $P(\text{relegation})$ for each team.
  • Points distributions: Median, interquartile range, and 90% credible interval for final points.
  • Match importance: How much each remaining match shifts outcome probabilities (useful for broadcast scheduling and squad rotation decisions).

Results

Pre-Season Predictions vs. Actual Outcomes

Season Predicted Champion Actual Champion Title Prob Actual Position
2018-19 Manchester City Manchester City 52% 1st
2019-20 Manchester City Liverpool 45% 2nd
2020-21 Liverpool Manchester City 34% 3rd
2021-22 Manchester City Manchester City 41% 1st
2022-23 Manchester City Manchester City 48% 1st

The model correctly identified the champion in 3 of 5 seasons. Importantly, the model never assigned more than 52% pre-season title probability---reflecting the genuine uncertainty at the start of a season. Even the most dominant teams have less than a coin-flip chance of winning the title before a ball is kicked.

Mid-Season Prediction Accuracy

We track prediction accuracy at matchweek 10, 20, and 30:

Title Race: - Matchweek 10: Correct champion identified in 3/5 seasons (60%). - Matchweek 20: Correct champion identified in 4/5 seasons (80%). - Matchweek 30: Correct champion identified in 5/5 seasons (100%).

Top 4: - Matchweek 10: Average of 2.8/4 top-4 teams correctly identified. - Matchweek 20: Average of 3.4/4 top-4 teams correctly identified. - Matchweek 30: Average of 3.8/4 top-4 teams correctly identified.

Relegation: - Matchweek 10: Average of 1.6/3 relegated teams correctly identified. - Matchweek 20: Average of 2.2/3 relegated teams correctly identified. - Matchweek 30: Average of 2.8/3 relegated teams correctly identified.

Calibration Analysis

We pool all predictions across the five seasons and bin them by predicted probability:

Predicted Probability Bin Number of Predictions Observed Frequency
0-10% 287 8.3%
10-20% 143 16.7%
20-30% 98 25.5%
30-40% 67 35.8%
40-50% 52 42.3%
50-60% 41 53.7%
60-70% 38 63.2%
70-80% 33 73.9%
80-90% 28 82.1%
90-100% 25 92.0%

The model is well-calibrated: observed frequencies closely match predicted probabilities across all bins. The average Brier score across all match outcome predictions is 0.198, competitive with published benchmarks.

Case Deep Dive: Leicester City 2015-16

While our primary analysis covers 2018-23, we retrospectively apply the model to the Leicester 2015-16 season to stress-test extreme scenarios:

  • Pre-season title probability: 0.3% (consistent with 300-1 odds, though bookmakers offered 5000-1).
  • Matchweek 10 title probability: 2.1% (Leicester led the league but the model assigned low probability due to weak underlying strength estimates).
  • Matchweek 20 title probability: 10.7% (the model slowly updated as Leicester continued winning, but the time-decay weighting meant that prior season data still anchored the estimate downward).
  • Matchweek 30 title probability: 42% (finally the model caught up with reality as the weight of current-season evidence overwhelmed the prior).
  • Final matchweek: 99.8%.

This reveals a fundamental tension in prediction models: responsiveness vs. stability. A model that reacts quickly to current form would have given Leicester higher title odds earlier, but it would also have generated many false alarms---incorrectly elevating other teams on hot streaks who regressed.

Sensitivity Analysis

We evaluate model sensitivity to key parameters:

  1. Time-decay half-life ($\xi$): Varying from 200 to 600 days. Shorter half-lives improve in-season responsiveness but degrade pre-season accuracy. The optimal value (minimizing overall Brier score) is around 300-400 days.

  2. Dixon-Coles $\rho$: Fixing $\rho = 0$ (pure independent Poisson) increases the Brier score by 0.008. The Dixon-Coles correction provides a small but consistent improvement.

  3. Home advantage ($\gamma$): The optimal home advantage has decreased over time, from approximately 1.35 in 2018-19 to 1.22 in 2022-23 (with a dramatic drop to 1.05 during the COVID-19 behind-closed-doors period in 2019-20 and 2020-21). Using a static $\gamma$ across seasons degrades accuracy.

  4. Number of simulations: 10,000 simulations produce probability estimates with standard errors of approximately 0.5%. Increasing to 50,000 reduces this to 0.2%, which is more than sufficient for practical purposes.

Implementation Details

The full implementation is provided in code/case-study-code.py and code/example-03-season-simulation.py. Key components:

  • DixonColesModel: Class that fits the model via maximum likelihood using scipy.optimize.minimize.
  • SeasonSimulator: Class that takes a fitted model, current standings, and remaining fixtures, and produces Monte Carlo simulations.
  • CalibrationAnalyzer: Class that bins predictions and computes calibration metrics.
  • plot_probability_evolution(): Function that shows how team probabilities evolve matchweek by matchweek.

Lessons for Practitioners

  1. Uncertainty is the product, not the enemy. The model's value comes from quantifying uncertainty, not from making point predictions. A 45% title probability is more informative than saying "City will win the league."

  2. Pre-season predictions are inherently limited. Even sophisticated models cannot predict outcomes with high confidence before the season begins. Treat pre-season predictions as priors to be updated, not fixed forecasts.

  3. Model updating is crucial. The biggest accuracy gains come from in-season updating. By matchweek 20, the model has enough data to make substantially better predictions than any pre-season model.

  4. Extreme outcomes are possible but rare. The model correctly assigned low probability to Leicester's title win---and it should have. The fact that a 0.3% event occurred does not mean the probability was wrong. Over 300+ league seasons globally, we expect one Leicester-like outcome by chance alone.

  5. Calibration trumps accuracy. A well-calibrated model that says "this team has a 30% chance" and is right 30% of the time is more useful than a model that says "this team will finish 4th" and is right 60% of the time. Decision-makers need probabilities, not point estimates.

Exercises

  1. Reproduce the simulation for one season using the provided code. Compare your results with the pre-season betting odds. Which teams does the model evaluate differently from the market?

  2. Modify the time-decay parameter and observe how it affects the calibration curve. Is there a single optimal value, or does it depend on the season?

  3. Add an xG-based team strength estimator (using shot-level xG rather than goal-based Dixon-Coles). Does this improve prediction accuracy, and at what point in the season does the improvement become significant?

  4. Extend the simulation to model cup competitions alongside the league. How does squad depth (Section 16.3) affect the probability of achieving a domestic double?

References

  • Dixon, M. J., & Coles, S. G. (1997). Modelling association football scores and inefficiencies in the football betting market. Journal of the Royal Statistical Society: Series C, 46(2), 265-280.
  • Baio, G., & Blangiardo, M. (2010). Bayesian hierarchical model for the prediction of football results. Journal of Applied Statistics, 37(2), 253-264.
  • Hvattum, L. M., & Arntzen, H. (2010). Using ELO ratings for match result prediction in association football. International Journal of Forecasting, 26(3), 460-470.
  • FiveThirtyEight (2017-2023). Soccer Power Index methodology documentation.
  • Constantinou, A. C., & Fenton, N. E. (2012). Solving the problem of inadequate scoring rules for assessing probabilistic football forecast models. Journal of Quantitative Analysis in Sports, 8(1).