Chapter 23 Quiz: Time Series Analysis for Betting

Instructions: Answer all 25 questions. This quiz is worth 100 points. You have 75 minutes. A calculator is permitted; no notes or internet access. For multiple choice, select the single best answer.


Section 1: Multiple Choice (10 questions, 3 points each = 30 points)

Question 1. A time series is said to be weakly stationary if:

(A) Its values never change over time

(B) Its mean, variance, and autocovariance are constant over time

(C) It has no autocorrelation at any lag

(D) It follows a normal distribution at every time point

Answer **(B) Its mean, variance, and autocovariance are constant over time.** Weak (or covariance) stationarity requires that the first two moments of the process do not depend on time: the mean $E[X_t] = \mu$ is constant, the variance $\text{Var}(X_t) = \sigma^2$ is constant, and the autocovariance $\text{Cov}(X_t, X_{t+h})$ depends only on the lag $h$, not on time $t$. The series can still fluctuate (ruling out A), have autocorrelation (ruling out C), and follow any distribution (ruling out D).

Question 2. The null hypothesis of the Augmented Dickey-Fuller (ADF) test is:

(A) The series is stationary

(B) The series has a unit root (non-stationary)

(C) The series has no autocorrelation

(D) The series has constant variance

Answer **(B) The series has a unit root (non-stationary).** The ADF test has $H_0$: the series contains a unit root (is non-stationary) versus $H_1$: the series is stationary. To conclude stationarity, you need to *reject* the null hypothesis. This is the opposite of the KPSS test, whose null hypothesis is that the series *is* stationary. Running both tests helps resolve ambiguous cases.

Question 3. In an ARIMA(2,1,1) model, what does the "1" in the middle position represent?

(A) One autoregressive lag

(B) One moving average term

(C) The series has been differenced once to achieve stationarity

(D) The model has one seasonal component

Answer **(C) The series has been differenced once to achieve stationarity.** In ARIMA(p,d,q), $p$ is the order of the autoregressive component, $d$ is the degree of differencing, and $q$ is the order of the moving average component. ARIMA(2,1,1) means: 2 AR lags, 1st-order differencing (the model is fit to $\Delta X_t = X_t - X_{t-1}$), and 1 MA term. Differencing removes trends and helps achieve stationarity.

Question 4. A team's weekly scoring margin shows a significant positive autocorrelation at lag 1 of 0.45. This most likely indicates:

(A) The team's performance is purely random from week to week

(B) Good performances tend to be followed by good performances (momentum/persistence)

(C) The team mean-reverts rapidly after each game

(D) The data contains a seasonal pattern with period 1

Answer **(B) Good performances tend to be followed by good performances (momentum/persistence).** Positive autocorrelation at lag 1 means the current value is positively correlated with the immediately preceding value. In a sports context, this indicates persistence or "momentum" --- a team that performed well last week tends to perform well this week, and vice versa. If performance were random, autocorrelation would be near zero. Mean reversion would produce negative autocorrelation. A seasonal pattern at period 1 is just white noise.

Question 5. The half-life of a mean-reverting process with speed parameter $\theta = 0.10$ is approximately:

(A) 2.3 time periods

(B) 4.6 time periods

(C) 6.9 time periods

(D) 10.0 time periods

Answer **(C) 6.9 time periods.** The half-life of an Ornstein-Uhlenbeck process is $t_{1/2} = \ln(2) / \theta = 0.6931 / 0.10 = 6.931 \approx 6.9$ time periods. This means that a deviation from the long-run mean is expected to be reduced by half after approximately 6.9 time periods. Larger $\theta$ produces faster reversion and shorter half-lives.

Question 6. Which changepoint detection algorithm is specifically designed for online (real-time) detection rather than retrospective analysis?

(A) PELT (Pruned Exact Linear Time)

(B) Binary segmentation

(C) BOCPD (Bayesian Online Changepoint Detection)

(D) CUSUM (Cumulative Sum)

Answer **(C) BOCPD (Bayesian Online Changepoint Detection).** BOCPD, introduced by Adams and MacKay (2007), is specifically designed for online detection. It maintains a posterior distribution over the "run length" (time since the last changepoint) and updates this distribution with each new observation. PELT and binary segmentation are retrospective methods that operate on a complete dataset. While CUSUM can be applied in an online fashion, BOCPD provides a principled Bayesian framework with full posterior probabilities for changepoints at each time step.

Question 7. An NBA team goes 8-2 in their first 10 games, then 4-6 in their next 10 games. Which of the following is the BEST interpretation from a time series perspective?

(A) The team has definitely declined in quality

(B) This pattern is consistent with random variation around a true 60% win rate

(C) This is clear evidence of mean reversion

(D) A changepoint has occurred between games 10 and 11

Answer **(B) This pattern is consistent with random variation around a true 60% win rate.** With only 10 games per segment, the observed win rates (80% and 40%) are well within the range of random variation for a true 60% win rate. The standard error of a proportion over 10 trials at p=0.6 is $\sqrt{0.6 \times 0.4 / 10} = 0.155$. The 95% confidence interval for 10 games spans roughly 30% to 90%. A formal changepoint test would not reject the null of no change with this sample size. Jumping to conclusions about decline, mean reversion, or changepoints from 20 games is statistically premature.

Question 8. In the context of time series analysis for betting, "closing line value" (CLV) measured over time is best analyzed using:

(A) A single-point hypothesis test

(B) Time series methods to detect whether the edge is persistent, trending, or mean-reverting

(C) A simple win-loss record

(D) Cross-sectional regression with no temporal component

Answer **(B) Time series methods to detect whether the edge is persistent, trending, or mean-reverting.** CLV measured over time forms a time series that can exhibit trends (as markets adapt to a bettor's edge), mean reversion, autocorrelation (streaks of good or bad value), and changepoints (when a model stops working or a market corrects). Analyzing CLV as a time series provides richer information than a single aggregate test, revealing whether the edge is stable, growing, or decaying.

Question 9. Which of the following is NOT a valid approach to handling non-stationarity in a sports time series?

(A) First differencing to remove a linear trend

(B) Seasonal differencing to remove periodic patterns

(C) Log transformation to stabilize variance

(D) Removing all data points that are more than two standard deviations from the mean

Answer **(D) Removing all data points that are more than two standard deviations from the mean.** Removing outliers based on a global threshold does not address non-stationarity; it simply discards data. If the mean is shifting over time, points from an earlier regime might appear as "outliers" when measured against the overall mean, but they were perfectly normal for their time period. Options A, B, and C are all standard, principled approaches to handling non-stationarity: differencing removes trends, seasonal differencing removes periodic components, and log transforms stabilize changing variance.

Question 10. A SARIMA(1,0,0)(1,0,0)[82] model for NBA team performance includes a seasonal AR term at lag 82. This term captures:

(A) The effect of playing the same team again in the same season

(B) The tendency for performance at the same point in consecutive seasons to be correlated

(C) Weekly patterns within a single season

(D) The influence of the All-Star break on performance

Answer **(B) The tendency for performance at the same point in consecutive seasons to be correlated.** With a seasonal period of 82 (the length of an NBA season), the seasonal AR(1) term at lag 82 captures the correlation between a team's performance at game $t$ and their performance at game $t-82$ --- that is, the same point in the previous season. This could reflect schedule difficulty patterns, player development arcs, or other factors that recur at the same stage of each season. It does not capture within-season weekly patterns (those would be at much shorter lags) or specific events like the All-Star break.

Section 2: True/False (5 questions, 3 points each = 15 points)

Question 11. True or False: If a time series has a significant trend, an ARMA model (without differencing) fitted to the raw data will produce biased and unreliable forecasts.

Answer **True.** ARMA models assume stationarity. If the data has a trend, the mean is not constant, violating this assumption. An ARMA model fitted to trended data will produce forecasts that revert to the sample mean rather than following the trend, leading to systematic under- or over-prediction. The correct approach is to difference the series to remove the trend (producing an ARIMA model) or to include a deterministic trend component.

Question 12. True or False: Mean reversion in sports statistics implies that after a period of unusually good performance, a team's future performance will be worse than their historical average.

Answer **False.** Mean reversion implies that future performance will be *closer to* the historical average than the current extreme, not that it will be *below* the average. If a team is performing at 120% of their long-run average, mean reversion predicts regression *toward* 100%, not a swing *below* 100%. Confusing "regression toward the mean" with "regression below the mean" is a common error and is related to the gambler's fallacy.

Question 13. True or False: The PELT algorithm for changepoint detection guarantees finding the globally optimal set of changepoints under its cost function.

Answer **True.** PELT (Pruned Exact Linear Time) provides an exact solution to the changepoint detection problem by pruning the search space without sacrificing optimality. Unlike binary segmentation, which is a greedy heuristic that may miss optimal solutions, PELT examines all possible segmentations (in a computationally efficient manner through pruning) and finds the global minimum of the penalized cost function. This makes PELT both exact and computationally efficient (often linear in the number of observations).

Question 14. True or False: If the ACF of a time series decays slowly and the PACF has a sharp cutoff after lag 2, the appropriate model is MA(2).

Answer **False.** The pattern described --- slowly decaying ACF with PACF cutoff after lag 2 --- is the signature of an AR(2) model, not an MA(2). For an MA(q) model, the ACF has a sharp cutoff after lag $q$ while the PACF decays slowly. The descriptions are reversed: AR models have slowly decaying ACF and sharp PACF cutoff; MA models have sharp ACF cutoff and slowly decaying PACF.

Question 15. True or False: A calendar effect (e.g., poor NFL team performance on Thursday Night Football) that has been widely publicized and known to bettors for several years is unlikely to still offer a profitable betting edge.

Answer **True.** The efficient market hypothesis applied to sports betting suggests that once an effect is widely known, bettors will exploit it until the betting lines adjust to account for it. If sportsbooks know that teams perform poorly on Thursday Night Football, they will adjust their lines accordingly (e.g., shading the total downward or the spread toward the underdog), eliminating the edge for bettors. Well-documented calendar effects are typically already priced into the market. Any remaining edge would need to be in *how much* the market adjusts, not in *whether* it adjusts.

Section 3: Fill in the Blank (3 questions, 4 points each = 12 points)

Question 16. In an Ornstein-Uhlenbeck process, the parameter $\theta$ controls the _ of mean reversion. A larger value of $\theta$ means the process reverts to its long-run mean _ quickly.

Answer **speed** (or rate); **more** The OU process is defined by $dX_t = \theta(\mu - X_t)dt + \sigma dW_t$, where $\theta$ is the speed of mean reversion, $\mu$ is the long-run mean, and $\sigma$ is the volatility. When $\theta$ is large, the restoring force toward $\mu$ is strong, and deviations are corrected quickly (short half-life). When $\theta$ is small, the process wanders far from the mean before being pulled back (long half-life).

Question 17. The Akaike Information Criterion (AIC) balances model _ (measured by the likelihood function) against model _ (measured by the number of parameters). Lower AIC values indicate a ____ model.

Answer **fit** (or goodness of fit); **complexity**; **better** (or preferred) AIC = $2k - 2\ln(\hat{L})$, where $k$ is the number of parameters and $\hat{L}$ is the maximized likelihood. The first term penalizes complexity (more parameters increase AIC) while the second rewards fit (higher likelihood decreases AIC). The model with the lowest AIC achieves the best trade-off between fitting the data well and keeping the model parsimonious, reducing the risk of overfitting.

Question 18. In CUSUM changepoint detection, the cumulative sum statistic is computed as $S_k = \sum_{i=1}^{k}(x_i - \bar{x})$. The estimated changepoint is the value of $k$ at which $|S_k|$ is _. Before the changepoint, the series mean is _ the overall mean, and after the changepoint, the series mean is ____ the overall mean.

Answer **maximized** (or at its maximum); **above** (or below); **below** (or above) The CUSUM reaches its maximum absolute value at the point where the series transitions from one regime to another. If $S_k$ reaches a positive maximum, the early observations are above the overall mean and the later observations are below it (or vice versa for a negative maximum). The direction of the CUSUM peak tells you the direction of the shift.

Section 4: Short Answer (3 questions, 5 points each = 15 points)

Question 19. Explain the concept of "differencing" a time series. Why is it commonly used before fitting an ARMA model? Provide the mathematical definition of first-order differencing and give a sports example where differencing would be necessary.

Answer **Differencing** transforms a non-stationary time series into a stationary one by computing the change between consecutive observations rather than the values themselves. First-order differencing is defined as: $$\Delta X_t = X_t - X_{t-1}$$ This removes a linear trend because if $X_t = a + bt + \varepsilon_t$ (a linear trend plus noise), then $\Delta X_t = b + \varepsilon_t - \varepsilon_{t-1}$, which has a constant mean $b$. Differencing is necessary before fitting ARMA models because ARMA assumes stationarity. If the raw series has a trend or unit root, the ARMA parameter estimates will be biased and forecasts will be unreliable. **Sports example:** An NBA team's cumulative win total over a season is non-stationary (it can only increase and has a clear upward trend). Differencing this series produces the game-by-game results (1 for a win, 0 for a loss), which is stationary with a mean equal to the team's true win rate.

Question 20. A bettor observes that an NFL team has gone 7-1 ATS (against the spread) in their last 8 games. Using time series reasoning, explain two different possible interpretations of this streak and describe how you would distinguish between them using the tools from Chapter 23.

Answer **Interpretation 1: Random variation.** ATS results are approximately a coin flip (close to 50/50 for most teams over long periods). An 8-game window is very small. The probability of going 7-1 or better in 8 games by chance alone is $\binom{8}{7}(0.5)^8 + \binom{8}{8}(0.5)^8 = 9/256 \approx 3.5\%$. While unlikely for a specific prediction, across all NFL teams over a season, several such streaks are expected. **Interpretation 2: Genuine changepoint.** The team may have undergone a real improvement (new player, scheme change, returning from injury) that the market has not fully adjusted for, creating a genuine ATS edge. **How to distinguish:** Use changepoint detection (BOCPD or CUSUM) on a longer history of the team's ATS margins (not just wins/losses) to test whether the recent period shows a statistically significant shift in the mean margin. Also test the ATS margin against the spread for autocorrelation --- if there is significant positive autocorrelation, the streak may reflect persistence. Finally, check the *closing line movement* during this period: if the lines have been moving toward the team (reflecting market learning), the edge is likely shrinking and may be transient.

Question 21. Describe how seasonal ARIMA (SARIMA) differs from standard ARIMA. Explain what the seasonal period represents in a sports context and give two examples of sports data where SARIMA would be more appropriate than ARIMA.

Answer **SARIMA** extends ARIMA by adding seasonal autoregressive, differencing, and moving average components that operate at a seasonal lag. The model is denoted SARIMA(p,d,q)(P,D,Q)[s], where the lowercase terms are the non-seasonal components, the uppercase terms are the seasonal components, and $s$ is the seasonal period. The seasonal period $s$ represents the number of observations after which the pattern repeats. In sports, this is typically the length of a full season: 82 games for the NBA, 162 for MLB, 17 for the NFL. **Example 1: MLB pitcher ERA across multiple seasons.** A pitcher's ERA might show a regular pattern within each season (stronger early, fatigued late) that repeats annually. SARIMA with $s = 162$ (or $s = 30$ for monthly aggregates) could capture this recurring fatigue pattern. **Example 2: NFL weekly TV ratings or handle data across multiple seasons.** Betting handle on NFL games follows a weekly pattern within a season (higher on Sundays, different on prime-time slots) that repeats across seasons. SARIMA with $s = 17$ (number of weeks per regular season) would capture both within-season dynamics and season-to-season recurrence. Standard ARIMA would miss these repeating seasonal patterns, leading to systematically biased forecasts at certain points in the season.

Section 5: Code Analysis (2 questions, 6 points each = 12 points)

Question 22. Examine the following Python code for fitting an ARIMA model:

import pandas as pd
from statsmodels.tsa.arima.model import ARIMA

def forecast_margin(series: pd.Series, order: tuple = (1, 1, 1)) -> float:
    model = ARIMA(series, order=order)
    fitted = model.fit()
    forecast = fitted.forecast(steps=1)
    return forecast.values[0]

# Usage
margins = pd.Series([3, 7, -2, 5, 10, 4, 8, -1, 6, 3, 12, 7])
next_margin = forecast_margin(margins)

(a) The code runs without error but has a significant methodological problem for a betting application. Identify the problem.

(b) Write corrected code that addresses this problem.

(c) What additional validation should be performed on the model before using its forecasts for betting?

Answer **(a)** The code fits the model to the *entire* series and then forecasts the next value. In a real betting application, this constitutes **look-ahead bias** if the model's hyperparameters (the order (1,1,1)) were selected using this same data. More critically, the code does not validate whether (1,1,1) is the appropriate model order for this series, does not check residual diagnostics, and does not assess whether the series is actually non-stationary (justifying $d=1$). With only 12 observations, an ARIMA(1,1,1) may be severely overfit. **(b)** Corrected code:
import pandas as pd
from statsmodels.tsa.arima.model import ARIMA
from statsmodels.tsa.stattools import adfuller
from statsmodels.stats.diagnostic import acorr_ljungbox

def forecast_margin(series: pd.Series, order: tuple = (1, 0, 0)) -> dict:
    # Test stationarity to determine differencing
    adf_result = adfuller(series)
    d = 0 if adf_result[1] < 0.05 else 1

    actual_order = (order[0], d, order[2])
    model = ARIMA(series, order=actual_order)
    fitted = model.fit()

    # Check residual autocorrelation
    lb_test = acorr_ljungbox(fitted.resid, lags=[5], return_df=True)

    forecast = fitted.forecast(steps=1)
    conf_int = fitted.get_forecast(steps=1).conf_int()

    return {
        "forecast": forecast.values[0],
        "conf_int": conf_int.values[0],
        "aic": fitted.aic,
        "ljung_box_pvalue": lb_test["lb_pvalue"].values[0],
        "residuals_ok": lb_test["lb_pvalue"].values[0] > 0.05,
    }
**(c)** Additional validation: (1) Ljung-Box test on residuals to confirm no remaining autocorrelation. (2) Normality test on residuals (Shapiro-Wilk or Jarque-Bera). (3) Out-of-sample validation using a rolling or expanding window backtest to assess true predictive accuracy. (4) Comparison against a naive benchmark (e.g., predicting the mean or the last observation). (5) AIC/BIC comparison across multiple model orders to confirm the chosen order is optimal.

Question 23. Examine the following changepoint detection code:

import numpy as np

def cusum_changepoint(data: list[float]) -> dict:
    """Detect a single changepoint using CUSUM."""
    n = len(data)
    mean = np.mean(data)
    cusum = np.cumsum(np.array(data) - mean)
    changepoint = np.argmax(np.abs(cusum))

    before = data[:changepoint + 1]
    after = data[changepoint + 1:]

    return {
        "changepoint_index": changepoint,
        "mean_before": np.mean(before),
        "mean_after": np.mean(after),
        "cusum_values": cusum.tolist(),
    }

(a) Trace through the execution with data = [10, 12, 8, 11, 3, 2, 5, 1]. Show the CUSUM values and identify the changepoint.

(b) What happens if data has no actual changepoint (all values drawn from the same distribution)? Will the function still report one?

(c) Suggest an improvement to handle the case where no changepoint exists.

Answer **(a)** Mean = (10+12+8+11+3+2+5+1)/8 = 52/8 = 6.5 Deviations from mean: [3.5, 5.5, 1.5, 4.5, -3.5, -4.5, -1.5, -5.5] CUSUM values (cumulative sum of deviations): - $S_1 = 3.5$ - $S_2 = 3.5 + 5.5 = 9.0$ - $S_3 = 9.0 + 1.5 = 10.5$ - $S_4 = 10.5 + 4.5 = 15.0$ - $S_5 = 15.0 - 3.5 = 11.5$ - $S_6 = 11.5 - 4.5 = 7.0$ - $S_7 = 7.0 - 1.5 = 5.5$ - $S_8 = 5.5 - 5.5 = 0.0$ Maximum |CUSUM| = 15.0 at index 3 (the 4th observation). Changepoint at index 3. Before: [10, 12, 8, 11], mean = 10.25. After: [3, 2, 5, 1], mean = 2.75. **(b)** Yes, the function will always report a changepoint, even if none exists. The CUSUM will still have a maximum absolute value somewhere, and `np.argmax` will return that index. This is a critical flaw: the function has no mechanism to assess whether the detected changepoint is statistically significant or just noise. **(c)** Add a significance test. One approach is a permutation test: shuffle the data many times, compute the max |CUSUM| for each shuffle, and compare the observed max |CUSUM| to this null distribution. If the observed value exceeds the 95th percentile, the changepoint is significant:
def cusum_with_significance(data, n_permutations=1000, alpha=0.05):
    result = cusum_changepoint(data)
    observed_max = max(abs(c) for c in result["cusum_values"])

    null_maxima = []
    for _ in range(n_permutations):
        shuffled = np.random.permutation(data)
        cusum = np.cumsum(shuffled - np.mean(shuffled))
        null_maxima.append(np.max(np.abs(cusum)))

    p_value = np.mean(np.array(null_maxima) >= observed_max)
    result["p_value"] = p_value
    result["is_significant"] = p_value < alpha
    return result

Section 6: Applied Problems (2 questions, 8 points each = 16 points)

Question 24. An NFL team's game-by-game point differentials for the first 12 weeks of the season are:

[14, 7, -3, 10, 3, -7, -14, -10, -3, -7, -17, -6]

(a) (2 points) Compute the sample mean and standard deviation of the full 12-game series.

(b) (2 points) The team's pre-season projected point differential was +2.5 per game. Using a one-sample t-test, determine whether the observed performance is significantly different from the projection at the 5% level.

(c) (2 points) Apply CUSUM analysis to identify if and when a performance changepoint occurred. Report the CUSUM values and the estimated changepoint.

(d) (2 points) A sportsbook currently has the team as a 3.5-point underdog in Week 13. Based on your time series analysis, do you see value in betting the team or betting against them? Justify your answer using both the overall mean and the post-changepoint mean (if one exists).

Answer **(a)** Sum = 14+7-3+10+3-7-14-10-3-7-17-6 = -33. Mean = -33/12 = **-2.75**. Deviations squared: (16.75)^2 + (9.75)^2 + (-0.25)^2 + (12.75)^2 + (5.75)^2 + (-4.25)^2 + (-11.25)^2 + (-7.25)^2 + (-0.25)^2 + (-4.25)^2 + (-14.25)^2 + (-3.25)^2 = 280.56 + 95.06 + 0.06 + 162.56 + 33.06 + 18.06 + 126.56 + 52.56 + 0.06 + 18.06 + 203.06 + 10.56 = 1000.25 Sample variance = 1000.25/11 = 90.93. Standard deviation = **9.54**. **(b)** $t = (\bar{x} - \mu_0) / (s/\sqrt{n}) = (-2.75 - 2.5) / (9.54/\sqrt{12}) = -5.25 / 2.754 = -1.906$. With 11 degrees of freedom, the critical value for a two-tailed test at 5% is approximately 2.201. Since |-1.906| < 2.201, we **fail to reject** $H_0$. The observed performance is not significantly different from the pre-season projection at the 5% level (p-value approximately 0.08). **(c)** CUSUM (deviations from overall mean -2.75): - $S_1 = 16.75$ - $S_2 = 16.75 + 9.75 = 26.50$ - $S_3 = 26.50 - 0.25 = 26.25$ - $S_4 = 26.25 + 12.75 = 39.00$ - $S_5 = 39.00 + 5.75 = 44.75$ - $S_6 = 44.75 - 4.25 = 40.50$ - $S_7 = 40.50 - 11.25 = 29.25$ - $S_8 = 29.25 - 7.25 = 22.00$ - $S_9 = 22.00 - 0.25 = 21.75$ - $S_{10} = 21.75 - 4.25 = 17.50$ - $S_{11} = 17.50 - 14.25 = 3.25$ - $S_{12} = 3.25 - 3.25 = 0.00$ Maximum |CUSUM| = 44.75 at index 4 (Week 5). **Estimated changepoint: after Week 5.** Weeks 1-5 mean: (14+7-3+10+3)/5 = 31/5 = **+6.2** Weeks 6-12 mean: (-7-14-10-3-7-17-6)/7 = -64/7 = **-9.14** **(d)** The overall mean of -2.75 is close to the line of -3.5, suggesting little value based on the full-season average. However, the changepoint analysis reveals a dramatic decline: the post-changepoint mean is **-9.14**, suggesting the team has been performing approximately 9 points below average recently. If this post-changepoint regime is the team's current true level, and the line is only -3.5, there may be value in **betting against the team** (taking the opponent -3.5). The market line appears to be weighting the strong early-season performance too heavily. However, caution is warranted: 7 post-changepoint games is still a small sample, and the changepoint significance should be confirmed with a permutation test.

Question 25. You are building a time series model to predict NBA team scoring for the purpose of betting game totals (over/under). You have three seasons of game-by-game data for all 30 teams.

(a) (2 points) Would you build one model per team or one model for the entire league? Justify your choice, considering sample size, parameter stability, and computational cost.

(b) (2 points) You fit an ARIMA(1,0,1) model to a single team's points-per-game series. The fitted AR(1) coefficient is 0.35 and the MA(1) coefficient is -0.20. Interpret these coefficients in the context of NBA scoring.

(c) (2 points) Your model's one-step-ahead RMSE on a held-out test set is 10.2 points. The sportsbook's game total has a typical margin of 2-3 points between the over and under lines. Is your model precise enough to be useful for betting? Explain.

(d) (2 points) Propose a strategy that combines your time series model with the sportsbook's game total to identify betting opportunities. Be specific about the decision rule and any thresholds you would use.

Answer **(a)** **One model per team** is preferred despite the smaller sample size (approximately 246 games over 3 seasons per team). NBA teams have fundamentally different offensive and defensive characteristics, pace of play, and coaching philosophies. A single league-wide model would blur these differences and produce less accurate team-specific forecasts. The sample size of ~246 games per team is adequate for a low-order ARIMA model (which has only 2-5 parameters). If sample size is a concern, a hierarchical approach that shares information across teams (using league-wide priors for parameters) could combine the benefits of both approaches. **(b)** The AR(1) coefficient of 0.35 means that a team's scoring has moderate persistence: if a team scored 10 points above their mean in one game, the model expects them to score about 3.5 points above their mean in the next game. The MA(1) coefficient of -0.20 means that a positive random shock in one game leads to a slight expected decrease in the next game, partially offsetting the AR persistence. The net effect is moderate positive autocorrelation, suggesting some game-to-game momentum in scoring but with partial mean reversion of the noise component. **(c)** An RMSE of 10.2 points means the model's predictions are, on average, off by about 10 points for a single team's scoring. For a game total (which sums two teams' scoring), the combined RMSE would be approximately $\sqrt{10.2^2 + 10.2^2} = 14.4$ points (assuming independence). This is **far too imprecise** for a market where the edge is 2-3 points. However, the model may still be useful if it identifies *systematic biases* in the market. Even if individual predictions are noisy, if the model consistently sees totals as 3+ points different from the market over many games, the average edge can be profitable. **(d)** Strategy: (1) For each game, use the ARIMA model to predict each team's expected score and sum them for a predicted game total. (2) Calculate the difference between the model's predicted total and the sportsbook's posted total. (3) Bet the over when the model total exceeds the market total by a threshold of at least 5 points (approximately 0.5 standard deviations of the model's error, chosen to filter out noise). Bet the under when the model total is 5+ points below the market. (4) Size bets proportionally to the edge (larger bets for larger discrepancies). (5) Track CLV to validate that the model's signals are consistently on the right side of closing line movement. Adjust the threshold based on historical ROI: if 5 points is too tight (low ROI), increase it; if there are too few qualifying bets, decrease it slightly.

Scoring Summary

Section Questions Points Each Total
1. Multiple Choice 10 3 30
2. True/False 5 3 15
3. Fill in the Blank 3 4 12
4. Short Answer 3 5 15
5. Code Analysis 2 6 12
6. Applied Problems 2 8 16
Total 25 --- 100

Grade Thresholds

Grade Score Range Percentage
A 90-100 90-100%
B 80-89 80-89%
C 70-79 70-79%
D 60-69 60-69%
F 0-59 0-59%