Quiz: Machine Learning for NFL Prediction

Question 1

What is the primary challenge of applying ML to NFL prediction?

A) Too many games to process B) Limited sample size (only ~270 games/season) C) NFL data is unstructured D) Teams are impossible to compare

Question 2

Which of the following is an example of data leakage?

A) Using last week's points per game B) Using the current game's final score as a feature C) Using preseason power rankings D) Using historical head-to-head records

Question 3

Temporal cross-validation differs from random cross-validation because:

A) It uses more folds B) Training data always precedes test data chronologically C) It produces higher accuracy D) It doesn't require a test set

Question 4

Which gradient boosting parameter INCREASES model complexity?

A) Lower max_depth B) Higher min_child_weight C) Higher n_estimators D) Higher regularization (reg_alpha)

Question 5

A model has 85% training accuracy and 58% test accuracy. This indicates:

A) An excellent model B) Underfitting C) Overfitting D) Proper calibration

Question 6

For NFL prediction, the recommended number of features is typically:

A) 5-10 B) 20-30 C) 100-200 D) 500+

Question 7

Which algorithm is most commonly used for NFL game prediction?

A) Support Vector Machine B) Naive Bayes C) Gradient Boosting (XGBoost/LightGBM) D) k-Nearest Neighbors

Question 8

Why might simpler models sometimes outperform complex ones in NFL prediction?

A) Complex models are slower B) Limited data increases overfitting risk for complex models C) Simple models have better documentation D) NFL rules favor simple analysis

Question 9

Feature importance in tree-based models measures:

A) Causal effect on outcomes B) How often a feature is used for splits C) Correlation with target D) Statistical significance

Question 10

Ensemble methods improve prediction by:

A) Using the same model multiple times B) Combining diverse models that make different errors C) Increasing training data D) Reducing feature count

Question 11

Walk-forward validation involves:

A) Randomly selecting test weeks B) Predicting one week at a time using only prior data C) Walking through features sequentially D) Using future data for training

Question 12

The Brier score for a random 50/50 predictor is:

A) 0.0 B) 0.10 C) 0.25 D) 0.50

Question 13

Neural networks typically require more data than gradient boosting because:

A) They have more parameters to train B) They are faster C) They only work with images D) They don't need regularization

Question 14

Which regularization technique is built into XGBoost?

A) Only L1 regularization B) Only L2 regularization C) Both L1 (reg_alpha) and L2 (reg_lambda) D) No regularization available

Question 15

Stacking in ensemble learning means:

A) Averaging predictions from multiple models B) Using one model's predictions as features for another C) Training models sequentially D) Stacking layers in a neural network

Question 16

When comparing model performance, you should:

A) Use training accuracy B) Use temporal held-out test accuracy C) Trust feature importance D) Compare to no baseline

Question 17

For class imbalance (rare events), you should consider:

A) Ignoring the minority class B) Using class weights or oversampling C) Only predicting the majority class D) Removing all minority class samples

Question 18

The standard deviation of NFL score differentials (~13.5 points) affects ML because:

A) It makes prediction easier B) It creates inherent noise that limits prediction accuracy C) It requires larger batch sizes D) It doesn't affect ML models

Question 19

SHAP values help with:

A) Model training speed B) Feature importance and model explainability C) Data collection D) Cross-validation

Question 20

When should you retrain your NFL ML model?

A) Never after initial training B) Periodically and when performance degrades C) Every prediction D) Only before the Super Bowl

Answer Key

B - With only ~270 games per season, NFL data is limited by ML standards, making overfitting a major concern.
B - Using the current game's final score leaks information from the future (the outcome you're trying to predict).
B - Temporal CV ensures training data comes before test data, preventing information from the future being used.
C - More estimators (trees) increases model complexity and capacity to fit training data.
C - A large gap between training and test accuracy is the classic sign of overfitting.
B - With ~270 games/season, 20-30 features is typically optimal; more leads to overfitting.
C - Gradient boosting (XGBoost, LightGBM) dominates NFL prediction due to its balance of power and robustness.
B - Limited NFL data means complex models tend to memorize rather than generalize.
B - Tree-based importance measures how often features are used for splitting, not causal effects.
B - Ensembles work when models make different errors; combining them reduces overall error.
B - Walk-forward simulates real prediction: each week uses only past data to predict.
C - Brier = (0.5 - 0)² = 0.25 for a 50% prediction on a binary outcome.
A - Neural networks have many more parameters, requiring more data to estimate them reliably.
C - XGBoost includes both L1 (reg_alpha) and L2 (reg_lambda) regularization parameters.
B - Stacking uses base model predictions as features for a meta-model.
B - Only temporal held-out test accuracy reflects real-world performance.
B - Class weights or oversampling help models learn from rare events appropriately.
B - High variance means even perfect models have limited accuracy due to randomness.
B - SHAP values explain which features drive individual predictions.
B - Periodic retraining captures new patterns; also retrain when performance drops.

Scoring Guide

18-20: Excellent - Ready for production ML systems
15-17: Good - Solid ML foundations
12-14: Satisfactory - Review overfitting and validation sections
Below 12: Needs Review - Revisit chapter fundamentals