Quiz: Introduction to Prediction Models
Question 1
What is the primary purpose of a prediction model?
A) To guarantee correct outcomes B) To systematically transform data into probabilistic forecasts C) To replace human judgment entirely D) To make money betting
Question 2
The standard prediction pipeline includes which steps in order?
A) Prediction → Data → Evaluation → Features B) Data → Features → Model → Prediction → Evaluation C) Model → Data → Features → Prediction D) Evaluation → Prediction → Model → Data
Question 3
What does a Brier score of 0.25 indicate?
A) Perfect predictions B) Equivalent to always predicting 50% C) Excellent calibration D) Poor model performance
Question 4
If a model predicts a home team has a 65% win probability, the implied point spread is approximately:
A) -3 points B) -5 points C) -7 points D) -10 points
Question 5
What is "data leakage" in prediction modeling?
A) Lost data during processing B) Using information not available at prediction time C) Sharing proprietary models D) Data storage failures
Question 6
A model shows 62% accuracy on 30 games. This result:
A) Proves the model is highly skilled B) Is statistically significant at p < 0.05 C) Could easily occur by chance D) Indicates perfect calibration
Question 7
The approximate standard deviation of NFL point spreads is:
A) 5 points B) 8 points C) 13-14 points D) 20 points
Question 8
"Overfitting" occurs when a model:
A) Uses too few features B) Memorizes training data but fails on new data C) Is too simple to capture patterns D) Has perfect accuracy
Question 9
What ATS accuracy is needed to profit when betting at -110 odds?
A) 50.0% B) 52.4% C) 55.0% D) 60.0%
Question 10
Home field advantage in the modern NFL is approximately:
A) 1 point B) 2.5 points C) 5 points D) 7 points
Question 11
To convert a point spread to win probability, which formula is commonly used?
A) wp = spread / 100 B) wp = 1 / (1 + 10^(spread/8)) C) wp = 0.5 + spread * 0.03 D) wp = e^spread / (1 + e^spread)
Question 12
A well-calibrated model's 70% predictions should:
A) Always be correct B) Win approximately 70% of the time C) Be more confident D) Have lower Brier scores
Question 13
Why can't you randomly split NFL game data for train/test?
A) NFL has too few games B) It creates temporal data leakage C) Random splits are always wrong D) NFL data is not random
Question 14
Mean Absolute Error (MAE) for a good NFL spread prediction model is typically:
A) 2-4 points B) 5-7 points C) 10-12 points D) 15-20 points
Question 15
What does quantifying prediction uncertainty help with?
A) Making the model more accurate B) Understanding which predictions are more reliable C) Eliminating randomness D) Guaranteeing profits
Question 16
Which factor typically has the LARGEST impact on NFL game predictions?
A) Weather conditions B) Team power ratings C) Day of the week D) Uniform colors
Question 17
The baseline for straight-up NFL prediction accuracy (coin flip) is:
A) 45% B) 50% C) 55% D) 60%
Question 18
If a model has high training accuracy but low test accuracy, you should:
A) Use more training data B) Simplify the model C) Add more features D) Increase model complexity
Question 19
A 90% confidence interval for a spread prediction of -7 with σ=13.5 is approximately:
A) [-10, -4] B) [-15, +1] C) [-29, +15] D) [-7, -7]
Question 20
Which metric best measures whether probability predictions are reliable?
A) Straight-up accuracy B) ATS accuracy C) Brier score / calibration D) Mean absolute error
Answer Key
-
B - A prediction model systematically transforms available data into probabilistic forecasts about uncertain future events.
-
B - The standard pipeline is: Data → Feature Engineering → Model → Predictions → Evaluation, with a feedback loop.
-
B - Brier score of 0.25 equals always predicting 50% probability. Lower is better; good models are <0.22.
-
B - Using wp = 1/(1+10^(spread/8)), 65% implies spread ≈ -5 points.
-
B - Data leakage means using information that wouldn't be available at the time of prediction, like future results.
-
C - With only 30 games, 62% could easily occur by chance. Standard error is ~9%, so this is not statistically significant.
-
C - NFL games have approximately 13-14 point standard deviation in spreads, reflecting high game-to-game variance.
-
B - Overfitting occurs when a model memorizes training data (including noise) but fails to generalize to new data.
-
B - At -110 odds, you need 52.4% accuracy to break even (11/21 = 52.38%).
-
B - Modern NFL home field advantage is approximately 2.5 points, down from 3.0 historically.
-
B - The formula wp = 1/(1+10^(spread/8)) converts spreads to probabilities, where each point ≈ 3% probability.
-
B - Calibration means predictions match reality; 70% predictions should win about 70% of the time.
-
B - Random splits could put later games in training and earlier games in testing, creating temporal data leakage.
-
C - Good NFL models typically have MAE of 10-12 points, similar to market performance.
-
B - Uncertainty quantification helps identify which predictions are more reliable and manage risk appropriately.
-
B - Team power ratings (strength) have the largest impact; they capture overall team quality.
-
B - Random guessing yields 50% accuracy, the baseline against which models are measured.
-
B - High train/low test accuracy indicates overfitting; simplifying the model reduces this gap.
-
C - 90% CI = -7 ± 1.65×13.5 = [-29.3, +15.3], showing NFL's high variance.
-
C - Brier score and calibration analysis measure whether probability predictions are reliable and well-calibrated.
Scoring Guide
- 18-20: Excellent - Ready for advanced prediction modeling
- 15-17: Good - Solid understanding of fundamentals
- 12-14: Satisfactory - Review evaluation metrics and pitfalls
- Below 12: Needs Review - Revisit chapter material