Quiz: Introduction to Prediction Models


Question 1

What is the primary purpose of a prediction model?

A) To guarantee correct outcomes B) To systematically transform data into probabilistic forecasts C) To replace human judgment entirely D) To make money betting


Question 2

The standard prediction pipeline includes which steps in order?

A) Prediction → Data → Evaluation → Features B) Data → Features → Model → Prediction → Evaluation C) Model → Data → Features → Prediction D) Evaluation → Prediction → Model → Data


Question 3

What does a Brier score of 0.25 indicate?

A) Perfect predictions B) Equivalent to always predicting 50% C) Excellent calibration D) Poor model performance


Question 4

If a model predicts a home team has a 65% win probability, the implied point spread is approximately:

A) -3 points B) -5 points C) -7 points D) -10 points


Question 5

What is "data leakage" in prediction modeling?

A) Lost data during processing B) Using information not available at prediction time C) Sharing proprietary models D) Data storage failures


Question 6

A model shows 62% accuracy on 30 games. This result:

A) Proves the model is highly skilled B) Is statistically significant at p < 0.05 C) Could easily occur by chance D) Indicates perfect calibration


Question 7

The approximate standard deviation of NFL point spreads is:

A) 5 points B) 8 points C) 13-14 points D) 20 points


Question 8

"Overfitting" occurs when a model:

A) Uses too few features B) Memorizes training data but fails on new data C) Is too simple to capture patterns D) Has perfect accuracy


Question 9

What ATS accuracy is needed to profit when betting at -110 odds?

A) 50.0% B) 52.4% C) 55.0% D) 60.0%


Question 10

Home field advantage in the modern NFL is approximately:

A) 1 point B) 2.5 points C) 5 points D) 7 points


Question 11

To convert a point spread to win probability, which formula is commonly used?

A) wp = spread / 100 B) wp = 1 / (1 + 10^(spread/8)) C) wp = 0.5 + spread * 0.03 D) wp = e^spread / (1 + e^spread)


Question 12

A well-calibrated model's 70% predictions should:

A) Always be correct B) Win approximately 70% of the time C) Be more confident D) Have lower Brier scores


Question 13

Why can't you randomly split NFL game data for train/test?

A) NFL has too few games B) It creates temporal data leakage C) Random splits are always wrong D) NFL data is not random


Question 14

Mean Absolute Error (MAE) for a good NFL spread prediction model is typically:

A) 2-4 points B) 5-7 points C) 10-12 points D) 15-20 points


Question 15

What does quantifying prediction uncertainty help with?

A) Making the model more accurate B) Understanding which predictions are more reliable C) Eliminating randomness D) Guaranteeing profits


Question 16

Which factor typically has the LARGEST impact on NFL game predictions?

A) Weather conditions B) Team power ratings C) Day of the week D) Uniform colors


Question 17

The baseline for straight-up NFL prediction accuracy (coin flip) is:

A) 45% B) 50% C) 55% D) 60%


Question 18

If a model has high training accuracy but low test accuracy, you should:

A) Use more training data B) Simplify the model C) Add more features D) Increase model complexity


Question 19

A 90% confidence interval for a spread prediction of -7 with σ=13.5 is approximately:

A) [-10, -4] B) [-15, +1] C) [-29, +15] D) [-7, -7]


Question 20

Which metric best measures whether probability predictions are reliable?

A) Straight-up accuracy B) ATS accuracy C) Brier score / calibration D) Mean absolute error


Answer Key

  1. B - A prediction model systematically transforms available data into probabilistic forecasts about uncertain future events.

  2. B - The standard pipeline is: Data → Feature Engineering → Model → Predictions → Evaluation, with a feedback loop.

  3. B - Brier score of 0.25 equals always predicting 50% probability. Lower is better; good models are <0.22.

  4. B - Using wp = 1/(1+10^(spread/8)), 65% implies spread ≈ -5 points.

  5. B - Data leakage means using information that wouldn't be available at the time of prediction, like future results.

  6. C - With only 30 games, 62% could easily occur by chance. Standard error is ~9%, so this is not statistically significant.

  7. C - NFL games have approximately 13-14 point standard deviation in spreads, reflecting high game-to-game variance.

  8. B - Overfitting occurs when a model memorizes training data (including noise) but fails to generalize to new data.

  9. B - At -110 odds, you need 52.4% accuracy to break even (11/21 = 52.38%).

  10. B - Modern NFL home field advantage is approximately 2.5 points, down from 3.0 historically.

  11. B - The formula wp = 1/(1+10^(spread/8)) converts spreads to probabilities, where each point ≈ 3% probability.

  12. B - Calibration means predictions match reality; 70% predictions should win about 70% of the time.

  13. B - Random splits could put later games in training and earlier games in testing, creating temporal data leakage.

  14. C - Good NFL models typically have MAE of 10-12 points, similar to market performance.

  15. B - Uncertainty quantification helps identify which predictions are more reliable and manage risk appropriately.

  16. B - Team power ratings (strength) have the largest impact; they capture overall team quality.

  17. B - Random guessing yields 50% accuracy, the baseline against which models are measured.

  18. B - High train/low test accuracy indicates overfitting; simplifying the model reduces this gap.

  19. C - 90% CI = -7 ± 1.65×13.5 = [-29.3, +15.3], showing NFL's high variance.

  20. C - Brier score and calibration analysis measure whether probability predictions are reliable and well-calibrated.


Scoring Guide

  • 18-20: Excellent - Ready for advanced prediction modeling
  • 15-17: Good - Solid understanding of fundamentals
  • 12-14: Satisfactory - Review evaluation metrics and pitfalls
  • Below 12: Needs Review - Revisit chapter material