Chapter 21: Quiz - In-Game Win Probability
Instructions
Answer all questions. Each question is worth the points indicated. Total possible: 100 points.
Section A: Multiple Choice (2 points each)
Question 1
Win probability measures: - A) A team's season winning percentage - B) The likelihood of winning given current game state - C) The probability of making the next shot - D) Historical performance against similar opponents
Question 2
Which factor does NOT typically affect in-game win probability? - A) Score differential - B) Time remaining - C) Jersey color - D) Possession status
Question 3
The Brier Score measures: - A) Only model accuracy - B) Only model calibration - C) Both calibration and discrimination - D) Player shooting accuracy
Question 4
A well-calibrated win probability model means: - A) It predicts 50% for every situation - B) When it predicts 70%, teams win about 70% of the time - C) It never makes mistakes - D) It has high accuracy on rare events
Question 5
Win Probability Added (WPA) is: - A) The change in win probability after a play - B) A player's career winning percentage - C) Added probability from home court - D) The bonus from making free throws
Question 6
Leverage Index is highest in which situation? - A) Tie game in the first quarter - B) Up 25 points in the fourth quarter - C) Tie game with 30 seconds left - D) Down 10 points at halftime
Question 7
Why is logistic regression preferred for win probability? - A) It's faster to compute - B) It outputs probabilities between 0 and 1 - C) It doesn't require training data - D) It has no assumptions
Question 8
Possession is worth approximately how much in terms of expected points? - A) 0.5 points - B) 1.0-1.1 points - C) 2.0 points - D) 3.0 points
Question 9
Platt scaling is used for: - A) Scoring adjustments - B) Model calibration - C) Feature engineering - D) Data collection
Question 10
Total WPA for a team in a winning game equals: - A) 0.0 - B) 0.5 - C) 1.0 - D) Varies by game
Section B: True/False (2 points each)
Question 11
Win probability models can accurately predict individual game outcomes.
Question 12
A team's WPA over a game sums to approximately 1 if they win (starting from 0.5).
Question 13
Higher leverage index means higher importance of the current situation.
Question 14
The Brier Score ranges from 0 (perfect) to infinity.
Question 15
WPA is a good metric for predicting future player performance.
Question 16
Score differential is more important early in the game than late.
Question 17
Home court advantage should be incorporated into win probability models.
Question 18
Time-series cross-validation is recommended for win probability model evaluation.
Question 19
A model that predicts 50% win probability for all situations would be perfectly calibrated.
Question 20
Win probability naturally accounts for team strength differences.
Section C: Short Answer (4 points each)
Question 21
Calculate the seconds remaining when: - Period: 4 - Clock: 3:45 Show your work.
Question 22
A model predicts 0.75 win probability for a team. The team is in this situation 200 times and wins 160 times. Calculate the calibration error at this probability level.
Question 23
Explain why WPA can be misleading for player evaluation. Give a specific example.
Question 24
Define leverage index mathematically and explain what LI = 5.0 means in practical terms.
Question 25
A team's win probability graph shows they were at 15% at some point but won. Calculate their "comeback improbability" and explain what this means.
Section D: Problem Solving (6 points each)
Question 26
Brier Score Calculation
Calculate the Brier Score for these predictions: | Situation | Predicted WP | Outcome (1=win) | |-----------|--------------|-----------------| | 1 | 0.90 | 1 | | 2 | 0.70 | 1 | | 3 | 0.60 | 0 | | 4 | 0.40 | 0 | | 5 | 0.20 | 0 |
Show your work and interpret whether this is good performance.
Question 27
WPA Attribution
Calculate total WPA for this player: | Play | WP Before | WP After | |------|-----------|----------| | Made 3PT | 0.45 | 0.58 | | Turnover | 0.65 | 0.52 | | Made FT | 0.70 | 0.73 | | Missed shot | 0.55 | 0.48 | | Assist | 0.40 | 0.50 |
a) Calculate WPA for each play b) Calculate total WPA c) Determine net positive/negative impact
Question 28
Feature Engineering
For these features: score_diff, seconds_remaining, possession
Create: a) A log transformation of time b) An interaction term between score and time c) A possession indicator (-1, 0, 1)
Write the formulas and explain why each helps model performance.
Question 29
Calibration Analysis
A model produces these predictions: | Probability Bin | Predictions | Actual Wins | Win Rate | |-----------------|-------------|-------------|----------| | 0.0-0.2 | 100 | 15 | | | 0.2-0.4 | 150 | 52 | | | 0.4-0.6 | 200 | 102 | | | 0.6-0.8 | 180 | 132 | | | 0.8-1.0 | 120 | 108 | |
a) Calculate the actual win rate for each bin b) Identify which bins are over-confident and under-confident c) Is the overall model well-calibrated?
Question 30
Win Probability Dynamics
A game has these key moments: | Time | Score | Event | WP (Home) | |------|-------|-------|-----------| | Start | 0-0 | Tip-off | 0.55 | | Q2, 0:00 | 52-48 | Halftime | 0.68 | | Q3, 6:00 | 58-65 | | 0.42 | | Q4, 2:00 | 85-82 | | 0.78 | | Q4, 0:10 | 88-86 | | 0.92 | | Final | 89-88 | Home wins | 1.00 |
a) Identify the biggest WP swing b) What likely caused the Q3 drop from 0.68 to 0.42? c) Was this an exciting game? Justify using WP data.
Section E: Essay Questions (8 points each)
Question 31
Building a Win Probability Model
Describe the complete process of building a win probability model:
- Data requirements and preprocessing
- Feature engineering (at least 5 features)
- Model selection and training
- Calibration and evaluation
- Deployment considerations
For each step, explain both what to do and why it matters.
Question 32
Applications of Win Probability
Discuss the various applications of win probability in professional basketball:
For teams: - In-game decision making - Player evaluation - Game planning
For media: - Broadcast graphics - Storytelling - Fan engagement
For fans: - Understanding game dynamics - Historical comparisons
Provide specific examples for at least three applications and discuss limitations.
Answer Key
Section A: Multiple Choice
- B - Likelihood of winning given current game state
- C - Jersey color does not affect win probability
- C - Both calibration and discrimination
- B - Predicted probabilities match actual outcomes
- A - Change in win probability after a play
- C - Tie game with 30 seconds left
- B - Outputs probabilities between 0 and 1
- B - 1.0-1.1 points per possession
- B - Model calibration
- B - 0.5 (starting WP changes to 1.0, so +0.5 net)
Section B: True/False
- False - Models predict probabilities, not certain outcomes
- True - Team WPA = final WP (1) - initial WP (0.5) = 0.5
- True - Definition of leverage index
- False - Brier Score ranges from 0 to 1
- False - WPA describes past, doesn't predict future well
- False - Score differential matters more late in game
- True - Home advantage is 3-4 points
- True - Prevents data leakage from future games
- False - Would only be calibrated if games were 50/50
- False - Need to explicitly include team strength as feature
Section C: Short Answer
21. - Period 4 means no periods remaining after current - 3:45 = 360 + 45 = 225 seconds remaining*
22. - Predicted: 75%, Actual: 160/200 = 80% - Calibration error = 80% - 75% = 5 percentage points - Model is under-confident at this level
- WPA depends on when you play, not just how well. A bench player in blowouts has limited WPA opportunity, while a closer has high-leverage chances. Example: Two players with identical skill - one plays garbage time (low WPA), one plays crunch time (high WPA).
24. - LI = Expected WP swing in current situation / Average expected WP swing - LI = 5.0 means this situation is 5x more important than average - Plays made now have 5x normal impact on winning
25. - Comeback improbability = 1 - 0.15 = 85% improbable - This means historically, only 15% of teams in that situation win - The team overcame 85% odds against them
Section D: Problem Solving
26. Brier Score = (1/n) * sum((predicted - actual)^2) - (0.90 - 1)^2 = 0.01 - (0.70 - 1)^2 = 0.09 - (0.60 - 0)^2 = 0.36 - (0.40 - 0)^2 = 0.16 - (0.20 - 0)^2 = 0.04
Sum = 0.66 Brier Score = 0.66 / 5 = 0.132
Interpretation: Brier Score of 0.132 is reasonable. Baseline (always predicting 0.5) gives 0.25. Our model is better than baseline.
27. a) WPA calculations: - Made 3PT: 0.58 - 0.45 = +0.13 - Turnover: 0.52 - 0.65 = -0.13 - Made FT: 0.73 - 0.70 = +0.03 - Missed shot: 0.48 - 0.55 = -0.07 - Assist: 0.50 - 0.40 = +0.10
b) Total WPA = 0.13 - 0.13 + 0.03 - 0.07 + 0.10 = +0.06
c) Net positive impact (+0.06 WPA)
28. a) Log transformation: log_seconds = log(seconds_remaining + 1) - Captures diminishing importance as time decreases - +1 prevents log(0) error
b) Interaction term: score_time_interact = score_diff * sqrt(seconds_remaining) - Large lead + lots of time = very safe - Large lead + little time = need to maintain
c) Possession indicator: - +1 if we have possession - -1 if opponent has possession - 0 if neutral (between plays) - Captures ~1 point expected value of possession
29. a) Win rates: - 0.0-0.2: 15/100 = 15% - 0.2-0.4: 52/150 = 34.7% - 0.4-0.6: 102/200 = 51% - 0.6-0.8: 132/180 = 73.3% - 0.8-1.0: 108/120 = 90%
b) Analysis: - 0.0-0.2: Expected ~10%, actual 15% - under-confident - 0.2-0.4: Expected ~30%, actual 35% - under-confident - 0.4-0.6: Expected ~50%, actual 51% - well-calibrated - 0.6-0.8: Expected ~70%, actual 73% - slightly under-confident - 0.8-1.0: Expected ~90%, actual 90% - well-calibrated
c) Overall reasonably well-calibrated, with slight tendency to be under-confident (teams win more than predicted).
30. a) Biggest WP swing: Q2 (0.68) to Q3 (0.42) = -0.26 swing
b) Likely causes of drop: - Away team went on a scoring run at end of Q2 or start of Q3 - 7-point swing in actual score (from +4 to -7) - Approximately 3 minutes elapsed, 11-point scoring run by away team
c) Yes, exciting game: - Lead changed (home led at half, trailed in Q3, won) - Close throughout (never more than 7 points apart late) - Final margin only 1 point - Home team was below 50% during game but won
Section E: Essay Questions
-
Key points for full credit: - Data: Play-by-play with timestamps, scores, possession; need 1000+ games - Features: score_diff, log(time), sqrt(time), possession, home_court, scoretime interaction - Model: Logistic regression (interpretable, calibrated); regularization helps - Evaluation: Brier score, calibration plot, cross-validation by time - Deployment*: Fast inference, handle edge cases, monitor calibration over time
-
Key points for full credit: - Teams: Timeout decisions, fouling strategy, who plays in crunch time - Media: Real-time graphics, "dagger" play identification, historical context - Fans: Excitement quantification, comparing games across eras - Limitations: Doesn't capture everything, single-game variance, model uncertainty
Scoring Guide
| Section | Points | Your Score |
|---|---|---|
| A (10 questions) | 20 | |
| B (10 questions) | 20 | |
| C (5 questions) | 20 | |
| D (5 questions) | 30 | |
| E (2 questions) | 16 | |
| Total | 106 |
Grade Scale: - A: 90-100+ - B: 80-89 - C: 70-79 - D: 60-69 - F: Below 60