Chapter 18 Quiz: Modeling the NHL
Test your understanding of NHL-specific modeling concepts, expected goals, shot metrics, and betting market patterns.
Question 1. What does xG (expected goals) measure in the NHL, and what is the fundamental statistical framework behind it?
Answer
xG measures the probability that a given shot results in a goal, based on the shot's characteristics (distance, angle, shot type, game state, rebound status, rush vs. settled). The fundamental framework is binary classification: for each shot, the model predicts 1 (goal) or 0 (no goal). Logistic regression is the standard baseline approach, with more complex models (gradient boosting, neural networks) used for capturing non-linear interactions. By summing individual shot xG values, we obtain a team's total expected goals for a game.Question 2. What is the single most important feature in an xG model, and approximately how much does conversion rate change as this feature increases?
Answer
Shot distance is the single most important feature. Shots from within 30 feet of the goal have dramatically higher conversion rates (15-30%+, depending on other factors) compared to shots from beyond 50 feet (1-3%). The relationship is approximately exponential: each additional 10 feet of distance roughly halves the expected goal probability, holding other factors constant.Question 3. What is the difference between Corsi and Fenwick?
Answer
Corsi counts all shot attempts: shots on goal, missed shots, and blocked shots. Fenwick excludes blocked shots, counting only unblocked shot attempts (shots on goal and missed shots). The rationale for Fenwick is that blocked shots may reflect the defensive team's shot-blocking ability rather than the offensive team's shot generation. In practice, the two are highly correlated ($r > 0.95$) and both serve as proxies for territorial control.Question 4. What is PDO, and what is its league-average value? Why is it called a "luck metric"?
Answer
PDO is the sum of a team's shooting percentage and save percentage at even strength: $\text{PDO} = \text{Sh\%} + \text{Sv\%}$. League average is, by definition, 100.0% (approximately 9% shooting + 91% save). PDO is called a luck metric because both shooting percentage and save percentage have low year-over-year correlations and strong mean-reversion properties. Teams with extreme PDO values are almost certainly benefiting from (or suffering from) unsustainable luck rather than genuine skill differences.Question 5. Explain score effects in the NHL. How does a team's CF% change when leading by 1 goal versus trailing by 1 goal?
Answer
Score effects describe the systematic behavioral changes teams make based on the game score. Teams with a lead "turtle" by playing more defensively, reducing their shot generation and conceding more shot attempts to the opponent. Teams trailing press aggressively, increasing their shot generation. Specifically: when leading by 1, a team's CF% drops to approximately 47% (from 50% when tied); when trailing by 1, it rises to approximately 53%. This bias means raw CF% must be score-adjusted to accurately reflect a team's true quality at controlling play.Question 6. What is Goals Saved Above Expected (GSAx) and why is it preferred over save percentage for goaltender evaluation?
Answer
GSAx = xGA - Actual GA, where xGA is the sum of xG values for all shots faced. Positive GSAx means the goaltender saved more goals than expected. GSAx is preferred over save percentage because it adjusts for shot quality automatically. A goaltender facing 30 shots from the slot will have a lower raw save percentage than one facing 30 shots from the point, but GSAx correctly evaluates their relative skill by accounting for the difficulty of each shot faced.Question 7. Why do goaltender metrics require heavy regression to the mean, and what is the approximate regression constant for GSAx?
Answer
Goaltender metrics require heavy regression because goals are rare events (only ~9% of shots score), creating enormous variance on individual shot outcomes. A goaltender faces approximately 1,500-2,000 shots per season, but the expected difference between a good and average goaltender is only 15-20 goals. This signal is easily swamped by noise. The regression constant for GSAx is approximately 2,500-3,500 shots, meaning a goaltender needs to face this many shots before their observed rate carries 50% weight versus the league-average prior.Question 8. Approximately what percentage of NHL regular-season games go to overtime, and how does this affect puck line betting?
Answer
Approximately 23-26% of NHL regular-season games go to overtime. This significantly affects puck line betting because games that go to overtime are decided by exactly 1 goal (the OT/shootout winner), meaning the underdog at +1.5 covers and the favorite at -1.5 does not. A team with a 60% moneyline win probability might only have a 33-38% chance of covering -1.5, because the 23-26% overtime probability siphons away a large portion of potential 2+ goal wins.Question 9. What is the typical NHL home ice advantage in terms of win percentage and expected goal differential?
Answer
Home teams win approximately 54-55% of regular-season games. In terms of expected goals, home ice advantage translates to approximately +0.08 to +0.12 goals per game (roughly 3-5 cents on the moneyline). The advantage comes from the last-change rule (favorable matchups), familiarity with the rink, crowd influence on referee behavior, and reduced travel fatigue.Question 10. How does back-to-back game fatigue affect NHL team performance?
Answer
Teams on the second game of a back-to-back show measurable degradation: win rate drops from ~50% to 43-45%, goals against increase by approximately 0.2-0.3 per game, and save percentage drops by 3-5 points (often because a backup goaltender starts). The effect is amplified when the team must travel between games (road B2B) and when the previous game went to overtime.Question 11. What is the "empty net" situation in hockey, and how should it be handled in an analytical model?
Answer
When trailing by 1-2 goals in the final minutes, a team pulls its goaltender for an extra attacker (6v5). This creates a high volume of shot attempts but also frequent empty-net goals for the leading team. For modeling: (1) empty-net events should be excluded from shot metrics and xG calculations because they represent a fundamentally different game state; (2) they should be included in final score calculations since bets settle on actual final scores; (3) they should be modeled separately when projecting totals, as the probability depends on score state.Question 12. Explain the geometric mean method for combining team offensive and defensive xG rates in a matchup projection.
Answer
The geometric mean method accounts for the interaction between one team's offense and the opponent's defense. Instead of simply using one team's xGF rate, it computes: $\text{Team xGF/60} = \text{league avg} \times \sqrt{\frac{\text{team xGF/60}}{\text{league avg}} \times \frac{\text{opp xGA/60}}{\text{league avg}}}$. This produces a natural regression effect: a strong offense facing a strong defense produces a moderate output, while a strong offense facing a weak defense produces an amplified output. It avoids the additive approach's tendency to produce unrealistic extreme projections.Question 13. A team has CF% of 55% but xGF% of only 48%. What does this combination indicate?
Answer
This combination indicates the team generates a high volume of shots but from low-quality locations. They dominate territorial play (55% of shot attempts) but their chances come primarily from the perimeter rather than the high-danger scoring areas near the net. The xG model correctly identifies that shot quantity does not equal shot quality. This team's underlying offensive production is weaker than its raw shot metrics suggest. Modern analytics has shifted toward xG-based models precisely because of these discrepancies.Question 14. How does power play special teams performance affect game projections, and what is the approximate league-average PP%?
Answer
Power play contributes to game projections through expected goals generated per opportunity. Teams average 3-4 power play opportunities per game. League-average PP% is approximately 20-22%. An elite power play (28%) facing a poor penalty kill (75% PK%) gains roughly 0.3-0.4 extra expected goals compared to average, which is significant in a sport averaging ~6 total goals. The projection combines team PP xGF/60 with opponent PK xGA/60, scaled by expected opportunity count and duration.Question 15. What is the significance of using Poisson distribution for NHL goal modeling, and what key assumption does it satisfy?
Answer
The Poisson distribution is well-suited for NHL goal modeling because goals are low-frequency events (teams average roughly 3 per game) that occur approximately independently of one another within a game. The key assumptions are: (1) goals occur at a roughly constant rate within each period, (2) two goals cannot occur simultaneously, and (3) the probability of a goal in a small time interval is proportional to the interval length. Unlike MLB run scoring, NHL goals do not exhibit significant clustering, so the Poisson's equal mean-variance assumption is more appropriate.Question 16. Explain why the NHL puck line at +1.5 for underdogs frequently offers value.
Answer
Underdog +1.5 frequently offers value because the cover rate (typically 62-72% for standard matchups) significantly exceeds what the market often implies. The high overtime rate (23-26%) means even underdogs who lose in OT cover +1.5. Additionally, the public tends to overbet favorites, which can inflate the -1.5 favorite price and correspondingly undervalue the +1.5 underdog. When the +1.5 is priced at plus-money (rare but valuable), the implied probability is often well below the true cover probability.Question 17. What is the difference between regulation win probability and overall win probability in the NHL?
Answer
Regulation win probability is the probability of winning during the 60-minute regulation period. Overall win probability includes overtime and shootout results. The gap between them is the overtime probability (approximately 23-26% of games), which is split roughly 50/50 between the teams (with a slight edge for the better team or home team). For example, a team with 45% regulation win probability might have 55% overall win probability if it gets a 52% edge in the ~25% of games that reach OT.Question 18. How should an NHL bettor interpret a team with a strong record but low xG differential?
Answer
A team with a strong record but low xG differential is likely overperforming due to unsustainable factors: hot goaltending (high PDO), fortunate shooting percentage, or favorable overtime/shootout outcomes. Their record will likely regress as these factors revert to the mean. For bettors, this team is overvalued by the market (which prices partly on record) and represents a profitable fade opportunity. The xG differential is a better predictor of future performance than the current win-loss record.Question 19. What is the typical impact of switching from a team's starting goaltender to a backup, expressed in expected goals per game?
Answer
The difference between a team's starter and backup is typically 0.3-0.5 expected goals per game. This is calculated from the difference in regressed GSAx per shot multiplied by shots faced per game (approximately 30). An elite starter might have a regressed GSAx/shot of +0.005 while a mediocre backup might be at -0.005, creating a 0.010 difference per shot, or 0.30 goals per game on 30 shots. This translates to approximately 8-12 cents on the moneyline.Question 20. Why are NHL betting markets considered less efficient than NFL or NBA markets?
Answer
NHL betting markets are less efficient because: (1) the market is smaller and less liquid, with lower betting limits; (2) fewer sophisticated quantitative bettors focus primarily on hockey; (3) the high randomness of hockey outcomes makes it harder for the market to converge on true probabilities; (4) late-breaking information (goaltender confirmations, injury updates) creates informational asymmetries; and (5) back-to-back scheduling creates predictable performance degradation that the market sometimes underadjusts for.Question 21. What role does "rebound" play as a feature in xG models?
Answer
A rebound shot is one that follows a previous shot within a short time window (typically 2-3 seconds). Rebounds have significantly higher xG because the goaltender is often out of position after making the initial save. Rebound shots from close range can have xG values of 0.25-0.40, compared to 0.05-0.10 for a typical initial shot from the same location. Including the rebound flag as a feature substantially improves xG model accuracy.Question 22. Describe calibration in the context of an xG model. Why is isotonic calibration recommended?
Answer
Calibration refers to how well the predicted probabilities match observed frequencies. A well-calibrated model that predicts 10% goal probability on a set of shots should see approximately 10% of those shots result in goals. Isotonic calibration is recommended because: (1) raw logistic regression outputs may not be perfectly calibrated due to class imbalance (goals are ~9% of shots); (2) isotonic calibration applies a non-parametric, monotonic transformation that corrects any systematic miscalibration; and (3) well-calibrated probabilities are essential for aggregation (summing xG values across shots requires each value to be a true probability).Question 23. What is score-adjusted Corsi, and why does it provide a more accurate picture of team quality than raw Corsi?
Answer
Score-adjusted Corsi normalizes each shot attempt to the tied-game rate by dividing by the score-state multiplier. When a team is trailing by 2 goals and generates a shot attempt, that attempt is divided by the trailing team's inflated rate multiplier (e.g., 1.10), reducing its contribution. This correction removes the systematic bias where teams that frequently trail appear to have better Corsi than they truly deserve. Score-adjusted Corsi is a better measure of a team's true ability to control play because it reflects how they would perform in a neutral (tied) game state.Question 24. How do NHL totals markets respond to goaltender announcements, and where does the edge typically lie?
Answer
When a backup goaltender is announced (replacing a significantly better starter), the moneyline typically adjusts but the total often underadjusts. A backup goaltender might be expected to allow 0.3-0.5 more goals than the starter, which should push the total higher. However, the market frequently moves the moneyline by the appropriate amount while adjusting the total by only a portion of the expected impact. This creates a window for over bets when a significantly weaker backup is confirmed. The window is narrow (1-2 hours pre-game), so speed matters.Question 25. A team has xGF of 3.0 per game and xGA of 2.5 per game, giving an xG differential of +0.5. However, their actual goal differential is only +0.15 per game. Explain the likely cause and project how this will resolve over the remainder of the season.