Chapter 18: Key Takeaways - Modeling the NHL
-
Expected goals (xG) is the cornerstone of modern NHL analytics and the most predictive team-level metric. An xG model assigns a goal probability to each shot based on distance, angle, shot type, and context. By summing shot-level xG, we measure how many goals a team "deserved" based on chance quality, independent of whether the puck actually went in. The xG differential correlates with future points at $r \approx 0.55$--$0.65$, compared to only $r \approx 0.35$--$0.45$ for actual goal differential.
-
Corsi and Fenwick measure territorial control and stabilize quickly. Shot attempt differentials serve as proxies for puck possession and stabilize within 20--25 games. They are useful supplements to xG, particularly for detecting team-quality changes early in the season before xG has sufficient sample size.
-
PDO is the most mean-reverting metric in hockey and a direct signal for regression bets. Teams with PDO above 102% are almost certainly benefiting from unsustainable luck; teams below 98% are likely unlucky. Simply betting against extreme PDO teams has been historically profitable.
-
Score effects fundamentally alter team behavior and must be adjusted for. Leading teams turtle (CF% drops to 43--47%), trailing teams press (CF% rises to 53--57%). Raw shot metrics are heavily contaminated by the score states a team played in. Score-adjusted metrics produce significantly more accurate team quality assessments.
-
Goals Saved Above Expected (GSAx) is the proper metric for goaltender evaluation. Raw save percentage is contaminated by shot quality and small-sample noise. GSAx adjusts for shot difficulty and isolates goaltender skill. However, even GSAx requires heavy regression to the mean.
-
Goaltender performance requires the heaviest regression of any variable in hockey. The regression constant of approximately 3,000 shots means that even after a full season, a goaltender's observed performance carries only 33--40% weight versus the league-average prior. Small-sample goaltender results (10--15 games) are almost entirely noise.
-
Goaltender mispricing is the single largest source of exploitable edges in NHL betting. The market overweights recent save percentage, underweights the starter-to-backup gap, and ignores shot quality context. A properly regressed goaltender model, combined with timely monitoring of goaltender confirmations, identifies persistent value.
-
The puck line (+/- 1.5) is one of the most interesting NHL betting markets. The 23--26% overtime rate creates a large wedge between moneyline probability and puck line cover probability. A 60% moneyline favorite covers $-1.5$ only about 33--38% of the time. Underdog $+1.5$ cover rates of 62--72% frequently exceed market-implied probabilities.
-
Back-to-back game fatigue is a measurable and exploitable factor. Win rates drop by 5--7 percentage points on the second game of a back-to-back, with additional penalties for travel, overtime the previous night, and backup goaltender starts. The market adjusts but often insufficiently.
-
Special teams contribute meaningfully to game projections. An elite power play facing a poor penalty kill gains 0.3--0.4 expected goals per game beyond even-strength projections. Modeling PP and PK separately, then adjusting for the opponent's opposing rate, adds predictive value.
-
The Poisson distribution is well-suited for NHL goal modeling. Goals are low-frequency events that occur approximately independently, making the Poisson a better fit than for MLB run scoring (which exhibits overdispersion). The Poisson model produces accurate moneyline, puck line, and totals probabilities from expected goal projections.
-
NHL betting markets are less efficient than NFL or NBA markets, creating persistent opportunities. The market is smaller, fewer sophisticated bettors focus on hockey, and late-breaking information (goaltender confirmations, schedule effects) creates informational asymmetries that the prepared bettor can exploit.