NHL Draft Analytics
Beginner
10 min read
1 views
Nov 27, 2025
NHL Draft Prediction Models
NHL draft success is notoriously difficult to predict, but modern analytics can significantly improve scouting decisions. By analyzing historical draft data and player development patterns, teams can identify which metrics best predict NHL success and build predictive models to evaluate prospects.
Key Draft Analytics Components
- Performance Metrics: Points per game, goals, assists, plus/minus
- Physical Attributes: Height, weight, skating speed, age at draft
- League Quality Scores: Adjusting for competition level
- Success Definition: Games played, career longevity, impact metrics
Building a Draft Prediction Model
Python: Random Forest Draft Model
import pandas as pd
import numpy as np
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
# Load draft and performance data
draft_data = pd.read_csv('nhl_draft_history.csv')
# Feature engineering for draft prediction
features = ['points_per_game', 'goals', 'assists',
'plus_minus', 'age_at_draft', 'height_inches',
'weight_lbs', 'league_quality_score']
# Target: Whether player became NHL regular (100+ games)
X = draft_data[features]
y = draft_data['nhl_success']
# Split and scale data
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.2, random_state=42
)
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)
# Train draft prediction model
model = RandomForestClassifier(
n_estimators=200,
max_depth=10,
min_samples_split=20,
random_state=42
)
model.fit(X_train_scaled, y_train)
# Evaluate model
train_score = model.score(X_train_scaled, y_train)
test_score = model.score(X_test_scaled, y_test)
print(f"Training Accuracy: {train_score:.3f}")
print(f"Testing Accuracy: {test_score:.3f}")
# Feature importance for draft evaluation
feature_importance = pd.DataFrame({
'feature': features,
'importance': model.feature_importances_
}).sort_values('importance', ascending=False)
print("\nMost Important Draft Factors:")
print(feature_importance)
# Predict success probability for new prospect
new_prospect = pd.DataFrame({
'points_per_game': [1.45],
'goals': [42],
'assists': [38],
'plus_minus': [15],
'age_at_draft': [18.2],
'height_inches': [73],
'weight_lbs': [195],
'league_quality_score': [8.5]
})
prospect_scaled = scaler.transform(new_prospect)
success_prob = model.predict_proba(prospect_scaled)[0][1]
print(f"\nProspect NHL Success Probability: {success_prob:.1%}")
R: Draft Prediction and Analysis
library(tidyverse)
library(randomForest)
library(caret)
# Load NHL draft data
draft_data <- read_csv("nhl_draft_history.csv")
# Prepare features for modeling
draft_features <- draft_data %>%
select(points_per_game, goals, assists, plus_minus,
age_at_draft, height_inches, weight_lbs,
league_quality_score, nhl_success)
# Split data
set.seed(42)
train_index <- createDataPartition(draft_features$nhl_success,
p = 0.8, list = FALSE)
train_data <- draft_features[train_index, ]
test_data <- draft_features[-train_index, ]
# Train random forest model
rf_model <- randomForest(
as.factor(nhl_success) ~ .,
data = train_data,
ntree = 200,
mtry = 3,
importance = TRUE
)
# Model performance
train_pred <- predict(rf_model, train_data)
test_pred <- predict(rf_model, test_data)
train_acc <- mean(train_pred == train_data$nhl_success)
test_acc <- mean(test_pred == test_data$nhl_success)
cat(sprintf("Training Accuracy: %.3f\n", train_acc))
cat(sprintf("Testing Accuracy: %.3f\n", test_acc))
# Variable importance
importance_df <- importance(rf_model) %>%
as.data.frame() %>%
rownames_to_column("variable") %>%
arrange(desc(MeanDecreaseGini))
print("Most Important Draft Factors:")
print(importance_df)
# Predict for new prospect
new_prospect <- data.frame(
points_per_game = 1.45,
goals = 42,
assists = 38,
plus_minus = 15,
age_at_draft = 18.2,
height_inches = 73,
weight_lbs = 195,
league_quality_score = 8.5
)
success_prob <- predict(rf_model, new_prospect, type = "prob")
cat(sprintf("\nProspect NHL Success Probability: %.1f%%\n",
success_prob[2] * 100))
# Visualize feature importance
ggplot(importance_df[1:8, ],
aes(x = reorder(variable, MeanDecreaseGini),
y = MeanDecreaseGini)) +
geom_col(fill = "steelblue") +
coord_flip() +
labs(title = "NHL Draft Prediction - Feature Importance",
x = "Feature", y = "Importance Score") +
theme_minimal()
Draft Round Analysis
Historical success rates vary dramatically by draft round. First-round picks have approximately 60% chance of playing 100+ NHL games, while seventh-round picks succeed only 10% of the time. Understanding these baseline rates helps contextualize individual predictions.
Key Predictive Factors
- Points per game is typically the strongest predictor
- Age at draft matters - younger players with similar stats have higher upside
- League quality adjustments are critical for fair comparisons
- Physical attributes have moderate predictive power
Discussion
Have questions or feedback? Join our community discussion on
Discord or
GitHub Discussions.
Table of Contents
Related Topics
Quick Actions