WNBA vs NBA Analytical Differences
Beginner
10 min read
1 views
Nov 27, 2025
WNBA vs NBA: Understanding Game Differences for Better Analysis
1. Key Structural Differences
Understanding the fundamental differences between WNBA and NBA games is crucial for accurate cross-league analysis and fair comparisons.
Game Structure
| Aspect | WNBA | NBA | Impact on Analysis |
|---|---|---|---|
| Game Length | 40 minutes (4×10 min) | 48 minutes (4×12 min) | 20% more playing time affects per-game stats |
| Ball Size | 28.5 inches (Size 6) | 29.5 inches (Size 7) | Affects shooting mechanics and percentages |
| 3-Point Line | 22 feet 1.75 inches | 23 feet 9 inches (corners: 22 ft) | Distance affects shot selection and efficiency |
| Regular Season | 40 games | 82 games | Sample size and fatigue considerations |
| Shot Clock | 24 seconds | 24 seconds | Similar pace pressure |
| Roster Size | 12 active players | 15 active players | Depth and rotation patterns |
2. Statistical Differences and Adjustments
Per-Game vs Per-Minute Metrics
The 20% difference in game length makes per-game statistics misleading when comparing across leagues. Always normalize to per-36-minute or per-100-possession rates.
Python
statistical_normalization.py
import pandas as pd
import numpy as np
class LeagueStatNormalizer:
"""
Normalize statistics between WNBA and NBA for fair comparison.
"""
def __init__(self):
self.wnba_game_length = 40 # minutes
self.nba_game_length = 48 # minutes
def normalize_per_game_stats(self, stats_df, league, target_minutes=36):
"""
Convert per-game stats to per-minute rates.
Parameters:
-----------
stats_df : pd.DataFrame
DataFrame with columns: player, points, rebounds, assists, minutes
league : str
'WNBA' or 'NBA'
target_minutes : int
Target minutes for normalization (default: 36)
Returns:
--------
pd.DataFrame : Normalized statistics
"""
df = stats_df.copy()
# Calculate per-minute rates
stat_cols = ['points', 'rebounds', 'assists', 'steals', 'blocks', 'turnovers']
for col in stat_cols:
if col in df.columns:
# Per-minute rate
df[f'{col}_per_min'] = df[col] / df['minutes']
# Normalize to target minutes
df[f'{col}_per_{target_minutes}'] = df[f'{col}_per_min'] * target_minutes
return df
def calculate_per_possession_stats(self, team_stats_df):
"""
Calculate per-100-possession statistics.
Parameters:
-----------
team_stats_df : pd.DataFrame
Team stats with possessions, points, field_goal_attempts, etc.
Returns:
--------
pd.DataFrame : Per-100-possession statistics
"""
df = team_stats_df.copy()
# Estimate possessions if not provided
if 'possessions' not in df.columns:
df['possessions'] = (
df['field_goal_attempts'] -
df['offensive_rebounds'] +
df['turnovers'] +
0.44 * df['free_throw_attempts']
)
# Calculate per-100-possession rates
df['offensive_rating'] = (df['points'] / df['possessions']) * 100
df['pace'] = df['possessions'] / (df['minutes'] / 5)
df['true_shooting_pct'] = (
df['points'] / (2 * (df['field_goal_attempts'] +
0.44 * df['free_throw_attempts']))
)
return df
def adjust_for_league_context(self, player_stats, league_avg_stats, league):
"""
Adjust individual stats relative to league averages.
Parameters:
-----------
player_stats : pd.Series or dict
Player's statistics
league_avg_stats : pd.Series or dict
League average statistics
league : str
'WNBA' or 'NBA'
Returns:
--------
dict : Context-adjusted statistics
"""
adjusted = {}
# Calculate relative performance
adjusted['relative_ppg'] = (
player_stats['points'] / league_avg_stats['points']
)
adjusted['relative_ts_pct'] = (
player_stats['true_shooting_pct'] / league_avg_stats['true_shooting_pct']
)
adjusted['relative_usage'] = (
player_stats['usage_rate'] / league_avg_stats['usage_rate']
)
# Z-scores for standardization
for stat in ['points', 'rebounds', 'assists', 'player_efficiency_rating']:
if stat in player_stats and f'{stat}_std' in league_avg_stats:
adjusted[f'{stat}_zscore'] = (
(player_stats[stat] - league_avg_stats[stat]) /
league_avg_stats[f'{stat}_std']
)
return adjusted
# Example usage
if __name__ == "__main__":
# Sample WNBA data
wnba_data = pd.DataFrame({
'player': ['A\'ja Wilson', 'Breanna Stewart', 'Sabrina Ionescu'],
'points': [22.8, 21.2, 19.2],
'rebounds': [9.5, 8.3, 5.2],
'assists': [2.3, 3.8, 6.3],
'minutes': [34.6, 35.2, 33.8]
})
# Sample NBA data
nba_data = pd.DataFrame({
'player': ['Luka Doncic', 'Giannis Antetokounmpo', 'Joel Embiid'],
'points': [33.9, 31.1, 33.1],
'rebounds': [9.2, 11.8, 10.2],
'assists': [9.8, 5.7, 4.2],
'minutes': [37.0, 35.0, 34.7]
})
normalizer = LeagueStatNormalizer()
# Normalize both leagues to per-36 minutes
wnba_normalized = normalizer.normalize_per_game_stats(wnba_data, 'WNBA')
nba_normalized = normalizer.normalize_per_game_stats(nba_data, 'NBA')
print("WNBA Stats (Per-36 Minutes):")
print(wnba_normalized[['player', 'points_per_36', 'rebounds_per_36', 'assists_per_36']])
print("\nNBA Stats (Per-36 Minutes):")
print(nba_normalized[['player', 'points_per_36', 'rebounds_per_36', 'assists_per_36']])
3. Pace and Efficiency Differences
League-Wide Pace Comparison
Pace varies significantly between leagues and across seasons. Understanding these differences is essential for proper statistical context.
Python
pace_efficiency_analysis.py
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
class PaceEfficiencyAnalyzer:
"""
Analyze pace and efficiency differences between WNBA and NBA.
"""
def __init__(self):
self.wnba_game_minutes = 40
self.nba_game_minutes = 48
def calculate_pace(self, team_stats):
"""
Calculate team pace (possessions per 48 minutes).
Pace = 48 * ((Team Possessions + Opponent Possessions) / (2 * Team Minutes))
"""
possessions = (
team_stats['fga'] -
team_stats['oreb'] +
team_stats['tov'] +
0.44 * team_stats['fta']
)
pace = 48 * possessions / team_stats['minutes']
return pace
def calculate_four_factors(self, team_stats):
"""
Calculate Dean Oliver's Four Factors of Basketball Success.
Returns:
--------
dict : Four factors (shooting, turnovers, rebounding, free throws)
"""
# Effective Field Goal Percentage
efg_pct = (team_stats['fgm'] + 0.5 * team_stats['fg3m']) / team_stats['fga']
# Turnover Rate
tov_rate = team_stats['tov'] / (
team_stats['fga'] + 0.44 * team_stats['fta'] + team_stats['tov']
)
# Offensive Rebound Rate
oreb_rate = team_stats['oreb'] / (
team_stats['oreb'] + team_stats['opponent_dreb']
)
# Free Throw Rate
ft_rate = team_stats['ftm'] / team_stats['fga']
return {
'efg_pct': efg_pct,
'tov_rate': tov_rate,
'oreb_rate': oreb_rate,
'ft_rate': ft_rate
}
def compare_league_efficiency(self, wnba_teams, nba_teams):
"""
Compare efficiency metrics between leagues.
Parameters:
-----------
wnba_teams : pd.DataFrame
WNBA team statistics
nba_teams : pd.DataFrame
NBA team statistics
Returns:
--------
pd.DataFrame : Comparison summary
"""
wnba_efficiency = self._calculate_league_metrics(wnba_teams, 'WNBA')
nba_efficiency = self._calculate_league_metrics(nba_teams, 'NBA')
comparison = pd.DataFrame({
'Metric': ['Avg Pace', 'Avg eFG%', 'Avg TOV%', 'Avg ORB%',
'Avg FT Rate', 'Avg PPG', 'Avg 3PT%'],
'WNBA': [
wnba_efficiency['pace'],
wnba_efficiency['efg_pct'],
wnba_efficiency['tov_rate'],
wnba_efficiency['oreb_rate'],
wnba_efficiency['ft_rate'],
wnba_efficiency['ppg'],
wnba_efficiency['fg3_pct']
],
'NBA': [
nba_efficiency['pace'],
nba_efficiency['efg_pct'],
nba_efficiency['tov_rate'],
nba_efficiency['oreb_rate'],
nba_efficiency['ft_rate'],
nba_efficiency['ppg'],
nba_efficiency['fg3_pct']
]
})
comparison['Difference'] = comparison['NBA'] - comparison['WNBA']
comparison['Pct_Diff'] = (
(comparison['NBA'] - comparison['WNBA']) / comparison['WNBA'] * 100
)
return comparison
def _calculate_league_metrics(self, teams_df, league):
"""Calculate aggregate league metrics."""
return {
'pace': teams_df['pace'].mean(),
'efg_pct': teams_df['efg_pct'].mean(),
'tov_rate': teams_df['tov_rate'].mean(),
'oreb_rate': teams_df['oreb_rate'].mean(),
'ft_rate': teams_df['ft_rate'].mean(),
'ppg': teams_df['points'].mean(),
'fg3_pct': teams_df['fg3_pct'].mean()
}
def analyze_style_differences(self, wnba_df, nba_df):
"""
Analyze playing style differences between leagues.
Returns:
--------
dict : Style comparison metrics
"""
style_comparison = {
'wnba_3pt_rate': (wnba_df['fg3a'].sum() / wnba_df['fga'].sum()),
'nba_3pt_rate': (nba_df['fg3a'].sum() / nba_df['fga'].sum()),
'wnba_midrange_rate': self._estimate_midrange_rate(wnba_df),
'nba_midrange_rate': self._estimate_midrange_rate(nba_df),
'wnba_paint_rate': (wnba_df['points_in_paint'].sum() /
wnba_df['points'].sum()),
'nba_paint_rate': (nba_df['points_in_paint'].sum() /
nba_df['points'].sum())
}
return style_comparison
def _estimate_midrange_rate(self, df):
"""Estimate mid-range attempt rate."""
fg2a = df['fga'] - df['fg3a']
# Estimate: ~60% of 2PA are mid-range (rough approximation)
return 0.6 * (fg2a.sum() / df['fga'].sum())
# Example analysis
if __name__ == "__main__":
analyzer = PaceEfficiencyAnalyzer()
# Typical league averages (2023-24 season estimates)
wnba_avg = {
'pace': 79.5,
'efg_pct': 0.495,
'tov_rate': 0.145,
'oreb_rate': 0.265,
'ft_rate': 0.225,
'ppg': 83.2,
'fg3_pct': 0.349
}
nba_avg = {
'pace': 99.8,
'efg_pct': 0.555,
'tov_rate': 0.127,
'oreb_rate': 0.238,
'ft_rate': 0.242,
'ppg': 114.8,
'fg3_pct': 0.367
}
print("League Comparison (2023-24 Season):")
print(f"Pace Difference: NBA is {(nba_avg['pace'] - wnba_avg['pace']):.1f} possessions faster")
print(f"eFG% Difference: NBA shoots {(nba_avg['efg_pct'] - wnba_avg['efg_pct'])*100:.1f}% more efficiently")
print(f"3PT% Difference: NBA shoots {(nba_avg['fg3_pct'] - wnba_avg['fg3_pct'])*100:.1f}% better from three")
4. Advanced Normalization with R
Statistical Modeling for Cross-League Comparison
R provides powerful tools for statistical modeling and normalization across different league contexts.
R
cross_league_normalization.R
# Cross-League Statistical Normalization
library(dplyr)
library(ggplot2)
library(scales)
# Function to normalize statistics across leagues
normalize_cross_league <- function(wnba_data, nba_data) {
# Z-score normalization within each league
wnba_normalized <- wnba_data %>%
mutate(
points_z = scale(points)[,1],
rebounds_z = scale(rebounds)[,1],
assists_z = scale(assists)[,1],
efficiency_z = scale(player_efficiency_rating)[,1],
league = "WNBA"
)
nba_normalized <- nba_data %>%
mutate(
points_z = scale(points)[,1],
rebounds_z = scale(rebounds)[,1],
assists_z = scale(assists)[,1],
efficiency_z = scale(player_efficiency_rating)[,1],
league = "NBA"
)
# Combine datasets
combined <- bind_rows(wnba_normalized, nba_normalized)
return(combined)
}
# Function to calculate percentile ranks within league
calculate_percentile_ranks <- function(player_stats, league_stats) {
percentiles <- data.frame(
player = player_stats$player,
points_percentile = ecdf(league_stats$points)(player_stats$points) * 100,
rebounds_percentile = ecdf(league_stats$rebounds)(player_stats$rebounds) * 100,
assists_percentile = ecdf(league_stats$assists)(player_stats$assists) * 100,
efficiency_percentile = ecdf(league_stats$player_efficiency_rating)(
player_stats$player_efficiency_rating
) * 100
)
return(percentiles)
}
# Function to adjust for pace differences
adjust_for_pace <- function(stats_df, league_pace, target_pace = 100) {
pace_factor <- target_pace / league_pace
adjusted <- stats_df %>%
mutate(
points_pace_adj = points * pace_factor,
rebounds_pace_adj = rebounds * pace_factor,
assists_pace_adj = assists * pace_factor,
turnovers_pace_adj = turnovers * pace_factor
)
return(adjusted)
}
# Function to create composite performance score
create_composite_score <- function(player_data) {
# Weighted composite based on z-scores
weights <- list(
points = 0.30,
rebounds = 0.20,
assists = 0.25,
efficiency = 0.15,
shooting_pct = 0.10
)
composite <- player_data %>%
mutate(
composite_score = (
points_z * weights$points +
rebounds_z * weights$rebounds +
assists_z * weights$assists +
efficiency_z * weights$efficiency +
ts_pct_z * weights$shooting_pct
)
)
return(composite)
}
# Function to analyze shooting efficiency differences
analyze_shooting_efficiency <- function(wnba_shots, nba_shots) {
# Calculate true shooting percentage
calculate_ts <- function(pts, fga, fta) {
ts_pct <- pts / (2 * (fga + 0.44 * fta))
return(ts_pct)
}
wnba_shooting <- wnba_shots %>%
mutate(
ts_pct = calculate_ts(points, fga, fta),
efg_pct = (fgm + 0.5 * fg3m) / fga,
fg3_rate = fg3a / fga,
league = "WNBA"
)
nba_shooting <- nba_shots %>%
mutate(
ts_pct = calculate_ts(points, fga, fta),
efg_pct = (fgm + 0.5 * fg3m) / fga,
fg3_rate = fg3a / fga,
league = "NBA"
)
# Summary statistics
shooting_summary <- bind_rows(wnba_shooting, nba_shooting) %>%
group_by(league) %>%
summarise(
avg_ts_pct = mean(ts_pct, na.rm = TRUE),
avg_efg_pct = mean(efg_pct, na.rm = TRUE),
avg_fg3_rate = mean(fg3_rate, na.rm = TRUE),
median_ts_pct = median(ts_pct, na.rm = TRUE),
sd_ts_pct = sd(ts_pct, na.rm = TRUE)
)
return(shooting_summary)
}
# Function to model performance relative to league context
model_relative_performance <- function(player_stats, league_context) {
# Linear regression approach
model <- lm(
player_efficiency_rating ~ points + rebounds + assists +
steals + blocks - turnovers,
data = league_context
)
# Predict expected performance
player_stats$expected_per <- predict(model, newdata = player_stats)
player_stats$per_above_expected <- player_stats$player_efficiency_rating -
player_stats$expected_per
return(player_stats)
}
# Function to adjust for minutes played
adjust_for_minutes <- function(stats_df, target_minutes = 36) {
adjusted <- stats_df %>%
mutate(
across(
c(points, rebounds, assists, steals, blocks, turnovers),
~ . * (target_minutes / minutes),
.names = "{.col}_per36"
)
)
return(adjusted)
}
# Example usage
set.seed(123)
# Sample WNBA data
wnba_players <- data.frame(
player = c("A'ja Wilson", "Breanna Stewart", "Sabrina Ionescu"),
points = c(22.8, 21.2, 19.2),
rebounds = c(9.5, 8.3, 5.2),
assists = c(2.3, 3.8, 6.3),
minutes = c(34.6, 35.2, 33.8),
player_efficiency_rating = c(29.5, 26.8, 24.3)
)
# Sample NBA data
nba_players <- data.frame(
player = c("Nikola Jokic", "Giannis Antetokounmpo", "Luka Doncic"),
points = c(26.4, 31.1, 33.9),
rebounds = c(12.4, 11.8, 9.2),
assists = c(9.0, 5.7, 9.8),
minutes = c(34.6, 35.0, 37.0),
player_efficiency_rating = c(31.7, 30.8, 28.7)
)
# Normalize across leagues
normalized_data <- normalize_cross_league(wnba_players, nba_players)
# Adjust for typical league pace
wnba_pace_adj <- adjust_for_pace(wnba_players, league_pace = 79.5, target_pace = 100)
nba_pace_adj <- adjust_for_pace(nba_players, league_pace = 99.8, target_pace = 100)
print("Cross-League Normalized Z-Scores:")
print(normalized_data %>% select(player, league, points_z, rebounds_z, assists_z))
print("\nPace-Adjusted Statistics (per 100 possessions):")
print(wnba_pace_adj %>% select(player, points_pace_adj, rebounds_pace_adj, assists_pace_adj))
5. Physical and Stylistic Differences
Key Playing Style Distinctions
- Three-Point Shooting: NBA teams attempt 3-pointers at a higher rate (~40% of all FGA vs ~30% in WNBA), but WNBA has seen rapid growth in 3PT attempts
- Pace: NBA averages ~100 possessions per 48 minutes vs ~80 in WNBA (adjusted for game length)
- Physical Attributes: Different athletic profiles affect rim protection, rebounding, and transition play
- Shot Selection: WNBA features more mid-range shots and structured offense; NBA more transition and isolation
- Defensive Schemes: Both leagues use similar schemes, but execution differs based on athleticism and spacing
Impact on Analytics
Python
style_impact_analysis.py
import pandas as pd
import numpy as np
class StyleImpactAnalyzer:
"""
Analyze how style differences impact statistical interpretation.
"""
def __init__(self):
self.style_factors = {
'wnba': {
'three_pt_rate': 0.30,
'transition_rate': 0.15,
'isolation_rate': 0.08,
'post_up_rate': 0.12,
'pick_and_roll_rate': 0.35
},
'nba': {
'three_pt_rate': 0.40,
'transition_rate': 0.18,
'isolation_rate': 0.13,
'post_up_rate': 0.08,
'pick_and_roll_rate': 0.30
}
}
def calculate_expected_efficiency(self, shot_distribution, league='wnba'):
"""
Calculate expected efficiency based on shot distribution.
Parameters:
-----------
shot_distribution : dict
Distribution of shot types (rim, midrange, three_pt)
league : str
'wnba' or 'nba'
Returns:
--------
float : Expected points per shot
"""
# League-average efficiency by shot type
efficiency_map = {
'wnba': {
'rim': 1.20, # ~60% FG
'midrange': 0.78, # ~39% FG
'three_pt': 1.05 # ~35% 3PT
},
'nba': {
'rim': 1.30, # ~65% FG
'midrange': 0.82, # ~41% FG
'three_pt': 1.10 # ~36.7% 3PT
}
}
expected_pps = 0
for shot_type, rate in shot_distribution.items():
expected_pps += rate * efficiency_map[league][shot_type]
return expected_pps
def adjust_for_play_style(self, player_stats, team_style, league):
"""
Adjust player stats based on team playing style.
Parameters:
-----------
player_stats : dict
Player statistics
team_style : dict
Team style characteristics (pace, three_pt_rate, etc.)
league : str
'wnba' or 'nba'
Returns:
--------
dict : Style-adjusted statistics
"""
adjusted = player_stats.copy()
# Adjust for pace
league_avg_pace = 79.5 if league == 'wnba' else 99.8
pace_adjustment = team_style['pace'] / league_avg_pace
adjusted['pace_neutral_points'] = player_stats['points'] / pace_adjustment
adjusted['pace_neutral_assists'] = player_stats['assists'] / pace_adjustment
# Adjust for usage in team context
if team_style['three_pt_rate'] > self.style_factors[league]['three_pt_rate']:
# High 3PT team - adjust expectations
adjusted['expected_3pt_makes'] = (
player_stats['three_pt_attempts'] *
(team_style['team_three_pt_pct'] + 0.02) # Player benefit
)
return adjusted
def compare_player_contexts(self, player_a, player_b,
team_a_style, team_b_style):
"""
Compare two players accounting for different team contexts.
Returns contextual analysis of performance differences.
"""
# Calculate context-adjusted metrics
context_analysis = {
'raw_comparison': {
'ppg_diff': player_a['points'] - player_b['points'],
'apg_diff': player_a['assists'] - player_b['assists']
},
'pace_adjusted': {
'ppg_diff': (
player_a['points'] / team_a_style['pace'] * 100 -
player_b['points'] / team_b_style['pace'] * 100
),
'apg_diff': (
player_a['assists'] / team_a_style['pace'] * 100 -
player_b['assists'] / team_b_style['pace'] * 100
)
},
'style_impact': {
'player_a_benefit': self._calculate_style_benefit(
player_a, team_a_style
),
'player_b_benefit': self._calculate_style_benefit(
player_b, team_b_style
)
}
}
return context_analysis
def _calculate_style_benefit(self, player, team_style):
"""Calculate how much a player benefits from team style."""
benefit_score = 0
# Shooter in high-pace, high-3PT team
if player.get('three_pt_rate', 0) > 0.35 and team_style['three_pt_rate'] > 0.35:
benefit_score += 0.15
# Ball-handler in high-pace team
if player.get('usage_rate', 0) > 0.25 and team_style['pace'] > 95:
benefit_score += 0.10
return benefit_score
# Example usage
if __name__ == "__main__":
analyzer = StyleImpactAnalyzer()
# Example: Compare A'ja Wilson (WNBA) context vs Giannis (NBA) context
aja_stats = {
'points': 22.8,
'assists': 2.3,
'three_pt_attempts': 2.1,
'usage_rate': 0.28
}
giannis_stats = {
'points': 31.1,
'assists': 5.7,
'three_pt_attempts': 2.6,
'usage_rate': 0.35
}
aces_style = {
'pace': 82.3,
'three_pt_rate': 0.32,
'team_three_pt_pct': 0.357
}
bucks_style = {
'pace': 98.5,
'three_pt_rate': 0.41,
'team_three_pt_pct': 0.376
}
# Adjust for team contexts
aja_adjusted = analyzer.adjust_for_play_style(aja_stats, aces_style, 'wnba')
giannis_adjusted = analyzer.adjust_for_play_style(giannis_stats, bucks_style, 'nba')
print("Pace-Neutral Comparison:")
print(f"A'ja Wilson: {aja_adjusted['pace_neutral_points']:.1f} points")
print(f"Giannis: {giannis_adjusted['pace_neutral_points']:.1f} points")
6. Applying Analytics Across Both Leagues
Best Practices for Cross-League Analysis
1. Always Normalize for Game Length
- Use per-36-minute or per-40-minute stats for direct comparison
- Better yet, use per-100-possession rates to account for pace
- Never compare raw per-game statistics directly
2. Account for League Context
- Compare players to their own league averages (z-scores or percentiles)
- Consider league-wide efficiency differences
- Recognize that a 50% eFG in WNBA is excellent; in NBA it's average
3. Respect Style Differences
- WNBA players excel in fundamental skills and mid-range game
- NBA features more athleticism-dependent plays
- Different optimal strategies don't imply superiority
4. Use Advanced Metrics Carefully
- Player Efficiency Rating (PER) formulas may need league-specific adjustments
- Win Shares and VORP require league-average baselines
- Box Plus/Minus models should be trained separately for each league
Practical Example: Building a Cross-League Rating System
Python
cross_league_rating_system.py
import pandas as pd
import numpy as np
from sklearn.preprocessing import StandardScaler
class CrossLeagueRatingSystem:
"""
Build a rating system that works across WNBA and NBA.
"""
def __init__(self):
self.scaler = StandardScaler()
self.league_factors = {
'wnba': {
'game_length': 40,
'avg_pace': 79.5,
'avg_efficiency': 0.495
},
'nba': {
'game_length': 48,
'avg_pace': 99.8,
'avg_efficiency': 0.555
}
}
def calculate_universal_rating(self, player_stats, league):
"""
Calculate a universal player rating that works across leagues.
The rating is based on:
1. Per-possession efficiency
2. Volume (usage rate)
3. Impact (plus/minus data)
4. Relative to league context
Parameters:
-----------
player_stats : dict
Player statistics
league : str
'wnba' or 'nba'
Returns:
--------
float : Universal rating score (0-100 scale)
"""
# Step 1: Normalize to per-100-possession basis
possessions_per_game = self._estimate_player_possessions(
player_stats, league
)
points_per_100 = (player_stats['points'] / possessions_per_game) * 100
assists_per_100 = (player_stats['assists'] / possessions_per_game) * 100
rebounds_per_100 = (player_stats['rebounds'] / possessions_per_game) * 100
# Step 2: Calculate efficiency score
efficiency_score = self._calculate_efficiency_score(
player_stats, league
)
# Step 3: Calculate volume score
volume_score = player_stats.get('usage_rate', 0.20) * 100
# Step 4: Calculate impact score
impact_score = self._calculate_impact_score(player_stats, league)
# Step 5: Combine into universal rating
universal_rating = (
0.40 * efficiency_score +
0.30 * impact_score +
0.20 * volume_score +
0.10 * self._calculate_consistency_score(player_stats)
)
return min(100, max(0, universal_rating))
def _estimate_player_possessions(self, stats, league):
"""Estimate individual player possessions per game."""
team_pace = self.league_factors[league]['avg_pace']
game_length = self.league_factors[league]['game_length']
# Possessions ≈ (minutes / game_length) * team_pace
possessions = (stats['minutes'] / game_length) * team_pace
return possessions
def _calculate_efficiency_score(self, stats, league):
"""Calculate efficiency relative to league average."""
league_avg_eff = self.league_factors[league]['avg_efficiency']
# True Shooting Percentage
ts_pct = stats['points'] / (
2 * (stats['fga'] + 0.44 * stats['fta'])
)
# Relative to league average
relative_efficiency = (ts_pct / league_avg_eff) * 50
# Bonus for low turnovers
tov_penalty = stats.get('turnovers', 2) * 2
return max(0, relative_efficiency - tov_penalty)
def _calculate_impact_score(self, stats, league):
"""Calculate overall impact score."""
# Combine box score stats
box_score_impact = (
stats['points'] * 1.0 +
stats['rebounds'] * 1.2 +
stats['assists'] * 1.5 +
stats.get('steals', 0) * 2.0 +
stats.get('blocks', 0) * 2.0 -
stats.get('turnovers', 0) * 1.5
)
# Normalize by minutes
per_minute_impact = box_score_impact / stats['minutes']
# Scale to 0-100
impact_score = per_minute_impact * 20
return min(100, max(0, impact_score))
def _calculate_consistency_score(self, stats):
"""Calculate consistency based on game-to-game variance."""
# If we have game-by-game data, calculate std dev
# For now, use a placeholder based on minutes played
consistency = min(100, (stats['minutes'] / 40) * 100)
return consistency
def rank_players_cross_league(self, wnba_players, nba_players):
"""
Rank players from both leagues on the same scale.
Parameters:
-----------
wnba_players : pd.DataFrame
WNBA player statistics
nba_players : pd.DataFrame
NBA player statistics
Returns:
--------
pd.DataFrame : Combined rankings
"""
wnba_ratings = []
for _, player in wnba_players.iterrows():
rating = self.calculate_universal_rating(player.to_dict(), 'wnba')
wnba_ratings.append({
'player': player['player'],
'league': 'WNBA',
'universal_rating': rating
})
nba_ratings = []
for _, player in nba_players.iterrows():
rating = self.calculate_universal_rating(player.to_dict(), 'nba')
nba_ratings.append({
'player': player['player'],
'league': 'NBA',
'universal_rating': rating
})
# Combine and rank
all_ratings = pd.DataFrame(wnba_ratings + nba_ratings)
all_ratings = all_ratings.sort_values('universal_rating', ascending=False)
all_ratings['rank'] = range(1, len(all_ratings) + 1)
return all_ratings
def generate_comparison_report(self, player1, player2, league1, league2):
"""
Generate detailed comparison report between two players.
"""
rating1 = self.calculate_universal_rating(player1, league1)
rating2 = self.calculate_universal_rating(player2, league2)
report = {
'player1': {
'name': player1['player'],
'league': league1.upper(),
'rating': rating1,
'per_100_stats': self._calculate_per_100_stats(player1, league1)
},
'player2': {
'name': player2['player'],
'league': league2.upper(),
'rating': rating2,
'per_100_stats': self._calculate_per_100_stats(player2, league2)
},
'comparison': {
'rating_diff': rating1 - rating2,
'better_player': player1['player'] if rating1 > rating2 else player2['player']
}
}
return report
def _calculate_per_100_stats(self, player, league):
"""Calculate per-100-possession statistics."""
poss = self._estimate_player_possessions(player, league)
return {
'points': (player['points'] / poss) * 100,
'rebounds': (player['rebounds'] / poss) * 100,
'assists': (player['assists'] / poss) * 100
}
# Example usage
if __name__ == "__main__":
rating_system = CrossLeagueRatingSystem()
# Sample player data
aja_wilson = {
'player': "A'ja Wilson",
'points': 22.8,
'rebounds': 9.5,
'assists': 2.3,
'steals': 1.8,
'blocks': 1.9,
'turnovers': 1.6,
'minutes': 34.6,
'fga': 16.2,
'fta': 6.8,
'usage_rate': 0.281
}
giannis = {
'player': 'Giannis Antetokounmpo',
'points': 31.1,
'rebounds': 11.8,
'assists': 5.7,
'steals': 1.2,
'blocks': 1.5,
'turnovers': 3.4,
'minutes': 35.0,
'fga': 20.4,
'fta': 11.1,
'usage_rate': 0.354
}
# Calculate universal ratings
aja_rating = rating_system.calculate_universal_rating(aja_wilson, 'wnba')
giannis_rating = rating_system.calculate_universal_rating(giannis, 'nba')
print(f"A'ja Wilson Universal Rating: {aja_rating:.1f}")
print(f"Giannis Antetokounmpo Universal Rating: {giannis_rating:.1f}")
# Generate comparison report
report = rating_system.generate_comparison_report(
aja_wilson, giannis, 'wnba', 'nba'
)
print("\nComparison Report:")
print(f"Rating Difference: {abs(report['comparison']['rating_diff']):.1f} points")
7. Key Takeaways for Analysts
Essential Principles:
- Never compare raw per-game stats: Always normalize for minutes or possessions
- Context is everything: A stat's meaning depends on league averages and pace
- Use relative metrics: Z-scores, percentiles, and league-relative measures are your friends
- Respect both games: Different doesn't mean inferior - each league has unique strengths
- Account for sample size: WNBA's 40-game season requires different statistical approaches than NBA's 82 games
- Consider evolution: Both leagues evolve - ensure your analysis uses current benchmarks
- Validate assumptions: Don't assume NBA models/formulas work directly in WNBA context
Common Pitfalls to Avoid:
- Comparing 40-minute stats to 48-minute stats directly
- Using NBA shooting efficiency benchmarks for WNBA players
- Ignoring pace differences when comparing scoring
- Applying NBA-trained machine learning models to WNBA data without retraining
- Overlooking roster size and rotation pattern differences
- Failing to account for schedule density differences
Resources for Further Learning:
- Official Stats: stats.wnba.com and stats.nba.com
- Advanced Analytics: Her Hoop Stats, Basketball Reference
- Academic Research: Journal of Quantitative Analysis in Sports
- Community: Women's basketball analytics community on Twitter/X
Discussion
Have questions or feedback? Join our community discussion on
Discord or
GitHub Discussions.
Table of Contents
Related Topics
Quick Actions