Comparing WNBA Players
Beginner
10 min read
1 views
Nov 27, 2025
Player Comparison Framework
Comparing WNBA players requires understanding positional differences, era adjustments, and the unique characteristics of the women's game. Statistical comparison should account for pace, playing time, team quality, and role-specific responsibilities.
Key Comparison Metrics
- Per-Game Stats: Points, rebounds, assists (traditional comparison)
- Per-36 Minutes: Normalizes for playing time differences
- True Shooting %: Accounts for 2PT, 3PT, and FT efficiency
- Usage Rate: Percentage of team possessions used
- Player Efficiency Rating (PER): All-in-one metric
- Win Shares: Estimated wins contributed
Python: Multi-Dimensional Player Comparison
Python: Advanced Player Comparison System
import pandas as pd
import numpy as np
from sklearn.preprocessing import StandardScaler
from scipy.spatial.distance import euclidean
import matplotlib.pyplot as plt
import seaborn as sns
# Sample WNBA player data (normally loaded from API)
players_data = pd.DataFrame({
'player': ['Breanna Stewart', 'A\'ja Wilson', 'Sabrina Ionescu',
'Jewell Loyd', 'Alyssa Thomas', 'Napheesa Collier',
'Kelsey Plum', 'Candace Parker', 'Jonquel Jones'],
'position': ['F', 'F', 'G', 'G', 'F', 'F', 'G', 'F', 'C'],
'minutes': [34.2, 33.8, 35.1, 33.5, 32.8, 33.2, 31.5, 28.9, 30.5],
'points': [21.8, 22.8, 19.2, 19.6, 13.5, 20.5, 17.8, 13.2, 14.6],
'rebounds': [8.5, 9.5, 4.8, 4.2, 9.6, 9.8, 2.9, 8.3, 8.9],
'assists': [3.5, 2.2, 6.3, 3.8, 7.8, 3.4, 4.2, 5.1, 2.8],
'steals': [1.5, 1.9, 1.2, 1.3, 1.7, 1.3, 1.0, 1.1, 0.8],
'blocks': [1.2, 1.5, 0.3, 0.2, 0.8, 1.6, 0.1, 0.6, 1.4],
'fg_pct': [0.460, 0.515, 0.447, 0.432, 0.542, 0.488, 0.438, 0.475, 0.502],
'three_pct': [0.372, 0.315, 0.375, 0.385, 0.125, 0.345, 0.412, 0.328, 0.300],
'ft_pct': [0.825, 0.815, 0.855, 0.882, 0.710, 0.843, 0.890, 0.788, 0.821],
'turnovers': [2.8, 2.5, 3.2, 2.1, 3.5, 2.3, 2.4, 2.9, 1.9]
})
# =============================================================================
# 1. Calculate Advanced Metrics
# =============================================================================
def calculate_advanced_stats(player_stats, league_avg_pace=80):
"""Calculate advanced statistics for player comparison"""
# True Shooting Percentage
player_stats['ts_pct'] = (
player_stats['points'] /
(2 * (player_stats['points'] / player_stats['fg_pct'] +
0.44 * player_stats['points'] / player_stats['ft_pct']))
)
# Per-36 minute stats
min_factor = 36 / player_stats['minutes']
player_stats['pts_per_36'] = player_stats['points'] * min_factor
player_stats['reb_per_36'] = player_stats['rebounds'] * min_factor
player_stats['ast_per_36'] = player_stats['assists'] * min_factor
# Usage Rate (simplified - normally needs team data)
# USG% = [(FGA + 0.44*FTA + TOV) * Team_Min] / [Min * (Team_FGA + 0.44*Team_FTA + Team_TOV)]
# Simplified approximation
player_stats['usage_rate'] = (
(player_stats['points'] + player_stats['turnovers']) /
player_stats['minutes'] * 100
)
# Assist-to-Turnover Ratio
player_stats['ast_to_ratio'] = (
player_stats['assists'] / player_stats['turnovers']
)
# Player Efficiency Rating (simplified)
player_stats['per'] = (
player_stats['points'] +
player_stats['rebounds'] +
player_stats['assists'] +
player_stats['steals'] +
player_stats['blocks'] -
(player_stats['points'] / player_stats['fg_pct'] - player_stats['points']) -
player_stats['turnovers']
) / player_stats['minutes'] * 36
return player_stats
players_data = calculate_advanced_stats(players_data)
print("=== Advanced Player Statistics ===")
print(players_data[['player', 'position', 'ts_pct', 'per',
'usage_rate', 'ast_to_ratio']].round(3))
# =============================================================================
# 2. Multi-Dimensional Player Similarity
# =============================================================================
def find_similar_players(target_player, all_players, n_similar=3):
"""Find most similar players using Euclidean distance"""
# Select comparison features
features = ['pts_per_36', 'reb_per_36', 'ast_per_36',
'ts_pct', 'usage_rate', 'steals', 'blocks']
# Standardize features
scaler = StandardScaler()
scaled_features = scaler.fit_transform(all_players[features])
# Find target player index
target_idx = all_players[all_players['player'] == target_player].index[0]
target_vector = scaled_features[target_idx]
# Calculate distances
distances = []
for idx, player_vector in enumerate(scaled_features):
if idx != target_idx:
dist = euclidean(target_vector, player_vector)
distances.append({
'player': all_players.iloc[idx]['player'],
'distance': dist
})
# Sort by similarity (smallest distance)
similar_df = pd.DataFrame(distances).sort_values('distance').head(n_similar)
return similar_df
# Example: Find players similar to Breanna Stewart
similar_to_stewart = find_similar_players('Breanna Stewart', players_data, n_similar=3)
print("\n=== Players Most Similar to Breanna Stewart ===")
print(similar_to_stewart)
# =============================================================================
# 3. Position-Specific Rankings
# =============================================================================
def rank_by_position(players_df, stat='per'):
"""Rank players within their position"""
position_rankings = players_df.copy()
position_rankings['position_rank'] = (
position_rankings.groupby('position')[stat]
.rank(ascending=False, method='min')
)
return position_rankings.sort_values(['position', 'position_rank'])
position_rankings = rank_by_position(players_data, stat='per')
print("\n=== Position Rankings by PER ===")
print(position_rankings[['player', 'position', 'per', 'position_rank']])
# =============================================================================
# 4. Player Comparison Radar Chart
# =============================================================================
def create_comparison_radar(player1, player2, all_players):
"""Create radar chart comparing two players"""
categories = ['Points', 'Rebounds', 'Assists', 'TS%', 'Steals', 'Blocks']
# Normalize stats to 0-100 scale
p1_data = all_players[all_players['player'] == player1].iloc[0]
p2_data = all_players[all_players['player'] == player2].iloc[0]
# Calculate percentiles
p1_values = [
(p1_data['points'] / all_players['points'].max()) * 100,
(p1_data['rebounds'] / all_players['rebounds'].max()) * 100,
(p1_data['assists'] / all_players['assists'].max()) * 100,
(p1_data['ts_pct'] / all_players['ts_pct'].max()) * 100,
(p1_data['steals'] / all_players['steals'].max()) * 100,
(p1_data['blocks'] / all_players['blocks'].max()) * 100,
]
p2_values = [
(p2_data['points'] / all_players['points'].max()) * 100,
(p2_data['rebounds'] / all_players['rebounds'].max()) * 100,
(p2_data['assists'] / all_players['assists'].max()) * 100,
(p2_data['ts_pct'] / all_players['ts_pct'].max()) * 100,
(p2_data['steals'] / all_players['steals'].max()) * 100,
(p2_data['blocks'] / all_players['blocks'].max()) * 100,
]
# Close the plot
p1_values += p1_values[:1]
p2_values += p2_values[:1]
angles = np.linspace(0, 2 * np.pi, len(categories), endpoint=False).tolist()
angles += angles[:1]
fig, ax = plt.subplots(figsize=(8, 8), subplot_kw=dict(projection='polar'))
ax.plot(angles, p1_values, 'o-', linewidth=2, label=player1, color='blue')
ax.fill(angles, p1_values, alpha=0.25, color='blue')
ax.plot(angles, p2_values, 'o-', linewidth=2, label=player2, color='red')
ax.fill(angles, p2_values, alpha=0.25, color='red')
ax.set_xticks(angles[:-1])
ax.set_xticklabels(categories)
ax.set_ylim(0, 100)
ax.set_title(f"Player Comparison: {player1} vs {player2}", size=14, pad=20)
ax.legend(loc='upper right', bbox_to_anchor=(1.3, 1.1))
ax.grid(True)
return fig
# Example: Compare two elite players
# fig = create_comparison_radar('Breanna Stewart', 'A\'ja Wilson', players_data)
# plt.show()
print("\n=== Player Comparison Tools Available ===")
print("1. Advanced stat calculator (TS%, PER, Per-36)")
print("2. Similarity finder (Euclidean distance)")
print("3. Position-specific rankings")
print("4. Radar chart visualizations")
R: WNBA Player Comparison with wehoop
library(wehoop)
library(tidyverse)
library(scales)
library(ggrepel)
# Load WNBA player box scores for current season
player_stats <- wehoop::load_wnba_player_box(seasons = 2024)
# =============================================================================
# 1. Calculate Season Averages and Advanced Stats
# =============================================================================
season_averages <- player_stats %>%
group_by(athlete_id, athlete_display_name, team_display_name) %>%
summarise(
games = n(),
minutes = mean(minutes, na.rm = TRUE),
points = mean(points, na.rm = TRUE),
rebounds = mean(rebounds, na.rm = TRUE),
assists = mean(assists, na.rm = TRUE),
steals = mean(steals, na.rm = TRUE),
blocks = mean(blocks, na.rm = TRUE),
turnovers = mean(turnovers, na.rm = TRUE),
fgm = sum(field_goals_made, na.rm = TRUE),
fga = sum(field_goals_attempted, na.rm = TRUE),
fg3m = sum(three_point_field_goals_made, na.rm = TRUE),
fg3a = sum(three_point_field_goals_attempted, na.rm = TRUE),
ftm = sum(free_throws_made, na.rm = TRUE),
fta = sum(free_throws_attempted, na.rm = TRUE),
.groups = "drop"
) %>%
filter(games >= 10) %>% # Min 10 games played
mutate(
# Advanced metrics
fg_pct = fgm / fga,
three_pct = fg3m / fg3a,
ft_pct = ftm / fta,
ts_pct = points / (2 * (fga + 0.44 * fta)),
# Per-36 minute stats
pts_per_36 = points * (36 / minutes),
reb_per_36 = rebounds * (36 / minutes),
ast_per_36 = assists * (36 / minutes),
# Efficiency ratios
ast_to_ratio = assists / turnovers,
usage_rate = (points + turnovers) / minutes * 100,
# Simplified PER
per = (points + rebounds + assists + steals + blocks -
(fga - fgm) - turnovers) / minutes * 36
)
cat("=== Top Players by PER (min 10 games) ===\n")
print(season_averages %>%
select(athlete_display_name, team_display_name, games,
points, rebounds, assists, per) %>%
arrange(desc(per)) %>%
head(10))
# =============================================================================
# 2. Find Similar Players Using Statistical Distance
# =============================================================================
find_similar_players <- function(player_name, stats_df, n = 5) {
# Select comparison features
features <- c("pts_per_36", "reb_per_36", "ast_per_36",
"ts_pct", "steals", "blocks")
# Normalize features
stats_normalized <- stats_df %>%
select(athlete_display_name, all_of(features)) %>%
mutate(across(all_of(features), ~scale(.x)[,1]))
# Get target player vector
target <- stats_normalized %>%
filter(athlete_display_name == player_name)
if (nrow(target) == 0) {
cat("Player not found\n")
return(NULL)
}
# Calculate Euclidean distance to all other players
stats_normalized %>%
filter(athlete_display_name != player_name) %>%
rowwise() %>%
mutate(
distance = sqrt(sum((c_across(all_of(features)) -
target[1, features])^2))
) %>%
ungroup() %>%
arrange(distance) %>%
head(n) %>%
select(athlete_display_name, distance)
}
# Example: Find players similar to top scorer
top_scorer <- season_averages %>%
arrange(desc(points)) %>%
slice(1) %>%
pull(athlete_display_name)
cat(sprintf("\n=== Players Similar to %s ===\n", top_scorer))
similar_players <- find_similar_players(top_scorer, season_averages, n = 5)
print(similar_players)
# =============================================================================
# 3. Position-Based Comparison (Guards vs Forwards vs Centers)
# =============================================================================
# Classification by playing style (based on stats)
position_classification <- season_averages %>%
mutate(
position_type = case_when(
assists >= 5 & rebounds < 6 ~ "Guard",
rebounds >= 7 & assists < 5 ~ "Big",
TRUE ~ "Wing/Forward"
)
)
cat("\n=== Position Type Averages ===\n")
position_averages <- position_classification %>%
group_by(position_type) %>%
summarise(
players = n(),
avg_points = mean(points),
avg_rebounds = mean(rebounds),
avg_assists = mean(assists),
avg_ts_pct = mean(ts_pct, na.rm = TRUE),
.groups = "drop"
)
print(position_averages)
# =============================================================================
# 4. Head-to-Head Player Comparison Function
# =============================================================================
compare_players <- function(player1, player2, stats_df) {
comparison <- stats_df %>%
filter(athlete_display_name %in% c(player1, player2)) %>%
select(athlete_display_name, games, minutes, points, rebounds,
assists, steals, blocks, ts_pct, per) %>%
arrange(athlete_display_name)
return(comparison)
}
# Example comparison
top_2_scorers <- season_averages %>%
arrange(desc(points)) %>%
head(2) %>%
pull(athlete_display_name)
if (length(top_2_scorers) >= 2) {
cat(sprintf("\n=== Head-to-Head: %s vs %s ===\n",
top_2_scorers[1], top_2_scorers[2]))
comparison <- compare_players(top_2_scorers[1], top_2_scorers[2],
season_averages)
print(comparison)
}
# =============================================================================
# 5. Visualization: Scoring vs Efficiency
# =============================================================================
ggplot(season_averages %>% filter(games >= 15),
aes(x = ts_pct, y = points, label = athlete_display_name)) +
geom_point(aes(size = usage_rate, color = per), alpha = 0.7) +
geom_text_repel(data = . %>% filter(points >= 15 | ts_pct >= 0.60),
size = 3, max.overlaps = 10) +
scale_color_gradient(low = "lightblue", high = "darkred",
name = "PER") +
scale_x_continuous(labels = percent_format(accuracy = 1)) +
scale_size_continuous(name = "Usage Rate") +
labs(
title = "WNBA Player Comparison: Scoring Volume vs Efficiency",
subtitle = "Minimum 15 games played, sized by usage rate",
x = "True Shooting %",
y = "Points Per Game"
) +
theme_minimal() +
theme(legend.position = "right")
# =============================================================================
# 6. All-Around Performance: Points, Rebounds, Assists
# =============================================================================
ggplot(season_averages %>% filter(games >= 15),
aes(x = rebounds, y = assists, size = points)) +
geom_point(alpha = 0.6, color = "steelblue") +
geom_text_repel(aes(label = athlete_display_name),
data = . %>% filter(points >= 15),
size = 3) +
labs(
title = "WNBA All-Around Performance",
subtitle = "Points (size), Rebounds, and Assists",
x = "Rebounds Per Game",
y = "Assists Per Game",
size = "Points Per Game"
) +
theme_minimal()
cat("\n=== Player Comparison Framework Complete ===\n")
cat("✓ Advanced stat calculations\n")
cat("✓ Similarity analysis\n")
cat("✓ Position-based comparisons\n")
cat("✓ Head-to-head tools\n")
cat("✓ Visualization dashboards\n")
Context Matters in Comparisons
When comparing WNBA players, consider team context, playing style, and role. A high-usage scorer on a weak team may have inflated counting stats but lower efficiency. Conversely, a role player on an elite team may have excellent efficiency but limited volume.
Effective Player Comparison Checklist
- Use per-minute stats to account for playing time differences
- Incorporate efficiency metrics (TS%, eFG%) not just volume
- Consider positional context (guards vs forwards vs centers)
- Account for era differences when comparing across seasons
- Combine multiple metrics for holistic evaluation
Discussion
Have questions or feedback? Join our community discussion on
Discord or
GitHub Discussions.
Table of Contents
Related Topics
Quick Actions