College Women's Basketball Analytics
College Women's Basketball Analytics
Women's college basketball analytics has experienced tremendous growth in recent years, driven by increased investment in programs, better data availability, and the rising prominence of women's basketball. This guide covers the state of analytics in women's college basketball, key data sources, practical analysis techniques, and applications for coaches, analysts, and recruiters.
State of Women's College Basketball Analytics
Current Landscape
Women's college basketball analytics has evolved significantly, particularly since the explosive growth in viewership and investment following the 2023-24 season. Key developments include:
- Increased Data Availability: More programs are investing in tracking systems (Synergy, Hudl) and analytics personnel
- Advanced Metrics Adoption: Teams are using efficiency ratings, possession-based metrics, and shot quality analysis
- Public Analytics Growth: Platforms like Her Hoop Stats and NCAA.com provide detailed statistics and rankings
- Transfer Portal Era: Analytics now crucial for evaluating transfer candidates and roster construction
- WNBA Pipeline: Draft analytics help identify professional prospects and developmental trajectories
Analytics Gaps and Opportunities
Despite growth, women's college basketball analytics still lags behind men's basketball in several areas:
Current Gaps
- Less tracking data availability
- Fewer public advanced stats platforms
- Limited play-by-play data for some conferences
- Smaller analytics community and resources
Growing Opportunities
- Expanding data partnerships (NCAA, conferences)
- Increased media coverage and interest
- Growing WNBA investment in scouting
- More programs hiring dedicated analysts
Key Data Sources for Women's College Basketball
1. NCAA Statistics
The NCAA provides official statistics through NCAA.com/stats, including:
- Team Statistics: Scoring, rebounding, shooting percentages, turnovers
- Individual Statistics: Player averages, efficiency ratings, per-game stats
- Conference Standings: Records, RPI, NET rankings (for tournament selection)
- Tournament Data: March Madness statistics and historical performance
Access: NCAA stats are available at ncaa.com/stats with downloadable CSV files for many categories.
2. Her Hoop Stats
Her Hoop Stats is the premier independent analytics platform for women's basketball, offering:
- Advanced Team Metrics: Efficiency ratings, tempo-adjusted statistics, strength of schedule
- Player Ratings: Box score-based player evaluation metrics
- Game Predictions: Statistical models for game outcomes and tournament projections
- Historical Database: Multi-year data for trend analysis and player development tracking
- Transfer Portal Tracker: Comprehensive database of players in the transfer portal with statistics
3. Conference Websites and Stats Services
Many conferences provide detailed statistics through their official websites or third-party platforms:
- Major Conferences: SEC, Big Ten, ACC, Big 12, Pac-12 provide comprehensive stats packages
- Synergy Sports: Video and statistical platform used by many programs (subscription required)
- StatBroadcast: Live stats platform used by many conferences with play-by-play data
4. wehoop Package (R)
The wehoop R package provides programmatic access to women's college basketball data, making it easy to retrieve and analyze statistics in R. It's part of the SportsDataverse ecosystem.
Python Analysis for Women's College Basketball
Data Collection and Processing
While there isn't a comprehensive Python package like wehoop for R, you can collect women's college basketball data using web scraping and API techniques:
import pandas as pd
import requests
from bs4 import BeautifulSoup
import numpy as np
class WomensCollegeBasketballScraper:
"""Scrape and process women's college basketball statistics"""
def __init__(self):
self.base_url = "https://www.ncaa.com/stats/basketball-women/d1"
self.headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36'
}
def get_team_stats(self, season, stat_category):
"""
Retrieve team statistics from NCAA.com
Parameters:
-----------
season : str
Season year (e.g., '2024')
stat_category : str
Category like 'scoring-offense', 'scoring-defense', 'field-goal-pct'
Returns:
--------
pd.DataFrame
Team statistics data
"""
url = f"{self.base_url}/{season}/{stat_category}"
try:
response = requests.get(url, headers=self.headers)
response.raise_for_status()
# Parse HTML
soup = BeautifulSoup(response.content, 'html.parser')
table = soup.find('table')
if table:
df = pd.read_html(str(table))[0]
df['season'] = season
df['category'] = stat_category
return df
else:
print(f"No table found for {stat_category}")
return pd.DataFrame()
except Exception as e:
print(f"Error fetching {stat_category}: {e}")
return pd.DataFrame()
def get_player_stats(self, season, stat_category):
"""Retrieve individual player statistics"""
url = f"{self.base_url}/{season}/player/{stat_category}"
try:
response = requests.get(url, headers=self.headers)
response.raise_for_status()
soup = BeautifulSoup(response.content, 'html.parser')
table = soup.find('table')
if table:
df = pd.read_html(str(table))[0]
df['season'] = season
df['category'] = stat_category
return df
else:
return pd.DataFrame()
except Exception as e:
print(f"Error fetching player {stat_category}: {e}")
return pd.DataFrame()
# Example usage
scraper = WomensCollegeBasketballScraper()
# Get team offensive statistics
team_offense = scraper.get_team_stats('2024', 'scoring-offense')
team_defense = scraper.get_team_stats('2024', 'scoring-defense')
print("Top 10 Scoring Offenses:")
print(team_offense.head(10))
# Get top scorers
top_scorers = scraper.get_player_stats('2024', 'points')
print("\nTop Scorers:")
print(top_scorers.head(10))
Four Factors Analysis
Implementing Dean Oliver's Four Factors for women's college basketball:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
class FourFactorsAnalyzer:
"""Calculate and analyze Four Factors for women's college basketball"""
def calculate_four_factors(self, team_stats):
"""
Calculate Four Factors from team statistics
Parameters:
-----------
team_stats : pd.DataFrame
DataFrame with columns: FGM, FGA, FTM, FTA, ORB, DRB, TOV,
Opp_FGM, Opp_FGA, Opp_FTM, Opp_FTA,
Opp_ORB, Opp_DRB, Opp_TOV
Returns:
--------
pd.DataFrame
DataFrame with Four Factors calculated
"""
df = team_stats.copy()
# Offensive Four Factors
df['eFG%'] = (df['FGM'] + 0.5 * df['3PM']) / df['FGA']
df['TOV%'] = df['TOV'] / (df['FGA'] + 0.44 * df['FTA'] + df['TOV'])
df['ORB%'] = df['ORB'] / (df['ORB'] + df['Opp_DRB'])
df['FTRate'] = df['FTM'] / df['FGA']
# Defensive Four Factors
df['Opp_eFG%'] = (df['Opp_FGM'] + 0.5 * df['Opp_3PM']) / df['Opp_FGA']
df['Opp_TOV%'] = df['Opp_TOV'] / (df['Opp_FGA'] + 0.44 * df['Opp_FTA'] + df['Opp_TOV'])
df['DRB%'] = df['DRB'] / (df['DRB'] + df['Opp_ORB'])
df['Opp_FTRate'] = df['Opp_FTM'] / df['Opp_FGA']
# Overall efficiency
df['Off_Efficiency'] = (df['eFG%'] * 0.40 +
(1 - df['TOV%']) * 0.25 +
df['ORB%'] * 0.20 +
df['FTRate'] * 0.15)
df['Def_Efficiency'] = ((1 - df['Opp_eFG%']) * 0.40 +
df['Opp_TOV%'] * 0.25 +
df['DRB%'] * 0.20 +
(1 - df['Opp_FTRate']) * 0.15)
return df
def plot_four_factors(self, team_stats, team_name):
"""Create radar chart of Four Factors"""
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(14, 6),
subplot_kw=dict(projection='polar'))
team_data = team_stats[team_stats['Team'] == team_name].iloc[0]
# Offensive factors
categories = ['eFG%', 'TOV%', 'ORB%', 'FTRate']
values = [team_data['eFG%'], 1 - team_data['TOV%'],
team_data['ORB%'], team_data['FTRate']]
angles = np.linspace(0, 2 * np.pi, len(categories), endpoint=False).tolist()
values += values[:1]
angles += angles[:1]
ax1.plot(angles, values, 'o-', linewidth=2, label=team_name)
ax1.fill(angles, values, alpha=0.25)
ax1.set_xticks(angles[:-1])
ax1.set_xticklabels(categories)
ax1.set_ylim(0, 1)
ax1.set_title('Offensive Four Factors', pad=20)
ax1.grid(True)
# Defensive factors
categories_def = ['Opp eFG%', 'Opp TOV%', 'DRB%', 'Opp FTRate']
values_def = [1 - team_data['Opp_eFG%'], team_data['Opp_TOV%'],
team_data['DRB%'], 1 - team_data['Opp_FTRate']]
values_def += values_def[:1]
ax2.plot(angles, values_def, 'o-', linewidth=2, label=team_name, color='red')
ax2.fill(angles, values_def, alpha=0.25, color='red')
ax2.set_xticks(angles[:-1])
ax2.set_xticklabels(categories_def)
ax2.set_ylim(0, 1)
ax2.set_title('Defensive Four Factors', pad=20)
ax2.grid(True)
plt.tight_layout()
return fig
# Example usage
# Assuming you have team statistics loaded
team_stats = pd.DataFrame({
'Team': ['South Carolina', 'UConn', 'Iowa', 'LSU'],
'FGM': [870, 825, 910, 840],
'FGA': [1750, 1680, 1850, 1720],
'3PM': [210, 245, 280, 225],
'FTM': [520, 480, 550, 490],
'FTA': [680, 620, 710, 640],
'ORB': [385, 340, 360, 370],
'DRB': [920, 880, 850, 870],
'TOV': [410, 385, 450, 420],
'Opp_FGM': [720, 750, 780, 760],
'Opp_FGA': [1680, 1720, 1780, 1740],
'Opp_3PM': [185, 195, 210, 200],
'Opp_FTM': [450, 470, 490, 475],
'Opp_FTA': [590, 610, 640, 620],
'Opp_ORB': [290, 310, 330, 315],
'Opp_DRB': [780, 760, 740, 755],
'Opp_TOV': [490, 510, 480, 495]
})
analyzer = FourFactorsAnalyzer()
team_stats_with_factors = analyzer.calculate_four_factors(team_stats)
print("Four Factors Analysis:")
print(team_stats_with_factors[['Team', 'eFG%', 'TOV%', 'ORB%', 'FTRate',
'Off_Efficiency', 'Def_Efficiency']])
# Plot for specific team
fig = analyzer.plot_four_factors(team_stats_with_factors, 'South Carolina')
plt.savefig('south_carolina_four_factors.png', dpi=300, bbox_inches='tight')
Player Evaluation Metrics
Calculate advanced player statistics for women's college basketball:
import pandas as pd
import numpy as np
class PlayerMetrics:
"""Calculate advanced player metrics for women's college basketball"""
def __init__(self):
self.league_avg_pace = 70.0 # Approximate for women's CBB
self.league_avg_ortg = 100.0
def calculate_per(self, player_stats):
"""
Calculate Player Efficiency Rating (PER)
Parameters:
-----------
player_stats : pd.DataFrame
Player statistics with columns: PTS, AST, REB, STL, BLK,
TOV, FGM, FGA, FTM, FTA, MIN
Returns:
--------
pd.DataFrame
DataFrame with PER calculated
"""
df = player_stats.copy()
# Unadjusted PER calculation
df['uPER'] = (df['PTS'] + df['REB'] + df['AST'] + df['STL'] + df['BLK'] -
df['FGA'] - df['FTA'] + df['FGM'] + df['FTM'] - df['TOV']) / df['MIN']
# Normalize to league average (15.0)
league_upер = df['uPER'].mean()
df['PER'] = (df['uPER'] / league_upер) * 15.0
return df
def calculate_true_shooting(self, player_stats):
"""
Calculate True Shooting Percentage
TS% = PTS / (2 * (FGA + 0.44 * FTA))
"""
df = player_stats.copy()
df['TS%'] = df['PTS'] / (2 * (df['FGA'] + 0.44 * df['FTA']))
return df
def calculate_usage_rate(self, player_stats, team_stats):
"""
Calculate player usage rate
USG% = 100 * ((FGA + 0.44 * FTA + TOV) * (Team_MIN / 5)) /
(MIN * (Team_FGA + 0.44 * Team_FTA + Team_TOV))
"""
df = player_stats.copy()
for idx, row in df.iterrows():
team = row['Team']
team_data = team_stats[team_stats['Team'] == team].iloc[0]
player_possessions = row['FGA'] + 0.44 * row['FTA'] + row['TOV']
team_possessions = team_data['FGA'] + 0.44 * team_data['FTA'] + team_data['TOV']
team_min = team_data['MIN']
df.at[idx, 'USG%'] = 100 * ((player_possessions * (team_min / 5)) /
(row['MIN'] * team_possessions))
return df
def calculate_assist_ratio(self, player_stats):
"""
Calculate assist to turnover ratio
"""
df = player_stats.copy()
df['AST/TO'] = df['AST'] / df['TOV'].replace(0, 1)
return df
def calculate_comprehensive_metrics(self, player_stats, team_stats):
"""Calculate all advanced metrics"""
df = player_stats.copy()
df = self.calculate_per(df)
df = self.calculate_true_shooting(df)
df = self.calculate_usage_rate(df, team_stats)
df = self.calculate_assist_ratio(df)
return df
# Example usage
player_stats = pd.DataFrame({
'Player': ['Caitlin Clark', 'Paige Bueckers', 'Hannah Hidalgo', 'JuJu Watkins'],
'Team': ['Iowa', 'UConn', 'Notre Dame', 'USC'],
'PTS': [812, 456, 634, 718],
'AST': [241, 132, 168, 145],
'REB': [227, 142, 176, 189],
'STL': [45, 38, 92, 67],
'BLK': [21, 12, 28, 34],
'TOV': [162, 78, 134, 142],
'FGM': [280, 165, 225, 258],
'FGA': [682, 378, 523, 601],
'3PM': [126, 48, 84, 62],
'FTM': [226, 126, 180, 202],
'FTA': [268, 148, 224, 248],
'MIN': [1156, 698, 1045, 1123]
})
team_stats = pd.DataFrame({
'Team': ['Iowa', 'UConn', 'Notre Dame', 'USC'],
'FGA': [1850, 1680, 1720, 1780],
'FTA': [710, 620, 645, 680],
'TOV': [450, 385, 410, 425],
'MIN': [6600, 6400, 6500, 6550]
})
metrics = PlayerMetrics()
player_stats_advanced = metrics.calculate_comprehensive_metrics(player_stats, team_stats)
print("Advanced Player Metrics:")
print(player_stats_advanced[['Player', 'PER', 'TS%', 'USG%', 'AST/TO']].round(2))
R Analysis Using wehoop
Getting Started with wehoop
The wehoop package provides easy access to women's basketball data, including college and professional leagues.
# Install wehoop (one time)
install.packages("wehoop")
# Or install development version from GitHub
# install.packages("devtools")
# devtools::install_github("sportsdataverse/wehoop")
# Load required libraries
library(wehoop)
library(dplyr)
library(ggplot2)
library(tidyr)
Loading Women's College Basketball Data
# Load team box scores for a specific season
team_box <- load_wbb_team_box(seasons = 2024)
# View structure
head(team_box)
# Load player box scores
player_box <- load_wbb_player_box(seasons = 2024)
# View top scorers
player_box %>%
group_by(athlete_display_name, team_short_display_name) %>%
summarise(
games = n(),
total_pts = sum(points, na.rm = TRUE),
ppg = mean(points, na.rm = TRUE),
.groups = 'drop'
) %>%
arrange(desc(ppg)) %>%
head(20)
# Load schedule and results
schedule <- load_wbb_schedule(seasons = 2024)
# View high-scoring games
schedule %>%
filter(!is.na(home_score) & !is.na(away_score)) %>%
mutate(total_score = home_score + away_score) %>%
arrange(desc(total_score)) %>%
select(game_date, home_team_name, home_score,
away_team_name, away_score, total_score) %>%
head(10)
Team Performance Analysis
# Calculate team efficiency metrics
team_efficiency <- team_box %>%
group_by(team_short_display_name) %>%
summarise(
games = n(),
pts = mean(team_score, na.rm = TRUE),
opp_pts = mean(opponent_team_score, na.rm = TRUE),
fg_pct = mean(field_goal_pct, na.rm = TRUE) * 100,
fg3_pct = mean(three_point_field_goal_pct, na.rm = TRUE) * 100,
ft_pct = mean(free_throw_pct, na.rm = TRUE) * 100,
rebounds = mean(rebounds, na.rm = TRUE),
assists = mean(assists, na.rm = TRUE),
turnovers = mean(turnovers, na.rm = TRUE),
steals = mean(steals, na.rm = TRUE),
blocks = mean(blocks, na.rm = TRUE),
.groups = 'drop'
) %>%
mutate(
margin = pts - opp_pts,
ast_to = assists / turnovers
) %>%
arrange(desc(margin))
# View top teams
print(head(team_efficiency, 15))
# Visualize offensive vs defensive efficiency
ggplot(team_efficiency, aes(x = pts, y = opp_pts)) +
geom_point(aes(size = margin, color = margin), alpha = 0.6) +
geom_abline(slope = 1, intercept = 0, linetype = "dashed", color = "gray50") +
scale_color_gradient2(low = "red", mid = "white", high = "blue", midpoint = 0) +
labs(
title = "Women's College Basketball: Offensive vs Defensive Efficiency",
subtitle = "2024 Season - Average Points Per Game",
x = "Points Scored (Offense)",
y = "Points Allowed (Defense)",
size = "Point Differential",
color = "Margin"
) +
theme_minimal() +
theme(plot.title = element_text(face = "bold", size = 14))
ggsave("wcbb_efficiency_2024.png", width = 10, height = 8, dpi = 300)
Conference Comparison Analysis
# Get team information with conferences
team_info <- wehoop::espn_wbb_teams(year = 2024)
# Join with efficiency data
conference_analysis <- team_box %>%
left_join(team_info %>% select(team_id, team_abbreviation, team_conference_name),
by = c("team_id" = "team_id")) %>%
group_by(team_conference_name) %>%
summarise(
teams = n_distinct(team_id),
avg_pts = mean(team_score, na.rm = TRUE),
avg_fg_pct = mean(field_goal_pct, na.rm = TRUE) * 100,
avg_3pt_pct = mean(three_point_field_goal_pct, na.rm = TRUE) * 100,
avg_assists = mean(assists, na.rm = TRUE),
avg_turnovers = mean(turnovers, na.rm = TRUE),
avg_rebounds = mean(rebounds, na.rm = TRUE),
.groups = 'drop'
) %>%
arrange(desc(avg_pts))
# View conference statistics
print(conference_analysis)
# Visualize conference offensive styles
conference_analysis %>%
filter(!is.na(team_conference_name)) %>%
ggplot(aes(x = avg_3pt_pct, y = avg_assists)) +
geom_point(aes(size = avg_pts, color = team_conference_name), alpha = 0.7) +
geom_text(aes(label = team_conference_name),
size = 3, vjust = -1, check_overlap = TRUE) +
labs(
title = "Women's College Basketball Conference Offensive Styles",
subtitle = "3-Point Shooting vs Ball Movement",
x = "Three-Point Shooting % (Average)",
y = "Assists Per Game (Average)",
size = "Scoring"
) +
theme_minimal() +
theme(legend.position = "none")
ggsave("conference_styles_2024.png", width = 12, height = 8, dpi = 300)
Player Performance Tracking
# Track individual player performance over season
player_performance <- player_box %>%
filter(athlete_display_name %in% c("Caitlin Clark", "Paige Bueckers",
"Hannah Hidalgo", "JuJu Watkins")) %>%
mutate(game_number = row_number()) %>%
group_by(athlete_display_name) %>%
arrange(game_date) %>%
mutate(
game_number = row_number(),
rolling_avg_pts = zoo::rollmean(points, k = 5, fill = NA, align = "right")
)
# Plot scoring trends
ggplot(player_performance, aes(x = game_number, y = points,
color = athlete_display_name)) +
geom_line(alpha = 0.3) +
geom_line(aes(y = rolling_avg_pts), size = 1) +
facet_wrap(~athlete_display_name, ncol = 2) +
labs(
title = "Elite Player Scoring Trends - 2024 Season",
subtitle = "Individual games (light) and 5-game rolling average (bold)",
x = "Game Number",
y = "Points",
color = "Player"
) +
theme_minimal() +
theme(legend.position = "bottom")
ggsave("player_scoring_trends_2024.png", width = 12, height = 8, dpi = 300)
Shot Chart Analysis
# Analyze shooting efficiency by location (if play-by-play data available)
# Note: Detailed shot location data may require additional sources
# Calculate shooting zones from box score data
shooting_analysis <- player_box %>%
mutate(
two_pt_attempts = field_goals_attempted - three_point_field_goals_attempted,
two_pt_made = field_goals_made - three_point_field_goals_made,
two_pt_pct = ifelse(two_pt_attempts > 0, two_pt_made / two_pt_attempts, NA)
) %>%
group_by(athlete_display_name, team_short_display_name) %>%
summarise(
games = n(),
total_fga = sum(field_goals_attempted, na.rm = TRUE),
three_pt_rate = sum(three_point_field_goals_attempted, na.rm = TRUE) /
sum(field_goals_attempted, na.rm = TRUE),
three_pt_pct = sum(three_point_field_goals_made, na.rm = TRUE) /
sum(three_point_field_goals_attempted, na.rm = TRUE),
two_pt_pct = sum(two_pt_made, na.rm = TRUE) / sum(two_pt_attempts, na.rm = TRUE),
ft_rate = sum(free_throws_attempted, na.rm = TRUE) /
sum(field_goals_attempted, na.rm = TRUE),
.groups = 'drop'
) %>%
filter(games >= 20) %>%
arrange(desc(total_fga))
# Visualize shot selection
shooting_analysis %>%
filter(total_fga >= 300) %>%
ggplot(aes(x = three_pt_rate, y = three_pt_pct)) +
geom_point(aes(size = total_fga, color = two_pt_pct), alpha = 0.6) +
geom_hline(yintercept = mean(shooting_analysis$three_pt_pct, na.rm = TRUE),
linetype = "dashed", color = "gray50") +
geom_vline(xintercept = mean(shooting_analysis$three_pt_rate, na.rm = TRUE),
linetype = "dashed", color = "gray50") +
scale_color_gradient(low = "red", high = "green") +
labs(
title = "Women's College Basketball Shooting Profiles",
subtitle = "High-volume shooters (300+ FGA) - 2024 Season",
x = "Three-Point Attempt Rate",
y = "Three-Point Shooting %",
size = "Total FGA",
color = "2PT%"
) +
theme_minimal()
ggsave("shooting_profiles_2024.png", width = 12, height = 8, dpi = 300)
Transfer Portal Analytics
Understanding the Transfer Portal
The transfer portal has become a critical aspect of roster management in women's college basketball. Players can enter the portal for various reasons, and analytics help programs identify high-value transfers:
- Statistical Evaluation: Assess player production at previous school
- Level Adjustment: Account for competition level differences (D1 vs D2, conference strength)
- Fit Analysis: Match player skills with team needs and system
- Development Potential: Identify players with upside for growth
- Remaining Eligibility: Consider years left and immediate vs long-term impact
Transfer Portal Evaluation Framework
import pandas as pd
import numpy as np
class TransferPortalAnalyzer:
"""Analyze and evaluate transfer portal candidates"""
def __init__(self):
self.position_weights = {
'Guard': {'scoring': 0.35, 'assist': 0.25, 'defense': 0.20,
'efficiency': 0.20},
'Forward': {'scoring': 0.30, 'rebounding': 0.25, 'defense': 0.25,
'efficiency': 0.20},
'Center': {'rebounding': 0.35, 'defense': 0.30, 'scoring': 0.20,
'efficiency': 0.15}
}
def calculate_transfer_value(self, player_stats, position):
"""
Calculate overall transfer value score
Parameters:
-----------
player_stats : dict
Player statistics dictionary
position : str
Player position (Guard, Forward, Center)
Returns:
--------
dict
Transfer value metrics
"""
weights = self.position_weights[position]
# Normalize statistics (0-100 scale)
scoring_score = min(player_stats['ppg'] / 25.0 * 100, 100)
assist_score = min(player_stats['apg'] / 8.0 * 100, 100)
rebounding_score = min(player_stats['rpg'] / 12.0 * 100, 100)
defense_score = ((player_stats['spg'] / 3.0 * 50) +
(player_stats['bpg'] / 3.0 * 50))
efficiency_score = min(player_stats['ts_pct'] * 100, 100)
# Calculate weighted score based on position
overall_score = 0
if 'scoring' in weights:
overall_score += scoring_score * weights['scoring']
if 'assist' in weights:
overall_score += assist_score * weights['assist']
if 'rebounding' in weights:
overall_score += rebounding_score * weights['rebounding']
if 'defense' in weights:
overall_score += defense_score * weights['defense']
if 'efficiency' in weights:
overall_score += efficiency_score * weights['efficiency']
return {
'overall_score': overall_score,
'scoring_score': scoring_score,
'assist_score': assist_score,
'rebounding_score': rebounding_score,
'defense_score': defense_score,
'efficiency_score': efficiency_score
}
def adjust_for_competition(self, player_stats, from_conference, to_conference):
"""
Adjust stats based on competition level change
Simple adjustment factors (would be more sophisticated with historical data)
"""
conference_strength = {
'SEC': 1.00,
'Big Ten': 0.98,
'ACC': 0.96,
'Big 12': 0.97,
'Pac-12': 0.95,
'Big East': 0.93,
'AAC': 0.85,
'Other': 0.80
}
from_strength = conference_strength.get(from_conference, 0.80)
to_strength = conference_strength.get(to_conference, 0.80)
adjustment_factor = to_strength / from_strength
adjusted_stats = player_stats.copy()
adjusted_stats['ppg'] *= adjustment_factor
adjusted_stats['rpg'] *= adjustment_factor
adjusted_stats['apg'] *= adjustment_factor
return adjusted_stats
def evaluate_transfer_fit(self, player_stats, team_needs):
"""
Evaluate how well a transfer fits team needs
Parameters:
-----------
player_stats : dict
Player statistics
team_needs : dict
Team needs with priorities (0-1 scale)
Returns:
--------
float
Fit score (0-100)
"""
fit_score = 0
if 'scoring' in team_needs:
fit_score += (player_stats['ppg'] / 20.0) * team_needs['scoring'] * 100
if 'playmaking' in team_needs:
fit_score += (player_stats['apg'] / 6.0) * team_needs['playmaking'] * 100
if 'rebounding' in team_needs:
fit_score += (player_stats['rpg'] / 10.0) * team_needs['rebounding'] * 100
if 'defense' in team_needs:
defense_impact = (player_stats['spg'] + player_stats['bpg']) / 4.0
fit_score += defense_impact * team_needs['defense'] * 100
if 'shooting' in team_needs:
fit_score += player_stats['three_pt_pct'] * team_needs['shooting'] * 100
# Normalize
total_needs = sum(team_needs.values())
if total_needs > 0:
fit_score = fit_score / total_needs
return min(fit_score, 100)
# Example usage
analyzer = TransferPortalAnalyzer()
# Evaluate a transfer candidate
player = {
'name': 'Jane Smith',
'position': 'Guard',
'from_school': 'Mid-Major University',
'from_conference': 'AAC',
'ppg': 18.5,
'rpg': 4.2,
'apg': 5.1,
'spg': 2.1,
'bpg': 0.3,
'three_pt_pct': 0.38,
'ts_pct': 0.58,
'years_remaining': 2
}
# Calculate transfer value
value = analyzer.calculate_transfer_value(player, player['position'])
print(f"Transfer Value Score: {value['overall_score']:.1f}")
print(f"Scoring: {value['scoring_score']:.1f}")
print(f"Efficiency: {value['efficiency_score']:.1f}")
# Adjust for competition level
adjusted_stats = analyzer.adjust_for_competition(player, 'AAC', 'Big Ten')
print(f"\nProjected Big Ten Stats:")
print(f"PPG: {adjusted_stats['ppg']:.1f}")
print(f"APG: {adjusted_stats['apg']:.1f}")
# Evaluate fit with team needs
team_needs = {
'scoring': 0.8,
'playmaking': 0.6,
'shooting': 0.9,
'defense': 0.5
}
fit_score = analyzer.evaluate_transfer_fit(player, team_needs)
print(f"\nTeam Fit Score: {fit_score:.1f}/100")
Transfer Success Prediction
Key factors that predict successful transfers in women's college basketball:
High Success Indicators:
- Consistent production (low variance in performance)
- High efficiency metrics (TS%, eFG%, low TOV rate)
- Good character/coachability feedback
- Transferring to similar or lower competition level
- Positional need match with destination team
- Multiple years of eligibility remaining
WNBA Draft Projection Analytics
Draft Evaluation Framework
Projecting WNBA draft prospects from college requires evaluating both current production and professional potential:
import pandas as pd
import numpy as np
from sklearn.preprocessing import StandardScaler
class WNBADraftProjection:
"""Project WNBA draft prospects from college statistics"""
def __init__(self):
# Historical weights based on WNBA success
self.evaluation_weights = {
'production': 0.30, # College stats
'efficiency': 0.25, # Advanced metrics
'physical_tools': 0.20, # Size, athleticism
'skill_level': 0.15, # Technical skills
'intangibles': 0.10 # Leadership, character
}
# Position-specific requirements
self.position_requirements = {
'Point Guard': {
'min_height': 68, # 5'8"
'key_stats': ['apg', 'ast_to_ratio', 'three_pt_pct'],
'weights': {'playmaking': 0.40, 'shooting': 0.30, 'defense': 0.30}
},
'Shooting Guard': {
'min_height': 70, # 5'10"
'key_stats': ['ppg', 'three_pt_pct', 'ts_pct'],
'weights': {'scoring': 0.45, 'shooting': 0.35, 'defense': 0.20}
},
'Wing/Forward': {
'min_height': 72, # 6'0"
'key_stats': ['ppg', 'rpg', 'three_pt_pct'],
'weights': {'versatility': 0.40, 'shooting': 0.30, 'defense': 0.30}
},
'Post': {
'min_height': 75, # 6'3"
'key_stats': ['rpg', 'bpg', 'fg_pct'],
'weights': {'rebounding': 0.35, 'defense': 0.35, 'scoring': 0.30}
}
}
def calculate_production_score(self, player_stats):
"""Calculate college production score"""
# Normalize per-game stats
scoring = min(player_stats['ppg'] / 25.0, 1.0) * 100
rebounding = min(player_stats['rpg'] / 12.0, 1.0) * 100
playmaking = min(player_stats['apg'] / 8.0, 1.0) * 100
# Weighted average
production = (scoring * 0.5 + rebounding * 0.25 + playmaking * 0.25)
return production
def calculate_efficiency_score(self, player_stats):
"""Calculate efficiency metrics"""
ts_score = player_stats['ts_pct'] * 100
usage_adjusted = player_stats['per'] * (player_stats['usg_pct'] / 25.0)
efficiency = (ts_score * 0.4 + usage_adjusted * 0.6)
return min(efficiency, 100)
def calculate_physical_score(self, player_profile, position):
"""Evaluate physical tools for WNBA"""
height_score = 0
min_height = self.position_requirements[position]['min_height']
if player_profile['height'] >= min_height + 3:
height_score = 100
elif player_profile['height'] >= min_height:
height_score = 70 + ((player_profile['height'] - min_height) / 3.0) * 30
else:
height_score = 40 + ((player_profile['height'] - (min_height - 4)) / 4.0) * 30
# Athleticism indicators (from stats)
athleticism = min((player_profile.get('spg', 0) * 20 +
player_profile.get('bpg', 0) * 20), 100)
physical = (height_score * 0.6 + athleticism * 0.4)
return physical
def calculate_skill_score(self, player_stats, position):
"""Evaluate skill level and versatility"""
key_stats = self.position_requirements[position]['key_stats']
skill_scores = []
if 'three_pt_pct' in key_stats:
shooting = min(player_stats.get('three_pt_pct', 0) * 100, 100)
skill_scores.append(shooting)
if 'apg' in key_stats:
playmaking = min(player_stats.get('apg', 0) / 6.0 * 100, 100)
skill_scores.append(playmaking)
if 'ast_to_ratio' in key_stats:
decision = min(player_stats.get('ast_to_ratio', 0) / 3.0 * 100, 100)
skill_scores.append(decision)
if 'fg_pct' in key_stats:
finishing = player_stats.get('fg_pct', 0) * 100
skill_scores.append(finishing)
return np.mean(skill_scores) if skill_scores else 50
def project_draft_position(self, player_stats, player_profile, position):
"""
Project overall draft grade and likely position
Returns:
--------
dict with draft projection information
"""
# Calculate component scores
production = self.calculate_production_score(player_stats)
efficiency = self.calculate_efficiency_score(player_stats)
physical = self.calculate_physical_score(player_profile, position)
skills = self.calculate_skill_score(player_stats, position)
intangibles = player_profile.get('intangibles_score', 70) # Default if not provided
# Overall grade
overall_grade = (
production * self.evaluation_weights['production'] +
efficiency * self.evaluation_weights['efficiency'] +
physical * self.evaluation_weights['physical_tools'] +
skills * self.evaluation_weights['skill_level'] +
intangibles * self.evaluation_weights['intangibles']
)
# Draft position projection (WNBA has 36 picks, 3 rounds)
if overall_grade >= 85:
draft_range = "Top 5 Pick (All-WNBA Potential)"
elif overall_grade >= 75:
draft_range = "Lottery Pick (1st Round, 1-12)"
elif overall_grade >= 65:
draft_range = "Late 1st / Early 2nd Round (13-20)"
elif overall_grade >= 55:
draft_range = "2nd Round (21-36)"
else:
draft_range = "Undrafted / Training Camp Invite"
return {
'overall_grade': overall_grade,
'draft_range': draft_range,
'production_score': production,
'efficiency_score': efficiency,
'physical_score': physical,
'skill_score': skills,
'intangibles_score': intangibles,
'strengths': self._identify_strengths(player_stats, player_profile),
'development_areas': self._identify_weaknesses(player_stats, player_profile)
}
def _identify_strengths(self, player_stats, player_profile):
"""Identify player strengths"""
strengths = []
if player_stats.get('ppg', 0) >= 18:
strengths.append("Elite scorer")
if player_stats.get('three_pt_pct', 0) >= 0.38:
strengths.append("Consistent three-point shooter")
if player_stats.get('apg', 0) >= 5:
strengths.append("Strong playmaker")
if player_stats.get('rpg', 0) >= 8:
strengths.append("Strong rebounder")
if player_stats.get('spg', 0) >= 2:
strengths.append("Disruptive defender")
if player_stats.get('ts_pct', 0) >= 0.60:
strengths.append("Highly efficient")
return strengths if strengths else ["Solid all-around player"]
def _identify_weaknesses(self, player_stats, player_profile):
"""Identify areas for development"""
weaknesses = []
if player_stats.get('three_pt_pct', 0) < 0.30:
weaknesses.append("Three-point shooting consistency")
if player_stats.get('ast_to_ratio', 0) < 1.5:
weaknesses.append("Decision-making / turnover issues")
if player_stats.get('ft_pct', 0) < 0.70:
weaknesses.append("Free throw shooting")
if player_profile.get('height', 75) < 70 and player_stats.get('spg', 0) < 1.5:
weaknesses.append("Defensive impact")
return weaknesses if weaknesses else ["Continue developing all-around game"]
# Example: Evaluate draft prospects
draft_eval = WNBADraftProjection()
# Top prospect profile
prospect = {
'name': 'Paige Bueckers',
'position': 'Shooting Guard',
'height': 71, # 5'11"
'stats': {
'ppg': 21.2,
'rpg': 5.2,
'apg': 3.8,
'spg': 2.3,
'bpg': 1.4,
'three_pt_pct': 0.412,
'fg_pct': 0.532,
'ft_pct': 0.827,
'ts_pct': 0.639,
'per': 28.5,
'usg_pct': 26.3,
'ast_to_ratio': 2.1
},
'intangibles_score': 90 # Leadership, winning experience
}
projection = draft_eval.project_draft_position(
prospect['stats'],
prospect,
prospect['position']
)
print(f"WNBA Draft Projection: {prospect['name']}")
print(f"Overall Grade: {projection['overall_grade']:.1f}/100")
print(f"Projected Range: {projection['draft_range']}")
print(f"\nComponent Scores:")
print(f" Production: {projection['production_score']:.1f}")
print(f" Efficiency: {projection['efficiency_score']:.1f}")
print(f" Physical Tools: {projection['physical_score']:.1f}")
print(f" Skill Level: {projection['skill_score']:.1f}")
print(f" Intangibles: {projection['intangibles_score']:.1f}")
print(f"\nStrengths: {', '.join(projection['strengths'])}")
print(f"Development Areas: {', '.join(projection['development_areas'])}")
Draft Prospect Comparison
Compare multiple prospects to create draft boards:
import matplotlib.pyplot as plt
import seaborn as sns
def create_prospect_comparison(prospects_list, draft_evaluator):
"""Create visual comparison of draft prospects"""
evaluations = []
for prospect in prospects_list:
eval_result = draft_evaluator.project_draft_position(
prospect['stats'],
prospect,
prospect['position']
)
eval_result['name'] = prospect['name']
eval_result['position'] = prospect['position']
evaluations.append(eval_result)
# Create DataFrame
df = pd.DataFrame(evaluations)
# Sort by overall grade
df = df.sort_values('overall_grade', ascending=False)
# Create visualization
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(16, 8))
# Overall grades
ax1.barh(df['name'], df['overall_grade'], color='steelblue')
ax1.set_xlabel('Overall Grade')
ax1.set_title('WNBA Draft Prospect Rankings', fontsize=14, fontweight='bold')
ax1.axvline(x=85, color='gold', linestyle='--', label='Top 5 Threshold')
ax1.axvline(x=75, color='silver', linestyle='--', label='Lottery Threshold')
ax1.legend()
# Component scores radar
categories = ['production_score', 'efficiency_score', 'physical_score',
'skill_score', 'intangibles_score']
for idx, row in df.head(5).iterrows():
values = [row[cat] for cat in categories]
ax2.plot(categories, values, marker='o', label=row['name'])
ax2.set_title('Top 5 Prospects - Component Breakdown',
fontsize=14, fontweight='bold')
ax2.set_ylim(0, 100)
ax2.legend(bbox_to_anchor=(1.05, 1), loc='upper left')
ax2.grid(True, alpha=0.3)
plt.xticks(rotation=45, ha='right')
plt.tight_layout()
return fig, df
# Example usage with multiple prospects
prospects = [
{
'name': 'Paige Bueckers',
'position': 'Shooting Guard',
'height': 71,
'stats': {'ppg': 21.2, 'rpg': 5.2, 'apg': 3.8, 'spg': 2.3,
'three_pt_pct': 0.412, 'ts_pct': 0.639, 'per': 28.5,
'usg_pct': 26.3, 'ast_to_ratio': 2.1},
'intangibles_score': 90
},
{
'name': 'Hannah Hidalgo',
'position': 'Point Guard',
'height': 68,
'stats': {'ppg': 22.6, 'rpg': 6.2, 'apg': 5.5, 'spg': 4.6,
'three_pt_pct': 0.368, 'ts_pct': 0.571, 'per': 27.8,
'usg_pct': 28.1, 'ast_to_ratio': 1.9},
'intangibles_score': 85
},
{
'name': 'Kiki Iriafen',
'position': 'Wing/Forward',
'height': 75,
'stats': {'ppg': 19.4, 'rpg': 11.1, 'apg': 2.8, 'spg': 1.5,
'three_pt_pct': 0.429, 'ts_pct': 0.618, 'per': 26.2,
'usg_pct': 24.5, 'ast_to_ratio': 1.4},
'intangibles_score': 80
}
]
fig, rankings = create_prospect_comparison(prospects, draft_eval)
plt.savefig('wnba_draft_prospects_2025.png', dpi=300, bbox_inches='tight')
print("\nWNBA Draft Big Board:")
print(rankings[['name', 'position', 'overall_grade', 'draft_range']].to_string(index=False))
Practical Applications for College Programs
1. Opponent Scouting System
Build an analytics-driven scouting report for upcoming opponents:
- Offensive Tendencies: Shot selection, pace, preferred actions (PnR, post-ups, transition)
- Key Players: Usage rates, efficiency, hot zones
- Defensive Schemes: Primary coverage (man, zone), pressure, transition defense
- Situational Analysis: Clutch performance, timeout effectiveness, substitution patterns
2. Player Development Tracking
Monitor individual player improvement throughout the season:
- Track rolling averages for key metrics (shooting %, assist/TO ratio, rebounding rate)
- Compare early season vs late season performance
- Identify skill development areas showing improvement
- Set measurable goals and track progress
3. Lineup Optimization
Use analytics to determine most effective lineup combinations:
- Calculate net rating for different 5-player combinations
- Analyze spacing (3PT shooting distribution)
- Evaluate defensive versatility and switchability
- Balance ball-handling and playmaking responsibilities
4. Recruiting Analytics
Data-driven approach to identifying and evaluating recruits:
- Evaluate high school statistics with competition level adjustments
- Compare recruits to successful players in your system
- Project college impact based on skill profile
- Identify undervalued prospects (high efficiency, low visibility)
5. Game Strategy Analytics
In-game and pregame strategic decisions informed by data:
Key Strategic Questions:
- What pace should we play to maximize our win probability?
- Which defensive scheme is most effective against this opponent's personnel?
- When should we use full-court pressure?
- Which players should take end-of-game shots?
- What are optimal substitution patterns to manage fatigue?
6. Season Planning and Load Management
Use analytics to optimize practice intensity and player workload:
- Monitor minutes played and fatigue indicators
- Schedule rest for key players in non-conference games
- Track injury risk factors (minutes, previous injuries)
- Balance development of bench players with winning games
Resources for Further Learning
Data Sources
Analytics Communities
- Women's Basketball Coaches Association (WBCA) analytics workshops
- Twitter/X Women's Basketball Analytics community (#WBBAnalytics)
- SportsDataverse Discord (wehoop support and community)
Recommended Reading
- Basketball Analytics: Spatial Tracking by Stephen Shea and Christopher Baker
- Sprawlball by Kirk Goldsberry (principles applicable to women's game)
- Her Hoop Stats blog articles on women's basketball analytics
Getting Started
Start with publicly available data from NCAA.com and Her Hoop Stats. Use the code examples in this guide to build simple analyses. As you become more comfortable, expand to more sophisticated models and custom data collection. The women's college basketball analytics community is growing rapidly and welcomes new contributors.