College Women's Basketball Analytics

Beginner 10 min read 0 views Nov 27, 2025

College Women's Basketball Analytics

Women's college basketball analytics has experienced tremendous growth in recent years, driven by increased investment in programs, better data availability, and the rising prominence of women's basketball. This guide covers the state of analytics in women's college basketball, key data sources, practical analysis techniques, and applications for coaches, analysts, and recruiters.

State of Women's College Basketball Analytics

Current Landscape

Women's college basketball analytics has evolved significantly, particularly since the explosive growth in viewership and investment following the 2023-24 season. Key developments include:

  • Increased Data Availability: More programs are investing in tracking systems (Synergy, Hudl) and analytics personnel
  • Advanced Metrics Adoption: Teams are using efficiency ratings, possession-based metrics, and shot quality analysis
  • Public Analytics Growth: Platforms like Her Hoop Stats and NCAA.com provide detailed statistics and rankings
  • Transfer Portal Era: Analytics now crucial for evaluating transfer candidates and roster construction
  • WNBA Pipeline: Draft analytics help identify professional prospects and developmental trajectories

Analytics Gaps and Opportunities

Despite growth, women's college basketball analytics still lags behind men's basketball in several areas:

Current Gaps

  • Less tracking data availability
  • Fewer public advanced stats platforms
  • Limited play-by-play data for some conferences
  • Smaller analytics community and resources

Growing Opportunities

  • Expanding data partnerships (NCAA, conferences)
  • Increased media coverage and interest
  • Growing WNBA investment in scouting
  • More programs hiring dedicated analysts

Key Data Sources for Women's College Basketball

1. NCAA Statistics

The NCAA provides official statistics through NCAA.com/stats, including:

  • Team Statistics: Scoring, rebounding, shooting percentages, turnovers
  • Individual Statistics: Player averages, efficiency ratings, per-game stats
  • Conference Standings: Records, RPI, NET rankings (for tournament selection)
  • Tournament Data: March Madness statistics and historical performance

Access: NCAA stats are available at ncaa.com/stats with downloadable CSV files for many categories.

2. Her Hoop Stats

Her Hoop Stats is the premier independent analytics platform for women's basketball, offering:

  • Advanced Team Metrics: Efficiency ratings, tempo-adjusted statistics, strength of schedule
  • Player Ratings: Box score-based player evaluation metrics
  • Game Predictions: Statistical models for game outcomes and tournament projections
  • Historical Database: Multi-year data for trend analysis and player development tracking
  • Transfer Portal Tracker: Comprehensive database of players in the transfer portal with statistics

3. Conference Websites and Stats Services

Many conferences provide detailed statistics through their official websites or third-party platforms:

  • Major Conferences: SEC, Big Ten, ACC, Big 12, Pac-12 provide comprehensive stats packages
  • Synergy Sports: Video and statistical platform used by many programs (subscription required)
  • StatBroadcast: Live stats platform used by many conferences with play-by-play data

4. wehoop Package (R)

The wehoop R package provides programmatic access to women's college basketball data, making it easy to retrieve and analyze statistics in R. It's part of the SportsDataverse ecosystem.

Python Analysis for Women's College Basketball

Data Collection and Processing

While there isn't a comprehensive Python package like wehoop for R, you can collect women's college basketball data using web scraping and API techniques:

import pandas as pd
import requests
from bs4 import BeautifulSoup
import numpy as np

class WomensCollegeBasketballScraper:
    """Scrape and process women's college basketball statistics"""

    def __init__(self):
        self.base_url = "https://www.ncaa.com/stats/basketball-women/d1"
        self.headers = {
            'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36'
        }

    def get_team_stats(self, season, stat_category):
        """
        Retrieve team statistics from NCAA.com

        Parameters:
        -----------
        season : str
            Season year (e.g., '2024')
        stat_category : str
            Category like 'scoring-offense', 'scoring-defense', 'field-goal-pct'

        Returns:
        --------
        pd.DataFrame
            Team statistics data
        """
        url = f"{self.base_url}/{season}/{stat_category}"

        try:
            response = requests.get(url, headers=self.headers)
            response.raise_for_status()

            # Parse HTML
            soup = BeautifulSoup(response.content, 'html.parser')
            table = soup.find('table')

            if table:
                df = pd.read_html(str(table))[0]
                df['season'] = season
                df['category'] = stat_category
                return df
            else:
                print(f"No table found for {stat_category}")
                return pd.DataFrame()

        except Exception as e:
            print(f"Error fetching {stat_category}: {e}")
            return pd.DataFrame()

    def get_player_stats(self, season, stat_category):
        """Retrieve individual player statistics"""
        url = f"{self.base_url}/{season}/player/{stat_category}"

        try:
            response = requests.get(url, headers=self.headers)
            response.raise_for_status()

            soup = BeautifulSoup(response.content, 'html.parser')
            table = soup.find('table')

            if table:
                df = pd.read_html(str(table))[0]
                df['season'] = season
                df['category'] = stat_category
                return df
            else:
                return pd.DataFrame()

        except Exception as e:
            print(f"Error fetching player {stat_category}: {e}")
            return pd.DataFrame()

# Example usage
scraper = WomensCollegeBasketballScraper()

# Get team offensive statistics
team_offense = scraper.get_team_stats('2024', 'scoring-offense')
team_defense = scraper.get_team_stats('2024', 'scoring-defense')

print("Top 10 Scoring Offenses:")
print(team_offense.head(10))

# Get top scorers
top_scorers = scraper.get_player_stats('2024', 'points')
print("\nTop Scorers:")
print(top_scorers.head(10))

Four Factors Analysis

Implementing Dean Oliver's Four Factors for women's college basketball:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

class FourFactorsAnalyzer:
    """Calculate and analyze Four Factors for women's college basketball"""

    def calculate_four_factors(self, team_stats):
        """
        Calculate Four Factors from team statistics

        Parameters:
        -----------
        team_stats : pd.DataFrame
            DataFrame with columns: FGM, FGA, FTM, FTA, ORB, DRB, TOV,
                                   Opp_FGM, Opp_FGA, Opp_FTM, Opp_FTA,
                                   Opp_ORB, Opp_DRB, Opp_TOV

        Returns:
        --------
        pd.DataFrame
            DataFrame with Four Factors calculated
        """
        df = team_stats.copy()

        # Offensive Four Factors
        df['eFG%'] = (df['FGM'] + 0.5 * df['3PM']) / df['FGA']
        df['TOV%'] = df['TOV'] / (df['FGA'] + 0.44 * df['FTA'] + df['TOV'])
        df['ORB%'] = df['ORB'] / (df['ORB'] + df['Opp_DRB'])
        df['FTRate'] = df['FTM'] / df['FGA']

        # Defensive Four Factors
        df['Opp_eFG%'] = (df['Opp_FGM'] + 0.5 * df['Opp_3PM']) / df['Opp_FGA']
        df['Opp_TOV%'] = df['Opp_TOV'] / (df['Opp_FGA'] + 0.44 * df['Opp_FTA'] + df['Opp_TOV'])
        df['DRB%'] = df['DRB'] / (df['DRB'] + df['Opp_ORB'])
        df['Opp_FTRate'] = df['Opp_FTM'] / df['Opp_FGA']

        # Overall efficiency
        df['Off_Efficiency'] = (df['eFG%'] * 0.40 +
                                (1 - df['TOV%']) * 0.25 +
                                df['ORB%'] * 0.20 +
                                df['FTRate'] * 0.15)

        df['Def_Efficiency'] = ((1 - df['Opp_eFG%']) * 0.40 +
                                df['Opp_TOV%'] * 0.25 +
                                df['DRB%'] * 0.20 +
                                (1 - df['Opp_FTRate']) * 0.15)

        return df

    def plot_four_factors(self, team_stats, team_name):
        """Create radar chart of Four Factors"""
        fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(14, 6),
                                        subplot_kw=dict(projection='polar'))

        team_data = team_stats[team_stats['Team'] == team_name].iloc[0]

        # Offensive factors
        categories = ['eFG%', 'TOV%', 'ORB%', 'FTRate']
        values = [team_data['eFG%'], 1 - team_data['TOV%'],
                  team_data['ORB%'], team_data['FTRate']]

        angles = np.linspace(0, 2 * np.pi, len(categories), endpoint=False).tolist()
        values += values[:1]
        angles += angles[:1]

        ax1.plot(angles, values, 'o-', linewidth=2, label=team_name)
        ax1.fill(angles, values, alpha=0.25)
        ax1.set_xticks(angles[:-1])
        ax1.set_xticklabels(categories)
        ax1.set_ylim(0, 1)
        ax1.set_title('Offensive Four Factors', pad=20)
        ax1.grid(True)

        # Defensive factors
        categories_def = ['Opp eFG%', 'Opp TOV%', 'DRB%', 'Opp FTRate']
        values_def = [1 - team_data['Opp_eFG%'], team_data['Opp_TOV%'],
                      team_data['DRB%'], 1 - team_data['Opp_FTRate']]
        values_def += values_def[:1]

        ax2.plot(angles, values_def, 'o-', linewidth=2, label=team_name, color='red')
        ax2.fill(angles, values_def, alpha=0.25, color='red')
        ax2.set_xticks(angles[:-1])
        ax2.set_xticklabels(categories_def)
        ax2.set_ylim(0, 1)
        ax2.set_title('Defensive Four Factors', pad=20)
        ax2.grid(True)

        plt.tight_layout()
        return fig

# Example usage
# Assuming you have team statistics loaded
team_stats = pd.DataFrame({
    'Team': ['South Carolina', 'UConn', 'Iowa', 'LSU'],
    'FGM': [870, 825, 910, 840],
    'FGA': [1750, 1680, 1850, 1720],
    '3PM': [210, 245, 280, 225],
    'FTM': [520, 480, 550, 490],
    'FTA': [680, 620, 710, 640],
    'ORB': [385, 340, 360, 370],
    'DRB': [920, 880, 850, 870],
    'TOV': [410, 385, 450, 420],
    'Opp_FGM': [720, 750, 780, 760],
    'Opp_FGA': [1680, 1720, 1780, 1740],
    'Opp_3PM': [185, 195, 210, 200],
    'Opp_FTM': [450, 470, 490, 475],
    'Opp_FTA': [590, 610, 640, 620],
    'Opp_ORB': [290, 310, 330, 315],
    'Opp_DRB': [780, 760, 740, 755],
    'Opp_TOV': [490, 510, 480, 495]
})

analyzer = FourFactorsAnalyzer()
team_stats_with_factors = analyzer.calculate_four_factors(team_stats)

print("Four Factors Analysis:")
print(team_stats_with_factors[['Team', 'eFG%', 'TOV%', 'ORB%', 'FTRate',
                                'Off_Efficiency', 'Def_Efficiency']])

# Plot for specific team
fig = analyzer.plot_four_factors(team_stats_with_factors, 'South Carolina')
plt.savefig('south_carolina_four_factors.png', dpi=300, bbox_inches='tight')

Player Evaluation Metrics

Calculate advanced player statistics for women's college basketball:

import pandas as pd
import numpy as np

class PlayerMetrics:
    """Calculate advanced player metrics for women's college basketball"""

    def __init__(self):
        self.league_avg_pace = 70.0  # Approximate for women's CBB
        self.league_avg_ortg = 100.0

    def calculate_per(self, player_stats):
        """
        Calculate Player Efficiency Rating (PER)

        Parameters:
        -----------
        player_stats : pd.DataFrame
            Player statistics with columns: PTS, AST, REB, STL, BLK,
                                          TOV, FGM, FGA, FTM, FTA, MIN

        Returns:
        --------
        pd.DataFrame
            DataFrame with PER calculated
        """
        df = player_stats.copy()

        # Unadjusted PER calculation
        df['uPER'] = (df['PTS'] + df['REB'] + df['AST'] + df['STL'] + df['BLK'] -
                      df['FGA'] - df['FTA'] + df['FGM'] + df['FTM'] - df['TOV']) / df['MIN']

        # Normalize to league average (15.0)
        league_upер = df['uPER'].mean()
        df['PER'] = (df['uPER'] / league_upер) * 15.0

        return df

    def calculate_true_shooting(self, player_stats):
        """
        Calculate True Shooting Percentage

        TS% = PTS / (2 * (FGA + 0.44 * FTA))
        """
        df = player_stats.copy()
        df['TS%'] = df['PTS'] / (2 * (df['FGA'] + 0.44 * df['FTA']))
        return df

    def calculate_usage_rate(self, player_stats, team_stats):
        """
        Calculate player usage rate

        USG% = 100 * ((FGA + 0.44 * FTA + TOV) * (Team_MIN / 5)) /
               (MIN * (Team_FGA + 0.44 * Team_FTA + Team_TOV))
        """
        df = player_stats.copy()

        for idx, row in df.iterrows():
            team = row['Team']
            team_data = team_stats[team_stats['Team'] == team].iloc[0]

            player_possessions = row['FGA'] + 0.44 * row['FTA'] + row['TOV']
            team_possessions = team_data['FGA'] + 0.44 * team_data['FTA'] + team_data['TOV']
            team_min = team_data['MIN']

            df.at[idx, 'USG%'] = 100 * ((player_possessions * (team_min / 5)) /
                                        (row['MIN'] * team_possessions))

        return df

    def calculate_assist_ratio(self, player_stats):
        """
        Calculate assist to turnover ratio
        """
        df = player_stats.copy()
        df['AST/TO'] = df['AST'] / df['TOV'].replace(0, 1)
        return df

    def calculate_comprehensive_metrics(self, player_stats, team_stats):
        """Calculate all advanced metrics"""
        df = player_stats.copy()

        df = self.calculate_per(df)
        df = self.calculate_true_shooting(df)
        df = self.calculate_usage_rate(df, team_stats)
        df = self.calculate_assist_ratio(df)

        return df

# Example usage
player_stats = pd.DataFrame({
    'Player': ['Caitlin Clark', 'Paige Bueckers', 'Hannah Hidalgo', 'JuJu Watkins'],
    'Team': ['Iowa', 'UConn', 'Notre Dame', 'USC'],
    'PTS': [812, 456, 634, 718],
    'AST': [241, 132, 168, 145],
    'REB': [227, 142, 176, 189],
    'STL': [45, 38, 92, 67],
    'BLK': [21, 12, 28, 34],
    'TOV': [162, 78, 134, 142],
    'FGM': [280, 165, 225, 258],
    'FGA': [682, 378, 523, 601],
    '3PM': [126, 48, 84, 62],
    'FTM': [226, 126, 180, 202],
    'FTA': [268, 148, 224, 248],
    'MIN': [1156, 698, 1045, 1123]
})

team_stats = pd.DataFrame({
    'Team': ['Iowa', 'UConn', 'Notre Dame', 'USC'],
    'FGA': [1850, 1680, 1720, 1780],
    'FTA': [710, 620, 645, 680],
    'TOV': [450, 385, 410, 425],
    'MIN': [6600, 6400, 6500, 6550]
})

metrics = PlayerMetrics()
player_stats_advanced = metrics.calculate_comprehensive_metrics(player_stats, team_stats)

print("Advanced Player Metrics:")
print(player_stats_advanced[['Player', 'PER', 'TS%', 'USG%', 'AST/TO']].round(2))

R Analysis Using wehoop

Getting Started with wehoop

The wehoop package provides easy access to women's basketball data, including college and professional leagues.

# Install wehoop (one time)
install.packages("wehoop")

# Or install development version from GitHub
# install.packages("devtools")
# devtools::install_github("sportsdataverse/wehoop")

# Load required libraries
library(wehoop)
library(dplyr)
library(ggplot2)
library(tidyr)

Loading Women's College Basketball Data

# Load team box scores for a specific season
team_box <- load_wbb_team_box(seasons = 2024)

# View structure
head(team_box)

# Load player box scores
player_box <- load_wbb_player_box(seasons = 2024)

# View top scorers
player_box %>%
  group_by(athlete_display_name, team_short_display_name) %>%
  summarise(
    games = n(),
    total_pts = sum(points, na.rm = TRUE),
    ppg = mean(points, na.rm = TRUE),
    .groups = 'drop'
  ) %>%
  arrange(desc(ppg)) %>%
  head(20)

# Load schedule and results
schedule <- load_wbb_schedule(seasons = 2024)

# View high-scoring games
schedule %>%
  filter(!is.na(home_score) & !is.na(away_score)) %>%
  mutate(total_score = home_score + away_score) %>%
  arrange(desc(total_score)) %>%
  select(game_date, home_team_name, home_score,
         away_team_name, away_score, total_score) %>%
  head(10)

Team Performance Analysis

# Calculate team efficiency metrics
team_efficiency <- team_box %>%
  group_by(team_short_display_name) %>%
  summarise(
    games = n(),
    pts = mean(team_score, na.rm = TRUE),
    opp_pts = mean(opponent_team_score, na.rm = TRUE),
    fg_pct = mean(field_goal_pct, na.rm = TRUE) * 100,
    fg3_pct = mean(three_point_field_goal_pct, na.rm = TRUE) * 100,
    ft_pct = mean(free_throw_pct, na.rm = TRUE) * 100,
    rebounds = mean(rebounds, na.rm = TRUE),
    assists = mean(assists, na.rm = TRUE),
    turnovers = mean(turnovers, na.rm = TRUE),
    steals = mean(steals, na.rm = TRUE),
    blocks = mean(blocks, na.rm = TRUE),
    .groups = 'drop'
  ) %>%
  mutate(
    margin = pts - opp_pts,
    ast_to = assists / turnovers
  ) %>%
  arrange(desc(margin))

# View top teams
print(head(team_efficiency, 15))

# Visualize offensive vs defensive efficiency
ggplot(team_efficiency, aes(x = pts, y = opp_pts)) +
  geom_point(aes(size = margin, color = margin), alpha = 0.6) +
  geom_abline(slope = 1, intercept = 0, linetype = "dashed", color = "gray50") +
  scale_color_gradient2(low = "red", mid = "white", high = "blue", midpoint = 0) +
  labs(
    title = "Women's College Basketball: Offensive vs Defensive Efficiency",
    subtitle = "2024 Season - Average Points Per Game",
    x = "Points Scored (Offense)",
    y = "Points Allowed (Defense)",
    size = "Point Differential",
    color = "Margin"
  ) +
  theme_minimal() +
  theme(plot.title = element_text(face = "bold", size = 14))

ggsave("wcbb_efficiency_2024.png", width = 10, height = 8, dpi = 300)

Conference Comparison Analysis

# Get team information with conferences
team_info <- wehoop::espn_wbb_teams(year = 2024)

# Join with efficiency data
conference_analysis <- team_box %>%
  left_join(team_info %>% select(team_id, team_abbreviation, team_conference_name),
            by = c("team_id" = "team_id")) %>%
  group_by(team_conference_name) %>%
  summarise(
    teams = n_distinct(team_id),
    avg_pts = mean(team_score, na.rm = TRUE),
    avg_fg_pct = mean(field_goal_pct, na.rm = TRUE) * 100,
    avg_3pt_pct = mean(three_point_field_goal_pct, na.rm = TRUE) * 100,
    avg_assists = mean(assists, na.rm = TRUE),
    avg_turnovers = mean(turnovers, na.rm = TRUE),
    avg_rebounds = mean(rebounds, na.rm = TRUE),
    .groups = 'drop'
  ) %>%
  arrange(desc(avg_pts))

# View conference statistics
print(conference_analysis)

# Visualize conference offensive styles
conference_analysis %>%
  filter(!is.na(team_conference_name)) %>%
  ggplot(aes(x = avg_3pt_pct, y = avg_assists)) +
  geom_point(aes(size = avg_pts, color = team_conference_name), alpha = 0.7) +
  geom_text(aes(label = team_conference_name),
            size = 3, vjust = -1, check_overlap = TRUE) +
  labs(
    title = "Women's College Basketball Conference Offensive Styles",
    subtitle = "3-Point Shooting vs Ball Movement",
    x = "Three-Point Shooting % (Average)",
    y = "Assists Per Game (Average)",
    size = "Scoring"
  ) +
  theme_minimal() +
  theme(legend.position = "none")

ggsave("conference_styles_2024.png", width = 12, height = 8, dpi = 300)

Player Performance Tracking

# Track individual player performance over season
player_performance <- player_box %>%
  filter(athlete_display_name %in% c("Caitlin Clark", "Paige Bueckers",
                                      "Hannah Hidalgo", "JuJu Watkins")) %>%
  mutate(game_number = row_number()) %>%
  group_by(athlete_display_name) %>%
  arrange(game_date) %>%
  mutate(
    game_number = row_number(),
    rolling_avg_pts = zoo::rollmean(points, k = 5, fill = NA, align = "right")
  )

# Plot scoring trends
ggplot(player_performance, aes(x = game_number, y = points,
                                color = athlete_display_name)) +
  geom_line(alpha = 0.3) +
  geom_line(aes(y = rolling_avg_pts), size = 1) +
  facet_wrap(~athlete_display_name, ncol = 2) +
  labs(
    title = "Elite Player Scoring Trends - 2024 Season",
    subtitle = "Individual games (light) and 5-game rolling average (bold)",
    x = "Game Number",
    y = "Points",
    color = "Player"
  ) +
  theme_minimal() +
  theme(legend.position = "bottom")

ggsave("player_scoring_trends_2024.png", width = 12, height = 8, dpi = 300)

Shot Chart Analysis

# Analyze shooting efficiency by location (if play-by-play data available)
# Note: Detailed shot location data may require additional sources

# Calculate shooting zones from box score data
shooting_analysis <- player_box %>%
  mutate(
    two_pt_attempts = field_goals_attempted - three_point_field_goals_attempted,
    two_pt_made = field_goals_made - three_point_field_goals_made,
    two_pt_pct = ifelse(two_pt_attempts > 0, two_pt_made / two_pt_attempts, NA)
  ) %>%
  group_by(athlete_display_name, team_short_display_name) %>%
  summarise(
    games = n(),
    total_fga = sum(field_goals_attempted, na.rm = TRUE),
    three_pt_rate = sum(three_point_field_goals_attempted, na.rm = TRUE) /
                    sum(field_goals_attempted, na.rm = TRUE),
    three_pt_pct = sum(three_point_field_goals_made, na.rm = TRUE) /
                   sum(three_point_field_goals_attempted, na.rm = TRUE),
    two_pt_pct = sum(two_pt_made, na.rm = TRUE) / sum(two_pt_attempts, na.rm = TRUE),
    ft_rate = sum(free_throws_attempted, na.rm = TRUE) /
              sum(field_goals_attempted, na.rm = TRUE),
    .groups = 'drop'
  ) %>%
  filter(games >= 20) %>%
  arrange(desc(total_fga))

# Visualize shot selection
shooting_analysis %>%
  filter(total_fga >= 300) %>%
  ggplot(aes(x = three_pt_rate, y = three_pt_pct)) +
  geom_point(aes(size = total_fga, color = two_pt_pct), alpha = 0.6) +
  geom_hline(yintercept = mean(shooting_analysis$three_pt_pct, na.rm = TRUE),
             linetype = "dashed", color = "gray50") +
  geom_vline(xintercept = mean(shooting_analysis$three_pt_rate, na.rm = TRUE),
             linetype = "dashed", color = "gray50") +
  scale_color_gradient(low = "red", high = "green") +
  labs(
    title = "Women's College Basketball Shooting Profiles",
    subtitle = "High-volume shooters (300+ FGA) - 2024 Season",
    x = "Three-Point Attempt Rate",
    y = "Three-Point Shooting %",
    size = "Total FGA",
    color = "2PT%"
  ) +
  theme_minimal()

ggsave("shooting_profiles_2024.png", width = 12, height = 8, dpi = 300)

Transfer Portal Analytics

Understanding the Transfer Portal

The transfer portal has become a critical aspect of roster management in women's college basketball. Players can enter the portal for various reasons, and analytics help programs identify high-value transfers:

  • Statistical Evaluation: Assess player production at previous school
  • Level Adjustment: Account for competition level differences (D1 vs D2, conference strength)
  • Fit Analysis: Match player skills with team needs and system
  • Development Potential: Identify players with upside for growth
  • Remaining Eligibility: Consider years left and immediate vs long-term impact

Transfer Portal Evaluation Framework

import pandas as pd
import numpy as np

class TransferPortalAnalyzer:
    """Analyze and evaluate transfer portal candidates"""

    def __init__(self):
        self.position_weights = {
            'Guard': {'scoring': 0.35, 'assist': 0.25, 'defense': 0.20,
                     'efficiency': 0.20},
            'Forward': {'scoring': 0.30, 'rebounding': 0.25, 'defense': 0.25,
                       'efficiency': 0.20},
            'Center': {'rebounding': 0.35, 'defense': 0.30, 'scoring': 0.20,
                      'efficiency': 0.15}
        }

    def calculate_transfer_value(self, player_stats, position):
        """
        Calculate overall transfer value score

        Parameters:
        -----------
        player_stats : dict
            Player statistics dictionary
        position : str
            Player position (Guard, Forward, Center)

        Returns:
        --------
        dict
            Transfer value metrics
        """
        weights = self.position_weights[position]

        # Normalize statistics (0-100 scale)
        scoring_score = min(player_stats['ppg'] / 25.0 * 100, 100)
        assist_score = min(player_stats['apg'] / 8.0 * 100, 100)
        rebounding_score = min(player_stats['rpg'] / 12.0 * 100, 100)
        defense_score = ((player_stats['spg'] / 3.0 * 50) +
                        (player_stats['bpg'] / 3.0 * 50))
        efficiency_score = min(player_stats['ts_pct'] * 100, 100)

        # Calculate weighted score based on position
        overall_score = 0
        if 'scoring' in weights:
            overall_score += scoring_score * weights['scoring']
        if 'assist' in weights:
            overall_score += assist_score * weights['assist']
        if 'rebounding' in weights:
            overall_score += rebounding_score * weights['rebounding']
        if 'defense' in weights:
            overall_score += defense_score * weights['defense']
        if 'efficiency' in weights:
            overall_score += efficiency_score * weights['efficiency']

        return {
            'overall_score': overall_score,
            'scoring_score': scoring_score,
            'assist_score': assist_score,
            'rebounding_score': rebounding_score,
            'defense_score': defense_score,
            'efficiency_score': efficiency_score
        }

    def adjust_for_competition(self, player_stats, from_conference, to_conference):
        """
        Adjust stats based on competition level change

        Simple adjustment factors (would be more sophisticated with historical data)
        """
        conference_strength = {
            'SEC': 1.00,
            'Big Ten': 0.98,
            'ACC': 0.96,
            'Big 12': 0.97,
            'Pac-12': 0.95,
            'Big East': 0.93,
            'AAC': 0.85,
            'Other': 0.80
        }

        from_strength = conference_strength.get(from_conference, 0.80)
        to_strength = conference_strength.get(to_conference, 0.80)

        adjustment_factor = to_strength / from_strength

        adjusted_stats = player_stats.copy()
        adjusted_stats['ppg'] *= adjustment_factor
        adjusted_stats['rpg'] *= adjustment_factor
        adjusted_stats['apg'] *= adjustment_factor

        return adjusted_stats

    def evaluate_transfer_fit(self, player_stats, team_needs):
        """
        Evaluate how well a transfer fits team needs

        Parameters:
        -----------
        player_stats : dict
            Player statistics
        team_needs : dict
            Team needs with priorities (0-1 scale)

        Returns:
        --------
        float
            Fit score (0-100)
        """
        fit_score = 0

        if 'scoring' in team_needs:
            fit_score += (player_stats['ppg'] / 20.0) * team_needs['scoring'] * 100

        if 'playmaking' in team_needs:
            fit_score += (player_stats['apg'] / 6.0) * team_needs['playmaking'] * 100

        if 'rebounding' in team_needs:
            fit_score += (player_stats['rpg'] / 10.0) * team_needs['rebounding'] * 100

        if 'defense' in team_needs:
            defense_impact = (player_stats['spg'] + player_stats['bpg']) / 4.0
            fit_score += defense_impact * team_needs['defense'] * 100

        if 'shooting' in team_needs:
            fit_score += player_stats['three_pt_pct'] * team_needs['shooting'] * 100

        # Normalize
        total_needs = sum(team_needs.values())
        if total_needs > 0:
            fit_score = fit_score / total_needs

        return min(fit_score, 100)

# Example usage
analyzer = TransferPortalAnalyzer()

# Evaluate a transfer candidate
player = {
    'name': 'Jane Smith',
    'position': 'Guard',
    'from_school': 'Mid-Major University',
    'from_conference': 'AAC',
    'ppg': 18.5,
    'rpg': 4.2,
    'apg': 5.1,
    'spg': 2.1,
    'bpg': 0.3,
    'three_pt_pct': 0.38,
    'ts_pct': 0.58,
    'years_remaining': 2
}

# Calculate transfer value
value = analyzer.calculate_transfer_value(player, player['position'])
print(f"Transfer Value Score: {value['overall_score']:.1f}")
print(f"Scoring: {value['scoring_score']:.1f}")
print(f"Efficiency: {value['efficiency_score']:.1f}")

# Adjust for competition level
adjusted_stats = analyzer.adjust_for_competition(player, 'AAC', 'Big Ten')
print(f"\nProjected Big Ten Stats:")
print(f"PPG: {adjusted_stats['ppg']:.1f}")
print(f"APG: {adjusted_stats['apg']:.1f}")

# Evaluate fit with team needs
team_needs = {
    'scoring': 0.8,
    'playmaking': 0.6,
    'shooting': 0.9,
    'defense': 0.5
}

fit_score = analyzer.evaluate_transfer_fit(player, team_needs)
print(f"\nTeam Fit Score: {fit_score:.1f}/100")

Transfer Success Prediction

Key factors that predict successful transfers in women's college basketball:

High Success Indicators:

  • Consistent production (low variance in performance)
  • High efficiency metrics (TS%, eFG%, low TOV rate)
  • Good character/coachability feedback
  • Transferring to similar or lower competition level
  • Positional need match with destination team
  • Multiple years of eligibility remaining

WNBA Draft Projection Analytics

Draft Evaluation Framework

Projecting WNBA draft prospects from college requires evaluating both current production and professional potential:

import pandas as pd
import numpy as np
from sklearn.preprocessing import StandardScaler

class WNBADraftProjection:
    """Project WNBA draft prospects from college statistics"""

    def __init__(self):
        # Historical weights based on WNBA success
        self.evaluation_weights = {
            'production': 0.30,      # College stats
            'efficiency': 0.25,      # Advanced metrics
            'physical_tools': 0.20,  # Size, athleticism
            'skill_level': 0.15,     # Technical skills
            'intangibles': 0.10      # Leadership, character
        }

        # Position-specific requirements
        self.position_requirements = {
            'Point Guard': {
                'min_height': 68,  # 5'8"
                'key_stats': ['apg', 'ast_to_ratio', 'three_pt_pct'],
                'weights': {'playmaking': 0.40, 'shooting': 0.30, 'defense': 0.30}
            },
            'Shooting Guard': {
                'min_height': 70,  # 5'10"
                'key_stats': ['ppg', 'three_pt_pct', 'ts_pct'],
                'weights': {'scoring': 0.45, 'shooting': 0.35, 'defense': 0.20}
            },
            'Wing/Forward': {
                'min_height': 72,  # 6'0"
                'key_stats': ['ppg', 'rpg', 'three_pt_pct'],
                'weights': {'versatility': 0.40, 'shooting': 0.30, 'defense': 0.30}
            },
            'Post': {
                'min_height': 75,  # 6'3"
                'key_stats': ['rpg', 'bpg', 'fg_pct'],
                'weights': {'rebounding': 0.35, 'defense': 0.35, 'scoring': 0.30}
            }
        }

    def calculate_production_score(self, player_stats):
        """Calculate college production score"""
        # Normalize per-game stats
        scoring = min(player_stats['ppg'] / 25.0, 1.0) * 100
        rebounding = min(player_stats['rpg'] / 12.0, 1.0) * 100
        playmaking = min(player_stats['apg'] / 8.0, 1.0) * 100

        # Weighted average
        production = (scoring * 0.5 + rebounding * 0.25 + playmaking * 0.25)
        return production

    def calculate_efficiency_score(self, player_stats):
        """Calculate efficiency metrics"""
        ts_score = player_stats['ts_pct'] * 100
        usage_adjusted = player_stats['per'] * (player_stats['usg_pct'] / 25.0)

        efficiency = (ts_score * 0.4 + usage_adjusted * 0.6)
        return min(efficiency, 100)

    def calculate_physical_score(self, player_profile, position):
        """Evaluate physical tools for WNBA"""
        height_score = 0
        min_height = self.position_requirements[position]['min_height']

        if player_profile['height'] >= min_height + 3:
            height_score = 100
        elif player_profile['height'] >= min_height:
            height_score = 70 + ((player_profile['height'] - min_height) / 3.0) * 30
        else:
            height_score = 40 + ((player_profile['height'] - (min_height - 4)) / 4.0) * 30

        # Athleticism indicators (from stats)
        athleticism = min((player_profile.get('spg', 0) * 20 +
                          player_profile.get('bpg', 0) * 20), 100)

        physical = (height_score * 0.6 + athleticism * 0.4)
        return physical

    def calculate_skill_score(self, player_stats, position):
        """Evaluate skill level and versatility"""
        key_stats = self.position_requirements[position]['key_stats']

        skill_scores = []

        if 'three_pt_pct' in key_stats:
            shooting = min(player_stats.get('three_pt_pct', 0) * 100, 100)
            skill_scores.append(shooting)

        if 'apg' in key_stats:
            playmaking = min(player_stats.get('apg', 0) / 6.0 * 100, 100)
            skill_scores.append(playmaking)

        if 'ast_to_ratio' in key_stats:
            decision = min(player_stats.get('ast_to_ratio', 0) / 3.0 * 100, 100)
            skill_scores.append(decision)

        if 'fg_pct' in key_stats:
            finishing = player_stats.get('fg_pct', 0) * 100
            skill_scores.append(finishing)

        return np.mean(skill_scores) if skill_scores else 50

    def project_draft_position(self, player_stats, player_profile, position):
        """
        Project overall draft grade and likely position

        Returns:
        --------
        dict with draft projection information
        """
        # Calculate component scores
        production = self.calculate_production_score(player_stats)
        efficiency = self.calculate_efficiency_score(player_stats)
        physical = self.calculate_physical_score(player_profile, position)
        skills = self.calculate_skill_score(player_stats, position)
        intangibles = player_profile.get('intangibles_score', 70)  # Default if not provided

        # Overall grade
        overall_grade = (
            production * self.evaluation_weights['production'] +
            efficiency * self.evaluation_weights['efficiency'] +
            physical * self.evaluation_weights['physical_tools'] +
            skills * self.evaluation_weights['skill_level'] +
            intangibles * self.evaluation_weights['intangibles']
        )

        # Draft position projection (WNBA has 36 picks, 3 rounds)
        if overall_grade >= 85:
            draft_range = "Top 5 Pick (All-WNBA Potential)"
        elif overall_grade >= 75:
            draft_range = "Lottery Pick (1st Round, 1-12)"
        elif overall_grade >= 65:
            draft_range = "Late 1st / Early 2nd Round (13-20)"
        elif overall_grade >= 55:
            draft_range = "2nd Round (21-36)"
        else:
            draft_range = "Undrafted / Training Camp Invite"

        return {
            'overall_grade': overall_grade,
            'draft_range': draft_range,
            'production_score': production,
            'efficiency_score': efficiency,
            'physical_score': physical,
            'skill_score': skills,
            'intangibles_score': intangibles,
            'strengths': self._identify_strengths(player_stats, player_profile),
            'development_areas': self._identify_weaknesses(player_stats, player_profile)
        }

    def _identify_strengths(self, player_stats, player_profile):
        """Identify player strengths"""
        strengths = []

        if player_stats.get('ppg', 0) >= 18:
            strengths.append("Elite scorer")
        if player_stats.get('three_pt_pct', 0) >= 0.38:
            strengths.append("Consistent three-point shooter")
        if player_stats.get('apg', 0) >= 5:
            strengths.append("Strong playmaker")
        if player_stats.get('rpg', 0) >= 8:
            strengths.append("Strong rebounder")
        if player_stats.get('spg', 0) >= 2:
            strengths.append("Disruptive defender")
        if player_stats.get('ts_pct', 0) >= 0.60:
            strengths.append("Highly efficient")

        return strengths if strengths else ["Solid all-around player"]

    def _identify_weaknesses(self, player_stats, player_profile):
        """Identify areas for development"""
        weaknesses = []

        if player_stats.get('three_pt_pct', 0) < 0.30:
            weaknesses.append("Three-point shooting consistency")
        if player_stats.get('ast_to_ratio', 0) < 1.5:
            weaknesses.append("Decision-making / turnover issues")
        if player_stats.get('ft_pct', 0) < 0.70:
            weaknesses.append("Free throw shooting")
        if player_profile.get('height', 75) < 70 and player_stats.get('spg', 0) < 1.5:
            weaknesses.append("Defensive impact")

        return weaknesses if weaknesses else ["Continue developing all-around game"]

# Example: Evaluate draft prospects
draft_eval = WNBADraftProjection()

# Top prospect profile
prospect = {
    'name': 'Paige Bueckers',
    'position': 'Shooting Guard',
    'height': 71,  # 5'11"
    'stats': {
        'ppg': 21.2,
        'rpg': 5.2,
        'apg': 3.8,
        'spg': 2.3,
        'bpg': 1.4,
        'three_pt_pct': 0.412,
        'fg_pct': 0.532,
        'ft_pct': 0.827,
        'ts_pct': 0.639,
        'per': 28.5,
        'usg_pct': 26.3,
        'ast_to_ratio': 2.1
    },
    'intangibles_score': 90  # Leadership, winning experience
}

projection = draft_eval.project_draft_position(
    prospect['stats'],
    prospect,
    prospect['position']
)

print(f"WNBA Draft Projection: {prospect['name']}")
print(f"Overall Grade: {projection['overall_grade']:.1f}/100")
print(f"Projected Range: {projection['draft_range']}")
print(f"\nComponent Scores:")
print(f"  Production: {projection['production_score']:.1f}")
print(f"  Efficiency: {projection['efficiency_score']:.1f}")
print(f"  Physical Tools: {projection['physical_score']:.1f}")
print(f"  Skill Level: {projection['skill_score']:.1f}")
print(f"  Intangibles: {projection['intangibles_score']:.1f}")
print(f"\nStrengths: {', '.join(projection['strengths'])}")
print(f"Development Areas: {', '.join(projection['development_areas'])}")

Draft Prospect Comparison

Compare multiple prospects to create draft boards:

import matplotlib.pyplot as plt
import seaborn as sns

def create_prospect_comparison(prospects_list, draft_evaluator):
    """Create visual comparison of draft prospects"""

    evaluations = []

    for prospect in prospects_list:
        eval_result = draft_evaluator.project_draft_position(
            prospect['stats'],
            prospect,
            prospect['position']
        )
        eval_result['name'] = prospect['name']
        eval_result['position'] = prospect['position']
        evaluations.append(eval_result)

    # Create DataFrame
    df = pd.DataFrame(evaluations)

    # Sort by overall grade
    df = df.sort_values('overall_grade', ascending=False)

    # Create visualization
    fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(16, 8))

    # Overall grades
    ax1.barh(df['name'], df['overall_grade'], color='steelblue')
    ax1.set_xlabel('Overall Grade')
    ax1.set_title('WNBA Draft Prospect Rankings', fontsize=14, fontweight='bold')
    ax1.axvline(x=85, color='gold', linestyle='--', label='Top 5 Threshold')
    ax1.axvline(x=75, color='silver', linestyle='--', label='Lottery Threshold')
    ax1.legend()

    # Component scores radar
    categories = ['production_score', 'efficiency_score', 'physical_score',
                  'skill_score', 'intangibles_score']

    for idx, row in df.head(5).iterrows():
        values = [row[cat] for cat in categories]
        ax2.plot(categories, values, marker='o', label=row['name'])

    ax2.set_title('Top 5 Prospects - Component Breakdown',
                  fontsize=14, fontweight='bold')
    ax2.set_ylim(0, 100)
    ax2.legend(bbox_to_anchor=(1.05, 1), loc='upper left')
    ax2.grid(True, alpha=0.3)

    plt.xticks(rotation=45, ha='right')
    plt.tight_layout()

    return fig, df

# Example usage with multiple prospects
prospects = [
    {
        'name': 'Paige Bueckers',
        'position': 'Shooting Guard',
        'height': 71,
        'stats': {'ppg': 21.2, 'rpg': 5.2, 'apg': 3.8, 'spg': 2.3,
                  'three_pt_pct': 0.412, 'ts_pct': 0.639, 'per': 28.5,
                  'usg_pct': 26.3, 'ast_to_ratio': 2.1},
        'intangibles_score': 90
    },
    {
        'name': 'Hannah Hidalgo',
        'position': 'Point Guard',
        'height': 68,
        'stats': {'ppg': 22.6, 'rpg': 6.2, 'apg': 5.5, 'spg': 4.6,
                  'three_pt_pct': 0.368, 'ts_pct': 0.571, 'per': 27.8,
                  'usg_pct': 28.1, 'ast_to_ratio': 1.9},
        'intangibles_score': 85
    },
    {
        'name': 'Kiki Iriafen',
        'position': 'Wing/Forward',
        'height': 75,
        'stats': {'ppg': 19.4, 'rpg': 11.1, 'apg': 2.8, 'spg': 1.5,
                  'three_pt_pct': 0.429, 'ts_pct': 0.618, 'per': 26.2,
                  'usg_pct': 24.5, 'ast_to_ratio': 1.4},
        'intangibles_score': 80
    }
]

fig, rankings = create_prospect_comparison(prospects, draft_eval)
plt.savefig('wnba_draft_prospects_2025.png', dpi=300, bbox_inches='tight')

print("\nWNBA Draft Big Board:")
print(rankings[['name', 'position', 'overall_grade', 'draft_range']].to_string(index=False))

Practical Applications for College Programs

1. Opponent Scouting System

Build an analytics-driven scouting report for upcoming opponents:

  • Offensive Tendencies: Shot selection, pace, preferred actions (PnR, post-ups, transition)
  • Key Players: Usage rates, efficiency, hot zones
  • Defensive Schemes: Primary coverage (man, zone), pressure, transition defense
  • Situational Analysis: Clutch performance, timeout effectiveness, substitution patterns

2. Player Development Tracking

Monitor individual player improvement throughout the season:

  • Track rolling averages for key metrics (shooting %, assist/TO ratio, rebounding rate)
  • Compare early season vs late season performance
  • Identify skill development areas showing improvement
  • Set measurable goals and track progress

3. Lineup Optimization

Use analytics to determine most effective lineup combinations:

  • Calculate net rating for different 5-player combinations
  • Analyze spacing (3PT shooting distribution)
  • Evaluate defensive versatility and switchability
  • Balance ball-handling and playmaking responsibilities

4. Recruiting Analytics

Data-driven approach to identifying and evaluating recruits:

  • Evaluate high school statistics with competition level adjustments
  • Compare recruits to successful players in your system
  • Project college impact based on skill profile
  • Identify undervalued prospects (high efficiency, low visibility)

5. Game Strategy Analytics

In-game and pregame strategic decisions informed by data:

Key Strategic Questions:

  • What pace should we play to maximize our win probability?
  • Which defensive scheme is most effective against this opponent's personnel?
  • When should we use full-court pressure?
  • Which players should take end-of-game shots?
  • What are optimal substitution patterns to manage fatigue?

6. Season Planning and Load Management

Use analytics to optimize practice intensity and player workload:

  • Monitor minutes played and fatigue indicators
  • Schedule rest for key players in non-conference games
  • Track injury risk factors (minutes, previous injuries)
  • Balance development of bench players with winning games

Resources for Further Learning

Data Sources

Analytics Communities

  • Women's Basketball Coaches Association (WBCA) analytics workshops
  • Twitter/X Women's Basketball Analytics community (#WBBAnalytics)
  • SportsDataverse Discord (wehoop support and community)

Recommended Reading

  • Basketball Analytics: Spatial Tracking by Stephen Shea and Christopher Baker
  • Sprawlball by Kirk Goldsberry (principles applicable to women's game)
  • Her Hoop Stats blog articles on women's basketball analytics

Getting Started

Start with publicly available data from NCAA.com and Her Hoop Stats. Use the code examples in this guide to build simple analyses. As you become more comfortable, expand to more sophisticated models and custom data collection. The women's college basketball analytics community is growing rapidly and welcomes new contributors.

Discussion

Have questions or feedback? Join our community discussion on Discord or GitHub Discussions.