G-League and Development Analytics

1. G League Structure and Purpose

The NBA G League (formerly D-League) serves as the NBA's official minor league system, providing a developmental platform for players, coaches, and officials. Understanding its structure is crucial for accurate performance translation.

League Organization

Teams: 30+ teams with direct NBA affiliations
Season Structure: 50-game regular season (November-March)
Roster Composition: NBA assignment players, two-way contracts, returning rights players, and G League contracts
Playing Rules: Nearly identical to NBA with experimental rule testing
Schedule: More compressed than NBA (back-to-backs common)

Development Pathways

Player Categories:

NBA Assignment: Contracted NBA players sent down for development
Two-Way Contracts: Split time between NBA and G League
Affiliate Players: Under contract with G League team
Draft Rights: Players selected in G League draft
Returning Rights: Previously played for NBA affiliate

2. Translation Factors to NBA

Performance translation from G League to NBA requires understanding multiple contextual factors that affect statistical reliability and predictive value.

Key Translation Metrics

Factor	G League Impact	NBA Translation	Adjustment Factor
Competition Level	Lower defensive intensity	Higher turnover rate	0.75-0.85x
Pace	Similar to NBA	Per-possession metrics stable	0.95-1.0x
Three-Point Shooting	More open looks	Contested shots increase	0.85-0.90x
Free Throw Rate	Higher FT attempts	Tighter officiating	0.80-0.85x
Rebounding	Less athletic competition	More contested boards	0.70-0.80x
Assists	Weaker off-ball movement	Better spacing required	0.75-0.85x

Age-Based Translation Curves

Player age significantly impacts translation probability:

Under 21: High upside, raw stats less predictive (focus on efficiency and skill indicators)
21-23: Peak translation window, stats more reliable with proper adjustments
24-26: Moderate translation probability, established skill sets translate better
27+: Low translation probability, likely career G League players unless specialized skill

Position-Specific Considerations

Translation Difficulty by Position:

Guards (PG/SG):
  - Shooting and playmaking translate best
  - Ball-handling under pressure less predictive
  - Defense often struggles with NBA athleticism

Wings (SF):
  - 3-and-D skills most transferable
  - Versatility premium at NBA level
  - Shot creation ability crucial

Bigs (PF/C):
  - Rim protection translates well
  - Shooting range increasingly important
  - Post skills less valuable in modern NBA
  - Switch ability critical for playing time

3. Python Code for G League Analysis

Comprehensive Python framework for analyzing G League performance and predicting NBA translation.


import pandas as pd
import numpy as np
from sklearn.preprocessing import StandardScaler
from sklearn.ensemble import RandomForestRegressor
import matplotlib.pyplot as plt
import seaborn as sns

class GLeagueAnalyzer:
    """
    Analyze G League performance and predict NBA translation potential.
    """

    def __init__(self):
        self.scaler = StandardScaler()
        self.model = None

    def calculate_advanced_stats(self, df):
        """
        Calculate advanced metrics from basic G League stats.

        Parameters:
        -----------
        df : DataFrame
            G League player statistics

        Returns:
        --------
        DataFrame with advanced metrics
        """
        # True Shooting Percentage
        df['TS%'] = df['PTS'] / (2 * (df['FGA'] + 0.44 * df['FTA']))

        # Effective Field Goal Percentage
        df['eFG%'] = (df['FGM'] + 0.5 * df['3PM']) / df['FGA']

        # Usage Rate (estimate)
        team_poss = df['FGA'] + 0.44 * df['FTA'] + df['TOV']
        df['USG%'] = 100 * ((df['FGA'] + 0.44 * df['FTA'] + df['TOV']) *
                            (df['TEAM_MIN'] / 5)) / (df['MIN'] * team_poss)

        # Assist-to-Turnover Ratio
        df['AST_TO_RATIO'] = df['AST'] / df['TOV'].replace(0, 1)

        # Per 36 minute stats
        df['PTS_PER36'] = (df['PTS'] / df['MIN']) * 36
        df['REB_PER36'] = (df['REB'] / df['MIN']) * 36
        df['AST_PER36'] = (df['AST'] / df['MIN']) * 36
        df['STL_PER36'] = (df['STL'] / df['MIN']) * 36
        df['BLK_PER36'] = (df['BLK'] / df['MIN']) * 36

        # Box Plus/Minus estimate
        df['BPM_EST'] = (df['PTS'] + df['REB'] + df['AST'] + df['STL'] +
                         df['BLK'] - df['TOV'] - (df['FGA'] - df['FGM']) -
                         (df['FTA'] - df['FTM'])) / df['MIN']

        return df

    def apply_translation_factors(self, df, age_adjusted=True):
        """
        Apply translation factors to G League stats for NBA projection.

        Parameters:
        -----------
        df : DataFrame
            G League statistics
        age_adjusted : bool
            Apply age-based adjustments

        Returns:
        --------
        DataFrame with NBA-translated projections
        """
        translation_factors = {
            'PTS': 0.80,
            'REB': 0.75,
            'AST': 0.80,
            'STL': 0.85,
            'BLK': 0.85,
            '3P%': 0.88,
            'FG%': 0.90,
            'FT%': 0.95
        }

        # Apply base translation factors
        for stat, factor in translation_factors.items():
            if stat in df.columns:
                df[f'NBA_PROJ_{stat}'] = df[stat] * factor

        # Age-based adjustments
        if age_adjusted and 'AGE' in df.columns:
            age_multiplier = np.where(df['AGE'] < 23, 1.1,
                            np.where(df['AGE'] < 25, 1.0,
                            np.where(df['AGE'] < 27, 0.9, 0.7)))

            for col in df.columns:
                if col.startswith('NBA_PROJ_'):
                    df[col] = df[col] * age_multiplier

        return df

    def calculate_translation_score(self, df):
        """
        Calculate comprehensive translation score (0-100).

        Parameters:
        -----------
        df : DataFrame
            Player statistics with advanced metrics

        Returns:
        --------
        Series with translation scores
        """
        # Component weights
        weights = {
            'efficiency': 0.25,
            'production': 0.20,
            'age': 0.15,
            'athleticism': 0.15,
            'shooting': 0.15,
            'defense': 0.10
        }

        # Efficiency score (0-100)
        efficiency = ((df['TS%'] - 0.45) / 0.25 * 100).clip(0, 100)

        # Production score (0-100)
        production = ((df['BPM_EST'] + 5) / 10 * 100).clip(0, 100)

        # Age score (younger = better)
        age_score = np.where(df['AGE'] < 23, 100,
                    np.where(df['AGE'] < 25, 80,
                    np.where(df['AGE'] < 27, 60, 30)))

        # Shooting score (three-point shooting critical)
        shooting = (df['3P%'] / 0.45 * 100).clip(0, 100)

        # Combine weighted components
        translation_score = (
            weights['efficiency'] * efficiency +
            weights['production'] * production +
            weights['age'] * age_score +
            weights['shooting'] * shooting
        )

        return translation_score.clip(0, 100)

    def identify_skill_profile(self, row):
        """
        Classify player into skill profile categories.

        Parameters:
        -----------
        row : Series
            Player statistics

        Returns:
        --------
        String classification
        """
        if row['3PM_PER36'] > 2.5 and row['3P%'] > 0.38:
            if row['PTS_PER36'] < 15:
                return "3-and-D Specialist"
            else:
                return "Scoring Wing"

        elif row['AST_PER36'] > 6 and row['AST_TO_RATIO'] > 2.0:
            return "Floor General"

        elif row['REB_PER36'] > 10 and row['BLK_PER36'] > 1.5:
            return "Rim Protector"

        elif row['PTS_PER36'] > 20 and row['USG%'] > 25:
            return "High-Usage Scorer"

        elif row['STL_PER36'] > 1.5 and row['BLK_PER36'] > 0.8:
            return "Defensive Specialist"

        elif row['REB_PER36'] > 8 and row['3P%'] > 0.35:
            return "Stretch Big"

        else:
            return "Role Player"

    def compare_to_nba_callups(self, player_stats, historical_callups):
        """
        Compare player to historical successful NBA call-ups.

        Parameters:
        -----------
        player_stats : Series
            Current player's G League stats
        historical_callups : DataFrame
            Historical data of successful G League to NBA transitions

        Returns:
        --------
        Dictionary with similarity scores and comparable players
        """
        # Select key metrics for comparison
        comparison_metrics = [
            'PTS_PER36', 'REB_PER36', 'AST_PER36', 'TS%',
            'USG%', '3P%', 'AST_TO_RATIO', 'AGE'
        ]

        # Normalize metrics
        player_normalized = self.scaler.fit_transform(
            player_stats[comparison_metrics].values.reshape(1, -1)
        )
        historical_normalized = self.scaler.transform(
            historical_callups[comparison_metrics].values
        )

        # Calculate Euclidean distance
        distances = np.linalg.norm(
            historical_normalized - player_normalized, axis=1
        )

        # Find most similar players
        similar_indices = np.argsort(distances)[:5]
        comparables = historical_callups.iloc[similar_indices].copy()
        comparables['SIMILARITY'] = 100 - (distances[similar_indices] /
                                           distances.max() * 100)

        return {
            'comparables': comparables,
            'avg_similarity': comparables['SIMILARITY'].mean(),
            'top_match': comparables.iloc[0]['PLAYER_NAME']
        }

    def generate_scouting_report(self, player_data):
        """
        Generate comprehensive scouting report for NBA readiness.

        Parameters:
        -----------
        player_data : Series
            Complete player statistics and metrics

        Returns:
        --------
        Dictionary with scouting report components
        """
        report = {
            'player_name': player_data['PLAYER_NAME'],
            'age': player_data['AGE'],
            'position': player_data['POSITION'],
            'translation_score': self.calculate_translation_score(
                pd.DataFrame([player_data])
            ).iloc[0],
            'skill_profile': self.identify_skill_profile(player_data),
            'strengths': [],
            'weaknesses': [],
            'nba_role': '',
            'recommendation': ''
        }

        # Identify strengths
        if player_data['TS%'] > 0.60:
            report['strengths'].append("Elite efficiency scorer")
        if player_data['3P%'] > 0.38 and player_data['3PA_PER36'] > 5:
            report['strengths'].append("Proven three-point shooter")
        if player_data['AST_TO_RATIO'] > 2.5:
            report['strengths'].append("Excellent decision maker")
        if player_data['STL_PER36'] + player_data['BLK_PER36'] > 2.5:
            report['strengths'].append("Disruptive defender")

        # Identify weaknesses
        if player_data['TOV'] / player_data['MIN'] > 0.15:
            report['weaknesses'].append("Turnover prone")
        if player_data['FT%'] < 0.70:
            report['weaknesses'].append("Poor free throw shooter")
        if player_data['AGE'] > 25:
            report['weaknesses'].append("Limited development runway")
        if player_data['REB_PER36'] < 3 and player_data['POSITION'] in ['PF', 'C']:
            report['weaknesses'].append("Below-average rebounder for position")

        # NBA role projection
        profile = report['skill_profile']
        if profile == "3-and-D Specialist":
            report['nba_role'] = "Rotation wing, 15-20 MPG"
        elif profile == "Floor General":
            report['nba_role'] = "Backup point guard, 18-24 MPG"
        elif profile == "Rim Protector":
            report['nba_role'] = "Backup center, 15-20 MPG"
        else:
            report['nba_role'] = "Deep bench/two-way contract"

        # Overall recommendation
        if report['translation_score'] > 70:
            report['recommendation'] = "STRONG NBA PROSPECT - Recommend immediate call-up"
        elif report['translation_score'] > 55:
            report['recommendation'] = "VIABLE NBA PLAYER - Consider for two-way contract"
        elif report['translation_score'] > 40:
            report['recommendation'] = "DEVELOPMENTAL - Continue G League assignment"
        else:
            report['recommendation'] = "LIMITED NBA UPSIDE - Focus on G League role"

        return report

    def visualize_player_profile(self, player_data, nba_average):
        """
        Create radar chart comparing player to NBA average.

        Parameters:
        -----------
        player_data : Series
            Player statistics
        nba_average : Series
            NBA average statistics for position
        """
        categories = ['Scoring', 'Playmaking', 'Rebounding',
                     'Efficiency', 'Defense']

        # Normalize to 0-100 scale
        player_values = [
            (player_data['PTS_PER36'] / 25) * 100,
            (player_data['AST_PER36'] / 8) * 100,
            (player_data['REB_PER36'] / 10) * 100,
            (player_data['TS%'] / 0.65) * 100,
            ((player_data['STL_PER36'] + player_data['BLK_PER36']) / 3) * 100
        ]

        nba_values = [
            (nba_average['PTS_PER36'] / 25) * 100,
            (nba_average['AST_PER36'] / 8) * 100,
            (nba_average['REB_PER36'] / 10) * 100,
            (nba_average['TS%'] / 0.65) * 100,
            ((nba_average['STL_PER36'] + nba_average['BLK_PER36']) / 3) * 100
        ]

        # Create radar chart
        angles = np.linspace(0, 2 * np.pi, len(categories), endpoint=False)
        player_values += player_values[:1]
        nba_values += nba_values[:1]
        angles = np.concatenate((angles, [angles[0]]))

        fig, ax = plt.subplots(figsize=(10, 10), subplot_kw=dict(projection='polar'))
        ax.plot(angles, player_values, 'o-', linewidth=2, label='Player', color='#1f77b4')
        ax.fill(angles, player_values, alpha=0.25, color='#1f77b4')
        ax.plot(angles, nba_values, 'o-', linewidth=2, label='NBA Avg', color='#ff7f0e')
        ax.fill(angles, nba_values, alpha=0.25, color='#ff7f0e')
        ax.set_xticks(angles[:-1])
        ax.set_xticklabels(categories, size=12)
        ax.set_ylim(0, 100)
        ax.set_title(f"{player_data['PLAYER_NAME']} - NBA Translation Profile",
                    size=16, y=1.08)
        ax.legend(loc='upper right', bbox_to_anchor=(1.3, 1.1))
        ax.grid(True)

        return fig

# Example usage
if __name__ == "__main__":
    # Sample G League data
    gleague_data = pd.DataFrame({
        'PLAYER_NAME': ['Player A', 'Player B', 'Player C'],
        'AGE': [22, 26, 24],
        'POSITION': ['SG', 'PG', 'PF'],
        'MIN': [32.5, 28.3, 25.1],
        'PTS': [22.3, 15.7, 12.4],
        'REB': [5.2, 3.1, 8.3],
        'AST': [3.5, 6.8, 1.9],
        'STL': [1.2, 1.5, 0.8],
        'BLK': [0.4, 0.2, 1.3],
        'TOV': [2.1, 2.3, 1.5],
        'FGM': [8.1, 5.9, 4.7],
        'FGA': [16.8, 13.2, 9.3],
        '3PM': [2.4, 1.8, 0.6],
        '3PA': [6.5, 5.1, 1.8],
        '3P%': [0.369, 0.353, 0.333],
        'FTM': [3.7, 2.1, 2.4],
        'FTA': [4.3, 2.6, 3.1],
        'FT%': [0.860, 0.808, 0.774],
        'TEAM_MIN': [240, 240, 240]
    })

    # Initialize analyzer
    analyzer = GLeagueAnalyzer()

    # Calculate advanced stats
    gleague_data = analyzer.calculate_advanced_stats(gleague_data)

    # Apply translation factors
    gleague_data = analyzer.apply_translation_factors(gleague_data)

    # Calculate translation scores
    gleague_data['TRANSLATION_SCORE'] = analyzer.calculate_translation_score(gleague_data)

    # Generate scouting reports
    for idx, player in gleague_data.iterrows():
        report = analyzer.generate_scouting_report(player)
        print(f"\n{'='*60}")
        print(f"SCOUTING REPORT: {report['player_name']}")
        print(f"{'='*60}")
        print(f"Age: {report['age']} | Position: {report['position']}")
        print(f"Translation Score: {report['translation_score']:.1f}/100")
        print(f"Profile: {report['skill_profile']}")
        print(f"\nStrengths: {', '.join(report['strengths']) if report['strengths'] else 'None identified'}")
        print(f"Weaknesses: {', '.join(report['weaknesses']) if report['weaknesses'] else 'None identified'}")
        print(f"\nProjected NBA Role: {report['nba_role']}")
        print(f"Recommendation: {report['recommendation']}")

    print("\n\nTop Translation Prospects:")
    print(gleague_data[['PLAYER_NAME', 'AGE', 'TRANSLATION_SCORE']].sort_values(
        'TRANSLATION_SCORE', ascending=False
    ))

4. R Code for Projection Models

Statistical models for predicting NBA success from G League performance using regression and machine learning.


library(tidyverse)
library(caret)
library(randomForest)
library(glmnet)
library(xgboost)
library(corrplot)

# G League NBA Translation Model
# Predicts NBA performance metrics from G League statistics

# ============================================================================
# Data Preparation
# ============================================================================

prepare_gleague_data <- function(gleague_stats, nba_outcomes) {
  # Merge G League stats with subsequent NBA performance
  # gleague_stats: G League season statistics
  # nba_outcomes: NBA performance in following season

  merged_data <- gleague_stats %>%
    inner_join(nba_outcomes, by = c("player_id", "season")) %>%
    mutate(
      # Calculate advanced G League metrics
      gl_ts_pct = gl_pts / (2 * (gl_fga + 0.44 * gl_fta)),
      gl_efg_pct = (gl_fgm + 0.5 * gl_3pm) / gl_fga,
      gl_ast_to = gl_ast / gl_tov,
      gl_usg = 100 * ((gl_fga + 0.44 * gl_fta + gl_tov) *
                      (team_min / 5)) / (gl_min * team_poss),

      # Per-36 minute stats
      gl_pts_36 = (gl_pts / gl_min) * 36,
      gl_reb_36 = (gl_reb / gl_min) * 36,
      gl_ast_36 = (gl_ast / gl_min) * 36,

      # Age factors
      age_prime = case_when(
        age < 22 ~ "Young",
        age >= 22 & age < 26 ~ "Prime",
        age >= 26 ~ "Older"
      ),

      # Position encoding
      position_group = case_when(
        position %in% c("PG", "SG") ~ "Guard",
        position %in% c("SF", "PF") ~ "Wing",
        position == "C" ~ "Center"
      )
    ) %>%
    # Remove infinite and missing values
    filter(
      is.finite(gl_ts_pct),
      is.finite(gl_ast_to),
      gl_min >= 15  # Minimum minutes filter
    )

  return(merged_data)
}

# ============================================================================
# Feature Engineering
# ============================================================================

create_translation_features <- function(data) {
  # Create advanced features for translation model

  features <- data %>%
    mutate(
      # Efficiency composite
      efficiency_score = scale(gl_ts_pct) + scale(gl_efg_pct) -
                        scale(gl_tov / gl_min),

      # Production composite
      production_score = scale(gl_pts_36) + scale(gl_reb_36) +
                        scale(gl_ast_36),

      # Shooting ability
      shooting_score = scale(gl_3p_pct) * sqrt(gl_3pa / gl_min),

      # Playmaking
      playmaking_score = scale(gl_ast_36) + scale(gl_ast_to),

      # Defense proxy
      defense_score = scale(gl_stl_36) + scale(gl_blk_36),

      # Age-adjusted production
      age_adj_prod = production_score * case_when(
        age < 23 ~ 1.2,
        age < 25 ~ 1.0,
        age < 27 ~ 0.85,
        TRUE ~ 0.7
      ),

      # Versatility score
      versatility = sd(c(gl_pts_36, gl_reb_36, gl_ast_36,
                        gl_stl_36, gl_blk_36), na.rm = TRUE)
    )

  return(features)
}

# ============================================================================
# Translation Factor Estimation
# ============================================================================

estimate_translation_factors <- function(data) {
  # Estimate position-specific translation factors
  # using players who played in both G League and NBA in same season

  factors <- data %>%
    group_by(position_group) %>%
    summarise(
      pts_factor = median(nba_pts / gl_pts, na.rm = TRUE),
      reb_factor = median(nba_reb / gl_reb, na.rm = TRUE),
      ast_factor = median(nba_ast / gl_ast, na.rm = TRUE),
      fg_pct_factor = median(nba_fg_pct / gl_fg_pct, na.rm = TRUE),
      3p_pct_factor = median(nba_3p_pct / gl_3p_pct, na.rm = TRUE),
      n_players = n()
    ) %>%
    mutate(across(ends_with("_factor"), ~ifelse(is.finite(.), ., NA)))

  return(factors)
}

apply_translation_factors <- function(gleague_stats, factors) {
  # Apply empirically derived translation factors

  translated <- gleague_stats %>%
    left_join(factors, by = "position_group") %>%
    mutate(
      projected_nba_pts = gl_pts * pts_factor,
      projected_nba_reb = gl_reb * reb_factor,
      projected_nba_ast = gl_ast * ast_factor,
      projected_nba_fg_pct = gl_fg_pct * fg_pct_factor,
      projected_nba_3p_pct = gl_3p_pct * `3p_pct_factor`
    )

  return(translated)
}

# ============================================================================
# Predictive Models
# ============================================================================

# Model 1: Linear Regression with Regularization
train_ridge_model <- function(train_data, target_var) {
  # Ridge regression for NBA outcome prediction

  # Select predictor variables
  predictors <- c("gl_pts_36", "gl_reb_36", "gl_ast_36", "gl_ts_pct",
                 "gl_usg", "gl_ast_to", "age", "efficiency_score",
                 "production_score", "shooting_score")

  # Prepare matrices
  x_train <- train_data %>%
    select(all_of(predictors)) %>%
    as.matrix()

  y_train <- train_data[[target_var]]

  # Cross-validated ridge regression
  cv_model <- cv.glmnet(
    x = x_train,
    y = y_train,
    alpha = 0,  # Ridge penalty
    nfolds = 10,
    type.measure = "mse"
  )

  return(cv_model)
}

# Model 2: Random Forest
train_rf_model <- function(train_data, target_var) {
  # Random forest for capturing non-linear relationships

  predictors <- c("gl_pts_36", "gl_reb_36", "gl_ast_36", "gl_ts_pct",
                 "gl_usg", "gl_ast_to", "age", "position_group",
                 "efficiency_score", "production_score", "shooting_score",
                 "playmaking_score", "defense_score")

  formula <- as.formula(paste(target_var, "~", paste(predictors, collapse = " + ")))

  rf_model <- randomForest(
    formula = formula,
    data = train_data,
    ntree = 500,
    mtry = floor(sqrt(length(predictors))),
    importance = TRUE,
    nodesize = 5
  )

  return(rf_model)
}

# Model 3: XGBoost
train_xgboost_model <- function(train_data, target_var) {
  # Gradient boosting for high-performance predictions

  predictors <- c("gl_pts_36", "gl_reb_36", "gl_ast_36", "gl_ts_pct",
                 "gl_usg", "gl_ast_to", "age", "efficiency_score",
                 "production_score", "shooting_score", "playmaking_score",
                 "defense_score", "age_adj_prod", "versatility")

  # Convert categorical variables
  train_processed <- train_data %>%
    mutate(
      position_guard = as.integer(position_group == "Guard"),
      position_wing = as.integer(position_group == "Wing"),
      position_center = as.integer(position_group == "Center")
    )

  predictors_full <- c(predictors, "position_guard", "position_wing", "position_center")

  x_train <- train_processed %>%
    select(all_of(predictors_full)) %>%
    as.matrix()

  y_train <- train_processed[[target_var]]

  dtrain <- xgb.DMatrix(data = x_train, label = y_train)

  # Cross-validation for optimal parameters
  cv_results <- xgb.cv(
    data = dtrain,
    nrounds = 500,
    nfold = 5,
    objective = "reg:squarederror",
    eta = 0.05,
    max_depth = 6,
    subsample = 0.8,
    colsample_bytree = 0.8,
    early_stopping_rounds = 20,
    verbose = 0
  )

  # Train final model
  best_iteration <- cv_results$best_iteration

  xgb_model <- xgboost(
    data = dtrain,
    nrounds = best_iteration,
    objective = "reg:squarederror",
    eta = 0.05,
    max_depth = 6,
    subsample = 0.8,
    colsample_bytree = 0.8,
    verbose = 0
  )

  return(list(model = xgb_model, feature_names = predictors_full))
}

# ============================================================================
# Model Evaluation
# ============================================================================

evaluate_model <- function(model, test_data, target_var, model_type = "rf") {
  # Comprehensive model evaluation

  if (model_type == "ridge") {
    predictors <- c("gl_pts_36", "gl_reb_36", "gl_ast_36", "gl_ts_pct",
                   "gl_usg", "gl_ast_to", "age", "efficiency_score",
                   "production_score", "shooting_score")
    x_test <- test_data %>% select(all_of(predictors)) %>% as.matrix()
    predictions <- predict(model, newx = x_test, s = "lambda.min")

  } else if (model_type == "rf") {
    predictions <- predict(model, newdata = test_data)

  } else if (model_type == "xgb") {
    predictors_full <- model$feature_names
    test_processed <- test_data %>%
      mutate(
        position_guard = as.integer(position_group == "Guard"),
        position_wing = as.integer(position_group == "Wing"),
        position_center = as.integer(position_group == "Center")
      )
    x_test <- test_processed %>%
      select(all_of(predictors_full)) %>%
      as.matrix()
    dtest <- xgb.DMatrix(data = x_test)
    predictions <- predict(model$model, dtest)
  }

  actual <- test_data[[target_var]]

  # Calculate metrics
  rmse <- sqrt(mean((predictions - actual)^2, na.rm = TRUE))
  mae <- mean(abs(predictions - actual), na.rm = TRUE)
  r_squared <- cor(predictions, actual, use = "complete.obs")^2

  # Prediction accuracy by categories
  accuracy_by_threshold <- tibble(
    threshold = c(5, 10, 15, 20),
    within_threshold = map_dbl(threshold, ~mean(abs(predictions - actual) <= .x, na.rm = TRUE))
  )

  results <- list(
    rmse = rmse,
    mae = mae,
    r_squared = r_squared,
    accuracy_by_threshold = accuracy_by_threshold,
    predictions = predictions,
    actual = actual
  )

  return(results)
}

# ============================================================================
# Success Probability Model
# ============================================================================

model_nba_success_probability <- function(data) {
  # Logistic regression for probability of NBA success
  # Success defined as: playing 100+ NBA minutes with PER > 10

  model_data <- data %>%
    mutate(
      nba_success = as.factor(ifelse(nba_min >= 100 & nba_per > 10, 1, 0))
    )

  # Train logistic regression
  success_model <- glm(
    nba_success ~ gl_pts_36 + gl_ts_pct + gl_usg + age +
                  efficiency_score + shooting_score + position_group,
    data = model_data,
    family = binomial(link = "logit")
  )

  # Calculate success probabilities
  model_data$success_prob <- predict(success_model, type = "response")

  # Categorize prospects
  model_data <- model_data %>%
    mutate(
      prospect_tier = case_when(
        success_prob >= 0.7 ~ "High",
        success_prob >= 0.4 ~ "Medium",
        success_prob >= 0.2 ~ "Low",
        TRUE ~ "Minimal"
      )
    )

  return(list(model = success_model, data = model_data))
}

# ============================================================================
# Visualization Functions
# ============================================================================

plot_translation_comparison <- function(evaluation_results, target_name) {
  # Scatter plot of predicted vs actual

  plot_data <- tibble(
    predicted = evaluation_results$predictions,
    actual = evaluation_results$actual
  )

  ggplot(plot_data, aes(x = predicted, y = actual)) +
    geom_point(alpha = 0.5, size = 3, color = "#1f77b4") +
    geom_abline(slope = 1, intercept = 0, linetype = "dashed",
                color = "red", size = 1) +
    geom_smooth(method = "lm", se = TRUE, color = "#ff7f0e") +
    labs(
      title = paste("G League to NBA Translation:", target_name),
      subtitle = sprintf("R² = %.3f, RMSE = %.2f, MAE = %.2f",
                        evaluation_results$r_squared,
                        evaluation_results$rmse,
                        evaluation_results$mae),
      x = "Predicted NBA Value",
      y = "Actual NBA Value"
    ) +
    theme_minimal() +
    theme(
      plot.title = element_text(size = 14, face = "bold"),
      plot.subtitle = element_text(size = 11)
    )
}

plot_feature_importance <- function(rf_model) {
  # Variable importance plot

  importance_df <- importance(rf_model) %>%
    as.data.frame() %>%
    rownames_to_column("feature") %>%
    arrange(desc(`%IncMSE`)) %>%
    head(15)

  ggplot(importance_df, aes(x = reorder(feature, `%IncMSE`), y = `%IncMSE`)) +
    geom_col(fill = "#1f77b4", alpha = 0.8) +
    coord_flip() +
    labs(
      title = "Feature Importance for NBA Translation",
      x = "Feature",
      y = "Importance (% Increase in MSE)"
    ) +
    theme_minimal() +
    theme(
      plot.title = element_text(size = 14, face = "bold")
    )
}

plot_success_probability_distribution <- function(success_model_results) {
  # Distribution of success probabilities by age

  ggplot(success_model_results$data, aes(x = age, y = success_prob,
                                         color = prospect_tier)) +
    geom_point(alpha = 0.6, size = 3) +
    geom_smooth(method = "loess", se = TRUE) +
    scale_color_manual(values = c("High" = "#2ecc71", "Medium" = "#f39c12",
                                  "Low" = "#e74c3c", "Minimal" = "#95a5a6")) +
    labs(
      title = "NBA Success Probability by Age",
      x = "Player Age",
      y = "Probability of NBA Success",
      color = "Prospect Tier"
    ) +
    theme_minimal() +
    theme(
      plot.title = element_text(size = 14, face = "bold"),
      legend.position = "right"
    )
}

# ============================================================================
# Example Usage
# ============================================================================

# Load and prepare data
# gleague_stats <- read_csv("gleague_stats.csv")
# nba_outcomes <- read_csv("nba_outcomes.csv")

# prepared_data <- prepare_gleague_data(gleague_stats, nba_outcomes)
# featured_data <- create_translation_features(prepared_data)

# Split data
# set.seed(42)
# train_idx <- createDataPartition(featured_data$nba_pts, p = 0.8, list = FALSE)
# train_data <- featured_data[train_idx, ]
# test_data <- featured_data[-train_idx, ]

# Train models
# rf_model <- train_rf_model(train_data, "nba_pts")
# xgb_model <- train_xgboost_model(train_data, "nba_pts")

# Evaluate
# rf_eval <- evaluate_model(rf_model, test_data, "nba_pts", "rf")
# xgb_eval <- evaluate_model(xgb_model, test_data, "nba_pts", "xgb")

# Success probability
# success_results <- model_nba_success_probability(featured_data)

# Visualizations
# plot_translation_comparison(rf_eval, "Points Per Game")
# plot_feature_importance(rf_model)
# plot_success_probability_distribution(success_results)

print("G League translation models loaded successfully")

5. Success Stories: Undrafted to NBA Stars

Notable players who developed through the G League and became successful NBA contributors.

Elite Success Cases

Fred VanVleet (Toronto Raptors)

Path: Undrafted (2016) → Raptors 905 → Two-time NBA Champion
G League Stats: 21.9 PPG, 4.9 APG, 45.2 FG%, 41.4 3P%
NBA Peak: 20.3 PPG, 6.7 APG, All-Star (2022)
Translation Keys: Elite shooting efficiency, improved playmaking, high basketball IQ
Contract Value: 4 years, $85 million (2022)

Duncan Robinson (Miami Heat)

Path: Undrafted (2018) → Sioux Falls Skyforce → Starting NBA role
G League Stats: 21.7 PPG, 44.7 FG%, 43.5 3P% on 9.1 attempts
NBA Impact: Key rotation player, NBA Finals starter
Translation Keys: Movement shooting, off-ball awareness, volume three-point shooting
Contract Value: 5 years, $90 million (2021)

Khris Middleton (Milwaukee Bucks)

Path: 39th pick (2012) → Fort Wayne Mad Ants → Three-time All-Star
G League Stats: 20.1 PPG, 5.8 RPG, 47.6 FG%, 38.9 3P%
NBA Achievement: NBA Champion (2021), Olympic Gold Medal
Translation Keys: Two-way versatility, clutch performance, improved shot creation
Contract Value: 5 years, $177.5 million (2019)

Pascal Siakam (Toronto Raptors)

Path: 27th pick (2016) → Raptors 905 → All-NBA Second Team
G League Stats: 17.4 PPG, 6.2 RPG, 2.3 APG
NBA Peak: 22.9 PPG, 7.3 RPG, Most Improved Player (2019)
Translation Keys: Rapid skill development, work ethic, position versatility

Key Translation Patterns

Success Factor	G League Indicators	NBA Translation
Three-Point Shooting	38%+ on high volume (6+ attempts)	Immediate floor spacing value
Efficiency	60%+ True Shooting Percentage	Low-usage role player success
Age	Under 24 at G League dominance	Higher development ceiling
Physical Tools	Defensive versatility indicators	Position flexibility in NBA
Basketball IQ	High assist-to-turnover ratio	Quick NBA adjustment period

Recent Notable Call-Ups (2023-2024)

Mac McClung: G League MVP → NBA contract (highlight athletic ability, scoring punch)
Craig Sword: Two-way → rotation minutes (defensive versatility)
Jordan Goodwin: Undrafted → 10-day → full contract (hustle, defense)
Taze Moore: G League Showcase → NBA opportunity (playmaking, athleticism)

6. Two-Way Contracts and Player Development

Understanding the two-way contract system and its role in player development pathways.

Two-Way Contract Structure

Contract Details (2024-25 Season)

Roster Spots: Each NBA team can have up to 3 two-way players
Salary: $578,577 (50% of NBA minimum) when in G League, full NBA minimum pro-rated for NBA days
NBA Days Limit: Maximum 50 days with NBA team during regular season
Playoff Eligibility: Can be on playoff roster if signed before playoff deadline
Contract Conversion: Can be converted to standard NBA contract at any time

Two-Way Contract Advantages

Stakeholder	Benefits	Strategic Use
NBA Teams	- Roster flexibility - Low-risk talent evaluation - Injury replacement options - Development investment	- Test fringe prospects - Fill temporary needs - Develop young talent - Practice squad depth
Players	- NBA exposure and coaching - Higher earnings than G League - Path to standard contract - Major medical coverage	- Showcase skills at NBA level - Learn NBA systems - Build relationships - Accelerate development
G League Teams	- High-quality players when available - Connection to NBA affiliate - Enhanced development role	- Implement NBA systems - Mentor other prospects - Competitive advantage

Development Best Practices

For Organizations

Aligned Systems: Run identical offensive/defensive schemes in G League affiliate
Communication: Regular coordination between NBA and G League coaching staffs
Development Plans: Individual skill development roadmaps for each two-way player
Strategic Call-Ups: Time NBA stints for player readiness and team needs
Film Study: Shared video analysis between NBA and G League operations
Performance Tracking: Detailed analytics on development progress

For Players

Skill Focus: Identify 2-3 skills to perfect during G League time
Role Acceptance: Embrace limited NBA role when called up
Physical Preparation: Maintain NBA-ready conditioning year-round
Mental Approach: Stay ready for call-up at any moment
Relationship Building: Maximize practice time with NBA players
Film Work: Study NBA rotation players at your position

Successful Two-Way Contract Conversions

2023-24 Season Examples:

Oshae Brissett (Celtics): Two-way → multi-year standard contract → key playoff contributor
Toumani Camara (Trail Blazers): Two-way → converted → starting role
GG Jackson (Grizzlies): Two-way → converted → rotational minutes as teenager
Julian Champagnie (Spurs): Two-way → standard deal → rotation wing

Alternative Development Pathways

10-Day Contracts: Short-term evaluation periods (2 max, then must sign for season or release)
Exhibit 10 Contracts: Non-guaranteed training camp deals with G League bonus
G League Ignite: Elite prospects prepare for NBA Draft (program ended 2024)
NBA Academy: International development program for elite youth players
Summer League: Showcase opportunity for unsigned players and draft picks

7. Best Practices for Prospect Evaluation

Comprehensive framework for evaluating G League players and projecting NBA success.

Multi-Dimensional Evaluation Framework

1. Statistical Analysis (40% Weight)

Primary Metrics:

True Shooting Percentage (min 58% for non-creators)
Per-36 minute production (age-adjusted)
Usage rate (context of role)
Assist-to-turnover ratio (playmakers)
Box Plus/Minus estimate

Advanced Indicators:

Shot quality (open vs contested attempts)
Shooting versatility (catch-and-shoot, pull-ups, movement)
Defensive impact proxies (deflections, charges, box-outs)
Situational performance (clutch, back-to-backs)

2. Skills Assessment (30% Weight)

Translatable Skills Checklist:

Shooting: NBA three-point range, shot mechanics, consistency
Ball-Handling: Pressure resistance, change of pace, ambidexterity
Passing: Court vision, decision speed, pocket passes
Defense: Footwork, positioning, communication, versatility
Rebounding: Box-out technique, positioning, hands
Finishing: Touch around rim, body control, angle creation

3. Physical Tools (20% Weight)

Athleticism: Speed, explosiveness, verticality
Size: Height, wingspan, standing reach (position-relative)
Strength: Contact absorption, screen setting, post defense
Lateral Quickness: Defensive footwork, closeout speed
Durability: Injury history, workload capacity

4. Intangibles (10% Weight)

Basketball IQ: Reads, anticipation, pattern recognition
Work Ethic: Skill development trajectory, conditioning
Competitiveness: Effort level, winning plays
Coachability: Adjustment speed, implementation of feedback
Team Chemistry: Locker room presence, role acceptance

Red Flags and Warning Signs

Statistical Red Flags:

High turnover rate (>15% TOV rate for guards)
Poor free throw shooting (<70% FT%)
Low three-point attempt rate (<3 per game for wings)
Declining efficiency over season
Poor performance against stronger competition

Contextual Red Flags:

Age 27+ without prior NBA experience
Dominant stats but no G League All-Star selection
Poor performance in NBA call-ups
Limited defensive versatility
One-dimensional skill set

Evaluation Workflow

Statistical Screening: Filter for minimum thresholds (age, efficiency, production)
Film Study: Watch 3-5 full games (mix of strong/weak opponents)
Skills Inventory: Grade each NBA-relevant skill (1-10 scale)
Physical Assessment: Evaluate physical tools via film and combine data
Context Analysis: Consider role, teammates, scheme fit
Projection: Estimate NBA role, playing time, career arc
Risk Assessment: Identify potential failure modes
Recommendation: Call-up, two-way, continue development, or pass

Position-Specific Evaluation Priorities

Point Guards

Primary: Playmaking (AST%), decision-making (TOV%), three-point shooting
Secondary: Pick-and-roll navigation, defensive pressure resistance
Physical: Quickness, change of direction

Shooting Guards / Wings

Primary: Three-point shooting (volume + efficiency), defensive versatility
Secondary: Off-ball movement, transition play
Physical: Length, athleticism, lateral quickness

Power Forwards / Centers

Primary: Rim protection, rebounding, shooting range
Secondary: Screen setting, short-roll passing, switch ability
Physical: Verticality, strength, mobility

Continuous Monitoring

Maintain living database of G League prospects with:

Weekly statistical updates
Monthly film reviews
Injury tracking
Performance trend analysis
Competitive landscape (other teams' interest)
Contract status and availability

Final Evaluation Principles

Context Matters: Always consider role, scheme, and competition level
Skills Over Stats: Translatable skills more predictive than raw numbers
Age Curve: Younger players have higher upside, but older players more reliable
Sample Size: Require minimum games played (30+) for statistical reliability
Projection Humility: Translation is difficult; expect high failure rate
Organizational Fit: Evaluate specific fit with team's system and needs

Conclusion

G League analytics requires sophisticated understanding of performance translation, contextual adjustments, and multi-dimensional player evaluation. Success depends on combining statistical analysis with skill assessment, physical tools evaluation, and organizational fit considerations. The two-way contract system provides valuable opportunities for player development and team flexibility, while success stories demonstrate the viable pathway from undrafted/late-round picks to meaningful NBA contributors.

Key takeaways for effective G League analytics:

Apply position-specific and age-adjusted translation factors to raw statistics
Prioritize skills that translate reliably (shooting, defense, basketball IQ)
Recognize the limited window for development (age 21-25 peak translation years)
Utilize comprehensive evaluation frameworks combining multiple data sources
Maintain realistic expectations about translation success rates

G League Analytics: Minor League Performance Translation