Dean Oliver's Four Factors

Beginner 10 min read 1 views Nov 27, 2025
# Dean Oliver's Four Factors Dean Oliver's Four Factors provide a framework for analyzing basketball team performance by breaking down the game into four key components that correlate most strongly with winning. These factors offer a data-driven approach to understanding what drives success in basketball. ## The Four Factors ### 1. Effective Field Goal Percentage (eFG%) **Formula:** `eFG% = (FGM + 0.5 * 3PM) / FGA` Effective Field Goal Percentage adjusts standard field goal percentage to account for the fact that three-point shots are worth 50% more than two-point shots. A player who shoots 50% on two-pointers has the same eFG% as a player who shoots 33.3% on three-pointers. **What it measures:** Shooting efficiency while accounting for the value of three-point shots. ### 2. Turnover Rate (TOV%) **Formula:** `TOV% = TOV / (FGA + 0.44 * FTA + TOV)` Turnover rate estimates the number of turnovers per 100 possessions. A lower turnover rate is better, as it means the team is taking care of the ball and not wasting possessions. **What it measures:** Ball security and possession efficiency. ### 3. Offensive Rebound Percentage (ORB%) **Formula:** `ORB% = ORB / (ORB + Opp DRB)` Offensive rebound percentage measures the percentage of available offensive rebounds a team grabs. Higher offensive rebounding creates additional possessions and second-chance points. **What it measures:** Second-chance opportunities and possession creation. ### 4. Free Throw Rate (FT Rate) **Formula:** `FT Rate = FTA / FGA` Free throw rate measures how often a team gets to the free throw line relative to field goal attempts. Teams that get to the line frequently score more efficiently and put opponents in foul trouble. **What it measures:** Ability to draw fouls and convert high-value free throws. ## Relative Importance Weights Dean Oliver assigned weights to each factor based on their correlation with winning: 1. **eFG%: 40%** - The most important factor 2. **TOV%: 25%** - Second most important 3. **ORB%: 20%** - Third in importance 4. **FT Rate: 15%** - Least important but still significant These weights can be used to create a composite offensive or defensive rating: **Offensive Rating = (eFG% × 0.40) + ((1 - TOV%) × 0.25) + (ORB% × 0.20) + (FT Rate × 0.15)** ## Team vs Individual Application ### Team-Level Analysis The Four Factors were originally designed for team analysis and are most powerful when comparing team offensive and defensive performance. Teams should: - **Maximize offensive eFG%** while minimizing opponent eFG% - **Minimize turnovers** while forcing opponent turnovers - **Maximize offensive rebounds** while limiting opponent offensive rebounds - **Maximize free throw attempts** while minimizing opponent free throws ### Individual-Level Analysis While the Four Factors were designed for teams, they can be adapted for individual players: - **eFG%**: Direct measure of individual shooting efficiency - **TOV%**: Individual turnover rate shows ball-handling ability - **ORB%**: Individual offensive rebounding contribution - **FT Rate**: Individual ability to draw fouls However, ORB% and defensive metrics are more context-dependent for individuals based on position and role. ## Predicting Wins The Four Factors are highly predictive of winning because they capture the essential elements of basketball: - **Shooting efficiency (eFG%)**: Better shooting = more points per possession - **Possession management (TOV%)**: Fewer turnovers = more scoring opportunities - **Possession creation (ORB%)**: More offensive rebounds = additional possessions - **Free throw generation (FT Rate)**: Free throws are efficient scoring opportunities ### Four Factors Differential To predict team success, calculate the differential between offensive and defensive Four Factors: - **eFG% Differential** = Team eFG% - Opponent eFG% - **TOV% Differential** = Opponent TOV% - Team TOV% (note the reversal) - **ORB% Differential** = Team ORB% - Opponent ORB% - **FT Rate Differential** = Team FT Rate - Opponent FT Rate Teams that excel in these differentials consistently win more games. Research shows that the Four Factors account for approximately 90% of the variance in team winning percentage. ## Python Implementation ### Calculating Four Factors ```python import pandas as pd import numpy as np class FourFactors: """Calculate Dean Oliver's Four Factors for basketball analysis.""" @staticmethod def effective_fg_percentage(fgm, tpm, fga): """ Calculate Effective Field Goal Percentage. Parameters: ----------- fgm : int or float Field goals made tpm : int or float Three-pointers made fga : int or float Field goal attempts Returns: -------- float : eFG% (0-1 scale) """ if fga == 0: return 0.0 return (fgm + 0.5 * tpm) / fga @staticmethod def turnover_rate(tov, fga, fta): """ Calculate Turnover Rate. Parameters: ----------- tov : int or float Turnovers fga : int or float Field goal attempts fta : int or float Free throw attempts Returns: -------- float : TOV% (0-1 scale) """ possessions = fga + 0.44 * fta + tov if possessions == 0: return 0.0 return tov / possessions @staticmethod def offensive_rebound_percentage(orb, opp_drb): """ Calculate Offensive Rebound Percentage. Parameters: ----------- orb : int or float Offensive rebounds opp_drb : int or float Opponent defensive rebounds Returns: -------- float : ORB% (0-1 scale) """ total_rebounds = orb + opp_drb if total_rebounds == 0: return 0.0 return orb / total_rebounds @staticmethod def free_throw_rate(fta, fga): """ Calculate Free Throw Rate. Parameters: ----------- fta : int or float Free throw attempts fga : int or float Field goal attempts Returns: -------- float : FT Rate """ if fga == 0: return 0.0 return fta / fga @staticmethod def calculate_all_factors(stats_dict): """ Calculate all four factors from a stats dictionary. Parameters: ----------- stats_dict : dict Dictionary with keys: fgm, tpm, fga, tov, fta, orb, opp_drb Returns: -------- dict : Dictionary with all four factors """ return { 'eFG%': FourFactors.effective_fg_percentage( stats_dict['fgm'], stats_dict['tpm'], stats_dict['fga'] ), 'TOV%': FourFactors.turnover_rate( stats_dict['tov'], stats_dict['fga'], stats_dict['fta'] ), 'ORB%': FourFactors.offensive_rebound_percentage( stats_dict['orb'], stats_dict['opp_drb'] ), 'FT_Rate': FourFactors.free_throw_rate( stats_dict['fta'], stats_dict['fga'] ) } @staticmethod def weighted_rating(efg, tov, orb, ft_rate): """ Calculate weighted composite rating using Oliver's weights. Parameters: ----------- efg : float Effective field goal percentage (0-1) tov : float Turnover rate (0-1) orb : float Offensive rebound percentage (0-1) ft_rate : float Free throw rate Returns: -------- float : Weighted rating (0-1 scale) """ return (efg * 0.40) + ((1 - tov) * 0.25) + (orb * 0.20) + (ft_rate * 0.15) # Example usage with team data def analyze_team_performance(): """Example of analyzing team performance using Four Factors.""" # Sample team statistics for a season team_stats = { 'fgm': 3250, 'tpm': 850, 'fga': 7200, 'tov': 1150, 'fta': 1800, 'orb': 920, 'opp_drb': 2450 } # Calculate four factors factors = FourFactors.calculate_all_factors(team_stats) print("Team Four Factors Analysis") print("=" * 50) print(f"Effective FG%: {factors['eFG%']:.3f} ({factors['eFG%']*100:.1f}%)") print(f"Turnover Rate: {factors['TOV%']:.3f} ({factors['TOV%']*100:.1f}%)") print(f"Off. Rebound%: {factors['ORB%']:.3f} ({factors['ORB%']*100:.1f}%)") print(f"FT Rate: {factors['FT_Rate']:.3f}") print("=" * 50) # Calculate weighted rating rating = FourFactors.weighted_rating( factors['eFG%'], factors['TOV%'], factors['ORB%'], factors['FT_Rate'] ) print(f"Weighted Offensive Rating: {rating:.3f}") return factors # Analyze multiple teams def compare_teams(): """Compare multiple teams using Four Factors.""" teams_data = { 'Team A': {'fgm': 3250, 'tpm': 850, 'fga': 7200, 'tov': 1150, 'fta': 1800, 'orb': 920, 'opp_drb': 2450}, 'Team B': {'fgm': 3100, 'tpm': 920, 'fga': 7100, 'tov': 1050, 'fta': 1900, 'orb': 880, 'opp_drb': 2500}, 'Team C': {'fgm': 3300, 'tpm': 780, 'fga': 7300, 'tov': 1250, 'fta': 1750, 'orb': 950, 'opp_drb': 2400} } results = [] for team_name, stats in teams_data.items(): factors = FourFactors.calculate_all_factors(stats) factors['Team'] = team_name factors['Rating'] = FourFactors.weighted_rating( factors['eFG%'], factors['TOV%'], factors['ORB%'], factors['FT_Rate'] ) results.append(factors) df = pd.DataFrame(results) df = df[['Team', 'eFG%', 'TOV%', 'ORB%', 'FT_Rate', 'Rating']] df = df.sort_values('Rating', ascending=False) print("\nTeam Comparison") print(df.to_string(index=False)) return df if __name__ == "__main__": analyze_team_performance() compare_teams() ``` ### Visualization ```python import matplotlib.pyplot as plt import seaborn as sns import numpy as np def visualize_four_factors(team_data_dict): """ Create comprehensive visualizations of Four Factors. Parameters: ----------- team_data_dict : dict Dictionary mapping team names to their stats """ # Calculate factors for all teams teams_factors = [] for team_name, stats in team_data_dict.items(): factors = FourFactors.calculate_all_factors(stats) factors['Team'] = team_name teams_factors.append(factors) df = pd.DataFrame(teams_factors) # Create figure with subplots fig, axes = plt.subplots(2, 2, figsize=(14, 10)) fig.suptitle("Dean Oliver's Four Factors Analysis", fontsize=16, fontweight='bold') # 1. Effective FG% ax1 = axes[0, 0] bars1 = ax1.barh(df['Team'], df['eFG%'], color='steelblue') ax1.set_xlabel('eFG%', fontweight='bold') ax1.set_title('Effective Field Goal % (Weight: 40%)', fontweight='bold') ax1.set_xlim(0, max(df['eFG%']) * 1.1) for i, bar in enumerate(bars1): width = bar.get_width() ax1.text(width, bar.get_y() + bar.get_height()/2, f'{width:.3f}', ha='left', va='center', fontsize=9) # 2. Turnover Rate (lower is better) ax2 = axes[0, 1] bars2 = ax2.barh(df['Team'], df['TOV%'], color='coral') ax2.set_xlabel('TOV%', fontweight='bold') ax2.set_title('Turnover Rate (Weight: 25%) - Lower is Better', fontweight='bold') ax2.set_xlim(0, max(df['TOV%']) * 1.1) for i, bar in enumerate(bars2): width = bar.get_width() ax2.text(width, bar.get_y() + bar.get_height()/2, f'{width:.3f}', ha='left', va='center', fontsize=9) # 3. Offensive Rebound % ax3 = axes[1, 0] bars3 = ax3.barh(df['Team'], df['ORB%'], color='forestgreen') ax3.set_xlabel('ORB%', fontweight='bold') ax3.set_title('Offensive Rebound % (Weight: 20%)', fontweight='bold') ax3.set_xlim(0, max(df['ORB%']) * 1.1) for i, bar in enumerate(bars3): width = bar.get_width() ax3.text(width, bar.get_y() + bar.get_height()/2, f'{width:.3f}', ha='left', va='center', fontsize=9) # 4. Free Throw Rate ax4 = axes[1, 1] bars4 = ax4.barh(df['Team'], df['FT_Rate'], color='mediumpurple') ax4.set_xlabel('FT Rate', fontweight='bold') ax4.set_title('Free Throw Rate (Weight: 15%)', fontweight='bold') ax4.set_xlim(0, max(df['FT_Rate']) * 1.1) for i, bar in enumerate(bars4): width = bar.get_width() ax4.text(width, bar.get_y() + bar.get_height()/2, f'{width:.3f}', ha='left', va='center', fontsize=9) plt.tight_layout() plt.savefig('four_factors_comparison.png', dpi=300, bbox_inches='tight') plt.show() def radar_chart_comparison(team_data_dict): """ Create radar chart comparing teams on Four Factors. Parameters: ----------- team_data_dict : dict Dictionary mapping team names to their stats """ # Calculate factors teams_factors = [] for team_name, stats in team_data_dict.items(): factors = FourFactors.calculate_all_factors(stats) # Normalize TOV% (invert so higher is better) factors['TOV%_norm'] = 1 - factors['TOV%'] factors['Team'] = team_name teams_factors.append(factors) # Set up radar chart categories = ['eFG%', 'Ball Security\n(1-TOV%)', 'ORB%', 'FT Rate'] num_vars = len(categories) # Compute angle for each axis angles = np.linspace(0, 2 * np.pi, num_vars, endpoint=False).tolist() angles += angles[:1] # Complete the circle fig, ax = plt.subplots(figsize=(10, 10), subplot_kw=dict(projection='polar')) colors = ['steelblue', 'coral', 'forestgreen', 'mediumpurple', 'gold'] for idx, team_factors in enumerate(teams_factors): values = [ team_factors['eFG%'], team_factors['TOV%_norm'], team_factors['ORB%'], team_factors['FT_Rate'] ] values += values[:1] # Complete the circle ax.plot(angles, values, 'o-', linewidth=2, label=team_factors['Team'], color=colors[idx % len(colors)]) ax.fill(angles, values, alpha=0.15, color=colors[idx % len(colors)]) ax.set_xticks(angles[:-1]) ax.set_xticklabels(categories, size=11) ax.set_ylim(0, 0.6) ax.set_title("Four Factors Radar Comparison", size=16, fontweight='bold', pad=20) ax.legend(loc='upper right', bbox_to_anchor=(1.3, 1.1)) ax.grid(True) plt.tight_layout() plt.savefig('four_factors_radar.png', dpi=300, bbox_inches='tight') plt.show() def correlation_with_wins(historical_data): """ Visualize correlation between Four Factors and winning percentage. Parameters: ----------- historical_data : pd.DataFrame DataFrame with columns: Team, eFG%, TOV%, ORB%, FT_Rate, Win% """ fig, axes = plt.subplots(2, 2, figsize=(14, 10)) fig.suptitle("Four Factors Correlation with Winning Percentage", fontsize=16, fontweight='bold') factors_info = [ ('eFG%', 'Effective FG%', 'steelblue', axes[0, 0]), ('TOV%', 'Turnover Rate', 'coral', axes[0, 1]), ('ORB%', 'Offensive Rebound%', 'forestgreen', axes[1, 0]), ('FT_Rate', 'Free Throw Rate', 'mediumpurple', axes[1, 1]) ] for factor, title, color, ax in factors_info: ax.scatter(historical_data[factor], historical_data['Win%'], alpha=0.6, s=100, color=color) # Add trend line z = np.polyfit(historical_data[factor], historical_data['Win%'], 1) p = np.poly1d(z) x_line = np.linspace(historical_data[factor].min(), historical_data[factor].max(), 100) ax.plot(x_line, p(x_line), "r--", alpha=0.8, linewidth=2) # Calculate correlation corr = historical_data[factor].corr(historical_data['Win%']) ax.set_xlabel(title, fontweight='bold', fontsize=11) ax.set_ylabel('Winning %', fontweight='bold', fontsize=11) ax.set_title(f'{title} vs Win% (r = {corr:.3f})', fontweight='bold') ax.grid(True, alpha=0.3) plt.tight_layout() plt.savefig('four_factors_win_correlation.png', dpi=300, bbox_inches='tight') plt.show() # Example usage if __name__ == "__main__": # Sample data teams_data = { 'Team A': {'fgm': 3250, 'tpm': 850, 'fga': 7200, 'tov': 1150, 'fta': 1800, 'orb': 920, 'opp_drb': 2450}, 'Team B': {'fgm': 3100, 'tpm': 920, 'fga': 7100, 'tov': 1050, 'fta': 1900, 'orb': 880, 'opp_drb': 2500}, 'Team C': {'fgm': 3300, 'tpm': 780, 'fga': 7300, 'tov': 1250, 'fta': 1750, 'orb': 950, 'opp_drb': 2400} } visualize_four_factors(teams_data) radar_chart_comparison(teams_data) # Simulated historical data for correlation np.random.seed(42) n_teams = 30 historical_df = pd.DataFrame({ 'Team': [f'Team {i+1}' for i in range(n_teams)], 'eFG%': np.random.normal(0.52, 0.03, n_teams), 'TOV%': np.random.normal(0.14, 0.02, n_teams), 'ORB%': np.random.normal(0.27, 0.03, n_teams), 'FT_Rate': np.random.normal(0.25, 0.04, n_teams) }) # Simulate winning percentage based on factors historical_df['Win%'] = ( historical_df['eFG%'] * 0.40 + (1 - historical_df['TOV%']) * 0.25 + historical_df['ORB%'] * 0.20 + historical_df['FT_Rate'] * 0.15 + np.random.normal(0, 0.05, n_teams) ) historical_df['Win%'] = historical_df['Win%'].clip(0.2, 0.8) correlation_with_wins(historical_df) ``` ## R Implementation ### Calculating Four Factors ```r # Four Factors Calculator in R # Function to calculate Effective Field Goal Percentage calculate_efg <- function(fgm, tpm, fga) { if (fga == 0) return(0) return((fgm + 0.5 * tpm) / fga) } # Function to calculate Turnover Rate calculate_tov_rate <- function(tov, fga, fta) { possessions <- fga + 0.44 * fta + tov if (possessions == 0) return(0) return(tov / possessions) } # Function to calculate Offensive Rebound Percentage calculate_orb_pct <- function(orb, opp_drb) { total_rebounds <- orb + opp_drb if (total_rebounds == 0) return(0) return(orb / total_rebounds) } # Function to calculate Free Throw Rate calculate_ft_rate <- function(fta, fga) { if (fga == 0) return(0) return(fta / fga) } # Calculate all four factors calculate_four_factors <- function(fgm, tpm, fga, tov, fta, orb, opp_drb) { list( eFG = calculate_efg(fgm, tpm, fga), TOV = calculate_tov_rate(tov, fga, fta), ORB = calculate_orb_pct(orb, opp_drb), FT_Rate = calculate_ft_rate(fta, fga) ) } # Calculate weighted rating calculate_weighted_rating <- function(efg, tov, orb, ft_rate) { (efg * 0.40) + ((1 - tov) * 0.25) + (orb * 0.20) + (ft_rate * 0.15) } # Example: Analyze team performance analyze_team <- function() { # Sample team statistics team_stats <- list( fgm = 3250, tpm = 850, fga = 7200, tov = 1150, fta = 1800, orb = 920, opp_drb = 2450 ) # Calculate four factors factors <- calculate_four_factors( team_stats$fgm, team_stats$tpm, team_stats$fga, team_stats$tov, team_stats$fta, team_stats$orb, team_stats$opp_drb ) cat("Team Four Factors Analysis\n") cat(rep("=", 50), "\n", sep="") cat(sprintf("Effective FG%%: %.3f (%.1f%%)\n", factors$eFG, factors$eFG * 100)) cat(sprintf("Turnover Rate: %.3f (%.1f%%)\n", factors$TOV, factors$TOV * 100)) cat(sprintf("Off. Rebound%%: %.3f (%.1f%%)\n", factors$ORB, factors$ORB * 100)) cat(sprintf("FT Rate: %.3f\n", factors$FT_Rate)) cat(rep("=", 50), "\n", sep="") # Calculate weighted rating rating <- calculate_weighted_rating( factors$eFG, factors$TOV, factors$ORB, factors$FT_Rate ) cat(sprintf("Weighted Offensive Rating: %.3f\n", rating)) return(factors) } # Compare multiple teams compare_teams <- function() { library(dplyr) # Team data teams <- data.frame( Team = c("Team A", "Team B", "Team C"), fgm = c(3250, 3100, 3300), tpm = c(850, 920, 780), fga = c(7200, 7100, 7300), tov = c(1150, 1050, 1250), fta = c(1800, 1900, 1750), orb = c(920, 880, 950), opp_drb = c(2450, 2500, 2400) ) # Calculate four factors for each team teams <- teams %>% rowwise() %>% mutate( eFG = calculate_efg(fgm, tpm, fga), TOV = calculate_tov_rate(tov, fga, fta), ORB = calculate_orb_pct(orb, opp_drb), FT_Rate = calculate_ft_rate(fta, fga), Rating = calculate_weighted_rating(eFG, TOV, ORB, FT_Rate) ) %>% select(Team, eFG, TOV, ORB, FT_Rate, Rating) %>% arrange(desc(Rating)) cat("\nTeam Comparison\n") print(teams, row.names = FALSE) return(teams) } # Run analysis analyze_team() compare_teams() ``` ### Visualization in R ```r library(ggplot2) library(tidyr) library(dplyr) library(gridExtra) library(fmsb) # Visualize four factors comparison visualize_four_factors <- function(teams_df) { # Prepare data in long format teams_long <- teams_df %>% select(Team, eFG, TOV, ORB, FT_Rate) %>% pivot_longer(cols = -Team, names_to = "Factor", values_to = "Value") # Create factor labels with weights factor_labels <- c( "eFG" = "Effective FG%\n(Weight: 40%)", "TOV" = "Turnover Rate\n(Weight: 25%)", "ORB" = "Off. Rebound%\n(Weight: 20%)", "FT_Rate" = "FT Rate\n(Weight: 15%)" ) teams_long$Factor <- factor(teams_long$Factor, levels = c("eFG", "TOV", "ORB", "FT_Rate")) # Create grouped bar chart p1 <- ggplot(teams_long, aes(x = Factor, y = Value, fill = Team)) + geom_bar(stat = "identity", position = "dodge", width = 0.7) + geom_text(aes(label = sprintf("%.3f", Value)), position = position_dodge(width = 0.7), vjust = -0.5, size = 3) + scale_x_discrete(labels = factor_labels) + scale_fill_manual(values = c("Team A" = "steelblue", "Team B" = "coral", "Team C" = "forestgreen")) + labs(title = "Dean Oliver's Four Factors Comparison", x = "", y = "Value", fill = "Team") + theme_minimal() + theme(plot.title = element_text(hjust = 0.5, face = "bold", size = 16), axis.text.x = element_text(face = "bold", size = 10), legend.position = "bottom") ggsave("four_factors_comparison_r.png", p1, width = 12, height = 7, dpi = 300) print(p1) return(p1) } # Create faceted comparison faceted_comparison <- function(teams_df) { teams_long <- teams_df %>% select(Team, eFG, TOV, ORB, FT_Rate) %>% pivot_longer(cols = -Team, names_to = "Factor", values_to = "Value") # Create separate plot for each factor p <- ggplot(teams_long, aes(x = Team, y = Value, fill = Team)) + geom_bar(stat = "identity") + geom_text(aes(label = sprintf("%.3f", Value)), vjust = -0.5, size = 3.5, fontface = "bold") + facet_wrap(~ Factor, scales = "free_y", nrow = 2, labeller = labeller(Factor = c( "eFG" = "Effective FG% (40%)", "TOV" = "Turnover Rate (25%)", "ORB" = "Off. Rebound% (20%)", "FT_Rate" = "FT Rate (15%)" ))) + scale_fill_manual(values = c("Team A" = "steelblue", "Team B" = "coral", "Team C" = "forestgreen")) + labs(title = "Four Factors Analysis by Team", x = "", y = "Value") + theme_minimal() + theme(plot.title = element_text(hjust = 0.5, face = "bold", size = 16), strip.text = element_text(face = "bold", size = 11), legend.position = "none", axis.text.x = element_text(angle = 45, hjust = 1)) ggsave("four_factors_faceted_r.png", p, width = 12, height = 9, dpi = 300) print(p) return(p) } # Radar chart comparison radar_chart_teams <- function(teams_df) { # Prepare data for radar chart radar_data <- teams_df %>% mutate(TOV_norm = 1 - TOV) %>% # Invert TOV so higher is better select(Team, eFG, TOV_norm, ORB, FT_Rate) # Convert to matrix for fmsb radar_matrix <- as.data.frame(t(radar_data[, -1])) colnames(radar_matrix) <- radar_data$Team # Add max and min rows (required by fmsb) radar_matrix <- rbind( max = rep(0.6, ncol(radar_matrix)), min = rep(0, ncol(radar_matrix)), radar_matrix ) # Set up colors colors <- c("steelblue", "coral", "forestgreen") # Create radar chart png("four_factors_radar_r.png", width = 800, height = 800, res = 120) radarchart( radar_matrix, axistype = 1, pcol = colors, pfcol = adjustcolor(colors, alpha.f = 0.2), plwd = 3, plty = 1, cglcol = "grey", cglty = 1, axislabcol = "grey30", caxislabels = seq(0, 0.6, 0.15), cglwd = 1.5, vlcex = 1.1, title = "Four Factors Radar Comparison" ) # Add legend legend( x = 0.9, y = 1.2, legend = colnames(radar_matrix), bty = "n", pch = 20, col = colors, text.col = "black", cex = 1.1, pt.cex = 2 ) dev.off() } # Correlation with winning percentage correlation_analysis <- function() { # Generate simulated data set.seed(42) n_teams <- 30 historical_data <- data.frame( Team = paste0("Team ", 1:n_teams), eFG = rnorm(n_teams, 0.52, 0.03), TOV = rnorm(n_teams, 0.14, 0.02), ORB = rnorm(n_teams, 0.27, 0.03), FT_Rate = rnorm(n_teams, 0.25, 0.04) ) # Simulate winning percentage based on factors historical_data <- historical_data %>% mutate( Win_Pct = eFG * 0.40 + (1 - TOV) * 0.25 + ORB * 0.20 + FT_Rate * 0.15 + rnorm(n_teams, 0, 0.05), Win_Pct = pmax(0.2, pmin(0.8, Win_Pct)) # Clip between 0.2 and 0.8 ) # Create correlation plots p1 <- ggplot(historical_data, aes(x = eFG, y = Win_Pct)) + geom_point(color = "steelblue", size = 3, alpha = 0.7) + geom_smooth(method = "lm", color = "red", linetype = "dashed", se = FALSE) + labs(title = sprintf("eFG%% vs Win%% (r = %.3f)", cor(historical_data$eFG, historical_data$Win_Pct)), x = "Effective FG%", y = "Winning %") + theme_minimal() + theme(plot.title = element_text(face = "bold")) p2 <- ggplot(historical_data, aes(x = TOV, y = Win_Pct)) + geom_point(color = "coral", size = 3, alpha = 0.7) + geom_smooth(method = "lm", color = "red", linetype = "dashed", se = FALSE) + labs(title = sprintf("TOV%% vs Win%% (r = %.3f)", cor(historical_data$TOV, historical_data$Win_Pct)), x = "Turnover Rate", y = "Winning %") + theme_minimal() + theme(plot.title = element_text(face = "bold")) p3 <- ggplot(historical_data, aes(x = ORB, y = Win_Pct)) + geom_point(color = "forestgreen", size = 3, alpha = 0.7) + geom_smooth(method = "lm", color = "red", linetype = "dashed", se = FALSE) + labs(title = sprintf("ORB%% vs Win%% (r = %.3f)", cor(historical_data$ORB, historical_data$Win_Pct)), x = "Offensive Rebound%", y = "Winning %") + theme_minimal() + theme(plot.title = element_text(face = "bold")) p4 <- ggplot(historical_data, aes(x = FT_Rate, y = Win_Pct)) + geom_point(color = "mediumpurple", size = 3, alpha = 0.7) + geom_smooth(method = "lm", color = "red", linetype = "dashed", se = FALSE) + labs(title = sprintf("FT Rate vs Win%% (r = %.3f)", cor(historical_data$FT_Rate, historical_data$Win_Pct)), x = "Free Throw Rate", y = "Winning %") + theme_minimal() + theme(plot.title = element_text(face = "bold")) # Combine plots combined <- grid.arrange(p1, p2, p3, p4, ncol = 2, top = "Four Factors Correlation with Winning Percentage") ggsave("four_factors_correlation_r.png", combined, width = 12, height = 10, dpi = 300) return(historical_data) } # Example usage teams_df <- compare_teams() visualize_four_factors(teams_df) faceted_comparison(teams_df) radar_chart_teams(teams_df) correlation_analysis() ``` ## Advanced Applications ### Defensive Four Factors The same framework applies to defense, but from the opponent's perspective: - **Opponent eFG%**: Lower is better (limiting opponent shooting efficiency) - **Opponent TOV%**: Higher is better (forcing turnovers) - **Opponent ORB%** (or Defensive Rebound%): Lower is better (limiting second chances) - **Opponent FT Rate**: Lower is better (avoiding fouls) ### Four Factors Differential Model ```python def predict_win_probability(team_off_factors, team_def_factors, opp_off_factors, opp_def_factors): """ Predict win probability using Four Factors differentials. Returns probability between 0 and 1. """ # Calculate offensive vs opponent defense off_diff = ( (team_off_factors['eFG%'] - opp_def_factors['eFG%']) * 0.40 + (opp_def_factors['TOV%'] - team_off_factors['TOV%']) * 0.25 + (team_off_factors['ORB%'] - opp_def_factors['ORB%']) * 0.20 + (team_off_factors['FT_Rate'] - opp_def_factors['FT_Rate']) * 0.15 ) # Calculate defense vs opponent offense def_diff = ( (opp_off_factors['eFG%'] - team_def_factors['eFG%']) * 0.40 + (team_def_factors['TOV%'] - opp_off_factors['TOV%']) * 0.25 + (opp_off_factors['ORB%'] - team_def_factors['ORB%']) * 0.20 + (opp_off_factors['FT_Rate'] - team_def_factors['FT_Rate']) * 0.15 ) # Net advantage net_advantage = off_diff - def_diff # Convert to probability using logistic function win_prob = 1 / (1 + np.exp(-10 * net_advantage)) return win_prob ``` ## Key Insights 1. **eFG% dominates**: At 40% weight, shooting efficiency is the most important factor 2. **Defense matters equally**: The Four Factors apply symmetrically to defense 3. **Context dependent**: Weights may vary by era, league, and playing style 4. **Not independent**: The factors can interact (e.g., turnovers reduce FGA) 5. **Player vs Team**: Individual factors need positional context; team factors are more robust ## References - Oliver, D. (2004). "Basketball on Paper: Rules and Tools for Performance Analysis" - NBA Advanced Stats (stats.nba.com) - Basketball-Reference.com Four Factors pages - Cleaning the Glass (cleaningtheglass.com) for modern Four Factors analysis

Discussion

Have questions or feedback? Join our community discussion on Discord or GitHub Discussions.