Expected Assists (xA) and Key Pass Analysis

Beginner 10 min read 1 views Nov 27, 2025
# Expected Assists (xA) and Key Pass Analysis ## Overview Expected Assists (xA) quantifies the quality of passes that lead to shots. It measures the likelihood that a pass will result in a goal, based on the xG value of the shot that follows the pass. This metric helps identify creative players and elite playmakers. ## Key Components ### Pass Quality Factors - **End Location**: Where the pass ends (closer to goal = higher xA) - **Pass Type**: Through ball, cross, cutback, etc. - **Defensive Pressure**: Number of defenders bypassed - **Speed of Play**: Quick transitions vs. patient buildup ### Key Pass Types 1. **Through Balls**: Splitting the defense 2. **Crosses**: Delivery from wide areas 3. **Cutbacks**: Passes across the box 4. **Set Pieces**: Corners, free kicks 5. **Throughballs**: Passes behind defensive line ## Python Implementation ```python import pandas as pd import numpy as np import matplotlib.pyplot as plt import seaborn as sns from scipy import stats from sklearn.preprocessing import StandardScaler # Sample pass data leading to shots pass_data = pd.DataFrame({ 'passer': ['Player A', 'Player A', 'Player B', 'Player C', 'Player B', 'Player A', 'Player C', 'Player B', 'Player A', 'Player C'], 'pass_type': ['through_ball', 'cross', 'through_ball', 'cutback', 'cross', 'through_ball', 'cross', 'cutback', 'cutback', 'through_ball'], 'pass_distance': [25, 30, 20, 15, 35, 22, 28, 18, 16, 24], 'defenders_bypassed': [3, 2, 4, 2, 1, 3, 2, 3, 2, 4], 'end_distance_to_goal': [12, 8, 10, 6, 10, 14, 9, 7, 8, 11], 'shot_xg': [0.15, 0.42, 0.25, 0.65, 0.22, 0.18, 0.38, 0.55, 0.48, 0.28], 'goal_scored': [0, 1, 0, 1, 0, 0, 0, 1, 1, 0] }) # Calculate xA (Expected Assists) # xA equals the xG value of the shot that follows the pass pass_data['xA'] = pass_data['shot_xg'] # Aggregate player-level statistics player_xa_stats = pass_data.groupby('passer').agg({ 'xA': 'sum', 'goal_scored': 'sum', 'pass_type': 'count' }).reset_index() player_xa_stats.columns = ['Player', 'Total_xA', 'Actual_Assists', 'Key_Passes'] player_xa_stats['xA_Performance'] = ( player_xa_stats['Actual_Assists'] - player_xa_stats['Total_xA'] ) player_xa_stats['xA_per_KeyPass'] = ( player_xa_stats['Total_xA'] / player_xa_stats['Key_Passes'] ) print("Player xA Performance:") print(player_xa_stats.round(3)) # Analyze pass types pass_type_analysis = pass_data.groupby('pass_type').agg({ 'xA': ['mean', 'sum', 'count'], 'goal_scored': 'sum' }).reset_index() pass_type_analysis.columns = ['Pass_Type', 'Avg_xA', 'Total_xA', 'Count', 'Assists'] pass_type_analysis['Conversion_Rate'] = ( pass_type_analysis['Assists'] / pass_type_analysis['Count'] ) print("\nPass Type Analysis:") print(pass_type_analysis.round(3)) # Advanced playmaker metrics def calculate_playmaker_metrics(player_data): """ Calculate comprehensive playmaker evaluation metrics """ metrics = { 'total_xa': player_data['xA'].sum(), 'key_passes': len(player_data), 'avg_xa_per_pass': player_data['xA'].mean(), 'big_chances_created': (player_data['shot_xg'] > 0.35).sum(), 'avg_defenders_bypassed': player_data['defenders_bypassed'].mean(), 'through_ball_rate': (player_data['pass_type'] == 'through_ball').sum() / len(player_data) } return metrics # Calculate metrics for each player playmaker_profiles = {} for player in pass_data['passer'].unique(): player_passes = pass_data[pass_data['passer'] == player] playmaker_profiles[player] = calculate_playmaker_metrics(player_passes) playmaker_df = pd.DataFrame(playmaker_profiles).T print("\nPlaymaker Profiles:") print(playmaker_df.round(3)) # Visualize xA vs Actual Assists fig, axes = plt.subplots(1, 2, figsize=(14, 6)) # Plot 1: xA vs Actual Assists ax1 = axes[0] ax1.scatter(player_xa_stats['Total_xA'], player_xa_stats['Actual_Assists'], s=player_xa_stats['Key_Passes'] * 20, alpha=0.6, c='steelblue') # Add diagonal line max_val = max(player_xa_stats['Total_xA'].max(), player_xa_stats['Actual_Assists'].max()) ax1.plot([0, max_val], [0, max_val], 'r--', alpha=0.5, label='Expected') for idx, row in player_xa_stats.iterrows(): ax1.annotate(row['Player'], (row['Total_xA'], row['Actual_Assists']), fontsize=9, alpha=0.7) ax1.set_xlabel('Expected Assists (xA)') ax1.set_ylabel('Actual Assists') ax1.set_title('xA vs Actual Assists by Player') ax1.legend() ax1.grid(alpha=0.3) # Plot 2: Pass Type Distribution ax2 = axes[1] pass_type_summary = pass_data.groupby('pass_type')['xA'].sum().sort_values() colors = plt.cm.viridis(np.linspace(0.3, 0.9, len(pass_type_summary))) pass_type_summary.plot(kind='barh', ax=ax2, color=colors) ax2.set_xlabel('Total xA') ax2.set_title('Total xA by Pass Type') ax2.grid(axis='x', alpha=0.3) plt.tight_layout() plt.savefig('xa_analysis.png', dpi=300, bbox_inches='tight') plt.show() # Creative player ranking def rank_playmakers(xa_stats, playmaker_metrics): """ Create composite playmaker ranking """ # Normalize metrics scaler = StandardScaler() ranking_df = xa_stats.copy() ranking_df['norm_xa'] = scaler.fit_transform(xa_stats[['Total_xA']]) ranking_df['norm_xa_per_pass'] = scaler.fit_transform(xa_stats[['xA_per_KeyPass']]) # Composite score ranking_df['Playmaker_Score'] = ( ranking_df['norm_xa'] * 0.5 + ranking_df['norm_xa_per_pass'] * 0.5 ) return ranking_df.sort_values('Playmaker_Score', ascending=False) playmaker_rankings = rank_playmakers(player_xa_stats, playmaker_df) print("\nPlaymaker Rankings:") print(playmaker_rankings[['Player', 'Total_xA', 'xA_per_KeyPass', 'Playmaker_Score']].round(3)) # Key pass heatmap analysis def create_key_pass_zones(pass_data): """ Analyze which zones produce highest xA """ # Create pitch zones pass_data['zone'] = pd.cut( pass_data['end_distance_to_goal'], bins=[0, 6, 12, 18, 30], labels=['Box_6yd', 'Box_inner', 'Box_outer', 'Outside_box'] ) zone_analysis = pass_data.groupby('zone').agg({ 'xA': ['mean', 'sum', 'count'], 'goal_scored': 'sum' }).reset_index() return zone_analysis zone_stats = create_key_pass_zones(pass_data) print("\nKey Pass Zone Analysis:") print(zone_stats) ``` ## R Implementation ```r library(tidyverse) library(ggplot2) library(scales) library(plotly) # Sample pass data leading to shots pass_data <- data.frame( passer = c("Player A", "Player A", "Player B", "Player C", "Player B", "Player A", "Player C", "Player B", "Player A", "Player C"), pass_type = c("through_ball", "cross", "through_ball", "cutback", "cross", "through_ball", "cross", "cutback", "cutback", "through_ball"), pass_distance = c(25, 30, 20, 15, 35, 22, 28, 18, 16, 24), defenders_bypassed = c(3, 2, 4, 2, 1, 3, 2, 3, 2, 4), end_distance_to_goal = c(12, 8, 10, 6, 10, 14, 9, 7, 8, 11), shot_xg = c(0.15, 0.42, 0.25, 0.65, 0.22, 0.18, 0.38, 0.55, 0.48, 0.28), goal_scored = c(0, 1, 0, 1, 0, 0, 0, 1, 1, 0) ) # Calculate xA (equals the xG of the shot that follows) pass_data <- pass_data %>% mutate(xA = shot_xg) # Aggregate player-level statistics player_xa_stats <- pass_data %>% group_by(passer) %>% summarise( Total_xA = sum(xA), Actual_Assists = sum(goal_scored), Key_Passes = n() ) %>% mutate( xA_Performance = Actual_Assists - Total_xA, xA_per_KeyPass = Total_xA / Key_Passes ) print("Player xA Performance:") print(player_xa_stats) # Analyze pass types pass_type_analysis <- pass_data %>% group_by(pass_type) %>% summarise( Avg_xA = mean(xA), Total_xA = sum(xA), Count = n(), Assists = sum(goal_scored) ) %>% mutate(Conversion_Rate = Assists / Count) print("Pass Type Analysis:") print(pass_type_analysis) # Advanced playmaker metrics calculate_playmaker_metrics <- function(player_passes) { tibble( total_xa = sum(player_passes$xA), key_passes = nrow(player_passes), avg_xa_per_pass = mean(player_passes$xA), big_chances_created = sum(player_passes$shot_xg > 0.35), avg_defenders_bypassed = mean(player_passes$defenders_bypassed), through_ball_rate = sum(player_passes$pass_type == "through_ball") / nrow(player_passes) ) } # Calculate metrics for each player playmaker_profiles <- pass_data %>% group_by(passer) %>% do(calculate_playmaker_metrics(.)) %>% ungroup() print("Playmaker Profiles:") print(playmaker_profiles) # Visualize xA vs Actual Assists xa_plot <- ggplot(player_xa_stats, aes(x = Total_xA, y = Actual_Assists, size = Key_Passes, label = passer)) + geom_point(alpha = 0.6, color = "steelblue") + geom_abline(intercept = 0, slope = 1, linetype = "dashed", color = "red", alpha = 0.5) + geom_text(vjust = -1, size = 3) + labs( title = "xA vs Actual Assists by Player", x = "Expected Assists (xA)", y = "Actual Assists", size = "Key Passes" ) + theme_minimal() + theme( plot.title = element_text(hjust = 0.5, size = 14, face = "bold"), legend.position = "bottom" ) print(xa_plot) # Pass type distribution pass_type_plot <- pass_data %>% group_by(pass_type) %>% summarise(Total_xA = sum(xA)) %>% arrange(Total_xA) %>% ggplot(aes(x = reorder(pass_type, Total_xA), y = Total_xA, fill = pass_type)) + geom_bar(stat = "identity", alpha = 0.8, show.legend = FALSE) + coord_flip() + labs( title = "Total xA by Pass Type", x = "Pass Type", y = "Total xA" ) + theme_minimal() + theme(plot.title = element_text(hjust = 0.5, size = 14, face = "bold")) print(pass_type_plot) # Creative player ranking rank_playmakers <- function(xa_stats) { xa_stats %>% mutate( norm_xa = scale(Total_xA)[,1], norm_xa_per_pass = scale(xA_per_KeyPass)[,1], Playmaker_Score = norm_xa * 0.5 + norm_xa_per_pass * 0.5 ) %>% arrange(desc(Playmaker_Score)) } playmaker_rankings <- rank_playmakers(player_xa_stats) print("Playmaker Rankings:") print(playmaker_rankings %>% select(passer, Total_xA, xA_per_KeyPass, Playmaker_Score)) # Key pass zone analysis zone_analysis <- pass_data %>% mutate( zone = cut( end_distance_to_goal, breaks = c(0, 6, 12, 18, 30), labels = c("Box_6yd", "Box_inner", "Box_outer", "Outside_box") ) ) %>% group_by(zone) %>% summarise( Avg_xA = mean(xA), Total_xA = sum(xA), Count = n(), Goals = sum(goal_scored) ) print("Key Pass Zone Analysis:") print(zone_analysis) # Zone visualization zone_plot <- ggplot(zone_analysis, aes(x = zone, y = Avg_xA, fill = zone)) + geom_bar(stat = "identity", alpha = 0.8, show.legend = FALSE) + geom_text(aes(label = sprintf("%.3f", Avg_xA)), vjust = -0.5, size = 4) + labs( title = "Average xA by Pitch Zone", x = "Zone", y = "Average xA" ) + theme_minimal() + theme( plot.title = element_text(hjust = 0.5, size = 14, face = "bold"), axis.text.x = element_text(angle = 45, hjust = 1) ) print(zone_plot) ``` ## Evaluation Metrics ### Player-Level Metrics 1. **Total xA**: Sum of all xA values (volume metric) 2. **xA per 90**: Expected assists per 90 minutes 3. **xA Performance**: Actual assists minus expected assists 4. **xA per Key Pass**: Quality metric for chance creation 5. **Big Chances Created**: Passes leading to xG > 0.35 ### Advanced Metrics - **Secondary Assists**: Pass before the assist - **Shot-Creating Actions**: Actions leading to shots - **Progressive Passes**: Passes moving ball significantly forward - **Pass Completion in Final Third**: Accuracy in dangerous areas ## Interpretation ### Overperformance (Assists > xA) - Playing with clinical finishers - Creating highest quality chances - May not be sustainable long-term ### Underperformance (Assists < xA) - Teammates missing good chances - Unlucky finishing from recipients - Quality creation not reflected in stats ## Use Cases 1. **Talent Identification**: Find undervalued creative players 2. **Team Analysis**: Identify chance creation strengths/weaknesses 3. **Transfer Decisions**: Predict assist output in new team 4. **Tactical Evaluation**: Assess playmaking patterns and effectiveness ## Best Practices - Minimum sample size of 500-1000 minutes - Consider position and tactical role - Combine with progressive passing metrics - Account for team quality and opponent strength - Use alongside video analysis for full context

Discussion

Have questions or feedback? Join our community discussion on Discord or GitHub Discussions.