Expected Goals Against (xGA)

Intermediate 10 min read 207 views Nov 25, 2025

Expected Goals Against (xGA)

Expected Goals Against (xGA) is a defensive performance metric that quantifies the quality and quantity of scoring chances a team concedes to opponents. This metric applies the same expected goals (xG) framework used to evaluate attacking performance to defensive situations, measuring how many goals a team should theoretically concede based on the quality of shots they allow. xGA provides superior insight into defensive performance compared to traditional metrics like goals conceded or shots allowed, as it accounts for shot quality, location, and context. A team consistently outperforming their xGA (conceding fewer goals than expected) may indicate excellent goalkeeping or unsustainable luck, while underperforming suggests defensive vulnerabilities or poor goalkeeping.

Key Concepts

Understanding xGA requires examining multiple dimensions of defensive performance and shot quality conceded:

  • Shot Quality Conceded: The expected goal value of each shot allowed, determined by factors including distance from goal, angle, type of shot (header, foot), and preceding actions.
  • xGA Per 90 Minutes: Standardized metric allowing comparison across teams playing different numbers of matches, showing average expected goals conceded per 90-minute period.
  • xGA vs Goals Conceded: The difference between expected and actual goals conceded, revealing whether defensive performance is sustainable or influenced by variance (luck, exceptional goalkeeping).
  • xGA by Zone: Breaking down expected goals conceded by pitch area (penalty box, outside box, six-yard area) to identify defensive weaknesses in specific zones.
  • xGA from Different Situations: Separating expected goals conceded from open play, set pieces, counter-attacks, and other distinct game situations.
  • Shot Location Conceded: Mapping where opponents generate shots from, revealing defensive positioning weaknesses and areas of vulnerability.
  • Defensive xG Prevention: The ability to force opponents into low-quality shots through positioning, pressing, and defensive pressure.
  • Big Chance Prevention: Specific focus on preventing high xG opportunities (typically >0.35 xG), which disproportionately affect defensive outcomes.
  • Post-Shot xG Against (PSxGA): Advanced metric incorporating shot placement and goalkeeper positioning for more precise defensive evaluation.

Mathematical Foundation

Expected Goals Against:

xGA = Σ xG(shot_i) for all shots conceded

Where xG(shot_i) is calculated using the same models as offensive xG

xGA Per 90 Minutes:

xGA/90 = (Total xGA / Total Minutes Played) × 90

Defensive Performance vs Expected:

Defensive Performance = Goals Conceded - xGA

Negative values indicate better than expected defensive performance

xGA Prevention Rate:

Prevention Rate = ((Opponent Expected xG in League / Team xGA) - 1) × 100

Positive values indicate better than average defensive restriction

Shot Quality Conceded Average:

Avg Shot Quality = xGA / Shots Conceded

Lower values indicate forcing opponents into low-quality attempts

xGA Rate per Shot:

xGA per Shot = Total xGA / Total Shots Against

Defensive xG Efficiency:

Def Efficiency = (1 - (Goals Conceded / xGA)) × 100

Measures percentage improvement over expected defensive performance

Zone-Specific xGA:

xGA_zone = Σ xG(shot) for shots from specific pitch zone

Save Percentage Above Expected (based on xGA):

Save% Above Expected = ((Saves - (Shots on Target - Goals)) / Shots on Target) - ((xGA - Goals) / xGA)

Python Implementation


import pandas as pd
import numpy as np
from statsbombpy import sb
from mplsoccer import Pitch, VerticalPitch
import matplotlib.pyplot as plt
import seaborn as sns

# Load match data
matches = sb.matches(competition_id=2, season_id=44)
events = sb.events(match_id=3788741)

# Calculate xGA for a team
def calculate_xga(events_df, team_name):
    """Calculate expected goals against for a specific team"""
    # Find shots against the team (shots by opponents)
    shots_against = events_df[
        (events_df['type'] == 'Shot') &
        (events_df['team'] != team_name)
    ].copy()

    if len(shots_against) == 0:
        return {'xga': 0, 'shots_against': 0, 'goals_conceded': 0}

    # Calculate xGA
    total_xga = shots_against['shot_statsbomb_xg'].sum() if 'shot_statsbomb_xg' in shots_against.columns else 0
    goals_conceded = len(shots_against[shots_against['shot_outcome'] == 'Goal'])
    shots_count = len(shots_against)

    xga_stats = {
        'xga': total_xga,
        'shots_against': shots_count,
        'goals_conceded': goals_conceded,
        'avg_shot_quality': total_xga / shots_count if shots_count > 0 else 0,
        'performance_vs_expected': goals_conceded - total_xga,
        'xga_per_shot': total_xga / shots_count if shots_count > 0 else 0
    }

    return xga_stats

# Calculate xGA by zone
def calculate_xga_by_zone(events_df, team_name):
    """Break down xGA by pitch zones"""
    shots_against = events_df[
        (events_df['type'] == 'Shot') &
        (events_df['team'] != team_name)
    ].copy()

    if len(shots_against) == 0:
        return pd.DataFrame()

    # Classify shots by zone
    def classify_shot_zone(row):
        if not isinstance(row.get('location'), list):
            return 'unknown'

        x, y = row['location'][0], row['location'][1]

        # Zone classification
        if x >= 102:  # Penalty area
            if x >= 114:  # Six-yard box
                if 30 <= y <= 50:
                    return 'six_yard_box'
                else:
                    return 'penalty_area_wide'
            elif 18 <= y <= 62:
                return 'penalty_area_central'
            else:
                return 'penalty_area_wide'
        else:  # Outside box
            if 24 <= y <= 56:
                return 'outside_box_central'
            else:
                return 'outside_box_wide'

    shots_against['zone'] = shots_against.apply(classify_shot_zone, axis=1)

    # Aggregate by zone
    zone_xga = shots_against.groupby('zone').agg({
        'shot_statsbomb_xg': ['sum', 'mean', 'count'],
        'shot_outcome': lambda x: (x == 'Goal').sum()
    })

    zone_xga.columns = ['total_xga', 'avg_xg_per_shot', 'shots', 'goals']
    zone_xga['performance_vs_xg'] = zone_xga['goals'] - zone_xga['total_xga']

    return zone_xga.sort_values('total_xga', ascending=False)

# Calculate xGA by situation type
def calculate_xga_by_situation(events_df, team_name):
    """Break down xGA by different game situations"""
    shots_against = events_df[
        (events_df['type'] == 'Shot') &
        (events_df['team'] != team_name)
    ].copy()

    if len(shots_against) == 0:
        return pd.DataFrame()

    # Group by play pattern
    situation_xga = shots_against.groupby('play_pattern').agg({
        'shot_statsbomb_xg': ['sum', 'mean', 'count'],
        'shot_outcome': lambda x: (x == 'Goal').sum()
    })

    situation_xga.columns = ['total_xga', 'avg_xg_per_shot', 'shots', 'goals']
    situation_xga['performance_vs_xg'] = situation_xga['goals'] - situation_xga['total_xga']

    return situation_xga.sort_values('total_xga', ascending=False)

# Visualize shots conceded with xG values
def plot_shots_against_map(events_df, team_name):
    """Create shot map of chances conceded"""
    shots_against = events_df[
        (events_df['type'] == 'Shot') &
        (events_df['team'] != team_name)
    ].copy()

    if len(shots_against) == 0:
        print("No shots against to visualize")
        return None

    pitch = Pitch(pitch_type='statsbomb', pitch_color='#22312b', line_color='white')
    fig, ax = pitch.draw(figsize=(14, 10))

    # Separate goals and non-goals
    goals = shots_against[shots_against['shot_outcome'] == 'Goal']
    non_goals = shots_against[shots_against['shot_outcome'] != 'Goal']

    # Plot non-goals
    if len(non_goals) > 0:
        x_coords = non_goals['location'].apply(lambda x: x[0] if isinstance(x, list) else 60)
        y_coords = non_goals['location'].apply(lambda x: x[1] if isinstance(x, list) else 40)
        xg_values = non_goals['shot_statsbomb_xg'].fillna(0.05)

        pitch.scatter(x_coords, y_coords, ax=ax,
                     s=xg_values * 1000,
                     c='#00d9ff', edgecolors='white',
                     linewidths=2, alpha=0.6, zorder=2,
                     label='Shots (size = xG)')

    # Plot goals
    if len(goals) > 0:
        x_coords_goals = goals['location'].apply(lambda x: x[0] if isinstance(x, list) else 60)
        y_coords_goals = goals['location'].apply(lambda x: x[1] if isinstance(x, list) else 40)
        xg_values_goals = goals['shot_statsbomb_xg'].fillna(0.05)

        pitch.scatter(x_coords_goals, y_coords_goals, ax=ax,
                     s=xg_values_goals * 1000,
                     c='#ff0000', marker='*',
                     edgecolors='white', linewidths=2,
                     alpha=0.9, zorder=3,
                     label='Goals Conceded')

    total_xga = shots_against['shot_statsbomb_xg'].sum()
    goals_conceded = len(goals)

    plt.legend(loc='upper left', fontsize=11)
    plt.title(f'{team_name} - Shots Conceded Map
xGA: {total_xga:.2f} | Goals: {goals_conceded} | Diff: {goals_conceded - total_xga:+.2f}',
             fontsize=16, color='white', pad=20)
    plt.tight_layout()
    return fig

# Calculate xGA timeline over season
def calculate_xga_timeline(matches_df, team_name):
    """Track xGA progression over multiple matches"""
    team_xga_timeline = []

    for _, match in matches_df.iterrows():
        try:
            match_events = sb.events(match_id=match['match_id'])

            # Determine if team is home or away
            home_team = match['home_team']
            away_team = match['away_team']

            if team_name == home_team:
                opponent = away_team
            elif team_name == away_team:
                opponent = home_team
            else:
                continue  # Team not in this match

            # Calculate xGA for this match
            xga_stats = calculate_xga(match_events, team_name)

            team_xga_timeline.append({
                'match_id': match['match_id'],
                'match_date': match.get('match_date', ''),
                'opponent': opponent,
                'xga': xga_stats['xga'],
                'goals_conceded': xga_stats['goals_conceded'],
                'performance_vs_xg': xga_stats['performance_vs_expected']
            })

        except Exception as e:
            continue

    return pd.DataFrame(team_xga_timeline)

# Plot xGA timeline
def plot_xga_timeline(xga_timeline_df, team_name):
    """Visualize xGA progression over season"""
    if len(xga_timeline_df) == 0:
        return None

    xga_timeline_df = xga_timeline_df.reset_index(drop=True)
    xga_timeline_df['cumulative_xga'] = xga_timeline_df['xga'].cumsum()
    xga_timeline_df['cumulative_goals'] = xga_timeline_df['goals_conceded'].cumsum()

    fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(14, 10))

    # Match-by-match xGA
    x_range = range(len(xga_timeline_df))

    ax1.plot(x_range, xga_timeline_df['xga'].values,
            color='#00d9ff', linewidth=2, marker='o',
            markersize=6, label='xGA')
    ax1.plot(x_range, xga_timeline_df['goals_conceded'].values,
            color='#ff6b6b', linewidth=2, marker='s',
            markersize=6, label='Goals Conceded')

    ax1.set_ylabel('Per Match', fontsize=12)
    ax1.set_title(f'{team_name} - xGA vs Goals Conceded per Match',
                 fontsize=14, fontweight='bold')
    ax1.legend(loc='upper left')
    ax1.grid(True, alpha=0.3)

    # Cumulative
    ax2.plot(x_range, xga_timeline_df['cumulative_xga'].values,
            color='#00d9ff', linewidth=2.5, label='Cumulative xGA')
    ax2.plot(x_range, xga_timeline_df['cumulative_goals'].values,
            color='#ff6b6b', linewidth=2.5, label='Cumulative Goals Conceded')

    ax2.fill_between(x_range,
                     xga_timeline_df['cumulative_xga'].values,
                     xga_timeline_df['cumulative_goals'].values,
                     where=(xga_timeline_df['cumulative_goals'] < xga_timeline_df['cumulative_xga']),
                     alpha=0.3, color='green', label='Better than expected')

    ax2.fill_between(x_range,
                     xga_timeline_df['cumulative_xga'].values,
                     xga_timeline_df['cumulative_goals'].values,
                     where=(xga_timeline_df['cumulative_goals'] >= xga_timeline_df['cumulative_xga']),
                     alpha=0.3, color='red', label='Worse than expected')

    ax2.set_xlabel('Match Number', fontsize=12)
    ax2.set_ylabel('Cumulative Total', fontsize=12)
    ax2.set_title(f'Cumulative xGA vs Goals Conceded', fontsize=14, fontweight='bold')
    ax2.legend(loc='upper left')
    ax2.grid(True, alpha=0.3)

    plt.tight_layout()
    return fig

# Compare xGA across teams
def compare_teams_xga(events_df):
    """Compare xGA metrics for both teams in a match"""
    teams = events_df['team'].unique()

    comparison = {}

    for team in teams:
        xga_stats = calculate_xga(events_df, team)
        zone_xga = calculate_xga_by_zone(events_df, team)
        situation_xga = calculate_xga_by_situation(events_df, team)

        comparison[team] = {
            'overall': xga_stats,
            'by_zone': zone_xga,
            'by_situation': situation_xga
        }

    return comparison

# Shot quality conceded analysis
def analyze_shot_quality_conceded(events_df, team_name):
    """Detailed analysis of shot quality allowed"""
    shots_against = events_df[
        (events_df['type'] == 'Shot') &
        (events_df['team'] != team_name)
    ].copy()

    if len(shots_against) == 0:
        return {}

    # Categorize by shot quality
    shots_against['quality_category'] = pd.cut(
        shots_against['shot_statsbomb_xg'].fillna(0),
        bins=[0, 0.05, 0.15, 0.35, 1.0],
        labels=['very_low', 'low', 'medium', 'high']
    )

    quality_breakdown = shots_against.groupby('quality_category').agg({
        'shot_statsbomb_xg': ['sum', 'mean', 'count'],
        'shot_outcome': lambda x: (x == 'Goal').sum()
    })

    quality_breakdown.columns = ['total_xg', 'avg_xg', 'shots', 'goals']
    quality_breakdown['conversion_rate'] = (quality_breakdown['goals'] / quality_breakdown['shots'] * 100).fillna(0)

    return quality_breakdown

# Example execution
team_name = 'Arsenal'

# Calculate overall xGA
xga_stats = calculate_xga(events, team_name)
print(f"xGA Analysis for {team_name}:")
print(f"  xGA: {xga_stats['xga']:.2f}")
print(f"  Goals Conceded: {xga_stats['goals_conceded']}")
print(f"  Performance vs Expected: {xga_stats['performance_vs_expected']:+.2f}")
print(f"  Avg Shot Quality Against: {xga_stats['avg_shot_quality']:.3f}")

# Zone analysis
print("
xGA by Zone:")
zone_xga = calculate_xga_by_zone(events, team_name)
print(zone_xga)

# Situation analysis
print("
xGA by Situation:")
situation_xga = calculate_xga_by_situation(events, team_name)
print(situation_xga)

# Shot quality analysis
print("
Shot Quality Conceded:")
quality_analysis = analyze_shot_quality_conceded(events, team_name)
print(quality_analysis)

# Visualizations
shot_map = plot_shots_against_map(events, team_name)
if shot_map:
    plt.show()

R Implementation


library(tidyverse)
library(StatsBombR)
library(ggsoccer)

# Load match data
competitions <- FreeCompetitions()
matches <- FreeMatches(competitions %>% filter(competition_name == "Premier League"))
events <- get.matchFree(matches$match_id[1])

# Calculate xGA for a team
calculate_xga <- function(events_data, team_name) {
  shots_against <- events_data %>%
    filter(type.name == "Shot" & team.name != team_name)

  if(nrow(shots_against) == 0) {
    return(list(xga = 0, shots_against = 0, goals_conceded = 0))
  }

  list(
    xga = sum(shots_against$shot.statsbomb_xg, na.rm = TRUE),
    shots_against = nrow(shots_against),
    goals_conceded = sum(shots_against$shot.outcome.name == "Goal", na.rm = TRUE),
    avg_shot_quality = mean(shots_against$shot.statsbomb_xg, na.rm = TRUE),
    performance_vs_expected = sum(shots_against$shot.outcome.name == "Goal", na.rm = TRUE) -
                             sum(shots_against$shot.statsbomb_xg, na.rm = TRUE),
    xga_per_shot = sum(shots_against$shot.statsbomb_xg, na.rm = TRUE) / nrow(shots_against)
  )
}

# Calculate xGA by zone
calculate_xga_by_zone <- function(events_data, team_name) {
  shots_against <- events_data %>%
    filter(type.name == "Shot" & team.name != team_name)

  if(nrow(shots_against) == 0) return(tibble())

  shots_against %>%
    mutate(
      zone = case_when(
        location.x >= 114 & location.y >= 30 & location.y <= 50 ~ "six_yard_box",
        location.x >= 102 & location.y >= 18 & location.y <= 62 ~ "penalty_area_central",
        location.x >= 102 ~ "penalty_area_wide",
        location.y >= 24 & location.y <= 56 ~ "outside_box_central",
        TRUE ~ "outside_box_wide"
      )
    ) %>%
    group_by(zone) %>%
    summarise(
      total_xga = sum(shot.statsbomb_xg, na.rm = TRUE),
      avg_xg_per_shot = mean(shot.statsbomb_xg, na.rm = TRUE),
      shots = n(),
      goals = sum(shot.outcome.name == "Goal", na.rm = TRUE),
      performance_vs_xg = goals - total_xga,
      .groups = "drop"
    ) %>%
    arrange(desc(total_xga))
}

# Calculate xGA by situation
calculate_xga_by_situation <- function(events_data, team_name) {
  events_data %>%
    filter(type.name == "Shot" & team.name != team_name) %>%
    group_by(play_pattern.name) %>%
    summarise(
      total_xga = sum(shot.statsbomb_xg, na.rm = TRUE),
      avg_xg_per_shot = mean(shot.statsbomb_xg, na.rm = TRUE),
      shots = n(),
      goals = sum(shot.outcome.name == "Goal", na.rm = TRUE),
      performance_vs_xg = goals - total_xga,
      .groups = "drop"
    ) %>%
    arrange(desc(total_xga))
}

# Visualize shots conceded map
plot_shots_against_map <- function(events_data, team_name) {
  shots_against <- events_data %>%
    filter(type.name == "Shot" & team.name != team_name)

  if(nrow(shots_against) == 0) {
    message("No shots against to visualize")
    return(NULL)
  }

  total_xga <- sum(shots_against$shot.statsbomb_xg, na.rm = TRUE)
  goals_conceded <- sum(shots_against$shot.outcome.name == "Goal", na.rm = TRUE)
  diff <- goals_conceded - total_xga

  ggplot(shots_against) +
    annotate_pitch(dimensions = pitch_statsbomb) +
    theme_pitch() +
    geom_point(
      aes(x = location.x, y = location.y,
          size = shot.statsbomb_xg,
          color = shot.outcome.name == "Goal",
          shape = shot.outcome.name == "Goal"),
      alpha = 0.7
    ) +
    scale_size_continuous(range = c(2, 12), name = "xG Value") +
    scale_color_manual(
      values = c("TRUE" = "#ff0000", "FALSE" = "#00d9ff"),
      labels = c("Other Shots", "Goals"),
      name = "Outcome"
    ) +
    scale_shape_manual(
      values = c("TRUE" = 8, "FALSE" = 16),
      labels = c("Other Shots", "Goals"),
      name = "Outcome"
    ) +
    labs(
      title = paste(team_name, "- Shots Conceded Map"),
      subtitle = sprintf("xGA: %.2f | Goals: %d | Diff: %+.2f",
                        total_xga, goals_conceded, diff)
    ) +
    theme(
      plot.title = element_text(hjust = 0.5, size = 16, face = "bold"),
      plot.subtitle = element_text(hjust = 0.5, size = 12)
    )
}

# Calculate xGA timeline over season
calculate_xga_timeline <- function(matches_data, events_list, team_name) {
  timeline <- map_df(seq_len(nrow(matches_data)), function(i) {
    match <- matches_data[i, ]
    match_events <- events_list[[i]]

    # Check if team is in match
    if(!(team_name %in% c(match$home_team, match$away_team))) {
      return(NULL)
    }

    opponent <- ifelse(team_name == match$home_team,
                      match$away_team,
                      match$home_team)

    xga_stats <- calculate_xga(match_events, team_name)

    tibble(
      match_id = match$match_id,
      match_date = match$match_date,
      opponent = opponent,
      xga = xga_stats$xga,
      goals_conceded = xga_stats$goals_conceded,
      performance_vs_xg = xga_stats$performance_vs_expected
    )
  })

  return(timeline)
}

# Plot xGA timeline
plot_xga_timeline <- function(xga_timeline_data, team_name) {
  if(nrow(xga_timeline_data) == 0) return(NULL)

  timeline_with_cumulative <- xga_timeline_data %>%
    arrange(match_date) %>%
    mutate(
      match_num = row_number(),
      cumulative_xga = cumsum(xga),
      cumulative_goals = cumsum(goals_conceded)
    )

  p1 <- ggplot(timeline_with_cumulative) +
    geom_line(aes(x = match_num, y = xga),
              color = "#00d9ff", linewidth = 1.2) +
    geom_point(aes(x = match_num, y = xga),
               color = "#00d9ff", size = 3) +
    geom_line(aes(x = match_num, y = goals_conceded),
              color = "#ff6b6b", linewidth = 1.2) +
    geom_point(aes(x = match_num, y = goals_conceded),
               color = "#ff6b6b", size = 3) +
    labs(title = paste(team_name, "- xGA vs Goals Conceded per Match"),
         y = "Per Match", x = NULL) +
    theme_minimal() +
    theme(plot.title = element_text(face = "bold", size = 14))

  p2 <- ggplot(timeline_with_cumulative) +
    geom_line(aes(x = match_num, y = cumulative_xga),
              color = "#00d9ff", linewidth = 1.5) +
    geom_line(aes(x = match_num, y = cumulative_goals),
              color = "#ff6b6b", linewidth = 1.5) +
    geom_ribbon(
      aes(x = match_num,
          ymin = pmin(cumulative_xga, cumulative_goals),
          ymax = pmax(cumulative_xga, cumulative_goals),
          fill = cumulative_goals < cumulative_xga),
      alpha = 0.3
    ) +
    scale_fill_manual(
      values = c("TRUE" = "green", "FALSE" = "red"),
      labels = c("Better than expected", "Worse than expected"),
      name = "Performance"
    ) +
    labs(title = "Cumulative xGA vs Goals Conceded",
         y = "Cumulative Total", x = "Match Number") +
    theme_minimal() +
    theme(plot.title = element_text(face = "bold", size = 14))

  gridExtra::grid.arrange(p1, p2, ncol = 1)
}

# Analyze shot quality conceded
analyze_shot_quality_conceded <- function(events_data, team_name) {
  events_data %>%
    filter(type.name == "Shot" & team.name != team_name) %>%
    mutate(
      quality_category = cut(
        shot.statsbomb_xg,
        breaks = c(0, 0.05, 0.15, 0.35, 1.0),
        labels = c("very_low", "low", "medium", "high"),
        include.lowest = TRUE
      )
    ) %>%
    group_by(quality_category) %>%
    summarise(
      total_xg = sum(shot.statsbomb_xg, na.rm = TRUE),
      avg_xg = mean(shot.statsbomb_xg, na.rm = TRUE),
      shots = n(),
      goals = sum(shot.outcome.name == "Goal", na.rm = TRUE),
      conversion_rate = goals / shots * 100,
      .groups = "drop"
    )
}

# Compare teams xGA
compare_teams_xga <- function(events_data) {
  teams <- unique(events_data$team.name)

  map(teams, function(team) {
    list(
      team = team,
      overall = calculate_xga(events_data, team),
      by_zone = calculate_xga_by_zone(events_data, team),
      by_situation = calculate_xga_by_situation(events_data, team)
    )
  }) %>%
    set_names(teams)
}

# Execute analysis
team_name <- "Arsenal"

# Overall xGA
xga_stats <- calculate_xga(events, team_name)
cat(sprintf("xGA Analysis for %s:
", team_name))
cat(sprintf("  xGA: %.2f
", xga_stats$xga))
cat(sprintf("  Goals Conceded: %d
", xga_stats$goals_conceded))
cat(sprintf("  Performance vs Expected: %+.2f
", xga_stats$performance_vs_expected))
cat(sprintf("  Avg Shot Quality Against: %.3f
", xga_stats$avg_shot_quality))

# Zone analysis
cat("
xGA by Zone:
")
zone_xga <- calculate_xga_by_zone(events, team_name)
print(zone_xga)

# Situation analysis
cat("
xGA by Situation:
")
situation_xga <- calculate_xga_by_situation(events, team_name)
print(situation_xga)

# Shot quality analysis
cat("
Shot Quality Conceded:
")
quality_analysis <- analyze_shot_quality_conceded(events, team_name)
print(quality_analysis)

# Visualizations
shot_map <- plot_shots_against_map(events, team_name)
if(!is.null(shot_map)) print(shot_map)

# Team comparison
team_comparison <- compare_teams_xga(events)
print(team_comparison)

Practical Applications

Defensive Performance Evaluation: xGA provides context-aware assessment of defensive quality beyond simple goals conceded. Teams can identify whether their defensive results are sustainable or influenced by variance. Consistently outperforming xGA may indicate exceptional goalkeeping that cannot be maintained indefinitely, while underperforming suggests defensive improvements are needed even if recent results appear acceptable.

Goalkeeper Assessment: Comparing goals conceded to xGA helps evaluate goalkeeper performance independently of team defense. Goalkeepers who consistently concede fewer goals than xGA demonstrate superior shot-stopping ability, while those conceding more than expected may need additional coaching or replacement consideration.

Tactical Adjustments: Breaking down xGA by zone and situation reveals specific defensive vulnerabilities requiring tactical solutions. High xGA from set pieces indicates need for improved organization on dead balls, while high xGA from counter-attacks suggests issues with defensive transitions and recovery speed.

Opposition Scouting: Analyzing opponent xGA patterns helps attackers identify defensive weaknesses to exploit. Teams can target zones where opponents concede high-quality chances and design attacking patterns that create similar situations.

Recruitment and Squad Planning: xGA metrics inform defensive recruitment decisions. Teams can identify whether defensive improvements require better center backs, full backs, or defensive midfielders by analyzing which areas generate highest xGA values. Similarly, teams can scout defenders from clubs with low xGA, indicating strong defensive performers.

Key Takeaways

  • xGA provides superior defensive evaluation compared to goals conceded by accounting for shot quality and chance creation allowed
  • Sustainable defensive performance aligns closely with xGA over time; significant deviations indicate luck or exceptional goalkeeping
  • Low xGA indicates effective defensive structure that prevents high-quality chances, not just shot quantity
  • Breaking down xGA by zone reveals specific defensive vulnerabilities in different pitch areas
  • Set piece xGA should be analyzed separately as it reflects different defensive skills than open play defending
  • Counter-attack xGA often represents the most dangerous situations and requires specific defensive preparation
  • Average shot quality conceded (xGA per shot) is often more revealing than total xGA, showing defensive ability to force low-quality attempts
  • Goalkeeper performance should be evaluated using saves above expected based on xGA, not just save percentage
  • Teams can strategically accept higher xGA in low-danger zones while minimizing high-quality chances in central areas
  • xGA trends over seasons help identify whether defensive improvements or declines are genuine tactical changes or statistical variance

Code Examples

Simple xG Model

Basic xG model using logistic regression with distance and angle features

import numpy as np
from sklearn.linear_model import LogisticRegression

def build_xg_model(shots_df):
    """Build Expected Goals model"""
    # Features: distance, angle, body_part
    shots_df["distance"] = np.sqrt((shots_df["x"] - 100)**2 + (shots_df["y"] - 50)**2)
    shots_df["angle"] = np.arctan2(7.32/2, shots_df["distance"]) * 2

    X = shots_df[["distance", "angle"]]
    y = shots_df["goal"]

    model = LogisticRegression()
    model.fit(X, y)
    return model

Discussion

Have questions or feedback? Join our community discussion on Discord or GitHub Discussions.