30 min read

Learning Objectives

Understand the concept of Expected Threat (xT) and how it differs from xG and xA
Build and interpret xT grids from historical shot and goal data
Calculate player and team xT contributions from event data
Analyze ball progression metrics including progressive passes and carries
Evaluate player value through possession-based frameworks
Compare different approaches to valuing on-ball actions
Apply xT analysis to tactical and recruitment questions

In This Chapter

Learning Objectives
Introduction
9.1 The Problem with Endpoint Metrics
9.2 Understanding Expected Threat (xT)
9.3 Building an xT Model
9.4 Progressive Actions and Their xT Value
9.5 Player Valuation Using xT
9.6 Progressive Passes and Carries
9.7 Comparison with VAEP, EPV, and Other Possession Value Models
9.8 Practical Applications
9.9 Visualization Techniques
9.10 Limitations and Considerations
9.11 Chapter Summary
Key Terminology
Key Formulas
Further Practice
References

Exercises Quiz Case Study 01 Case Study 02 Key Takeaways Further Reading

Chapter 9: Expected Threat (xT) and Ball Progression

Learning Objectives

By the end of this chapter, you will be able to:

Understand the concept of Expected Threat (xT) and how it differs from xG and xA
Build and interpret xT grids from historical shot and goal data
Calculate player and team xT contributions from event data
Analyze ball progression metrics including progressive passes and carries
Evaluate player value through possession-based frameworks
Compare different approaches to valuing on-ball actions
Apply xT analysis to tactical and recruitment questions

Introduction

Expected Goals (xG) tells us the value of a shot. Expected Assists (xA) tells us the value of the final pass before a shot. But what about all the other actions in a possession--the passes that advance the ball up the field, the dribbles that break defensive lines, the movements that create space? How do we value the hundreds of actions that occur before a shooting opportunity emerges?

Expected Threat (xT) addresses this gap by assigning value to every position on the pitch based on how likely that position is to lead to a goal in the near future. When a player moves the ball from a low-threat area to a high-threat area, they generate positive xT--they have increased their team's chance of scoring.

This chapter introduces xT and related ball progression metrics, providing a comprehensive framework for valuing all on-ball actions, not just shots and assists. These metrics have transformed how clubs evaluate midfielders, full-backs, and other players whose contributions don't always appear in traditional statistics.

9.1 The Problem with Endpoint Metrics

9.1.1 The Missing Middle

Traditional statistics focus on endpoints: goals, assists, shots on target. Even advanced metrics like xG and xA concentrate on the final moments of an attack. But most of a soccer match involves:

Building possession from deep positions
Progressing the ball through midfield
Creating advantageous positions for attacks
Breaking defensive lines through passes or dribbles

Players who excel at these tasks--deep-lying playmakers, ball-progressing center-backs, inverted full-backs--often appear undervalued by endpoint metrics.

Consider two midfielders: - Player A: Completes a 40-yard progressive pass that advances his team into the final third, but the attack fizzles - Player B: Receives the ball 25 yards from goal after Player A's work, makes a simple 5-yard pass, and a teammate shoots

Traditional xA credits Player B for the assist opportunity. Player A gets nothing, despite arguably making the more valuable contribution. This systematic undervaluation of ball progression has real consequences: clubs that rely solely on xG and xA will overlook excellent midfielders and defenders who drive their team's attacking play without appearing in the shot-creation chain.

9.1.2 The Value of Position

The insight behind xT is that position has value. Standing at the edge of the opponent's penalty area is inherently more threatening than standing in your own half, even before any action occurs. This positional value exists because:

Proximity to goal: Closer positions offer shorter distances and better angles for shooting
Defensive disruption: Attacking positions force defenders into more difficult decisions
Transition advantage: Central positions in the final third require multiple defenders to cover
Shooting opportunity density: Certain zones produce many more shots per possession than others

By quantifying this positional value, we can credit players for improving their team's position, regardless of whether that improvement directly leads to a shot.

Callout: Why xT Matters for Player Valuation

Before xT and similar metrics, evaluating certain positions was extremely difficult with data alone: - Ball-playing center-backs who drive possession forward appeared no different from passive center-backs in traditional stats - Deep-lying playmakers who orchestrate attacks from the center circle received no statistical credit for their orchestration - Inverted full-backs who carry the ball into midfield channels were invisible in xG/xA analysis

xT and progressive action metrics have made these players visible to data-driven scouting for the first time.

9.2 Understanding Expected Threat (xT)

9.2.1 Origin of xT: Karun Singh's Work

Expected Threat was formalized by Karun Singh in a 2019 blog post titled "Introducing Expected Threat (xT)." While the concept of positional value had been explored informally by several analysts, Singh provided the first clear mathematical framework and publicly available implementation.

Singh's key insight was to model the pitch as a Markov chain: the ball's current position determines the probability distribution of future positions, and from any future position, there is some probability of a shot and some probability of scoring. By working backward from these scoring probabilities, you can assign a threat value to every location on the pitch.

The work drew on earlier ideas from several sources. Sarah Rudd presented related concepts at the MIT Sloan Sports Analytics Conference as early as 2011, discussing how to value passes based on the zones they connect. Marek Kwiatkowski explored similar territory with his "goal-expected" framework. However, Singh's formulation was particularly elegant and accessible, and it caught the attention of the broader analytics community rapidly.

Since its introduction, xT has been widely adopted by analytics departments at professional clubs, and the concept has been extended and refined by numerous researchers. It serves as the conceptual foundation for more complex possession value models like VAEP and EPV.

9.2.2 Core Concept

Expected Threat (xT) divides the pitch into a grid of zones (typically 12x8 = 96 zones or 16x12 = 192 zones). Each zone is assigned a value representing the probability that possessing the ball in that zone will lead to a goal in the next few actions.

The fundamental equation:

xT(zone) = P(shot|zone) * P(goal|shot,zone) + P(move|zone) * Sum[P(zone'|zone,move) * xT(zone')]

Where: - P(shot|zone) = probability of shooting from this zone - P(goal|shot,zone) = probability of scoring if shooting from this zone - P(move|zone) = probability of moving the ball (pass/dribble) rather than shooting - P(zone'|zone,move) = probability of the ball ending up in zone' given a move from zone - xT(zone') = expected threat value of the destination zone

This recursive equation captures both the immediate shooting threat and the future threat from subsequent actions. The beauty of this formulation is that it propagates value backward from the goal: zones that frequently lead to dangerous positions inherit some of that danger, even if they are far from goal themselves.

Key Insight: The recursive nature of xT means that a zone in the center circle might have a non-trivial xT value not because shots are taken from there, but because passes from that zone frequently reach dangerous areas. This is exactly the kind of value that xG and xA miss entirely.

9.2.3 The Zone-Based Value Surface

To understand what the xT surface looks like in practice, consider the following typical values:

Zone	Approximate xT	Interpretation
Own penalty area	0.001-0.003	Minimal threat, recovery position
Own half	0.002-0.010	Building phase
Central midfield	0.005-0.020	Transition zone
Final third wings	0.010-0.030	Crossing positions
Final third center	0.030-0.080	Dangerous territory
Edge of box	0.080-0.150	High threat
Inside penalty area	0.150-0.400	Very high threat
Central box	0.300-0.500+	Maximum threat

These values represent the probability that possession in that zone leads to a goal within the possession. Several notable patterns emerge from the xT surface:

Centrality premium. Central zones consistently have higher xT than equivalent lateral zones at the same distance from goal. This reflects the fact that central positions offer wider shooting angles and more passing options into dangerous areas.

Non-linear increase near goal. xT values increase slowly through midfield but accelerate rapidly in the final third, particularly within the penalty area. The zone just outside the center of the box might have 5-10 times the xT of a zone 20 meters further back.

Asymmetry in wide areas. Wide zones near the opponent's byline have moderate xT because they enable crosses and cutbacks, but the xT of these zones is lower than central zones at the same distance from goal, reflecting the lower conversion rate of crosses compared to central shots.

9.2.4 The Markov Chain Approach to Calculating xT

The mathematical foundation of xT is a Markov chain--a stochastic model where the probability of future states depends only on the current state, not on the sequence of events that preceded it. In the xT framework:

States are the zones on the pitch grid
Transitions are ball movements (passes and carries) between zones
Absorbing states are goals (the possession ends with a score) and possession losses (the possession ends without a score)

The Markov property assumes that where the ball goes next depends only on where it is now, not on how it got there. This is an approximation--in reality, a ball that arrived via a fast break may have different continuation probabilities than one that arrived through slow build-up--but it is a reasonable simplification that makes the math tractable.

The xT calculation proceeds as follows:

Estimate shooting probability from each zone: What fraction of possessions in zone $z$ result in a shot?
Estimate scoring probability given a shot: What fraction of shots from zone $z$ result in a goal?
Estimate transition probabilities: For passes and carries starting in zone $z$, what is the probability distribution of destination zones?
Solve the value function recursively: Use value iteration to find the xT for each zone that satisfies the fundamental equation.

The value iteration converges because each step of the Markov chain has some probability of ending the possession (through a shot, turnover, or out-of-play), ensuring that the infinite series of future values converges to a finite sum.

Callout: Markov Chains in Plain Language

A Markov chain is simply a system where what happens next depends only on where you are now, not on your history. Think of it like navigating a city: from any intersection, you can turn left, right, or go straight, and the probability of each choice depends on the intersection (the current state) but not on how you got there. xT treats the soccer pitch the same way: from any zone, the ball has certain probabilities of moving to each other zone, and we can calculate the long-run probability of scoring from each starting position.

9.3 Building an xT Model

9.3.1 Data Requirements

To build an xT model, you need:

Event data with precise coordinates - All passes (successful and unsuccessful) - All carries/dribbles - All shots with outcomes
Sufficient sample size - Minimum ~50,000 actions for stable estimates - One full league season typically provides adequate data - More data (multiple seasons) produces smoother estimates
Consistent coordinate system - Standard 120x80 or 105x68 coordinate space - All data normalized to same orientation (attacking left-to-right)

9.3.2 Grid Definition

Choose your grid resolution:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

# Common grid configurations
GRID_12x8 = (12, 8)   # 96 zones, ~10m x 10m each
GRID_16x12 = (16, 12)  # 192 zones, ~7m x 7m each
GRID_24x16 = (24, 16)  # 384 zones, ~5m x 5m each

def get_zone(x, y, grid_size=(12, 8), pitch_dims=(120, 80)):
    """Convert coordinates to zone indices."""
    zone_x = int(x / pitch_dims[0] * grid_size[0])
    zone_y = int(y / pitch_dims[1] * grid_size[1])

    # Clamp to valid range
    zone_x = max(0, min(zone_x, grid_size[0] - 1))
    zone_y = max(0, min(zone_y, grid_size[1] - 1))

    return zone_x, zone_y

def zone_to_index(x, y, grid_size=(12, 8), pitch_dims=(120, 80)):
    """Convert coordinates to a single flat index."""
    zx, zy = get_zone(x, y, grid_size, pitch_dims)
    return zy * grid_size[0] + zx

Higher resolution captures more nuance but requires more data for stable estimates. The 12x8 grid is a good default that balances resolution against data requirements. For clubs with access to multiple seasons of data, 16x12 provides noticeably better spatial resolution, particularly in the penalty area where small differences in position matter significantly.

Best Practice: Start with a 12x8 grid for initial exploration and increase resolution if your dataset supports it. A useful heuristic: each zone should contain at least 100 actions for stable probability estimates. With a 12x8 grid and 50,000 actions per season, that gives roughly 520 actions per zone on average--comfortably above the minimum.

9.3.3 Transition Matrix Calculation

Build transition matrices showing how the ball moves between zones:

def build_transition_matrices(events_df, grid_size=(12, 8)):
    """
    Build pass and carry transition matrices.

    Returns matrices where M[i,j] = P(end in zone j | start in zone i)
    """
    n_zones = grid_size[0] * grid_size[1]

    # Initialize count matrices
    pass_counts = np.zeros((n_zones, n_zones))
    carry_counts = np.zeros((n_zones, n_zones))
    pass_totals = np.zeros(n_zones)
    carry_totals = np.zeros(n_zones)

    # Count transitions
    for _, event in events_df.iterrows():
        start_zone = zone_to_index(event['start_x'], event['start_y'], grid_size)
        end_zone = zone_to_index(event['end_x'], event['end_y'], grid_size)

        if event['type'] == 'Pass':
            pass_counts[start_zone, end_zone] += 1
            pass_totals[start_zone] += 1
        elif event['type'] == 'Carry':
            carry_counts[start_zone, end_zone] += 1
            carry_totals[start_zone] += 1

    # Normalize to probabilities
    pass_matrix = np.divide(pass_counts, pass_totals[:, np.newaxis],
                           where=pass_totals[:, np.newaxis] > 0)
    carry_matrix = np.divide(carry_counts, carry_totals[:, np.newaxis],
                            where=carry_totals[:, np.newaxis] > 0)

    return pass_matrix, carry_matrix

The transition matrix is the heart of the xT model. Each row represents a starting zone, and each column represents a destination zone. The entry at row $i$, column $j$ gives the probability that a ball movement (pass or carry) starting in zone $i$ ends in zone $j$. These probabilities are estimated simply as the historical frequency of each transition.

9.3.4 Solving for xT Values

Use value iteration to solve for xT:

def calculate_xt_grid(shot_prob, goal_prob, move_prob,
                       pass_matrix, carry_matrix,
                       grid_size=(12, 8),
                       max_iterations=100, tolerance=1e-6):
    """
    Calculate xT values using value iteration.

    Parameters
    ----------
    shot_prob : array
        Probability of shooting from each zone
    goal_prob : array
        Probability of scoring given a shot from each zone
    move_prob : array
        Probability of moving the ball (rather than shooting) from each zone
    pass_matrix : array
        Transition matrix for passes
    carry_matrix : array
        Transition matrix for carries
    grid_size : tuple
        Grid dimensions
    max_iterations : int
        Maximum number of iterations
    tolerance : float
        Convergence threshold

    Returns
    -------
    array
        xT values for each zone
    """
    n_zones = grid_size[0] * grid_size[1]

    # Initialize with shot values
    xT = shot_prob * goal_prob

    for iteration in range(max_iterations):
        xT_old = xT.copy()

        # Calculate expected value of moving
        pass_value = pass_matrix @ xT
        carry_value = carry_matrix @ xT
        move_value = 0.7 * pass_value + 0.3 * carry_value  # Weighted average

        # Update xT
        xT = shot_prob * goal_prob + move_prob * move_value

        # Check convergence
        max_change = np.max(np.abs(xT - xT_old))
        if max_change < tolerance:
            print(f"Converged after {iteration + 1} iterations (max change: {max_change:.8f})")
            break

    if iteration == max_iterations - 1:
        print(f"Warning: Did not converge after {max_iterations} iterations (max change: {max_change:.8f})")

    return xT

The value iteration algorithm works as follows:

Initialize xT values using only the immediate shooting value: xT_0(z) = P(shot|z) * P(goal|shot,z)
Update each zone's value by adding the expected future value from ball movements: xT_{n+1}(z) = P(shot|z) * P(goal|shot,z) + P(move|z) * E[xT_n(destination)]
Repeat until the maximum change between iterations falls below a tolerance threshold (typically 1e-6)

In practice, convergence typically occurs within 5-15 iterations. The speed of convergence depends on the grid resolution and the degree to which value propagates backward from the goal.

9.3.5 Practical Implementation Details

A complete xT implementation requires careful attention to several details:

def build_complete_xt_model(events_df, grid_size=(12, 8), pitch_dims=(120, 80)):
    """
    Build a complete xT model from raw event data.

    Parameters
    ----------
    events_df : pd.DataFrame
        Event data with columns: type, start_x, start_y, end_x, end_y, outcome
    grid_size : tuple
        Grid dimensions (columns, rows)
    pitch_dims : tuple
        Pitch dimensions in the coordinate system

    Returns
    -------
    dict
        Complete xT model with grid values and metadata
    """
    n_zones = grid_size[0] * grid_size[1]

    # Step 1: Calculate zone-level statistics
    shot_count = np.zeros(n_zones)
    goal_count = np.zeros(n_zones)
    move_count = np.zeros(n_zones)
    total_actions = np.zeros(n_zones)

    for _, event in events_df.iterrows():
        zone = zone_to_index(event['start_x'], event['start_y'], grid_size, pitch_dims)

        if event['type'] == 'Shot':
            shot_count[zone] += 1
            if event.get('outcome') == 'Goal':
                goal_count[zone] += 1
            total_actions[zone] += 1
        elif event['type'] in ['Pass', 'Carry']:
            move_count[zone] += 1
            total_actions[zone] += 1

    # Step 2: Calculate probabilities with smoothing
    # Add small constant to avoid division by zero
    epsilon = 1e-8
    shot_prob = shot_count / (total_actions + epsilon)
    goal_prob = np.divide(goal_count, shot_count, where=shot_count > 0, out=np.zeros_like(goal_count))
    move_prob = move_count / (total_actions + epsilon)

    # Step 3: Build transition matrices
    pass_matrix, carry_matrix = build_transition_matrices(events_df, grid_size)

    # Step 4: Solve for xT
    xT = calculate_xt_grid(shot_prob, goal_prob, move_prob,
                           pass_matrix, carry_matrix, grid_size)

    return {
        'xT': xT,
        'grid_size': grid_size,
        'pitch_dims': pitch_dims,
        'shot_prob': shot_prob,
        'goal_prob': goal_prob,
        'move_prob': move_prob,
        'n_actions': total_actions
    }

9.3.6 Handling Edge Cases

Several situations require special handling:

Zones with no data: Use smoothing or interpolation from neighbors. This is particularly important for corner zones and deep defensive positions where events are rare. A common approach is to add a small uniform prior to all zone counts before calculating probabilities.
Own penalty area: Very low values, primarily clearance zones. Some implementations set these to zero since possessions in your own box are almost always defensive situations.
Unsuccessful actions: Account for turnover probability. When a pass fails, the possession effectively ends for the team in question, so the xT of the destination zone should be zero (or negative, if you want to penalize turnovers). For carries that result in a loss of possession, the negative value should be the xT of the zone where possession was lost.
Set pieces: May require separate treatment due to different spatial dynamics. Corner kicks, for instance, move the ball from a fixed position to the penalty area, creating a transition that does not reflect normal open-play dynamics. Some implementations build separate xT grids for open play and set pieces.
Defensive actions: The basic xT framework only values offensive ball movements. To incorporate defensive contributions, you would need to model the xT prevented by tackles, interceptions, and blocks, which requires a separate analytical framework.

Callout: Common Implementation Mistakes

When building your own xT model, watch out for these pitfalls: - Not separating successful and unsuccessful actions. Unsuccessful passes should not contribute to the transition matrix because the ball does not actually arrive at the "destination" -- it is intercepted or goes out of play. - Including penalties and free kicks in open-play estimates. Set pieces create artificial transitions that distort the movement probabilities. - Using too fine a grid. With insufficient data, a 24x16 grid will have many empty zones, creating noisy and unreliable estimates. - Forgetting to normalize orientation. If some events have coordinates attacking left-to-right and others right-to-left, the model will be nonsensical. Always normalize all events to the same attacking direction.

9.4 Progressive Actions and Their xT Value

9.4.1 Carries vs Passes in the xT Framework

A key distinction in the xT framework is between passes and carries, which contribute differently to ball progression:

Passes move the ball between players across potentially large distances. They are high-risk (interception possible) but can bypass defensive lines entirely. A single pass can generate large xT if it moves the ball from midfield into the penalty area.

Carries (also called dribbles or ball progressions) involve a single player moving with the ball. They are generally lower-risk per meter of progression but slower and limited by the player's physical ability. Carries generate xT incrementally as the player advances.

def decompose_xt_by_action_type(events_df, xt_grid, grid_size=(12, 8)):
    """
    Decompose total xT generation into passes and carries.
    """
    pass_xt = 0
    carry_xt = 0
    pass_count = 0
    carry_count = 0

    for _, event in events_df.iterrows():
        start_zone = zone_to_index(event['start_x'], event['start_y'], grid_size)
        end_zone = zone_to_index(event['end_x'], event['end_y'], grid_size)

        xt_added = xt_grid[end_zone] - xt_grid[start_zone]

        if event['type'] == 'Pass' and event.get('successful', True):
            pass_xt += xt_added
            pass_count += 1
        elif event['type'] == 'Carry':
            carry_xt += xt_added
            carry_count += 1

    return {
        'pass_xt_total': pass_xt,
        'carry_xt_total': carry_xt,
        'pass_count': pass_count,
        'carry_count': carry_count,
        'xt_per_pass': pass_xt / pass_count if pass_count > 0 else 0,
        'xt_per_carry': carry_xt / carry_count if carry_count > 0 else 0,
        'pass_share': pass_xt / (pass_xt + carry_xt) if (pass_xt + carry_xt) > 0 else 0
    }

In most leagues, passes account for roughly 60-70% of total positive xT generation, with carries accounting for 30-40%. However, this ratio varies dramatically by player profile. Dribbling-oriented wingers may generate 50% or more of their xT through carries, while deep-lying playmakers generate nearly all of theirs through passes.

The distinction between pass xT and carry xT is analytically important because it reveals different player profiles:

Pass-dominant xT generators (e.g., Toni Kroos, Luka Modric) advance play through vision and technique
Carry-dominant xT generators (e.g., Adama Traore, Neymar) advance play through dribbling and physical ability
Balanced contributors (e.g., Kevin De Bruyne) generate xT through both channels roughly equally

9.4.2 xT from Actions: The Basic Calculation

Once we have an xT grid, we can value individual actions:

xT_added(action) = xT(end_zone) - xT(start_zone)

For example: - Pass from central midfield (xT=0.02) to edge of box (xT=0.10): xT added = +0.08 - Failed dribble losing possession: xT added = -0.02 (the possession xT lost) - Backward pass to maintain possession: xT added might be slightly negative but acceptable - Long ball from defense (xT=0.005) to attacking midfielder (xT=0.04): xT added = +0.035

The simplicity of this calculation is one of xT's great strengths. Once the grid is computed, evaluating any action requires only looking up two values and subtracting.

9.5 Player Valuation Using xT

9.5.1 Calculating Player xT

For each player, sum their xT contributions:

def calculate_player_xt(events_df, xt_grid, grid_size=(12, 8)):
    """Calculate xT added by each player."""
    player_xt = {}

    for _, event in events_df.iterrows():
        player = event['player']

        if pd.isna(player):
            continue

        # Get zones
        start_zone = zone_to_index(event['start_x'], event['start_y'], grid_size)
        end_zone = zone_to_index(event['end_x'], event['end_y'], grid_size)

        # Calculate xT added
        xt_added = xt_grid[end_zone] - xt_grid[start_zone]

        # Account for unsuccessful actions
        if not event.get('successful', True):
            xt_added = -xt_grid[start_zone]  # Lost possession value

        # Accumulate
        if player not in player_xt:
            player_xt[player] = {'xt_added': 0, 'actions': 0,
                                  'xt_passes': 0, 'xt_carries': 0,
                                  'pass_count': 0, 'carry_count': 0}

        player_xt[player]['xt_added'] += xt_added
        player_xt[player]['actions'] += 1

        if event['type'] == 'Pass':
            player_xt[player]['xt_passes'] += xt_added
            player_xt[player]['pass_count'] += 1
        elif event['type'] == 'Carry':
            player_xt[player]['xt_carries'] += xt_added
            player_xt[player]['carry_count'] += 1

    return player_xt

9.5.2 xT per 90 Minutes

Normalize by playing time:

def calculate_xt_per_90(player_xt, player_minutes):
    """Normalize xT to per-90-minute rate."""
    results = []

    for player, data in player_xt.items():
        minutes = player_minutes.get(player, 0)

        if minutes >= 450:  # Minimum 5 full matches
            xt_per_90 = data['xt_added'] / (minutes / 90)
            actions_per_90 = data['actions'] / (minutes / 90)

            results.append({
                'player': player,
                'xt_total': data['xt_added'],
                'xt_per_90': xt_per_90,
                'actions': data['actions'],
                'actions_per_90': actions_per_90,
                'xt_per_action': data['xt_added'] / data['actions'],
                'xt_passes_per90': data['xt_passes'] / (minutes / 90),
                'xt_carries_per90': data['xt_carries'] / (minutes / 90),
                'minutes': minutes
            })

    return pd.DataFrame(results).sort_values('xt_per_90', ascending=False)

9.5.3 Decomposing xT by Action Type

Break down contributions to understand player profiles:

def decompose_player_xt(events_df, xt_grid, player_name, grid_size=(12, 8)):
    """Decompose a player's xT into action types."""
    player_events = events_df[events_df['player'] == player_name]

    decomposition = {
        'passes': {'xt': 0, 'count': 0},
        'carries': {'xt': 0, 'count': 0},
        'successful_actions': {'xt': 0, 'count': 0},
        'failed_actions': {'xt': 0, 'count': 0}
    }

    for _, event in player_events.iterrows():
        start_zone = zone_to_index(event['start_x'], event['start_y'], grid_size)
        end_zone = zone_to_index(event['end_x'], event['end_y'], grid_size)

        if event.get('successful', True):
            xt_added = xt_grid[end_zone] - xt_grid[start_zone]
            decomposition['successful_actions']['xt'] += xt_added
            decomposition['successful_actions']['count'] += 1
        else:
            xt_added = -xt_grid[start_zone]
            decomposition['failed_actions']['xt'] += xt_added
            decomposition['failed_actions']['count'] += 1

        action_type = event['type'].lower() + 's'
        if action_type in decomposition:
            decomposition[action_type]['xt'] += xt_added
            decomposition[action_type]['count'] += 1

    return decomposition

9.5.4 Team-Level xT Analysis

Aggregate to team level for tactical analysis:

def calculate_team_xt(events_df, xt_grid, grid_size=(12, 8)):
    """Calculate team xT metrics."""
    team_data = {}

    for team in events_df['team'].unique():
        team_events = events_df[events_df['team'] == team]

        xt_total = 0
        by_zone = {}

        for _, event in team_events.iterrows():
            start_zone = zone_to_index(event['start_x'], event['start_y'], grid_size)
            end_zone = zone_to_index(event['end_x'], event['end_y'], grid_size)

            if event.get('successful', True):
                xt_added = xt_grid[end_zone] - xt_grid[start_zone]
            else:
                xt_added = -xt_grid[start_zone]

            xt_total += xt_added

            # Track by start zone for spatial analysis
            zone_key = (start_zone // grid_size[0], start_zone % grid_size[0])
            if zone_key not in by_zone:
                by_zone[zone_key] = 0
            by_zone[zone_key] += xt_added

        matches = team_events['match_id'].nunique()

        team_data[team] = {
            'xt_total': xt_total,
            'xt_per_match': xt_total / matches,
            'matches': matches,
            'xt_by_zone': by_zone
        }

    return team_data

9.6 Progressive Passes and Carries

9.6.1 Defining Progressive Actions

While xT provides a continuous measure of value added, progressive actions offer a simpler, more interpretable metric. A pass or carry is typically considered "progressive" if it:

Standard Definition (Wyscout): - Moves the ball at least 25% closer to the opponent's goal - Or enters the penalty area from outside it

Alternative Definition (distance-based): - Forward passes of at least 30 meters toward goal - Or passes into the final third from outside it

FBref/StatsBomb Definition: - Passes that move the ball toward the opponent's goal by at least 10 meters from their starting point, or any completed pass into the penalty area

def is_progressive_pass(start_x, start_y, end_x, end_y):
    """Determine if a pass is progressive (25% closer to goal rule)."""
    # Distance to goal from start and end
    goal_x, goal_y = 120, 40  # Standard coordinates

    start_dist = np.sqrt((goal_x - start_x)**2 + (goal_y - start_y)**2)
    end_dist = np.sqrt((goal_x - end_x)**2 + (goal_y - end_y)**2)

    # Check if 25% closer
    return end_dist < 0.75 * start_dist

def is_progressive_carry(start_x, start_y, end_x, end_y, min_distance=10):
    """Determine if a carry is progressive."""
    # Must move ball forward significantly
    forward_progress = end_x - start_x

    # Distance to goal check (same as pass)
    goal_x, goal_y = 120, 40
    start_dist = np.sqrt((goal_x - start_x)**2 + (goal_y - start_y)**2)
    end_dist = np.sqrt((goal_x - end_x)**2 + (goal_y - end_y)**2)

    return forward_progress >= min_distance and end_dist < 0.75 * start_dist

9.6.2 Progressive Passing Metrics

Key metrics for progressive passing:

Metric	Definition	Typical Values (CM)
Progressive passes	Count of progressive passes	5-15 per 90
Progressive pass distance	Total forward distance	200-400m per 90
Progressive pass %	Progressive / total passes	5-15%
Passes into final third	Passes ending in final third	3-10 per 90
Passes into penalty area	Passes ending in box	1-4 per 90

9.6.3 Progressive Carrying Metrics

Key metrics for progressive carries:

Metric	Definition	Typical Values
Progressive carries	Count of progressive carries	3-10 per 90
Progressive carry distance	Total forward distance	100-300m per 90
Carries into final third	Carries ending in final third	1-5 per 90
Carries into penalty area	Carries ending in box	0-2 per 90

9.6.4 Combining Progression Metrics

Total ball progression combines passes and carries:

def calculate_progression_metrics(events_df, player_name, minutes_played):
    """Calculate comprehensive progression metrics."""
    player_events = events_df[events_df['player'] == player_name]

    passes = player_events[player_events['type'] == 'Pass']
    carries = player_events[player_events['type'] == 'Carry']

    # Progressive passes
    prog_passes = passes.apply(
        lambda r: is_progressive_pass(r['start_x'], r['start_y'],
                                       r['end_x'], r['end_y']), axis=1
    )

    # Progressive carries
    prog_carries = carries.apply(
        lambda r: is_progressive_carry(r['start_x'], r['start_y'],
                                        r['end_x'], r['end_y']), axis=1
    )

    # Calculate distances
    prog_pass_dist = passes[prog_passes].apply(
        lambda r: r['end_x'] - r['start_x'], axis=1
    ).sum()

    prog_carry_dist = carries[prog_carries].apply(
        lambda r: r['end_x'] - r['start_x'], axis=1
    ).sum()

    # Normalize to per 90
    factor = 90 / minutes_played

    return {
        'progressive_passes_90': prog_passes.sum() * factor,
        'progressive_carries_90': prog_carries.sum() * factor,
        'total_progressive_actions_90': (prog_passes.sum() + prog_carries.sum()) * factor,
        'progressive_pass_distance_90': prog_pass_dist * factor,
        'progressive_carry_distance_90': prog_carry_dist * factor,
        'total_progressive_distance_90': (prog_pass_dist + prog_carry_dist) * factor
    }

Callout: xT vs. Progressive Actions -- When to Use Each

Use xT when: - You need a continuous, granular measure of value added - You want to compare the value of different actions on the same scale - You are building a player valuation model or scouting tool

Use progressive actions when: - You need metrics that are easy to communicate to coaches and scouts - You want a quick proxy for ball progression ability - You are working with limited data or computational resources

In practice, xT per 90 and progressive actions per 90 correlate highly (r > 0.8 for most positions), meaning either can serve as a reasonable proxy for the other.

9.7 Comparison with VAEP, EPV, and Other Possession Value Models

9.7.1 VAEP (Valuing Actions by Estimating Probabilities)

VAEP, developed by Tom Decroos and colleagues at KU Leuven, takes a different approach. Instead of using a grid, VAEP uses machine learning to directly predict how each action changes the probability of: 1. Scoring in the next 10 actions 2. Conceding in the next 10 actions

VAEP_value(action) = Delta_P(scoring) - Delta_P(conceding)

Advantages over xT: - Accounts for action-specific features (pass type, body part, pressure) - Includes defensive value (preventing conceding) - Can handle complex sequences - Values the action itself, not just the positional change

Disadvantages: - More complex to implement - Requires more features and training data - Less interpretable "black box" - Harder to explain to non-technical stakeholders

One important distinction between xT and VAEP is how they handle unsuccessful actions. In xT, a failed pass is typically penalized by subtracting the xT of the starting zone, but this penalty is coarse--it does not account for how dangerous the turnover location is for the opponent's counter-attack. VAEP, by contrast, explicitly models the probability of conceding after a turnover, meaning that a failed pass in your own defensive third is penalized far more heavily than one in the opponent's half. This makes VAEP more suitable for evaluating players whose style involves risk-taking in dangerous areas.

VAEP has been particularly successful in academic research and has been adopted by several professional clubs. The SPADL (Soccer Player Action Description Language) framework that accompanies VAEP provides a standardized way to represent soccer events across different data providers.

Callout: When VAEP and xT Disagree

Players who rank significantly differently under VAEP and xT often reveal interesting tactical profiles: - A player ranking higher in VAEP than xT typically performs context-dependent actions well--they make different decisions under pressure versus in space, and VAEP's richer feature set captures this. - A player ranking higher in xT than VAEP may be generating large spatial progressions but at a high turnover cost, which xT underweights but VAEP captures through its conceding probability component.

When the two frameworks disagree substantially on a player, it is worth investigating the underlying reasons, as the disagreement itself is analytically informative.

9.7.2 EPV (Expected Possession Value)

EPV, developed by Javier Fernandez and Luke Bornn, creates a continuous surface rather than a grid, using tracking data to account for: - Player positions (both teams) - Ball location and trajectory - Space control and pressure - Available passing options

EPV provides the most accurate possession valuation but requires tracking data, which is not always available. It answers the question: "Given the exact positions of all 22 players and the ball, what is the probability that this possession ends in a goal?"

The key advantage of EPV over xT is that it is context-dependent: the same ball position can have very different EPV values depending on where the defenders and attackers are positioned. xT, by contrast, assigns the same value to a zone regardless of the defensive structure around it.

9.7.3 xGChain and xGBuildup

As discussed in Chapter 8, xGChain and xGBuildup provide simpler alternatives that credit all players involved in possessions ending in shots. These metrics share xT's goal of valuing non-endpoint actions but use a fundamentally different approach: rather than calculating positional threat, they distribute shot xG backward through the possession chain.

9.7.4 Comparison of Frameworks

Framework	Data Required	Complexity	Interpretability	Accuracy	Best For
xT	Event data	Low	High	Moderate	Quick analysis, scouting
VAEP	Event data + features	Medium	Medium	Good	Detailed player evaluation
EPV	Tracking data	High	Low	High	Elite-level tactical analysis
xG/xA chain	Event data	Low	High	Limited	Build-up contribution
Progressive actions	Event data	Very Low	Very High	Low	Communication, basic scouting

9.7.5 Choosing a Framework

Select based on your resources and needs:

xT: Best for quick, interpretable analysis with event data only. Ideal for mid-tier clubs without tracking data or extensive data science resources.
VAEP: Better accuracy with event data, willing to sacrifice interpretability. Suitable for clubs with data science capacity who want more granular player evaluation.
EPV: Maximum accuracy with tracking data available. Used by elite clubs with comprehensive data infrastructure.
Progressive actions: Simplest approach, most transparent for communication with coaches and scouts.

Callout: The Practical Reality of Model Choice

In professional soccer analytics, the choice of possession value model often comes down to practical considerations rather than theoretical optimality: - Data availability. Most clubs outside the top leagues do not have tracking data, ruling out EPV. - Engineering resources. VAEP requires significant ML infrastructure; xT can be built in an afternoon. - Communication needs. Coaches and sporting directors need to understand the numbers. xT and progressive actions are far easier to explain than VAEP or EPV. - Update frequency. xT grids are stable across seasons and can be computed once. VAEP models require retraining as new data arrives.

9.8 Practical Applications

9.8.1 Scouting and Recruitment

xT excels at identifying players who progress the ball effectively:

Use Case: Finding Ball-Progressing Center-Backs

def scout_progressive_defenders(player_data, position='CB'):
    """Find defenders with high ball progression."""
    defenders = player_data[player_data['position'] == position]

    # Key metrics for progressive defenders
    metrics = defenders[[
        'player', 'team',
        'progressive_passes_90',
        'progressive_carries_90',
        'xt_per_90',
        'pass_completion_pct'
    ]].copy()

    # Composite score
    metrics['progression_score'] = (
        metrics['progressive_passes_90'].rank(pct=True) * 0.35 +
        metrics['progressive_carries_90'].rank(pct=True) * 0.25 +
        metrics['xt_per_90'].rank(pct=True) * 0.30 +
        metrics['pass_completion_pct'].rank(pct=True) * 0.10
    )

    return metrics.sort_values('progression_score', ascending=False)

This type of scouting query has become increasingly important as modern tactical systems demand center-backs who can initiate attacks from deep positions. A center-back ranking in the top 10% for xT per 90 among all center-backs is likely a strong ball-progressor who could thrive in a possession-oriented system.

Use Case: Identifying Undervalued Midfielders

Midfielders who generate high xT but low xA may be undervalued by traditional statistics. They consistently advance play into dangerous areas but do not always deliver the final pass. These players are often available at lower transfer fees because their contributions are invisible in headline statistics.

def find_undervalued_midfielders(player_data, min_minutes=1500):
    """
    Find midfielders with high ball progression but low assist numbers.
    These players may be undervalued in the market.
    """
    midfielders = player_data[
        (player_data['position'].isin(['CM', 'CDM', 'DM'])) &
        (player_data['minutes'] >= min_minutes)
    ].copy()

    midfielders['xt_per90'] = midfielders['xt_total'] / (midfielders['minutes'] / 90)
    midfielders['xa_per90'] = midfielders['xA'] / (midfielders['minutes'] / 90)

    # High progression, low direct chance creation
    midfielders['progression_ratio'] = midfielders['xt_per90'] / (midfielders['xa_per90'] + 0.01)

    return midfielders.sort_values('xt_per90', ascending=False)

9.8.2 Tactical Analysis

Analyze how teams build attacks:

Use Case: Comparing Build-Up Patterns

def analyze_buildup_patterns(team_events, xt_grid, grid_size=(12, 8)):
    """Analyze team's xT generation by zone."""
    # Define zones
    zone_xt = {'defensive': 0, 'middle': 0, 'final': 0}
    zone_actions = {'defensive': 0, 'middle': 0, 'final': 0}

    for _, event in team_events.iterrows():
        start_zone = zone_to_index(event['start_x'], event['start_y'], grid_size)
        end_zone = zone_to_index(event['end_x'], event['end_y'], grid_size)

        if event.get('successful', True):
            xt_added = xt_grid[end_zone] - xt_grid[start_zone]
        else:
            xt_added = -xt_grid[start_zone]

        start_x = event['start_x']

        if start_x < 40:
            zone_xt['defensive'] += xt_added
            zone_actions['defensive'] += 1
        elif start_x < 80:
            zone_xt['middle'] += xt_added
            zone_actions['middle'] += 1
        else:
            zone_xt['final'] += xt_added
            zone_actions['final'] += 1

    # Percentage distribution
    total = sum(max(v, 0) for v in zone_xt.values())
    zone_pct = {k: max(v, 0)/total*100 if total > 0 else 0 for k, v in zone_xt.items()}

    return zone_xt, zone_pct, zone_actions

Teams with high defensive-third xT generation are skilled at building from the back. Teams with high middle-third xT generation excel at transitional play. Teams concentrated in the final third may rely on direct play or pressing high to win the ball in advanced positions.

9.8.3 Player Development

Track progression improvements over time:

def track_progression_development(player_name, seasons_data):
    """Track a player's ball progression metrics over seasons."""
    development = []

    for season, data in seasons_data.items():
        metrics = calculate_progression_metrics(
            data['events'],
            player_name,
            data['minutes']
        )
        metrics['season'] = season
        development.append(metrics)

    return pd.DataFrame(development)

Young players who show improving xT and progressive action numbers across consecutive seasons are likely developing their ability to influence the game at higher levels. This trajectory data is valuable for academy player evaluation and loan decisions.

9.8.4 Match Analysis

Use xT to analyze specific matches:

def analyze_match_xt(match_events, xt_grid, grid_size=(12, 8)):
    """Analyze xT generation throughout a match."""
    # By time period
    time_periods = [(0, 15), (15, 30), (30, 45), (45, 60), (60, 75), (75, 90)]

    analysis = {}
    for start, end in time_periods:
        period_events = match_events[
            (match_events['minute'] >= start) &
            (match_events['minute'] < end)
        ]

        for team in period_events['team'].unique():
            team_events = period_events[period_events['team'] == team]
            xt_total = 0
            for _, event in team_events.iterrows():
                sz = zone_to_index(event['start_x'], event['start_y'], grid_size)
                ez = zone_to_index(event['end_x'], event['end_y'], grid_size)
                if event.get('successful', True):
                    xt_total += xt_grid[ez] - xt_grid[sz]
                else:
                    xt_total -= xt_grid[sz]

            key = f"{team}_{start}-{end}"
            analysis[key] = xt_total

    return analysis

9.9 Visualization Techniques

9.9.1 xT Heatmaps

Visualize the xT grid:

def plot_xt_grid(xt_values, grid_size=(12, 8)):
    """Plot xT grid as heatmap."""
    fig, ax = plt.subplots(figsize=(14, 9))

    # Reshape to grid
    xt_matrix = xt_values.reshape(grid_size[1], grid_size[0])

    # Draw pitch background
    draw_pitch(ax)

    # Overlay heatmap
    extent = [0, 120, 0, 80]
    im = ax.imshow(xt_matrix, extent=extent, origin='lower',
                   cmap='RdYlGn', alpha=0.7, aspect='auto')

    # Colorbar
    cbar = plt.colorbar(im, ax=ax, shrink=0.8)
    cbar.set_label('Expected Threat', fontsize=12)

    ax.set_title('Expected Threat (xT) Grid', fontsize=14)

    return fig

9.9.2 Player xT Maps

Show where players generate xT:

def plot_player_xt_zones(player_events, xt_grid, grid_size=(12, 8)):
    """Plot where a player adds xT."""
    fig, ax = plt.subplots(figsize=(14, 9))
    draw_pitch(ax)

    # Calculate xT added per zone
    zone_xt = {}
    for _, event in player_events.iterrows():
        start_zone = get_zone(event['start_x'], event['start_y'])
        sz_idx = zone_to_index(event['start_x'], event['start_y'], grid_size)
        ez_idx = zone_to_index(event['end_x'], event['end_y'], grid_size)

        if event.get('successful', True):
            xt_added = xt_grid[ez_idx] - xt_grid[sz_idx]
        else:
            xt_added = -xt_grid[sz_idx]

        if start_zone not in zone_xt:
            zone_xt[start_zone] = 0
        zone_xt[start_zone] += xt_added

    # Plot as bubbles
    for zone, xt in zone_xt.items():
        x_center = (zone[0] + 0.5) * 10  # Convert zone to coords
        y_center = (zone[1] + 0.5) * 10

        color = 'green' if xt > 0 else 'red'
        size = abs(xt) * 2000

        ax.scatter(x_center, y_center, s=size, c=color, alpha=0.5)

    ax.set_title(f'xT Generation by Zone', fontsize=14)

    return fig

9.9.3 Progressive Action Maps

Visualize progressive passes and carries:

def plot_progressive_actions(player_events, action_type='Pass'):
    """Plot progressive passes or carries."""
    fig, ax = plt.subplots(figsize=(14, 9))
    draw_pitch(ax)

    actions = player_events[player_events['type'] == action_type]

    for _, action in actions.iterrows():
        is_prog = is_progressive_pass(
            action['start_x'], action['start_y'],
            action['end_x'], action['end_y']
        )

        if is_prog:
            ax.annotate('',
                       xy=(action['end_x'], action['end_y']),
                       xytext=(action['start_x'], action['start_y']),
                       arrowprops=dict(arrowstyle='->', color='blue',
                                      lw=1, alpha=0.7))

    ax.set_title(f'Progressive {action_type}es', fontsize=14)

    return fig

9.10 Limitations and Considerations

9.10.1 xT Limitations

The basic xT model has several known limitations:

1. Ignores defensive positioning. This is the most significant limitation. A pass into the box is not equally valuable against 2 defenders versus 5. xT assigns the same value to a zone regardless of whether the space is open or heavily defended. This means xT can overvalue actions against deep blocks (where defenders are concentrated in high-xT zones) and undervalue actions in transition (where the defense is disorganized but the ball may be in a lower-xT zone).

2. Context-independent. The same position has the same value regardless of game state. Being in the opponent's box at 0-0 in the 10th minute is valued identically to being there at 3-0 up in the 89th minute, even though the tactical context is completely different.

3. Backward passes undervalued. Necessary possession retention appears negative in xT because backward passes move the ball from higher-xT to lower-xT zones. A center-back receiving a pass from a midfielder (negative xT action) and then playing a progressive pass forward (positive xT action) nets out to a modest positive, even though the backward pass was essential for repositioning and creating the subsequent opportunity.

4. Set pieces. Corner kicks don't fit the position-based framework well. A corner kick moves the ball from a relatively low-xT zone (the corner) to a high-xT zone (the penalty area), generating artificial xT that does not reflect the same type of attacking progression as open play.

5. Does not capture off-ball movement. xT measures only on-ball actions. A striker's intelligent run that drags a defender out of position, creating space for a teammate, generates zero xT but may be more valuable than many on-ball actions.

9.10.2 Practical Limitations of Coarse Grid Resolution

The choice of grid resolution introduces a trade-off that deserves explicit attention. When zones are too coarse--as in the common 12x8 grid--actions within the same zone receive zero xT credit even if they meaningfully change the ball's position. For example, in a 12x8 grid each zone covers approximately 10 meters by 10 meters. A carry from the back edge of a zone to the front edge covers nearly 10 meters of forward progression yet generates exactly zero xT because the start and end zones are identical. This "within-zone blindness" systematically undervalues short progressive actions, particularly carries and one-two passing sequences that advance the ball in small increments. In the penalty area, where even 2-3 meters of positional difference can dramatically change the shooting angle and expected goal probability, this problem is especially acute. A player who dribbles from the corner of the box to a central position 8 meters away may cross from one high-xT zone into an adjacent high-xT zone with similar values, receiving minimal credit for what was actually a very valuable action. Analysts should be aware that coarse grids tend to favor players who make long-range progressions (large zone-to-zone jumps) over those who make incremental but cumulatively valuable short progressions.

Callout: Mitigating Grid Resolution Problems

If you are restricted to a coarse grid due to limited data, two practical workarounds can help: - Interpolation: Instead of using discrete zone lookups, interpolate xT values based on the exact coordinates within each zone. This converts the step-function xT surface into a smooth gradient and ensures that every forward movement receives proportional credit. - Hybrid metrics: Combine xT with progressive distance metrics, which do not suffer from grid resolution issues. A player's total progressive carry distance captures value that coarse xT misses.

9.10.3 Extensions and Improvements to Basic xT

Several extensions have been proposed to address xT's limitations:

Context-dependent xT. Rather than a single xT grid, some implementations build separate grids for different game states (open play vs. set pieces, score margin, time period). This partially addresses the context-independence limitation.

Risk-adjusted xT. By incorporating turnover probabilities more explicitly, risk-adjusted xT penalizes high-risk actions more heavily. A player who generates 0.10 xT per successful pass but loses the ball 30% of the time looks different from one who generates 0.08 xT per pass with a 5% turnover rate.

Directional xT. Rather than a scalar value per zone, directional xT assigns different values depending on the direction of the incoming action. A pass arriving from the wing creates a different threat profile than one arriving from a central position, even if both end in the same zone.

Opponent-adjusted xT. By computing xT grids specific to each opponent's defensive structure, analysts can better estimate the actual threat generated against different defensive styles.

9.10.4 Data Quality Issues

Coordinate accuracy: Small errors compound across many actions. If the data provider's coordinate system has systematic biases (e.g., consistently placing events 1-2 meters from their true location), xT calculations will be affected.
Carry detection: Some providers do not capture carries well. If carries are underreported, the xT model will underweight their contribution.
Definition inconsistencies: "Progressive" varies by provider and by analyst. Always be explicit about which definition you are using.

9.10.5 Interpretation Cautions

Position matters: A fullback with high xT might just take lots of crosses from wide positions, which inflates their xT through sheer volume rather than genuine quality of progression.
Role dependence: Defensive midfielders naturally generate less xT because they operate in zones with lower xT differentials. Compare within positions, never across positions.
Risk profiles: High xT can come with high turnover rates. A player who attempts many ambitious passes will generate high gross xT but may also lose possession frequently. Always look at net xT (successful actions minus failed actions).
Team context: Playing for a possession-heavy team inflates raw xT because there are simply more on-ball actions to accumulate xT from.

9.10.6 Best Practices

Always normalize by playing time (per 90)
Compare within positions, not across positions
Consider xT alongside success rates (do not reward high-risk, low-success)
Use multiple metrics together for full picture
Account for team playing style and league context
Separate open-play xT from set-piece xT where possible
Look at both gross xT (from successful actions) and net xT (including penalties for turnovers)

Callout: The Complementary Metric Stack

For comprehensive player evaluation, combine xT with other metrics: - xT per 90 + xG per 90 + xA per 90 gives a complete attacking picture - Progressive passes/90 + Progressive carries/90 provides interpretable context for xT numbers - Pass completion % + xT per action distinguishes efficient progressors from reckless ones - Defensive actions per 90 rounds out the picture for midfielders and defenders

9.11 Chapter Summary

Expected Threat (xT) and ball progression metrics fill a crucial gap in soccer analytics by valuing actions that occur before shooting opportunities emerge. Key takeaways:

xT originated from Karun Singh's work formalizing the concept of positional value on the pitch, building on earlier ideas from Sarah Rudd and others
The zone-based value surface assigns threat values to every pitch position based on the probability of that position leading to a goal
The Markov chain approach models ball movement as a stochastic process, propagating value backward from the goal through value iteration
Progressive actions (passes and carries) offer a simpler alternative to xT for measuring ball progression
Carries vs passes contribute differently to xT, revealing distinct player profiles (dribblers vs. passers)
Player valuation using xT makes visible the contributions of deep-lying playmakers, ball-playing defenders, and other traditionally undervalued positions
Comparison with VAEP and EPV reveals trade-offs between complexity, accuracy, interpretability, and data requirements
Practical implementation requires careful handling of grid resolution, data quality, unsuccessful actions, and set pieces
Limitations include context-independence, inability to capture defensive positioning, and undervaluation of backward passes
Extensions like context-dependent xT and risk-adjusted xT address some of these limitations

In the next chapter, we will explore Passing Networks and Analysis, building on these concepts to understand team structure and playing patterns through network science approaches.

Key Terminology

Term	Definition
Expected Threat (xT)	Probability that possession in a zone leads to a goal
xT Grid	Division of pitch into zones with assigned threat values
Transition Matrix	Probability matrix showing ball movement between zones
Value Iteration	Algorithm for solving xT values recursively
Progressive Pass	Pass that moves ball significantly closer to goal
Progressive Carry	Ball carry that advances possession toward goal
VAEP	Machine learning framework valuing actions by goal probability change
EPV	Continuous possession value surface using tracking data
Markov Chain	Stochastic model where future states depend only on the current state
Context-Dependent xT	Extension of xT that varies values by game state

Key Formulas

xT Fundamental Equation: $$xT(z) = P(shot|z) \cdot P(goal|shot,z) + P(move|z) \cdot \sum_{z'} P(z'|z,move) \cdot xT(z')$$

xT Added by an Action: $$xT_{added} = xT(z_{end}) - xT(z_{start})$$

xT for Failed Actions: $$xT_{failed} = -xT(z_{start})$$

xT per 90: $$xT_{per90} = \frac{\sum xT_{added}}{Minutes / 90}$$

Progressive Pass Criterion: $$d_{end} < 0.75 \cdot d_{start}$$

Where $d$ is the distance from the point to the center of the opponent's goal.

Further Practice

Build an xT grid from one full season of StatsBomb data
Calculate xT per 90 for all midfielders and identify top progressors
Compare progressive passing leaders with xT leaders--explain differences
Analyze how a specific team generates xT by zone
Track a young player's ball progression development over multiple seasons
Build separate xT grids for home and away matches and compare the differences
Implement risk-adjusted xT that penalizes turnovers proportional to the xT of the zone where possession was lost

References

Singh, K. (2019). "Introducing Expected Threat (xT)." Karun.in.
Rudd, S. (2011). "A Framework for Tactical Analysis and Individual Offensive Production Assessment in Soccer." MIT Sloan Sports Analytics Conference.
Decroos, T. et al. (2019). "Actions Speak Louder than Goals: Valuing Player Actions in Soccer." KDD Conference.
Fernandez, J. & Bornn, L. (2018). "Wide Open Spaces: A Statistical Technique for Measuring Space Creation in Professional Soccer." MIT Sloan Sports Analytics Conference.
Spearman, W. (2018). "Beyond Expected Goals." MIT Sloan Sports Analytics Conference.
Sumpter, D. (2019). "Expected Threat." Friends of Tracking, YouTube.
StatsBomb (2021). "Progressive Passes and Carries." StatsBomb IQ Documentation.