34 min read

> "Tactics are what you do when there is something to do; strategy is what you do when there is nothing to do." --- Savielly Tartakower (chess grandmaster, but equally applicable to football)

Learning Objectives

Detect and classify team formations from positional tracking data
Build tactical fingerprints that quantify a team's playing style
Conduct systematic opponent analysis using data-driven methods
Model in-game tactical adjustments and their effectiveness
Optimize substitution timing and selection using historical data
Analyze game state effects on team behavior and outcomes
Communicate tactical insights effectively to coaching staff

Exercises Quiz Case Study 01 Case Study 02 Key Takeaways Further Reading

Chapter 22: Match Strategy and Tactics

"Tactics are what you do when there is something to do; strategy is what you do when there is nothing to do." --- Savielly Tartakower (chess grandmaster, but equally applicable to football)

The convergence of data analytics and tactical analysis represents one of the most impactful applications of modern soccer science. Where coaches once relied exclusively on video review and intuition, today's analytical frameworks provide rigorous, quantitative methods for understanding how teams play, how opponents can be exploited, and how in-game adjustments influence match outcomes.

This chapter synthesizes techniques from tracking data analysis (Chapter 18), data sources and collection (Chapter 2), and expected goals (Chapter 7) into a unified framework for match strategy. We move from detection---identifying what is happening on the pitch---to prescription---recommending what should happen next.

22.1 Formation Analysis

22.1.1 Defining Formations Mathematically

A football formation is conventionally described by a sequence of integers representing the number of players in each horizontal band of the pitch (excluding the goalkeeper). A 4-3-3, for instance, denotes four defenders, three midfielders, and three forwards. However, this shorthand obscures enormous variation: a 4-3-3 with a single pivot differs dramatically from one with a double pivot, and nominal formations shift continuously during play.

We define a formation more precisely as a mapping from players to spatial roles. Let $\mathbf{x}_i(t) \in \mathbb{R}^2$ denote the pitch position of player $i$ at time $t$. The formation template $\mathbf{F}$ is a set of $K = 10$ reference positions (outfield players):

$$ \mathbf{F} = \{\mathbf{f}_1, \mathbf{f}_2, \ldots, \mathbf{f}_{10}\}, \quad \mathbf{f}_k \in \mathbb{R}^2 $$

Each reference position corresponds to a tactical role (e.g., left center-back, right winger). The observed formation at time $t$ is the assignment of players to roles that minimizes total displacement:

$$ \sigma^*(t) = \arg\min_{\sigma \in S_{10}} \sum_{i=1}^{10} \|\mathbf{x}_i(t) - \mathbf{f}_{\sigma(i)}\|^2 $$

where $S_{10}$ is the symmetric group of permutations on 10 elements. This is an instance of the assignment problem, solvable in $O(n^3)$ time via the Hungarian algorithm.

22.1.2 Formation Detection from Tracking Data

In practice, formation detection proceeds in several stages:

Normalization: Translate all coordinates so that the team's centroid is at the origin, and optionally rotate so that the direction of play aligns with the positive $x$-axis.
Phase Filtering: Separate in-possession, out-of-possession, and transition phases. Formations differ substantially between these states.
Template Matching: Compare the mean positions of each player (over a window of play) against a library of canonical formation templates.
Clustering: Apply unsupervised learning (e.g., $k$-means or Gaussian mixture models) to the set of normalized position vectors to discover formation states without predefined templates.

import numpy as np
from scipy.optimize import linear_sum_assignment
from sklearn.cluster import KMeans

def detect_formation(
    player_positions: np.ndarray,
    templates: dict[str, np.ndarray],
) -> tuple[str, float]:
    """Detect the best-matching formation template.

    Args:
        player_positions: Array of shape (10, 2) with outfield player
            coordinates, normalized to team centroid at origin.
        templates: Dictionary mapping formation names (e.g., '4-3-3')
            to arrays of shape (10, 2) with reference positions.

    Returns:
        Tuple of (formation_name, cost) where cost is the total
        squared displacement under optimal assignment.
    """
    best_formation = None
    best_cost = float('inf')

    for name, template in templates.items():
        # Build cost matrix: C[i, j] = ||player_i - template_j||^2
        cost_matrix = np.sum(
            (player_positions[:, np.newaxis, :] - template[np.newaxis, :, :]) ** 2,
            axis=2,
        )
        row_ind, col_ind = linear_sum_assignment(cost_matrix)
        cost = cost_matrix[row_ind, col_ind].sum()

        if cost < best_cost:
            best_cost = cost
            best_formation = name

    return best_formation, best_cost

Callout: Formation vs. Shape

Coaches distinguish between a team's nominal formation (what appears on the team sheet) and its effective shape (the spatial configuration during play). A team listed as 4-4-2 may compress into a 4-4-1-1 out of possession and stretch into a 3-2-5 in possession. Data-driven formation detection captures the effective shape, which is far more informative than the nominal formation.

22.1.3 Dynamic Formation Shifts

Formations are not static. Teams systematically change shape depending on phase of play, game state, and tactical instruction. To capture this, we can compute formation snapshots at regular intervals (e.g., every 5 minutes of effective playing time) and track transitions.

Define the formation state sequence as:

$$ \mathcal{S} = \{(t_1, F_1), (t_2, F_2), \ldots, (t_n, F_n)\} $$

where $F_j$ is the detected formation at time $t_j$. A formation transition occurs at $t_j$ if $F_j \neq F_{j-1}$. The frequency and nature of these transitions characterize a team's tactical flexibility.

We can also quantify formation compactness using the eigenvalues of the covariance matrix of player positions:

$$ \Sigma = \frac{1}{10} \sum_{i=1}^{10} (\mathbf{x}_i - \bar{\mathbf{x}})(\mathbf{x}_i - \bar{\mathbf{x}})^T $$

The eigenvalues $\lambda_1 \geq \lambda_2$ measure the spread along the principal axes. A compact, narrow block has small eigenvalues; a stretched formation has large ones. The ratio $\lambda_1 / \lambda_2$ indicates elongation: a high ratio suggests the team is stretched along one axis (typically length-wise during counter-attacks).

22.1.4 Formation Quality Metrics

Not all formations are equally effective in all contexts. We define several metrics:

Defensive Coverage Index (DCI): The fraction of the defensive third covered by Voronoi cells of defending players. Higher DCI indicates better spatial coverage.

$$ \text{DCI} = \frac{\sum_{i \in \text{defenders}} \text{Area}(V_i \cap \text{DefensiveThird})}{\text{Area}(\text{DefensiveThird})} $$

Passing Lane Availability (PLA): The number of viable passing lanes from the ball carrier to teammates, weighted by the probability of successful completion.
Numerical Superiority Index (NSI): For each pitch zone, the difference between the count of attacking and defending players, weighted by zone importance.

22.1.5 Formation Classification with Machine Learning

Beyond template matching, machine learning approaches offer more flexible formation classification. A neural network classifier can learn to map player position configurations directly to formation labels without relying on handcrafted templates.

The input representation is critical. Raw $(x, y)$ coordinates for 10 players form a 20-dimensional vector, but this representation is sensitive to player ordering. Several strategies address this:

Sorted representations: Sort players by their $x$-coordinate (back-to-front) or by role assignment, creating a canonical ordering that removes permutation sensitivity.

Set-based representations: Use architectures like Deep Sets or Set Transformers that are inherently permutation-invariant:

$$ f(\{\mathbf{x}_1, \ldots, \mathbf{x}_{10}\}) = \rho\left(\sum_{i=1}^{10} \phi(\mathbf{x}_i)\right) $$

where $\phi$ and $\rho$ are learned neural networks. This approach treats the set of player positions as an unordered collection, which aligns with the nature of the problem.

Graph-based representations: Represent the team as a graph where players are nodes and spatial relationships are edges. Graph neural networks can then learn formation patterns from the relational structure.

Callout: The Limits of Formation Labels

While formation classification is useful for high-level tactical analysis, the obsession with formation labels (4-3-3 vs. 4-2-3-1 vs. 3-5-2) can be misleading. In reality, formations exist on a continuous spectrum, and many tactical systems defy simple labeling. The most informative analysis often focuses on the spatial relationships between specific player groups (e.g., the distance between the defensive line and the midfield line) rather than on assigning a discrete formation label. Formation labels are communication shorthand, not analytical endpoints.

22.2 Tactical Fingerprinting

22.2.1 What Is a Tactical Fingerprint?

A tactical fingerprint is a multidimensional profile that characterizes a team's playing style across a set of measurable dimensions. Just as a human fingerprint uniquely identifies an individual, a tactical fingerprint distinguishes one team's approach from another's.

The fingerprint comprises a feature vector $\mathbf{v} \in \mathbb{R}^d$ where each dimension captures a specific tactical attribute. Common dimensions include:

Dimension	Description	Typical Range
Possession %	Share of ball possession	30--70%
PPDA	Passes allowed per defensive action	5--20
Directness	Ratio of forward passes to total passes	0.2--0.5
Build-up speed	Average time from own-half recovery to final-third entry	5--30s
Pressing intensity	Defensive actions in opponent's half per minute of opponent possession	0.5--3.0
Width in attack	Average horizontal spread of attackers	20--50m
Crossing frequency	Crosses per 90 minutes	10--35
Counter-attack rate	Proportion of attacks classified as counters	5--25%
Set-piece dependency	Share of goals from set pieces	15--45%
High turnovers	Ball recoveries in final third per 90	2--10

22.2.2 Computing the Fingerprint

Given a dataset of $N$ matches for a team, we compute each dimension as a per-90-minute average, then standardize across the league:

$$ z_j = \frac{\bar{v}_j - \mu_j}{\sigma_j} $$

where $\bar{v}_j$ is the team's average on dimension $j$, and $\mu_j$, $\sigma_j$ are the league mean and standard deviation. The standardized fingerprint $\mathbf{z} = (z_1, z_2, \ldots, z_d)$ allows comparison across leagues and seasons.

import pandas as pd
import numpy as np
from sklearn.preprocessing import StandardScaler

def compute_tactical_fingerprint(
    team_data: pd.DataFrame,
    league_data: pd.DataFrame,
    dimensions: list[str],
) -> pd.Series:
    """Compute standardized tactical fingerprint for a team.

    Args:
        team_data: DataFrame with per-match stats for the team.
        league_data: DataFrame with per-match stats for all teams.
        dimensions: List of column names to include in fingerprint.

    Returns:
        Series with z-scores for each dimension.
    """
    team_means = team_data[dimensions].mean()
    scaler = StandardScaler()
    scaler.fit(league_data[dimensions])

    z_scores = (team_means.values - scaler.mean_) / scaler.scale_
    return pd.Series(z_scores, index=dimensions)

22.2.3 Visualizing Tactical Profiles

The most common visualization for tactical fingerprints is the radar chart (also called a spider or polar chart). Each axis represents one tactical dimension, and the team's standardized score is plotted as a polygon.

For comparing two teams (e.g., in pre-match preparation), overlaying their radar charts immediately highlights areas of similarity and divergence. An alternative is the percentile bar chart, which shows each dimension as a horizontal bar indicating the team's percentile rank within the league.

22.2.4 Style Clustering

By computing fingerprints for all teams in a league, we can apply clustering algorithms to group teams by playing style. This has several applications:

Transfer scouting: Identify teams with similar styles to ease player adaptation.
Tactical preparation: Prepare generic game plans for each style cluster.
League-wide trend analysis: Track how tactical trends evolve over seasons.

Using $k$-means with $k = 4$--$6$ clusters typically yields interpretable groupings such as "possession-dominant," "direct/counter-attacking," "high-pressing," and "deep-block defensive."

Callout: The Limits of Fingerprinting

Tactical fingerprints capture tendencies, not abilities. A team with high pressing intensity is not necessarily a good pressing team---they may press frequently but ineffectively. Always pair descriptive fingerprints with outcome-based metrics (e.g., PPDA vs. opponent xG conceded while pressing high) to distinguish style from quality.

22.2.5 Similarity and Distance Metrics

To compare two tactical fingerprints $\mathbf{z}_A$ and $\mathbf{z}_B$, we use:

Euclidean distance: $$ d_E(\mathbf{z}_A, \mathbf{z}_B) = \sqrt{\sum_{j=1}^d (z_{A,j} - z_{B,j})^2} $$

Cosine similarity (direction of style, ignoring magnitude): $$ \text{cos}(\mathbf{z}_A, \mathbf{z}_B) = \frac{\mathbf{z}_A \cdot \mathbf{z}_B}{\|\mathbf{z}_A\| \|\mathbf{z}_B\|} $$

Mahalanobis distance (accounting for feature correlations): $$ d_M(\mathbf{z}_A, \mathbf{z}_B) = \sqrt{(\mathbf{z}_A - \mathbf{z}_B)^T \Sigma^{-1} (\mathbf{z}_A - \mathbf{z}_B)} $$

where $\Sigma$ is the covariance matrix of fingerprints across all teams.

22.2.6 Temporal Evolution of Tactical Fingerprints

A team's tactical fingerprint is not static across a season. Injuries, transfers, coaching changes, and tactical evolution all cause fingerprints to shift over time. Tracking this evolution provides insight into how a team is adapting and where their current trajectory leads.

A rolling-window approach computes fingerprints over the most recent $W$ matches (typically $W = 5$ or $W = 10$), producing a time series of fingerprints:

$$ \mathbf{z}^{(t)} = \text{fingerprint}(\text{matches}_{t-W+1}, \ldots, \text{matches}_t) $$

The rate of change in the fingerprint is:

$$ \Delta \mathbf{z}^{(t)} = \mathbf{z}^{(t)} - \mathbf{z}^{(t-1)} $$

Large values of $\|\Delta \mathbf{z}^{(t)}\|$ indicate a significant tactical shift. This can flag moments when a team has changed its approach---perhaps in response to a poor run of results or a change in personnel.

22.3 Opponent Analysis and Preparation

22.3.1 The Opponent Analysis Pipeline

Systematic opponent analysis follows a structured pipeline:

Data Collection: Aggregate event data, tracking data, and video from the opponent's recent matches (typically 5--10 most recent).
Tactical Profiling: Compute the opponent's tactical fingerprint (Section 22.2).
Pattern Detection: Identify recurring patterns in build-up play, defensive structure, set pieces, and transitions.
Vulnerability Identification: Find statistical weaknesses that can be exploited.
Game Plan Formulation: Translate analytical findings into actionable tactical instructions.
Communication: Present insights in formats accessible to coaching staff and players (Section 22.7).

22.3.2 Build-Up Play Analysis

A team's build-up play can be decomposed into sequences of passes that advance the ball from the defensive third to the attacking third. We represent each build-up sequence as a directed graph:

$$ G = (V, E), \quad V = \{p_1, p_2, \ldots, p_m\}, \quad E = \{(p_i, p_j) : \text{pass from } p_i \text{ to } p_j\} $$

where vertices are players and edges are passes. Common patterns emerge as frequently occurring subgraphs. We can use sequence mining or graph kernel methods to identify these patterns.

Key metrics for build-up analysis include:

Build-up route preference: Left, central, or right channel usage percentages.
Progression method: Short passing sequences vs. long balls vs. dribbles.
Key pivot players: Players who appear most frequently in successful build-up sequences (high betweenness centrality in the passing network).
Tempo changes: Moments where passing speed increases sharply, indicating an attempt to break defensive lines.

import networkx as nx
import pandas as pd

def analyze_buildup_patterns(
    passes: pd.DataFrame,
    pitch_length: float = 105.0,
) -> dict[str, float]:
    """Analyze build-up play patterns from pass event data.

    Args:
        passes: DataFrame with columns 'passer', 'receiver',
            'start_x', 'start_y', 'end_x', 'end_y', 'success'.
        pitch_length: Length of the pitch in meters.

    Returns:
        Dictionary with build-up metrics.
    """
    third = pitch_length / 3

    # Filter to build-up passes (starting in own defensive/middle third)
    buildup = passes[passes['start_x'] < 2 * third].copy()
    successful = buildup[buildup['success'] == 1]

    # Channel analysis
    left = successful[successful['start_y'] < 24.67]
    central = successful[
        (successful['start_y'] >= 24.67) & (successful['start_y'] <= 45.33)
    ]
    right = successful[successful['start_y'] > 45.33]

    total = len(successful)
    metrics = {
        'left_channel_pct': len(left) / total * 100 if total > 0 else 0,
        'central_channel_pct': len(central) / total * 100 if total > 0 else 0,
        'right_channel_pct': len(right) / total * 100 if total > 0 else 0,
    }

    # Passing network and betweenness
    G = nx.DiGraph()
    for _, row in successful.iterrows():
        if G.has_edge(row['passer'], row['receiver']):
            G[row['passer']][row['receiver']]['weight'] += 1
        else:
            G.add_edge(row['passer'], row['receiver'], weight=1)

    if G.number_of_nodes() > 0:
        betweenness = nx.betweenness_centrality(G, weight='weight')
        metrics['key_pivot_player'] = max(betweenness, key=betweenness.get)
        metrics['pivot_centrality'] = max(betweenness.values())

    # Directness: fraction of passes that advance the ball
    forward = successful[successful['end_x'] > successful['start_x']]
    metrics['directness'] = len(forward) / total if total > 0 else 0

    return metrics

22.3.3 Pressing Triggers and Pressing Traps

One of the most important aspects of modern opponent analysis is understanding how a team presses and, conversely, identifying when an opponent's press can be exploited.

Pressing triggers are the specific conditions that cause a team to initiate a coordinated press. Common triggers include:

Backward pass to the goalkeeper or center-back: Many teams press aggressively when the opponent plays back, as this suggests the opposition is under pressure and may be forced into a long ball.
Ball played to a weak-footed player's weaker side: Pressing the ball onto a player's weak foot limits their passing options.
Ball received in a wide position near the touchline: The touchline acts as an extra defender, reducing the receiver's options.
Heavy touch or miscontrol: An imperfect first touch signals an opportunity to recover the ball.
Specific player receives the ball: Some teams target specific opponents they consider technically weak or slow in possession.

Identifying an opponent's pressing triggers allows a team to either avoid those triggers (circulating the ball in ways that do not activate the press) or deliberately trigger the press to exploit the space left behind.

Pressing traps are deliberate tactical setups where a team invites the opponent to play into a specific area, then springs a coordinated press. The most famous example is Jurgen Klopp's "Red Zone" pressing at Liverpool, where the team deliberately allowed opponents to build up through central areas before springing a trap with multiple players converging.

Analytically, pressing traps can be identified by looking for:

$$ \text{Trap Index}(z) = \frac{\text{Ball recoveries in zone } z}{\text{Times opponent entered zone } z} $$

A high Trap Index in a specific zone suggests the team is deliberately channeling opponents there and recovering the ball at a high rate.

Callout: PPDA and Its Limitations

Passes Per Defensive Action (PPDA) is the most commonly cited pressing metric, measuring how many passes a team allows the opponent to complete before making a defensive action. While useful as a high-level indicator of pressing intensity, PPDA has significant limitations: it treats all defensive actions equally (a tackle in midfield and a clearance in the box both count), it does not distinguish between effective and ineffective pressing, and it does not capture the spatial element of pressing (where on the pitch the defensive actions occur). For serious opponent analysis, PPDA should be supplemented with zone-specific pressing metrics, pressing success rates, and analysis of what happens after the press (ball recovery vs. press beaten).

22.3.4 Counter-Attacking Analysis

Counter-attacks are among the most dangerous attacking phases in soccer, producing shots of significantly higher quality than sustained possession attacks. Analyzing an opponent's counter-attacking capability -- and vulnerability to counter-attacks -- is critical for game planning.

A counter-attack can be formally defined as a possession sequence that: 1. Begins with a ball recovery in the defending or middle third 2. Reaches the attacking third within a specified time threshold (typically 10-15 seconds) 3. Involves a net forward progression of at least 40 meters

Key counter-attacking metrics include:

Counter-attack frequency: How often the team launches counter-attacks per 90 minutes
Counter-attack conversion rate: Proportion of counter-attacks that produce a shot
Counter-attack xG per attack: Average quality of chances created from counters
Transition speed: Average time from ball recovery to shot or final-third entry
Key transition players: Who carries the ball and who finishes in counter-attacking sequences

def identify_counter_attacks(
    possessions: pd.DataFrame,
    time_threshold: float = 15.0,
    distance_threshold: float = 40.0,
) -> pd.DataFrame:
    """Identify counter-attacking sequences from possession data.

    Args:
        possessions: DataFrame with possession sequences including
            'start_x', 'end_x', 'duration_seconds', 'start_zone'.
        time_threshold: Maximum duration in seconds.
        distance_threshold: Minimum forward distance in meters.

    Returns:
        DataFrame of identified counter-attacks.
    """
    counters = possessions[
        (possessions['start_zone'].isin(['defensive_third', 'middle_third']))
        & (possessions['duration_seconds'] <= time_threshold)
        & ((possessions['end_x'] - possessions['start_x']) >= distance_threshold)
    ].copy()

    return counters

22.3.5 Defensive Vulnerability Mapping

Every team has defensive weaknesses. Data-driven vulnerability mapping identifies these systematically:

Spatial vulnerability: Compute the opponent's xG conceded by pitch zone. Zones where the opponent concedes above-average xG indicate spatial weaknesses. Overlay this with the opponent's defensive positioning data to identify the structural cause (e.g., fullback pushing too high, gap between lines).

Temporal vulnerability: Analyze when in a match the opponent is most vulnerable. Some teams tire in the final 15 minutes; others are slow starters. Compute xG conceded in 15-minute intervals:

$$ \text{xG}_{\text{conceded}}^{(k)} = \sum_{s \in S_k} \text{xG}(s) $$

where $S_k$ is the set of shots conceded in the $k$-th 15-minute interval.

Transitional vulnerability: Measure the opponent's effectiveness in defensive transitions. The counter-press success rate (percentage of possessions recovered within 5 seconds of losing the ball) and time to defensive shape (seconds from loss of possession to reaching a stable defensive formation) are key indicators.

Personnel vulnerability: Identify individual defensive weaknesses. For each defender, compute: - 1v1 duel success rate - Aerial duel success rate - Frequency of being dribbled past - Positional errors leading to shots

22.3.6 Set-Piece Strategy Optimization

Set pieces account for approximately 25--30% of goals in professional football, making them a critical component of both opponent analysis and strategic planning.

For defensive set pieces (analyzing how the opponent defends corners, free kicks): - Zonal vs. man-marking vs. hybrid systems - Near-post and far-post coverage - Players assigned to key zones - Historical vulnerability to specific delivery types

For attacking set pieces (analyzing the opponent's set-piece routines): - Preferred delivery types and target players - Decoy runs and blocking patterns - Short-corner and training-ground routine frequencies

Optimizing Own Set-Piece Strategy:

Set-piece optimization involves matching attacking routines to the specific weaknesses of the upcoming opponent. If an opponent uses zonal marking and has weak aerial ability in zone 2 (near post), the optimal strategy might involve an in-swinging delivery to that zone with a designated attacker making a run from outside the box.

The expected value of a set-piece routine $r$ against opponent $o$ can be estimated as:

$$ E[\text{xG}](r, o) = P(\text{delivery quality} \mid r) \times P(\text{contact} \mid r, o) \times E[\text{xG} \mid \text{contact}, r, o] $$

By computing this for a library of set-piece routines against each opponent's defensive setup, the coaching staff can select the optimal routines for each match.

Callout: The Analytics Arms Race

As opponent analysis becomes more sophisticated, teams must also consider how opponents will analyze them. This creates a strategic game of incomplete information, analogous to poker. Some teams deliberately vary their approach to make pattern detection more difficult---a form of tactical "randomization" that has parallels in game theory. The optimal frequency of different set-piece routines, for instance, can be modeled as a mixed-strategy Nash equilibrium.

22.4 In-Game Tactical Adjustments

22.4.1 Detecting Tactical Changes During a Match

Coaches make tactical adjustments throughout a match in response to the evolving game state, opponent behavior, and player performance. Detecting these changes from data requires monitoring multiple signals:

Formation shifts: As described in Section 22.1.3, tracking formation state transitions identifies structural changes (e.g., moving from a 4-3-3 to a 3-5-2).

Pressing trigger changes: A team might switch from pressing high to sitting deep. We detect this by monitoring the team's defensive line height (average $y$-coordinate of the deepest four outfield players) and PPDA in rolling windows.

Tempo changes: Compute the team's passing tempo (passes per minute of possession) in rolling windows. A significant increase often signals a push for a goal; a decrease suggests game management.

Width changes: Track the average horizontal spread of the team. Narrowing may indicate a shift to a more defensive, compact shape.

22.4.2 Quantifying Adjustment Effectiveness

To evaluate whether a tactical adjustment improved a team's performance, we compare metrics before and after the change using a change-point analysis framework.

Let $Y_t$ be a performance metric (e.g., xG rate per minute) at time $t$. We model the match as having a potential change point $\tau$:

$$ Y_t = \begin{cases} \mu_1 + \epsilon_t & \text{if } t < \tau \\ \mu_2 + \epsilon_t & \text{if } t \geq \tau \end{cases} $$

The effectiveness of the adjustment is measured by $\Delta = \mu_2 - \mu_1$. We estimate $\tau$ using methods such as binary segmentation or PELT (Pruned Exact Linear Time), and test whether $\Delta$ is statistically significant.

import numpy as np
from scipy import stats

def detect_tactical_change_point(
    metric_series: np.ndarray,
    min_segment_length: int = 5,
) -> dict[str, float]:
    """Detect a single change point in a time series of tactical metrics.

    Uses binary segmentation with a likelihood-ratio test.

    Args:
        metric_series: 1D array of metric values at regular intervals.
        min_segment_length: Minimum number of observations per segment.

    Returns:
        Dictionary with change point index, metric values before/after,
        and statistical significance.
    """
    n = len(metric_series)
    best_tau = None
    best_log_likelihood_ratio = -np.inf

    for tau in range(min_segment_length, n - min_segment_length):
        seg1 = metric_series[:tau]
        seg2 = metric_series[tau:]

        # Log-likelihood ratio for change in mean (assuming normal)
        var_full = np.var(metric_series)
        if var_full == 0:
            continue
        var_split = (
            len(seg1) * np.var(seg1) + len(seg2) * np.var(seg2)
        ) / n

        if var_split == 0:
            continue

        llr = -n / 2 * np.log(var_split / var_full)
        if llr > best_log_likelihood_ratio:
            best_log_likelihood_ratio = llr
            best_tau = tau

    if best_tau is None:
        return {'change_point': None, 'significant': False}

    before = metric_series[:best_tau]
    after = metric_series[best_tau:]
    t_stat, p_value = stats.ttest_ind(before, after)

    return {
        'change_point': best_tau,
        'mean_before': float(np.mean(before)),
        'mean_after': float(np.mean(after)),
        'delta': float(np.mean(after) - np.mean(before)),
        't_statistic': float(t_stat),
        'p_value': float(p_value),
        'significant': p_value < 0.05,
    }

22.4.3 Real-Time Tactical Dashboards

Modern analytics departments provide coaching staff with real-time tactical dashboards during matches. These dashboards typically display:

Formation heat maps: Current spatial distribution of both teams.
Momentum indicators: Rolling xG difference, territory control, pressing efficiency.
Key matchup stats: 1v1 duel outcomes for critical individual battles.
Fatigue indicators: Player distance covered, high-speed running distance, sprint counts.
Set-piece performance: Tracking whether set-piece routines are executing as planned.

The challenge is to present this information in a format that a coach can absorb in seconds during the high-pressure environment of a live match. Section 22.7 addresses this communication challenge in detail.

22.4.4 The Feedback Loop

In-game tactical adjustments create a feedback loop:

Observe: Data systems detect current tactical state.
Analyze: Analysts compare observed patterns to pre-match expectations.
Recommend: Analysts communicate suggested adjustments.
Implement: Coaching staff decides whether to act on recommendations.
Monitor: Data systems track the effect of any changes made.

This loop operates on timescales of 5--15 minutes during a match, constrained by the practical limitations of communicating with players during play.

Callout: The Speed of Decision-Making

One of the greatest challenges in real-time tactical analysis is the mismatch between analytical processing time and coaching decision-making time. A comprehensive analysis of pressing effectiveness might take 5 minutes to run and interpret, but the coach needs an answer now. This is why pre-match preparation is so critical: the more scenarios that have been anticipated and planned for before the match, the faster the coaching staff can react during it. Effective real-time analytics is 80% preparation and 20% in-game execution.

22.5 Substitution Strategy

22.5.1 The Value of Substitutions

Substitutions are among the most impactful tactical decisions a coach makes during a match. With three to five substitutions available (depending on competition rules), each decision involves:

Who to bring on: Which player from the bench best addresses the current tactical need?
Who to take off: Which player is underperforming, fatigued, or tactically mismatched?
When to substitute: What is the optimal timing?
What to change: Should the substitution involve a formation or tactical shift?

22.5.2 Timing Models

Research has consistently shown that substitution timing matters. The general findings from large-scale analyses are:

Teams that are losing benefit from earlier substitutions (before the 58th minute).
Teams that are winning tend to benefit from later substitutions that preserve the lead.
The "fresh legs" effect is most pronounced in the 60--75 minute window, when fatigue differentials are largest.

We can model the optimal substitution time as a function of game state using a survival analysis framework. Define $T$ as the time of substitution and $Y$ as the match outcome (win/draw/loss). The hazard function for a positive outcome change given substitution at time $t$ is:

$$ h(t) = \lim_{\Delta t \to 0} \frac{P(t \leq T < t + \Delta t \mid T \geq t)}{\Delta t} $$

Fitting a Cox proportional hazards model with covariates for game state, score difference, and player fatigue indicators yields estimates of the optimal substitution window.

22.5.3 Player Impact Models

To quantify the expected impact of bringing on a specific substitute, we compute a Substitution Impact Score (SIS):

$$ \text{SIS}(p_{\text{on}}, p_{\text{off}}, t, \text{state}) = \hat{Y}(p_{\text{on}}, t, \text{state}) - \hat{Y}(p_{\text{off}}, t, \text{state}) $$

where $\hat{Y}$ is the predicted contribution of a player given the current game state and match time. This prediction draws on:

The substitute's historical per-90 metrics (xG, xA, defensive actions).
Fatigue adjustment for the player being replaced (performance decay curves).
Tactical fit: how well the substitute's profile matches the current tactical need.
Opponent-specific factors: historical performance against similar opposition styles.

22.5.4 Fatigue Modeling

Player fatigue is a critical input to substitution decisions. We model fatigue as a function of physical output:

$$ \text{Fatigue}(t) = \alpha \cdot d_{\text{total}}(t) + \beta \cdot d_{\text{sprint}}(t) + \gamma \cdot n_{\text{accel}}(t) + \delta \cdot t $$

where $d_{\text{total}}(t)$ is total distance covered, $d_{\text{sprint}}(t)$ is high-speed running distance, $n_{\text{accel}}(t)$ is the number of high-intensity accelerations, and $t$ is minutes played. The coefficients $\alpha, \beta, \gamma, \delta$ are fit from historical data linking physical output to performance decline.

Observable indicators of fatigue include:

Declining maximum sprint speed (compared to first-half peak).
Reduced high-intensity running frequency.
Increased time between high-intensity efforts (recovery time).
Reduced pressing engagement (fewer defensive actions in the opponent's half).

Callout: The Five-Substitution Rule

The introduction of five substitutions (in three windows plus half-time) in many competitions since 2020 has fundamentally changed substitution strategy. Teams can now plan for a "second wave" of players, and the ability to make five changes reduces the risk of each individual substitution. Analytics departments model substitution strategies as a sequential decision problem: the decision to use the third substitution depends on the remaining two, creating interdependencies that require dynamic programming or simulation to optimize.

22.5.5 Sequential Decision Framework

With multiple substitutions available, the problem becomes sequential. Let $V(s, t, n)$ be the value function for state $s$ at time $t$ with $n$ substitutions remaining:

$$ V(s, t, n) = \max\left\{V_{\text{wait}}(s, t, n),\; \max_{a \in \mathcal{A}} \left[\text{SIS}(a) + V(s', t, n-1)\right]\right\} $$

where $\mathcal{A}$ is the set of possible substitution actions and $s'$ is the resulting state. This Bellman equation can be solved approximately via backward induction or Monte Carlo simulation.

22.5.6 Substitution Optimization in Practice

In practice, substitution decisions are rarely made by algorithm alone. The analyst's role is to provide the coaching staff with structured information that supports their decision:

Pre-match substitution planning: Before the match, the analytics team prepares scenario-based substitution plans. For example: "If we are losing at the 55th minute, the highest-impact substitution is bringing on Player X for Player Y, which historically increases our xG rate by 0.15 per 90." These scenarios are prepared for the most likely game states.

In-match fatigue monitoring: During the match, physical performance data is monitored in real time. When a player's sprint speed drops below 85% of their first-half peak, or when their pressing engagement falls below a threshold, the analyst flags them for potential substitution.

Post-match substitution evaluation: After the match, every substitution is evaluated using the change-point framework (Section 22.4.2). Did the team's performance improve after the change? Was the timing optimal? This feedback loop improves future substitution decision-making.

22.6 Game State Analysis

22.6.1 Defining Game States

The game state is the current score differential from the perspective of a given team:

$$ \text{GS}(t) = g_{\text{team}}(t) - g_{\text{opponent}}(t) $$

Common game states are: winning ($\text{GS} > 0$), drawing ($\text{GS} = 0$), losing ($\text{GS} < 0$). Finer granularity distinguishes between winning by 1, winning by 2+, etc.

Game state profoundly affects team behavior. A vast body of research demonstrates that teams play differently depending on the score:

Losing teams tend to increase attacking intensity, push more players forward, take more risks, and concede more space.
Winning teams tend to become more conservative, defend deeper, reduce pressing intensity, and focus on game management.
Drawing teams behavior depends on context: early draws resemble the pre-match tactical plan, while late draws often see increased urgency.

22.6.2 Game-State-Adjusted Metrics

Raw per-90 metrics can be misleading because they do not account for game state effects. A team that frequently leads will accumulate fewer attacking statistics not because they are less capable but because they spend more time managing the game.

We define game-state-adjusted metrics by computing performance separately for each game state, then aggregating with appropriate weights:

$$ \text{Metric}_{\text{adj}} = \sum_{s \in \{-2, -1, 0, +1, +2\}} w_s \cdot \text{Metric}_s $$

where $w_s$ represents a neutral weighting (e.g., league-average time spent in each game state) rather than the team's actual time in each state.

This adjustment is particularly important for: - xG models: Teams generate different quality chances in different game states. - Pressing statistics: PPDA varies dramatically with game state. - Possession statistics: Winning teams often cede possession deliberately.

22.6.3 Win Probability Models

A win probability model estimates the probability of each match outcome (win/draw/loss) at any point during the match, given the current game state and contextual factors.

The simplest model uses historical base rates. For example, if a team is winning 1-0 at the 60th minute at home, the historical win probability might be 85%. More sophisticated models incorporate:

Current xG difference (not just actual goals).
Red cards and numerical advantage/disadvantage.
Momentum indicators (recent possession, territory, shot frequency).
Team quality (pre-match Elo ratings or similar).

A common approach uses logistic regression or gradient-boosted trees:

$$ P(\text{Win} \mid \mathbf{x}_t) = \frac{1}{1 + e^{-\mathbf{w}^T \mathbf{x}_t}} $$

where $\mathbf{x}_t$ is the feature vector at time $t$, including score difference, time remaining, xG difference, numerical advantage, and venue.

import numpy as np
import pandas as pd
from sklearn.linear_model import LogisticRegression

def build_win_probability_model(
    historical_data: pd.DataFrame,
    features: list[str],
) -> LogisticRegression:
    """Build a win probability model from historical match data.

    Args:
        historical_data: DataFrame with one row per match-minute,
            including feature columns and 'result' column
            (1 = win, 0 = not win).
        features: List of feature column names.

    Returns:
        Fitted logistic regression model.
    """
    X = historical_data[features].values
    y = historical_data['result'].values

    model = LogisticRegression(
        max_iter=1000,
        class_weight='balanced',
    )
    model.fit(X, y)
    return model

22.6.4 Game State Transition Matrices

We can model score progression as a Markov chain. The transition matrix $\mathbf{P}$ has entries $P_{ij}$ representing the probability of moving from score state $i$ to score state $j$ in a given time interval:

$$ P_{ij} = P(\text{GS}_{t+1} = j \mid \text{GS}_t = i) $$

For example, the probability of a team going from 0-0 to 1-0 in any given minute depends on the team's scoring rate, which in turn depends on the quality of both teams.

The stationary distribution of this Markov chain, combined with absorbing states at the final whistle, yields match outcome probabilities. This framework naturally handles cascading effects: going 1-0 up changes the opponent's behavior (they push forward), which changes the transition probabilities for subsequent minutes.

22.6.5 Score Effects on Playing Style

To quantify how game state affects playing style, we compute the score effect vector for each team:

$$ \Delta \mathbf{v}_s = \mathbf{v}_s - \mathbf{v}_0 $$

where $\mathbf{v}_s$ is the tactical fingerprint (Section 22.2) computed only from minutes spent in game state $s$, and $\mathbf{v}_0$ is the fingerprint when drawing. This reveals each team's strategic response to different score situations.

Teams with small $\|\Delta \mathbf{v}_s\|$ are strategically rigid---they play similarly regardless of the score. Teams with large $\|\Delta \mathbf{v}_s\|$ are strategically adaptive. Neither is inherently better; the question is whether the adaptation is effective.

Callout: The Losing Team's Dilemma

When a team is losing, they face a fundamental trade-off: increasing attacking intensity may create more scoring opportunities but also opens space for the opponent. Research by Brechot and Flepp (2020) found that losing teams in the Bundesliga systematically over-commit to attack, conceding significantly more xG than they generate in the process. This suggests that the "rational" response to losing may be more moderate than coaches typically implement---a finding with significant strategic implications.

22.7 Communicating Tactical Insights

22.7.1 The Communication Challenge

The most sophisticated tactical analysis is worthless if it cannot be communicated effectively to decision-makers. In football, the primary consumers of tactical insights are:

Head coach: Needs strategic-level insights for game planning and in-game decisions.
Assistant coaches: Need detailed tactical information for training design and set-piece preparation.
Players: Need simple, actionable instructions they can execute under pressure.
Sporting director: Needs high-level tactical trends for strategic planning and recruitment.

Each audience requires a different level of abstraction, vocabulary, and presentation format.

22.7.2 Visualization Principles for Tactical Communication

Effective tactical visualizations follow several principles:

1. Use the pitch as the canvas. Football people think spatially. Whenever possible, overlay data on a pitch diagram rather than using abstract charts. Heat maps, pass maps, and shot maps are effective because they leverage the viewer's spatial intuition.

2. Minimize cognitive load. During a match, coaches have seconds to absorb information. Pre-match presentations should be concise. Use color coding consistently: red for danger/opponent, blue for own team, green for opportunity.

3. Show comparison, not just description. A heat map of the opponent's attacking zones is more useful when contrasted with the team's own defensive coverage. Side-by-side comparisons and overlay visualizations enable rapid comparison.

4. Animate when static images fall short. Some tactical concepts---like pressing triggers, defensive transitions, or set-piece routines---are inherently dynamic and benefit from short video clips or animated visualizations.

5. Quantify uncertainty. When presenting expected outcomes (e.g., win probability, xG), always convey the degree of uncertainty. Confidence intervals and probabilistic language ("65--75% chance of winning") are more honest and useful than point estimates.

22.7.3 The Pre-Match Report

A typical pre-match analytical report includes the following sections:

Executive Summary (1 page): Key findings and recommended tactical adjustments.
Opponent Tactical Profile (2--3 pages): Fingerprint radar chart, formation analysis, key players.
Build-Up Analysis (1--2 pages): How the opponent progresses the ball, preferred channels, key pivot players.
Defensive Analysis (1--2 pages): Pressing system, defensive line height, vulnerability zones.
Transition Analysis (1 page): Counter-attacking and counter-pressing patterns.
Set-Piece Analysis (2--3 pages): Defensive and attacking routines with diagrams.
Recommended Game Plan (1--2 pages): Specific tactical instructions derived from the analysis.
Appendix: Detailed statistical tables, additional visualizations.

22.7.4 Opposition Analysis Workflows

A structured opposition analysis workflow integrates multiple analytical stages into a coherent preparation process:

Day 1 (Match Day -4): Initial Data Pull - Aggregate event and tracking data from the opponent's last 5-10 matches - Generate automated tactical fingerprint - Run formation detection algorithms - Produce initial statistical summary

Day 2 (Match Day -3): Deep Analysis - Detailed build-up play analysis with passing network graphs - Defensive vulnerability mapping (spatial, temporal, personnel) - Counter-attack and transition profiling - Set-piece cataloging and classification

Day 3 (Match Day -2): Synthesis and Recommendations - Combine analytical findings into a coherent tactical narrative - Identify 3-5 key tactical points for the game plan - Prepare video clips illustrating each point - Draft the pre-match report

Day 4 (Match Day -1): Presentation and Refinement - Present to coaching staff - Incorporate coach feedback and adjust recommendations - Prepare player-facing materials (simplified messaging, key video clips) - Finalize set-piece plans

Match Day: Execution and Monitoring - Provide real-time tactical monitoring (Section 22.4.3) - Prepare half-time data package - Support in-game decision-making

Callout: The One-Page Rule

Many experienced coaches insist on a "one-page rule" for pre-match tactical briefs delivered to players. If the key messages cannot be summarized on a single page (or three to five bullet points), the analysis is too complex to be actionable on the pitch. The analyst's job is to distill complexity into simplicity---not to demonstrate how much analysis was performed, but to identify the two or three insights that will actually influence the outcome of the match.

22.7.5 In-Game Communication

During matches, analysts communicate with coaches via several channels:

Half-time presentations: 3--5 minute briefings with key data points and visualizations, delivered on tablets or printed.
Messaging systems: Text or structured messages sent to the bench via tablets or dedicated apps.
Pre-defined signals: Agreed-upon shorthand for common tactical adjustments (e.g., "Case A" = drop defensive line by 10 meters).

The key constraint is time: analysts must distill complex multi-dimensional analysis into actionable insights in real time. Effective in-game communication requires:

Pre-agreed frameworks and vocabulary.
Automated detection of key events and thresholds.
Tiered alerting: routine updates vs. urgent tactical flags.

22.7.6 Building Trust with Coaching Staff

Callout: The Human Factor

The adoption of tactical analytics depends critically on the relationship between analysts and coaches. Research and industry experience consistently highlight several factors: (1) Start with the coach's questions, not your models. (2) Validate analytical findings against video---coaches trust their eyes before they trust numbers. (3) Be honest about uncertainty and limitations. (4) Provide insights, not instructions---the tactical decision remains the coach's domain. (5) Build credibility incrementally by being right about small things before proposing large changes.

22.7.7 Pitch-Based Visualization Example

The following Python snippet demonstrates creating a tactical pitch visualization suitable for a pre-match report:

import matplotlib.pyplot as plt
import matplotlib.patches as patches
import numpy as np

def draw_pitch(ax: plt.Axes, pitch_length: float = 105.0,
               pitch_width: float = 68.0) -> plt.Axes:
    """Draw a football pitch on the given axes.

    Args:
        ax: Matplotlib axes to draw on.
        pitch_length: Pitch length in meters.
        pitch_width: Pitch width in meters.

    Returns:
        The axes with pitch markings drawn.
    """
    ax.set_xlim(-2, pitch_length + 2)
    ax.set_ylim(-2, pitch_width + 2)
    ax.set_aspect('equal')
    ax.axis('off')

    # Pitch outline
    ax.plot([0, pitch_length], [0, 0], 'k-', linewidth=1.5)
    ax.plot([0, pitch_length], [pitch_width, pitch_width], 'k-', linewidth=1.5)
    ax.plot([0, 0], [0, pitch_width], 'k-', linewidth=1.5)
    ax.plot([pitch_length, pitch_length], [0, pitch_width], 'k-', linewidth=1.5)

    # Center line and circle
    ax.plot([pitch_length / 2, pitch_length / 2], [0, pitch_width], 'k-', linewidth=1)
    circle = plt.Circle((pitch_length / 2, pitch_width / 2), 11.15,
                         fill=False, color='black', linewidth=1)
    ax.add_patch(circle)

    # Penalty areas
    for x_start in [0, pitch_length - 18.5]:
        pa = patches.Rectangle(
            (x_start, (pitch_width - 40.32) / 2), 18.5, 40.32,
            fill=False, edgecolor='black', linewidth=1
        )
        ax.add_patch(pa)

    return ax


def plot_opponent_vulnerability(
    ax: plt.Axes,
    vulnerability_zones: list[dict],
    pitch_length: float = 105.0,
    pitch_width: float = 68.0,
) -> plt.Axes:
    """Plot opponent defensive vulnerability zones on a pitch.

    Args:
        ax: Matplotlib axes with pitch drawn.
        vulnerability_zones: List of dicts with 'x', 'y', 'radius',
            and 'severity' keys.
        pitch_length: Pitch length in meters.
        pitch_width: Pitch width in meters.

    Returns:
        The axes with vulnerability zones drawn.
    """
    for zone in vulnerability_zones:
        color_intensity = min(zone['severity'] / 10, 1.0)
        circle = plt.Circle(
            (zone['x'], zone['y']),
            zone['radius'],
            alpha=0.3 + 0.4 * color_intensity,
            color='red',
            linewidth=0,
        )
        ax.add_patch(circle)
        ax.annotate(
            f"{zone['severity']:.1f}",
            (zone['x'], zone['y']),
            ha='center', va='center',
            fontsize=9, fontweight='bold',
            color='darkred',
        )

    ax.set_title("Opponent Defensive Vulnerability Map",
                 fontsize=13, fontweight='bold', pad=10)
    return ax

22.7.8 Post-Match Review

Post-match analysis closes the loop by evaluating:

Tactical plan adherence: Did the team execute the pre-match plan? Quantify using fingerprint similarity between planned and observed style.
Adjustment effectiveness: Did in-game changes improve performance? (Section 22.4.2).
Individual performance: Player-level metrics contextualized by tactical role and opponent quality.
Opposition analysis accuracy: Were the pre-match predictions about opponent behavior correct? This feedback improves future analyses.

22.7.9 Communicating Tactical Insights to Coaches: A Practical Guide

The bridge between analytical insight and coaching action is often the most difficult step in the entire process. Several practical strategies have proven effective:

Speak in football language, not data language. Instead of saying "the opponent's left-back has a duel success rate of only 38% in 1v1 situations on his left side," say "their left-back struggles when attackers go at him one-on-one down his side -- we can exploit this with our right winger." The insight is the same; the framing is different.

Lead with video evidence. Coaches are visual thinkers. When presenting an analytical finding, always have a video clip ready that illustrates the point. If the data says the opponent's defensive line is vulnerable to balls in behind during the 60-75 minute window, show three clips of them being caught out during that period. The data identifies the pattern; the video makes it real.

Be selective. Resist the temptation to present every finding. A pre-match briefing that covers 15 tactical points will result in the players remembering none of them. Focus on the 2-3 insights that are most likely to influence the match outcome.

Provide actionable instructions, not abstract observations. "The opponent plays a high line" is an observation. "When we win the ball in midfield, look for Player X making a run in behind -- there will be space between their center-backs and goalkeeper" is an instruction. Analysts should always translate observations into specific, executable actions.

Callout: The Analytics Translator

Some of the most valuable people in modern football analytics departments are not the most technically skilled data scientists or the most experienced scouts -- they are the "translators" who can move fluently between the language of data and the language of coaching. These individuals understand both the statistical nuances of the analysis and the practical realities of implementing tactical changes on the training pitch and in match situations. Building this translation capability -- whether in a single person or through collaborative processes -- is essential for any club seeking to integrate analytics into its tactical workflow.

Summary

This chapter presented a comprehensive framework for data-driven match strategy and tactics in professional football. We began with formation detection from tracking data, showing how the assignment problem, clustering methods, and machine learning classifiers can identify and track tactical shapes. Tactical fingerprinting provides a multidimensional portrait of team style, enabling systematic comparison, clustering, and temporal tracking of tactical evolution. Opponent analysis synthesizes these tools into an actionable preparation pipeline, identifying build-up patterns, pressing triggers, counter-attacking tendencies, defensive vulnerabilities, and set-piece strategies.

In-game tactical adjustments were modeled using change-point analysis, enabling quantitative evaluation of whether mid-match changes improved performance. Real-time dashboards and structured communication frameworks ensure that analytical insights reach coaching staff in actionable form. Substitution strategy was framed as a sequential decision problem, incorporating fatigue modeling, player impact estimation, and practical pre-match scenario planning. Game state analysis revealed how score differential systematically alters team behavior, and game-state-adjusted metrics correct for these confounding effects. Win probability models and game state transition matrices provide frameworks for understanding match dynamics as they unfold.

Finally, we addressed the critical challenge of communicating tactical insights to coaching staff, emphasizing practical visualization principles, structured opposition analysis workflows, the importance of speaking in football language, and the essential role of building trust between analysts and coaches. The integration of these analytical methods into the daily workflow of a professional football club represents one of the most exciting frontiers in sports analytics. As data quality improves and analytical methods mature, the role of the tactical analyst will continue to grow---not as a replacement for coaching intuition, but as a powerful complement to it.

References

Bialkowski, A., Lucey, P., Carr, P., Yue, Y., Sridharan, S., & Matthews, I. (2014). "Identifying Team Style in Soccer Using Formations Learned from Spatiotemporal Tracking Data." IEEE International Conference on Data Mining Workshops.
Fernandez, J., & Bornn, L. (2018). "Wide Open Spaces: A Statistical Technique for Measuring Space Creation in Professional Soccer." MIT Sloan Sports Analytics Conference.
Brechot, M., & Flepp, R. (2020). "Dealing with Randomness in Match Outcomes: How to Rethink Performance Evaluation in European Club Football Using Expected Goals." Journal of Sports Economics, 21(4), 335--362.
Power, P., Ruiz, H., Wei, X., & Lucey, P. (2017). "Not All Passes Are Created Equal: Objectively Measuring the Risk and Reward of Passes in Soccer from Tracking Data." ACM KDD.
Shaw, L., & Sudarshan, M. (2020). "Routine Inspection: A Playbook for Corner Kicks." MIT Sloan Sports Analytics Conference.
Decroos, T., Bransen, L., Van Haaren, J., & Davis, J. (2019). "Actions Speak Louder Than Goals: Valuing Player Actions in Soccer." ACM KDD.
Robberechts, P., & Davis, J. (2020). "Forecasting the FIFA World Cup -- Combining Result- and Goal-Based Approaches." Machine Learning and Data Mining for Sports Analytics.
Spearman, W. (2018). "Beyond Expected Goals." MIT Sloan Sports Analytics Conference.
Goes, F. R., Kempe, M., Meerhoff, L. A., & Lemmink, K. A. P. M. (2019). "Not Every Pass Can Succeed: A Framework for Measuring Passing Difficulty in Soccer Matches." Journal of Sports Sciences, 37(14), 1605--1614.
Memmert, D., & Rein, R. (2018). "Match Analysis, Big Data and Tactics: Current Trends in Elite Soccer." Deutsche Zeitschrift fur Sportmedizin, 69, 65--72.
Herold, M., Goes, F., Hartmann, S., & Memmert, D. (2022). "Machine Learning in Men's Professional Football: Current Applications and Future Directions for Improving Attacking Play." International Journal of Sports Science & Coaching, 17(4), 798--817.
Rein, R., & Memmert, D. (2016). "Big Data and Tactical Analysis in Elite Soccer: Future Challenges and Opportunities for Sports Science." SpringerPlus, 5(1), 1--13.