How Machine Learning Is Changing Match Preparation" chapter: 30 case_study: 1 difficulty: "intermediate"


Case Study 1: The AI Revolution --- How Machine Learning Is Changing Match Preparation

Background

In the 2024--25 season, a mid-table club in a top European league --- referred to here as FC Analytica to protect proprietary information --- undertook a comprehensive overhaul of their match preparation workflow. The club's new head of analytics, hired from a technology company rather than a traditional football background, proposed replacing the largely manual match preparation process with an AI-assisted pipeline. This case study examines what happened, what worked, what failed, and what lessons emerged.

The Problem

FC Analytica's coaching staff faced a common challenge: preparing for an opponent in the compressed schedule of modern football. With matches every three to four days during congested periods, the coaching staff had limited time to analyze upcoming opponents, design training sessions, and brief players.

The traditional workflow looked like this:

  1. Video analysts would watch 3--5 full matches of the upcoming opponent (15--20 hours of video), creating clips organized by theme (build-up play, pressing, set pieces, transitions).
  2. The analyst team would compile statistical summaries from data providers: formation tendencies, key player statistics, set piece data.
  3. The coaching staff would review the video clips and statistics, develop a tactical plan, and design training sessions.
  4. Players would receive a pre-match briefing combining video clips and tactical instructions.

Total preparation time: approximately 30--40 person-hours per opponent. During congested periods, this was unsustainable, and preparation quality declined noticeably.

The AI-Assisted Solution

The new system introduced several ML-powered components:

Component 1: Automated Tactical Profiling

A computer vision system processed opponent match footage and automatically generated a tactical profile including:

  • Formation detection across different phases of play (in possession, out of possession, transition)
  • Pressing trigger identification (what events cause the opponent to initiate their press)
  • Build-up play patterns (preferred passing sequences from goal kicks and deep positions)
  • Defensive structure analysis (line height, compactness, weak-side coverage)

The system used a combination of convolutional neural networks for player detection and tracking, and a graph neural network operating on the resulting positional data to classify tactical states.

Component 2: Key Player Threat Assessment

An ML model evaluated each opponent player based on their recent performances, generating a "threat profile" that included:

  • Primary and secondary roles (e.g., "inverted winger who drifts centrally to receive between the lines")
  • Spatial heatmaps of dangerous activity
  • Preferred actions in the final third (cut inside and shoot, overlap and cross, combination play)
  • Defensive vulnerabilities (pressing commitment, recovery speed, aerial duels)

Component 3: Set Piece Analysis Engine

A dedicated module analyzed the opponent's set piece routines using pattern matching:

  • Identifying recurring corner kick and free kick routines
  • Classifying set piece defensive organization (zonal, man-marking, hybrid)
  • Detecting overloaded zones and potential vulnerabilities
  • Ranking set pieces by similarity to routines FC Analytica had faced before

Component 4: Natural Language Report Generation

An LLM-powered system synthesized the outputs of Components 1--3 into a structured pre-match report, written in natural language that the coaching staff could read without statistical expertise. The report included embedded video timestamps, linking analytical findings to specific match moments.

Component 5: Training Drill Recommendation

Based on the identified opponent profile, the system suggested training drills from a curated database, matched to the specific tactical challenges the opponent presented. For example, if the opponent used a high press with wing-trapping triggers, the system would recommend build-up drills that practiced press-breaking through the center.

Implementation Timeline

Phase Duration Activities
Data infrastructure 2 months Integrated tracking and event data feeds, set up GPU compute
Model development 4 months Built and trained Components 1--3
LLM integration 2 months Fine-tuned report generation, integrated with video
Pilot testing 6 weeks Parallel running with traditional workflow
Full deployment Season-long AI-first workflow with human oversight

Results

Quantitative Outcomes

  • Preparation time reduced by 60%: From 30--40 person-hours to 12--16 person-hours per opponent. The saved time was redirected to deeper analysis of specific tactical challenges and more individual player briefings.
  • Set piece defensive improvement: Opponent set piece conversion rate against FC Analytica dropped from 4.2% (previous season) to 2.8% --- a statistically significant improvement, though confounded by other changes.
  • First-half tactical adjustment speed: The coaching staff reported feeling "more prepared" and made fewer reactive first-half substitutions (from an average of 0.4 per match to 0.15).

Qualitative Outcomes

  • Coaching staff engagement: After initial skepticism, the coaching staff became enthusiastic users of the automated reports. The head coach described them as "like having a very well-prepared assistant who has watched every match."
  • Player reception: Players responded positively to more focused, shorter pre-match briefings. The use of embedded video clips in the AI-generated reports improved retention.
  • Analyst role evolution: Rather than spending time on routine video tagging and statistical compilation, analysts shifted to higher-value activities: validating AI outputs, investigating edge cases, and working directly with coaches on tactical design.

Failures and Challenges

  1. The "novel formation" problem: When an opponent used an unusual tactical setup not well-represented in the training data, the automated system produced unreliable output. This happened three times during the season and required manual intervention each time.

  2. Overconfidence in automated threat assessments: In one match, the system rated an opponent's new signing as a low threat based on limited data (only 4 appearances). The player scored twice. This highlighted the danger of ML systems extrapolating from small samples.

  3. Cultural resistance from senior staff: The goalkeeping coach refused to use the set piece analysis module for three months, preferring his own video review process. He was eventually won over when the system correctly predicted an opponent's corner kick routine that led to a goal that FC Analytica successfully defended.

  4. Data quality issues: Tracking data quality varied significantly across venues. Matches at certain stadiums produced lower-quality data, degrading model performance. The team had to implement quality checks and fallback procedures.

  5. Interpretability gaps: The tactical profiling model sometimes produced classifications that the coaching staff found hard to interpret. For example, it might classify an opponent's defensive structure in a way that did not correspond to the coaching staff's mental model of formations. This required ongoing calibration between the analytics team and the coaching staff.

Key Metrics Tracked

import pandas as pd
import numpy as np


def evaluate_preparation_system(
    season_data: pd.DataFrame,
    previous_season_data: pd.DataFrame
) -> dict:
    """Compare match preparation outcomes across seasons.

    This function evaluates key performance indicators
    that may be influenced by the AI-assisted preparation system.

    Args:
        season_data: Current season match data with columns
            'opponent', 'preparation_hours', 'result',
            'first_half_subs', 'set_piece_goals_conceded'.
        previous_season_data: Previous season data with same structure.

    Returns:
        Dictionary of comparative metrics.
    """
    metrics = {}

    # Preparation efficiency
    metrics["avg_prep_hours_current"] = season_data["preparation_hours"].mean()
    metrics["avg_prep_hours_previous"] = previous_season_data["preparation_hours"].mean()
    metrics["prep_time_reduction_pct"] = (
        1 - metrics["avg_prep_hours_current"] / metrics["avg_prep_hours_previous"]
    ) * 100

    # Tactical readiness proxy: first-half substitutions
    metrics["avg_first_half_subs_current"] = season_data["first_half_subs"].mean()
    metrics["avg_first_half_subs_previous"] = previous_season_data["first_half_subs"].mean()

    # Set piece defense
    metrics["set_piece_goals_per_match_current"] = (
        season_data["set_piece_goals_conceded"].sum() / len(season_data)
    )
    metrics["set_piece_goals_per_match_previous"] = (
        previous_season_data["set_piece_goals_conceded"].sum() / len(previous_season_data)
    )

    return metrics

Lessons Learned

1. AI augments, it does not replace

The most important lesson was that the AI system was most effective when it augmented human expertise rather than attempting to replace it. The automated reports were a starting point for the coaching staff's analysis, not the final word. Every AI output was reviewed and contextualized by a human analyst before reaching the coaching staff.

2. Failure modes must be anticipated

The system's failures were predictable in hindsight: novel situations, small samples, and data quality issues are well-known ML challenges. The team learned to build explicit "uncertainty indicators" into every output, flagging cases where the model's confidence was low.

3. Cultural change takes time

Introducing AI-assisted workflows required patience, empathy, and a willingness to accommodate different adoption speeds. Forcing adoption would have created resentment; allowing gradual, voluntary adoption with visible success stories proved more effective.

4. Data infrastructure is the foundation

The first two months spent on data infrastructure --- which produced no visible outputs --- were the most important investment. Without reliable, well-organized data pipelines, the ML models would have been built on sand.

5. Evaluation is difficult

Attributing outcomes (wins, defensive improvements) to the AI system is confounded by many other factors (player fitness, opponent quality, tactical changes unrelated to AI). The club learned to focus on process metrics (preparation time, analyst productivity, coaching staff satisfaction) rather than trying to isolate the system's impact on results.

Discussion Questions

  1. How would you design an evaluation framework that could more rigorously measure the impact of AI-assisted match preparation on match outcomes?

  2. The "novel formation" problem highlights a fundamental limitation of ML systems trained on historical data. What approaches might mitigate this limitation?

  3. The goalkeeping coach's initial resistance was eventually overcome by a specific success story. Is this a reliable method for driving adoption, or does it introduce its own biases (e.g., cherry-picking successes)?

  4. FC Analytica is a well-resourced club in a top European league. How would you adapt this approach for a club with a fraction of the budget and technical staff?

  5. The system's overconfidence about the opponent's new signing raises questions about how ML systems should handle uncertainty when data is limited. What design principles would you implement to address this?

Connection to Chapter Themes

This case study illustrates several themes from Chapter 30:

  • Emerging technologies (Section 30.1): The system combines computer vision, graph neural networks, LLMs, and pattern matching --- multiple cutting-edge technologies working together.
  • The human element (Section 30.4): The system's success depended critically on human oversight, contextual understanding, and the analyst-coach relationship.
  • Ethical considerations (Section 30.2): The system processes personal data about opposing players, raising questions about consent and proportionality.
  • Democratization (Section 30.3): While this implementation was resource-intensive, the underlying techniques are becoming more accessible, and simplified versions could benefit smaller clubs.