> "Data is only as good as the organization that surrounds it. The best model in the world is worthless if nobody listens to it, and nobody listens if you haven't built the structures, trust, and processes to make data part of the culture."
Learning Objectives
- Evaluate organizational models for analytics departments including embedded, centralized, and multi-club structures
- Design hiring strategies and team compositions appropriate for different stages of departmental maturity
- Specify technology infrastructure requirements for data platforms, pipelines, and analytical tools
- Establish workflow processes that integrate analytics into matchday preparation, recruitment, and planning
- Apply stakeholder management techniques to build trust and influence with coaches, executives, and medical staff
- Develop frameworks for measuring the return on investment of analytics initiatives
- Build a data-driven culture that embeds analytical thinking across the organization
In This Chapter
Chapter 28: Building an Analytics Department
"Data is only as good as the organization that surrounds it. The best model in the world is worthless if nobody listens to it, and nobody listens if you haven't built the structures, trust, and processes to make data part of the culture." --- Rasmus Ankersen, Chairman, FC Midtjylland
Introduction
The professionalization of soccer analytics has accelerated dramatically since the mid-2010s. What began as isolated efforts by mathematically inclined coaches and hobbyist programmers has matured into a recognized organizational function within professional football clubs worldwide. Today, top-tier clubs employ teams of data scientists, analysts, software engineers, and video specialists who work in concert to extract competitive advantage from ever-growing volumes of data.
Yet building an analytics department is not simply a matter of hiring talented individuals and providing them with data. It requires deliberate organizational design, careful integration with existing football operations, robust technology infrastructure, and --- perhaps most importantly --- a sustained commitment from leadership to embed analytical thinking into the club's decision-making DNA.
This chapter provides a comprehensive framework for building, scaling, and managing an analytics department within a professional soccer organization. Whether you are a club executive tasked with establishing an analytics function from scratch, an analyst seeking to understand where your role fits in the broader organization, or a student preparing for a career in sports analytics, the principles and practices outlined here will serve as a practical guide.
We begin with organizational structures (Section 28.1), examining where analytics departments sit within club hierarchies and how reporting lines affect influence. Section 28.2 covers hiring and team composition, detailing the roles, skills, and profiles needed at various stages of departmental maturity. Section 28.3 addresses technology infrastructure --- the platforms, tools, and data pipelines that enable analytical work. Section 28.4 explores workflow and process design, establishing how analytics integrates into matchday preparation, recruitment, and long-term planning. Section 28.5 tackles stakeholder management, arguably the most critical and least technical skill required for analytics success. Section 28.6 provides frameworks for measuring the impact of analytics investment. Finally, Section 28.7 presents case studies from clubs that have successfully built analytics operations at different scales and budgets.
28.1 Organizational Structures
28.1.1 Where Analytics Lives in the Club
The organizational placement of an analytics department has profound implications for its effectiveness, influence, and long-term sustainability. Across professional football, three primary structural models have emerged:
Model 1: Embedded within Football Operations
In this model, the analytics department reports directly to the sporting director, director of football, or head coach. Analysts sit physically and organizationally alongside the coaching staff and recruitment team.
Advantages: - Direct access to decision-makers - Tight feedback loops between analysis and action - Analysts develop deep contextual understanding of team needs - Lower barriers to communication
Disadvantages: - Vulnerability to regime change (new manager may not value analytics) - Risk of analysts becoming "service providers" rather than strategic partners - Limited cross-functional influence (e.g., on commercial or medical decisions)
Model 2: Centralized Analytics Unit
Here, analytics operates as an independent department with its own leadership, budget, and strategic objectives. It serves multiple internal clients --- coaching, recruitment, medical, commercial --- and typically reports to the CEO or a chief analytics/strategy officer.
Advantages: - Greater organizational resilience - Ability to serve multiple stakeholders - Clearer career progression for analysts - Budget protection from football-side volatility
Disadvantages: - Risk of becoming disconnected from football operations - Potential for slower response times - "Ivory tower" perception among coaching staff - Requires strong internal marketing and relationship management
Model 3: Multi-Club / Group Model
For organizations managing multiple clubs (e.g., City Football Group, Red Bull network, Multi Club Ownership groups), analytics may operate at the group level, providing centralized services to all clubs while maintaining local liaisons at each site.
Advantages: - Economies of scale in technology and talent - Standardized methodologies across the group - Larger datasets for model training - Career mobility for analysts across clubs
Disadvantages: - Local context may be lost - Communication overhead - Potential resistance from individual club staff - Complexity in managing competing priorities
Intuition: Think of these three models as analogous to how technology functions are organized in other industries. Model 1 is like having an embedded IT person in each business unit---responsive but fragile. Model 2 is like a centralized IT department---resilient but potentially distant. Model 3 is like shared services in a multinational corporation---efficient but complex. The right choice depends on your organization's size, culture, and strategic objectives. There is no universally correct answer. The best model is the one that maximizes the probability that analytical insights will actually influence decisions.
28.1.2 Reporting Lines and Authority
The reporting structure within an analytics department determines how information flows, who sets priorities, and how conflicts are resolved. A typical mature department might be structured as follows:
Chief Analytics Officer / Head of Analytics
|
+--- Lead Analyst (First Team)
| +--- Match Analyst(s)
| +--- Performance Data Analyst(s)
|
+--- Lead Analyst (Recruitment)
| +--- Scout Analyst(s)
| +--- Data Scientist(s)
|
+--- Lead Engineer / Data Engineer
| +--- Software Developer(s)
| +--- Database Administrator
|
+--- Video Analysis Lead
+--- Video Analyst(s)
+--- Video Editor(s)
The Head of Analytics should ideally have a "seat at the table" --- attending senior leadership meetings, participating in transfer committee discussions, and contributing to strategic planning. Without this access, the department risks being relegated to a reactive, service-oriented function.
Common Pitfall: A frequent organizational mistake is placing the Head of Analytics too deep in the hierarchy---reporting to an assistant coach or a junior sporting director rather than to the sporting director or CEO directly. This creates a bottleneck: every insight must pass through an intermediary who may not understand or value it. Worse, it signals to the rest of the organization that analytics is a support function rather than a strategic one. If the Head of Analytics does not have direct access to the top decision-makers, the department's influence will be structurally limited regardless of the quality of its work.
Dotted-Line Relationships: In practice, analytics staff often maintain "dotted-line" reporting relationships in addition to their formal reporting line. For example, a recruitment analyst may report formally to the Head of Analytics but have a strong dotted-line relationship with the Chief Scout. Managing these dual reporting lines requires clear expectations about priorities: when the Chief Scout needs an urgent player profile and the Head of Analytics needs a quarterly report, who wins? Establishing these protocols explicitly prevents confusion and conflict.
28.1.3 Maturity Models for Analytics Departments
Analytics departments evolve through recognizable stages of maturity. Understanding where your department sits on this spectrum helps set realistic expectations and plan for growth.
Stage 1: Ad Hoc (1-2 people) - Single analyst or small team - Reactive work driven by coaching requests - Basic tools (Excel, basic video analysis) - Limited data infrastructure - Typical budget: $50,000 -- $150,000
Stage 2: Foundational (3-5 people) - Dedicated roles for match analysis and recruitment - Established data pipelines from providers (Opta, StatsBomb, etc.) - Basic dashboards and reporting - Beginning to influence some decisions - Typical budget: $200,000 -- $500,000
Stage 3: Established (6-12 people) - Specialized roles including data scientists and engineers - Custom models and tools - Proactive analysis alongside reactive support - Regular integration into decision-making processes - Typical budget: $500,000 -- $1,500,000
Stage 4: Advanced (13-25+ people) - Full-stack analytics operation - Proprietary data collection and tracking systems - Research and development function - Analytics embedded in organizational culture - Typical budget: $1,500,000 -- $5,000,000+
We can model these stages and their associated costs programmatically. See code/example-01-department-design.py for a complete implementation.
Real-World Application: Many clubs attempt to leap from Stage 1 directly to Stage 3 or 4, hiring aggressively and purchasing expensive technology. This almost always fails. The foundational stage is essential because it builds the relationships, trust, and institutional knowledge that advanced analytics depends on. A club that hires ten analysts before coaching staff trust a single one will have ten frustrated analysts producing work that nobody reads. The progression through stages is not just about budget---it is about organizational readiness.
28.1.4 Budget Allocation Framework
A well-designed analytics budget typically follows this allocation pattern:
$$B_{\text{total}} = B_{\text{personnel}} + B_{\text{data}} + B_{\text{technology}} + B_{\text{training}} + B_{\text{contingency}}$$
Where typical proportions are:
$$B_{\text{personnel}} \approx 0.55 \cdot B_{\text{total}}$$ $$B_{\text{data}} \approx 0.20 \cdot B_{\text{total}}$$ $$B_{\text{technology}} \approx 0.15 \cdot B_{\text{total}}$$ $$B_{\text{training}} \approx 0.05 \cdot B_{\text{total}}$$ $$B_{\text{contingency}} \approx 0.05 \cdot B_{\text{total}}$$
Personnel costs dominate because analytics is fundamentally a people-driven function. Data acquisition (event data, tracking data, video feeds) represents the second largest expense, followed by technology infrastructure (cloud computing, software licenses, hardware).
Data Cost Landscape: The cost of data varies enormously depending on coverage and granularity. Event data for a single domestic league might cost $20,000--50,000 per season, while comprehensive global event data with freeze-frame information can exceed $200,000. Tracking data is typically more expensive, with per-league costs ranging from $50,000 to $150,000 depending on the provider and resolution. Video feeds add another layer of cost, particularly for leagues outside the club's home country. A club scouting globally across 20+ leagues may spend $300,000--500,000 annually on data alone.
28.2 Hiring and Team Composition
28.2.1 Core Roles and Competencies
Building an effective analytics team requires assembling diverse skill sets that collectively span the full spectrum from raw data to actionable insight. The following roles form the backbone of a mature analytics department:
Head of Analytics / Chief Analytics Officer
The department leader must combine technical credibility with organizational savvy. This role requires: - Strategic vision for how analytics supports club objectives - Ability to translate between technical and football languages - Experience managing cross-functional relationships - Understanding of both the data science and football domains - Political skill to navigate organizational dynamics
Match / Performance Analyst
Match analysts work most closely with coaching staff, providing tactical analysis and performance insights: - Deep tactical and technical football knowledge - Proficiency in video analysis platforms (Hudl/SBG, Catapult, etc.) - Statistical literacy and data visualization skills - Communication skills (presenting to coaches and players) - Ability to work under matchday time pressure
Data Scientist
Data scientists build the models and algorithms that power predictive and prescriptive analytics: - Strong foundations in statistics and machine learning - Programming proficiency (Python, R, SQL) - Experience with sports-specific modeling (expected goals, player valuation, etc.) - Ability to communicate results to non-technical audiences - Research orientation with practical delivery focus
Recruitment / Scouting Analyst
Recruitment analysts support the transfer process with data-driven player identification and evaluation: - Understanding of player evaluation frameworks - Experience with large-scale data analysis - Knowledge of transfer market dynamics - Ability to integrate statistical and video-based scouting - Cultural awareness (understanding different leagues and playing styles)
Data Engineer / Software Developer
Engineers build and maintain the technical infrastructure: - Database design and management (SQL, NoSQL) - ETL pipeline development - API integration with data providers - Web/application development for internal tools - Cloud infrastructure management (AWS, GCP, Azure)
Video Analyst
Video analysts provide qualitative context that complements quantitative analysis: - Expert-level understanding of tactical concepts - Proficiency with video editing and tagging tools - Ability to create compelling visual presentations - Collaboration with coaching staff on session planning - Detail orientation for coding and categorizing match events
Best Practice: When writing job descriptions for analytics roles, be specific about the balance of technical and football skills you require. Vague descriptions like "passion for football and data" attract hundreds of unqualified applicants. Instead, specify concrete requirements: "Experience building machine learning models with sports tracking data" or "Ability to deliver a 5-minute tactical presentation to coaching staff based on video and data analysis." Specificity in the job description filters the applicant pool effectively and signals to strong candidates that you understand the role.
28.2.2 The T-Shaped Analyst
The most effective analytics professionals exhibit a "T-shaped" competency profile: deep expertise in one domain (the vertical bar of the T) combined with working knowledge across adjacent areas (the horizontal bar).
$$\text{Effectiveness} = \text{Depth}_{\text{primary}} \times \sum_{i=1}^{n} w_i \cdot \text{Breadth}_{i}$$
Where $\text{Depth}_{\text{primary}}$ represents expertise in the core skill, $\text{Breadth}_i$ represents competence in adjacent skill $i$, and $w_i$ is the weight (importance) of that adjacent skill.
For example, a data scientist with deep machine learning expertise who also possesses working knowledge of football tactics, video analysis, and data engineering will be far more effective than a technically brilliant data scientist who cannot communicate findings to coaches or understand the football context of their models.
Intuition: The T-shape metaphor captures a fundamental truth about analytics in football: no single discipline is sufficient. A brilliant statistician who cannot communicate is like a translator who speaks only one language. A football expert who cannot code is limited to manual, non-scalable analysis. The T-shape means being excellent at one thing while being conversational in several others. In practice, the "horizontal bar" of the T often matters more for career progression than the "vertical bar"---it is the breadth that enables an analyst to connect their work to the broader organizational context.
Common Pitfall: One of the most common mistakes in hiring for analytics roles is over-indexing on either football knowledge or technical skill at the expense of the other. A former professional player with no data skills cannot build models. A brilliant data scientist who has never watched a match cannot contextualize results.
The ideal candidate understands both worlds, but these individuals are rare. A practical strategy is to: 1. Hire for the skill that is harder to teach (usually technical ability) 2. Invest in football education for technical hires 3. Pair technical and football-oriented team members 4. Create an environment where mutual learning is expected
28.2.3 Scaling the Team
The sequence in which roles are added matters. A recommended hiring sequence for a club building from scratch:
| Hire Order | Role | Rationale |
|---|---|---|
| 1 | Head of Analytics | Establishes vision and credibility |
| 2 | Match/Performance Analyst | Provides immediate value to coaching staff |
| 3 | Data Scientist | Begins building models and tools |
| 4 | Recruitment Analyst | Extends analytics to transfer decisions |
| 5 | Data Engineer | Formalizes infrastructure as complexity grows |
| 6 | Video Analyst | Adds qualitative depth |
| 7+ | Additional specialists | Scale based on need (medical analytics, set-piece specialist, etc.) |
This sequence prioritizes building trust with stakeholders early (through match analysis) before investing in more technically sophisticated but slower-to-deliver capabilities (modeling, infrastructure).
Real-World Application: A Championship club in England followed this exact sequence over three seasons. In year one, they hired a Head of Analytics and a Match Analyst. The Match Analyst immediately began producing opponent reports that coaches found useful, establishing credibility. In year two, they added a Data Scientist who built an xG model and a recruitment screening tool. In year three, they hired a Recruitment Analyst and a Data Engineer. By the time the Data Engineer joined, there was already a clear backlog of infrastructure needs that justified the role. If they had hired the Data Engineer first, that person would have been building infrastructure for tools that did not yet exist, with no internal users to validate the work. Sequencing matters.
28.2.4 Compensation Benchmarks
While compensation varies significantly by geography, league level, and club resources, the following ranges provide a rough guide for European football as of the mid-2020s (in USD/EUR equivalents):
| Role | Junior (0-2 yrs) | Mid (3-5 yrs) | Senior (6+ yrs) |
|---|---|---|---|
| Match Analyst | $30K-$50K | $50K-$80K | $80K-$120K |
| Data Scientist | $50K-$80K | $80K-$130K | $130K-$200K+ |
| Recruitment Analyst | $35K-$55K | $55K-$90K | $90K-$140K |
| Data Engineer | $50K-$80K | $80K-$130K | $130K-$180K |
| Video Analyst | $25K-$45K | $45K-$70K | $70K-$100K |
| Head of Analytics | --- | $100K-$160K | $160K-$300K+ |
Note that these figures are approximate and that top-tier clubs in the Premier League or backed by well-funded ownership groups may pay significantly above these ranges, particularly for data scientists and engineers who command premium salaries in the broader technology labor market.
28.2.5 Career Development and Retention
Retaining analytics talent is a significant challenge for football clubs. The skills that make someone an effective sports data scientist---programming, machine learning, statistical modeling---are highly valued in technology, finance, and consulting, where salaries are often 50--100% higher than in football. Clubs that lose experienced analysts face not only the cost of replacement but the loss of institutional knowledge, relationships, and domain expertise that took years to build.
Effective retention strategies include:
Career Pathways: Define clear progression routes. An analyst should be able to see a path from junior analyst to senior analyst to lead analyst to Head of Analytics. Without visible progression, ambitious individuals will seek it elsewhere.
Skill Development: Budget for conference attendance, online courses, and research time. Analysts who feel they are growing professionally are more likely to stay. The 20--30% of time allocated to Tier 3 (R&D) work serves a dual purpose: it drives innovation and it keeps analysts intellectually engaged.
Mission and Purpose: Many analytics professionals choose football over higher-paying industries because they are passionate about the sport. Nurture this passion by involving analysts in meaningful decisions, giving them visibility into the outcomes of their work, and recognizing their contributions to on-pitch success.
Flexible Working: Where possible, offer remote or hybrid working arrangements. Much analytics work (modeling, data engineering, report preparation) can be done remotely, with on-site presence required primarily for matchdays and face-to-face stakeholder meetings.
Competitive Compensation: While football may not match technology sector salaries, clubs should aim to be competitive within the sports analytics market. Losing an analyst because a rival club offered 15% more is a false economy.
Best Practice: Conduct "stay interviews" with your analytics staff---not just exit interviews when they leave. Ask what keeps them motivated, what frustrates them, and what would make them consider leaving. The information from stay interviews is far more actionable than exit interviews, because by the time someone is leaving, it is usually too late to address their concerns.
28.2.6 Diversity and Inclusion in Analytics Hiring
The soccer analytics field has historically been dominated by a narrow demographic: young, white, male, from quantitative academic backgrounds. This lack of diversity is not just an equity issue---it is a performance issue. Homogeneous teams are more susceptible to groupthink, blind spots, and confirmation bias. Teams with diverse backgrounds, perspectives, and experiences produce more creative and robust analyses.
Concrete steps to improve diversity include:
- Broaden sourcing channels: Recruit from universities and communities that are underrepresented in the current talent pool.
- Re-evaluate requirements: Does the role truly require a master's degree in statistics, or would equivalent practical experience suffice?
- Structured interviews: Use standardized evaluation criteria to reduce unconscious bias in hiring decisions.
- Mentorship programs: Support junior staff from underrepresented backgrounds with dedicated mentorship.
- Inclusive culture: Ensure that the department's culture is welcoming to people of all backgrounds, not just those who fit a narrow "football nerd" stereotype.
28.3 Technology Infrastructure
28.3.1 The Analytics Technology Stack
A modern soccer analytics department relies on a layered technology stack. Each layer serves a distinct purpose and must integrate seamlessly with the others:
Layer 1: Data Sources - Event data providers (Opta/Stats Perform, StatsBomb, Wyscout) - Tracking data (Second Spectrum, SkillCorner, Signality, Hawk-Eye) - Physical performance data (GPS/accelerometer: Catapult, STATSports, Polar) - Video feeds (broadcast, tactical camera, drone) - Internal scouting reports and coach evaluations - Medical and wellness data - Financial and contract data
Layer 2: Data Infrastructure - Cloud platform (AWS, GCP, Azure) - Data warehouse / data lake (Snowflake, BigQuery, Redshift, S3) - ETL/ELT pipelines (Airflow, dbt, custom Python scripts) - API layer for data access - Version control (Git)
Layer 3: Analytics Tools - Statistical computing (Python, R) - Machine learning frameworks (scikit-learn, PyTorch, TensorFlow) - Visualization (matplotlib, Plotly, D3.js, Tableau) - Geospatial analysis tools (for pitch-based visualization) - Video analysis platforms (Hudl/SBG, Catapult/ProZone)
Layer 4: Delivery and Communication - Dashboards (Tableau, Power BI, Streamlit, custom web apps) - Reporting tools (Jupyter notebooks, automated PDF reports) - Presentation platforms (for matchday and scouting presentations) - Mobile interfaces (for coaching staff on the training ground) - Slack/Teams integration for alerts and notifications
28.3.2 Data Architecture Principles
The data architecture should be designed around several key principles:
Single Source of Truth (SSOT): All data consumers should access the same canonical data, avoiding the "multiple spreadsheets with different numbers" problem that plagues many organizations.
Reproducibility: Any analysis should be reproducible from raw data through to final output. This requires version-controlled code, documented data transformations, and clear lineage tracking.
Scalability: The architecture should accommodate growing data volumes without fundamental redesign. Tracking data alone can generate hundreds of gigabytes per season.
Security and Access Control: Player medical data, contract information, and scouting intelligence are highly sensitive. Role-based access control is essential.
A simplified data flow can be expressed as:
$$\text{Raw Data} \xrightarrow{\text{Ingest}} \text{Data Lake} \xrightarrow{\text{Transform}} \text{Data Warehouse} \xrightarrow{\text{Serve}} \text{Analytics Layer} \xrightarrow{\text{Deliver}} \text{End Users}$$
Intuition: Think of data architecture like plumbing in a building. The data lake is the water reservoir---it stores everything in its rawest form. The data warehouse is the water treatment plant---it cleans, structures, and organizes the data for consumption. The analytics layer is the faucet---it delivers specific data to specific users in the form they need. Just as you would not drink directly from the reservoir, end users should not query raw data directly. And just as poor plumbing creates leaks, pressure drops, and contamination, poor data architecture creates inconsistencies, delays, and errors.
28.3.3 Data Pipeline Design
The data pipeline is the automated system that moves data from sources through transformations to end users. A well-designed pipeline operates reliably with minimal human intervention and provides clear visibility into its status.
Ingestion Layer: Data arrives from multiple sources in different formats and at different frequencies. Event data might arrive via API within 30 seconds of an event occurring. Tracking data may arrive as batch files 2--4 hours after a match. Medical data might be entered manually into a wellness application. The ingestion layer must handle all of these patterns:
- API polling: Scheduled jobs that pull data from provider APIs at regular intervals.
- Webhook receivers: Endpoints that accept push-based data delivery from providers.
- File watchers: Processes that monitor shared folders or cloud storage for new data files.
- Manual upload interfaces: Simple web forms for data that is captured manually (scouting notes, medical assessments).
Transformation Layer: Raw data is rarely in the format needed for analysis. The transformation layer applies cleaning, enrichment, and restructuring:
- Cleaning: Handle missing values, correct obvious errors, standardize formats.
- Enrichment: Add derived fields (e.g., calculate speed from consecutive position records, assign pitch zones to coordinates).
- Joining: Merge data from multiple sources (e.g., link event data to tracking data for the same match).
- Aggregation: Pre-compute commonly requested summaries (per-player-per-match statistics, rolling averages).
Orchestration: Tools like Apache Airflow or Prefect schedule and coordinate pipeline tasks, handling dependencies (e.g., "do not compute player ratings until both event data and tracking data for this match have been ingested and cleaned") and managing retries when individual tasks fail.
Common Pitfall: A common infrastructure failure mode is the "pipeline of shame"---a collection of ad hoc scripts, scheduled tasks, and manual steps that collectively form the club's data pipeline but that no single person fully understands. When the person who wrote the scripts leaves, the pipeline becomes a black box. Eventually, something breaks, and nobody knows how to fix it. The solution is to invest in proper orchestration tooling and documentation from the start, even when the pipeline is simple. It is far easier to maintain good practices than to retrofit them onto a tangled mess of scripts.
28.3.4 Build vs. Buy Decisions
One of the most consequential technology decisions an analytics department faces is the build-versus-buy trade-off. This applies at every layer of the stack:
| Component | Buy (SaaS/Vendor) | Build (In-House) |
|---|---|---|
| Event Data | Opta, StatsBomb | Computer vision pipelines |
| Tracking Data | Second Spectrum, SkillCorner | Custom camera systems |
| Dashboards | Tableau, Power BI | Streamlit, custom web apps |
| xG Models | Provider-supplied | Custom research models |
| Recruitment Platform | Wyscout, TransferRoom | Proprietary platform |
| Video Tagging | Hudl/SBG | Custom tools |
The general principle is:
$$\text{Build if: } \frac{\text{Competitive Advantage Gained}}{\text{Development Cost} + \text{Maintenance Cost}} > \frac{1}{\text{Buy Cost}}$$
In practice, most clubs should buy commodity services (event data, basic dashboards) and build where differentiation matters most (custom models, proprietary tools for recruitment).
Real-World Application: One Premier League club spent 18 months and approximately $400,000 building a custom recruitment platform to replace their Wyscout-based workflow. The platform was technically impressive---it integrated multiple data sources, ran proprietary models, and presented results in a custom-designed interface. However, the scouts refused to use it. They found the interface unintuitive compared to the Wyscout platform they were accustomed to, and the proprietary models occasionally recommended players that scouts considered clearly unsuitable, eroding trust. The club eventually reverted to a modified version of their Wyscout workflow, supplemented by a simpler analytics overlay. The lesson: building custom tools is only worthwhile if the end users will actually adopt them. User research and iterative design with the intended users must be part of the development process.
Advanced: The hidden cost of building in-house tools is substantial and frequently underestimated: - Maintenance: Software requires ongoing updates, bug fixes, and security patches - Documentation: Custom tools need thorough documentation for knowledge transfer - Key-person risk: If the developer leaves, can someone else maintain the code? - Opportunity cost: Time spent building infrastructure is time not spent on analysis
A useful heuristic: build only when the in-house solution will be demonstrably superior to available commercial options AND when you have the engineering capacity to maintain it long-term.
28.3.5 Data Provider Evaluation
Selecting data providers is a critical early decision. A structured evaluation framework should consider:
- Coverage: Does the provider cover all leagues and competitions relevant to the club?
- Granularity: What level of detail is captured (e.g., basic events vs. detailed qualifiers vs. freeze frames)?
- Timeliness: How quickly after a match is data available?
- Accuracy: What is the provider's error rate, and how is quality controlled?
- API quality: How easy is it to programmatically access and integrate the data?
- Cost: What is the total cost of ownership, including potential volume-based pricing?
- Exclusivity: Does the contract allow or prevent sharing data with third parties?
A weighted scoring model can formalize this evaluation:
$$S_{\text{provider}} = \sum_{i=1}^{7} w_i \cdot s_i$$
where $w_i$ is the weight assigned to criterion $i$ and $s_i$ is the score (e.g., 1--5) for that criterion.
28.4 Workflow and Process Design
28.4.1 The Analytics Workflow Lifecycle
Analytics work in professional football follows cyclical patterns aligned with the competitive calendar. Understanding these cycles is essential for resource planning and workflow design.
Matchday Cycle (Weekly during season)
Day -6: Post-match debrief (previous match)
Day -5: Opponent analysis begins
Day -4: Detailed tactical report compiled
Day -3: Presentation to coaching staff
Day -2: Player-specific analysis distributed
Day -1: Final preparations and set-piece review
Day 0: Matchday (real-time analysis, half-time report)
Day +1: Post-match analysis and reporting
Transfer Windows (Biannual)
Ongoing: Longlist maintenance and monitoring
Week -8: Priority positions identified with coaching/sporting staff
Week -6: Shortlist generation (data-driven screening)
Week -4: Deep dives on top candidates (video + data)
Week -2: Financial modeling and negotiation support
Window: Due diligence and deal support
Post: Integration tracking for new signings
Season Cycle (Annual)
Pre-season: Squad evaluation, new signing integration, baseline metrics
Early season: Model calibration, initial trend analysis
Mid-season: January window support, tactical adaptation analysis
Late season: Form analysis, playoff/relegation scenario modeling
Off-season: R&D, infrastructure upgrades, process improvement
Best Practice: Map your analytics team's workload across these overlapping cycles and identify the peak-load periods. In English football, the most demanding period is typically late December through early January, when the Christmas fixture congestion coincides with the January transfer window opening. If your team is understaffed for these peaks, consider hiring seasonal support (graduate interns, for example) or pre-building as much January window analysis as possible during the quieter autumn months.
28.4.2 Automating Routine Workflows
A significant portion of analytics work is repetitive: data ingestion, report generation, dashboard updates, and standard post-match analysis. Automating these tasks frees analysts to focus on higher-value work.
The code/example-02-workflow-automation.py file demonstrates a complete automated pipeline, but the conceptual framework is as follows:
$$\text{Analyst Time}_{\text{available}} = \text{Total Time} - \text{Time}_{\text{automated}} - \text{Time}_{\text{administrative}}$$
Maximizing $\text{Analyst Time}_{\text{available}}$ for creative, high-value analysis requires minimizing both automated routine tasks (through automation) and administrative overhead (through good process design).
Key areas for automation include:
- Data ingestion: Scheduled pulls from provider APIs
- Report generation: Templated post-match and weekly reports
- Dashboard updates: Automatic data refresh and visualization updates
- Alert systems: Notifications when metrics exceed thresholds (e.g., player workload warnings)
- Quality checks: Automated data validation upon ingestion
Measuring Automation ROI: Track how many hours per week are saved by each automated process. A post-match report that previously took an analyst 3 hours to compile manually but now generates automatically in 10 minutes saves approximately 130 hours per season (assuming 45 matches). At a loaded cost of $50/hour for analyst time, that single automation saves $6,500 per season. When you can demonstrate that $10,000 invested in automation tooling saves $30,000 per season in analyst time, the business case for further automation investment becomes straightforward.
28.4.3 Knowledge Management
An often-overlooked aspect of workflow design is knowledge management: how does the department capture, organize, and share what it learns?
Documentation Standards: - All code should be version-controlled and documented - Analytical findings should be stored in a searchable repository - Methodological decisions should be recorded with rationale - Post-project retrospectives should be conducted and archived
Knowledge Sharing Practices: - Regular internal seminars or "lunch and learn" sessions - Pair programming or analysis for skill transfer - Standard templates for common analyses - A shared library of reusable code and visualizations
Institutional Memory: - When analysts leave, their knowledge should not leave with them - Documentation of all active models, dashboards, and data pipelines - Succession planning for critical functions - Onboarding materials for new hires
Common Pitfall: The most dangerous form of technical debt in an analytics department is not poorly written code---it is undocumented knowledge. When the only person who understands how the xG model was calibrated, or why the recruitment dashboard uses a particular weighting scheme, or how the data pipeline handles a specific edge case leaves the club, that knowledge is gone. The cost of reconstructing it---if it can be reconstructed at all---far exceeds the cost of documenting it in the first place. Make documentation a deliverable, not an afterthought.
28.4.4 Project Prioritization
Analytics departments inevitably face more requests than they can fulfill. A structured prioritization framework prevents the team from becoming purely reactive:
$$\text{Priority Score} = w_1 \cdot \text{Impact} + w_2 \cdot \text{Urgency} + w_3 \cdot \text{Feasibility} - w_4 \cdot \text{Cost}$$
Where each factor is scored on a standardized scale (e.g., 1-10) and weights reflect organizational values. For example, a club in a relegation battle might weight urgency heavily, while a club in a rebuilding phase might weight long-term impact more.
A practical approach is to maintain a backlog with three tiers:
| Tier | Description | Example |
|---|---|---|
| 1 | Must-do, time-sensitive | Pre-match opponent report |
| 2 | Important, scheduled | Monthly squad performance review |
| 3 | Valuable, flexible timing | New xG model development |
Tier 1 work takes precedence, but a healthy department should allocate at least 20-30% of capacity to Tier 3 (research and development) work to drive innovation and long-term improvement.
28.5 Stakeholder Management
28.5.1 Identifying and Mapping Stakeholders
Effective stakeholder management begins with understanding who the stakeholders are and what they need. In a professional football club, the analytics department's stakeholders typically include:
Primary Stakeholders (Direct users of analytics output): - Head coach and coaching staff - Sporting director / Director of football - Chief scout and scouting team - Performance / sports science staff
Secondary Stakeholders (Indirect beneficiaries or influencers): - Club CEO and board - Players - Medical staff - Academy staff and coaches - Commercial and marketing departments
External Stakeholders: - Ownership group - Agents (during transfer negotiations) - Media (selectively) - Data providers and technology partners
Each stakeholder has different needs, communication preferences, and levels of data literacy. Mapping these characteristics is the first step toward effective engagement:
$$\text{Engagement Strategy}_i = f(\text{Influence}_i, \text{Interest}_i, \text{Data Literacy}_i)$$
28.5.2 Building Trust with Coaching Staff
The relationship between the analytics department and the coaching staff is the single most important determinant of analytics effectiveness. Building this trust requires:
1. Start with their questions, not your answers. Rather than pushing unsolicited analysis, begin by understanding what the coaching staff wants to know. Frame your work as supporting their vision, not imposing an alternative worldview.
2. Speak their language. Coaches think in tactical concepts, not statistical abstractions. Instead of "his expected threat per 90 from progressive carries is in the 94th percentile," try "he's one of the best in the league at driving forward with the ball into dangerous areas --- he does it more often and more effectively than almost anyone."
3. Be honest about uncertainty. Nothing erodes trust faster than overconfident predictions that turn out wrong. Communicate uncertainty ranges and caveats clearly. A coach who understands that your model gives a 70% probability, not a guarantee, will be more forgiving when the 30% occurs.
4. Deliver quick wins early. Before embarking on complex projects, deliver simple but useful analyses that demonstrate value. A well-designed set-piece report or opponent analysis that coaches find genuinely useful builds credibility for more ambitious work later.
5. Be present. Physical presence matters. Analysts who attend training sessions, sit in tactical meetings, and travel with the team develop better intuition and stronger relationships than those who work in isolation.
Real-World Application: At one Bundesliga club, the Head of Analytics made a deliberate decision to spend the first three months in the role doing nothing but attending training sessions, watching matches from the bench, and having informal conversations with coaching staff. No dashboards were built. No models were deployed. No reports were written. Instead, the Head of Analytics developed a deep understanding of what the coaches valued, how they communicated, and where they felt they lacked information. When the first analytical products were finally delivered---three months later than the board expected---they were precisely calibrated to the coaching staff's needs. Adoption was immediate and enthusiastic. The three-month investment in relationship-building paid dividends for years.
Intuition: The "So What?" test is essential for every piece of analysis before it reaches a stakeholder. Ask yourself: - So what? Why does this finding matter? - Now what? What action should be taken as a result? - Says who? What evidence supports this conclusion? - How confident? What is the uncertainty around this finding?
If you cannot answer all four questions clearly, the analysis is not ready for delivery.
28.5.3 Integration with Scouting, Medical, and Commercial Departments
Analytics should not operate in isolation from other club departments. The most impactful analytics work often occurs at the intersection of multiple functions:
Scouting Integration: The analytics team and the scouting department should operate as partners, not competitors. Data-driven screening identifies candidates; traditional scouting provides the qualitative assessment that data cannot capture (character, mentality, injury proneness, adaptability). A healthy workflow involves analytics generating initial longlists, scouts narrowing them through video and live observation, and the two functions collaborating on final shortlists. The worst outcome is parallel, disconnected processes---the analytics team recommending one set of players and the scouts recommending a completely different set, with the sporting director left to adjudicate between them.
Medical Department Integration: Injury risk modeling, load management, and return-to-play decisions all benefit from analytical input. The analytics team can integrate training load data, match load data, sleep and wellness questionnaires, and historical injury records to build predictive models. However, this work requires close collaboration with medical staff and must respect clinical judgment. The analyst can flag that a player's cumulative load is entering a high-risk zone; the decision to rest that player remains clinical.
Commercial and Marketing Integration: Analytics can support commercial functions through fan engagement analysis, matchday revenue optimization, pricing models for hospitality, and social media strategy. While these applications are outside the scope of this book's primary focus on football performance, they represent a significant opportunity for analytics departments to demonstrate value beyond the football operations silo.
Best Practice: Create formal touchpoints between analytics and other departments. A monthly meeting between the analytics team and the medical department to review injury risk models. A weekly call between analytics and scouting to discuss recruitment targets. A quarterly presentation to the commercial team on fan engagement trends. These structured interactions prevent siloed working and ensure that analytics is perceived as a club-wide resource, not just a football operations tool.
28.5.4 Presenting to Non-Technical Audiences: Storytelling with Data
The ability to communicate complex analyses to non-technical audiences is perhaps the most critical skill in the analytics department's repertoire. Key principles include:
Lead with the conclusion. Non-technical audiences want the answer first, then the supporting evidence. Academic conventions (methodology, then results) should be reversed.
Visualize, don't tabulate. A well-designed chart conveys more information more quickly than a table of numbers. Use pitch maps, heat maps, and intuitive visualizations that leverage spatial understanding.
Use comparisons and benchmarks. Raw numbers are meaningless without context. "72% pass completion" means nothing; "72% pass completion, compared to a league average of 68% for centre-backs" tells a story.
Layer information. Start with the headline finding, then offer progressively more detail for those who want to dig deeper. A dashboard that allows drill-down serves this purpose well.
Limit scope. A report that tries to cover everything will be ignored. Focus on the 2-3 most important findings and resist the temptation to show all your work.
Use narrative structure. The best analytical presentations follow a story arc: situation (what is the context?), complication (what is the problem or opportunity?), resolution (what does the data suggest?), and action (what should we do?). This structure gives the audience a framework for understanding why the analysis matters.
Advanced: Consider developing a "data language" specific to your club---a standardized set of terms, metrics, and visualizations that everyone from the coaching staff to the board understands. For example, if you consistently present player performance using radar charts with the same six axes (goal threat, creativity, ball progression, defensive contribution, pressing, physical output), stakeholders will develop an intuition for reading these charts over time. Consistency in visual language reduces the cognitive effort required to extract meaning, which increases the likelihood that your analysis will be understood and acted upon.
28.5.5 Managing Resistance and Skepticism
Resistance to analytics is natural and should be expected, not feared. Common sources of resistance include:
- Threat perception: "Are they trying to replace my judgment?"
- Past experience: "We tried data before and it didn't work."
- Complexity aversion: "I don't understand these models."
- Cultural inertia: "We've always done it this way."
- Valid skepticism: "Can data really capture what I see on the pitch?"
Effective responses to each:
| Source of Resistance | Response Strategy |
|---|---|
| Threat perception | Frame analytics as augmenting, not replacing, expertise |
| Past experience | Acknowledge failures, demonstrate what's different now |
| Complexity aversion | Simplify delivery, educate gradually |
| Cultural inertia | Find early adopters, let results speak |
| Valid skepticism | Acknowledge limitations honestly, combine with qualitative insight |
28.5.6 Change Management in Traditional Football Organizations
Introducing analytics into a traditional football club is fundamentally a change management challenge. The Kotter model of organizational change provides a useful framework:
- Create urgency: Demonstrate the competitive gap---show how rival clubs are using analytics to gain advantage.
- Build a coalition: Identify early adopters within the coaching and scouting staff who are open to data-driven approaches. Their advocacy is more persuasive than any report.
- Form a vision: Articulate a clear vision of what a data-informed club looks like---not a club run by algorithms, but a club where every decision is informed by the best available evidence.
- Communicate the vision: Consistently reinforce the message through actions, not just words. Every successful analytical intervention is an opportunity to demonstrate the vision in action.
- Remove obstacles: Address practical barriers (lack of data access, insufficient technology, time pressure) that prevent stakeholders from engaging with analytics.
- Create short-term wins: Deliver visible, tangible successes early. A set-piece analysis that directly contributes to a goal is worth more for organizational buy-in than a year's worth of dashboards.
- Build on gains: Use early wins to justify further investment and expand the scope of analytics.
- Anchor in culture: Over time, data-informed decision-making should become "how we do things here," not a special initiative.
Real-World Application: When Brighton & Hove Albion began building their analytics capability in the early 2010s, the club was in the lower divisions of English football. The analytics function started small, with a single analyst working closely with the coaching staff on match preparation. As the club climbed through the divisions, the analytics department grew in tandem, always ensuring that its capabilities matched the organization's readiness to absorb them. By the time Brighton reached the Premier League, analytics was deeply embedded in the club's culture---not because it was imposed from above, but because it had evolved organically alongside the club's development. The lesson is that building a data culture is a marathon, not a sprint, and attempting to force the pace risks backlash and rejection.
28.6 Measuring Analytics Impact
28.6.1 The Measurement Challenge
Measuring the return on investment (ROI) of analytics is inherently difficult because:
- Attribution: When a team wins, how much of that success is attributable to analytics versus coaching, player talent, luck, and other factors?
- Counterfactual: We cannot observe what would have happened without analytics (the counterfactual).
- Time horizons: Some analytics benefits (e.g., better recruitment) manifest over years, not weeks.
- Qualitative benefits: Improved decision-making confidence and reduced cognitive bias are real but hard to quantify.
Despite these challenges, structured measurement is essential for justifying investment and driving continuous improvement.
28.6.2 A Framework for Impact Measurement
We propose a multi-dimensional framework that captures both quantitative and qualitative impacts:
Dimension 1: Financial Impact
$$\text{ROI}_{\text{financial}} = \frac{\text{Revenue Gains} + \text{Cost Savings} - \text{Analytics Investment}}{\text{Analytics Investment}} \times 100\%$$
Financial impact can be estimated through: - Recruitment efficiency: Transfer spend avoided through better player identification - Salary optimization: Reduced overpayment through market value modeling - Revenue impact: Points gained (leading to prize money, TV revenue, etc.)
Dimension 2: Decision Quality
Track the adoption and outcomes of analytically-informed decisions: - What percentage of recruitment recommendations were followed? - Of those followed, what percentage were successful (by predefined criteria)? - How did analytically-supported signings perform versus the club average?
Dimension 3: Process Efficiency
$$\text{Efficiency Gain} = \frac{\text{Time}_{\text{before automation}} - \text{Time}_{\text{after automation}}}{\text{Time}_{\text{before automation}}} \times 100\%$$
Measure time savings from automated reporting, streamlined scouting workflows, and reduced manual data processing.
Dimension 4: Stakeholder Satisfaction
Regular surveys of internal clients (coaching staff, sporting director, recruitment team) provide qualitative feedback on: - Relevance of analysis to their needs - Timeliness of delivery - Quality of communication - Overall perceived value
28.6.3 Key Performance Indicators (KPIs)
A balanced scorecard for an analytics department might include:
Output KPIs: - Number of pre-match reports delivered on time - Number of player profiles generated for recruitment - Number of ad-hoc analysis requests fulfilled - Number of models deployed and maintained
Outcome KPIs: - Adoption rate of analytics recommendations - Success rate of analytically-supported signings - Accuracy of predictive models (calibration, discrimination) - Cost savings attributable to analytics
Process KPIs: - Average turnaround time for analysis requests - Percentage of routine reporting that is automated - System uptime and data pipeline reliability - Code quality metrics (test coverage, documentation)
People KPIs: - Team retention rate - Skill development and certification completion - Internal stakeholder satisfaction scores - Cross-functional collaboration frequency
See code/example-03-impact-measurement.py for a complete implementation of an impact measurement dashboard.
Best Practice: Review KPIs quarterly and present them to club leadership. Even if the numbers are imperfect, the act of measurement signals accountability and professionalism. It also provides a basis for resource allocation discussions: "Our recruitment model identified 15 candidates this window, 8 were shortlisted by scouts, 3 were signed, and all 3 have outperformed their expected metrics after 6 months" is a far more compelling argument for budget increases than "we think we're adding value."
28.6.4 The Points-Above-Replacement Model
One particularly useful metric borrows from baseball's Wins Above Replacement (WAR) concept. We can estimate the "points above replacement" contributed by analytics-influenced decisions:
$$\text{PAR}_{\text{analytics}} = \sum_{d \in D} P(\text{decision } d \text{ influenced by analytics}) \times \Delta \text{Points}(d)$$
Where $D$ is the set of decisions made, and $\Delta \text{Points}(d)$ is the estimated point impact of decision $d$ relative to the replacement-level alternative.
For example, if analytics identified a signing who contributed an estimated 4 additional points over the season compared to the alternative signing the club would have made, and the analytics department's recommendation was the primary driver of that decision, then:
$$\text{PAR}_{\text{signing}} = 0.8 \times 4 = 3.2 \text{ points}$$
Aggregated across recruitment, tactical, and set-piece decisions, this provides a holistic (if imperfect) estimate of analytics' contribution to sporting performance.
28.6.5 Long-Term Value Creation
Beyond season-by-season measurement, analytics departments create long-term value through:
- Institutional knowledge: Building a data-driven decision-making culture that persists beyond individual staff
- Asset appreciation: Better player development and recruitment leading to increased squad market value
- Risk reduction: More informed decisions reducing the variance of outcomes
- Competitive moat: Proprietary models and data assets that are difficult for competitors to replicate
The total value of an analytics department can thus be expressed as:
$$V_{\text{analytics}} = \sum_{t=0}^{T} \frac{\text{Net Benefit}_t}{(1 + r)^t}$$
Where $\text{Net Benefit}_t$ is the total benefit minus cost in year $t$, $r$ is the discount rate, and $T$ is the time horizon. This net present value (NPV) formulation captures the compounding benefits of sustained analytics investment.
28.7 Case Studies from Top Clubs
28.7.1 FC Midtjylland: The Pioneer
FC Midtjylland (FCM) is widely regarded as the first European football club to fully embrace a data-driven approach to football management. Under the ownership of Matthew Benham and the chairmanship of Rasmus Ankersen, FCM built an analytics-first culture that delivered sustained success in the Danish Superliga and European competition.
Key Characteristics: - Analytics integrated at every level of decision-making - Set-piece specialization driven by statistical analysis - Recruitment model focused on undervalued players with high statistical profiles - Willingness to challenge conventional football wisdom with data
Organizational Approach: - Small but influential analytics team (5-8 people at peak) - Direct reporting to ownership, bypassing traditional football hierarchies - Strong alignment between ownership vision and analytical methodology - Culture of experimentation and tolerance for failure
Results: - Multiple Danish Superliga titles - Consistent overperformance relative to wage bill - Successful player development and profitable transfer activity - Established template for data-driven club management
Lessons Learned: FCM's experience demonstrates that analytics impact is not proportional to team size. A small, well-integrated team with direct access to decision-makers can be more effective than a large department that is organizationally isolated. The critical success factor was not the sophistication of the models but the willingness of the entire organization---from ownership to coaching staff---to take data-driven recommendations seriously.
For a detailed analysis of the FCM case, see case-study-01.md.
28.7.2 Liverpool FC: The Collaborative Model
Liverpool FC under the ownership of Fenway Sports Group (FSG) represents a model where analytics is deeply embedded within a collaborative decision-making structure. Rather than analytics dictating decisions, the club uses data as one input among several in a consensus-driven process.
Key Characteristics: - Analytics as one pillar of a multi-disciplinary approach - Strong relationships between analysts and coaching staff - Research partnerships with academic institutions - Integration of analytics with sports science and medical departments
Organizational Approach: - Research department working alongside traditional football operations - Direct involvement in transfer committee decisions - Emphasis on throw-in coaching and set-piece optimization - Player development informed by data-driven benchmarking
The Transfer Committee Model: Liverpool's transfer process involves a committee that includes the manager, sporting director, chief scout, and analytics lead. Each member brings a different perspective: the manager articulates the tactical profile needed, the analytics team identifies candidates who match that profile statistically, the scouts assess candidates through video and live observation, and the sporting director evaluates financial and contractual feasibility. The committee structure ensures that no single perspective dominates, and decisions are the product of informed consensus rather than individual judgment.
Impact on Recruitment: Liverpool's analytics-informed recruitment has been widely credited with identifying several players who were undervalued by the market but whose statistical profiles indicated elite potential. The club's ability to identify talent in less scouted leagues (Egypt, Guinea, the Swiss lower divisions) was enabled by data-driven screening that flagged players traditional scouting networks might have overlooked.
Intuition: Liverpool's model illustrates a key principle: analytics is most effective when it is integrated into a collaborative decision-making process rather than operating as a separate, authoritative voice. The transfer committee model distributes risk---no single person's judgment can lead to a catastrophic signing---while ensuring that data is always part of the conversation. This approach requires humility from the analytics team (accepting that data is one input, not the final answer) and openness from the coaching staff (accepting that data can reveal things they cannot see).
28.7.3 Brentford FC: Punching Above Weight
Brentford FC, also owned by Matthew Benham, applied a similar data-driven philosophy to FCM but in the far more competitive English football pyramid. Their approach demonstrates how analytics can provide a competitive edge even in resource-constrained environments.
Key Achievements: - Promotion to the Premier League after 74 years away - Consistent overperformance relative to budget - Highly profitable transfer model (buy low, sell high) - Innovative approach to goalkeeper recruitment and set-piece design
Analytics Philosophy: - "Moneyball" approach: identifying market inefficiencies - Statistical models for player valuation and recruitment - Willingness to sell high-performing players at peak value - Reinvestment of transfer profits into analytics and recruitment infrastructure
The Brentford Model in Detail: Brentford's approach is distinguished by its willingness to make contrarian decisions based on data. The club has repeatedly signed players from lower divisions or less fashionable leagues whose statistical profiles suggested they could perform at a higher level, then sold those players at significant profit once their performance validated the data. This "buy, develop, sell" model generates transfer profits that fund both the playing budget and the analytics infrastructure, creating a virtuous cycle.
The club's analytics department played a particularly notable role in goalkeeper recruitment, identifying goalkeepers whose underlying metrics (save percentage relative to shot quality, distribution accuracy, sweeping range) were superior to their transfer valuations. This approach led to several highly successful goalkeeping signings at a fraction of the cost of more traditionally scouted alternatives.
28.7.4 Brighton & Hove Albion: The Organic Growth Model
Brighton's analytics journey is notable for its gradualism. Rather than a top-down mandate to "become data-driven," analytics capabilities grew organically alongside the club's competitive ambitions.
Key Characteristics: - Analytics capability scaled in step with competitive level - Strong integration between analytics and coaching from the outset - Investment in proprietary data collection and analysis tools - Focus on expected goals and possession value models
Organizational Approach: - Analytics staff embedded within the football operations structure - Close working relationship between analysts and coaching staff at all levels (first team, U23s, academy) - Significant investment in data engineering to build robust internal infrastructure - Culture of evidence-based decision-making extending beyond transfers to tactical preparation and player development
Results: - Sustained Premier League presence after promotion - Recruitment of high-impact players from less scouted markets - Development of a recognizable, analytically-informed playing style - European qualification achieved by a club with a fraction of the budget of traditional competitors
28.7.5 RB Leipzig: The Network Model
RB Leipzig operates within the Red Bull football network, which includes clubs in Austria (Red Bull Salzburg), Brazil (Red Bull Bragantino), the United States (New York Red Bulls), and elsewhere. This multi-club structure creates unique analytics opportunities and challenges.
Key Characteristics: - Centralized analytics methodology across the Red Bull network - Standardized data infrastructure enabling cross-club comparison - Player pathway analytics: identifying when a player is ready to move from a feeder club to a higher-level club - Pressing metrics deeply integrated into the Red Bull playing philosophy
Organizational Approach: - Central analytics team at the group level, with local analysts at each club - Shared data platform enabling player comparison across the network - Regular exchange of analysts between clubs for knowledge sharing - Strong alignment between analytics methodology and the high-pressing tactical philosophy that defines all Red Bull clubs
The Player Pathway Model: One of the most distinctive aspects of the Red Bull analytics operation is the player pathway model. Players are recruited into the network at the entry-level clubs (typically Salzburg or Bragantino), developed using standardized methods, and promoted to Leipzig when their performance data suggests they are ready for a higher competitive level. The analytics team maintains predictive models that estimate when a player's development trajectory will plateau at the current level and when a move to a more demanding environment would be beneficial. This systematic approach to player development and promotion has produced several high-value transfers and established the Red Bull network as one of the most effective player development systems in global football.
Advanced: Multi-club ownership models create fascinating analytical challenges. How do you compare player performance across leagues of different quality? If a striker scores 20 goals in the Austrian Bundesliga, what is the expected translation to the German Bundesliga? These "league translation" models must account for differences in defensive quality, tactical style, physical intensity, and refereeing standards. The Red Bull network has an advantage here because their data spans multiple leagues with standardized collection methodologies, providing a rich dataset for calibrating translation models. Clubs outside such networks must rely on provider data, which may have inconsistencies in collection methodology across leagues.
28.7.6 Lessons Across Case Studies
Comparing these case studies reveals several common success factors:
- Ownership commitment: In every successful case, the ownership group actively champions analytics
- Patience: Building an analytics culture takes years, not months
- Integration: Analytics works best when embedded in existing processes, not bolted on
- Pragmatism: Successful departments deliver practical value, not theoretical perfection
- Adaptability: No two clubs implement analytics identically; context matters
- Trust: The human relationships between analysts and football staff are as important as the models themselves
The mathematical relationship between these factors and outcomes can be conceptualized as:
$$P(\text{success}) = 1 - \prod_{i=1}^{n} (1 - p_i)^{w_i}$$
Where $p_i$ is the probability of getting factor $i$ right and $w_i$ is the weight of that factor. This formulation captures the idea that all factors contribute to success, but no single factor is sufficient on its own --- it is the combination that matters.
Common Failure Patterns: Just as there are common success factors, there are common failure patterns in analytics department building:
- The Technology-First Trap: Investing heavily in infrastructure and tools before understanding what the stakeholders actually need. The result is an impressive technology stack with no users.
- The Ivory Tower: Building a brilliant analytics team that produces sophisticated research but has no mechanism for translating insights into decisions. The team publishes internal papers that nobody reads.
- The Regime Change: An analytics department that depends entirely on the patronage of a single manager or sporting director. When that person leaves, the department is dismantled or marginalized.
- The Credibility Gap: Deploying models that make high-profile recommendations that turn out wrong, without sufficient uncertainty communication. One bad call can undermine years of trust-building.
- The Scope Creep: An analytics department that tries to do everything for everyone, spreading itself too thin and delivering mediocre work across many areas rather than excellent work in a few.
Best Practice: When establishing an analytics department, explicitly define what you will NOT do, at least initially. Saying "we will focus on first-team match analysis and recruitment in year one, and will not attempt medical analytics, commercial analytics, or academy integration until year three" sets clear expectations with stakeholders and prevents the scope creep that dilutes impact.
28.8 Building a Data Culture
28.8.1 What Is a Data Culture?
A data culture exists when data-informed decision-making is the organizational norm rather than the exception. In a club with a strong data culture:
- Coaches routinely ask "what does the data say?" before making tactical decisions
- Scouts combine video assessment with statistical profiles as standard practice
- The medical team monitors player load data as part of routine injury prevention
- The board evaluates transfer proposals against market value models
- Post-decision reviews include an assessment of what the data predicted versus what actually happened
Building this culture is not primarily a technology challenge---it is a human challenge. It requires changing habits, mindsets, and power dynamics that may have been in place for decades.
28.8.2 The Role of Leadership
Data culture starts at the top. If the club's CEO, chairman, or ownership group does not genuinely value data-informed decision-making, no amount of analytical talent or technology investment will create a data culture. Leadership's role is to:
- Model behavior: Actively seek data before making decisions, and be transparent about how data influenced the decision.
- Allocate resources: Fund the analytics department at a level commensurate with its strategic importance.
- Protect the function: Shield the analytics department from political pressures that might compromise its objectivity.
- Celebrate successes: Publicly recognize instances where data-informed decisions led to positive outcomes.
- Accept failures: Acknowledge that data-informed decisions will sometimes be wrong, without blaming the data or the analysts.
28.8.3 Education and Literacy
Building a data culture requires raising the data literacy of the entire organization, not just the analytics department. This means investing in education at multiple levels:
- Executive education: Board members and senior leaders need to understand basic statistical concepts (probability, uncertainty, sample size) well enough to evaluate analytical claims critically.
- Coaching staff development: Coaches need to understand what analytics can and cannot do, and how to integrate data into their existing workflows.
- Scout training: Scouts need to understand how statistical models complement their observational expertise, and how to use analytical tools in their scouting process.
- Player education: Players increasingly receive individual performance data. Helping them understand and use this data productively---rather than being confused or demoralized by it---is an important but often neglected aspect of data culture.
Real-World Application: One Premier League club runs a monthly "Data Literacy" workshop for all football operations staff, from coaches to scouts to medical personnel. The workshops cover practical topics: how to read a radar chart, what xG means and does not mean, how to interpret confidence intervals, and how to distinguish between correlation and causation. Attendance is voluntary but encouraged, and the workshops have been credited with significantly improving the quality of conversations between analytics and non-analytics staff. Coaches who previously dismissed analytics now ask informed questions and engage critically with the data---not accepting it blindly, but not dismissing it either.
Summary
Building an analytics department in professional football is a multifaceted challenge that extends far beyond hiring data scientists and purchasing data feeds. It requires:
- Strategic organizational design that positions analytics to influence decisions
- Thoughtful hiring that balances technical expertise with football knowledge and communication skills
- Robust technology infrastructure that enables efficient data processing and delivery
- Well-designed workflows that integrate analytics into the rhythms of professional football
- Skilled stakeholder management that builds trust and overcomes resistance
- Rigorous impact measurement that justifies investment and drives improvement
- Deliberate culture building that embeds data-informed thinking across the organization
- Learning from others while adapting best practices to local context
The clubs that have succeeded in building effective analytics departments share common traits: committed ownership, patient investment, pragmatic focus on value delivery, and --- above all --- recognition that analytics is not a technology project but an organizational transformation.
As the field continues to mature, the gap between analytics leaders and laggards will widen. Clubs that invest thoughtfully in building their analytics capabilities today are investing in competitive advantage that will compound for years to come.
Best Practice: If you are just beginning to build an analytics function, remember: 1. Start small, deliver value quickly, and expand based on demonstrated impact 2. Hire people who can communicate as well as they can code 3. Invest in relationships before investing in technology 4. Be patient --- cultural change takes time 5. Measure your impact, even imperfectly, from day one 6. Define your scope explicitly---what you will and will not do 7. Build for organizational resilience, not just current-regime support 8. Document everything, because institutional memory is fragile
The journey from ad-hoc analysis to a mature analytics department is long, but every successful analytics operation started with a single analyst and a willingness to try.
Chapter References
- Anderson, C., & Sally, D. (2013). The Numbers Game: Why Everything You Know About Football Is Wrong. Penguin Books.
- Ankersen, R. (2012). The Gold Mine Effect: Crack the Secrets of High Performance. Icon Books.
- Biermann, C. (2019). Football Hackers: The Science and Art of a Data Revolution. Blink Publishing.
- Davenport, T. H., & Harris, J. G. (2007). Competing on Analytics: The New Science of Winning. Harvard Business Press.
- Kuper, S., & Szymanski, S. (2018). Soccernomics. Nation Books (5th edition).
- Laursen, G. H. N., & Thorlund, J. (2010). Business Analytics for Managers. Wiley.
- Lewis, M. (2003). Moneyball: The Art of Winning an Unfair Game. W. W. Norton.
- Pena, J. L., & Touchette, H. (2012). "A Network Theory Analysis of Football Strategies." arXiv:1206.6904.
- Rein, R., & Memmert, D. (2016). "Big Data and Tactical Analysis in Elite Soccer." SpringerPlus, 5(1), 1410.
- Tippett, J. (2019). The Expected Goals Philosophy. Independently published.
- Kotter, J. P. (2012). Leading Change. Harvard Business Review Press.
- Provost, F., & Fawcett, T. (2013). Data Science for Business. O'Reilly Media.