Case Study 2: Scaling Analytics at Manchester City Football Group
Background
The City Football Group (CFG) represents the most ambitious experiment in multi-club ownership in football history. Founded around Manchester City FC following the 2008 acquisition by the Abu Dhabi United Group, CFG has grown into a global network of football clubs spanning multiple continents. As of the mid-2020s, the group includes majority or minority stakes in clubs across England (Manchester City), the United States (New York City FC), Australia (Melbourne City FC), Japan (Yokohama F. Marinos), Spain (Girona FC), France (Troyes AC), Belgium (Lommel SK), India (Mumbai City FC), China (Sichuan Jiuniu FC), and others.
What makes CFG particularly relevant for the study of analytics departments is not just the scale of investment, but the deliberate, systematic approach to building analytics capabilities that serve an interconnected network of clubs. CFG has attempted to answer a question no other organization in football has faced: how do you build analytics at scale across multiple clubs, leagues, cultures, and competitive contexts?
The Challenge
CFG's analytics ambitions faced unique challenges beyond those of a single club:
-
Scale and complexity: Serving 10+ clubs across different time zones, languages, and football cultures requires systems and processes that transcend local context.
-
Heterogeneous data environments: Different leagues have different data providers, different levels of data availability, and different data formats. A player in the Belgian second division and one in the Premier League may have vastly different data footprints.
-
Competing priorities: Each club has its own competitive objectives, sometimes conflicting with group-level strategies (e.g., a player loan that benefits the group but weakens the lending club's squad).
-
Talent retention: Analysts capable of operating at this level are in high demand across the sports and technology industries.
-
Integration challenge: Analytics must serve diverse stakeholders --- from Manchester City's world-class coaching staff to development clubs with more modest expectations.
Organizational Design
The Hub-and-Spoke Model
CFG adopted a hub-and-spoke organizational model for analytics:
CFG Central Analytics Hub (Manchester / New York)
|
+--- Manchester City FC Analytics Team
| +--- First Team Analytics
| +--- Academy Analytics
| +--- Recruitment Analytics
|
+--- Group Recruitment & Intelligence
| +--- Global Scouting Analytics
| +--- Player Valuation Models
| +--- Loan Strategy Analytics
|
+--- Technology & Data Engineering
| +--- Data Platform Team
| +--- Software Development
| +--- Research & Development
|
+--- Satellite Analysts (at each club)
+--- NYC FC Local Analyst(s)
+--- Melbourne City Local Analyst(s)
+--- Yokohama F. Marinos Local Analyst(s)
+--- [Other clubs...]
Central hub responsibilities: - Maintaining the group-wide data platform - Developing and maintaining core models (player valuation, recruitment screening, performance benchmarking) - Setting methodological standards - Conducting research and development - Supporting Manchester City's first-team analytics - Training and upskilling satellite analysts
Satellite analyst responsibilities: - Delivering local analytics services to their club's coaching and recruitment staff - Providing local context for centrally developed models - Feeding local data and intelligence back to the hub - Adapting group-wide tools to local needs
Staffing and Scale
At its peak, CFG's analytics operation has employed 25-40+ people across all locations, making it one of the largest analytics functions in world football. The breakdown is approximately:
| Function | Headcount | Location |
|---|---|---|
| Manchester City first team | 6-8 | Manchester |
| Group recruitment analytics | 5-8 | Manchester / globally |
| Technology and data engineering | 6-10 | Manchester |
| Research and development | 3-5 | Manchester |
| Satellite analysts (all clubs) | 8-12 | Various global locations |
This scale requires significant investment. While exact figures are not public, industry estimates suggest CFG's total analytics expenditure (including personnel, data, technology, and infrastructure) runs into several million dollars annually.
Technology Infrastructure
The Group Data Platform
CFG's most significant technological achievement is its proprietary group-wide data platform. This system serves as the single source of truth for football data across all clubs and includes:
Data Ingestion Layer: - Automated feeds from multiple event data providers (adapted per league) - Tracking data integration where available - GPS and physical performance data from each club's systems - Video feeds and tagging data - Internal scouting reports and evaluations - Financial and contract data
Data Warehouse: - Centralized storage with club-specific access controls - Standardized data models that enable cross-club comparison - Historical data spanning multiple seasons and leagues - Player tracking across the group (monitoring loaned and former players)
Analytics Layer: - Proprietary player valuation models - Recruitment screening tools with customizable filters - Performance benchmarking dashboards - Tactical analysis tools integrated with video - Injury risk and workload monitoring
Delivery Layer: - Web-based dashboards accessible from any location - Mobile interfaces for coaching staff - Automated reporting and alerting - API layer for custom analysis
Standardization vs. Localization
A critical design tension in CFG's technology is the balance between standardization (which enables scale and cross-club comparison) and localization (which ensures relevance to each club's specific context).
CFG addresses this through a "core + extension" architecture: - Core modules are standardized across all clubs (data models, player valuation framework, recruitment screening) - Extension modules are customizable per club (tactical dashboards adapted to local playing style, league-specific benchmarks)
This approach allows a recruitment analyst in Manchester to compare a player in the Belgian league with one in the Australian league using the same underlying methodology, while local analysts can present information to their coaching staff in contextually appropriate formats.
Key Use Cases
1. Network Recruitment
CFG's most visible analytics application is network-wide recruitment: using data to identify talented players who can serve multiple clubs within the group.
The process: 1. Central analytics team maintains a global player database with standardized metrics 2. Statistical screening identifies players matching defined profiles (age, position, performance metrics, contract situation) 3. Shortlists are generated for each club based on their specific needs and competitive context 4. Local scouts and analysts conduct deeper evaluation (video review, live scouting) 5. Recommendations are presented to the relevant club's sporting director 6. Transfer committee makes the final decision
Innovation: CFG's player valuation models incorporate "group utility" --- the estimated value a player provides not just to the acquiring club, but to the broader group. A young player signed by Lommel SK who develops successfully might be transferred internally to Girona FC and eventually to Manchester City, creating value at each step.
2. Loan Strategy Optimization
With a large academy and multiple clubs, CFG uses analytics to optimize loan placements:
$$\text{Loan Value}_{p,c} = w_1 \cdot \text{Playing Time}_{p,c} + w_2 \cdot \text{Development}_{p,c} + w_3 \cdot \text{Competitive Level}_{c} - w_4 \cdot \text{Risk}_{p,c}$$
Where $p$ represents the player and $c$ represents the candidate club. The model estimates which loan destination will maximize each player's development, considering factors like expected playing time, coaching quality, competitive level, and injury risk.
3. Cross-Club Knowledge Transfer
Analytics facilitates knowledge transfer across the group: - Tactical innovations at one club can be evaluated statistically and shared with others - Training methodologies can be benchmarked across clubs - Medical and sports science best practices can be standardized - Set-piece routines developed at one club can be adapted for others
4. Performance Benchmarking
The standardized data platform enables sophisticated benchmarking: - How does a young midfielder at Melbourne City compare to Manchester City academy players at the same age? - Which physical development trajectories predict senior success? - How do league-specific playing styles affect player development?
Challenges and Lessons
Challenge 1: Cultural Integration
Each club within CFG has its own culture, history, and expectations. Imposing a uniform analytics approach risks alienating local staff and fans. CFG learned that analytics must be adapted to local context: - In Japan, the communication style and decision-making process differ significantly from Manchester - Clubs with strong local identities (Girona, Yokohama) resist being perceived as mere "feeders" - Coaching staff at different clubs have varying levels of analytical literacy
Lesson: Standardize the methodology, localize the delivery.
Challenge 2: Data Availability Gaps
Not all leagues where CFG operates have the same level of data coverage: - Premier League: Full event data, tracking data, extensive video - A-League (Australia): Good event data, limited tracking - Belgian second division: Basic event data, minimal tracking - Indian Super League: Variable coverage
Lesson: Build models that degrade gracefully with less data, and supplement statistical analysis with increased video and live scouting in data-sparse environments.
Challenge 3: Competing Priorities
When Manchester City's recruitment needs conflict with a satellite club's competitive objectives (e.g., recalling a loan player mid-season), analytics must navigate political complexity:
Lesson: Establish clear governance frameworks for inter-club decisions, and ensure local clubs feel the benefits of the network, not just the costs.
Challenge 4: Talent Management
Analysts with experience at CFG are highly sought after by other clubs and organizations. The group faces a constant challenge of developing talent that may leave:
Lesson: Create career pathways within the group (e.g., an analyst at Melbourne City could progress to Girona, then to Manchester City). Invest in development, accept some turnover, and build institutional knowledge into systems rather than individuals.
Challenge 5: Measuring Group-Level Impact
Measuring the ROI of a multi-club analytics function is even more complex than at a single club: - How do you attribute the value of a player signed by Lommel who later stars at Manchester City? - What is the return on technology investments that serve all clubs? - How do you value knowledge transfer that improves coaching at multiple clubs?
Lesson: Develop group-level metrics alongside club-level ones. Track the full lifecycle value of players across the network.
Quantitative Indicators
While comprehensive financial data is not public, several indicators suggest the scale of CFG's analytics impact:
| Metric | Evidence |
|---|---|
| Recruitment efficiency | Multiple players identified through data who progressed from satellite clubs to Manchester City or were sold at significant profit |
| Loan optimization | Increasing proportion of academy players receiving appropriate loan placements and returning improved |
| Technology scale | Proprietary platform serving 10+ clubs across 5+ continents |
| Staff development | Former CFG analysts now holding senior positions at other clubs, demonstrating the quality of the operation |
| Competitive success | Manchester City's sustained domestic and European success; satellite clubs achieving local competitive objectives |
The Multi-Club Analytics Playbook
Based on CFG's experience, a playbook for multi-club analytics can be distilled:
-
Invest in the platform first: A shared data platform is the foundation of multi-club analytics. Without it, each club operates in isolation.
-
Hire for adaptability: Analysts in a multi-club environment need to work across cultures, time zones, and competitive contexts. Cultural intelligence is as important as technical skill.
-
Define governance clearly: Who decides when club interests conflict? How are shared resources allocated? These questions must have clear answers before they become urgent.
-
Create network value: Each club must benefit from being part of the network. If satellite clubs feel exploited, the model collapses.
-
Balance central control with local autonomy: Over-centralization kills local innovation; under-centralization prevents scale benefits. The optimal balance shifts over time.
-
Measure holistically: Single-club metrics are insufficient. Develop group-level KPIs that capture network effects and lifecycle value.
Discussion Questions
-
Is the multi-club analytics model inherently advantageous, or does it introduce complexity that offsets the benefits of scale? Under what conditions does each outcome prevail?
-
How should CFG handle the ethical tension between optimizing for group outcomes and honoring each club's independent sporting objectives?
-
As more ownership groups adopt multi-club models, will the analytics advantage CFG currently enjoys be eroded? How can they maintain differentiation?
-
What are the data privacy implications of sharing player performance and medical data across a multi-club group, particularly across different legal jurisdictions?
-
If you were designing a multi-club analytics function from scratch today, what would you do differently from CFG's approach?
Python Analysis
See code/case-study-code.py for a quantitative analysis of multi-club network value, including player pathway optimization and group-level impact measurement modeling.