Case Study 21.2: Scouting a Replacement — Finding the Next N'Golo Kanté
Background
N'Golo Kanté's career trajectory represents one of the most remarkable stories in modern football and a masterclass in data-informed recruitment. Kanté was signed by Leicester City from SM Caen in Ligue 2 for approximately 7.6 million GBP in 2015, helped Leicester win an improbable Premier League title, and was then signed by Chelsea for 32 million GBP the following summer, where he again won the league title in his first season.
This case study uses the hypothetical scenario of a mid-table Premier League club that has just lost its Kanté-like defensive midfielder to a bigger club. The challenge: use a data-driven approach to identify a replacement who can replicate the departed player's impact at a fraction of the cost.
Step 1: Define the Player Profile
What Made Kanté Exceptional?
Kanté's statistical profile during his peak seasons was extraordinary for a defensive midfielder:
| Metric (per 90) | Kanté (Peak) | Positional Average | Percentile |
|---|---|---|---|
| Tackles | 4.7 | 2.8 | 98th |
| Interceptions | 3.5 | 2.1 | 96th |
| Ball recoveries | 12.2 | 9.5 | 92nd |
| Pressures | 30.3 | 22.1 | 95th |
| Pressure success rate | 32.1% | 28.8% | 90th |
| Progressive passes | 7.8 | 4.2 | 78th |
| Progressive carries | 3.2 | 2.1 | 82nd |
| Touches in opp. box | 1.8 | 1.2 | 71st |
| Distance covered (km) | 14.1 | 12.8 | 94th |
| Sprints per 90 | 48 | 35 | 96th |
Kanté's defining characteristic was the combination of elite defensive output AND significant ball-progressing ability, coupled with extraordinary physical capacity. He was not merely a "ball winner" -- he was a complete midfielder who recovered possession and then advanced it effectively.
Translating the Profile into Search Criteria
We define a "Kanté-type midfielder" profile with the following search parameters:
kante_profile = {
"position": ["DM", "CM", "CDM"],
"age_range": (21, 27),
"min_minutes": 1200,
"metrics": {
"tackles_per90": {"min": 3.5, "weight": 0.18, "percentile_target": 85},
"interceptions_per90": {"min": 2.5, "weight": 0.15, "percentile_target": 85},
"ball_recoveries_per90": {"min": 10.0, "weight": 0.12, "percentile_target": 80},
"pressures_per90": {"min": 24.0, "weight": 0.15, "percentile_target": 85},
"pressure_success_pct": {"min": 30.0, "weight": 0.10, "percentile_target": 80},
"progressive_passes_per90": {"min": 4.0, "weight": 0.12, "percentile_target": 70},
"progressive_carries_per90": {"min": 2.0, "weight": 0.10, "percentile_target": 70},
"distance_covered_km": {"min": 13.0, "weight": 0.08, "percentile_target": 80},
},
}
Key Decision: Weighting
The weights reflect our departed player's primary value: defensive actions (tackles, interceptions, pressures) receive higher weights than ball progression. However, we explicitly require above-average progressive ability to ensure we do not sign a purely destructive midfielder who cannot contribute in possession.
Step 2: Data Screening
Initial Filter
Starting with a database of approximately 8,500 central midfielders across 25 leagues, we apply the initial filters:
Filter 1: Position = DM, CM, CDM → 8,500 players
Filter 2: Age 21-27 → 4,200 players
Filter 3: Minutes >= 1,200 → 2,800 players
Filter 4: League level >= Tier 2 → 2,100 players
Statistical Thresholds
Applying the minimum metric thresholds from our profile:
Filter 5: Tackles/90 >= 3.5 → 620 players
Filter 6: Interceptions/90 >= 2.5 → 310 players
Filter 7: Ball recoveries/90 >= 10.0 → 185 players
Filter 8: Pressures/90 >= 24.0 → 92 players
Filter 9: Progressive passes/90 >= 4.0 → 48 players
Filter 10: Progressive carries/90 >= 2.0 → 31 players
From 8,500 candidates, we have filtered down to 31 players who meet all minimum thresholds. This is our statistical shortlist.
Step 3: Composite Scoring and Ranking
Z-Score Computation
For each of the 31 shortlisted players, we compute z-scores relative to the full population of 2,800 qualifying midfielders (those meeting the age and minutes criteria):
$$ z_{ij} = \frac{x_{ij} - \bar{x}_j}{s_j} $$
Weighted Composite Score
$$ S_i = \sum_{j=1}^{8} w_j \cdot z_{ij} $$
Top 10 Candidates (Hypothetical Results)
| Rank | Player | Age | League | Composite Score | Key Strength |
|---|---|---|---|---|---|
| 1 | Player Alpha | 24 | Ligue 1 | 2.85 | Elite pressing + ball recovery |
| 2 | Player Beta | 23 | Bundesliga | 2.71 | Exceptional tackling + progressive carrying |
| 3 | Player Gamma | 25 | Serie A | 2.58 | Highest interception rate |
| 4 | Player Delta | 22 | Eredivisie | 2.45 | Best progressive passing of all candidates |
| 5 | Player Epsilon | 26 | Premier League | 2.38 | Most well-rounded profile |
| 6 | Player Zeta | 24 | Liga Portugal | 2.31 | Strong pressing, good aerial ability |
| 7 | Player Eta | 23 | Championship | 2.22 | High tackle rate, excellent stamina |
| 8 | Player Theta | 25 | La Liga | 2.15 | Elite interception rate, good passing |
| 9 | Player Iota | 21 | Bundesliga 2 | 2.08 | Youngest candidate, high upside |
| 10 | Player Kappa | 24 | Ligue 1 | 2.01 | Strong in all categories, no standout |
Step 4: League Adjustment
Several candidates play in leagues with different statistical environments. We apply league adjustment factors:
Adjustment Example: Player Delta (Eredivisie)
Player Delta's raw statistics in the Eredivisie: - Tackles per 90: 4.1 - Pressures per 90: 27.8 - Progressive passes per 90: 8.2
Applying Premier League adjustment factors:
$$ \text{Tackles}^{adj} = 4.1 \times \frac{1.00}{0.83} = 4.94 $$
$$ \text{Pressures}^{adj} = 27.8 \times \frac{1.00}{0.82} = 31.46 $$
Wait -- these adjusted numbers look too good. This illustrates an important caveat: the simple ratio method can overadjust when the difficulty gap is large. We should apply a dampening factor:
$$ \text{Adjusted} = \text{Raw} \times \left(1 + d \times \left(\frac{f_{target}}{f_{source}} - 1\right)\right) $$
where $d$ is a dampening factor (typically 0.5-0.7) that prevents full adjustment.
With $d = 0.6$:
$$ \text{Tackles}^{adj} = 4.1 \times (1 + 0.6 \times (1.205 - 1)) = 4.1 \times 1.123 = 4.60 $$
This is more realistic and accounts for the fact that not all of the league difference translates directly.
Post-Adjustment Rankings
After league adjustment, the rankings shift:
| Rank | Player | League | Raw Score | Adjusted Score | Change |
|---|---|---|---|---|---|
| 1 | Player Alpha | Ligue 1 | 2.85 | 2.72 | -1 adjusted metrics |
| 2 | Player Epsilon | Premier League | 2.38 | 2.38 | No change (already PL) |
| 3 | Player Beta | Bundesliga | 2.71 | 2.55 | Slight downward adj. |
| 4 | Player Gamma | Serie A | 2.58 | 2.48 | Minor adjustment |
| 5 | Player Delta | Eredivisie | 2.45 | 2.15 | Significant downward adj. |
Player Epsilon, initially ranked 5th, rises to 2nd because his statistics already come from the Premier League and require no adjustment.
Step 5: Performance Projection
Age Curve Application
For each top candidate, we project forward to estimate performance over the proposed contract period (4 years):
Player Alpha (Age 24, Ligue 1): - Current level (league-adjusted): 2.72 composite - Expected trajectory: Near peak, slight improvement to age 26-27, then plateau - 4-year projected average: 2.65 composite
Player Iota (Age 21, Bundesliga 2): - Current level (league-adjusted): 1.85 composite - Expected trajectory: Steep improvement expected, but with high uncertainty - 4-year projected average: 2.30 composite (with 80% CI: 1.60-3.00)
MARCEL Projection for Key Metrics
For Player Alpha's tackles per 90: - Season $t$: 4.2 - Season $t-1$: 3.8 - Season $t-2$: 3.5 - Reliability: $r = 0.62$ - League mean: 2.8 - Age adjustment: +0.05 (age 24, still improving)
$$ \hat{y}_{t+1} = \left(\frac{5 \times 4.2 + 4 \times 3.8 + 3 \times 3.5}{12}\right) \times 0.62 + 2.8 \times 0.38 + 0.05 $$
$$ = \left(\frac{23.0 + 17.2 + 12.5}{12}\right) \times 0.62 + 1.064 + 0.05 $$
$$ = 3.892 \times 0.62 + 1.064 + 0.05 = 2.413 + 1.064 + 0.05 = 3.53 $$
After league adjustment to the Premier League, this becomes approximately 3.53 tackles per 90, which is above average but not elite -- a realistic projection that appropriately regresses from his current high level and adjusts for league quality.
Step 6: Risk Assessment
Risk Profiles for Top 3 Candidates
Player Alpha (Age 24, Ligue 1): - Injury risk: LOW (12 days missed in 2 seasons) - Performance sustainability: MEDIUM (some xG overperformance as a midfielder) - Adaptation risk: MEDIUM (Ligue 1 to Premier League, language barrier) - Character: LOW risk (no disciplinary issues, strong references) - Financial: MEDIUM (estimated fee 18-22M EUR, reasonable for profile) - Composite risk: 3.8/10
Player Epsilon (Age 26, Premier League): - Injury risk: MEDIUM (recurring ankle issues, 45 days missed) - Performance sustainability: LOW (consistent performer for 3 seasons) - Adaptation risk: LOW (already in Premier League) - Character: LOW risk (known in the league, good reputation) - Financial: HIGH (estimated fee 30-35M EUR, higher wages, limited resale) - Composite risk: 4.5/10
Player Beta (Age 23, Bundesliga): - Injury risk: LOW (no significant injuries) - Performance sustainability: MEDIUM (only 1.5 seasons of high-level data) - Adaptation risk: MEDIUM (Bundesliga to Premier League) - Character: LOW risk - Financial: MEDIUM (estimated fee 15-20M EUR, good development potential) - Composite risk: 3.2/10
Step 7: Scout Integration
Scouting Assignments
Based on the data analysis, we assign scouts to evaluate the top 5 candidates:
- Scout 1 (France-based): Watch Player Alpha 3 times, report on pressing intensity in live context, physicality in duels, and communication with teammates
- Scout 2 (Germany-based): Watch Player Beta 3 times, report on tactical discipline, positioning in different game states, and ability to handle physical press
- Scout 3 (England-based): Watch Player Epsilon 2 times, specifically assessing injury movement patterns and consistency of effort
Scout Report Excerpts (Hypothetical)
Player Alpha — Scout 1 Report:
"Exceptional engine. Covers more ground than anyone else on the pitch. Anticipation is outstanding -- reads the game two passes ahead. Concerns: slightly limited technically when pressed himself. In tight spaces, sometimes chooses the safe option when a progressive pass is available. Physical enough for the Premier League but not dominant in the air. Recommendation: SIGN -- the defensive qualities are genuine and the technical limitations are coachable."
Player Beta — Scout 2 Report:
"Very intelligent midfielder. Positions himself perfectly to cut passing lanes. Progressive carrying is a real weapon -- drives forward with purpose and picks the right moment. Concerns: tends to disappear in matches where his team dominates possession (fewer defensive opportunities). Only tested against top opposition a few times. Recommendation: MONITOR -- want to see him in bigger matches before committing."
Player Epsilon — Scout 3 Report:
"Reliable and consistent. Does the job every week without flashy moments. Knows the league, knows the demands. Concerns: the ankle is a worry -- visible discomfort after challenges on the left side. At 26 with injury history, the ceiling is what you see now. Recommendation: SIGN IF PRICE IS RIGHT -- but would not go above 25M."
Step 8: Final Recommendation
Decision Matrix
| Factor | Weight | Alpha | Beta | Epsilon |
|---|---|---|---|---|
| Statistical fit | 0.25 | 11.0 | 10.5 | 10.0 |
| Projected performance | 0.20 | 10.5 | 10.0 | 9.5 |
| Risk profile | 0.15 | 9.5 | 10.0 | 8.0 |
| Scout assessment | 0.20 | 10.5 | 9.0 | 9.5 |
| Financial value | 0.20 | 9.5 | 10.5 | 8.0 |
| Weighted Total | 1.00 | 10.25 | 9.95 | 9.05 |
Recommendation
Primary target: Player Alpha - Best statistical fit for the departed player's role - Strong scout endorsement - Reasonable transfer fee with development upside - Manageable risk profile - Age 24 means 4-5 years of peak performance and potential resale value
Backup target: Player Beta - Slightly lower current level but strong upside - Lower fee provides better financial value - Scout wants further evaluation -- monitor for the remainder of the season - If Alpha is unavailable or price escalates, Beta becomes the primary target
Conditional target: Player Epsilon - Only pursue if available below 25M EUR - Injury risk and age limit the upside - No adaptation risk is a significant advantage - Best option for immediate impact if the club needs a short-term solution
Lessons from the Kanté Case
-
The power of lower-league scouting: Kanté was identified in Ligue 2 because someone (at Leicester) was looking beyond the obvious markets. Data-driven screening makes this systematic rather than accidental.
-
Profile definition is critical: "Find the next Kanté" is not a useful brief. "Find a midfielder under 27 who is in the 85th+ percentile for tackles, interceptions, and pressures while also being above average in progressive actions" is actionable.
-
Context matters: Kanté's statistics at Caen were extraordinary for Ligue 2, but the question was whether they would translate. League adjustment and projection models help answer this question systematically.
-
The complete package requires human assessment: Kanté's humility, work ethic, and football intelligence were apparent to scouts but invisible in data. The final decision must integrate both perspectives.
-
Value creation requires timing: Leicester bought Kanté for 7.6M and sold for 32M -- a return driven by buying before peak recognition. The recruitment funnel must prioritize value, not just talent.
Discussion Questions
-
In the composite scoring step, we used equal z-score weighting after applying profile weights. What are the advantages and disadvantages of this approach compared to using percentile-based scoring?
-
Player Delta (Eredivisie) dropped significantly in the rankings after league adjustment. Under what circumstances might you still consider a player whose adjusted scores are lower but whose raw talent is evident?
-
The scout recommended "MONITOR" for Player Beta rather than "SIGN." How long should a club be willing to wait for additional scouting evidence before a transfer window closes?
-
How would the analysis change if the club's budget were 50% lower? Which trade-offs would you make?
-
If you were applying this methodology in real-time during a transfer window, what would be your minimum timeline from initial screening to final recommendation?
Code Reference
See code/case-study-code.py for the full Python implementation of this scouting replacement analysis, including similarity scoring, league adjustment, MARCEL projection, and the decision matrix calculation.