Case Study 2: Building a Player Evaluation Dashboard for a Scouting Department

Background

You have been hired as a junior data analyst at a mid-table club in a European top-flight league. The scouting department currently evaluates players using a combination of video analysis and basic statistics (goals, assists, pass completion rate). The head of recruitment has asked you to build a Player Evaluation Dashboard that the scouting team can use to compare transfer targets.

The dashboard must satisfy several constraints:

  1. Audience: The primary users are scouts and the sporting director. They are comfortable with basic statistics but have limited experience with advanced metrics. The head coach will see summaries but not interact with the dashboard directly.

  2. Scope: The dashboard focuses on attacking players (wingers, attacking midfielders, and strikers) in the top five European leagues.

  3. Data: You have access to event-level data for the current season, including shot locations, pass coordinates, carry end-points, and pressure events. You also have historical data for the past three seasons.

  4. Decision context: The club is looking to sign one attacking player in the January transfer window with a budget of 15--25 million euros. The player must improve the team's chance creation, which has been identified as a weakness.

Phase 1: Metric Selection

Task 1.1: Choosing Metrics

From the universe of available metrics, select 8--10 metrics that should appear on the dashboard for attacking players. For each metric, explain:

  • (a) What it measures and why it is relevant to the club's need (improved chance creation).
  • (b) Whether it is a rate or counting statistic, and what denominator is used.
  • (c) Its approximate stabilization period (from Section 5.5.5).
  • (d) Any known limitations or caveats.

Hint

Consider metrics from multiple phases of play: shooting, passing/creativity, ball-carrying, and pressing/work rate.

Task 1.2: What NOT to Include

Identify three metrics that might seem relevant but should be excluded from the dashboard. For each, explain why it fails one or more of the five desirable metric properties (validity, reliability, discrimination, interpretability, actionability).

Phase 2: Context Adjustments

Task 2.1: Adjustment Strategy

The dashboard will compare players across five different leagues (Premier League, La Liga, Bundesliga, Serie A, Ligue 1). Describe the context adjustments you would apply, addressing:

  • (a) League-strength adjustment: How would you handle the fact that Ligue 1 is generally considered weaker than the Premier League?
  • (b) Possession adjustment: A Barcelona winger playing in a 65% possession team should be compared differently than a Burnley winger in a 42% possession team.
  • (c) Opponent adjustment: How do you account for the fact that a player in a weaker league faces weaker opponents?
  • (d) Which adjustments would you apply automatically, and which would you let the user toggle on or off?

Task 2.2: Implementation

Using the code provided in code/case-study-code.py, implement possession adjustment for the following simulated player data:

Player League Team Poss% xA/90 (raw) Prog Carries/90 (raw) Pressures/90 (raw)
Alves La Liga 63 0.28 5.2 18.5
Brooks PL 47 0.22 3.8 22.1
Crespo Serie A 55 0.25 4.5 19.8
Dupont Ligue 1 51 0.24 4.1 20.5
Eriksen Bund. 58 0.26 4.8 17.2

Compute possession-adjusted values for each metric and discuss how the rankings change.

Phase 3: Validation

Task 3.1: Historical Validation

Using the past three seasons of historical data (simulated), validate your chosen metrics by:

  • (a) Computing season-over-season correlations (stability).
  • (b) Computing the ICC across all players and seasons (discrimination).
  • (c) Testing whether each metric in season $N$ predicts goal involvement (goals + assists) in season $N+1$ (predictive validity).

Summarize your findings in a table like this:

Metric Season-over-Season r ICC Predictive r vs. Goal Involvement
xA/90 ? ? ?
Prog Carries/90 ? ? ?
... ... ... ...

Task 3.2: Minimum Minutes

Based on your validation results and the stabilization-point analysis from Section 5.5.5, recommend a minimum-minutes threshold for the dashboard. Justify your choice and explain how you would handle players who fall below this threshold (exclude them entirely, flag them with a warning, or use a shrinkage estimator).

Phase 4: Visualization and Communication

Task 4.1: Dashboard Layout

Sketch (or describe in detail) the layout of a single-player profile page on the dashboard. Include:

  • (a) A percentile radar chart showing the player's ranking on each metric relative to positional peers.
  • (b) A comparison table allowing side-by-side comparison of up to 3 players.
  • (c) A trend line showing how the player's key metrics have evolved over the past 3 seasons.
  • (d) An uncertainty indicator (e.g., confidence interval bars) reflecting sample size.

Task 4.2: Scouting Report

Write a one-page scouting report (300--400 words) for one of the simulated players, following the communication principles from Section 5.6. Your report should:

  • Lead with the recruitment question (does this player improve our chance creation?).
  • Present 3--5 key metrics with context (comparisons to positional averages, possession-adjusted values).
  • Acknowledge uncertainty and limitations.
  • Include a clear recommendation.

Task 4.3: Presenting to the Sporting Director

The sporting director has 15 minutes for your presentation. Plan your presentation:

  • (a) What is your opening statement (1--2 sentences)?
  • (b) Which 3 metrics do you emphasize and why?
  • (c) How do you handle the question: "But he only played 1,800 minutes --- can we really trust these numbers?"
  • (d) What video clips would you pair with the data?

Phase 5: Ethical Considerations

Task 5.1: Data Fairness

Your dashboard compares players across leagues, age groups, and team contexts. Identify two potential fairness issues that could arise from your metric choices or adjustment methods. For example:

  • Could your league-strength adjustment systematically undervalue players from African or South American leagues?
  • Could possession adjustment penalize players on dominant teams who might actually be driving that dominance?

Task 5.2: Transparency

The head scout asks: "I do not understand how these adjusted numbers work. How do I know they are not just made up?" Describe how you would address this concern, using at least three strategies from Section 5.6.5.

Deliverables

By the end of this case study, you should produce:

  1. A metric selection document (Tasks 1.1 and 1.2).
  2. A context-adjusted comparison table (Task 2.2).
  3. A validation summary table (Task 3.1).
  4. A scouting report for one player (Task 4.2).
  5. A presentation outline (Task 4.3).
  6. Working Python code for all computations (see code/case-study-code.py).

Key Takeaways

  • Building an effective scouting dashboard requires careful metric selection grounded in the desirable metric properties from Section 5.2.
  • Cross-league comparisons demand multiple context adjustments applied thoughtfully --- with transparency about what each adjustment does and does not account for.
  • Validation is not a one-time exercise. Metrics should be re-validated as new data becomes available and as the competitive landscape changes.
  • Communication is as important as computation. The best metrics in the world are useless if scouts and decision-makers do not trust or understand them.
  • Ethical considerations --- fairness across leagues, transparency of methods, and honest acknowledgment of uncertainty --- should be built into the design from the start, not added as an afterthought.