Chapter 21: Quiz — Player Recruitment and Scouting
Instructions: Select the best answer for each question. Each question has exactly one correct answer unless otherwise stated.
Question 1. What is the primary purpose of data screening in the recruitment funnel?
(a) To replace traditional scouting entirely (b) To narrow a large universe of players to a manageable number for human evaluation (c) To determine the exact transfer fee for each candidate (d) To generate automated contract offers
Question 2. Which of the following is NOT a typical limitation of event data in player recruitment?
(a) It cannot capture off-ball movement (b) It provides limited insight into decision-making quality (c) It is only available for the top 5 European leagues (d) It does not measure leadership and communication
Question 3. Why is per-90-minute normalization used in recruitment analytics?
(a) To inflate statistics for players who play fewer minutes (b) To remove the effect of playing time differences when comparing players (c) To convert seasonal totals into weekly projections (d) To adjust for differences in match length across competitions
Question 4. A player has 2 goals in 270 minutes of play. His goals per 90 is:
(a) 0.33 (b) 0.67 (c) 1.00 (d) 0.74
Question 5. What is the "replacement fallacy" in recruitment?
(a) The belief that any player can be replaced by a cheaper alternative (b) The mistake of searching for a like-for-like replacement instead of considering complementary profiles (c) The assumption that replacement players will always underperform the original (d) The tendency to overpay for direct replacements
Question 6. Cosine similarity between two player profiles measures:
(a) The absolute difference in their statistical outputs (b) The directional alignment of their statistical vectors regardless of magnitude (c) The physical distance between their playing positions on the pitch (d) The correlation between their performance over time
Question 7. Which minimum minutes threshold is most commonly recommended for per-90 statistics in recruitment analysis?
(a) 180 minutes (2 full matches) (b) 450 minutes (5 full matches) (c) 900 minutes (10 full matches) (d) 2,700 minutes (30 full matches)
Question 8. In a composite scoring model, what is the purpose of z-score standardization?
(a) To convert all metrics to the same scale before combining them (b) To remove outliers from the dataset (c) To weight metrics by their importance (d) To adjust metrics for league difficulty
Question 9. A player has 400 minutes of playing time and an observed npxG/90 of 0.45. Using Bayesian shrinkage with a prior strength of 900 minutes and league mean of 0.20, what is his adjusted npxG/90?
(a) 0.20 (b) 0.28 (c) 0.33 (d) 0.45
Question 10. According to the general age curve model, at what age range do most outfield players reach their peak overall performance?
(a) 20-23 (b) 24-29 (c) 30-33 (d) 18-21
Question 11. In the MARCEL projection system, what does the reliability coefficient ($r$) represent?
(a) The player's consistency across matches (b) The test-retest correlation indicating how predictive a statistic is year-over-year (c) The probability that the projection is correct (d) The correlation between a player's statistics and team success
Question 12. What is the primary advantage of the delta method for estimating age curves over fitting a curve to cross-sectional data?
(a) It requires less data (b) It is computationally simpler (c) It mitigates survivorship bias by focusing on within-player changes (d) It produces smoother curves
Question 13. A midfielder averages 10.0 progressive passes per 90 in the Eredivisie (league factor 0.77). What is his league-adjusted figure for the Premier League (factor 1.00)?
(a) 8.16 (b) 10.00 (c) 12.39 (d) 10.77
Question 14. Which league adjustment method uses data from players who have transferred between leagues to calibrate statistical comparisons?
(a) League average ratios (b) Transfer-based calibration (c) Hierarchical modeling (d) Style-weighted normalization
Question 15. A team averaging 63% possession will tend to have players with inflated statistics in which category?
(a) Aerial duels won (b) Tackles per 90 (c) Progressive passes per 90 (d) Clearances per 90
Question 16. A striker scored 15 goals from 13.0 xG. His goals minus xG is +4.0. If the expected standard deviation for a player with his shot volume is 3.0 goals, what is his z-score for finishing outperformance?
(a) 0.75 (b) 1.00 (c) 1.33 (d) 4.00
Question 17. Which of the following is NOT typically classified as a "red flag" in player recruitment?
(a) Significant overperformance of xG without a history of elite finishing (b) Declining minutes played year-over-year without clear injury explanation (c) Strong statistical performance in a top-5 European league (d) Three or more hamstring injuries in two seasons
Question 18. In a composite risk score, which risk category typically receives the highest weight?
(a) Discipline risk (b) Age risk (c) Injury risk (d) The weighting depends on the club's specific priorities and risk tolerance
Question 19. What is the typical adaptation period when a player transfers to a new league?
(a) 1-2 weeks (b) 1-3 months (c) 3-12 months (d) 18-24 months
Question 20. Which of the following player attributes is BEST captured by traditional scouting rather than data analysis?
(a) Goals per 90 minutes (b) Pass completion percentage (c) Decision-making speed under pressure (d) Distance covered per match
Question 21. In the integration framework described in this chapter, at which stage do scouts and analysts first collaborate directly?
(a) Stage 1: Data-Led Discovery (b) Stage 2: Scout-Led Evaluation (c) Stage 3: Collaborative Assessment (d) Stage 4: Decision Support
Question 22. A non-linear scoring function assigns zero credit below a minimum threshold, linear credit up to a target value, and diminishing returns above the target. What is the mathematical form used for the diminishing returns region?
(a) Exponential growth (b) Logarithmic function (c) Quadratic function (d) Sigmoid function
Question 23. When building a player database for recruitment, which of the following relationships is correctly described?
(a) Each player belongs to exactly one league across their entire career (b) A player's per-90 statistics should be stored as raw totals only (c) Biographical data, seasonal statistics, and injury records should be stored in separate but linked tables (d) Market value should be stored only at the time of the most recent transfer
Question 24. Which statement best describes the role of prediction intervals in performance projection?
(a) They guarantee that the player's performance will fall within the stated range (b) They communicate the uncertainty inherent in any projection, increasing with projection horizon (c) They replace the need for point estimates (d) They are only relevant for players under age 23
Question 25. The chapter argues that the goal of integrating data and scouting is:
(a) To prove that data analysis is superior to traditional scouting (b) To eliminate the need for live scouting (c) To improve the overall hit rate on recruitment decisions, not to eliminate mistakes entirely (d) To reduce the recruitment department's headcount and costs
Answer Key
-
(b) — Data screening narrows the universe of players so that human evaluation (video and live scouting) can focus on the most promising candidates.
-
(c) — Event data is available for many leagues beyond the top 5. The other options are genuine limitations of event data.
-
(b) — Per-90 normalization removes playing time effects, enabling fair comparison between starters and substitutes.
-
(b) — Goals per 90 = (2 / 270) * 90 = 0.667, rounded to 0.67.
-
(b) — The replacement fallacy is searching for a statistically identical replacement rather than considering what profile best complements the existing squad.
-
(b) — Cosine similarity measures the angle between two vectors, capturing directional alignment regardless of magnitude.
-
(c) — 900 minutes (10 full matches) is the most commonly cited threshold, though context-dependent adjustments may be applied.
-
(a) — Z-score standardization converts metrics measured on different scales (e.g., passes per 90 vs. xG per 90) to a common scale with mean 0 and standard deviation 1.
-
(b) — Adjusted = (400 * 0.45 + 900 * 0.20) / (400 + 900) = (180 + 180) / 1300 = 360 / 1300 = 0.277, approximately 0.28.
-
(b) — Most outfield players peak between ages 24-29, though the exact range varies by position and metric.
-
(b) — The reliability coefficient measures year-over-year stability of a statistic, determining how much weight to place on observed performance versus regression to the mean.
-
(c) — The delta method examines within-player year-over-year changes, avoiding the survivorship bias that occurs when only players who continue to play appear in cross-sectional data.
-
(c) — Adjusted = 10.0 * (1.00 / 0.77) = 10.0 * 1.2987 = 12.39.
-
(b) — Transfer-based calibration compares the same player's performance before and after moving between leagues.
-
(c) — Teams with high possession will have more time on the ball, inflating passing statistics including progressive passes.
-
(c) — z = (15 - 13.0) / 3.0 = 4.0 / 3.0 = 1.33.
-
(c) — Strong performance in a top league is a positive indicator, not a red flag. The other options are all recognized red flags.
-
(d) — Risk weightings should reflect the club's specific priorities, risk tolerance, and strategic context. There is no universally "correct" weighting.
-
(c) — The chapter states that adaptation typically takes 3-12 months, during which performance may not reflect true ability.
-
(c) — Decision-making speed under pressure requires contextual observation that data cannot fully capture, making it best assessed by traditional scouting.
-
(c) — Stage 3 (Collaborative Assessment) is where joint meetings between analysts and scouts first occur in the described framework.
-
(b) — The diminishing returns region uses a logarithmic function: $1 + \alpha \cdot \ln(x / x_{target})$.
-
(c) — A well-designed player database stores biographical data, seasonal statistics, and injury records in separate linked tables.
-
(b) — Prediction intervals communicate uncertainty and widen with longer projection horizons, younger players, and smaller samples.
-
(c) — The chapter explicitly states that the goal is to improve the hit rate, not to eliminate mistakes entirely.