Chapter 5 Self-Assessment Quiz
Test your understanding of the key concepts from Chapter 5. Select the best answer for each question, then expand the answer block to check your work. A score of 80% or higher indicates solid mastery of the material.
Question 1. Which of the following is the primary limitation of using raw goal tallies to evaluate strikers?
- (a) Goals are too easy to count.
- (b) Goal tallies do not account for the quality or quantity of chances received.
- (c) Goals are only relevant in knockout competitions.
- (d) Goal tallies are always biased toward home players.
Answer
**(b)** Raw goal tallies mix finishing skill with the quality and volume of chances a player receives. A striker with many high-quality chances (high xG) will naturally score more, even without superior finishing ability. Section 5.1.1 discusses this in detail.Question 2. Pass completion rate is considered misleading primarily because:
- (a) It counts backward passes and forward passes equally.
- (b) It is always above 90% for all professional players.
- (c) It does not account for the difficulty or value of attempted passes.
- (d) It requires event-level data that is not widely available.
Answer
**(c)** Pass completion rate treats all passes equally, regardless of their difficulty, distance, or tactical value. A conservative passer who avoids risk will have a higher completion rate than an ambitious one, but the latter's passes may be far more valuable. See Section 5.1.1.Question 3. In the signal-to-noise framework, "context effects" refers to:
- (a) Random measurement error in the data collection process.
- (b) Systematic factors like opponent quality, game state, and venue that influence observed statistics.
- (c) The player's underlying true talent level.
- (d) The sample size of the dataset.
Answer
**(b)** Context effects are systematic (non-random) influences on observed statistics, such as opponent strength, whether the team is winning or losing, home vs. away, and possession share. They sit between true talent and random noise in the decomposition. See Section 5.2.2.Question 4. Which of the following is a rate statistic?
- (a) Total assists in a season
- (b) Total distance covered in a match
- (c) Goals per 90 minutes
- (d) Number of clean sheets
Answer
**(c)** Goals per 90 minutes divides a count (goals) by a denominator (minutes played, normalized to 90). Options (a), (b), and (d) are counting statistics that accumulate over time. See Section 5.3.1.Question 5. A player has scored 6 goals in 810 minutes. Their goals per 90 is:
- (a) 0.53
- (b) 0.67
- (c) 0.74
- (d) 0.81
Answer
**(b)** Goals per 90 = (6 / 810) x 90 = 0.667, which rounds to 0.67. See Section 5.3.2.Question 6. Why should per-90 rates be treated with caution for players with fewer than 900 minutes?
- (a) Per-90 calculations require at least 900 minutes to be mathematically valid.
- (b) Small samples produce unstable rate estimates that may not reflect true ability.
- (c) Players with fewer than 900 minutes are always substitutes and therefore lower quality.
- (d) The per-90 formula has a systematic bias below 900 minutes.
Answer
**(b)** With small samples, the variance of the estimated rate is high, meaning the observed per-90 value could be far from the player's true rate. This is a statistical sampling issue, not a formula bias. See Section 5.3.2.Question 7. When is a counting statistic preferable to a rate statistic?
- (a) When comparing players with different amounts of playing time.
- (b) When the total volume of output is what matters for the decision (e.g., Golden Boot race, total squad contribution).
- (c) When you want to remove the effect of playing time.
- (d) Counting statistics are never preferable in modern analytics.
Answer
**(b)** Counting statistics are appropriate when total output matters --- for example, in determining league awards, total team production, or workload management. See Section 5.3.4.Question 8. A team scores 2 goals against an opponent that typically concedes 1.8 goals per match. The league average is 1.3 goals conceded per match. The opponent-adjusted goals scored is approximately:
- (a) 1.44
- (b) 1.80
- (c) 2.00
- (d) 2.77
Answer
**(a)** Adjusted = 2 x (1.3 / 1.8) = 2 x 0.722 = 1.44. The adjustment reduces the total because the opponent concedes more than average, making scoring against them less impressive. See Section 5.4.2.Question 9. Game-state adjustment is necessary because:
- (a) Referees make different decisions depending on the score.
- (b) Teams change their tactical behavior depending on whether they are winning, losing, or level, which systematically affects individual statistics.
- (c) Players try harder when their team is losing.
- (d) Game state only affects goalkeeper statistics.
Answer
**(b)** Teams adjust tactics based on the score: leading teams may sit deeper and absorb pressure, while trailing teams push forward. This systematically changes the opportunities available for individual statistical actions. See Section 5.4.3.Question 10. Possession adjustment for defensive statistics uses which denominator?
- (a) The team's own possession percentage.
- (b) The opponent's possession percentage (i.e., 1 minus the team's possession).
- (c) 50%, regardless of actual possession.
- (d) The league average possession percentage.
Answer
**(b)** Defensive actions (tackles, interceptions, blocks) occur when the opponent has the ball. Therefore, the relevant denominator is the opponent's share of possession, which equals 1 minus the team's own possession percentage. See Section 5.4.4.Question 11. The three pillars of metric validation are:
- (a) Accuracy, precision, and recall.
- (b) Stability, discrimination, and predictive power.
- (c) Validity, objectivity, and simplicity.
- (d) Correlation, regression, and classification.
Answer
**(b)** The three pillars are stability (does the metric give consistent results?), discrimination (does it separate genuinely different players/teams?), and predictive power (does it forecast future outcomes?). See Section 5.5.1.Question 12. Split-half reliability involves:
- (a) Splitting the season into first half and second half and comparing means.
- (b) Comparing the metric for the best half of the squad against the worst half.
- (c) Dividing a player's matches into two subsets (e.g., odd and even) and correlating the metric across subsets.
- (d) Splitting the data into training and test sets for machine learning.
Answer
**(c)** Split-half reliability divides a player's data into two interleaved subsets (commonly odd-numbered and even-numbered matches), computes the metric for each subset, and measures the correlation between them. High correlation indicates stability. See Section 5.5.2.Question 13. The Spearman-Brown prophecy formula is used to:
- (a) Predict future performance from past data.
- (b) Estimate the reliability of the full-length metric from the split-half correlation.
- (c) Determine the optimal sample size for a study.
- (d) Convert correlation coefficients to regression coefficients.
Answer
**(b)** The formula $r_{\text{full}} = 2r_{\text{half}} / (1 + r_{\text{half}})$ estimates what the reliability would be if we used the full dataset rather than just half. See Section 5.5.2.Question 14. An ICC of 0.20 for a metric means:
- (a) 20% of the variance is between players, and 80% is within players --- poor discrimination.
- (b) The metric is 20% accurate.
- (c) 20% of players are above average.
- (d) The metric has been validated on 20% of the dataset.
Answer
**(a)** An ICC of 0.20 means only 20% of the total variance in the metric is attributable to genuine differences between players. The remaining 80% is noise or within-player variability. This is a poorly discriminating metric. See Section 5.5.3.Question 15. The stabilization point $n^*$ is the sample size at which:
- (a) The metric reaches its maximum possible value.
- (b) The metric's reliability equals 1.0.
- (c) Signal and noise contribute equally to the observed value (reliability = 0.5).
- (d) The metric becomes perfectly predictive.
Answer
**(c)** The stabilization point is where reliability reaches 0.5 --- the point at which the observed metric is equally influenced by true skill and random noise. Beyond this point, skill increasingly dominates. See Section 5.5.5.Question 16. Which of the following metrics typically has the longest stabilization period?
- (a) Pass completion percentage
- (b) Shots per 90
- (c) Goal conversion rate
- (d) Tackles per 90
Answer
**(c)** Goal conversion rate (goals divided by shots) involves rare events with high variance, requiring 35--40+ matches to stabilize. Process metrics like passing and tackling stabilize much faster. See Section 5.5.5.Question 17. When presenting metrics to a head coach, the best approach is to:
- (a) Lead with the mathematical formula behind the metric.
- (b) Present as many metrics as possible to appear thorough.
- (c) Lead with the decision-relevant question and support it with 3--5 key metrics, using video alongside data.
- (d) Avoid mentioning any limitations of the analysis.
Answer
**(c)** Section 5.6 emphasizes leading with the question rather than the method, focusing on a small number of relevant metrics, and pairing data with video evidence to ground abstract numbers in observable reality.Question 18. "Over-precision" in metric communication refers to:
- (a) Using too many decimal places, implying more accuracy than the data supports.
- (b) Being too precise about which player to sign.
- (c) Measuring too many events per match.
- (d) Using metrics that are too specialized for the audience.
Answer
**(a)** Reporting a value like "0.3247 xG per 90" implies a level of measurement precision that does not exist in the underlying data. Rounding to two significant figures (0.32) is more honest and easier to communicate. See Section 5.6.4.Question 19. A "prescriptive metric" is one that:
- (a) Describes what happened in a past match.
- (b) Predicts future outcomes based on historical patterns.
- (c) Recommends a specific action or decision based on analytical findings.
- (d) Prescribes which data should be collected.
Answer
**(c)** Prescriptive metrics go beyond description and prediction to recommend specific actions, such as "sign this player" or "change tactical formation." See Section 5.2.3.Question 20. Which of the following is NOT one of the five desirable properties of a good metric?
- (a) Validity
- (b) Complexity
- (c) Reliability
- (d) Actionability
Answer
**(b)** The five desirable properties are validity, reliability, discrimination, interpretability, and actionability. Complexity is not a desirable property --- in fact, unnecessary complexity can hinder interpretability and adoption. See Section 5.2.1.Question 21. Home advantage adjustment is applied because:
- (a) Home teams always win.
- (b) Teams historically score more goals and win more matches at home, introducing a systematic bias.
- (c) Away teams use different formations.
- (d) Stadium size affects player statistics.
Answer
**(b)** Home advantage is a well-documented phenomenon in which teams perform better at home across multiple metrics. This systematic bias should be accounted for when comparing performances across different venues. See Section 5.4.5.Question 22. The formula for possession-adjusted offensive statistics is:
- (a) Raw Value x (Team Possession% / 50%)
- (b) Raw Value x (50% / Team Possession%)
- (c) Raw Value / Team Possession%
- (d) Raw Value x (100% - Team Possession%)
Answer
**(b)** For offensive actions, possession adjustment = Raw Value x (50% / Team Possession%). This scales down the stats of high-possession teams (who have more opportunities) and scales up those of low-possession teams. See Section 5.4.4.Question 23. Which validation test would best determine whether a new metric captures genuine differences between players rather than random noise?
- (a) Split-half reliability
- (b) Predictive validity against future outcomes
- (c) Intraclass correlation coefficient (ICC)
- (d) Face validity
Answer
**(c)** The ICC directly measures the ratio of between-player variance to total variance. A high ICC means the metric captures genuine player differences rather than random match-to-match fluctuation. See Section 5.5.3.Question 24. You are comparing a winger in the Eredivisie (Dutch league) to one in the Premier League. Which adjustments should you consider? Select the most complete answer.
- (a) Only league-strength adjustment.
- (b) League-strength adjustment and per-90 normalization.
- (c) League-strength adjustment, possession adjustment, opponent adjustment, and per-90 normalization.
- (d) No adjustment is needed if both players play the same position.
Answer
**(c)** Cross-league comparisons require multiple adjustments: league strength (to account for different competition levels), possession (as league-wide possession styles differ), opponent quality (within each league), and per-90 normalization (to account for different playing time). See Sections 5.3 and 5.4.Question 25. Building trust in analytics with coaching staff is best achieved by:
- (a) Presenting the most complex model possible to demonstrate expertise.
- (b) Starting with small, low-stakes recommendations and gradually expanding the role of data, while being transparent about uncertainty.
- (c) Replacing the coach's judgment entirely with data-driven decisions.
- (d) Only sharing results when the data agrees with the coach's intuition.
Answer
**(b)** Trust is built incrementally through transparency, a track record of honest assessment, humility about limitations, and integration into existing workflows. The Liverpool FC example in Section 5.6.5 illustrates this incremental approach.Scoring Guide
| Score | Interpretation |
|---|---|
| 23--25 correct (92--100%) | Excellent --- you have strong command of the material |
| 20--22 correct (80--88%) | Good --- you meet the passing threshold with solid understanding |
| 16--19 correct (64--76%) | Fair --- review the sections indicated in incorrect answers |
| Below 16 (< 64%) | Needs review --- re-read the chapter before proceeding |