Self-Assessment Quiz: Descriptive Statistics in Football
Test your understanding of descriptive statistics and their application to football analytics.
Section 1: Central Tendency (Questions 1-7)
Question 1
A running back has the following yards per carry over 8 games: 3.2, 4.5, 3.8, 4.1, 12.5, 3.9, 4.2, 3.6
Which measure of central tendency is most appropriate for summarizing his "typical" performance?
A) Mean (4.98 yards) B) Median (3.95 yards) C) Mode D) Range
Question 2
If a team's scoring data is: 28, 35, 31, 42, 24, 38, 33
What is the median score?
A) 31 B) 33 C) 35 D) 33.0
Question 3
When calculating a quarterback's season completion percentage, which approach is correct?
A) Average the completion percentage from each game B) Sum all completions divided by sum of all attempts C) Use the median game completion percentage D) Use the mode of game completion percentages
Question 4
A dataset has mean = 35 and median = 28. What does this suggest about the distribution?
A) The distribution is symmetric B) The distribution is left-skewed C) The distribution is right-skewed D) There are no outliers
Question 5
Which statement about the mode is TRUE?
A) It is always the best measure of central tendency B) It is most useful for categorical data C) There can only be one mode in a dataset D) It is resistant to outliers like the median
Question 6
A team scores the following points over 5 games: 24, 31, 28, 45, 27
What is the mean?
A) 28 B) 31 C) 30 D) 155
Question 7
Which measure would change most if a single outlier is removed?
A) Median B) Mode C) Mean D) All would change equally
Section 2: Variability (Questions 8-14)
Question 8
Team A has a standard deviation of 8 points in their scoring. Team B has a standard deviation of 3 points. Which statement is TRUE?
A) Team A scores more points on average B) Team A is more consistent C) Team B is more consistent D) The teams are equally consistent
Question 9
If a dataset has mean = 40 and standard deviation = 10, approximately what percentage of values fall between 30 and 50 (assuming normal distribution)?
A) 50% B) 68% C) 95% D) 99.7%
Question 10
The interquartile range (IQR) is calculated as:
A) Maximum - Minimum B) Q3 - Q1 C) Mean - Median D) 2 × Standard Deviation
Question 11
Which measure of spread is most resistant to outliers?
A) Range B) Standard deviation C) Variance D) Interquartile range
Question 12
A quarterback's passer rating has a mean of 95 and standard deviation of 15. A rating of 125 represents:
A) 1 standard deviation above the mean B) 2 standard deviations above the mean C) 3 standard deviations above the mean D) An outlier by definition
Question 13
The coefficient of variation (CV) is useful for:
A) Comparing variability across different scales B) Measuring distribution shape C) Finding outliers D) Calculating correlation
Question 14
Team A: Mean = 30 PPG, SD = 6 Team B: Mean = 25 PPG, SD = 6
Which team has higher relative variability (CV)?
A) Team A B) Team B C) They are equal D) Cannot determine without more data
Section 3: Distributions (Questions 15-19)
Question 15
Rushing yards per play are typically:
A) Normally distributed B) Right-skewed (positive skew) C) Left-skewed (negative skew) D) Uniformly distributed
Question 16
Using the IQR method, a value is considered an outlier if it is:
A) More than 1 IQR from the median B) More than 1.5 × IQR below Q1 or above Q3 C) More than 2 standard deviations from the mean D) The maximum or minimum value
Question 17
A z-score of -1.5 indicates:
A) The value is 1.5 points below average B) The value is 1.5 standard deviations below the mean C) The value is in the bottom 1.5% D) The value is an outlier
Question 18
If a distribution has kurtosis > 0 (excess kurtosis), it means:
A) The distribution has lighter tails than normal B) The distribution has heavier tails than normal (more outliers) C) The distribution is symmetric D) The distribution is bimodal
Question 19
A team's scoring distribution shows two distinct peaks (at 21 and 42 points). This is called:
A) Normal distribution B) Skewed distribution C) Bimodal distribution D) Uniform distribution
Section 4: Correlation (Questions 20-25)
Question 20
A correlation coefficient of -0.85 indicates:
A) A strong positive relationship B) A strong negative relationship C) No relationship D) A weak negative relationship
Question 21
Rushing yards and passing yards both correlate positively with points scored. If they have zero correlation with each other, this suggests:
A) One causes the other B) They are independent paths to scoring C) The data is wrong D) One is more important than the other
Question 22
Which correlation value represents the weakest relationship?
A) r = 0.65 B) r = -0.72 C) r = 0.15 D) r = -0.88
Question 23
"Correlation does not imply causation" means:
A) Correlation is useless B) Two correlated variables may both be caused by a third variable C) Only positive correlations can imply causation D) Correlation is always wrong
Question 24
In a correlation matrix, the diagonal values are always:
A) 0 B) 1 C) The mean of all correlations D) Variable-specific
Question 25
Turnovers and points scored have r = -0.75. If a team increases turnovers by 1 standard deviation, their scoring would:
A) Increase by 0.75 points B) Decrease by 0.75 points C) Decrease by 0.75 standard deviations (on average) D) Have no predictable change
Answer Key
| Question | Answer | Explanation |
|---|---|---|
| 1 | B | Median is resistant to the outlier (12.5 yards) |
| 2 | B | Sorted: 24,28,31,33,35,38,42. Middle value is 33 |
| 3 | B | Weighted by attempts gives true overall percentage |
| 4 | C | Mean > Median indicates right skew (pulled by high values) |
| 5 | B | Mode is most useful for categorical data like play types |
| 6 | C | (24+31+28+45+27)/5 = 155/5 = 31... wait, that's B. Let me recalculate: 24+31+28+45+27 = 155, 155/5 = 31. Answer is B |
| 7 | C | Mean is most sensitive to outliers |
| 8 | C | Lower standard deviation means more consistent |
| 9 | B | 68% falls within 1 standard deviation |
| 10 | B | IQR = Q3 - Q1 |
| 11 | D | IQR uses only middle 50%, ignoring extremes |
| 12 | B | (125-95)/15 = 2 standard deviations |
| 13 | A | CV standardizes by mean, allowing cross-scale comparison |
| 14 | B | CV = 6/25 = 24% vs 6/30 = 20% |
| 15 | B | Many short gains, few long runs creates right skew |
| 16 | B | Outlier if < Q1 - 1.5×IQR or > Q3 + 1.5×IQR |
| 17 | B | Z-score measures standard deviations from mean |
| 18 | B | Positive kurtosis = heavier tails, more extreme values |
| 19 | C | Two peaks = bimodal |
| 20 | B | Close to -1 = strong negative |
| 21 | B | Zero correlation suggests independence |
| 22 | C | Closest to zero = weakest |
| 23 | B | Third variables or coincidence can create correlation |
| 24 | B | A variable correlates perfectly with itself |
| 25 | C | Correlation describes relationship in standard deviation units |
Note: Question 6 answer should be B (31), not C. The sum is 155, divided by 5 = 31.
Scoring Guide
- 23-25 correct: Excellent! Strong grasp of descriptive statistics.
- 18-22 correct: Good understanding. Review topics you missed.
- 13-17 correct: Fair. More practice with calculations recommended.
- Below 13: Review chapter material thoroughly.
Topics to Review by Question
| Questions | Topic |
|---|---|
| 1-4 | Mean, median, and when to use each |
| 5-7 | Mode and central tendency selection |
| 8-11 | Standard deviation and spread measures |
| 12-14 | Z-scores and coefficient of variation |
| 15-17 | Skewness, outliers, distributions |
| 18-19 | Kurtosis and distribution shapes |
| 20-22 | Correlation strength and direction |
| 23-25 | Correlation interpretation and matrices |