Chapter 5: Exercises

Practice Problems for Descriptive Statistics in Basketball

These exercises reinforce the concepts covered in Chapter 5. Problems range from basic calculations to applied analysis scenarios. Detailed solutions are provided at the end of this section.


Section A: Measures of Central Tendency (Problems 1-8)

Problem 1: Basic Mean Calculation

A rookie point guard recorded the following points in his first 10 NBA games: 12, 8, 15, 22, 10, 18, 14, 9, 20, 12.

a) Calculate the mean points per game. b) If he scores 35 points in game 11, what is the new mean? c) By what percentage did the mean change after the 35-point game?


Problem 2: Weighted Mean for Shooting Percentage

A player has the following three-point shooting record across three seasons:

Season 3PM 3PA 3P%
2021-22 82 240 34.2%
2022-23 156 420 37.1%
2023-24 198 510 38.8%

a) Calculate the career three-point percentage using the weighted mean formula. b) Compare this to the simple average of the three season percentages. Why do they differ?


Problem 3: Median Salary Analysis

The following are the salaries (in millions) for a hypothetical NBA team's roster:

$52.1, $38.4, $25.8, $18.2, $12.5, $8.3, $5.2, $3.8, $2.1, $2.1, $1.9, $1.8, $1.5, $1.2, $1.1

a) Calculate the mean salary. b) Calculate the median salary. c) Which measure better represents the "typical" salary? Explain.


Problem 4: Mode in Shot Selection

A shot chart analysis reveals the following distribution of shot attempts by zone for a player:

Zone Attempts
Restricted Area 245
Paint (Non-RA) 87
Mid-Range 112
Corner 3 156
Above Break 3 245

a) Identify the mode(s) of this distribution. b) What does this bimodal pattern tell us about the player's shot selection?


Problem 5: Comparing Central Tendency Measures

The following data shows the assist totals for games played by a veteran point guard during a 20-game stretch:

5, 7, 8, 6, 12, 9, 8, 7, 6, 8, 15, 7, 8, 6, 9, 8, 7, 6, 8, 7

a) Calculate the mean, median, and mode. b) Based on the relationship between these measures, describe the distribution shape. c) Which measure would you report to describe this player's "typical" assist output?


Problem 6: Per-Game vs. Total Statistics

Player A played 72 games and averaged 18.5 PPG. Player B played 58 games and averaged 22.3 PPG.

a) Calculate total points scored by each player. b) Which player scored more total points? c) Discuss the trade-off between per-game averages and total production.


Problem 7: Mean Shift Analysis

A team's offensive rating over the first half of the season (41 games) averaged 112.5 points per 100 possessions. After making a trade, their offensive rating over the second half (41 games) averaged 118.3.

a) Calculate the season-long mean offensive rating. b) What was the improvement in offensive rating after the trade? c) Is it valid to attribute the entire improvement to the trade? What other factors might explain the change?


Problem 8: Trimmed Mean

Consider these rebound totals for a center over 15 games: 2, 8, 9, 10, 10, 11, 11, 12, 12, 13, 13, 14, 15, 16, 22

a) Calculate the standard mean. b) Calculate the 10% trimmed mean (remove the top and bottom 10% of values). c) Why might a trimmed mean be more appropriate for this data?


Section B: Measures of Variability (Problems 9-14)

Problem 9: Range and IQR Calculation

Two shooting guards have the following scoring outputs over 10 games:

Player X: 18, 20, 22, 19, 21, 20, 18, 22, 21, 19 Player Y: 8, 12, 28, 32, 14, 24, 18, 26, 10, 28

a) Calculate the mean, range, and IQR for each player. b) Which player is more consistent? Support your answer with the variability measures. c) If you needed a predictable 20 points per game, which player would you prefer?


Problem 10: Standard Deviation Calculation

The following are the true shooting percentages (TS%) for 12 players on a team:

0.542, 0.578, 0.612, 0.489, 0.556, 0.601, 0.523, 0.587, 0.544, 0.568, 0.492, 0.598

a) Calculate the mean TS%. b) Calculate the sample standard deviation. c) What percentage of players fall within one standard deviation of the mean?


Problem 11: Coefficient of Variation

Analyze the following league-wide statistics:

Statistic Mean Std Dev
PPG 14.8 6.2
RPG 4.9 2.8
APG 3.2 2.1
BPG 0.5 0.5

a) Calculate the coefficient of variation for each statistic. b) Rank the statistics from least variable to most variable (relative to their means). c) Interpret why blocks per game has such high relative variability.


Problem 12: Variance Decomposition

A player's scoring variance can be decomposed into within-game variance and between-game variance. Over 5 games, the player scored the following points by quarter:

Game Q1 Q2 Q3 Q4 Total
1 8 6 10 8 32
2 4 8 4 6 22
3 6 6 8 8 28
4 10 4 6 12 32
5 2 4 8 2 16

a) Calculate the between-game variance (variance of game totals). b) Calculate the average within-game variance (average of each game's quarter-to-quarter variance). c) Which type of variance is larger? What does this tell us about the player?


Problem 13: Volatility Comparison

You are comparing two free agents. Their last three seasons showed the following PPG averages:

Player A: 22.1, 23.4, 22.8 (ages 26-28) Player B: 18.5, 26.2, 21.3 (ages 26-28)

a) Calculate the mean and standard deviation of seasonal averages for each player. b) If both players are projected to average 22.0 PPG next season, which is the riskier signing based on historical volatility? c) Discuss additional factors beyond scoring volatility that should influence this decision.


Problem 14: Comparing Team Consistency

Two teams have the following point differentials for their last 10 games:

Team A: +12, +8, -3, +15, +6, +10, -1, +9, +11, +7 Team B: +25, -8, +18, -12, +22, +3, -5, +28, -2, +11

a) Calculate the mean point differential for each team. b) Calculate the standard deviation for each team. c) Which team would you consider a more reliable playoff contender? Explain.


Section C: Percentiles and Rankings (Problems 15-19)

Problem 15: Percentile Calculation

The following data represents the PER (Player Efficiency Rating) for 20 players in a division:

12.5, 14.2, 15.8, 11.3, 18.6, 22.1, 16.4, 13.7, 19.8, 15.2, 17.3, 10.8, 21.5, 14.9, 16.8, 20.2, 13.4, 18.1, 15.6, 17.9

a) Calculate the 25th, 50th, 75th, and 90th percentiles. b) A player has a PER of 19.5. What is their approximate percentile rank? c) What PER would a player need to be in the top 10%?


Problem 16: Five-Number Summary

Construct the five-number summary for the following usage rates (%) of players on a team:

12.5, 15.8, 18.2, 19.4, 21.1, 22.6, 24.8, 26.3, 28.5, 32.1

a) Identify the minimum, Q1, median, Q3, and maximum. b) Calculate the IQR and identify any potential outliers using the 1.5*IQR rule. c) Sketch a box plot based on this data.


Problem 17: Decile Analysis

A scouting department ranks draft prospects using a composite score from 0-100. The following are scores for 30 prospects:

45, 52, 58, 62, 65, 68, 70, 72, 73, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 94, 96, 98

a) Calculate the decile boundaries (10th, 20th, ..., 90th percentiles). b) If a team only drafts prospects in the top two deciles, what is the minimum score required? c) How many prospects in this dataset meet that threshold?


Problem 18: Percentile Rank by Position

The following are the assist totals for centers in the league (per game):

0.8, 1.2, 1.5, 1.6, 1.8, 2.0, 2.1, 2.3, 2.5, 2.8, 3.1, 3.5, 4.2, 5.8, 7.2

A center averages 3.0 APG.

a) What is this center's percentile rank among centers? b) If the league-wide APG distribution has a 50th percentile of 2.4, would this center be above or below the 50th percentile league-wide? c) Discuss why position-specific percentiles provide different insights than league-wide percentiles.


Problem 19: Percentile-Based Player Profiles

Create a percentile profile for a player given the following statistics and league distributions:

Statistic Player Value League Mean League SD
PPG 18.5 14.2 6.8
RPG 7.2 5.1 3.2
APG 2.8 3.4 2.5
SPG 1.4 0.9 0.5
BPG 0.6 0.5 0.5

Assuming normal distributions: a) Calculate the z-score for each statistic. b) Convert each z-score to a percentile rank. c) Create a radar chart description of this player's profile (identify their strongest and weakest areas).


Section D: Distribution Shape (Problems 20-23)

Problem 20: Skewness Interpretation

A histogram of player salaries shows a strong right tail with the following characteristics: - Mean: $9.2 million - Median: $4.8 million - Mode: $1.9 million (rookie scale contract)

a) Without calculating, predict whether the skewness will be positive or negative. b) If the skewness coefficient is +2.3, interpret this value. c) Which measure of central tendency is most appropriate for describing the "typical" NBA salary?


Problem 21: Kurtosis in Game Scores

Team A's point totals over 82 games have a kurtosis of +4.2, while Team B's have a kurtosis of -0.8.

a) Which team has more extreme scoring performances (both high and low)? b) How might high positive kurtosis affect predictions for Team A's future games? c) If you were betting on game totals, which team provides more predictable outcomes?


Problem 22: Normality Assessment

The following statistics are available for a distribution of three-point shooting percentages: - Mean: 35.8% - Median: 36.2% - Skewness: -0.32 - Kurtosis: 0.18

a) Based on these values, assess whether the distribution is approximately normal. b) Would parametric statistical tests be appropriate for this data? c) What graphical method could confirm your assessment?


Problem 23: Distribution Comparison

Compare the following two distributions of plus/minus ratings:

Distribution A (Starters): - Mean: +4.2 - Median: +3.8 - Skewness: +0.45 - Kurtosis: -0.22

Distribution B (Bench Players): - Mean: -2.1 - Median: -1.5 - Skewness: -0.68 - Kurtosis: +1.35

a) Which distribution is closer to symmetric? b) Which distribution has more extreme values? c) Interpret the difference in means and medians for each distribution.


Section E: Correlation Analysis (Problems 24-28)

Problem 24: Correlation Calculation

Calculate the Pearson correlation coefficient for the following data on minutes played (X) and points scored (Y):

Player Minutes Points
1 32.5 18.2
2 28.1 14.5
3 35.8 21.3
4 22.4 11.8
5 30.2 16.9
6 18.6 8.4
7 26.8 15.2
8 34.1 22.8

a) Calculate the correlation coefficient. b) Interpret the strength and direction of the relationship. c) What percentage of variance in points is explained by minutes?


Problem 25: Spurious Correlation

A study finds a correlation of r = +0.72 between a player's salary and their number of fouls per game.

a) Does this correlation suggest that paying players more causes them to foul more? b) Identify at least two potential confounding variables. c) How might you investigate whether this relationship is causal?


Problem 26: Spearman vs. Pearson

The following data shows draft position and career win shares for 10 players:

Player Draft Position Career WS
A 1 142.5
B 3 98.2
C 5 85.6
D 8 72.1
E 12 68.4
F 18 45.2
G 25 38.9
H 35 22.1
I 45 15.8
J 55 8.2

a) Calculate the Spearman rank correlation. b) Would you expect Pearson or Spearman to show a stronger relationship? Why? c) Interpret the practical significance of this correlation for draft strategy.


Problem 27: Correlation Matrix Analysis

The following correlation matrix shows relationships among five statistics:

PPG RPG APG TOV MPG
PPG 1.00 0.35 0.42 0.58 0.78
RPG 0.35 1.00 -0.12 0.15 0.55
APG 0.42 -0.12 1.00 0.62 0.48
TOV 0.58 0.15 0.62 1.00 0.65
MPG 0.78 0.55 0.48 0.65 1.00

a) Which pair of statistics has the strongest correlation (excluding MPG)? b) Interpret the negative correlation between RPG and APG. c) Why does TOV correlate positively with both PPG and APG?


Problem 28: Partial Correlation

The correlation between three-point attempts (3PA) and wins is r = +0.45. The correlation between 3PA and offensive rating is r = +0.52. The correlation between offensive rating and wins is r = +0.68.

a) Calculate the partial correlation between 3PA and wins, controlling for offensive rating. b) Interpret this partial correlation. Does shooting more threes directly lead to more wins, or is the relationship mediated by offensive efficiency?


Section F: Standardization and Z-Scores (Problems 29-35)

Problem 29: Basic Z-Score Calculation

League statistics show the following for points per game: - Mean: 15.2 PPG - Standard Deviation: 6.4 PPG

Calculate the z-score for players with the following PPG averages: a) 28.5 PPG b) 15.2 PPG c) 8.8 PPG d) 35.0 PPG


Problem 30: Cross-Statistic Comparison

Using z-scores, compare a player's relative standing in scoring versus assists:

Scoring: Player averages 22.5 PPG (League: Mean = 15.0, SD = 6.5) Assists: Player averages 6.2 APG (League: Mean = 3.5, SD = 2.2)

a) Calculate the z-score for each statistic. b) In which category is the player more exceptional relative to the league? c) Create a simple composite score by averaging the two z-scores.


Problem 31: Position-Adjusted Z-Scores

A point guard averages 5.5 rebounds per game.

League-wide rebounding: Mean = 5.0, SD = 2.8 Point guard rebounding: Mean = 3.8, SD = 1.2

a) Calculate the league-wide z-score. b) Calculate the position-adjusted z-score. c) Interpret the difference. Is this player a good rebounder for their position?


Problem 32: Era-Adjusted Scoring

Compare these two scoring seasons:

1985-86 Season: Player averaged 30.4 PPG - League Mean: 22.5 PPG - League SD: 8.2 PPG

2023-24 Season: Player averaged 31.2 PPG - League Mean: 15.8 PPG - League SD: 6.1 PPG

a) Calculate the z-score for each player's scoring. b) Which player was more dominant relative to their era? c) Discuss limitations of this era-adjustment approach.


Problem 33: Creating a Composite Metric

Create a "two-way player" score using the following statistics:

Category Player League Mean League SD Weight
PPG 18.5 15.2 6.4 0.25
SPG 1.8 0.9 0.5 0.25
BPG 1.2 0.5 0.5 0.25
DWS 3.8 2.1 1.4 0.25

a) Calculate the z-score for each statistic. b) Calculate the weighted composite score. c) Interpret the composite score in terms of standard deviations above/below average.


Problem 34: Percentile from Z-Score

Convert the following z-scores to percentile ranks (assume normal distribution):

a) z = +1.0 b) z = +1.96 c) z = -0.5 d) z = +2.5


Problem 35: Robust Standardization

A dataset of player salaries has the following characteristics: - Mean: $9.8 million - Median: $5.2 million - Standard Deviation: $8.5 million - IQR: $7.8 million

For a player earning $18.0 million: a) Calculate the traditional z-score. b) Calculate the robust z-score (using median and IQR). c) Which standardization better represents this player's salary relative to typical salaries? Why?


Solutions

Solution 1

a) Mean = (12 + 8 + 15 + 22 + 10 + 18 + 14 + 9 + 20 + 12) / 10 = 140 / 10 = 14.0 PPG

b) New mean = (140 + 35) / 11 = 175 / 11 = 15.91 PPG

c) Percentage change = (15.91 - 14.0) / 14.0 * 100 = 13.6% increase

Solution 2

a) Career 3P% = (82 + 156 + 198) / (240 + 420 + 510) = 436 / 1170 = 37.3%

b) Simple average = (34.2 + 37.1 + 38.8) / 3 = 36.7%. They differ because simple averaging gives equal weight to each season, while the weighted average gives more weight to seasons with more attempts.

Solution 3

a) Mean = $176.0M / 15 = **$11.73 million**

b) Median (8th value when ordered) = $3.8 million

c) The median better represents the typical salary because the distribution is heavily right-skewed by max contracts.

Solution 4

a) The modes are Restricted Area (245) and Above Break 3 (245) - bimodal distribution.

b) This indicates modern "rim and three" shot selection, avoiding mid-range shots in favor of high-value shots.

Solution 5

a) Mean = 7.95, Median = 7.5, Mode = 8

b) Mean > Median > Mode suggests slight right skew from the two high-assist games (12, 15).

c) The median (7.5) or mode (8) best represents typical performance; the mean is slightly inflated by outlier games.

Solution 6

a) Player A: 72 * 18.5 = 1,332 points; Player B: 58 * 22.3 = 1,293 points

b) Player A scored more total points despite lower per-game average.

c) Per-game rewards efficiency and availability; totals reward durability and consistency over a full season.

Solution 7

a) Season mean = (41 * 112.5 + 41 * 118.3) / 82 = 115.4

b) Improvement = 118.3 - 112.5 = 5.8 points per 100 possessions

c) No - schedule difficulty, health, roster integration time, and natural variance could also explain improvement.

Solution 8

a) Standard mean = 177/15 = 11.8 rebounds

b) Trimmed mean (remove 2 and 22): 153/13 = 11.77 rebounds

c) The trimmed mean reduces influence of the outlier games (2 and 22), providing a more stable estimate.

Solution 9

a) - Player X: Mean = 20.0, Range = 4, IQR = 2.5 - Player Y: Mean = 20.0, Range = 24, IQR = 14.5

b) Player X is more consistent (lower range and IQR despite same mean).

c) Player X - their output is predictable around 20 points.

Solution 10

a) Mean TS% = 55.75%

b) Sample SD = 0.0396 or 3.96%

c) 68% range: 51.79% to 59.71%. Counting: 8/12 = 66.7% fall within one SD.

Solution 11

a) - PPG CV = 6.2/14.8 * 100 = 41.9% - RPG CV = 2.8/4.9 * 100 = 57.1% - APG CV = 2.1/3.2 * 100 = 65.6% - BPG CV = 0.5/0.5 * 100 = 100.0%

b) Ranking (least to most variable): PPG < RPG < APG < BPG

c) Blocks are rare events with many players recording near-zero, creating high relative variability.

Solution 12

a) Game totals: 32, 22, 28, 32, 16. Between-game variance = 42.0

b) Within-game variances: 2.67, 3.33, 1.33, 11.33, 8.67. Average = 5.47

c) Between-game variance is larger, indicating more game-to-game inconsistency than quarter-to-quarter within games.

Solution 13

a) - Player A: Mean = 22.77, SD = 0.66 - Player B: Mean = 22.0, SD = 3.87

b) Player B is riskier - historical volatility (SD = 3.87) suggests unpredictable future performance.

c) Injury history, age, team fit, contract structure, and shooting percentages should also be considered.

Solution 14

a) Team A: Mean = +7.4; Team B: Mean = +8.0

b) Team A: SD = 5.72; Team B: SD = 14.28

c) Team A - similar win margin but much more consistent performance suggests reliability in playoffs.

Solution 15

a) Sorted data shows: Q1 = 14.0, Q2 = 16.1, Q3 = 18.85, P90 = 21.5

b) 19.5 would be approximately the 80th percentile

c) 90th percentile ≈ 21.5 PER

Solution 16

a) Min = 12.5, Q1 = 17.7, Median = 21.85, Q3 = 27.1, Max = 32.1

b) IQR = 9.4. Lower fence: 3.6, Upper fence: 41.2. No outliers.

c) [Box plot sketch description: Box from 17.7 to 27.1, median line at 21.85, whiskers to 12.5 and 32.1]

Solution 17

a) D1=57.4, D2=65.3, D3=70.9, D4=74.5, D5=77.5, D6=80.5, D7=83.5, D8=86.5, D9=91.5

b) Minimum score for top 2 deciles: 86.5 or higher

c) 6 prospects meet this threshold.

Solution 18

a) 10/15 centers have lower APG, so approximately 67th percentile among centers.

b) 3.0 > 2.4, so above the 50th percentile league-wide.

c) Position-specific percentiles reveal skill relative to role expectations; league-wide conflates different positions.

Solution 19

a) z-scores: PPG = 0.63, RPG = 0.66, APG = -0.24, SPG = 1.00, BPG = 0.20

b) Percentiles (approx): PPG = 74th, RPG = 75th, APG = 41st, SPG = 84th, BPG = 58th

c) Strongest: steals (84th). Weakest: assists (41st). Well-rounded scorer/rebounder with elite defensive activity.

Solution 20

a) Positive skewness (right-skewed) - mean > median > mode

b) Highly right-skewed; extreme salaries create a long right tail.

c) Median ($4.8M) - resistant to extreme values, represents typical player.

Solution 21

a) Team A - positive kurtosis indicates heavier tails (more extreme games).

b) More unexpected blowouts and poor performances; wider confidence intervals for predictions.

c) Team B - negative kurtosis means fewer extreme outcomes, more predictable totals.

Solution 22

a) Approximately normal: small skewness (-0.32), near-zero kurtosis (0.18), mean ≈ median.

b) Yes, parametric tests are appropriate.

c) Q-Q plot or histogram with normal curve overlay.

Solution 23

a) Distribution A (skewness +0.45) is closer to symmetric than B (-0.68).

b) Distribution B (kurtosis +1.35) has more extreme values.

c) Distribution A: mean > median suggests some high positive outliers. Distribution B: mean < median suggests negative outliers pulling mean down.

Solution 24

a) Using the formula, r ≈ 0.97

b) Very strong positive correlation - minutes strongly predict points.

c) R² = 0.94, so 94% of variance in points explained by minutes.

Solution 25

a) No - correlation does not imply causation.

b) Confounders: playing time (more minutes = more fouls and higher salary), role (aggressive defenders paid more, also foul more), veteran status.

c) Control for minutes/role; examine within-player salary changes; compare similar players with different salaries.

Solution 26

a) Spearman rho ≈ -0.98 (nearly perfect negative rank correlation)

b) Spearman, because the relationship is monotonic but may not be linear (diminishing returns at higher picks).

c) Earlier picks strongly associated with better careers; top picks are valuable assets.

Solution 27

a) Strongest: APG-TOV (0.62) or PPG-TOV (0.58)

b) Negative RPG-APG: traditional positions - rebounders (bigs) don't pass as much as guards.

c) Ball-handling creates both assists and turnovers; high-usage players have more opportunities for both.

Solution 28

a) Partial r = (0.45 - 0.52 * 0.68) / sqrt((1 - 0.52²)(1 - 0.68²)) = 0.10

b) After controlling for offensive rating, the direct 3PA-wins relationship nearly disappears. Three-pointers help through improved offense, not independently.

Solution 29

a) z = (28.5 - 15.2) / 6.4 = +2.08 b) z = (15.2 - 15.2) / 6.4 = 0.00 c) z = (8.8 - 15.2) / 6.4 = -1.00 d) z = (35.0 - 15.2) / 6.4 = +3.09

Solution 30

a) Scoring z = (22.5 - 15.0) / 6.5 = +1.15; Assists z = (6.2 - 3.5) / 2.2 = +1.23

b) Assists - higher z-score indicates more exceptional relative to league.

c) Composite = (1.15 + 1.23) / 2 = +1.19

Solution 31

a) League z = (5.5 - 5.0) / 2.8 = +0.18

b) Position z = (5.5 - 3.8) / 1.2 = +1.42

c) Slightly above average league-wide, but excellent for a point guard (1.42 SD above position mean).

Solution 32

a) 1985-86: z = (30.4 - 22.5) / 8.2 = +0.96; 2023-24: z = (31.2 - 15.8) / 6.1 = +2.52

b) 2023-24 player was more dominant (z = 2.52 vs. 0.96).

c) Limitations: different pace, rules, defensive intensity, roster sizes, and quality of competition.

Solution 33

a) z-scores: PPG = 0.52, SPG = 1.80, BPG = 1.40, DWS = 1.21

b) Composite = 0.25(0.52 + 1.80 + 1.40 + 1.21) = 1.23

c) This player is 1.23 standard deviations above average as a two-way player - solidly above average.

Solution 34

a) z = +1.0 → 84th percentile b) z = +1.96 → 97.5th percentile c) z = -0.5 → 31st percentile d) z = +2.5 → 99.4th percentile

Solution 35

a) Traditional z = (18.0 - 9.8) / 8.5 = +0.96

b) Robust z = (18.0 - 5.2) / 7.8 = +1.64

c) Robust z-score - the distribution is right-skewed, so median/IQR better represent typical values. The robust score correctly shows this player earns well above typical.