Chapter 1 Exercises: Introduction to Basketball Analytics

Overview

These exercises reinforce the concepts introduced in Chapter 1. They range from basic comprehension questions to advanced analytical problems requiring Python implementation.

Difficulty Levels: - ★ Basic: Reinforce fundamental concepts - ★★ Intermediate: Apply concepts to new situations - ★★★ Advanced: Extend methods or combine multiple concepts - ★★★★ Challenge: Open-ended research questions


Section 1: Conceptual Understanding

Exercise 1.1 ★

Define the three types of analytics (descriptive, predictive, prescriptive) and provide one example of each from basketball that was NOT mentioned in the chapter.

Exercise 1.2 ★

Explain why raw counting statistics (like total points) can be misleading when comparing players. What factors should be considered?

Exercise 1.3 ★

List the Four Factors identified by Dean Oliver and explain why shooting is weighted most heavily.

Exercise 1.4 ★★

Compare and contrast the box score era with the tracking data era. What questions can be answered with tracking data that were impossible to answer with box scores alone?

Exercise 1.5 ★★

Explain the concept of "expected value" in shot selection. Why might a team intentionally take more difficult shots?

Exercise 1.6 ★★

Describe the difference between raw plus-minus and adjusted plus-minus. What problem does adjustment solve?

Exercise 1.7 ★★★

Critique the claim that "analytics has solved basketball." What aspects of basketball remain difficult or impossible to quantify?

Exercise 1.8 ★★★

Explain why small sample sizes are particularly problematic in basketball analytics. Provide specific examples of situations where sample size issues arise.


Section 2: Mathematical Calculations

Exercise 1.9 ★

Calculate the expected points per shot for the following scenarios: a) A player shoots 40% from three-point range b) A player shoots 50% from mid-range (two-point range) c) A player shoots 65% from the restricted area (two-point range)

Which shot type is most efficient?

Exercise 1.10 ★

A team has the following game statistics: - Field Goals Made: 38 - Field Goal Attempts: 84 - Three-Pointers Made: 12 - Free Throws Made: 18 - Free Throw Attempts: 22

Calculate: a) Field Goal Percentage (FG%) b) Effective Field Goal Percentage (eFG%) c) True Shooting Percentage (TS%)

Exercise 1.11 ★★

Using Dean Oliver's Four Factors, calculate all four factors for Team A: - Field Goals Made: 42 - Field Goal Attempts: 88 - Three-Pointers Made: 14 - Turnovers: 13 - Possessions: 102 - Offensive Rebounds: 11 - Opponent Defensive Rebounds: 35 - Free Throws Made: 17 - Free Throw Attempts: 22

Exercise 1.12 ★★

A player has the following shooting splits: - Corner three-pointers: 42% on 50 attempts - Above-the-break three-pointers: 34% on 150 attempts - Mid-range: 41% on 100 attempts - At the rim: 68% on 200 attempts

Calculate the expected points per shot for each zone. If the player could choose to convert 50 mid-range attempts into another shot type, which conversion would maximize expected points?

Exercise 1.13 ★★★

A player is 15 for 40 (37.5%) from three-point range this season. Calculate the 95% confidence interval for their true three-point shooting ability. Interpret this interval in the context of talent evaluation.

Exercise 1.14 ★★★

Two players have the following plus-minus statistics over 1000 minutes: - Player A: +150 points, played with starters 80% of the time - Player B: +50 points, played with bench players 90% of the time

Discuss why raw plus-minus might be misleading here. What additional information would you need to fairly compare these players?


Section 3: Python Programming

Exercise 1.15 ★

Write a Python function that calculates True Shooting Percentage given points, field goal attempts, and free throw attempts.

def true_shooting_percentage(points, fga, fta):
    """
    Calculate True Shooting Percentage.

    Args:
        points: Total points scored
        fga: Field goal attempts
        fta: Free throw attempts

    Returns:
        True shooting percentage as a decimal (0-1)
    """
    # Your code here
    pass

# Test with: points=25, fga=18, fta=6
# Expected output: approximately 0.595

Exercise 1.16 ★

Write a Python function that determines which shot type has higher expected value.

def better_shot(three_pt_pct, two_pt_pct):
    """
    Determine which shot type has higher expected value.

    Args:
        three_pt_pct: Three-point percentage (0-1)
        two_pt_pct: Two-point percentage (0-1)

    Returns:
        String indicating which shot is better and by how much
    """
    # Your code here
    pass

# Test cases:
# better_shot(0.35, 0.50) -> "Two-pointer is better by 0.05 expected points"
# better_shot(0.40, 0.55) -> "Three-pointer is better by 0.10 expected points"

Exercise 1.17 ★★

Implement the Four Factors calculation as a Python class with methods for each factor.

class FourFactors:
    """Calculate Dean Oliver's Four Factors for a team."""

    def __init__(self, fg, fga, threept, tov, poss, orb, opp_drb, ft, fta):
        """Initialize with game statistics."""
        # Your code here
        pass

    def effective_fg_pct(self):
        """Calculate Effective Field Goal Percentage."""
        pass

    def turnover_rate(self):
        """Calculate Turnover Rate."""
        pass

    def offensive_reb_pct(self):
        """Calculate Offensive Rebounding Percentage."""
        pass

    def free_throw_rate(self):
        """Calculate Free Throw Rate."""
        pass

    def summary(self):
        """Return dictionary of all four factors."""
        pass

Exercise 1.18 ★★

Create a function that simulates shot outcomes and calculates actual vs. expected points.

import numpy as np

def simulate_shooting(three_pt_pct, three_pt_attempts,
                      two_pt_pct, two_pt_attempts, n_simulations=1000):
    """
    Simulate shooting performance and compare to expected value.

    Args:
        three_pt_pct: True three-point percentage
        three_pt_attempts: Number of three-point attempts
        two_pt_pct: True two-point percentage
        two_pt_attempts: Number of two-point attempts
        n_simulations: Number of simulations to run

    Returns:
        Dictionary with simulation results including:
        - expected_points: Theoretical expected points
        - mean_actual_points: Average points from simulations
        - std_actual_points: Standard deviation of simulated points
        - percentiles: 5th, 25th, 50th, 75th, 95th percentile outcomes
    """
    # Your code here
    pass

Exercise 1.19 ★★★

Create a visualization function that plots shot location data with efficiency coloring.

import matplotlib.pyplot as plt
import numpy as np

def plot_shot_efficiency(shot_data, title="Shot Efficiency Chart"):
    """
    Create a shot chart colored by efficiency.

    Args:
        shot_data: DataFrame with columns:
            - x: X coordinate (-25 to 25 feet)
            - y: Y coordinate (0 to 47 feet)
            - made: Boolean indicating if shot was made
        title: Chart title

    Returns:
        matplotlib figure object
    """
    # Your code here
    # Hint: Use hexbin or scatter with color mapping
    pass

Exercise 1.20 ★★★

Build a function that calculates confidence intervals for multiple statistics simultaneously.

def calculate_stat_confidence_intervals(player_data, confidence=0.95):
    """
    Calculate confidence intervals for shooting percentages.

    Args:
        player_data: Dictionary with keys:
            - 'fg_made', 'fg_attempts'
            - 'three_made', 'three_attempts'
            - 'ft_made', 'ft_attempts'
        confidence: Confidence level

    Returns:
        Dictionary with confidence intervals for:
        - FG%
        - 3P%
        - FT%
        - TS%
    """
    # Your code here
    pass

Section 4: Data Analysis

Exercise 1.21 ★★

Using the nba_api library (or simulated data), calculate the league-wide three-point attempt rate for each season from 2000 to present. Create a visualization showing this trend.

Exercise 1.22 ★★

Compare the shot distribution (at-rim, mid-range, three-pointer) for two contrasting teams: one "analytics-friendly" team and one traditional team. Discuss the differences.

Exercise 1.23 ★★★

Analyze the relationship between pace (possessions per game) and offensive efficiency (points per possession) across NBA teams. Is there a correlation? Create a scatter plot with team labels.

Exercise 1.24 ★★★

Calculate the Four Factors for all NBA teams from a recent season. Rank teams by each factor and identify which factors are most correlated with winning percentage.

Exercise 1.25 ★★★★

Build a simple regression model predicting team wins from the Four Factors. Which factors are most predictive? How well does the model fit the data?


Section 5: Critical Thinking

Exercise 1.26 ★★

"The three-point revolution means teams should only shoot three-pointers and layups." Critique this statement. Under what circumstances might mid-range shots still be valuable?

Exercise 1.27 ★★

Discuss the tension between individual player statistics and team success. How might optimizing for individual statistics harm team performance?

Exercise 1.28 ★★★

The Rockets under Daryl Morey and the Warriors dynasty had different approaches to analytics-driven basketball. Research both approaches and compare their philosophies.

Exercise 1.29 ★★★

Discuss ethical considerations in basketball analytics. Consider: - Player privacy and tracking data - Load management and fan experience - Draft modeling and player agency

Exercise 1.30 ★★★★

Design an analytics study to answer: "Does taking more three-pointers cause teams to win more, or do better teams simply take more three-pointers?" How would you establish causation rather than just correlation?


Section 6: Historical Analysis

Exercise 1.31 ★

Research the "pace and space" era. When did it begin, and what factors drove its emergence?

Exercise 1.32 ★★

Compare the statistical profile of a dominant center from the 1990s (e.g., Shaquille O'Neal) with a modern center. How has the center position evolved?

Exercise 1.33 ★★

Calculate pace-adjusted statistics for a historical player to enable fair comparison with modern players. Explain your methodology.

Exercise 1.34 ★★★

The 2016 Warriors won 73 games but lost the Finals. The 2017 Warriors won 67 games but won the Finals. Using available statistics, analyze what changed between these two seasons.

Exercise 1.35 ★★★★

Identify a player whose reputation was significantly changed by advanced analytics (for better or worse). Present the evidence on both sides.


Section 7: Career Development

Exercise 1.36 ★

Research the career path of one person currently working in NBA analytics. What education and experience led to their current role?

Exercise 1.37 ★★

Identify three skills beyond statistics and programming that are important for basketball analytics roles. Explain why each matters.

Exercise 1.38 ★★★

Draft an outline for a portfolio project that would demonstrate your basketball analytics skills to a potential employer. What would you analyze, and how would you present it?

Exercise 1.39 ★★★

Attend (virtually or in-person) a sports analytics conference presentation or read a paper from the MIT Sloan Sports Analytics Conference. Summarize the key findings and methodology.

Exercise 1.40 ★★★★

Write a 500-word analytical article about a current NBA topic suitable for publication on a basketball analytics blog. Include at least one original calculation or visualization.


Solutions

Selected solutions are available in Appendix G. Full solutions are available to instructors upon request.

Solution Hints

Exercise 1.9: Use the formula Expected Points = Point Value × Probability

Exercise 1.10: - eFG% = (FGM + 0.5 × 3PM) / FGA - TS% = Points / (2 × (FGA + 0.44 × FTA))

Exercise 1.15: Remember to use 0.44 as the free throw coefficient, not 0.5

Exercise 1.17: Initialize all values in __init__ and use stored values in methods


Complete these exercises before proceeding to Chapter 2.