Chapter 16 Exercises: NBA Modeling

Part A: Foundational Concepts (Exercises 1-6)

Exercise 1. Define the "Four Factors" of basketball as identified by Dean Oliver. For each factor, explain what it measures, why it matters, and provide the formula. Rank them in order of importance for predicting team success.

Exercise 2. Explain the concept of pace in NBA analytics. Team A averages 102 possessions per game and Team B averages 96 possessions per game. Estimate the expected number of possessions when these two teams play each other.

Exercise 3. Calculate the offensive rating for a team that scored 112 points on 98 possessions. Calculate the defensive rating for a team that allowed 105 points on 100 possessions. Convert both to per-100-possession rates. What is the team's net rating?

Exercise 4. Describe the difference between usage rate and efficiency for an NBA player. A player has a usage rate of 30% and a true shooting percentage of 58%. Another has a usage rate of 18% and a true shooting percentage of 62%. Which player is more valuable and why might the answer not be straightforward?

Exercise 5. What is the "possession" in NBA analytics and why is it the fundamental unit of analysis? Provide the standard formula for estimating team possessions from box score data.

Exercise 6. The NBA regular season consists of 82 games. At what point in the season (approximately how many games) does a team's net rating become a more reliable predictor of future performance than their win-loss record? Explain the statistical reasoning.

Part B: Data and Feature Engineering (Exercises 7-12)

Exercise 7. Design a feature that captures the "rest advantage" in NBA games. Consider factors such as days off, travel distance, back-to-back games, and three-games-in-four-nights situations. Specify your feature encoding and justify your design choices.

Exercise 8. Build a lineup-adjusted efficiency metric. When a team's starting point guard is out and replaced by a backup, how would you estimate the impact on team offensive rating? Describe your methodology using player-level on/off data.

Exercise 9. Construct a pace-adjusted prediction framework. If your model predicts Team A will score 1.12 points per possession and Team B will score 1.08 points per possession, with the expected game pace at 97 possessions, calculate the predicted total and spread.

Exercise 10. NBA games feature significant within-game variance. Design a halftime model that re-estimates game probabilities using first-half data. What features would you use? How would you weight pre-game priors versus first-half evidence?

Exercise 11. Create a travel fatigue feature using NBA schedule data. How would you quantify the difference between a team playing at home after two days off versus a team playing the second game of a back-to-back on the road after traveling across two time zones?

Exercise 12. Design a "clutch performance" metric for NBA teams. Define what constitutes clutch situations (e.g., score within 5 points in the final 5 minutes) and explain why clutch performance is largely non-predictive despite its narrative importance.

Part C: Model Building (Exercises 13-18)

Exercise 13. Build a linear regression model that predicts the point spread of NBA games using each team's offensive rating, defensive rating, pace, and home-court advantage. Train on one full season and test on the next.

Exercise 14. Implement a Bayesian team rating model for the NBA that updates after each game. Start with preseason priors based on the previous season's ratings (regressed 25% toward the mean) and update using a Kalman filter approach. Compare the Bayesian model's accuracy in the first month of the season versus a model that only uses current-season data.

Exercise 15. Build a totals prediction model using the Four Factors. For each game, predict the total using both teams' pace, effective field goal percentage, turnover rate, offensive rebounding rate, and free throw rate. Evaluate against closing totals using MAE and RMSE.

Exercise 16. Construct a player prop model for points scored. Use a player's recent scoring average, opponent defensive rating, pace of play, and home/away split to predict their individual point total. Compare your predictions to posted player prop lines.

Exercise 17. Build an ensemble model that combines three different approaches: (a) a Four Factors regression model, (b) an Elo rating system, and (c) a market-based model using recent ATS performance. Weight the components optimally and evaluate whether the ensemble outperforms any individual model.

Exercise 18. Implement a simulation-based model for NBA games that accounts for possession-level variance. Model each possession's outcome as a draw from a scoring distribution (0, 1, 2, or 3 points with position-appropriate probabilities). Run 10,000 simulations per game and derive spread and total probabilities.

Part D: Market Analysis (Exercises 19-24)

Exercise 19. The NBA has more games per season than the NFL. Analyze whether the NBA point-spread market is more or less efficient than the NFL market. Use historical data to compare the ATS records of large favorites (10+ points) in both sports.

Exercise 20. Study the "second night of a back-to-back" effect in NBA betting. Collect data on teams playing back-to-back games and calculate their ATS record. Does the market fully price in the fatigue factor?

Exercise 21. Analyze NBA totals market efficiency. Are high totals (230+) or low totals (under 210) more likely to go over or under historically? Test whether there is a systematic bias in totals setting.

Exercise 22. Study line movement patterns in NBA games. When a line moves more than 2 points between opening and closing, does the closing line outperform the opening line against the actual result? Calculate the value of "steam move" detection.

Exercise 23. Evaluate the profitability of betting on NBA teams that are 0-3 or worse ATS in their last three games. Does mean reversion create a profitable contrarian strategy?

Exercise 24. Compare the accuracy of your model to the closing line across an entire NBA season. Calculate your model's closing line value (CLV) -- the average difference between your model's line and the closing line when you take a position. Explain why positive CLV is the best long-term indicator of a model's profitability.

Part E: Advanced Applications (Exercises 25-30)

Exercise 25. Build a real-time win probability model for NBA games. Using score differential, time remaining, and possession, calculate the probability that the home team wins at any point during the game. Use this to identify potential live betting opportunities.

Exercise 26. Develop a model for predicting NBA playoff series outcomes. How do regular-season metrics translate to playoff performance? Account for factors like home-court advantage amplification, star-player leverage, and the effect of rest days between games.

Exercise 27. Study the impact of the NBA trade deadline on team performance. When a team makes a significant mid-season acquisition, how quickly does the betting market adjust? Build a framework for estimating the point-spread impact of trades.

Exercise 28. Construct an NBA player injury impact model. Using on/off court data, estimate the point impact of each player's absence. Which player's absence has the largest impact in the current NBA? Build a top-20 ranking.

Exercise 29. Analyze the "schedule loss" concept in NBA modeling. Some games are likely to be low-effort based on scheduling context (e.g., a game sandwiched between two rivalry matchups). Can you identify these games systematically and exploit them?

Exercise 30. Design and backtest a complete NBA betting system for an entire season. Generate predictions for every game, identify value bets where your edge exceeds a specified threshold, apply Kelly criterion staking, and report ROI with a full statistical analysis including confidence intervals and drawdown metrics.