Chapter 24 Exercises: Simulation and Monte Carlo Methods
Instructions: Complete all exercises in the parts assigned by your instructor. Show all work for calculation problems. For programming challenges, include comments explaining your logic and provide sample output. For analysis and research problems, cite your sources where applicable.
Part A: Conceptual Understanding
Each problem is worth 5 points. Answer in complete sentences unless otherwise directed.
Exercise A.1 --- The Law of Large Numbers and Convergence
Explain the role of the Law of Large Numbers in Monte Carlo simulation. Describe (a) the difference between the weak and strong forms of the LLN, (b) why the $O(1/\sqrt{N})$ convergence rate is called "dimension-free" and why this matters for sports betting applications, (c) the practical implication that halving the standard error requires quadrupling the number of simulations, and (d) why this convergence rate is both the greatest strength and the greatest limitation of Monte Carlo methods.
Exercise A.2 --- Pseudorandom Number Generation
Discuss the concept of pseudorandom number generation and its role in simulation. Address (a) why we use pseudorandom rather than truly random numbers, (b) the importance of setting and recording seeds for reproducibility, (c) the difference between NumPy's legacy np.random module and the modern Generator API and why the latter is preferred, and (d) why generating random numbers in bulk (vectorized) is dramatically faster than generating them one at a time in a loop.
Exercise A.3 --- Bootstrap vs. Parametric Inference
Compare the non-parametric bootstrap with traditional parametric confidence intervals. Explain (a) when the bootstrap is preferred over analytical methods, (b) the key assumption underlying the bootstrap (that the empirical distribution approximates the true distribution), (c) why the BCa method is preferred over the simple percentile method for betting performance metrics, and (d) a scenario in sports betting where the bootstrap would give misleading results.
Exercise A.4 --- Permutation Test Logic
A bettor claims their model produces significantly better predictions for home underdogs than for home favorites. Describe how you would design a permutation test to evaluate this claim. Specifically, explain (a) the null hypothesis, (b) what labels you would permute, (c) the test statistic you would use, (d) how you would compute the p-value, and (e) what conclusion you would draw if 3.2% of permuted statistics exceeded the observed statistic.
Exercise A.5 --- Variance Reduction Intuition
Explain the intuition behind antithetic variates and control variates without using any mathematical formulas. Use a sports betting analogy for each technique. Then explain (a) why antithetic variates work best for monotone functions, (b) what makes a good control variate, (c) why importance sampling can catastrophically fail, and (d) how you would decide which variance reduction technique to apply for a given simulation problem.
Exercise A.6 --- Season Simulation Modeling Choices
When simulating a full NFL season, the modeler must make several key decisions. Discuss the following modeling choices and their implications: (a) simulating game outcomes as binary (win/loss) versus simulating point margins with a normal distribution, (b) using static team ratings versus allowing ratings to evolve throughout the simulated season, (c) whether to model home-field advantage as a fixed constant or as team-specific, and (d) how the choice of margin standard deviation (e.g., 13.5 vs. 10 points) affects the distribution of simulated season outcomes.
Exercise A.7 --- Bootstrap Sample Size Requirements
A bettor has 200 bets in their track record and wants to compute a 95% BCa confidence interval for their ROI. Discuss (a) whether 200 bets is sufficient for reliable bootstrap inference and why, (b) how the number of bootstrap replicates $B$ affects the quality of the confidence interval, (c) the rule of thumb for choosing $B$ and why 10,000 is commonly used, and (d) how you would determine if the confidence interval has stabilized (i.e., is no longer changing meaningfully with additional bootstrap replicates).
Exercise A.8 --- Monte Carlo vs. Analytical Solutions
For the problem "What is the probability that a bettor with a 54% win rate at -110 odds goes broke before doubling their bankroll, starting with 100 units?", explain (a) why this is difficult to solve analytically, (b) how you would set up a Monte Carlo simulation to estimate the answer, (c) how many simulations would be needed to estimate this probability to within 1 percentage point with 95% confidence (assume the true probability is around 15%), and (d) what variance reduction technique would be most effective for this particular problem.
Part B: Calculations
Each problem is worth 5 points. Show all work and round final answers to the indicated precision.
Exercise B.1 --- Monte Carlo Standard Error
A Monte Carlo simulation of 50,000 replications estimates the probability of a 10-win NFL team making the Super Bowl. The estimated probability is 0.0372 with a standard deviation of the indicator variable of 0.1893.
(a) Calculate the standard error of this estimate.
(b) Construct a 95% confidence interval for the true probability.
(c) How many simulations would be needed to reduce the standard error to 0.0005?
(d) If each simulation takes 0.02 seconds, how long would the simulation in (c) take?
Exercise B.2 --- Bootstrap Confidence Interval
A bettor's 400-bet track record at -110 odds yields a sample ROI of 3.8%. A bootstrap with 10,000 replicates produces the following statistics for the ROI distribution: mean = 3.72%, standard deviation = 2.95%, 2.5th percentile = -2.01%, 97.5th percentile = 9.65%.
(a) State the percentile bootstrap 95% confidence interval.
(b) Compute the bootstrap estimate of bias. Is the bias meaningful?
(c) Using the basic (pivotal) bootstrap method, compute the 95% confidence interval. Show why it differs from the percentile method.
(d) If the BCa method adjusts the percentiles to (2.8th, 97.8th) and the corresponding values are (-1.85%, 9.82%), state the BCa interval and explain why it differs from the percentile interval.
Exercise B.3 --- Permutation Test P-Value
You conduct a permutation test comparing the prediction accuracy (measured by Brier score) of two models across 300 games. The observed difference in mean Brier score is 0.012 (Model B is better). From 10,000 permutations, 412 produce a difference at least as large as 0.012.
(a) Compute the one-sided p-value. Is the improvement significant at the 5% level?
(b) Compute the two-sided p-value by counting permutations with absolute difference at least 0.012 (reported as 823 out of 10,000).
(c) What is the Monte Carlo standard error of the one-sided p-value estimate?
(d) If you wanted the p-value estimate to be accurate to within 0.005 with 95% confidence, how many permutations would you need?
Exercise B.4 --- Antithetic Variates Variance Reduction
For a simulation estimating the probability that an NBA team with a 0.55 win probability reaches 50 wins in an 82-game season:
(a) If the naive Monte Carlo estimate from 10,000 simulations has variance 0.0000248, what is the standard error?
(b) The antithetic variate estimator using 5,000 pairs (10,000 total simulations) has paired means with variance 0.0000112. What is the standard error?
(c) Compute the efficiency gain (variance reduction ratio).
(d) The correlation between the original and antithetic results is -0.55. Using the formula $\text{Var}(\text{AV}) = \frac{\sigma^2}{N}(1 + \rho)/2$, verify the variance reduction is consistent with this correlation.
Exercise B.5 --- Control Variates Calculation
In a season simulation, you estimate the expected number of playoff teams from the Western Conference. The control variate is the total number of wins across all Western Conference teams, which must equal $82 \times 15 / 2 = 615$ in expectation (since total wins must equal total losses in head-to-head play within a balanced schedule approximation).
(a) If the naive estimate has variance 2.56, the control variate has variance 145.0, and the covariance between the two is 12.8, compute the optimal coefficient $\beta^*$.
(b) Compute the correlation between the primary estimate and the control variate.
(c) Calculate the variance of the control variate estimator.
(d) What is the efficiency gain?
Exercise B.6 --- Convergence Diagnostics
A simulation of March Madness brackets (10,000 replications) estimates the probability of a 1-seed winning the tournament. The running mean stabilizes as follows:
| Replications | Running Mean | Running SE |
|---|---|---|
| 1,000 | 0.232 | 0.0134 |
| 2,000 | 0.241 | 0.0096 |
| 5,000 | 0.238 | 0.0060 |
| 10,000 | 0.236 | 0.0042 |
(a) How does the running SE decrease from 1,000 to 10,000 replications? Is this consistent with the theoretical $O(1/\sqrt{N})$ rate?
(b) Construct a 95% confidence interval at each stage.
(c) At what point do the confidence intervals stabilize (i.e., they all overlap)?
(d) If the true probability is 0.235, are all four confidence intervals consistent with the truth?
Exercise B.7 --- Importance Sampling for Rare Events
You want to estimate the probability that a team with a true 0.40 win rate achieves a 10-game winning streak in a 16-game NFL season. Under the naive approach, this probability is approximately 0.0001.
(a) How many simulations would you need to get a standard error of 0.00002 using naive Monte Carlo?
(b) You propose an importance sampling distribution where the win probability is inflated to 0.70. Write the importance weight for a scenario where the team wins their first 10 games and loses the remaining 6.
(c) Compute the importance weight numerically for the scenario in (b).
(d) Explain qualitatively why importance sampling is effective for this problem but could be dangerous if you chose a proposal distribution that inflated the win probability to 0.99.
Part C: Programming Challenges
Each problem is worth 10 points. Write clean, well-documented Python code. Include docstrings, type hints, and at least three test cases per function.
Exercise C.1 --- Complete Season Simulator with Playoff Brackets
Build a full NBA season simulator that includes both the regular season and the playoff bracket.
Requirements: - Accept team power ratings for 30 NBA teams and a full 82-game schedule. - Simulate game outcomes using normally distributed point margins with team-specific home-court advantages. - At the end of the regular season, seed teams 1-8 in each conference based on win totals (with tiebreakers). - Simulate a full 7-game playoff series for each matchup, with home-court advantage to the higher seed. - Track and report: win total distributions, playoff probabilities, conference finals probabilities, and championship probabilities for all 30 teams. - Run 10,000 simulations and display results in a formatted table sorted by championship probability.
Exercise C.2 --- Comprehensive Bootstrap Analysis Suite
Build a BootstrapAnalyzer class that provides a complete bootstrap analysis toolkit for evaluating betting performance.
Requirements: - Implement three CI methods: percentile, basic (pivotal), and BCa. - Support the following statistics: ROI, win rate, Sharpe ratio, maximum drawdown, profit factor (gross wins / gross losses), and longest losing streak. - Implement a paired bootstrap test for comparing two models or two bettors: given two sets of paired results, test whether the difference in a statistic is significant. - Produce a formatted summary report showing point estimates, all three CI types, bootstrap bias, and probability of the true statistic exceeding a threshold (e.g., P(ROI > 0)). - Include a convergence diagnostic that shows how the BCa interval changes as the number of bootstrap replicates increases from 1,000 to 20,000.
Exercise C.3 --- Permutation Testing Framework
Build a flexible permutation testing framework that handles the common hypothesis testing scenarios in sports analytics.
Requirements: - Implement the following test types: (1) two-sample test (compare two groups), (2) paired test (compare paired observations by sign-flipping), (3) independence test (permute one variable while holding the other fixed), and (4) trend test (permute temporal ordering). - Support custom test statistics via a callable interface. - Compute exact p-values when the total number of permutations is small enough (< 100,000), and approximate p-values via random sampling otherwise. - Produce a results object that includes: observed statistic, p-value, permutation distribution (as an array), effect size (Cohen's d for two-sample), and a confidence interval for the p-value itself. - Demonstrate all four test types on realistic sports data examples.
Exercise C.4 --- Variance Reduction Comparison Tool
Build a tool that applies multiple variance reduction techniques to the same simulation problem and produces a comprehensive comparison.
Requirements: - Implement all four techniques from the chapter: antithetic variates, control variates, stratified sampling, and importance sampling. - Apply all four to three different problems: (1) estimating the probability that a team with a 0.55 win probability achieves 50+ wins in 82 games, (2) estimating the expected profit of a Kelly-sized bet over 200 bets at -110 with a 54% win rate, and (3) estimating the probability of a 16-seed beating a 1-seed in a best-of-1 game simulated via a margin model. - For each technique on each problem, report: point estimate, standard error, efficiency gain relative to naive MC, computation time, and effective sample size. - Produce a summary table ranking the techniques by efficiency gain for each problem.
Exercise C.5 --- March Madness Bracket Optimizer
Build a March Madness bracket simulation and optimization tool that identifies contrarian bracket strategies for large bracket pools.
Requirements: - Simulate a full 64-team bracket with team power ratings and historical seed-based upset rates. - Run 100,000 bracket simulations to compute team advancement probabilities for all rounds. - Given a bracket pool of $N$ participants, implement a scoring system (ESPN standard: 10-20-40-80-160-320 points per round) and simulate pool outcomes. - Optimize bracket picks to maximize the expected pool winnings (not just the probability of a perfect bracket). The key insight is that in large pools, picking some upsets that others will miss can maximize expected value. - Compare the expected pool performance of (a) picking all favorites, (b) picking according to probability, and (c) the optimized contrarian strategy.
Part D: Analysis & Interpretation
Each problem is worth 5 points. Provide structured, well-reasoned responses.
Exercise D.1 --- Interpreting Season Simulation Output
A season simulation for the NFL produces the following results for the NFC North division (10,000 simulations):
| Team | Mean Wins | Std Wins | Playoff % | Division Win % | #1 Seed % |
|---|---|---|---|---|---|
| Lions | 11.8 | 2.1 | 87.3% | 52.1% | 14.2% |
| Packers | 10.4 | 2.3 | 68.5% | 28.7% | 6.1% |
| Vikings | 9.6 | 2.4 | 51.2% | 14.3% | 3.0% |
| Bears | 6.2 | 2.2 | 5.8% | 4.9% | 0.2% |
(a) The Lions have an 87.3% playoff probability. A futures market prices Lions to make the playoffs at -650 (implied 86.7%). Is there value? What factors would make you hesitant to bet?
(b) The Bears have a 4.9% chance to win the division. Explain why this is higher than you might expect for a 6.2-win team.
(c) The standard deviation of wins is approximately 2.1-2.4 for all teams. What drives this variability, and what does it imply about the reliability of pre-season win total projections?
(d) How would you use this simulation output to evaluate the market for Bears over/under 6.5 wins?
(e) A bettor argues that simulations are unreliable because they assume static team ratings. Evaluate this criticism and propose an approach that addresses it.
Exercise D.2 --- Bootstrap Results for a Bettor's Track Record
A bettor presents a bootstrap analysis of their 600-bet track record at -110 odds:
| Metric | Point Estimate | BCa 95% CI | P(metric > threshold) |
|---|---|---|---|
| Win Rate | 54.2% | (50.3%, 57.8%) | P(> 52.4%) = 72.3% |
| ROI | +3.6% | (-2.8%, +10.1%) | P(> 0%) = 86.4% |
| Sharpe Ratio | 0.052 | (-0.038, 0.143) | P(> 0) = 86.8% |
| Max Drawdown | $2,840 | ($1,420, $4,650) | --- |
(a) The 95% CI for ROI spans from -2.8% to +10.1%, which includes zero. Does this mean the bettor has no skill? Explain what the interval tells us and what it does not.
(b) The probability that the true win rate exceeds the breakeven point (52.4%) is 72.3%. Is this sufficient evidence of genuine skill? What probability would you require before concluding the bettor is skilled?
(c) The maximum drawdown CI ranges from $1,420 to $4,650. How should the bettor use this information for bankroll management?
(d) If the bettor continues for another 400 bets at the same true skill level, how would you expect the BCa intervals to change? Quantify the expected reduction in CI width.
Exercise D.3 --- Evaluating a Permutation Test Result
A researcher tests whether NBA teams perform significantly worse in the second game of a back-to-back compared to games with normal rest. The permutation test yields:
- Normal rest games: mean margin = +1.8 points (n = 800)
- Back-to-back games: mean margin = -0.6 points (n = 200)
- Observed difference: 2.4 points
- Permutation p-value (one-sided): 0.0023
- Cohen's d: 0.18
(a) Interpret the p-value and Cohen's d together. Is the effect statistically significant? Is it practically meaningful for betting?
(b) A 2.4-point effect from back-to-back status seems large. What confounding factors might inflate this estimate?
(c) If the sportsbook already adjusts lines by 1.5 points for back-to-back teams, is the remaining 0.9-point edge exploitable? What additional information would you need?
(d) Why is the permutation test more appropriate here than a standard t-test?
Exercise D.4 --- Comparing Variance Reduction Techniques
A modeler estimates the probability of a 7-seed upsetting a 2-seed in the first round of the NBA playoffs (best-of-7) using several techniques:
| Method | Estimate | SE | Efficiency Gain | Computation Time |
|---|---|---|---|---|
| Naive MC (N=100,000) | 0.1823 | 0.00122 | 1.0x | 12.4 sec |
| Antithetic (N=100,000) | 0.1831 | 0.00089 | 1.88x | 13.1 sec |
| Control Variate | 0.1827 | 0.00041 | 8.85x | 14.2 sec |
| Stratified | 0.1829 | 0.00068 | 3.22x | 15.8 sec |
| Importance Sampling | 0.1825 | 0.00033 | 13.67x | 18.5 sec |
(a) All estimates agree to within about 0.001. What does this consistency tell you about the reliability of each method?
(b) The control variate method achieves an 8.85x efficiency gain. Speculate on what the control variate might be in this playoff simulation context.
(c) Importance sampling achieves the highest efficiency gain but takes the most time. When is the computational overhead of importance sampling justified?
(d) If you needed to estimate this probability to within 0.001 with 95% confidence, how many simulations would each method require? Which method would you choose, considering both accuracy and computation time?
Exercise D.5 --- Simulation Pitfalls
A sports analytics firm simulates the upcoming Premier League season and publishes the following championship probabilities: Manchester City 42%, Arsenal 28%, Liverpool 18%, all others combined 12%. A critic raises several concerns. Evaluate each:
(a) "The simulation assumes team strength is fixed throughout the season, but transfers, injuries, and form changes mean strength varies." How would you quantify the impact of this assumption on the championship probabilities?
(b) "The simulation uses only 1,000 replications, which is not enough for reliable probability estimates." Calculate the standard error of the Manchester City championship probability estimate with 1,000 replications and assess whether the concern is valid.
(c) "The simulation uses normally distributed margins, but football (soccer) scores are low and discrete, so this is a poor model." Evaluate this criticism and propose an alternative approach for simulating individual match outcomes.
(d) "The probabilities should sum to more than 100% because there is uncertainty in the model." Explain why this is incorrect and what the critic might be confusing.
Part E: Research & Extension
Each problem is worth 5 points. These require independent research beyond Chapter 24. Cite all sources.
Exercise E.1 --- History of Monte Carlo Methods
Research and write a brief essay (500-700 words) tracing the history of Monte Carlo methods from their origins in the 1940s Manhattan Project (Ulam, von Neumann, Metropolis) to their modern application in sports analytics. Cover (a) the original problem that motivated Monte Carlo simulation, (b) key milestones in the development of the method, (c) the first published applications to sports or gambling, (d) how computational advances have expanded what is possible, and (e) current state-of-the-art applications in professional sports betting operations.
Exercise E.2 --- Markov Chain Monte Carlo (MCMC) in Sports
Research how MCMC methods (e.g., Metropolis-Hastings, Gibbs sampling, Hamiltonian Monte Carlo) are used in sports analytics. Find at least two published examples where MCMC was applied to a sports modeling problem. For each example, report (a) the sport and research question, (b) the specific MCMC algorithm used, (c) why standard Monte Carlo was insufficient, (d) any convergence diagnostics reported, and (e) the practical implications of the findings.
Exercise E.3 --- Quasi-Monte Carlo Methods
Research quasi-Monte Carlo (QMC) methods and low-discrepancy sequences (Halton, Sobol, Latin hypercube). Write a 400-600 word summary explaining (a) how QMC differs from standard Monte Carlo, (b) the theoretical convergence rate advantage of QMC, (c) when QMC is most effective (dimensionality considerations), (d) a concrete example of how QMC could improve a sports simulation, and (e) limitations of QMC compared to standard MC.
Exercise E.4 --- The Bootstrap in Published Sports Research
Find three published academic papers or industry reports that use bootstrap methods in sports analytics. For each, summarize (a) the research question, (b) the specific bootstrap method used (non-parametric, parametric, residual, wild, etc.), (c) the statistic for which confidence intervals were constructed, (d) the number of bootstrap replicates used, and (e) the key findings and whether the bootstrap changed the conclusions relative to parametric alternatives.
Exercise E.5 --- Simulation-Based Inference in Live Betting
Research how simulation methods are used in live (in-play) sports betting markets. Address (a) the unique computational challenges of running simulations in real time during a game, (b) how game-state information (score, time remaining, possession) is incorporated into live simulations, (c) published examples of live-game simulation models (e.g., win probability models), (d) the trade-off between simulation accuracy and computation speed in live markets, and (e) emerging approaches (e.g., neural network surrogates for simulation) that address the speed constraint.
Scoring Guide
| Part | Problems | Points Each | Total Points |
|---|---|---|---|
| A: Conceptual Understanding | 8 | 5 | 40 |
| B: Calculations | 7 | 5 | 35 |
| C: Programming Challenges | 5 | 10 | 50 |
| D: Analysis & Interpretation | 5 | 5 | 25 |
| E: Research & Extension | 5 | 5 | 25 |
| Total | 30 | --- | 175 |
Grading Criteria
Part A (Conceptual): Full credit requires clear, accurate explanations that demonstrate understanding of the underlying simulation concepts and their relevance to sports betting. Partial credit for incomplete but correct reasoning.
Part B (Calculations): Full credit requires correct final answers with all work shown. Partial credit for correct methodology with arithmetic errors.
Part C (Programming): Graded on correctness (40%), code quality and documentation (30%), and test coverage (30%). Code must execute without errors.
Part D (Analysis): Graded on analytical depth, logical reasoning, and appropriate application of simulation concepts to real-world betting scenarios. Multiple valid approaches may exist.
Part E (Research): Graded on research quality, source credibility, analytical depth, and clear writing. Minimum source requirements specified per problem.
Solutions: Complete worked solutions for all exercises are available in
code/exercise-solutions.py. For programming challenges, reference implementations are provided in thecode/directory.