Chapter 41 Exercises: Putting It All Together
Instructions: Complete all exercises in the parts assigned by your instructor. Show all work for calculation problems. For programming challenges, include comments explaining your logic and provide sample output. These exercises integrate concepts from the entire textbook; you may need to refer to earlier chapters.
Part A: Conceptual Understanding
Each problem is worth 5 points. Answer in complete sentences unless otherwise directed.
Exercise A.1 --- The Eight-Stage Workflow
List all eight stages of the complete betting workflow described in Section 41.1. For each stage, identify (a) the primary input, (b) the primary output, and (c) one quality check that should be performed before moving to the next stage.
Exercise A.2 --- Portfolio Diversification Rationale
Explain why treating a collection of sports bets as a diversified portfolio reduces the variance of returns even when individual bet outcomes are binary (win or lose). Reference the law of large numbers and the role of correlation in your answer.
Exercise A.3 --- Ensemble vs. Single Model
A bettor has two models: Model A has a Brier score of 0.22 and Model B has a Brier score of 0.25. Explain why a weighted ensemble of these two models might achieve a lower Brier score than either model individually. Under what conditions would the ensemble fail to improve upon the better individual model?
Exercise A.4 --- Performance Attribution Dimensions
Describe at least five dimensions along which betting performance can be attributed (e.g., by sport, by strategy). For each dimension, give an example of an actionable insight that attribution along that dimension could reveal.
Exercise A.5 --- Edge Decay Mechanisms
Identify and describe three distinct mechanisms by which a betting edge can decay over time. For each mechanism, suggest one monitoring metric that would provide early warning of edge decay.
Exercise A.6 --- Closing Line Value
Why is closing line value (CLV) considered the single most important diagnostic for long-term betting sustainability? Explain the relationship between consistently beating the closing line and long-term profitability, and describe a scenario where positive CLV could coexist with negative short-term P&L.
Exercise A.7 --- Scaling Decisions
A bettor has placed 600 bets over eight months with a 3.2% ROI and a 54% rate of beating the closing line. Should the bettor scale up bet sizes? Discuss at least three factors that should inform this decision beyond the headline ROI number.
Exercise A.8 --- Operational Discipline
The chapter argues that "the most common reason profitable bettors fail in the long run is not model error --- it is process failure." Provide three concrete examples of process failure and explain how each undermines long-term profitability.
Part B: Calculations and Short Problems
Each problem is worth 5 points. Show all work.
Exercise B.1 --- Risk Budget Allocation
A bettor has a \$20,000 bankroll and the following risk budget:
| Sport | Strategy | Allocation (%) | Max Per Bet (%) |
|---|---|---|---|
| NFL | Sides model | 15% | 2% |
| NBA | Totals model | 12% | 1.5% |
| MLB | Moneyline model | 10% | 1.5% |
| Soccer | xG model | 8% | 1% |
(a) Calculate the dollar allocation and maximum per-bet size for each sport/strategy.
(b) The bettor currently has \$2,400 in NFL sides exposure, \$1,800 in NBA totals exposure, and \$0 in MLB and soccer. The NBA totals model generates a signal to bet \$350 on an NBA total. Can the bet be placed? Show your work.
(c) If the bettor wants to add a \$500 NFL sides bet, can it be placed under the risk budget constraints? Consider both the strategy-level and portfolio-level (25% of bankroll) limits.
Exercise B.2 --- Ensemble Weighting
Three models have the following recent Brier scores over the last 100 predictions:
| Model | Brier Score |
|---|---|
| Logistic Regression | 0.210 |
| Random Forest | 0.195 |
| Neural Network | 0.230 |
(a) Calculate the inverse-Brier weights for each model.
(b) If the three models predict home win probabilities of 0.58, 0.62, and 0.55 respectively, what is the weighted ensemble probability?
(c) The no-vig market-implied probability is 0.54. What is the estimated edge using the ensemble probability?
Exercise B.3 --- Performance Metrics
A bettor's record for the month shows:
| Metric | Value |
|---|---|
| Total bets | 85 |
| Wins | 46 |
| Losses | 38 |
| Pushes | 1 |
| Total staked | \$8,500 |
| Total P&L | +\$412 |
| Average odds | -108 |
(a) Calculate the win rate (excluding pushes), ROI, and average profit per bet.
(b) The bettor's cumulative P&L curve peaked at +\$780 at bet 62, then declined to +\$412 by bet 85. What is the maximum drawdown in dollars and as a percentage of the peak?
(c) Suppose daily P&L has a mean of \$14.80 and a standard deviation of \$52.30. Calculate the annualized Sharpe-like ratio assuming 300 betting days per year.
Exercise B.4 --- Consensus Pricing
A bettor's quantitative model estimates a 0.60 probability for Team A to win. The no-vig market-implied probability is 0.55. The bettor's qualitative assessment, based on a recent coaching change, adjusts the probability upward by 0.03.
Using weights of 55% for the quantitative model, 35% for the market, and 10% for the qualitative adjustment, calculate:
(a) The consensus probability estimate.
(b) The expected value of a \$100 bet at decimal odds of 1.80 (implied probability 0.556).
(c) The Kelly fraction for this bet given the consensus probability and decimal odds of 1.80.
Exercise B.5 --- Portfolio Variance
A bettor places three simultaneous bets with the following characteristics:
| Bet | Stake Weight | Std Dev of Return | Correlation with Bet 1 | Correlation with Bet 2 |
|---|---|---|---|---|
| 1 | 0.40 | 0.95 | 1.00 | 0.15 |
| 2 | 0.35 | 0.90 | 0.15 | 1.00 |
| 3 | 0.25 | 0.92 | 0.05 | 0.05 |
(a) Calculate the portfolio variance using the formula from Section 41.2.
(b) Compare this to the variance of a single bet placed at the average standard deviation. By what percentage is the portfolio variance lower?
Exercise B.6 --- Signal Filtering
A model generates the following signals for tonight's games:
| Game | Model Prob | Market Prob (no-vig) | Side | Odds |
|---|---|---|---|---|
| Game A | 0.58 | 0.54 | Home | -130 |
| Game B | 0.52 | 0.50 | Away | +105 |
| Game C | 0.61 | 0.55 | Home | -140 |
| Game D | 0.49 | 0.48 | Away | +120 |
| Game E | 0.55 | 0.52 | Home | -115 |
Using minimum edge threshold of 3% and minimum model confidence of 55%, which bets pass the filter? For each passing bet, calculate the expected value per dollar wagered.
Part C: Programming Challenges
Each problem is worth 10 points. Include working Python code with comments and sample output.
Exercise C.1 --- Complete Data Pipeline
Implement a BettingDataPipeline for a sport of your choice. Your pipeline should:
(a) Generate or load synthetic data representing at least two data sources (e.g., game results and market odds).
(b) Implement a cleaning function that handles missing values, removes duplicates, and validates date ranges.
(c) Merge the sources on appropriate keys.
(d) Run the validate_data method and print a data quality report.
Include at least 200 rows of synthetic data and demonstrate the full pipeline from collection to validation.
Exercise C.2 --- Dynamic Risk Budget System
Extend the RiskBudget class from Section 41.2 with the following enhancements:
(a) Add a rebalance method that adjusts allocations proportionally when the bankroll changes (grows or shrinks).
(b) Add a utilization_alert method that prints a warning when any sport/strategy exceeds 80% of its allocation.
(c) Add a daily_exposure_limit parameter that caps the total new exposure added in a single day.
(d) Demonstrate the enhanced system with a sequence of at least 20 simulated bets, showing rebalancing after bankroll changes.
Exercise C.3 --- Ensemble Model with Walk-Forward Evaluation
Build an ensemble predictor that:
(a) Trains at least two model types (e.g., logistic regression and gradient boosting) on synthetic or real sports data.
(b) Implements inverse-Brier weighting with a configurable evaluation window.
(c) Performs walk-forward evaluation: train on the first 70% of data, then predict one game at a time, updating weights after each prediction.
(d) Compares the ensemble's Brier score to each individual model's score over the walk-forward period.
Print a summary table showing individual and ensemble Brier scores, and plot the cumulative Brier score over time if matplotlib is available.
Exercise C.4 --- Full Performance Attribution Report
Using simulated bet data (at least 500 bets across 3 sports and 4 strategies over 6 months):
(a) Generate realistic bet data including dates, sports, strategies, odds, stakes, results, model probabilities, and edges.
(b) Instantiate the PerformanceAttribution class and generate the by_sport, by_strategy, by_time_period, and edge_analysis reports.
(c) Calculate the portfolio Sharpe ratio and maximum drawdown.
(d) Write a narrative summary (as comments in code or printed output) interpreting the attribution results and recommending at least two actionable changes.
Exercise C.5 --- Automated Betting Decision Engine
Build a class BettingDecisionEngine that integrates the signal generator, risk budget, and ensemble predictor into a single decision pipeline:
(a) Accept a set of upcoming games with features and market odds.
(b) Generate ensemble predictions and calculate edges.
(c) Filter signals by minimum edge and confidence thresholds.
(d) Size bets using a fractional Kelly criterion.
(e) Check each bet against the risk budget before approving.
(f) Output a structured bet sheet (DataFrame) with approved bets, including game, side, probability, edge, stake, and sportsbook recommendation.
Demonstrate the engine with at least 10 simulated upcoming games.
Part D: Analysis and Interpretation
Each problem is worth 10 points. Write clear, structured analyses.
Exercise D.1 --- Risk Budget Design
Design a complete risk budget for a hypothetical bettor with a \$15,000 bankroll who wants to bet on NFL, NBA, and tennis. The bettor has a strong NFL model (2 years of track record, 4% ROI), a newer NBA model (6 months, 2.5% ROI), and is just starting to develop a tennis model with no track record.
Specify: (a) sport-level allocations with rationale, (b) strategy-level allocations within each sport, (c) maximum per-bet sizes, (d) total portfolio exposure limit, and (e) criteria for adjusting allocations over time.
Exercise D.2 --- Model Disagreement Analysis
Two models produce the following predictions for five upcoming NBA games:
| Game | Model A (Prob Home Win) | Model B (Prob Home Win) |
|---|---|---|
| 1 | 0.62 | 0.64 |
| 2 | 0.58 | 0.42 |
| 3 | 0.71 | 0.68 |
| 4 | 0.45 | 0.55 |
| 5 | 0.53 | 0.51 |
For each game, calculate the absolute disagreement between models. Classify each game as "high agreement" (disagreement < 5%), "moderate disagreement" (5-10%), or "high disagreement" (> 10%). For each category, recommend a betting action (full position, reduced position, or pass) and explain your reasoning.
Exercise D.3 --- Operations Calendar
Design a comprehensive operations calendar for a multi-sport bettor covering daily, weekly, monthly, and quarterly activities. For each time interval, specify:
(a) The specific tasks to be performed. (b) The metrics to review. (c) Decision criteria for action (e.g., "If drawdown exceeds 15% of bankroll, reduce all bet sizes by 50%"). (d) Estimated time commitment.
Part E: Integration and Synthesis
Each problem is worth 10 points. These problems require synthesizing concepts from multiple chapters.
Exercise E.1 --- End-to-End System Design
Design (on paper or in pseudocode) a complete betting system for a single sport of your choice. Your design should include:
(a) Data sources and collection schedule. (b) Feature engineering pipeline with at least 8 specific features. (c) Model architecture (type, training schedule, validation approach). (d) Signal generation and filtering criteria. (e) Bet sizing methodology. (f) Execution plan (sportsbook selection, timing, record-keeping). (g) Performance review schedule and key metrics. (h) Criteria for scaling up, scaling down, or retiring the strategy.
This is a design document, not an implementation. Focus on completeness and coherence.
Exercise E.2 --- Backtesting Framework
Implement a backtesting framework that simulates the complete betting workflow on historical data:
(a) Walk-forward model training and prediction. (b) Signal generation with configurable edge thresholds. (c) Kelly-based bet sizing with a configurable Kelly fraction. (d) Risk budget enforcement. (e) Bet settlement and P&L tracking.
Run the backtest on at least 200 simulated games and produce a report including total P&L, ROI, max drawdown, Sharpe ratio, and a cumulative P&L curve.
Exercise E.3 --- Market Adaptation Simulation
Simulate a scenario where a bettor's edge decays over time:
(a) Generate 1,000 bets where the true edge starts at 5% and linearly decays to 0% over the sample.
(b) Implement a monitoring system that tracks rolling ROI, rolling CLV, and rolling edge estimate over windows of 50 and 100 bets.
(c) Determine at what point the monitoring system would detect the edge decay (i.e., when the rolling metrics cross a predefined threshold).
(d) Compare two strategies: (i) maintaining constant bet sizes and (ii) dynamically reducing bet sizes as the rolling edge estimate decreases. Which strategy preserves more bankroll?
Exercise E.4 --- Multi-Sport Portfolio Optimization
Build a portfolio optimization system that:
(a) Takes as input expected edges and variances for bets across 3 sports.
(b) Calculates optimal Kelly allocations for each bet.
(c) Applies portfolio-level constraints (maximum total exposure, maximum per-sport exposure, correlation adjustments).
(d) Produces an optimized bet allocation that maximizes expected log-wealth growth subject to the constraints.
Test with at least 15 simultaneous betting opportunities and compare the constrained portfolio to the unconstrained Kelly allocation.
Exercise E.5 --- Process Quality Audit
Design and implement a "process quality audit" tool that evaluates the health of a betting operation based on its historical bet data. The tool should assess:
(a) Data quality: completeness of bet records, consistency of metadata. (b) Discipline: adherence to bet sizing rules (no bet exceeds Kelly or risk budget limits). (c) Execution quality: average slippage between target and actual odds. (d) Timeliness: lag between signal generation and bet placement. (e) Review compliance: evidence of regular performance review (e.g., annotations or adjustments in the data).
Output a "health score" from 0 to 100 with category-level subscores and specific recommendations.