Chapter 41 Exercises: Putting It All Together

Instructions: Complete all exercises in the parts assigned by your instructor. Show all work for calculation problems. For programming challenges, include comments explaining your logic and provide sample output. These exercises integrate concepts from the entire textbook; you may need to refer to earlier chapters.


Part A: Conceptual Understanding

Each problem is worth 5 points. Answer in complete sentences unless otherwise directed.


Exercise A.1 --- The Eight-Stage Workflow

List all eight stages of the complete betting workflow described in Section 41.1. For each stage, identify (a) the primary input, (b) the primary output, and (c) one quality check that should be performed before moving to the next stage.


Exercise A.2 --- Portfolio Diversification Rationale

Explain why treating a collection of sports bets as a diversified portfolio reduces the variance of returns even when individual bet outcomes are binary (win or lose). Reference the law of large numbers and the role of correlation in your answer.


Exercise A.3 --- Ensemble vs. Single Model

A bettor has two models: Model A has a Brier score of 0.22 and Model B has a Brier score of 0.25. Explain why a weighted ensemble of these two models might achieve a lower Brier score than either model individually. Under what conditions would the ensemble fail to improve upon the better individual model?


Exercise A.4 --- Performance Attribution Dimensions

Describe at least five dimensions along which betting performance can be attributed (e.g., by sport, by strategy). For each dimension, give an example of an actionable insight that attribution along that dimension could reveal.


Exercise A.5 --- Edge Decay Mechanisms

Identify and describe three distinct mechanisms by which a betting edge can decay over time. For each mechanism, suggest one monitoring metric that would provide early warning of edge decay.


Exercise A.6 --- Closing Line Value

Why is closing line value (CLV) considered the single most important diagnostic for long-term betting sustainability? Explain the relationship between consistently beating the closing line and long-term profitability, and describe a scenario where positive CLV could coexist with negative short-term P&L.


Exercise A.7 --- Scaling Decisions

A bettor has placed 600 bets over eight months with a 3.2% ROI and a 54% rate of beating the closing line. Should the bettor scale up bet sizes? Discuss at least three factors that should inform this decision beyond the headline ROI number.


Exercise A.8 --- Operational Discipline

The chapter argues that "the most common reason profitable bettors fail in the long run is not model error --- it is process failure." Provide three concrete examples of process failure and explain how each undermines long-term profitability.


Part B: Calculations and Short Problems

Each problem is worth 5 points. Show all work.


Exercise B.1 --- Risk Budget Allocation

A bettor has a \$20,000 bankroll and the following risk budget:

Sport Strategy Allocation (%) Max Per Bet (%)
NFL Sides model 15% 2%
NBA Totals model 12% 1.5%
MLB Moneyline model 10% 1.5%
Soccer xG model 8% 1%

(a) Calculate the dollar allocation and maximum per-bet size for each sport/strategy.

(b) The bettor currently has \$2,400 in NFL sides exposure, \$1,800 in NBA totals exposure, and \$0 in MLB and soccer. The NBA totals model generates a signal to bet \$350 on an NBA total. Can the bet be placed? Show your work.

(c) If the bettor wants to add a \$500 NFL sides bet, can it be placed under the risk budget constraints? Consider both the strategy-level and portfolio-level (25% of bankroll) limits.


Exercise B.2 --- Ensemble Weighting

Three models have the following recent Brier scores over the last 100 predictions:

Model Brier Score
Logistic Regression 0.210
Random Forest 0.195
Neural Network 0.230

(a) Calculate the inverse-Brier weights for each model.

(b) If the three models predict home win probabilities of 0.58, 0.62, and 0.55 respectively, what is the weighted ensemble probability?

(c) The no-vig market-implied probability is 0.54. What is the estimated edge using the ensemble probability?


Exercise B.3 --- Performance Metrics

A bettor's record for the month shows:

Metric Value
Total bets 85
Wins 46
Losses 38
Pushes 1
Total staked \$8,500
Total P&L +\$412
Average odds -108

(a) Calculate the win rate (excluding pushes), ROI, and average profit per bet.

(b) The bettor's cumulative P&L curve peaked at +\$780 at bet 62, then declined to +\$412 by bet 85. What is the maximum drawdown in dollars and as a percentage of the peak?

(c) Suppose daily P&L has a mean of \$14.80 and a standard deviation of \$52.30. Calculate the annualized Sharpe-like ratio assuming 300 betting days per year.


Exercise B.4 --- Consensus Pricing

A bettor's quantitative model estimates a 0.60 probability for Team A to win. The no-vig market-implied probability is 0.55. The bettor's qualitative assessment, based on a recent coaching change, adjusts the probability upward by 0.03.

Using weights of 55% for the quantitative model, 35% for the market, and 10% for the qualitative adjustment, calculate:

(a) The consensus probability estimate.

(b) The expected value of a \$100 bet at decimal odds of 1.80 (implied probability 0.556).

(c) The Kelly fraction for this bet given the consensus probability and decimal odds of 1.80.


Exercise B.5 --- Portfolio Variance

A bettor places three simultaneous bets with the following characteristics:

Bet Stake Weight Std Dev of Return Correlation with Bet 1 Correlation with Bet 2
1 0.40 0.95 1.00 0.15
2 0.35 0.90 0.15 1.00
3 0.25 0.92 0.05 0.05

(a) Calculate the portfolio variance using the formula from Section 41.2.

(b) Compare this to the variance of a single bet placed at the average standard deviation. By what percentage is the portfolio variance lower?


Exercise B.6 --- Signal Filtering

A model generates the following signals for tonight's games:

Game Model Prob Market Prob (no-vig) Side Odds
Game A 0.58 0.54 Home -130
Game B 0.52 0.50 Away +105
Game C 0.61 0.55 Home -140
Game D 0.49 0.48 Away +120
Game E 0.55 0.52 Home -115

Using minimum edge threshold of 3% and minimum model confidence of 55%, which bets pass the filter? For each passing bet, calculate the expected value per dollar wagered.


Part C: Programming Challenges

Each problem is worth 10 points. Include working Python code with comments and sample output.


Exercise C.1 --- Complete Data Pipeline

Implement a BettingDataPipeline for a sport of your choice. Your pipeline should:

(a) Generate or load synthetic data representing at least two data sources (e.g., game results and market odds).

(b) Implement a cleaning function that handles missing values, removes duplicates, and validates date ranges.

(c) Merge the sources on appropriate keys.

(d) Run the validate_data method and print a data quality report.

Include at least 200 rows of synthetic data and demonstrate the full pipeline from collection to validation.


Exercise C.2 --- Dynamic Risk Budget System

Extend the RiskBudget class from Section 41.2 with the following enhancements:

(a) Add a rebalance method that adjusts allocations proportionally when the bankroll changes (grows or shrinks).

(b) Add a utilization_alert method that prints a warning when any sport/strategy exceeds 80% of its allocation.

(c) Add a daily_exposure_limit parameter that caps the total new exposure added in a single day.

(d) Demonstrate the enhanced system with a sequence of at least 20 simulated bets, showing rebalancing after bankroll changes.


Exercise C.3 --- Ensemble Model with Walk-Forward Evaluation

Build an ensemble predictor that:

(a) Trains at least two model types (e.g., logistic regression and gradient boosting) on synthetic or real sports data.

(b) Implements inverse-Brier weighting with a configurable evaluation window.

(c) Performs walk-forward evaluation: train on the first 70% of data, then predict one game at a time, updating weights after each prediction.

(d) Compares the ensemble's Brier score to each individual model's score over the walk-forward period.

Print a summary table showing individual and ensemble Brier scores, and plot the cumulative Brier score over time if matplotlib is available.


Exercise C.4 --- Full Performance Attribution Report

Using simulated bet data (at least 500 bets across 3 sports and 4 strategies over 6 months):

(a) Generate realistic bet data including dates, sports, strategies, odds, stakes, results, model probabilities, and edges.

(b) Instantiate the PerformanceAttribution class and generate the by_sport, by_strategy, by_time_period, and edge_analysis reports.

(c) Calculate the portfolio Sharpe ratio and maximum drawdown.

(d) Write a narrative summary (as comments in code or printed output) interpreting the attribution results and recommending at least two actionable changes.


Exercise C.5 --- Automated Betting Decision Engine

Build a class BettingDecisionEngine that integrates the signal generator, risk budget, and ensemble predictor into a single decision pipeline:

(a) Accept a set of upcoming games with features and market odds.

(b) Generate ensemble predictions and calculate edges.

(c) Filter signals by minimum edge and confidence thresholds.

(d) Size bets using a fractional Kelly criterion.

(e) Check each bet against the risk budget before approving.

(f) Output a structured bet sheet (DataFrame) with approved bets, including game, side, probability, edge, stake, and sportsbook recommendation.

Demonstrate the engine with at least 10 simulated upcoming games.


Part D: Analysis and Interpretation

Each problem is worth 10 points. Write clear, structured analyses.


Exercise D.1 --- Risk Budget Design

Design a complete risk budget for a hypothetical bettor with a \$15,000 bankroll who wants to bet on NFL, NBA, and tennis. The bettor has a strong NFL model (2 years of track record, 4% ROI), a newer NBA model (6 months, 2.5% ROI), and is just starting to develop a tennis model with no track record.

Specify: (a) sport-level allocations with rationale, (b) strategy-level allocations within each sport, (c) maximum per-bet sizes, (d) total portfolio exposure limit, and (e) criteria for adjusting allocations over time.


Exercise D.2 --- Model Disagreement Analysis

Two models produce the following predictions for five upcoming NBA games:

Game Model A (Prob Home Win) Model B (Prob Home Win)
1 0.62 0.64
2 0.58 0.42
3 0.71 0.68
4 0.45 0.55
5 0.53 0.51

For each game, calculate the absolute disagreement between models. Classify each game as "high agreement" (disagreement < 5%), "moderate disagreement" (5-10%), or "high disagreement" (> 10%). For each category, recommend a betting action (full position, reduced position, or pass) and explain your reasoning.


Exercise D.3 --- Operations Calendar

Design a comprehensive operations calendar for a multi-sport bettor covering daily, weekly, monthly, and quarterly activities. For each time interval, specify:

(a) The specific tasks to be performed. (b) The metrics to review. (c) Decision criteria for action (e.g., "If drawdown exceeds 15% of bankroll, reduce all bet sizes by 50%"). (d) Estimated time commitment.


Part E: Integration and Synthesis

Each problem is worth 10 points. These problems require synthesizing concepts from multiple chapters.


Exercise E.1 --- End-to-End System Design

Design (on paper or in pseudocode) a complete betting system for a single sport of your choice. Your design should include:

(a) Data sources and collection schedule. (b) Feature engineering pipeline with at least 8 specific features. (c) Model architecture (type, training schedule, validation approach). (d) Signal generation and filtering criteria. (e) Bet sizing methodology. (f) Execution plan (sportsbook selection, timing, record-keeping). (g) Performance review schedule and key metrics. (h) Criteria for scaling up, scaling down, or retiring the strategy.

This is a design document, not an implementation. Focus on completeness and coherence.


Exercise E.2 --- Backtesting Framework

Implement a backtesting framework that simulates the complete betting workflow on historical data:

(a) Walk-forward model training and prediction. (b) Signal generation with configurable edge thresholds. (c) Kelly-based bet sizing with a configurable Kelly fraction. (d) Risk budget enforcement. (e) Bet settlement and P&L tracking.

Run the backtest on at least 200 simulated games and produce a report including total P&L, ROI, max drawdown, Sharpe ratio, and a cumulative P&L curve.


Exercise E.3 --- Market Adaptation Simulation

Simulate a scenario where a bettor's edge decays over time:

(a) Generate 1,000 bets where the true edge starts at 5% and linearly decays to 0% over the sample.

(b) Implement a monitoring system that tracks rolling ROI, rolling CLV, and rolling edge estimate over windows of 50 and 100 bets.

(c) Determine at what point the monitoring system would detect the edge decay (i.e., when the rolling metrics cross a predefined threshold).

(d) Compare two strategies: (i) maintaining constant bet sizes and (ii) dynamically reducing bet sizes as the rolling edge estimate decreases. Which strategy preserves more bankroll?


Exercise E.4 --- Multi-Sport Portfolio Optimization

Build a portfolio optimization system that:

(a) Takes as input expected edges and variances for bets across 3 sports.

(b) Calculates optimal Kelly allocations for each bet.

(c) Applies portfolio-level constraints (maximum total exposure, maximum per-sport exposure, correlation adjustments).

(d) Produces an optimized bet allocation that maximizes expected log-wealth growth subject to the constraints.

Test with at least 15 simultaneous betting opportunities and compare the constrained portfolio to the unconstrained Kelly allocation.


Exercise E.5 --- Process Quality Audit

Design and implement a "process quality audit" tool that evaluates the health of a betting operation based on its historical bet data. The tool should assess:

(a) Data quality: completeness of bet records, consistency of metadata. (b) Discipline: adherence to bet sizing rules (no bet exceeds Kelly or risk budget limits). (c) Execution quality: average slippage between target and actual odds. (d) Timeliness: lag between signal generation and bet placement. (e) Review compliance: evidence of regular performance review (e.g., annotations or adjustments in the data).

Output a "health score" from 0 to 100 with category-level subscores and specific recommendations.