Chapter 33 Exercises: Live and In-Play Betting
Part A: Foundational Concepts (Exercises 1-6)
Exercise 1. Define live betting (in-play betting) and explain how it differs structurally from pre-game betting. In your answer, address at least four dimensions of difference: margins, limits, bet acceptance mechanisms, and market depth. For each dimension, explain the economic rationale behind the difference and how it affects the quantitative bettor's approach.
Exercise 2. A pre-game NBA moneyline market has a total overround of 4.5%. The same game's live moneyline market, observed at the start of the third quarter, carries a total overround of 7.2%. Explain why live margins are wider. Calculate the breakeven edge a bettor needs to overcome in each market (assuming symmetric vig), and discuss how the higher live margin changes the minimum model accuracy required for profitability.
Exercise 3. Rank the following six sports by their suitability for quantitative live betting, from most to least suitable: NFL football, NBA basketball, MLB baseball, NHL hockey, tennis, and soccer. For each sport, provide two supporting reasons based on the characteristics discussed in the chapter (scoring frequency, game state complexity, natural pauses, data availability, and predictability). Justify any ranking disagreements you have with the chapter's ordering.
Exercise 4. Explain the concept of "adverse selection" in the context of live betting markets. A sportsbook observes that a certain bettor consistently bets immediately after scoring events and has a 59% win rate on live moneyline wagers. From the bookmaker's perspective, describe three defensive mechanisms the book could deploy and explain how each affects the bettor's expected profitability.
Exercise 5. Describe the concept of "bet behind" (delayed acceptance) in live betting. A bettor submits a live bet at odds of 2.10. During the 3-second acceptance window, the line moves to 2.00. Explain what happens under three different acceptance models: (a) "any odds" acceptance, (b) "better odds only" acceptance, and (c) "fixed odds" acceptance. Calculate the expected cost of the acceptance delay assuming a 40% chance of adverse line movement of 0.10 during the acceptance window.
Exercise 6. An NBA game has the following live betting update frequencies from three different data feed tiers: court-side scouts (0.5s latency), official data feeds (3.0s latency), and broadcast (10s latency). A scoring event occurs that shifts the true win probability by 8 percentage points. Assuming the sportsbook updates its line 2.0 seconds after the event, calculate the mispricing window duration for each data feed tier. For a bettor on each tier, estimate whether the opportunity is capturable given a 200ms model computation time and a 300ms API submission time.
Part B: Real-Time Model Updating (Exercises 7-12)
Exercise 7. State Bayes' theorem and explain how it applies to live win probability updating. An NBA team has a pre-game win probability of 0.60 (the prior). In the first quarter, the team falls behind by 8 points. Using the following likelihood data from historical games -- P(down 8 after Q1 | team wins) = 0.28 and P(down 8 after Q1 | team loses) = 0.52 -- compute the posterior win probability. Show all steps and interpret the result.
Exercise 8. Describe the iterative Bayesian updating process for live betting. Starting with a prior win probability of 0.55 for the home team in an NFL game, apply the following sequence of updates:
- Event 1: Home team scores a field goal (3 points). Likelihood ratio for home win given a home field goal at that game state: 1.15.
- Event 2: Away team scores a touchdown (7 points). Likelihood ratio for home win given an away touchdown at the new game state: 0.72.
- Event 3: Home team intercepts a pass. Likelihood ratio for home win given a home interception at the new game state: 1.25.
Compute the posterior win probability after each event using the odds form of Bayes' theorem. Show that the order of updates does not matter if the events are conditionally independent given the true state.
Exercise 9. Write complete Python code for a simplified NFL win probability model. Your model should take as inputs: score differential, seconds remaining, current down, yards to go, and field position (yard line). Use logistic regression principles: define a feature vector from these inputs and apply the logistic function to produce a win probability. Use the following coefficient estimates: intercept = 0.0, score_diff_per_point = 0.15, time_remaining_factor = -0.001 per second, and field_position_factor = 0.005 per yard. Demonstrate the model by computing win probability for a team leading by 3 with 4:00 remaining on their own 35-yard line, 2nd down and 6.
Exercise 10. Explain the difference between a state-space model and a simple game-state lookup table for live win probability estimation. Define the state equation and observation equation for a basketball state-space model where the latent state is the "true scoring rate differential" between two teams. Using the Kalman filter update equations, show how an observed 12-0 scoring run in 3 minutes would update the estimated scoring rate differential, assuming a process noise variance of 1.0 and an observation noise variance of 4.0.
Exercise 11. A live betting model for tennis uses the following state variables: sets won by each player, games won in the current set by each player, and points won in the current game by each player. Calculate the total number of possible game states (ignoring deuce scenarios for simplicity) in a best-of-3-set match where sets are won at 6 games and games at 4 points. Then explain why a closed-form probability model is feasible for tennis but impractical for football, and describe how the Markov chain approach exploits the structure of tennis scoring.
Exercise 12. A sportsbook's live model updates every 5 seconds, while a sophisticated bettor's model updates every 1 second. During a 48-minute NBA game, estimate the total number of "mispricing windows" where the bettor's model has already processed new information but the book has not yet updated. Assume scoring events occur on average every 24 seconds and each event creates a potential mispricing. How many of these windows are likely to exceed the bettor's minimum edge threshold of 3%? Justify your estimate using the concept of information decay.
Part C: Latency and Execution (Exercises 13-18)
Exercise 13. Draw and label a complete latency diagram for a live bet, from the occurrence of a real-world event to the confirmation of a bet. Include the following stages: event occurrence, data capture, data transmission, data parsing, model update, decision logic, API call, book processing, and bet confirmation. Assign realistic latency values to each stage for both a "professional" setup (court-side data, co-located servers) and an "amateur" setup (broadcast data, home internet). Calculate total latencies for both.
Exercise 14. Write Python code using the asyncio and aiohttp libraries that implements a connection pool manager for submitting live bets to three different sportsbook APIs simultaneously. The manager should: (a) maintain persistent connections to each book, (b) track per-book latency statistics, (c) route each bet to the book with the lowest recent average latency, and (d) implement a circuit breaker that temporarily disables a book if its error rate exceeds 20% in the last 50 requests. Include proper error handling and logging.
Exercise 15. A live bettor's execution pipeline has the following latency components: data reception (150ms), model computation (80ms), decision logic (10ms), API call (200ms), and bet acceptance delay (variable, mean 2000ms). The total mispricing window for a typical opportunity is 3 seconds.
(a) Calculate the probability that the bet is submitted before the window closes. (b) If the acceptance delay is exponentially distributed with mean 2000ms, what fraction of submitted bets will be confirmed before the window closes? (c) The bettor is considering investing $5,000 to reduce API latency from 200ms to 50ms. If they encounter 20 opportunities per day with an average edge of 4% and an average stake of $100, calculate the ROI on this investment over a 6-month period.
Exercise 16. Compare REST API and WebSocket connection approaches for live betting. Write pseudocode for both implementations that listen for odds updates on a specific market and submit a bet when the odds exceed a threshold. Discuss the tradeoffs in terms of latency, complexity, reliability, and bandwidth. Under what conditions would you choose one over the other?
Exercise 17. A live bettor monitors odds across five sportsbooks. Book A has a latency of 100ms, Book B 200ms, Book C 300ms, Book D 150ms, and Book E 500ms. The bettor's model identifies an opportunity that exists at Books B, C, and E simultaneously (same market, same direction). The model estimates an edge of 5% at Book B, 4% at Book C, and 6% at Book E. Using a Kelly criterion with 25% fractional Kelly and a $10,000 bankroll, calculate the optimal bet amount for each book. Then, recalculate assuming the bettor can only submit to one book at a time (sequential execution adds 200ms per additional book). How does the execution constraint change the optimal strategy?
Exercise 18. Design and describe an automated vs. manual decision framework for live betting. Create a decision matrix with the following axes: edge source (model-based, observational, hybrid), frequency of opportunities (per minute, per quarter, per game), required response time (sub-second, 1-10 seconds, minutes), and bet volume (single bet, multiple simultaneous). For each cell in the matrix, indicate whether automated, manual, or hybrid execution is most appropriate, and explain your reasoning for at least four cells.
Part D: Mispricing Detection (Exercises 19-24)
Exercise 19. Explain three distinct causes of mispricing in live betting markets: (a) model complexity tradeoffs, (b) multi-market consistency delays, and (c) manual intervention requirements. For each cause, provide a concrete example from a specific sport, estimate the typical duration of the mispricing window, and describe the characteristics of the mispricing (direction, magnitude, predictability).
Exercise 20. Write Python code that implements a cross-book stale line detector. The detector should: (a) maintain a rolling window of odds from at least four books, (b) compute a consensus fair probability using the median of de-vigged probabilities, (c) flag any book whose current odds imply a probability that differs from the consensus by more than a configurable threshold, and (d) calculate a confidence score based on the number of books in the consensus, the magnitude of the deviation, and the time since the book last updated. Include a demonstration with synthetic data showing the detector identifying a stale line.
Exercise 21. An NBA live bettor's model shows a home team win probability of 0.72, while the sportsbook's moneyline implies a de-vigged probability of 0.65 for the home team. The vig on the live moneyline is 6% total. Calculate: (a) the bettor's estimated edge, (b) the expected value per dollar wagered on the home team, (c) the full Kelly bet as a fraction of bankroll, and (d) the quarter-Kelly bet for a $5,000 bankroll. If the bettor believes there is a 30% chance that their model is miscalibrated by 5 percentage points, recalculate the edge and Kelly fraction accounting for this model uncertainty.
Exercise 22. Describe the phenomenon of "momentum overreaction" in live betting markets. A basketball team goes on a 15-0 scoring run over 4 minutes. The pre-run win probability was 0.45 for this team. After the run, the market-implied win probability is 0.72. Your model, which accounts for mean reversion in scoring rates, estimates the win probability at 0.65. Calculate the implied expected scoring rate during the run, compare it to the team's baseline scoring rate (assume 110 points per 48 minutes), and explain why mean reversion suggests the market has overreacted. What specific model feature would capture this mean-reversion effect?
Exercise 23. Write Python code that monitors live odds from multiple books and detects "velocity anomalies" -- situations where one book's update frequency drops significantly below its historical average. The code should: (a) track the inter-update interval for each book/market combination, (b) compute a running mean and standard deviation of the interval, (c) flag when the current interval exceeds 3 standard deviations above the mean, and (d) output an alert with the book, market, expected update interval, actual interval, and z-score. Demonstrate with simulated data where one book freezes for 30 seconds during a scoring event.
Exercise 24. In-game injuries create some of the most significant live betting edges. Describe a systematic approach to quantifying injury impact in real time. For an NBA game, a star player (averaging 28 points, 8 rebounds, 5 assists per game on 35 minutes) leaves the game in the second quarter with an apparent ankle injury. Using the concept of "replacement-level player" and the player's estimated impact on team efficiency (assume +6.5 points per 100 possessions relative to replacement), calculate: (a) the expected change in team scoring rate for the remainder of the game, (b) the impact on the live spread, and (c) the impact on the live total. Explain why the market may be slow to fully price this information.
Part E: System Architecture and Strategy (Exercises 25-30)
Exercise 25. Design the complete five-layer architecture for a live betting system as described in the chapter: data ingestion, analytics engine, decision engine, execution layer, and monitoring. For each layer, specify: (a) the primary responsibilities, (b) the technology stack you would recommend (specific libraries, databases, message queues), (c) the interfaces between this layer and adjacent layers, and (d) the failure modes and recovery strategies. Present your design as a detailed architectural diagram with annotations.
Exercise 26. Write a complete Python implementation of a simplified live betting decision engine. The engine should: (a) accept model fair prices and book offered prices as inputs, (b) calculate edge for each opportunity, (c) apply a minimum edge threshold, (d) size bets using fractional Kelly, (e) enforce a per-game maximum exposure limit, (f) enforce a total portfolio exposure limit, and (g) output a prioritized list of recommended bets. Test with at least five simultaneous opportunities at varying edge levels.
Exercise 27. A live betting operation is considering two data feed options: Option A costs $2,000/month and provides 2-second latency official data; Option B costs $8,000/month and provides 0.5-second latency court-side data. The operation currently places an average of 50 live bets per day with a mean edge of 3.5% and mean stake of $150. Estimate how much additional edge (in percentage points) the faster data feed would provide, under the assumption that 40% of opportunities are latency-sensitive and faster data captures 60% more of the latency-sensitive opportunities. Calculate the expected monthly profit difference and determine whether the upgrade is justified. State all assumptions clearly.
Exercise 28. Implement a complete backtesting framework for a live betting strategy. The framework should: (a) replay historical game data with realistic timing, (b) simulate the latency of your data feed and execution pipeline, (c) model bet acceptance/rejection based on configurable acceptance rates, (d) track P&L, edge realized vs. expected, and execution metrics, and (e) produce a summary report. Write the Python class structure and key methods (you may use pseudocode for the data replay component, but all other methods should be complete).
Exercise 29. Describe the risk management considerations specific to live betting that do not arise (or arise less frequently) in pre-game betting. Address: (a) the risk of correlated losses from rapid sequential bets on the same event, (b) the risk of model failure during unusual game situations, (c) the risk of API outages during critical moments, (d) the risk of "chasing" -- placing increasingly large bets to recover losses within a game, and (e) the risk of data feed errors causing false signals. For each risk, propose a specific, implementable mitigation strategy.
Exercise 30. Design a comprehensive performance monitoring dashboard for a live betting operation. Specify the metrics to display in four categories: (a) model performance (calibration, Brier score, edge distribution), (b) execution performance (latency percentiles, acceptance rate, fill rate), (c) financial performance (daily P&L, cumulative P&L, ROI, Sharpe ratio), and (d) system health (data feed status, API connection status, error rates). For each metric, specify the update frequency, the visualization type (line chart, gauge, table, etc.), and the alerting threshold that should trigger a notification. Write Python code that computes at least three of these metrics from a log of executed bets.