Exercises: Chapter 26

Section A: Conceptual Understanding (Exercises 1--10)

Exercise 1: Identifying Lookahead Bias

The following backtest code snippet processes prediction market data. Identify all instances of lookahead bias and explain how to fix each one.

import pandas as pd
import numpy as np

df = pd.read_csv('market_data.csv')
df['resolution'] = df.groupby('market_id')['resolution'].transform('last')
df['zscore'] = (df['price'] - df['price'].mean()) / df['price'].std()

for i, row in df.iterrows():
    if row['zscore'] < -1.5 and row['resolution'] == 1:
        signal = 'BUY'
    elif row['zscore'] > 1.5 and row['resolution'] == 0:
        signal = 'SELL'

Exercise 2: Survivorship Bias Scenario

You download a dataset of 500 prediction markets from a platform. The dataset only includes markets that successfully resolved (YES or NO) and excludes 73 markets that were cancelled due to ambiguous resolution criteria. Your strategy specifically targets markets with unusual resolution criteria because they tend to be mispriced.

(a) Explain why your backtest results will be biased. (b) In which direction will the bias push your estimated returns? (c) Propose a method to correct for or mitigate this bias.

Exercise 3: Overfitting Analysis

A trader tests 50 different parameter combinations for a mean-reversion strategy on a single prediction market and selects the one with the highest Sharpe ratio. The best combination achieves an in-sample Sharpe of 3.2.

(a) Calculate the probability that at least one of the 50 tests would exceed a Sharpe of 2.0 purely by chance (assume returns are normally distributed with zero mean and unit variance, and that the sample has 100 observations). (b) What is the expected maximum Sharpe ratio across 50 independent tests under the null hypothesis? (c) How does this change the interpretation of the observed Sharpe of 3.2?

Exercise 4: Event-Driven vs. Vectorized

Explain why an event-driven backtesting architecture provides structural protection against lookahead bias, while a vectorized approach does not. Use a specific example involving a prediction market strategy that computes a rolling average.

Exercise 5: Fill Simulation Importance

A strategy backtested on a prediction market with average daily volume of 200 contracts shows a 45% annual return. The strategy trades 50 contracts per signal. Using the square-root impact model with $\sigma = 0.03$, $\beta = 0.5$, and $V = 200$, calculate the expected market impact cost per trade and determine whether the strategy remains profitable after accounting for this impact.

Exercise 6: Transaction Cost Breakdown

For a Polymarket trade with the following parameters, calculate the total transaction cost: - Buy 100 YES contracts at ask price of $0.62 - Bid price is $0.58 - Taker fee: 2% - Expected holding period: 45 days - Risk-free rate: 5% annually

Include: spread cost, trading fee, and opportunity cost.

Exercise 7: Walk-Forward Design

You have 3 years of daily prediction market data. Design a walk-forward testing scheme with: - 6-month training windows - 2-month test windows - Rolling (not anchored) approach

(a) How many walk-forward steps will you have? (b) How much of the total data will be used for out-of-sample testing? (c) What is the minimum number of trades per window needed for statistical reliability?

Exercise 8: Metric Interpretation

A strategy reports the following metrics: - Win Rate: 72% - Average Win: $0.04 - Average Loss: $0.12 - Profit Factor: 0.86

(a) Is this strategy profitable? Explain using the expectancy formula. (b) What does the combination of high win rate and low profit factor suggest about the strategy's risk profile? (c) How would you modify the strategy to improve its risk-adjusted performance?

Exercise 9: Statistical Power

Calculate the minimum number of trades needed to detect a Sharpe ratio of 0.8 with: (a) 80% power at the 5% significance level (b) 90% power at the 1% significance level (c) Discuss the practical implications for prediction market backtesting, where markets often have limited trading history.

Exercise 10: Multiple Comparisons

You test 30 strategies on the same prediction market dataset. Five strategies show p-values below 0.05.

(a) Apply the Bonferroni correction. How many strategies remain significant? (b) Apply the Benjamini-Hochberg procedure with FDR = 0.10. How many strategies remain significant? (c) Which correction method is more appropriate for exploratory backtesting research, and why?


Section B: Implementation (Exercises 11--20)

Exercise 11: Build a Data Handler

Implement a CSVDataHandler class that inherits from the DataHandler abstract base class defined in Section 26.3. The handler should: - Load data from a CSV file with columns: timestamp, market_id, last_price, bid, ask, volume, bid_size, ask_size - Support multiple markets simultaneously - Enforce chronological ordering - Prevent any possibility of lookahead (only emit data up to the current timestamp)

Exercise 12: Implement a Mean-Reversion Strategy

Implement a MeanReversionStrategy class that inherits from Strategy. The strategy should: - Compute a rolling z-score of the price over a configurable lookback window - Generate a BUY signal when z-score < -threshold - Generate a SELL signal when z-score > +threshold - Return no signal when abs(z-score) < threshold - Use only data available through the DataHandler.get_latest() method

Exercise 13: Build a Portfolio Manager

Implement a SimplePortfolio class that inherits from Portfolio. It should: - Track positions in multiple markets simultaneously - Enforce a maximum position size per market (configurable) - Enforce a maximum total portfolio allocation (configurable) - Convert signals to orders only when position limits allow - Track realized and unrealized P&L

Exercise 14: Implement Fill Simulation

Extend the RealisticExecutionSimulator from Section 26.5 to support: - A configurable "fill probability" that varies with order size relative to available liquidity - Time-varying slippage (higher during volatile periods) - A "queue position" model for limit orders (your order fills only after orders ahead of you in the queue)

Exercise 15: Cost Model for PredictIt

Implement a PredictItCostModel that models PredictIt's unique fee structure: - 10% fee on profits per market (not per trade --- calculated at market resolution) - 5% withdrawal fee - $850 maximum position per market - No fee on losing trades

Your model should correctly track cumulative P&L per market to calculate the profit fee at resolution.

Exercise 16: Vectorized Backtester

Implement a vectorized backtester that operates on pandas DataFrames for fast strategy screening. The backtester should: - Accept a signal DataFrame (same index as price data, values of +1, -1, 0) - Apply configurable transaction costs - Compute an equity curve - Return a dictionary of performance metrics - Include a warning if the signal appears to use future information (basic check: correlation between signal and future returns is suspiciously high)

Exercise 17: Walk-Forward with Cross-Validation

Extend the WalkForwardEngine from Section 26.7 to support combinatorial purged cross-validation (CPCV): - Within each training window, use k-fold cross-validation with a purge gap - The purge gap prevents information leakage between train and validation folds - Select parameters based on average cross-validated performance rather than single in-sample performance

Exercise 18: Custom Performance Metric

Implement a "Prediction Market Efficiency" metric that captures how well a strategy exploits prediction market mispricing. Define it as:

$$PME = \frac{\text{Average Edge Captured}}{\text{Average Edge Available}}$$

Where "edge available" is the absolute difference between the market price and the true resolution probability, and "edge captured" is the profit earned relative to the edge available at the time of entry.

Exercise 19: Regime-Aware Backtester

Implement a regime-detection module that identifies different market regimes (e.g., low volatility, high volatility, trending, mean-reverting) and reports strategy performance separately for each regime. Use a hidden Markov model with two states.

Exercise 20: Backtest Comparison Framework

Build a framework that can compare two strategies side-by-side: - Run both on the same data - Compute all metrics for each - Perform a paired t-test to determine if the difference in returns is statistically significant - Generate a comparison report with overlaid equity curves


Section C: Analysis and Research (Exercises 21--30)

Exercise 21: Sharpe Ratio Distribution

Simulate 10,000 random strategies (random signals on random returns) and plot the distribution of backtest Sharpe ratios. What is the 95th percentile Sharpe for random strategies with 500 trades? How does this change with 100, 200, and 1000 trades?

Exercise 22: Impact of Spread on Strategy Viability

For a strategy with a gross Sharpe ratio of 1.5 that trades once per day, plot how the net Sharpe ratio declines as the spread increases from 0 to 10 cents. At what spread does the strategy become unprofitable? How does trading frequency affect this relationship?

Exercise 23: Optimal Walk-Forward Window Size

Using simulated data with a known signal embedded in noise, test walk-forward analysis with training windows of 30, 60, 90, 120, 180, and 365 days. Plot out-of-sample performance as a function of training window size. Is there an optimal window size? How does it relate to the signal's characteristics?

Exercise 24: Bootstrap Analysis of Drawdown

Generate 10,000 bootstrap samples of a strategy's return series and compute the maximum drawdown distribution. Report the 5th, 25th, 50th, 75th, and 95th percentiles. How does this compare to the single backtest drawdown? What does this tell you about drawdown uncertainty?

Exercise 25: The Bailey-Lopez de Prado Minimum Backtest Length

Implement the Bailey-Lopez de Prado formula for the minimum backtest length (MBL) needed to avoid false discoveries:

$$MBL \geq \frac{1}{SR^2} \left[ (z_\alpha + z_\beta)^2 + \frac{(z_\alpha + z_\beta)^4}{4} \hat{\gamma}_3^2 + \frac{(z_\alpha + z_\beta)^2}{4} (\hat{\gamma}_4 - 3) \right]$$

Where $\hat{\gamma}_3$ and $\hat{\gamma}_4$ are the skewness and kurtosis of returns. Calculate MBL for a strategy with Sharpe 1.0, skewness -0.5, and excess kurtosis 3.0.

Exercise 26: Transaction Cost Sensitivity Surface

Create a 3D surface plot showing strategy net return as a function of two cost parameters: spread (0--10 cents) and fee rate (0--5%). Identify the "break-even" contour where the strategy transitions from profitable to unprofitable.

Exercise 27: Stale Price Detection

Implement a stale price detection algorithm that identifies periods in prediction market data where the quoted price has not changed for an unusually long time. The algorithm should: - Flag prices that have not moved in more than 2x the average inter-trade interval - Distinguish between genuinely stale quotes and markets that legitimately trade at stable prices - Adjust the execution simulator to use wider spreads during detected stale periods

Exercise 28: Portfolio-Level Backtesting

Extend the backtesting framework to handle a portfolio of 50 prediction markets simultaneously. The portfolio should: - Respect a total capital constraint - Implement equal-weight and risk-parity allocation schemes - Compute portfolio-level metrics (including cross-market correlation effects) - Handle the fact that different markets have different resolution dates

Exercise 29: Slippage Model Calibration

Given a dataset of 1,000 actual prediction market executions (with order size, market conditions at time of order, and actual fill price), calibrate the parameters of the square-root impact model. Report the calibrated $\beta$ coefficient and its confidence interval. Compare the calibrated model's predictions to a simple constant-slippage model.

Exercise 30: End-to-End Backtest Pipeline

Build a complete end-to-end backtest pipeline that: 1. Downloads prediction market data from an API (or loads from a provided CSV) 2. Cleans and validates the data 3. Implements a momentum strategy (buy markets whose prices have risen over the past N periods) 4. Runs walk-forward backtesting with parameter optimization 5. Applies the fill simulator and transaction cost model 6. Generates a full performance report with statistical significance tests 7. Produces a go/no-go recommendation for paper trading

This should be a single script that runs from start to finish and produces a PDF or HTML report.