Appendix G: Answers to Selected Exercises

This appendix provides solutions to odd-numbered exercises from Parts A and B of selected chapters. Solutions are presented with key steps shown so that readers can check their reasoning and identify any errors in their approach. For brevity, intermediate algebraic steps are occasionally condensed.

Chapter 1: What Are Prediction Markets?

Exercise 1. Explain how a prediction market differs from a poll. Give two specific advantages prediction markets have over polls for estimating the probability of a future event.

A poll asks respondents to state their beliefs or preferences and reports the distribution of responses. A prediction market asks participants to back their beliefs with money (or points) by buying and selling contracts, and produces a price that reflects the crowd's aggregated beliefs. Two advantages:

(a) Incentive alignment. Because participants have financial stakes, they are motivated to research carefully and report honestly rather than engage in cheap talk. Misinformed or insincere participants lose money to better-informed traders, which filters low-quality information out of the price.

(b) Continuous updating. Prediction market prices update in real time as new information arrives, whereas polls are conducted periodically and may reflect stale information by the time results are published.

Exercise 3. A binary contract on "Will it rain in New York City tomorrow?" is trading at $0.72. Interpret this price in probabilistic terms. What would a trader who believes the true probability of rain is 0.85 do, and why?

The market price of $0.72 implies that the crowd consensus probability of rain is approximately 72%. A trader who believes the true probability is 0.85 considers the Yes contract underpriced: they expect to receive $1.00 with probability 0.85 while paying only $0.72, yielding a positive expected value of $0.85 - $0.72 = $0.13 per contract. Therefore, this trader should buy Yes contracts. By doing so, the trader pushes the price upward, moving it closer to what they believe is the correct probability.

Exercise 5. Name three real-world prediction market platforms and describe one distinguishing characteristic of each.

(a) Kalshi: A CFTC-regulated Designated Contract Market in the United States, distinguishing it as one of the few fully regulated real-money prediction market exchanges.

(b) Polymarket: A decentralized prediction market built on blockchain infrastructure (Polygon), distinguishing it through its use of smart contracts for trustless settlement without a centralized intermediary.

(c) Metaculus: A community forecasting platform that uses proper scoring rules rather than a market mechanism. Its distinguishing feature is a structured question format with extensive community discussion and a long public track record of calibrated forecasts.

Chapter 3: Probability Foundations for Prediction Markets

Exercise 1. A complete set of contracts on an election has three outcomes: Candidate A, Candidate B, and Candidate C. Their prices are $0.45, $0.38, and $0.20. Is there an arbitrage opportunity? If so, describe the trade.

The sum of the prices is $0.45 + $0.38 + $0.20 = $1.03. Since a complete bundle always pays exactly $1.00, the contracts are collectively overpriced by $0.03. An arbitrageur should sell one contract of each outcome (selling a complete bundle), receiving $1.03. Regardless of which candidate wins, the arbitrageur pays out $1.00 to settle the one winning contract, netting a risk-free profit of $0.03 per bundle.

Exercise 3. Using the Dutch Book theorem, explain why a coherent probability assignment over mutually exclusive, exhaustive outcomes must sum to 1.

The Dutch Book theorem states that if a bettor's implied probabilities do not satisfy the axioms of probability, there exists a set of bets (a Dutch book) that guarantees a loss for the bettor. Specifically, if a set of prices for mutually exclusive and exhaustive outcomes sums to more or less than 1, a counterparty can construct a combination of trades that yields a sure profit. If the sum exceeds 1, sell all contracts and pocket the excess. If the sum is less than 1, buy all contracts at a total cost below the guaranteed $1 payout. To avoid being Dutch-booked, a rational agent must set prices (probabilities) that sum to exactly 1.

Exercise 5. A trader assesses P(A) = 0.6 and P(B | A) = 0.5. Compute P(A and B). If P(B) = 0.4, compute P(A | B) using Bayes' theorem.

By the multiplication rule: P(A and B) = P(A) * P(B | A) = 0.6 * 0.5 = 0.30.

By Bayes' theorem: P(A | B) = P(B | A) * P(A) / P(B) = 0.5 * 0.6 / 0.4 = 0.30 / 0.40 = 0.75.

Chapter 7: Combinatorial and Conditional Markets

Exercise 1. A conditional market asks: "What is the probability that GDP growth exceeds 3%, conditional on Tax Policy X being enacted?" Explain the structure of this market and what a price of $0.55 means.

This market consists of contracts that only become active (i.e., are resolved) if the conditioning event "Tax Policy X is enacted" actually occurs. If Tax Policy X is not enacted, all contracts are voided and participants receive their money back. A price of $0.55 means that, conditional on Tax Policy X being enacted, the market's consensus probability that GDP growth exceeds 3% is approximately 55%. This allows decision-makers to compare the expected economic effect of Policy X against alternatives by examining their respective conditional markets.

Exercise 3. With three binary questions, how many atomic states exist in a combinatorial market? Why does this pose a computational challenge for market makers?

Three binary questions produce 2^3 = 8 atomic states (one for each possible combination of Yes/No outcomes across the three questions). More generally, n binary questions produce 2^n atomic states. This poses a computational challenge because the market maker must maintain consistent prices across all 2^n states, and traders may wish to buy or sell contracts on arbitrary Boolean combinations of outcomes. Pricing queries and the LMSR cost function calculations can become exponential in n, making naive approaches infeasible for large numbers of questions.

Exercise 5. Describe one practical application of conditional prediction markets in corporate decision-making.

A company considering two possible product strategies (Strategy A and Strategy B) could create conditional markets asking: "What will our revenue be in Q4, given that we adopt Strategy A?" and "What will our revenue be in Q4, given that we adopt Strategy B?" Employees and stakeholders trade based on their private information about customer demand, technical feasibility, and competitive dynamics. Management then compares the conditional market prices to identify which strategy the internal crowd believes will generate higher revenue, supplementing traditional analysis with an incentive-compatible information aggregation tool.

Chapter 8: Bayesian Thinking for Traders

Exercise 1. You believe the probability of an event is 0.4 (your prior). You observe evidence that is three times more likely under the event than under its negation (likelihood ratio = 3). What is your posterior probability?

Let H = event, E = evidence. Prior: P(H) = 0.4, so P(not H) = 0.6. The likelihood ratio is P(E | H) / P(E | not H) = 3.

Using the odds form of Bayes' theorem: - Prior odds = P(H) / P(not H) = 0.4 / 0.6 = 2/3 - Posterior odds = Prior odds * Likelihood ratio = (2/3) * 3 = 2 - Posterior probability = Posterior odds / (1 + Posterior odds) = 2 / 3 = 0.667

After observing the evidence, the posterior probability is approximately 0.667 (66.7%).

Exercise 3. You model an election outcome using a Beta(3, 7) prior (favoring the incumbent to lose). After observing 12 polls, of which 8 favor the incumbent, what is the posterior distribution? What is the posterior mean?

The Beta distribution is a conjugate prior for Bernoulli/binomial data. Starting with Beta(alpha, beta) = Beta(3, 7) and observing 8 successes (incumbent favored) and 4 failures (incumbent not favored) out of 12 polls:

Posterior = Beta(3 + 8, 7 + 4) = Beta(11, 11).

Posterior mean = alpha / (alpha + beta) = 11 / (11 + 11) = 11/22 = 0.50.

The data has shifted the belief from a prior mean of 3/10 = 0.30 (against the incumbent) to a posterior mean of 0.50, reflecting the surprisingly strong polling performance.

Exercise 5. Explain the difference between the prior, likelihood, and posterior in Bayesian inference. Why is this framework natural for prediction market traders?

The prior represents the trader's initial belief about the probability of an event before considering new evidence. The likelihood measures how probable the observed evidence is under each possible state of the world. The posterior is the updated belief after combining the prior and the likelihood via Bayes' theorem.

This framework is natural for prediction market traders because they constantly update their beliefs as new information arrives (polling data, economic reports, news events). The market price can be interpreted as a crowd prior, and a disciplined trader uses Bayesian reasoning to determine whether their posterior probability differs enough from the market price to justify a trade. Bayesian updating also provides a principled method for avoiding both over-reaction and under-reaction to news.

Chapter 9: Scoring Rules and Forecast Evaluation

Exercise 1. Compute the Brier score for a forecaster who assigns probabilities of 0.9, 0.7, 0.3, and 0.8 to four events, where the first, second, and fourth events occur (o = 1) and the third does not (o = 0).

BS = (1/N) * sum of (f_i - o_i)^2.

Event 1: (0.9 - 1)^2 = 0.01
Event 2: (0.7 - 1)^2 = 0.09
Event 3: (0.3 - 0)^2 = 0.09
Event 4: (0.8 - 1)^2 = 0.04

BS = (1/4)(0.01 + 0.09 + 0.09 + 0.04) = (1/4)(0.23) = 0.0575.

This is a good Brier score (close to 0), reflecting confident and mostly accurate predictions.

Exercise 3. Prove that the Brier score is a proper scoring rule — that is, the expected Brier score is minimized when the forecaster reports their true belief p.

Consider a single binary event with true probability p. The forecaster reports r. The expected Brier score is:

E[BS] = p * (r - 1)^2 + (1 - p) * (r - 0)^2 = p(1 - r)^2 + (1 - p)r^2 = p - 2pr + pr^2 + r^2 - pr^2 = p - 2pr + r^2

Taking the derivative with respect to r and setting it to zero:

dE[BS]/dr = -2p + 2r = 0 r = p.

The second derivative is 2 > 0, confirming this is a minimum. Therefore, the expected Brier score is minimized when the reported probability r equals the true probability p, which is exactly the definition of a proper scoring rule.

Exercise 5. A forecaster's reliability diagram shows that when they predict 80%, events actually occur 65% of the time. Is this forecaster overconfident or underconfident? Suggest a recalibration approach.

The forecaster is overconfident: they assign a probability of 80% to events that only occur 65% of the time, meaning their predictions are too extreme (too far from 50%). To recalibrate, one could apply:

(a) Platt scaling: Fit a logistic regression of outcomes against the forecaster's raw probabilities. The fitted sigmoid maps 0.80 to something closer to 0.65.

(b) Isotonic regression: A nonparametric approach that fits a monotonically non-decreasing step function to the (predicted, observed) pairs across all bins of the reliability diagram.

Either approach would produce a calibration function g(f) such that g(0.80) is approximately 0.65, and more generally, predictions across all levels would be adjusted toward better calibration.

Chapter 13: Aggregating Forecasts and Crowd Wisdom

Exercise 1. Five forecasters assign probabilities of 0.6, 0.7, 0.65, 0.72, and 0.63 to an event. Compute the simple average and the median. Which do you prefer and why?

Simple average = (0.6 + 0.7 + 0.65 + 0.72 + 0.63) / 5 = 3.30 / 5 = 0.66.

Ordering the values: 0.60, 0.63, 0.65, 0.70, 0.72. The median = 0.65.

Both give similar results here because the distribution of forecasts is relatively symmetric and there are no extreme outliers. In general, the mean is preferred when forecasters are roughly equally skilled and conditions are benign. The median is more robust to outliers and manipulation — if one forecaster had reported 0.99, the mean would shift substantially (to 0.738) while the median would remain 0.65. For small groups with potential outlier forecasters, the median is often the safer choice.

Exercise 3. Explain what extremizing does and why it is useful when aggregating probabilities from multiple forecasters.

Extremizing is a transformation that pushes an aggregated probability away from 0.5 toward the nearer extreme (0 or 1). For example, if the average forecast is 0.7, an extremized version might be 0.78. This is useful because when forecasters share overlapping information (as is common when they read the same news sources), a simple average effectively double-counts shared evidence. The result is an aggregate that is insufficiently confident. Extremizing corrects for this by recognizing that if multiple partially independent sources all point in the same direction, the true probability should be more extreme than the naive average suggests. Mathematically, extremizing can be performed by converting probabilities to log-odds, multiplying by a factor d > 1, and converting back: extremized(p) = logistic(d * logit(p)).

Exercise 5. A team of superforecasters achieves a Brier score of 0.149 on a set of geopolitical questions, while the prediction market price achieves 0.178. Does this prove that superforecasters are always better than prediction markets? Discuss at least two caveats.

No, this single comparison does not prove that superforecasters are always superior. Caveats include:

(a) Sample specificity: The result holds for this particular set of questions over this particular time period. Different question domains, difficulty levels, or time horizons might favor markets.

(b) Statistical significance: A difference of 0.029 in Brier scores may not be statistically significant given the number of questions. A formal hypothesis test or bootstrap confidence interval would be needed to rule out chance.

(c) Incentive structure: Superforecasters in the Good Judgment Project were carefully selected and highly motivated by the competitive tournament format. Prediction markets may perform differently depending on their participant pool, liquidity levels, and the financial stakes involved.

(d) Scalability: Superforecaster teams require training, management, and selection, whereas prediction markets can aggregate information from large, anonymous populations without centralized coordination.

Chapter 15: Option-Theoretic Perspectives

Exercise 1. Explain the structural similarity between a binary prediction market contract and a European binary (digital) option.

Both instruments pay a fixed amount ($1) if a specified condition is satisfied at a specified time, and nothing otherwise. For a European binary call option, the condition is that the underlying asset price exceeds the strike price at expiration. For a binary prediction market contract, the condition is that a specified real-world event occurs by the resolution date. In both cases, the fair price of the instrument equals the discounted risk-neutral probability of the condition being met. The key difference is that binary options are written on tradable financial assets with continuous price paths, whereas prediction market contracts can reference any verifiable event.

Exercise 3. As a prediction market contract approaches its resolution date, what happens to the price behavior, and why? Compare this to option time decay.

As resolution approaches, the remaining uncertainty about the outcome decreases — news that will determine the result has largely arrived or will arrive imminently. The contract price tends to migrate toward 0 or 1 as the outcome becomes increasingly apparent. This is analogous to option time decay (theta): as an option approaches expiration, its time value declines because there is less time for the underlying to move favorably. In both cases, the time dimension compresses the distribution of possible values. For prediction markets specifically, the "volatility" of the contract price decreases as the resolution date nears (except in rare cases where the outcome remains genuinely uncertain until the last moment).

Chapter 17: Decentralized Prediction Markets

Exercise 1. Describe the oracle problem in decentralized prediction markets. Why can't a smart contract simply "look up" a real-world result?

Smart contracts on a blockchain execute deterministically based on on-chain data. They cannot natively access off-chain information such as election results, weather data, or sports scores because the blockchain is a closed, self-contained system. If nodes could independently query external APIs, they might receive different results (due to timing, API errors, or manipulation), breaking consensus. The oracle problem refers to the challenge of bringing reliable, trustworthy external data onto the blockchain. Solutions include decentralized oracle networks (e.g., Chainlink) that aggregate data from multiple sources, Schelling-point mechanisms (e.g., Kleros, UMA) where reporters stake tokens and are rewarded for convergence on the truth, and trusted centralized oracles with reputational stakes.

Exercise 3. Compare the LMSR and the CPMM as AMM designs for prediction markets. Give one advantage and one disadvantage of each.

LMSR (Logarithmic Market Scoring Rule): - Advantage: Bounded loss for the market maker (subsidizer). The maximum possible loss is determined by the liquidity parameter b and the number of outcomes, making the subsidy requirement predictable. - Disadvantage: Requires an upfront subsidy to initialize. Someone must fund the initial liquidity, and there is no natural mechanism for liquidity providers to earn returns.

CPMM (Constant Product Market Maker): - Advantage: Liquidity is crowd-sourced — anyone can deposit tokens into the pool and earn fees proportional to their share, creating a self-sustaining liquidity ecosystem. - Disadvantage: Liquidity providers suffer impermanent loss when contract prices diverge, and the constant product curve provides less favorable pricing (higher slippage) for large trades compared to the LMSR in typical prediction market settings.

Chapter 22: Feature Engineering for Prediction Markets

Exercise 1. For a prediction market on "Will the Federal Reserve raise interest rates at the next meeting?", list five features you would engineer and explain the signal each captures.

(a) Fed funds futures implied rate: The market-implied probability of a rate hike derived from fed funds futures contracts. This directly captures financial market expectations.

(b) Recent CPI change (month-over-month): Inflation data is a primary input to Fed decisions. Higher-than-expected CPI readings increase the probability of a rate hike.

(c) Unemployment rate (latest release): Employment conditions influence monetary policy. Lower unemployment may encourage tightening.

(d) Fed governor speech sentiment score: NLP-derived sentiment from recent speeches by FOMC members can reveal hawkish or dovish leanings before the official decision.

(e) Prediction market price momentum (7-day slope): The recent trend in the contract's own price captures the market's evolving interpretation of all available information and can signal directional momentum.

Exercise 3. Why might including the current prediction market price as a feature in your model lead to problems? Discuss at least two issues.

(a) Circularity / reflexivity: If your model uses the current market price as a feature and your trades move the market price, there is a feedback loop. The feature (price) changes as a result of your action (trading), potentially leading to instability or self-fulfilling (or self-defeating) predictions.

(b) Leakage of the target variable: In a well-functioning market, the price is the best available estimate of the probability, which is closely related to the outcome. Using it as a feature in a model trained to predict the same outcome may cause the model to simply learn to parrot the market price, providing no additional edge and making the model unable to identify mispricing.

(c) Stale price risk: In illiquid markets, the last traded price may not reflect current information, making it an unreliable feature despite appearing highly predictive in backtests where resolution is known.

Chapter 23: Machine Learning Models for Forecasting

Exercise 1. Explain the bias-variance tradeoff in the context of building a model to predict prediction market outcomes. How does a random forest address this tradeoff?

The bias-variance tradeoff states that a model's expected prediction error decomposes into bias (error from oversimplifying assumptions), variance (error from sensitivity to training data fluctuations), and irreducible noise. A model that is too simple (e.g., logistic regression with few features) may have high bias (systematically wrong). A model that is too complex (e.g., a single deep decision tree) may have high variance (accurate on training data but erratic on new data).

A random forest addresses this by training many decision trees, each on a bootstrapped sample of the data with a random subset of features. Individual trees have low bias (each tree is deep and flexible) but high variance. Averaging across many such trees dramatically reduces variance while preserving the low bias, yielding a model that generalizes well. This makes random forests particularly well-suited for prediction market modeling where the signal-to-noise ratio is moderate and overfitting is a major risk.

Exercise 3. You train an XGBoost model that achieves 92% accuracy on your training set but only 61% on the test set. Diagnose the problem and propose three remedies.

The large gap between training accuracy (92%) and test accuracy (61%) is a classic sign of overfitting: the model has memorized patterns in the training data that do not generalize.

Three remedies:

(a) Increase regularization: Increase the L1 (reg_alpha) and L2 (reg_lambda) penalty terms, reduce max_depth, or increase min_child_weight to prevent the model from fitting noise.

(b) Reduce model complexity: Use fewer boosting rounds (n_estimators) or a lower learning rate (eta) with early stopping based on validation set performance. This prevents the model from continuing to fit residual noise.

(c) Gather more training data or apply feature selection: More data reduces the variance of the model. Alternatively, removing irrelevant or noisy features prevents the model from finding spurious correlations. Using cross-validation (e.g., 5-fold) to select hyperparameters and features provides a more robust estimate of generalization performance.

Chapter 26: MLOps for Prediction Market Systems

Exercise 1. Explain why model versioning is important in a production prediction market trading system. What can go wrong without it?

Model versioning maintains a record of every model that was trained, the data it was trained on, its hyperparameters, and its performance metrics. Without versioning:

(a) Irreproducibility: If a model begins performing poorly, there is no way to roll back to a previous version that was working well.

(b) No auditability: The trader cannot determine which model was responsible for specific past trades, making it impossible to diagnose losses or attribute performance.

(c) Training-serving skew: Without tracking which model version is deployed, discrepancies between the training environment and the production environment may go undetected.

Tools like MLflow's model registry address these concerns by providing version control, staging environments, and metadata tracking for models throughout their lifecycle.

Exercise 3. Design a monitoring pipeline that detects model drift for a prediction market forecasting system. What metrics would you track, and what actions would trigger retraining?

Key metrics to track:

(a) Prediction distribution shift: Monitor the distribution of the model's output probabilities over time. If the distribution changes significantly (measured by KL divergence or the Kolmogorov-Smirnov statistic) compared to a reference window, the model may be drifting.

(b) Feature distribution shift: Track the distributions of input features. If upstream data sources change format, scale, or availability, the model may receive inputs outside its training distribution.

(c) Realized Brier score (rolling window): Compute the Brier score on resolved markets over a rolling 30-day window. Compare against the model's historical baseline.

Retraining triggers: (i) Rolling Brier score exceeds 120% of the historical baseline for two consecutive weeks. (ii) Feature drift detector fires for more than three features simultaneously. (iii) A scheduled calendar trigger (e.g., monthly retraining) as a baseline safeguard.

Chapter 30: Building and Evaluating Strategies

Exercise 1. A trading strategy has annual returns of 18%, a standard deviation of 25%, and the risk-free rate is 4%. Compute the Sharpe ratio and interpret it.

Sharpe ratio = (Mean return - Risk-free rate) / Standard deviation = (0.18 - 0.04) / 0.25 = 0.14 / 0.25 = 0.56.

A Sharpe ratio of 0.56 indicates that the strategy earns 0.56 units of excess return per unit of risk (volatility). This is a modest risk-adjusted performance. For comparison, traditional equity markets have historically delivered Sharpe ratios around 0.3-0.5. A Sharpe ratio above 1.0 is generally considered strong, and above 2.0 is exceptional. The strategy is generating positive risk-adjusted returns but has significant room for improvement.

Exercise 3. Explain the difference between in-sample backtesting and walk-forward validation. Why is walk-forward validation preferred for prediction market strategies?

In-sample backtesting trains a model on historical data and evaluates it on the same data (or a single held-out test set), which risks overfitting to the specific historical period. Walk-forward validation trains the model on a rolling window of past data, generates predictions for the next period, then advances the window forward and repeats. The model is always tested on data it has not seen during training.

Walk-forward validation is preferred for prediction markets because: (a) it simulates the actual trading experience where decisions are made using only past information; (b) it reveals how the model adapts (or fails to adapt) to changing market conditions, regime shifts, and evolving participant behavior; (c) it provides multiple out-of-sample evaluation points, yielding a more robust and realistic estimate of strategy performance than a single backtest period.

Chapter 34: Risk Management in Practice

Exercise 1. A trader has a $10,000 bankroll and identifies a contract trading at $0.40 that they believe has a true probability of 0.55. Compute the full Kelly stake and the half-Kelly stake.

For a binary contract paying $1: - Cost per contract: $0.40 - Profit if correct: $1.00 - $0.40 = $0.60 - Odds (b) = profit / cost = 0.60 / 0.40 = 1.5 - p = 0.55 (believed true probability), q = 1 - p = 0.45 - Kelly fraction f* = (bp - q) / b = (1.5 * 0.55 - 0.45) / 1.5 = (0.825 - 0.45) / 1.5 = 0.375 / 1.5 = 0.25

Full Kelly stake = 0.25 * $10,000 = $2,500 (which buys 2,500 / 0.40 = 6,250 contracts). Half Kelly stake = 0.125 * $10,000 = $1,250 (which buys 3,125 contracts).

The half-Kelly approach sacrifices approximately 25% of the expected growth rate but reduces the variance and maximum drawdown substantially, making it the more common choice in practice where probability estimates are uncertain.

Exercise 3. A portfolio holds three prediction market positions with the following characteristics: Position A has a 10% probability of total loss and constitutes 30% of the portfolio. Position B has a 5% probability of total loss and constitutes 50%. Position C has a 20% probability of total loss and constitutes 20%. Assuming independence, compute the probability of losing the entire portfolio.

For the entire portfolio to be lost, all three positions must individually suffer a total loss. Under independence:

P(total loss) = P(A lost) * P(B lost) * P(C lost) = 0.10 * 0.05 * 0.20 = 0.001 = 0.1%.

Note: This is the probability of losing everything. However, the more practically relevant risk is the probability of losing a large fraction of the portfolio. If only positions A and C are lost (probability = 0.10 * 0.20 = 0.02 = 2%), the portfolio would lose 30% + 20% = 50% of its value. Proper risk management requires analyzing partial loss scenarios, not just the extreme case.

Exercise 5. Explain why a trader should never use full Kelly sizing in practice. Give at least three reasons.

(a) Parameter uncertainty: The Kelly criterion assumes the trader knows the true probability p exactly. In reality, probability estimates contain errors. If the true edge is smaller than estimated, full Kelly oversizes the bet, leading to excessive variance and potential ruin.

(b) Heavy tails and correlated losses: The Kelly criterion assumes outcomes are independent. In practice, prediction market positions can be correlated (e.g., multiple political markets move together). Correlated losses amplify drawdowns beyond what the Kelly formula anticipates.

(c) Utility and psychological tolerance: Full Kelly produces extremely volatile wealth paths, with frequent drawdowns of 50% or more. Few traders can psychologically tolerate such swings without making emotional decisions (going on tilt). Fractional Kelly (e.g., half Kelly) provides a smoother equity curve that most traders can stick with through inevitable losing streaks.

(d) Discrete and bounded bankrolls: The Kelly criterion is derived for an infinite-horizon setting with continuous wealth. Real traders have finite capital, finite trading opportunities, and may have non-logarithmic utility. These practical constraints favor more conservative sizing.

Solutions to even-numbered exercises are available in the instructor's companion manual.