Chapter 30: Key Takeaways

The Combinatorial Challenge

  • Real-world questions are interconnected. Markets that can price combinations, conjunctions, and conditional relationships provide richer information than isolated markets.
  • For $n$ binary events, the full joint distribution has $2^n$ states. This combinatorial explosion makes naive enumeration infeasible beyond approximately 15--20 events.
  • The design spectrum ranges from full combinatorial (exact, small $n$) to partition-based (exact for independent groups), approximation methods (scalable but approximate), lazy evaluation (trades drive computation), and bundle/conditional trading (targeted expressiveness).

Full Combinatorial Markets

  • The LMSR cost function generalizes directly to exponential state spaces: $C(\mathbf{q}) = b \ln\left(\sum_{i=1}^{N} e^{q_i / b}\right)$ where $N = 2^n$.
  • Any Boolean expression over events can be priced by summing state probabilities: $P(\phi) = \sum_{s: \phi(s)=\text{true}} p_s$.
  • Traders can bet on marginals, conjunctions, disjunctions, and conditional probabilities.
  • Feasible for up to about 15 events on commodity hardware; GPU acceleration can push this higher.

Partition-Based Markets

  • If events form independent groups, the full joint distribution factors: $P(A_1, A_2, B_1, B_2) = P(A_1, A_2) \times P(B_1, B_2)$.
  • Total states reduce from $2^n$ to $\sum 2^{k_i}$ where $k_i$ is the size of each group. For 12 events in three groups of 4, this is 48 states instead of 4,096 -- an 85x reduction.
  • Limitation: partitions cannot capture correlations between groups. The designer must balance accuracy (larger groups) against tractability (smaller groups).
  • Tree-structured dependencies and conditional markets are special cases of partition decomposition.

Approximation Methods

Method Best When Trade-off
Monte Carlo sampling General case, moderate $n$ Accuracy improves with sample count; may need many samples for rare events
Bayesian network Known causal structure Exact if structure is correct; requires structural knowledge
Mean-field Weak correlations Fast but ignores all correlations
Belief propagation Sparse dependency graphs Good for trees; approximate for loopy graphs
  • In practice, hybrid approaches work best: exact computation for common queries, partitions for independent groups, sampling as fallback.

LMSR Extensions (LMSR-C)

  • The combinatorial LMSR maintains the same mathematical form but uses lazy evaluation: only states that have been traded are instantiated.
  • Key property: maximum loss grows linearly in $n$, not exponentially. $L_{\max} = b \cdot n \cdot \ln 2 \approx 0.693 \cdot b \cdot n$.
  • The factored representation allows trades on Boolean expressions without enumerating the full state space.
  • LMSR-C preserves all desirable AMM properties: no arbitrage, path independence, bounded loss, and information incorporation.

Bundle Trading

  • A bundle is a collection of positions across multiple contracts traded as a single atomic transaction.
  • Bundle types: portfolio bundles, spread bundles, conditional bundles, parlay bundles, and hedging bundles.
  • Atomicity prevents partial execution: the entire bundle executes or none of it does.
  • Bundles enable sophisticated strategies: correlation bets, calendar spreads, scenario hedges.
  • Cross-market arbitrage detection is natural with bundle trading.

Conditional Prediction Markets

  • Conditional markets answer "If X, what is P(Y)?" using $P(Y \mid X) = P(X \text{ and } Y) / P(X)$.
  • The conditional token framework splits collateral into tokens for each condition, then further splits for outcomes. Tokens for non-occurring conditions are voided (refunded).
  • Conditional markets are the foundation of decision markets (Chapter 31) and connect to causal inference.
  • Multiple conditional markets can be inconsistent; coupled market makers or arbitrage mechanisms maintain consistency.

Computational Trade-offs

  • Approximation is "good enough" when the error is smaller than the bid-ask spread, relative rankings are preserved, and conditional relationships are directionally correct.
  • Latency targets: <100ms for active trading, <1s for casual traders, <10s for batch queries.
  • Memory scales as $O(2^n)$ for full enumeration but $O(k)$ for lazy evaluation where $k$ is the number of traded states.

Causal Inference from Markets

  • Combinatorial markets can help distinguish correlation from causation by providing prices for multiple conditional distributions simultaneously.
  • If $P(Y \mid X)$ is high but $P(Y \mid X, Z) \approx P(Y \mid \neg X, Z)$, the association may be confounded by $Z$.
  • True causal inference requires the $\text{do}$-calculus: $P(O \mid \text{do}(A))$ differs from $P(O \mid A)$ when there are confounders.
  • Decision markets (Chapter 31) address this by randomizing the decision based on market prices.

Key Formulas

Formula Description
$2^n$ Number of states for $n$ binary events
$C(\mathbf{q}) = b \ln \sum_i e^{q_i/b}$ LMSR cost function
$p_i = e^{q_i/b} / \sum_j e^{q_j/b}$ LMSR state price (softmax)
$P(\phi) = \sum_{s:\phi(s)} p_s$ Boolean expression price
$P(Y \mid X) = P(X,Y) / P(X)$ Conditional probability
$L_{\max} = b \cdot n \cdot \ln 2$ LMSR-C maximum loss
$\text{corr}(A,B) = \frac{P(AB) - P(A)P(B)}{\sqrt{P(A)(1-P(A))P(B)(1-P(B))}}$ Implied correlation