Chapter 30: Key Takeaways
The Combinatorial Challenge
- Real-world questions are interconnected. Markets that can price combinations, conjunctions, and conditional relationships provide richer information than isolated markets.
- For $n$ binary events, the full joint distribution has $2^n$ states. This combinatorial explosion makes naive enumeration infeasible beyond approximately 15--20 events.
- The design spectrum ranges from full combinatorial (exact, small $n$) to partition-based (exact for independent groups), approximation methods (scalable but approximate), lazy evaluation (trades drive computation), and bundle/conditional trading (targeted expressiveness).
Full Combinatorial Markets
- The LMSR cost function generalizes directly to exponential state spaces: $C(\mathbf{q}) = b \ln\left(\sum_{i=1}^{N} e^{q_i / b}\right)$ where $N = 2^n$.
- Any Boolean expression over events can be priced by summing state probabilities: $P(\phi) = \sum_{s: \phi(s)=\text{true}} p_s$.
- Traders can bet on marginals, conjunctions, disjunctions, and conditional probabilities.
- Feasible for up to about 15 events on commodity hardware; GPU acceleration can push this higher.
Partition-Based Markets
- If events form independent groups, the full joint distribution factors: $P(A_1, A_2, B_1, B_2) = P(A_1, A_2) \times P(B_1, B_2)$.
- Total states reduce from $2^n$ to $\sum 2^{k_i}$ where $k_i$ is the size of each group. For 12 events in three groups of 4, this is 48 states instead of 4,096 -- an 85x reduction.
- Limitation: partitions cannot capture correlations between groups. The designer must balance accuracy (larger groups) against tractability (smaller groups).
- Tree-structured dependencies and conditional markets are special cases of partition decomposition.
Approximation Methods
| Method | Best When | Trade-off |
|---|---|---|
| Monte Carlo sampling | General case, moderate $n$ | Accuracy improves with sample count; may need many samples for rare events |
| Bayesian network | Known causal structure | Exact if structure is correct; requires structural knowledge |
| Mean-field | Weak correlations | Fast but ignores all correlations |
| Belief propagation | Sparse dependency graphs | Good for trees; approximate for loopy graphs |
- In practice, hybrid approaches work best: exact computation for common queries, partitions for independent groups, sampling as fallback.
LMSR Extensions (LMSR-C)
- The combinatorial LMSR maintains the same mathematical form but uses lazy evaluation: only states that have been traded are instantiated.
- Key property: maximum loss grows linearly in $n$, not exponentially. $L_{\max} = b \cdot n \cdot \ln 2 \approx 0.693 \cdot b \cdot n$.
- The factored representation allows trades on Boolean expressions without enumerating the full state space.
- LMSR-C preserves all desirable AMM properties: no arbitrage, path independence, bounded loss, and information incorporation.
Bundle Trading
- A bundle is a collection of positions across multiple contracts traded as a single atomic transaction.
- Bundle types: portfolio bundles, spread bundles, conditional bundles, parlay bundles, and hedging bundles.
- Atomicity prevents partial execution: the entire bundle executes or none of it does.
- Bundles enable sophisticated strategies: correlation bets, calendar spreads, scenario hedges.
- Cross-market arbitrage detection is natural with bundle trading.
Conditional Prediction Markets
- Conditional markets answer "If X, what is P(Y)?" using $P(Y \mid X) = P(X \text{ and } Y) / P(X)$.
- The conditional token framework splits collateral into tokens for each condition, then further splits for outcomes. Tokens for non-occurring conditions are voided (refunded).
- Conditional markets are the foundation of decision markets (Chapter 31) and connect to causal inference.
- Multiple conditional markets can be inconsistent; coupled market makers or arbitrage mechanisms maintain consistency.
Computational Trade-offs
- Approximation is "good enough" when the error is smaller than the bid-ask spread, relative rankings are preserved, and conditional relationships are directionally correct.
- Latency targets: <100ms for active trading, <1s for casual traders, <10s for batch queries.
- Memory scales as $O(2^n)$ for full enumeration but $O(k)$ for lazy evaluation where $k$ is the number of traded states.
Causal Inference from Markets
- Combinatorial markets can help distinguish correlation from causation by providing prices for multiple conditional distributions simultaneously.
- If $P(Y \mid X)$ is high but $P(Y \mid X, Z) \approx P(Y \mid \neg X, Z)$, the association may be confounded by $Z$.
- True causal inference requires the $\text{do}$-calculus: $P(O \mid \text{do}(A))$ differs from $P(O \mid A)$ when there are confounders.
- Decision markets (Chapter 31) address this by randomizing the decision based on market prices.
Key Formulas
| Formula | Description |
|---|---|
| $2^n$ | Number of states for $n$ binary events |
| $C(\mathbf{q}) = b \ln \sum_i e^{q_i/b}$ | LMSR cost function |
| $p_i = e^{q_i/b} / \sum_j e^{q_j/b}$ | LMSR state price (softmax) |
| $P(\phi) = \sum_{s:\phi(s)} p_s$ | Boolean expression price |
| $P(Y \mid X) = P(X,Y) / P(X)$ | Conditional probability |
| $L_{\max} = b \cdot n \cdot \ln 2$ | LMSR-C maximum loss |
| $\text{corr}(A,B) = \frac{P(AB) - P(A)P(B)}{\sqrt{P(A)(1-P(A))P(B)(1-P(B))}}$ | Implied correlation |