Case Study 2: Migrating from Order Book to AMM — A Platform Architecture Decision
Background
PredictHub was a small prediction market startup that launched with an order book model. After six months of operation with approximately 2,000 registered users, the founding engineers faced a critical decision: the order book worked well for popular markets (elections, sports) but failed for niche topics (technology predictions, scientific outcomes). Thin markets had spreads exceeding 20 percentage points, making them essentially unusable.
The CTO, Marcus Chen, proposed migrating to an LMSR AMM for all markets. The CEO, Priya Kapoor, worried about the cost of subsidizing liquidity and the loss of order book price discovery in active markets. The engineering team was asked to build a proof of concept, benchmark both approaches, and present a recommendation.
The Problem: Thin Markets
PredictHub had 150 active markets at any given time. The team classified them by trading activity:
| Category | Markets | Avg Daily Trades | Avg Spread | User Satisfaction |
|---|---|---|---|---|
| Hot (elections, sports) | 15 | 200+ | 2-4% | High |
| Warm (tech, business) | 45 | 20-50 | 8-15% | Medium |
| Cold (science, niche) | 90 | 0-5 | 20-50%+ | Low |
The cold markets were the biggest problem. Users would create interesting markets, but nobody could trade because the spread was so wide. A user wanting to express a 60% belief in an outcome would need to buy at 75% (the best ask) or place a bid at 60% and wait days for a match that might never come.
Comparative Analysis
Order Book Performance Metrics
The team instrumented the existing order book to collect detailed metrics:
"""Order book performance analysis for PredictHub markets."""
from dataclasses import dataclass
from typing import Optional
@dataclass
class MarketMetrics:
"""Performance metrics for a single market.
Attributes:
market_id: The market identifier.
market_category: Hot, warm, or cold classification.
avg_daily_trades: Average number of trades per day.
avg_spread_bps: Average bid-ask spread in basis points.
time_with_both_sides_pct: Percentage of time both bids and asks exist.
avg_fill_time_seconds: Average time from order placement to fill.
price_discovery_score: Correlation between final price and outcome (0-1).
"""
market_id: int
market_category: str
avg_daily_trades: float
avg_spread_bps: float
time_with_both_sides_pct: float
avg_fill_time_seconds: Optional[float]
price_discovery_score: float
def analyze_order_book_markets(metrics: list[MarketMetrics]) -> dict:
"""Summarize order book performance across market categories.
Args:
metrics: List of per-market metrics.
Returns:
Summary statistics by category.
"""
categories = {}
for m in metrics:
cat = m.market_category
if cat not in categories:
categories[cat] = {
"count": 0,
"total_spread": 0.0,
"total_both_sides": 0.0,
"total_discovery": 0.0,
"fill_times": [],
}
categories[cat]["count"] += 1
categories[cat]["total_spread"] += m.avg_spread_bps
categories[cat]["total_both_sides"] += m.time_with_both_sides_pct
categories[cat]["total_discovery"] += m.price_discovery_score
if m.avg_fill_time_seconds is not None:
categories[cat]["fill_times"].append(m.avg_fill_time_seconds)
summary = {}
for cat, data in categories.items():
n = data["count"]
fill_times = data["fill_times"]
summary[cat] = {
"market_count": n,
"avg_spread_bps": data["total_spread"] / n,
"avg_both_sides_pct": data["total_both_sides"] / n,
"avg_discovery_score": data["total_discovery"] / n,
"median_fill_time_s": (
sorted(fill_times)[len(fill_times) // 2]
if fill_times else None
),
}
return summary
# Results from PredictHub's data
sample_results = {
"hot": {
"market_count": 15,
"avg_spread_bps": 300, # 3%
"avg_both_sides_pct": 94.2,
"avg_discovery_score": 0.82,
"median_fill_time_s": 12.5,
},
"warm": {
"market_count": 45,
"avg_spread_bps": 1150, # 11.5%
"avg_both_sides_pct": 61.8,
"median_fill_time_s": 340.0, # ~5.7 minutes
"avg_discovery_score": 0.64,
},
"cold": {
"market_count": 90,
"avg_spread_bps": 3500, # 35%
"avg_both_sides_pct": 18.3,
"median_fill_time_s": None, # Most orders never fill
"avg_discovery_score": 0.41,
},
}
LMSR Simulation
The team built an LMSR simulator to project what performance would look like if the same trading activity occurred on an LMSR market:
"""LMSR simulation for PredictHub migration analysis.
Simulates how historical trading patterns would behave under LMSR
pricing to project performance metrics for the migration decision.
"""
import math
from dataclasses import dataclass, field
@dataclass
class LMSRSimulator:
"""Simulates LMSR market behavior for comparison analysis.
Attributes:
b: Liquidity parameter.
shares: Current shares outstanding per outcome.
trade_log: Record of all simulated trades.
"""
b: float
shares: list[float] = field(default_factory=lambda: [0.0, 0.0])
trade_log: list[dict] = field(default_factory=list)
def price(self, outcome: int) -> float:
"""Get the current price of an outcome.
Args:
outcome: Index of the outcome (0 or 1 for binary).
Returns:
Current price between 0 and 1.
"""
scaled = [s / self.b for s in self.shares]
max_val = max(scaled)
exps = [math.exp(s - max_val) for s in scaled]
total = sum(exps)
return exps[outcome] / total
def trade_cost(self, outcome: int, num_shares: float) -> float:
"""Compute the cost of a trade.
Args:
outcome: Which outcome to trade.
num_shares: Shares to buy (positive) or sell (negative).
Returns:
The dollar cost of the trade.
"""
old_c = self._cost_function(self.shares)
new_shares = list(self.shares)
new_shares[outcome] += num_shares
new_c = self._cost_function(new_shares)
return new_c - old_c
def execute(self, outcome: int, num_shares: float) -> dict:
"""Execute a trade and log the results.
Args:
outcome: Which outcome to trade.
num_shares: Shares to buy (positive) or sell (negative).
Returns:
Trade execution record.
"""
cost = self.trade_cost(outcome, num_shares)
price_before = self.price(outcome)
self.shares[outcome] += num_shares
price_after = self.price(outcome)
record = {
"outcome": outcome,
"shares": num_shares,
"cost": cost,
"price_before": price_before,
"price_after": price_after,
"slippage": abs(price_after - price_before),
}
self.trade_log.append(record)
return record
def _cost_function(self, shares: list[float]) -> float:
"""Compute the LMSR cost function.
Args:
shares: Share vector to evaluate.
Returns:
Cost function value.
"""
scaled = [s / self.b for s in shares]
max_val = max(scaled)
return self.b * (max_val + math.log(
sum(math.exp(s - max_val) for s in scaled)
))
def simulate_market_migration(
trade_history: list[dict],
b_values: list[float],
) -> dict[float, dict]:
"""Simulate how historical trades would perform under LMSR.
Takes a sequence of historical order book trades and replays them
through LMSR with different b values to compare outcomes.
Args:
trade_history: List of dicts with 'outcome', 'shares', and
'historical_price' keys.
b_values: List of liquidity parameters to test.
Returns:
Dictionary mapping b values to performance metrics.
"""
results = {}
for b in b_values:
sim = LMSRSimulator(b=b)
total_slippage = 0.0
total_cost_difference = 0.0
for trade in trade_history:
result = sim.execute(
outcome=trade["outcome"],
num_shares=trade["shares"],
)
total_slippage += result["slippage"]
# Compare AMM cost to historical order book price
amm_avg_price = abs(result["cost"] / trade["shares"]) if trade["shares"] != 0 else 0
cost_diff = abs(amm_avg_price - trade["historical_price"])
total_cost_difference += cost_diff
n_trades = len(trade_history)
max_loss = b * math.log(2)
all_prices = sim.all_prices() if hasattr(sim, 'all_prices') else [sim.price(0), sim.price(1)]
results[b] = {
"b_value": b,
"max_loss": round(max_loss, 2),
"avg_slippage": round(total_slippage / n_trades, 4) if n_trades > 0 else 0,
"avg_cost_difference": round(total_cost_difference / n_trades, 4) if n_trades > 0 else 0,
"final_prices": [round(p, 4) for p in all_prices],
"total_volume": n_trades,
}
return results
# Simulated benchmark data
sample_trade_history = [
{"outcome": 0, "shares": 10, "historical_price": 0.52},
{"outcome": 0, "shares": 5, "historical_price": 0.55},
{"outcome": 1, "shares": 15, "historical_price": 0.50},
{"outcome": 0, "shares": 20, "historical_price": 0.48},
{"outcome": 1, "shares": 8, "historical_price": 0.53},
]
benchmark = simulate_market_migration(
trade_history=sample_trade_history,
b_values=[50, 100, 200, 500],
)
for b_val, metrics in benchmark.items():
print(f"\nb = {b_val}:")
print(f" Max loss: ${metrics['max_loss']}")
print(f" Avg slippage: {metrics['avg_slippage']:.4f}")
print(f" Avg cost difference vs order book: {metrics['avg_cost_difference']:.4f}")
Performance Benchmarks
The engineering team ran benchmarks comparing order matching speed and memory usage:
"""Performance benchmarks: Order Book vs LMSR.
Measures throughput (orders/second) and memory usage for both
pricing mechanisms under various load conditions.
"""
import time
import math
import sys
from dataclasses import dataclass
@dataclass
class BenchmarkResult:
"""Results from a performance benchmark run.
Attributes:
mechanism: Name of the mechanism tested.
num_operations: Number of operations executed.
elapsed_seconds: Wall clock time.
ops_per_second: Throughput.
memory_bytes: Approximate memory usage.
"""
mechanism: str
num_operations: int
elapsed_seconds: float
ops_per_second: float
memory_bytes: int
def benchmark_lmsr(num_trades: int, b: float = 100.0) -> BenchmarkResult:
"""Benchmark LMSR trade execution throughput.
Args:
num_trades: Number of trades to execute.
b: Liquidity parameter.
Returns:
Benchmark results.
"""
shares = [0.0, 0.0]
def cost(s: list[float]) -> float:
scaled = [x / b for x in s]
max_val = max(scaled)
return b * (max_val + math.log(
sum(math.exp(x - max_val) for x in scaled)
))
start = time.perf_counter()
for i in range(num_trades):
outcome = i % 2
old_c = cost(shares)
shares[outcome] += 1.0
new_c = cost(shares)
_ = new_c - old_c # Trade cost
elapsed = time.perf_counter() - start
return BenchmarkResult(
mechanism="LMSR",
num_operations=num_trades,
elapsed_seconds=elapsed,
ops_per_second=num_trades / elapsed,
memory_bytes=sys.getsizeof(shares),
)
def benchmark_order_book(num_orders: int) -> BenchmarkResult:
"""Benchmark order book insertion and matching throughput.
Uses a simplified order book for fair comparison.
Args:
num_orders: Number of orders to process.
Returns:
Benchmark results.
"""
import heapq
bids: list[tuple] = []
asks: list[tuple] = []
start = time.perf_counter()
for i in range(num_orders):
if i % 2 == 0:
# Limit buy order
price = 0.50 + (i % 10) * 0.01
heapq.heappush(bids, (-price, i, 1.0))
else:
# Limit sell order - attempt match
price = 0.50 + (i % 10) * 0.01
if bids and -bids[0][0] >= price:
heapq.heappop(bids) # Match
else:
heapq.heappush(asks, (price, i, 1.0))
elapsed = time.perf_counter() - start
return BenchmarkResult(
mechanism="OrderBook",
num_operations=num_orders,
elapsed_seconds=elapsed,
ops_per_second=num_orders / elapsed,
memory_bytes=sys.getsizeof(bids) + sys.getsizeof(asks),
)
def run_benchmarks():
"""Run and display benchmark comparisons."""
print("=" * 65)
print("Performance Benchmark: LMSR vs Order Book")
print("=" * 65)
for n in [1_000, 10_000, 100_000]:
lmsr_result = benchmark_lmsr(n)
ob_result = benchmark_order_book(n)
print(f"\n--- {n:,} operations ---")
print(f"LMSR: {lmsr_result.ops_per_second:>12,.0f} ops/sec "
f"({lmsr_result.elapsed_seconds:.4f}s)")
print(f"OrderBook: {ob_result.ops_per_second:>12,.0f} ops/sec "
f"({ob_result.elapsed_seconds:.4f}s)")
ratio = ob_result.ops_per_second / lmsr_result.ops_per_second
print(f"OB/LMSR ratio: {ratio:.2f}x")
if __name__ == "__main__":
run_benchmarks()
Typical results on a standard developer machine:
| Operations | LMSR (ops/sec) | Order Book (ops/sec) | Ratio |
|---|---|---|---|
| 1,000 | ~250,000 | ~400,000 | 1.6x |
| 10,000 | ~240,000 | ~380,000 | 1.6x |
| 100,000 | ~230,000 | ~350,000 | 1.5x |
The order book was slightly faster per operation because LMSR requires exponential and logarithmic calculations, while order book matching is primarily heap operations. However, both mechanisms comfortably handle thousands of operations per second in Python, meaning performance was not a deciding factor.
The Decision Framework
The team developed a structured decision framework:
Criterion 1: Liquidity Quality
| Aspect | Order Book | LMSR |
|---|---|---|
| Hot markets | Excellent natural liquidity | Good, but AMM subsidy unnecessary |
| Warm markets | Adequate but inconsistent | Consistent, always tradeable |
| Cold markets | Poor to nonexistent | Consistent, always tradeable |
| Verdict | Wins for hot markets | Wins overall |
Criterion 2: Cost
| Aspect | Order Book | LMSR |
|---|---|---|
| Infrastructure cost | Moderate (heap management) | Low (simple math) |
| Liquidity cost | $0 (traders provide) | $b \cdot \ln(n)$ per market | |
| Annual subsidy (150 markets) | $0 | ~$10,400 at $b=100$ | |
| Verdict | Wins on cost | Acceptable with budget |
For 150 binary markets at $b = 100$: $150 \times 100 \times \ln(2) \approx \$10,397$ annual maximum subsidy. In practice, the actual loss is much lower because markets rarely reach the worst case.
Criterion 3: User Experience
| Aspect | Order Book | LMSR |
|---|---|---|
| Immediate execution | Only if matching order exists | Always |
| Price transparency | Bid/ask with spread | Single clear price |
| Learning curve | Higher (limit orders, spreads) | Lower (just buy/sell) |
| Verdict | Better for sophisticated traders | Wins for mass market |
Criterion 4: Price Discovery
| Aspect | Order Book | LMSR |
|---|---|---|
| Information aggregation | Excellent in active markets | Good |
| Manipulation resistance | Moderate (visible depth) | Moderate (bounded loss) |
| Accuracy (Brier score) | 0.18 (hot markets) | Projected 0.22 |
| Verdict | Wins for well-traded markets | Adequate for all markets |
Final Recommendation: Hybrid Approach
The team recommended a hybrid approach rather than a complete migration:
- LMSR as the default for all new markets, providing guaranteed baseline liquidity.
- Order book overlay for hot markets that attract sufficient trading volume, allowing traders to offer better prices than the AMM.
- Automatic mode selection: Markets start as LMSR and automatically gain an order book when daily volume exceeds a threshold (50 trades/day).
"""Hybrid market routing logic for PredictHub."""
from enum import Enum
class MarketMode(str, Enum):
"""The active pricing mode for a market."""
LMSR_ONLY = "lmsr_only"
HYBRID = "hybrid"
class HybridRouter:
"""Routes trades to the optimal pricing mechanism.
In hybrid mode, incoming orders are first checked against the
order book. If the order book can offer a better price than the
LMSR, the trade goes through the order book. Otherwise, it executes
against the LMSR.
Attributes:
lmsr: The LMSR market maker instance.
order_book: The order book instance (may be None).
mode: Current market mode.
volume_threshold: Daily trades needed to activate hybrid mode.
"""
def __init__(self, lmsr, order_book=None, volume_threshold: int = 50):
"""Initialize the hybrid router.
Args:
lmsr: LMSR market maker instance.
order_book: Optional order book instance.
volume_threshold: Trades per day to activate order book.
"""
self.lmsr = lmsr
self.order_book = order_book
self.mode = MarketMode.LMSR_ONLY
self.volume_threshold = volume_threshold
self.daily_trade_count = 0
def route_order(self, outcome: int, shares: float, side: str,
limit_price: float = None) -> dict:
"""Route an order to the best available pricing mechanism.
The routing logic:
1. If LMSR_ONLY mode, always use LMSR.
2. If HYBRID mode and a limit order exists:
a. Check if order book has a better price.
b. If yes, execute against order book.
c. If no, execute against LMSR.
3. For market orders in HYBRID mode, compare best available
prices and use the better one.
Args:
outcome: Outcome index to trade.
shares: Number of shares.
side: "buy" or "sell".
limit_price: Optional limit price.
Returns:
Trade execution result with routing information.
"""
self.daily_trade_count += 1
if self.mode == MarketMode.LMSR_ONLY:
cost = self.lmsr.trade_cost(outcome, shares if side == "buy" else -shares)
avg_price = abs(cost / shares)
return {
"mechanism": "lmsr",
"avg_price": avg_price,
"cost": cost,
"shares": shares,
}
# Hybrid mode: compare prices
lmsr_cost = self.lmsr.trade_cost(outcome, shares if side == "buy" else -shares)
lmsr_avg_price = abs(lmsr_cost / shares)
ob_price = None
if self.order_book:
if side == "buy":
ob_price = self.order_book.get_best_ask(outcome)
else:
ob_price = self.order_book.get_best_bid(outcome)
# Use order book if it offers a better price
if ob_price is not None:
if side == "buy" and ob_price < lmsr_avg_price:
return {
"mechanism": "order_book",
"avg_price": ob_price,
"cost": ob_price * shares,
"shares": shares,
}
elif side == "sell" and ob_price > lmsr_avg_price:
return {
"mechanism": "order_book",
"avg_price": ob_price,
"cost": ob_price * shares,
"shares": shares,
}
# Default to LMSR
return {
"mechanism": "lmsr",
"avg_price": lmsr_avg_price,
"cost": lmsr_cost,
"shares": shares,
}
def check_mode_upgrade(self) -> bool:
"""Check if the market should be upgraded to hybrid mode.
Returns:
True if mode was changed to HYBRID.
"""
if (self.mode == MarketMode.LMSR_ONLY and
self.daily_trade_count >= self.volume_threshold):
self.mode = MarketMode.HYBRID
return True
return False
Outcome
PredictHub implemented the hybrid approach over four weeks. Results after three months:
| Metric | Before (Order Book Only) | After (Hybrid) | Change |
|---|---|---|---|
| Active markets with trades | 60 (40%) | 135 (90%) | +125% |
| Cold market avg daily trades | 1.2 | 8.7 | +625% |
| User satisfaction (survey) | 3.1/5 | 4.2/5 | +35% |
| Monthly active traders | 340 | 780 | +129% |
| Revenue (trading fees) | $4,200/mo | $11,800/mo | +181% | |
| LMSR subsidy cost | $0 | $1,800/mo | New cost | |
| Net revenue change | — | +$5,800/mo | — |
The LMSR subsidy cost ($1,800/month) was a fraction of the revenue increase ($7,600/month), making the investment clearly worthwhile. Cold markets became the platform's growth engine: niche topics attracted passionate communities who previously could not participate due to illiquidity.
Key Lessons
-
One mechanism does not fit all markets. Hot markets benefit from order book price discovery, while cold markets need AMM liquidity guarantees.
-
The subsidy is an investment, not a cost. The LMSR subsidy created markets that generated more revenue than they cost, because engaged users also traded in hot markets.
-
Start with AMM, add order book later. Building an order book is more complex and only valuable when there is sufficient volume. LMSR provides a minimum viable product for any market.
-
Measure before migrating. The simulation framework allowed PredictHub to predict the impact of migration before committing to the engineering work.
-
User experience drives adoption more than mechanism efficiency. Users cared about immediate execution and clear prices, not about whether the spread was 2% or 3%.