Case Study 2: Migrating from Order Book to AMM — A Platform Architecture Decision

Background

PredictHub was a small prediction market startup that launched with an order book model. After six months of operation with approximately 2,000 registered users, the founding engineers faced a critical decision: the order book worked well for popular markets (elections, sports) but failed for niche topics (technology predictions, scientific outcomes). Thin markets had spreads exceeding 20 percentage points, making them essentially unusable.

The CTO, Marcus Chen, proposed migrating to an LMSR AMM for all markets. The CEO, Priya Kapoor, worried about the cost of subsidizing liquidity and the loss of order book price discovery in active markets. The engineering team was asked to build a proof of concept, benchmark both approaches, and present a recommendation.

The Problem: Thin Markets

PredictHub had 150 active markets at any given time. The team classified them by trading activity:

Category Markets Avg Daily Trades Avg Spread User Satisfaction
Hot (elections, sports) 15 200+ 2-4% High
Warm (tech, business) 45 20-50 8-15% Medium
Cold (science, niche) 90 0-5 20-50%+ Low

The cold markets were the biggest problem. Users would create interesting markets, but nobody could trade because the spread was so wide. A user wanting to express a 60% belief in an outcome would need to buy at 75% (the best ask) or place a bid at 60% and wait days for a match that might never come.

Comparative Analysis

Order Book Performance Metrics

The team instrumented the existing order book to collect detailed metrics:

"""Order book performance analysis for PredictHub markets."""

from dataclasses import dataclass
from typing import Optional


@dataclass
class MarketMetrics:
    """Performance metrics for a single market.

    Attributes:
        market_id: The market identifier.
        market_category: Hot, warm, or cold classification.
        avg_daily_trades: Average number of trades per day.
        avg_spread_bps: Average bid-ask spread in basis points.
        time_with_both_sides_pct: Percentage of time both bids and asks exist.
        avg_fill_time_seconds: Average time from order placement to fill.
        price_discovery_score: Correlation between final price and outcome (0-1).
    """
    market_id: int
    market_category: str
    avg_daily_trades: float
    avg_spread_bps: float
    time_with_both_sides_pct: float
    avg_fill_time_seconds: Optional[float]
    price_discovery_score: float


def analyze_order_book_markets(metrics: list[MarketMetrics]) -> dict:
    """Summarize order book performance across market categories.

    Args:
        metrics: List of per-market metrics.

    Returns:
        Summary statistics by category.
    """
    categories = {}
    for m in metrics:
        cat = m.market_category
        if cat not in categories:
            categories[cat] = {
                "count": 0,
                "total_spread": 0.0,
                "total_both_sides": 0.0,
                "total_discovery": 0.0,
                "fill_times": [],
            }
        categories[cat]["count"] += 1
        categories[cat]["total_spread"] += m.avg_spread_bps
        categories[cat]["total_both_sides"] += m.time_with_both_sides_pct
        categories[cat]["total_discovery"] += m.price_discovery_score
        if m.avg_fill_time_seconds is not None:
            categories[cat]["fill_times"].append(m.avg_fill_time_seconds)

    summary = {}
    for cat, data in categories.items():
        n = data["count"]
        fill_times = data["fill_times"]
        summary[cat] = {
            "market_count": n,
            "avg_spread_bps": data["total_spread"] / n,
            "avg_both_sides_pct": data["total_both_sides"] / n,
            "avg_discovery_score": data["total_discovery"] / n,
            "median_fill_time_s": (
                sorted(fill_times)[len(fill_times) // 2]
                if fill_times else None
            ),
        }

    return summary


# Results from PredictHub's data
sample_results = {
    "hot": {
        "market_count": 15,
        "avg_spread_bps": 300,      # 3%
        "avg_both_sides_pct": 94.2,
        "avg_discovery_score": 0.82,
        "median_fill_time_s": 12.5,
    },
    "warm": {
        "market_count": 45,
        "avg_spread_bps": 1150,     # 11.5%
        "avg_both_sides_pct": 61.8,
        "median_fill_time_s": 340.0,  # ~5.7 minutes
        "avg_discovery_score": 0.64,
    },
    "cold": {
        "market_count": 90,
        "avg_spread_bps": 3500,     # 35%
        "avg_both_sides_pct": 18.3,
        "median_fill_time_s": None,  # Most orders never fill
        "avg_discovery_score": 0.41,
    },
}

LMSR Simulation

The team built an LMSR simulator to project what performance would look like if the same trading activity occurred on an LMSR market:

"""LMSR simulation for PredictHub migration analysis.

Simulates how historical trading patterns would behave under LMSR
pricing to project performance metrics for the migration decision.
"""

import math
from dataclasses import dataclass, field


@dataclass
class LMSRSimulator:
    """Simulates LMSR market behavior for comparison analysis.

    Attributes:
        b: Liquidity parameter.
        shares: Current shares outstanding per outcome.
        trade_log: Record of all simulated trades.
    """
    b: float
    shares: list[float] = field(default_factory=lambda: [0.0, 0.0])
    trade_log: list[dict] = field(default_factory=list)

    def price(self, outcome: int) -> float:
        """Get the current price of an outcome.

        Args:
            outcome: Index of the outcome (0 or 1 for binary).

        Returns:
            Current price between 0 and 1.
        """
        scaled = [s / self.b for s in self.shares]
        max_val = max(scaled)
        exps = [math.exp(s - max_val) for s in scaled]
        total = sum(exps)
        return exps[outcome] / total

    def trade_cost(self, outcome: int, num_shares: float) -> float:
        """Compute the cost of a trade.

        Args:
            outcome: Which outcome to trade.
            num_shares: Shares to buy (positive) or sell (negative).

        Returns:
            The dollar cost of the trade.
        """
        old_c = self._cost_function(self.shares)
        new_shares = list(self.shares)
        new_shares[outcome] += num_shares
        new_c = self._cost_function(new_shares)
        return new_c - old_c

    def execute(self, outcome: int, num_shares: float) -> dict:
        """Execute a trade and log the results.

        Args:
            outcome: Which outcome to trade.
            num_shares: Shares to buy (positive) or sell (negative).

        Returns:
            Trade execution record.
        """
        cost = self.trade_cost(outcome, num_shares)
        price_before = self.price(outcome)
        self.shares[outcome] += num_shares
        price_after = self.price(outcome)

        record = {
            "outcome": outcome,
            "shares": num_shares,
            "cost": cost,
            "price_before": price_before,
            "price_after": price_after,
            "slippage": abs(price_after - price_before),
        }
        self.trade_log.append(record)
        return record

    def _cost_function(self, shares: list[float]) -> float:
        """Compute the LMSR cost function.

        Args:
            shares: Share vector to evaluate.

        Returns:
            Cost function value.
        """
        scaled = [s / self.b for s in shares]
        max_val = max(scaled)
        return self.b * (max_val + math.log(
            sum(math.exp(s - max_val) for s in scaled)
        ))


def simulate_market_migration(
    trade_history: list[dict],
    b_values: list[float],
) -> dict[float, dict]:
    """Simulate how historical trades would perform under LMSR.

    Takes a sequence of historical order book trades and replays them
    through LMSR with different b values to compare outcomes.

    Args:
        trade_history: List of dicts with 'outcome', 'shares', and
            'historical_price' keys.
        b_values: List of liquidity parameters to test.

    Returns:
        Dictionary mapping b values to performance metrics.
    """
    results = {}

    for b in b_values:
        sim = LMSRSimulator(b=b)
        total_slippage = 0.0
        total_cost_difference = 0.0

        for trade in trade_history:
            result = sim.execute(
                outcome=trade["outcome"],
                num_shares=trade["shares"],
            )
            total_slippage += result["slippage"]

            # Compare AMM cost to historical order book price
            amm_avg_price = abs(result["cost"] / trade["shares"]) if trade["shares"] != 0 else 0
            cost_diff = abs(amm_avg_price - trade["historical_price"])
            total_cost_difference += cost_diff

        n_trades = len(trade_history)
        max_loss = b * math.log(2)
        all_prices = sim.all_prices() if hasattr(sim, 'all_prices') else [sim.price(0), sim.price(1)]

        results[b] = {
            "b_value": b,
            "max_loss": round(max_loss, 2),
            "avg_slippage": round(total_slippage / n_trades, 4) if n_trades > 0 else 0,
            "avg_cost_difference": round(total_cost_difference / n_trades, 4) if n_trades > 0 else 0,
            "final_prices": [round(p, 4) for p in all_prices],
            "total_volume": n_trades,
        }

    return results


# Simulated benchmark data
sample_trade_history = [
    {"outcome": 0, "shares": 10, "historical_price": 0.52},
    {"outcome": 0, "shares": 5, "historical_price": 0.55},
    {"outcome": 1, "shares": 15, "historical_price": 0.50},
    {"outcome": 0, "shares": 20, "historical_price": 0.48},
    {"outcome": 1, "shares": 8, "historical_price": 0.53},
]

benchmark = simulate_market_migration(
    trade_history=sample_trade_history,
    b_values=[50, 100, 200, 500],
)

for b_val, metrics in benchmark.items():
    print(f"\nb = {b_val}:")
    print(f"  Max loss: ${metrics['max_loss']}")
    print(f"  Avg slippage: {metrics['avg_slippage']:.4f}")
    print(f"  Avg cost difference vs order book: {metrics['avg_cost_difference']:.4f}")

Performance Benchmarks

The engineering team ran benchmarks comparing order matching speed and memory usage:

"""Performance benchmarks: Order Book vs LMSR.

Measures throughput (orders/second) and memory usage for both
pricing mechanisms under various load conditions.
"""

import time
import math
import sys
from dataclasses import dataclass


@dataclass
class BenchmarkResult:
    """Results from a performance benchmark run.

    Attributes:
        mechanism: Name of the mechanism tested.
        num_operations: Number of operations executed.
        elapsed_seconds: Wall clock time.
        ops_per_second: Throughput.
        memory_bytes: Approximate memory usage.
    """
    mechanism: str
    num_operations: int
    elapsed_seconds: float
    ops_per_second: float
    memory_bytes: int


def benchmark_lmsr(num_trades: int, b: float = 100.0) -> BenchmarkResult:
    """Benchmark LMSR trade execution throughput.

    Args:
        num_trades: Number of trades to execute.
        b: Liquidity parameter.

    Returns:
        Benchmark results.
    """
    shares = [0.0, 0.0]

    def cost(s: list[float]) -> float:
        scaled = [x / b for x in s]
        max_val = max(scaled)
        return b * (max_val + math.log(
            sum(math.exp(x - max_val) for x in scaled)
        ))

    start = time.perf_counter()
    for i in range(num_trades):
        outcome = i % 2
        old_c = cost(shares)
        shares[outcome] += 1.0
        new_c = cost(shares)
        _ = new_c - old_c  # Trade cost
    elapsed = time.perf_counter() - start

    return BenchmarkResult(
        mechanism="LMSR",
        num_operations=num_trades,
        elapsed_seconds=elapsed,
        ops_per_second=num_trades / elapsed,
        memory_bytes=sys.getsizeof(shares),
    )


def benchmark_order_book(num_orders: int) -> BenchmarkResult:
    """Benchmark order book insertion and matching throughput.

    Uses a simplified order book for fair comparison.

    Args:
        num_orders: Number of orders to process.

    Returns:
        Benchmark results.
    """
    import heapq
    bids: list[tuple] = []
    asks: list[tuple] = []

    start = time.perf_counter()
    for i in range(num_orders):
        if i % 2 == 0:
            # Limit buy order
            price = 0.50 + (i % 10) * 0.01
            heapq.heappush(bids, (-price, i, 1.0))
        else:
            # Limit sell order - attempt match
            price = 0.50 + (i % 10) * 0.01
            if bids and -bids[0][0] >= price:
                heapq.heappop(bids)  # Match
            else:
                heapq.heappush(asks, (price, i, 1.0))
    elapsed = time.perf_counter() - start

    return BenchmarkResult(
        mechanism="OrderBook",
        num_operations=num_orders,
        elapsed_seconds=elapsed,
        ops_per_second=num_orders / elapsed,
        memory_bytes=sys.getsizeof(bids) + sys.getsizeof(asks),
    )


def run_benchmarks():
    """Run and display benchmark comparisons."""
    print("=" * 65)
    print("Performance Benchmark: LMSR vs Order Book")
    print("=" * 65)

    for n in [1_000, 10_000, 100_000]:
        lmsr_result = benchmark_lmsr(n)
        ob_result = benchmark_order_book(n)

        print(f"\n--- {n:,} operations ---")
        print(f"LMSR:       {lmsr_result.ops_per_second:>12,.0f} ops/sec "
              f"({lmsr_result.elapsed_seconds:.4f}s)")
        print(f"OrderBook:  {ob_result.ops_per_second:>12,.0f} ops/sec "
              f"({ob_result.elapsed_seconds:.4f}s)")
        ratio = ob_result.ops_per_second / lmsr_result.ops_per_second
        print(f"OB/LMSR ratio: {ratio:.2f}x")


if __name__ == "__main__":
    run_benchmarks()

Typical results on a standard developer machine:

Operations LMSR (ops/sec) Order Book (ops/sec) Ratio
1,000 ~250,000 ~400,000 1.6x
10,000 ~240,000 ~380,000 1.6x
100,000 ~230,000 ~350,000 1.5x

The order book was slightly faster per operation because LMSR requires exponential and logarithmic calculations, while order book matching is primarily heap operations. However, both mechanisms comfortably handle thousands of operations per second in Python, meaning performance was not a deciding factor.

The Decision Framework

The team developed a structured decision framework:

Criterion 1: Liquidity Quality

Aspect Order Book LMSR
Hot markets Excellent natural liquidity Good, but AMM subsidy unnecessary
Warm markets Adequate but inconsistent Consistent, always tradeable
Cold markets Poor to nonexistent Consistent, always tradeable
Verdict Wins for hot markets Wins overall

Criterion 2: Cost

Aspect Order Book LMSR
Infrastructure cost Moderate (heap management) Low (simple math)
Liquidity cost $0 (traders provide) | $b \cdot \ln(n)$ per market
Annual subsidy (150 markets) $0 | ~$10,400 at $b=100$
Verdict Wins on cost Acceptable with budget

For 150 binary markets at $b = 100$: $150 \times 100 \times \ln(2) \approx \$10,397$ annual maximum subsidy. In practice, the actual loss is much lower because markets rarely reach the worst case.

Criterion 3: User Experience

Aspect Order Book LMSR
Immediate execution Only if matching order exists Always
Price transparency Bid/ask with spread Single clear price
Learning curve Higher (limit orders, spreads) Lower (just buy/sell)
Verdict Better for sophisticated traders Wins for mass market

Criterion 4: Price Discovery

Aspect Order Book LMSR
Information aggregation Excellent in active markets Good
Manipulation resistance Moderate (visible depth) Moderate (bounded loss)
Accuracy (Brier score) 0.18 (hot markets) Projected 0.22
Verdict Wins for well-traded markets Adequate for all markets

Final Recommendation: Hybrid Approach

The team recommended a hybrid approach rather than a complete migration:

  1. LMSR as the default for all new markets, providing guaranteed baseline liquidity.
  2. Order book overlay for hot markets that attract sufficient trading volume, allowing traders to offer better prices than the AMM.
  3. Automatic mode selection: Markets start as LMSR and automatically gain an order book when daily volume exceeds a threshold (50 trades/day).
"""Hybrid market routing logic for PredictHub."""

from enum import Enum


class MarketMode(str, Enum):
    """The active pricing mode for a market."""
    LMSR_ONLY = "lmsr_only"
    HYBRID = "hybrid"


class HybridRouter:
    """Routes trades to the optimal pricing mechanism.

    In hybrid mode, incoming orders are first checked against the
    order book. If the order book can offer a better price than the
    LMSR, the trade goes through the order book. Otherwise, it executes
    against the LMSR.

    Attributes:
        lmsr: The LMSR market maker instance.
        order_book: The order book instance (may be None).
        mode: Current market mode.
        volume_threshold: Daily trades needed to activate hybrid mode.
    """

    def __init__(self, lmsr, order_book=None, volume_threshold: int = 50):
        """Initialize the hybrid router.

        Args:
            lmsr: LMSR market maker instance.
            order_book: Optional order book instance.
            volume_threshold: Trades per day to activate order book.
        """
        self.lmsr = lmsr
        self.order_book = order_book
        self.mode = MarketMode.LMSR_ONLY
        self.volume_threshold = volume_threshold
        self.daily_trade_count = 0

    def route_order(self, outcome: int, shares: float, side: str,
                    limit_price: float = None) -> dict:
        """Route an order to the best available pricing mechanism.

        The routing logic:
        1. If LMSR_ONLY mode, always use LMSR.
        2. If HYBRID mode and a limit order exists:
           a. Check if order book has a better price.
           b. If yes, execute against order book.
           c. If no, execute against LMSR.
        3. For market orders in HYBRID mode, compare best available
           prices and use the better one.

        Args:
            outcome: Outcome index to trade.
            shares: Number of shares.
            side: "buy" or "sell".
            limit_price: Optional limit price.

        Returns:
            Trade execution result with routing information.
        """
        self.daily_trade_count += 1

        if self.mode == MarketMode.LMSR_ONLY:
            cost = self.lmsr.trade_cost(outcome, shares if side == "buy" else -shares)
            avg_price = abs(cost / shares)
            return {
                "mechanism": "lmsr",
                "avg_price": avg_price,
                "cost": cost,
                "shares": shares,
            }

        # Hybrid mode: compare prices
        lmsr_cost = self.lmsr.trade_cost(outcome, shares if side == "buy" else -shares)
        lmsr_avg_price = abs(lmsr_cost / shares)

        ob_price = None
        if self.order_book:
            if side == "buy":
                ob_price = self.order_book.get_best_ask(outcome)
            else:
                ob_price = self.order_book.get_best_bid(outcome)

        # Use order book if it offers a better price
        if ob_price is not None:
            if side == "buy" and ob_price < lmsr_avg_price:
                return {
                    "mechanism": "order_book",
                    "avg_price": ob_price,
                    "cost": ob_price * shares,
                    "shares": shares,
                }
            elif side == "sell" and ob_price > lmsr_avg_price:
                return {
                    "mechanism": "order_book",
                    "avg_price": ob_price,
                    "cost": ob_price * shares,
                    "shares": shares,
                }

        # Default to LMSR
        return {
            "mechanism": "lmsr",
            "avg_price": lmsr_avg_price,
            "cost": lmsr_cost,
            "shares": shares,
        }

    def check_mode_upgrade(self) -> bool:
        """Check if the market should be upgraded to hybrid mode.

        Returns:
            True if mode was changed to HYBRID.
        """
        if (self.mode == MarketMode.LMSR_ONLY and
                self.daily_trade_count >= self.volume_threshold):
            self.mode = MarketMode.HYBRID
            return True
        return False

Outcome

PredictHub implemented the hybrid approach over four weeks. Results after three months:

Metric Before (Order Book Only) After (Hybrid) Change
Active markets with trades 60 (40%) 135 (90%) +125%
Cold market avg daily trades 1.2 8.7 +625%
User satisfaction (survey) 3.1/5 4.2/5 +35%
Monthly active traders 340 780 +129%
Revenue (trading fees) $4,200/mo | $11,800/mo +181%
LMSR subsidy cost $0 | $1,800/mo New cost
Net revenue change +$5,800/mo

The LMSR subsidy cost ($1,800/month) was a fraction of the revenue increase ($7,600/month), making the investment clearly worthwhile. Cold markets became the platform's growth engine: niche topics attracted passionate communities who previously could not participate due to illiquidity.

Key Lessons

  1. One mechanism does not fit all markets. Hot markets benefit from order book price discovery, while cold markets need AMM liquidity guarantees.

  2. The subsidy is an investment, not a cost. The LMSR subsidy created markets that generated more revenue than they cost, because engaged users also traded in hot markets.

  3. Start with AMM, add order book later. Building an order book is more complex and only valuable when there is sufficient volume. LMSR provides a minimum viable product for any market.

  4. Measure before migrating. The simulation framework allowed PredictHub to predict the impact of migration before committing to the engineering work.

  5. User experience drives adoption more than mechanism efficiency. Users cared about immediate execution and clear prices, not about whether the spread was 2% or 3%.