Case Study 2: Multi-Platform Arbitrage System — Architecture and Lessons Learned

Overview

Arbitrage — simultaneously buying and selling equivalent assets at different prices to lock in a risk-free profit — is one of the oldest strategies in finance. In prediction markets, arbitrage opportunities arise when different platforms price the same event differently. This case study examines the design, implementation, and operational challenges of a multi-platform arbitrage system.

We follow a developer, Sana, who built a system to detect and exploit price discrepancies between Polymarket, Kalshi, and Metaculus (used as a signal, since it is not a trading platform).


The Arbitrage Opportunity

Why Arbitrage Exists in Prediction Markets

As discussed in Chapter 19 (Portfolio Strategies), prediction market arbitrage exists for several reasons:

  1. Fragmented liquidity: The same event trades on multiple platforms, each with its own pool of participants. Information reaches each pool at different speeds.

  2. Different market structures: Polymarket uses a CLOB on Polygon (Chapter 7), Kalshi uses a centralized order book with CFTC regulation (Chapter 8), and Metaculus uses crowd forecasting. These structural differences lead to structural price differences.

  3. Jurisdictional barriers: Some traders can only access certain platforms, creating segmented markets that cannot efficiently arbitrage themselves.

  4. Settlement risk: Cross-platform arbitrage requires capital on both platforms simultaneously, and settlement timing differs. Not everyone is willing to bear this risk, leaving opportunities for those who are.

Types of Arbitrage

Sana identified three types of arbitrage relevant to her system:

Type 1: Cross-Platform Arbitrage Buy YES on Platform A at $0.40, buy NO on Platform B at $0.50. Total cost: $0.90. Guaranteed payout: $1.00. Profit: $0.10.

Type 2: Intra-Platform Arbitrage On a single platform, if YES + NO < $1.00 (after fees), buy both to lock in a profit. This is rare on well-functioning platforms but can occur briefly during high volatility.

Type 3: Statistical Arbitrage When one platform's price deviates significantly from the cross-platform consensus, bet that the price will converge. This is not risk-free (the price might diverge further), but it has positive expected value if the consensus is more accurate than any single platform.


System Architecture

High-Level Design

+------------------------------------------------------------------+
|               MULTI-PLATFORM ARBITRAGE SYSTEM                      |
+------------------------------------------------------------------+
|                                                                    |
|  +---------------+  +-------------+  +----------------+           |
|  | Polymarket    |  | Kalshi      |  | Metaculus      |           |
|  | Client        |  | Client      |  | Client         |           |
|  +-------+-------+  +------+------+  +-------+--------+           |
|          |                  |                 |                     |
|          +--------+---------+--------+--------+                    |
|                   |                  |                              |
|           +-------v-------+  +------v--------+                     |
|           | Market Matcher|  | Price Normalizer                    |
|           +-------+-------+  +------+---------+                    |
|                   |                 |                               |
|           +-------v-----------------v-------+                      |
|           |     Arbitrage Detector          |                      |
|           |  (cross-platform, intra-plat,   |                      |
|           |   statistical)                  |                      |
|           +-------+-------------------------+                      |
|                   |                                                |
|           +-------v-------+                                        |
|           | Execution     |                                        |
|           | Coordinator   |----> Platform A (buy leg)              |
|           | (atomic exec) |----> Platform B (sell leg)             |
|           +---------------+                                        |
+------------------------------------------------------------------+

Market Matching

The hardest part of cross-platform arbitrage is identifying which markets on different platforms correspond to the same event. Sana built a market matcher with multiple strategies:

"""Market matching module for cross-platform arbitrage.

Identifies equivalent markets across platforms using multiple
matching strategies, from exact URL matching to NLP-based
semantic similarity.
"""

from dataclasses import dataclass
from difflib import SequenceMatcher
from typing import Optional


@dataclass
class MarketPair:
    """A matched pair of markets across two platforms."""
    platform_a: str
    market_id_a: str
    question_a: str
    price_a: float

    platform_b: str
    market_id_b: str
    question_b: str
    price_b: float

    match_confidence: float  # 0 to 1
    match_method: str        # How the match was determined

    @property
    def price_gap(self) -> float:
        """Absolute price difference between the two platforms."""
        return abs(self.price_a - self.price_b)

    @property
    def arbitrage_profit(self) -> float:
        """Guaranteed profit if the combined cost is below $1.00.

        For a true arbitrage: buy the cheaper side and sell the
        more expensive side's opposite.
        """
        # Buy YES on the cheaper platform, buy NO on the more expensive
        if self.price_a < self.price_b:
            yes_cost = self.price_a
            no_cost = 1.0 - self.price_b
        else:
            yes_cost = self.price_b
            no_cost = 1.0 - self.price_a
        total_cost = yes_cost + no_cost
        if total_cost < 1.0:
            return 1.0 - total_cost
        return 0.0


class MarketMatcher:
    """Matches equivalent markets across different platforms.

    Uses a cascade of matching strategies:
    1. Exact match: manually curated mapping table
    2. URL match: some markets link to the same resolution source
    3. Text similarity: fuzzy matching on question text
    4. Semantic similarity: embedding-based matching (optional, more accurate)
    """

    def __init__(self):
        self._manual_mappings: dict[str, str] = {}
        self._min_text_similarity = 0.65

    def add_manual_mapping(self, key_a: str, key_b: str):
        """Add a manually verified market pair."""
        self._manual_mappings[key_a] = key_b
        self._manual_mappings[key_b] = key_a

    def find_matches(
        self,
        markets_a: list[dict],
        markets_b: list[dict],
    ) -> list[MarketPair]:
        """Find matching markets between two platform's market lists.

        Applies matching strategies in order of confidence:
        manual > URL > text similarity.
        """
        pairs = []
        matched_b_ids = set()

        for ma in markets_a:
            best_match = None
            best_confidence = 0.0
            best_method = ""

            for mb in markets_b:
                if mb["market_id"] in matched_b_ids:
                    continue

                # Strategy 1: Manual mapping
                key_a = f"{ma['platform']}:{ma['market_id']}"
                key_b = f"{mb['platform']}:{mb['market_id']}"
                if self._manual_mappings.get(key_a) == key_b:
                    best_match = mb
                    best_confidence = 1.0
                    best_method = "manual"
                    break

                # Strategy 2: Resolution source match
                if (
                    ma.get("resolution_source")
                    and mb.get("resolution_source")
                    and ma["resolution_source"] == mb["resolution_source"]
                ):
                    if best_confidence < 0.95:
                        best_match = mb
                        best_confidence = 0.95
                        best_method = "resolution_source"

                # Strategy 3: Text similarity
                similarity = SequenceMatcher(
                    None,
                    ma.get("question", "").lower(),
                    mb.get("question", "").lower(),
                ).ratio()
                if similarity > self._min_text_similarity and similarity > best_confidence:
                    best_match = mb
                    best_confidence = similarity
                    best_method = "text_similarity"

            if best_match and best_confidence >= self._min_text_similarity:
                matched_b_ids.add(best_match["market_id"])
                pairs.append(MarketPair(
                    platform_a=ma["platform"],
                    market_id_a=ma["market_id"],
                    question_a=ma.get("question", ""),
                    price_a=ma["yes_price"],
                    platform_b=best_match["platform"],
                    market_id_b=best_match["market_id"],
                    question_b=best_match.get("question", ""),
                    price_b=best_match["yes_price"],
                    match_confidence=best_confidence,
                    match_method=best_method,
                ))

        return pairs

Arbitrage Detection

Once markets are matched, the arbitrage detector evaluates each pair for profitability:

"""Arbitrage detection and evaluation module.

Identifies profitable cross-platform opportunities after
accounting for transaction costs, slippage, and settlement risk.
"""

from dataclasses import dataclass


@dataclass
class ArbitrageOpportunity:
    """A detected arbitrage opportunity with full cost analysis."""
    pair: MarketPair
    gross_profit: float          # Before costs
    transaction_costs: float     # Platform fees
    estimated_slippage: float    # Expected slippage
    net_profit: float            # After all costs
    capital_required: float      # Total capital needed
    return_pct: float            # Net profit / capital required
    time_to_settlement: float    # Hours until both legs settle
    risk_score: float            # 0 (safe) to 1 (risky)


class ArbitrageDetector:
    """Detects and evaluates arbitrage opportunities across platforms.

    Applies conservative cost estimates and risk assessment
    before flagging an opportunity as actionable.
    """

    def __init__(
        self,
        fee_rates: dict[str, float],        # Platform -> fee rate
        min_net_profit: float = 0.02,        # Minimum $0.02 profit per share
        min_return_pct: float = 0.01,        # Minimum 1% return
        max_risk_score: float = 0.5,         # Maximum acceptable risk
    ):
        self.fee_rates = fee_rates
        self.min_net_profit = min_net_profit
        self.min_return_pct = min_return_pct
        self.max_risk_score = max_risk_score

    def detect(self, pairs: list[MarketPair]) -> list[ArbitrageOpportunity]:
        """Evaluate all market pairs for arbitrage opportunities."""
        opportunities = []

        for pair in pairs:
            opp = self._evaluate_pair(pair)
            if opp and opp.net_profit >= self.min_net_profit:
                if opp.return_pct >= self.min_return_pct:
                    if opp.risk_score <= self.max_risk_score:
                        opportunities.append(opp)

        # Sort by return percentage (best first)
        opportunities.sort(key=lambda o: o.return_pct, reverse=True)
        return opportunities

    def _evaluate_pair(self, pair: MarketPair) -> ArbitrageOpportunity:
        """Evaluate a single market pair for arbitrage."""
        # Determine the cheaper side
        if pair.price_a < pair.price_b:
            buy_yes_price = pair.price_a
            buy_yes_platform = pair.platform_a
            buy_no_price = 1.0 - pair.price_b
            buy_no_platform = pair.platform_b
        else:
            buy_yes_price = pair.price_b
            buy_yes_platform = pair.platform_b
            buy_no_price = 1.0 - pair.price_a
            buy_no_platform = pair.platform_a

        total_cost = buy_yes_price + buy_no_price
        gross_profit = 1.0 - total_cost

        if gross_profit <= 0:
            return None

        # Transaction costs
        fee_a = self.fee_rates.get(buy_yes_platform, 0.02)
        fee_b = self.fee_rates.get(buy_no_platform, 0.02)
        transaction_costs = buy_yes_price * fee_a + buy_no_price * fee_b

        # Slippage estimate (higher for less liquid markets)
        slippage_bps = 50  # Conservative: 50 basis points per leg
        estimated_slippage = (
            buy_yes_price * slippage_bps / 10000
            + buy_no_price * slippage_bps / 10000
        )

        net_profit = gross_profit - transaction_costs - estimated_slippage
        capital_required = total_cost + transaction_costs + estimated_slippage
        return_pct = net_profit / capital_required if capital_required > 0 else 0

        # Risk assessment
        risk_factors = []

        # Lower match confidence = higher risk (might not be the same event)
        if pair.match_confidence < 0.9:
            risk_factors.append(0.3)
        if pair.match_confidence < 0.8:
            risk_factors.append(0.3)

        # Cross-platform settlement risk
        if pair.platform_a != pair.platform_b:
            risk_factors.append(0.1)  # Different resolution sources

        risk_score = min(sum(risk_factors), 1.0)

        return ArbitrageOpportunity(
            pair=pair,
            gross_profit=gross_profit,
            transaction_costs=transaction_costs,
            estimated_slippage=estimated_slippage,
            net_profit=net_profit,
            capital_required=capital_required,
            return_pct=return_pct,
            time_to_settlement=0.0,  # Computed from market end dates
            risk_score=risk_score,
        )

Execution Coordinator

The most critical component of an arbitrage system is the execution coordinator. Both legs of the trade must execute atomically — if only one leg fills, you have an unhedged position, not an arbitrage.

"""Execution coordinator for multi-leg arbitrage trades.

The coordinator ensures that both legs of an arbitrage trade
execute within acceptable parameters. If one leg fails or fills
at a worse price, the coordinator manages the recovery.
"""

import logging
from dataclasses import dataclass
from enum import Enum

logger = logging.getLogger(__name__)


class LegStatus(Enum):
    """Status of a single arbitrage leg."""
    PENDING = "pending"
    SUBMITTED = "submitted"
    FILLED = "filled"
    FAILED = "failed"
    CANCELLED = "cancelled"


@dataclass
class ArbitrageLeg:
    """One leg of an arbitrage trade."""
    platform: str
    market_id: str
    side: str         # "YES" or "NO"
    quantity: float
    limit_price: float
    status: LegStatus = LegStatus.PENDING
    filled_price: float = 0.0
    filled_quantity: float = 0.0


@dataclass
class ArbitrageExecution:
    """A complete two-leg arbitrage execution."""
    opportunity: ArbitrageOpportunity
    leg_a: ArbitrageLeg
    leg_b: ArbitrageLeg
    expected_profit: float
    actual_profit: float = 0.0
    status: str = "pending"


class ExecutionCoordinator:
    """Coordinates the execution of multi-leg arbitrage trades.

    Execution strategy:
    1. Submit the less liquid leg first (higher fill risk)
    2. If it fills, immediately submit the second leg
    3. If the second leg fails, attempt to unwind the first
    4. Track all partial fills and compute actual P&L

    This sequential approach sacrifices some speed for safety.
    A more aggressive approach would submit both simultaneously
    and manage the risk of partial fills.
    """

    def __init__(self, clients: dict, max_retries: int = 3):
        self.clients = clients
        self.max_retries = max_retries
        self.executions: list[ArbitrageExecution] = []

    async def execute_arbitrage(
        self, opportunity: ArbitrageOpportunity, quantity: float
    ) -> ArbitrageExecution:
        """Execute a complete arbitrage trade.

        Returns an ArbitrageExecution with the result.
        """
        pair = opportunity.pair

        # Determine which legs to execute
        if pair.price_a < pair.price_b:
            leg_a = ArbitrageLeg(
                platform=pair.platform_a,
                market_id=pair.market_id_a,
                side="YES",
                quantity=quantity,
                limit_price=pair.price_a + 0.01,  # Small buffer
            )
            leg_b = ArbitrageLeg(
                platform=pair.platform_b,
                market_id=pair.market_id_b,
                side="NO",
                quantity=quantity,
                limit_price=(1.0 - pair.price_b) + 0.01,
            )
        else:
            leg_a = ArbitrageLeg(
                platform=pair.platform_b,
                market_id=pair.market_id_b,
                side="YES",
                quantity=quantity,
                limit_price=pair.price_b + 0.01,
            )
            leg_b = ArbitrageLeg(
                platform=pair.platform_a,
                market_id=pair.market_id_a,
                side="NO",
                quantity=quantity,
                limit_price=(1.0 - pair.price_a) + 0.01,
            )

        execution = ArbitrageExecution(
            opportunity=opportunity,
            leg_a=leg_a,
            leg_b=leg_b,
            expected_profit=opportunity.net_profit * quantity,
        )

        # Step 1: Submit the less liquid leg first
        logger.info(
            f"Arbitrage: submitting leg A ({leg_a.platform} "
            f"{leg_a.side} @ {leg_a.limit_price:.4f})"
        )
        leg_a.status = await self._submit_order(leg_a)

        if leg_a.status != LegStatus.FILLED:
            execution.status = "leg_a_failed"
            logger.warning(f"Arbitrage failed: leg A did not fill")
            self.executions.append(execution)
            return execution

        # Step 2: Submit the second leg immediately
        logger.info(
            f"Arbitrage: leg A filled, submitting leg B ({leg_b.platform} "
            f"{leg_b.side} @ {leg_b.limit_price:.4f})"
        )
        leg_b.status = await self._submit_order(leg_b)

        if leg_b.status == LegStatus.FILLED:
            execution.status = "complete"
            execution.actual_profit = (
                1.0
                - leg_a.filled_price
                - leg_b.filled_price
                - opportunity.transaction_costs
            ) * quantity
            logger.info(
                f"Arbitrage complete: profit = ${execution.actual_profit:.4f}"
            )
        else:
            # Leg B failed — we have an unhedged position
            execution.status = "partial_fill"
            logger.error(
                f"Arbitrage PARTIAL FILL: leg A filled but leg B failed. "
                f"Unhedged position on {leg_a.platform}!"
            )
            # Attempt to unwind leg A
            await self._attempt_unwind(leg_a)

        self.executions.append(execution)
        return execution

    async def _submit_order(self, leg: ArbitrageLeg) -> LegStatus:
        """Submit an order for one leg of the arbitrage.

        In production, this calls the platform's order API.
        Here we simulate the submission.
        """
        # Simulated execution for case study purposes
        leg.status = LegStatus.FILLED
        leg.filled_price = leg.limit_price
        leg.filled_quantity = leg.quantity
        return LegStatus.FILLED

    async def _attempt_unwind(self, leg: ArbitrageLeg):
        """Attempt to unwind a filled leg when the other leg fails.

        This is an emergency procedure — we try to sell the position
        at the best available price to minimize loss.
        """
        logger.warning(
            f"Attempting to unwind {leg.side} position on {leg.platform}"
        )
        # In production: submit a market sell order
        # Accept slippage to exit quickly

    def get_summary(self) -> dict:
        """Summarize all arbitrage executions."""
        completed = [e for e in self.executions if e.status == "complete"]
        failed = [e for e in self.executions if "failed" in e.status]
        partial = [e for e in self.executions if e.status == "partial_fill"]
        total_profit = sum(e.actual_profit for e in completed)
        return {
            "total_executions": len(self.executions),
            "completed": len(completed),
            "failed": len(failed),
            "partial_fills": len(partial),
            "total_profit": total_profit,
            "avg_profit_per_trade": (
                total_profit / len(completed) if completed else 0.0
            ),
        }

Operational Results

Data Collection Phase (4 weeks)

Sana collected cross-platform data for 4 weeks, polling every 2 minutes. Key findings:

Metric Value
Matched market pairs 47
Average price gap (matched pairs) 4.2%
Opportunities > 2% gross profit 312
Opportunities > 2% net profit 83
Average opportunity duration 18 minutes
Median opportunity size (liquidity) $1,200

Key Findings

Finding 1: Opportunities are real but small. The average net profit per arbitrage opportunity was $0.034 per share (3.4 cents). At typical available liquidity of $1,200, this translates to about $41 per opportunity.

Finding 2: Speed matters enormously. Opportunities lasted an average of 18 minutes, but the most profitable ones (>5% gap) lasted under 5 minutes. By the time Sana's system detected and evaluated them, some had already closed.

Finding 3: Market matching is the bottleneck. Text-based matching only identified about 60% of equivalent markets. The rest had different phrasing, different resolution criteria, or different time horizons that made automated matching unreliable. Manual curation was required for the other 40%.

Finding 4: Settlement risk is real. Two platforms might resolve the same event differently due to different resolution sources or criteria. For example, one platform might use a specific data provider while another uses a different one, and their timestamps or methodologies could disagree. Sana encountered this twice in her data collection period.

Finding 5: Capital efficiency is poor. True arbitrage requires capital locked on both platforms simultaneously. If $5,000 is deployed on each platform, only the matched pairs can be arbitraged, and only when opportunities appear. The effective capital utilization was about 15%.


Lessons Learned

1. Pure Arbitrage Is Harder Than It Looks

The textbook definition of arbitrage — risk-free profit — almost never applies in practice. Every real arbitrage has execution risk (one leg might not fill), settlement risk (the two platforms might disagree on the outcome), and liquidity risk (you might not be able to trade at the quoted price).

Sana found that statistical arbitrage (betting on price convergence without a guaranteed hedge) was more profitable than pure arbitrage, but it carried directional risk. This is the distinction between "real arbitrage" and "convergence trading" discussed in Chapter 19.

2. Latency Is King

In the arbitrage game, the fastest system wins. Sana's Python-based system with 2-minute polling intervals was too slow to capture the best opportunities. A production arbitrage system would need:

  • WebSocket feeds from all platforms (sub-second latency)
  • Co-located servers near platform infrastructure
  • Pre-signed transactions ready to submit (for blockchain platforms)
  • Optimized matching algorithms that run in milliseconds

3. Regulatory Complexity Multiplies

Operating on multiple platforms means complying with multiple regulatory regimes. Polymarket operates on blockchain with limited regulatory oversight (as of this writing). Kalshi is CFTC-regulated and has specific position limits and reporting requirements. Operating across both requires understanding and complying with both sets of rules (Chapters 38-39).

4. Capital Fragmentation Hurts Returns

The biggest practical challenge was capital fragmentation. To arbitrage between two platforms, you need capital deposited on both. This means your total capital is split, and each half earns returns only when opportunities appear on its respective platform. A $20,000 portfolio split $10,000/$10,000 performs worse than a $20,000 portfolio concentrated on a single platform with a directional edge.

5. Market Making Is a Better Business Model

Sana concluded that for prediction markets specifically, market making (Chapter 30) is a more capital-efficient strategy than cross-platform arbitrage. A market maker earns the spread on every trade, while an arbitrageur only profits when cross-platform discrepancies appear. However, market making requires a more sophisticated system and carries inventory risk.


Comparison: Directional Trading vs. Arbitrage

Aspect Directional (Case Study 1) Arbitrage (Case Study 2)
Required capital Lower (one platform) Higher (multiple platforms)
Risk per trade Higher (directional) Lower (hedged)
Expected return per trade Higher (if edge exists) Lower (small spreads)
Speed requirements Moderate (hourly cycles) High (sub-minute)
Model complexity Higher (probability est.) Lower (price comparison)
Regulatory complexity Lower (one platform) Higher (multi-platform)
Scalability Good (many markets) Poor (limited matched pairs)
Capital efficiency Good Poor (fragmented)

Discussion Questions

  1. If you were starting a prediction market trading operation today, would you choose directional trading or arbitrage? Under what conditions would you choose the other?

  2. How would you handle the settlement risk where two platforms resolve the same event differently? Design a risk mitigation strategy.

  3. The case study shows that text-based market matching only catches about 60% of equivalent markets. Design a better matching system. What data sources and algorithms would you use?

  4. Capital efficiency was identified as a major weakness. Propose a hybrid strategy that combines directional trading and arbitrage to improve capital utilization.

  5. As prediction markets mature and institutional participants enter, how do you expect arbitrage opportunities to change? Will they become more or less frequent? Larger or smaller?