Case Study 2: Multi-Platform Arbitrage System — Architecture and Lessons Learned
Overview
Arbitrage — simultaneously buying and selling equivalent assets at different prices to lock in a risk-free profit — is one of the oldest strategies in finance. In prediction markets, arbitrage opportunities arise when different platforms price the same event differently. This case study examines the design, implementation, and operational challenges of a multi-platform arbitrage system.
We follow a developer, Sana, who built a system to detect and exploit price discrepancies between Polymarket, Kalshi, and Metaculus (used as a signal, since it is not a trading platform).
The Arbitrage Opportunity
Why Arbitrage Exists in Prediction Markets
As discussed in Chapter 19 (Portfolio Strategies), prediction market arbitrage exists for several reasons:
-
Fragmented liquidity: The same event trades on multiple platforms, each with its own pool of participants. Information reaches each pool at different speeds.
-
Different market structures: Polymarket uses a CLOB on Polygon (Chapter 7), Kalshi uses a centralized order book with CFTC regulation (Chapter 8), and Metaculus uses crowd forecasting. These structural differences lead to structural price differences.
-
Jurisdictional barriers: Some traders can only access certain platforms, creating segmented markets that cannot efficiently arbitrage themselves.
-
Settlement risk: Cross-platform arbitrage requires capital on both platforms simultaneously, and settlement timing differs. Not everyone is willing to bear this risk, leaving opportunities for those who are.
Types of Arbitrage
Sana identified three types of arbitrage relevant to her system:
Type 1: Cross-Platform Arbitrage Buy YES on Platform A at $0.40, buy NO on Platform B at $0.50. Total cost: $0.90. Guaranteed payout: $1.00. Profit: $0.10.
Type 2: Intra-Platform Arbitrage On a single platform, if YES + NO < $1.00 (after fees), buy both to lock in a profit. This is rare on well-functioning platforms but can occur briefly during high volatility.
Type 3: Statistical Arbitrage When one platform's price deviates significantly from the cross-platform consensus, bet that the price will converge. This is not risk-free (the price might diverge further), but it has positive expected value if the consensus is more accurate than any single platform.
System Architecture
High-Level Design
+------------------------------------------------------------------+
| MULTI-PLATFORM ARBITRAGE SYSTEM |
+------------------------------------------------------------------+
| |
| +---------------+ +-------------+ +----------------+ |
| | Polymarket | | Kalshi | | Metaculus | |
| | Client | | Client | | Client | |
| +-------+-------+ +------+------+ +-------+--------+ |
| | | | |
| +--------+---------+--------+--------+ |
| | | |
| +-------v-------+ +------v--------+ |
| | Market Matcher| | Price Normalizer |
| +-------+-------+ +------+---------+ |
| | | |
| +-------v-----------------v-------+ |
| | Arbitrage Detector | |
| | (cross-platform, intra-plat, | |
| | statistical) | |
| +-------+-------------------------+ |
| | |
| +-------v-------+ |
| | Execution | |
| | Coordinator |----> Platform A (buy leg) |
| | (atomic exec) |----> Platform B (sell leg) |
| +---------------+ |
+------------------------------------------------------------------+
Market Matching
The hardest part of cross-platform arbitrage is identifying which markets on different platforms correspond to the same event. Sana built a market matcher with multiple strategies:
"""Market matching module for cross-platform arbitrage.
Identifies equivalent markets across platforms using multiple
matching strategies, from exact URL matching to NLP-based
semantic similarity.
"""
from dataclasses import dataclass
from difflib import SequenceMatcher
from typing import Optional
@dataclass
class MarketPair:
"""A matched pair of markets across two platforms."""
platform_a: str
market_id_a: str
question_a: str
price_a: float
platform_b: str
market_id_b: str
question_b: str
price_b: float
match_confidence: float # 0 to 1
match_method: str # How the match was determined
@property
def price_gap(self) -> float:
"""Absolute price difference between the two platforms."""
return abs(self.price_a - self.price_b)
@property
def arbitrage_profit(self) -> float:
"""Guaranteed profit if the combined cost is below $1.00.
For a true arbitrage: buy the cheaper side and sell the
more expensive side's opposite.
"""
# Buy YES on the cheaper platform, buy NO on the more expensive
if self.price_a < self.price_b:
yes_cost = self.price_a
no_cost = 1.0 - self.price_b
else:
yes_cost = self.price_b
no_cost = 1.0 - self.price_a
total_cost = yes_cost + no_cost
if total_cost < 1.0:
return 1.0 - total_cost
return 0.0
class MarketMatcher:
"""Matches equivalent markets across different platforms.
Uses a cascade of matching strategies:
1. Exact match: manually curated mapping table
2. URL match: some markets link to the same resolution source
3. Text similarity: fuzzy matching on question text
4. Semantic similarity: embedding-based matching (optional, more accurate)
"""
def __init__(self):
self._manual_mappings: dict[str, str] = {}
self._min_text_similarity = 0.65
def add_manual_mapping(self, key_a: str, key_b: str):
"""Add a manually verified market pair."""
self._manual_mappings[key_a] = key_b
self._manual_mappings[key_b] = key_a
def find_matches(
self,
markets_a: list[dict],
markets_b: list[dict],
) -> list[MarketPair]:
"""Find matching markets between two platform's market lists.
Applies matching strategies in order of confidence:
manual > URL > text similarity.
"""
pairs = []
matched_b_ids = set()
for ma in markets_a:
best_match = None
best_confidence = 0.0
best_method = ""
for mb in markets_b:
if mb["market_id"] in matched_b_ids:
continue
# Strategy 1: Manual mapping
key_a = f"{ma['platform']}:{ma['market_id']}"
key_b = f"{mb['platform']}:{mb['market_id']}"
if self._manual_mappings.get(key_a) == key_b:
best_match = mb
best_confidence = 1.0
best_method = "manual"
break
# Strategy 2: Resolution source match
if (
ma.get("resolution_source")
and mb.get("resolution_source")
and ma["resolution_source"] == mb["resolution_source"]
):
if best_confidence < 0.95:
best_match = mb
best_confidence = 0.95
best_method = "resolution_source"
# Strategy 3: Text similarity
similarity = SequenceMatcher(
None,
ma.get("question", "").lower(),
mb.get("question", "").lower(),
).ratio()
if similarity > self._min_text_similarity and similarity > best_confidence:
best_match = mb
best_confidence = similarity
best_method = "text_similarity"
if best_match and best_confidence >= self._min_text_similarity:
matched_b_ids.add(best_match["market_id"])
pairs.append(MarketPair(
platform_a=ma["platform"],
market_id_a=ma["market_id"],
question_a=ma.get("question", ""),
price_a=ma["yes_price"],
platform_b=best_match["platform"],
market_id_b=best_match["market_id"],
question_b=best_match.get("question", ""),
price_b=best_match["yes_price"],
match_confidence=best_confidence,
match_method=best_method,
))
return pairs
Arbitrage Detection
Once markets are matched, the arbitrage detector evaluates each pair for profitability:
"""Arbitrage detection and evaluation module.
Identifies profitable cross-platform opportunities after
accounting for transaction costs, slippage, and settlement risk.
"""
from dataclasses import dataclass
@dataclass
class ArbitrageOpportunity:
"""A detected arbitrage opportunity with full cost analysis."""
pair: MarketPair
gross_profit: float # Before costs
transaction_costs: float # Platform fees
estimated_slippage: float # Expected slippage
net_profit: float # After all costs
capital_required: float # Total capital needed
return_pct: float # Net profit / capital required
time_to_settlement: float # Hours until both legs settle
risk_score: float # 0 (safe) to 1 (risky)
class ArbitrageDetector:
"""Detects and evaluates arbitrage opportunities across platforms.
Applies conservative cost estimates and risk assessment
before flagging an opportunity as actionable.
"""
def __init__(
self,
fee_rates: dict[str, float], # Platform -> fee rate
min_net_profit: float = 0.02, # Minimum $0.02 profit per share
min_return_pct: float = 0.01, # Minimum 1% return
max_risk_score: float = 0.5, # Maximum acceptable risk
):
self.fee_rates = fee_rates
self.min_net_profit = min_net_profit
self.min_return_pct = min_return_pct
self.max_risk_score = max_risk_score
def detect(self, pairs: list[MarketPair]) -> list[ArbitrageOpportunity]:
"""Evaluate all market pairs for arbitrage opportunities."""
opportunities = []
for pair in pairs:
opp = self._evaluate_pair(pair)
if opp and opp.net_profit >= self.min_net_profit:
if opp.return_pct >= self.min_return_pct:
if opp.risk_score <= self.max_risk_score:
opportunities.append(opp)
# Sort by return percentage (best first)
opportunities.sort(key=lambda o: o.return_pct, reverse=True)
return opportunities
def _evaluate_pair(self, pair: MarketPair) -> ArbitrageOpportunity:
"""Evaluate a single market pair for arbitrage."""
# Determine the cheaper side
if pair.price_a < pair.price_b:
buy_yes_price = pair.price_a
buy_yes_platform = pair.platform_a
buy_no_price = 1.0 - pair.price_b
buy_no_platform = pair.platform_b
else:
buy_yes_price = pair.price_b
buy_yes_platform = pair.platform_b
buy_no_price = 1.0 - pair.price_a
buy_no_platform = pair.platform_a
total_cost = buy_yes_price + buy_no_price
gross_profit = 1.0 - total_cost
if gross_profit <= 0:
return None
# Transaction costs
fee_a = self.fee_rates.get(buy_yes_platform, 0.02)
fee_b = self.fee_rates.get(buy_no_platform, 0.02)
transaction_costs = buy_yes_price * fee_a + buy_no_price * fee_b
# Slippage estimate (higher for less liquid markets)
slippage_bps = 50 # Conservative: 50 basis points per leg
estimated_slippage = (
buy_yes_price * slippage_bps / 10000
+ buy_no_price * slippage_bps / 10000
)
net_profit = gross_profit - transaction_costs - estimated_slippage
capital_required = total_cost + transaction_costs + estimated_slippage
return_pct = net_profit / capital_required if capital_required > 0 else 0
# Risk assessment
risk_factors = []
# Lower match confidence = higher risk (might not be the same event)
if pair.match_confidence < 0.9:
risk_factors.append(0.3)
if pair.match_confidence < 0.8:
risk_factors.append(0.3)
# Cross-platform settlement risk
if pair.platform_a != pair.platform_b:
risk_factors.append(0.1) # Different resolution sources
risk_score = min(sum(risk_factors), 1.0)
return ArbitrageOpportunity(
pair=pair,
gross_profit=gross_profit,
transaction_costs=transaction_costs,
estimated_slippage=estimated_slippage,
net_profit=net_profit,
capital_required=capital_required,
return_pct=return_pct,
time_to_settlement=0.0, # Computed from market end dates
risk_score=risk_score,
)
Execution Coordinator
The most critical component of an arbitrage system is the execution coordinator. Both legs of the trade must execute atomically — if only one leg fills, you have an unhedged position, not an arbitrage.
"""Execution coordinator for multi-leg arbitrage trades.
The coordinator ensures that both legs of an arbitrage trade
execute within acceptable parameters. If one leg fails or fills
at a worse price, the coordinator manages the recovery.
"""
import logging
from dataclasses import dataclass
from enum import Enum
logger = logging.getLogger(__name__)
class LegStatus(Enum):
"""Status of a single arbitrage leg."""
PENDING = "pending"
SUBMITTED = "submitted"
FILLED = "filled"
FAILED = "failed"
CANCELLED = "cancelled"
@dataclass
class ArbitrageLeg:
"""One leg of an arbitrage trade."""
platform: str
market_id: str
side: str # "YES" or "NO"
quantity: float
limit_price: float
status: LegStatus = LegStatus.PENDING
filled_price: float = 0.0
filled_quantity: float = 0.0
@dataclass
class ArbitrageExecution:
"""A complete two-leg arbitrage execution."""
opportunity: ArbitrageOpportunity
leg_a: ArbitrageLeg
leg_b: ArbitrageLeg
expected_profit: float
actual_profit: float = 0.0
status: str = "pending"
class ExecutionCoordinator:
"""Coordinates the execution of multi-leg arbitrage trades.
Execution strategy:
1. Submit the less liquid leg first (higher fill risk)
2. If it fills, immediately submit the second leg
3. If the second leg fails, attempt to unwind the first
4. Track all partial fills and compute actual P&L
This sequential approach sacrifices some speed for safety.
A more aggressive approach would submit both simultaneously
and manage the risk of partial fills.
"""
def __init__(self, clients: dict, max_retries: int = 3):
self.clients = clients
self.max_retries = max_retries
self.executions: list[ArbitrageExecution] = []
async def execute_arbitrage(
self, opportunity: ArbitrageOpportunity, quantity: float
) -> ArbitrageExecution:
"""Execute a complete arbitrage trade.
Returns an ArbitrageExecution with the result.
"""
pair = opportunity.pair
# Determine which legs to execute
if pair.price_a < pair.price_b:
leg_a = ArbitrageLeg(
platform=pair.platform_a,
market_id=pair.market_id_a,
side="YES",
quantity=quantity,
limit_price=pair.price_a + 0.01, # Small buffer
)
leg_b = ArbitrageLeg(
platform=pair.platform_b,
market_id=pair.market_id_b,
side="NO",
quantity=quantity,
limit_price=(1.0 - pair.price_b) + 0.01,
)
else:
leg_a = ArbitrageLeg(
platform=pair.platform_b,
market_id=pair.market_id_b,
side="YES",
quantity=quantity,
limit_price=pair.price_b + 0.01,
)
leg_b = ArbitrageLeg(
platform=pair.platform_a,
market_id=pair.market_id_a,
side="NO",
quantity=quantity,
limit_price=(1.0 - pair.price_a) + 0.01,
)
execution = ArbitrageExecution(
opportunity=opportunity,
leg_a=leg_a,
leg_b=leg_b,
expected_profit=opportunity.net_profit * quantity,
)
# Step 1: Submit the less liquid leg first
logger.info(
f"Arbitrage: submitting leg A ({leg_a.platform} "
f"{leg_a.side} @ {leg_a.limit_price:.4f})"
)
leg_a.status = await self._submit_order(leg_a)
if leg_a.status != LegStatus.FILLED:
execution.status = "leg_a_failed"
logger.warning(f"Arbitrage failed: leg A did not fill")
self.executions.append(execution)
return execution
# Step 2: Submit the second leg immediately
logger.info(
f"Arbitrage: leg A filled, submitting leg B ({leg_b.platform} "
f"{leg_b.side} @ {leg_b.limit_price:.4f})"
)
leg_b.status = await self._submit_order(leg_b)
if leg_b.status == LegStatus.FILLED:
execution.status = "complete"
execution.actual_profit = (
1.0
- leg_a.filled_price
- leg_b.filled_price
- opportunity.transaction_costs
) * quantity
logger.info(
f"Arbitrage complete: profit = ${execution.actual_profit:.4f}"
)
else:
# Leg B failed — we have an unhedged position
execution.status = "partial_fill"
logger.error(
f"Arbitrage PARTIAL FILL: leg A filled but leg B failed. "
f"Unhedged position on {leg_a.platform}!"
)
# Attempt to unwind leg A
await self._attempt_unwind(leg_a)
self.executions.append(execution)
return execution
async def _submit_order(self, leg: ArbitrageLeg) -> LegStatus:
"""Submit an order for one leg of the arbitrage.
In production, this calls the platform's order API.
Here we simulate the submission.
"""
# Simulated execution for case study purposes
leg.status = LegStatus.FILLED
leg.filled_price = leg.limit_price
leg.filled_quantity = leg.quantity
return LegStatus.FILLED
async def _attempt_unwind(self, leg: ArbitrageLeg):
"""Attempt to unwind a filled leg when the other leg fails.
This is an emergency procedure — we try to sell the position
at the best available price to minimize loss.
"""
logger.warning(
f"Attempting to unwind {leg.side} position on {leg.platform}"
)
# In production: submit a market sell order
# Accept slippage to exit quickly
def get_summary(self) -> dict:
"""Summarize all arbitrage executions."""
completed = [e for e in self.executions if e.status == "complete"]
failed = [e for e in self.executions if "failed" in e.status]
partial = [e for e in self.executions if e.status == "partial_fill"]
total_profit = sum(e.actual_profit for e in completed)
return {
"total_executions": len(self.executions),
"completed": len(completed),
"failed": len(failed),
"partial_fills": len(partial),
"total_profit": total_profit,
"avg_profit_per_trade": (
total_profit / len(completed) if completed else 0.0
),
}
Operational Results
Data Collection Phase (4 weeks)
Sana collected cross-platform data for 4 weeks, polling every 2 minutes. Key findings:
| Metric | Value |
|---|---|
| Matched market pairs | 47 |
| Average price gap (matched pairs) | 4.2% |
| Opportunities > 2% gross profit | 312 |
| Opportunities > 2% net profit | 83 |
| Average opportunity duration | 18 minutes |
| Median opportunity size (liquidity) | $1,200 |
Key Findings
Finding 1: Opportunities are real but small. The average net profit per arbitrage opportunity was $0.034 per share (3.4 cents). At typical available liquidity of $1,200, this translates to about $41 per opportunity.
Finding 2: Speed matters enormously. Opportunities lasted an average of 18 minutes, but the most profitable ones (>5% gap) lasted under 5 minutes. By the time Sana's system detected and evaluated them, some had already closed.
Finding 3: Market matching is the bottleneck. Text-based matching only identified about 60% of equivalent markets. The rest had different phrasing, different resolution criteria, or different time horizons that made automated matching unreliable. Manual curation was required for the other 40%.
Finding 4: Settlement risk is real. Two platforms might resolve the same event differently due to different resolution sources or criteria. For example, one platform might use a specific data provider while another uses a different one, and their timestamps or methodologies could disagree. Sana encountered this twice in her data collection period.
Finding 5: Capital efficiency is poor. True arbitrage requires capital locked on both platforms simultaneously. If $5,000 is deployed on each platform, only the matched pairs can be arbitraged, and only when opportunities appear. The effective capital utilization was about 15%.
Lessons Learned
1. Pure Arbitrage Is Harder Than It Looks
The textbook definition of arbitrage — risk-free profit — almost never applies in practice. Every real arbitrage has execution risk (one leg might not fill), settlement risk (the two platforms might disagree on the outcome), and liquidity risk (you might not be able to trade at the quoted price).
Sana found that statistical arbitrage (betting on price convergence without a guaranteed hedge) was more profitable than pure arbitrage, but it carried directional risk. This is the distinction between "real arbitrage" and "convergence trading" discussed in Chapter 19.
2. Latency Is King
In the arbitrage game, the fastest system wins. Sana's Python-based system with 2-minute polling intervals was too slow to capture the best opportunities. A production arbitrage system would need:
- WebSocket feeds from all platforms (sub-second latency)
- Co-located servers near platform infrastructure
- Pre-signed transactions ready to submit (for blockchain platforms)
- Optimized matching algorithms that run in milliseconds
3. Regulatory Complexity Multiplies
Operating on multiple platforms means complying with multiple regulatory regimes. Polymarket operates on blockchain with limited regulatory oversight (as of this writing). Kalshi is CFTC-regulated and has specific position limits and reporting requirements. Operating across both requires understanding and complying with both sets of rules (Chapters 38-39).
4. Capital Fragmentation Hurts Returns
The biggest practical challenge was capital fragmentation. To arbitrage between two platforms, you need capital deposited on both. This means your total capital is split, and each half earns returns only when opportunities appear on its respective platform. A $20,000 portfolio split $10,000/$10,000 performs worse than a $20,000 portfolio concentrated on a single platform with a directional edge.
5. Market Making Is a Better Business Model
Sana concluded that for prediction markets specifically, market making (Chapter 30) is a more capital-efficient strategy than cross-platform arbitrage. A market maker earns the spread on every trade, while an arbitrageur only profits when cross-platform discrepancies appear. However, market making requires a more sophisticated system and carries inventory risk.
Comparison: Directional Trading vs. Arbitrage
| Aspect | Directional (Case Study 1) | Arbitrage (Case Study 2) |
|---|---|---|
| Required capital | Lower (one platform) | Higher (multiple platforms) |
| Risk per trade | Higher (directional) | Lower (hedged) |
| Expected return per trade | Higher (if edge exists) | Lower (small spreads) |
| Speed requirements | Moderate (hourly cycles) | High (sub-minute) |
| Model complexity | Higher (probability est.) | Lower (price comparison) |
| Regulatory complexity | Lower (one platform) | Higher (multi-platform) |
| Scalability | Good (many markets) | Poor (limited matched pairs) |
| Capital efficiency | Good | Poor (fragmented) |
Discussion Questions
-
If you were starting a prediction market trading operation today, would you choose directional trading or arbitrage? Under what conditions would you choose the other?
-
How would you handle the settlement risk where two platforms resolve the same event differently? Design a risk mitigation strategy.
-
The case study shows that text-based market matching only catches about 60% of equivalent markets. Design a better matching system. What data sources and algorithms would you use?
-
Capital efficiency was identified as a major weakness. Propose a hybrid strategy that combines directional trading and arbitrage to improve capital utilization.
-
As prediction markets mature and institutional participants enter, how do you expect arbitrage opportunities to change? Will they become more or less frequent? Larger or smaller?