Chapter 33: Key Takeaways

Why Scaling Matters

Prediction market traffic is extremely bursty: peak-to-normal ratios of 50:1 to 500:1 are common. Election nights, breaking news, and surprise events produce the most valuable information and the most traffic simultaneously.
A platform that buckles under peak load fails at the exact moment it matters most. Scaling is not a luxury; it is a core requirement.
Service Level Objectives (SLOs) drive architecture: order latency p50 < 50 ms, p99 < 200 ms, availability 99.95%, and exactly-once trade semantics.

Event sourcing stores all changes as an immutable, append-only sequence of events. Current state is derived by replaying events: $S(t) = \text{fold}(f, S_0, [e_1, \ldots, e_n])$.
Benefits specific to prediction markets: regulatory audit trails, dispute resolution, temporal queries ("What was the price at 9:47 PM?"), replay-based debugging, and event-driven architecture.
Key event types: MarketCreated, OrderPlaced, OrderMatched, OrderCancelled, MarketResolved, FundsDeposited, PayoutIssued.
Snapshots prevent expensive full replays: capture state periodically, then replay only events after the snapshot.
Optimistic concurrency control prevents conflicting updates: each event carries a version number checked against the aggregate's current version.

Separate the write model (commands: place order, create market) from the read model (queries: display prices, show portfolio).
The write path optimizes for consistency and correctness. The read path optimizes for throughput and flexible querying.
Read models are projections of the event stream, denormalized for specific access patterns: order book view, market summary view, portfolio view, leaderboard view.
Eventual consistency between write and read models is acceptable. Staleness bound: $\delta + \pi < 10\text{ms}$ in practice.
CQRS enables independent scaling: a single, powerful write server paired with many read replicas.

Schema design must reflect access patterns: current prices (very frequent), open orders (matching engine), trader positions (portfolio), recent trades (charts).
Partial indexes are critical: CREATE INDEX ... WHERE status = 'open' keeps the matching engine's index small as the orders table grows.
Connection pooling (PgBouncer): pool size = $C \times (Q_{\text{avg}} + L_{\text{avg}}) / 1000$. Transaction pooling mode allows thousands of app connections to share dozens of database connections.
Read replicas distribute read-heavy workloads. Route read-only queries (market listings, portfolios) to replicas; reserve the primary for writes.
Table partitioning by time for trades: queries with time range predicates automatically target only relevant partitions.

Cache value formula: $V_{\text{cache}} = f_{\text{access}} \times c_{\text{compute}} / r_{\text{change}}$. High-value targets: market list, leaderboard, user portfolios.
Multi-level caching: L1 (in-process, nanoseconds), L2 (Redis, microseconds), L3 (database, milliseconds).
Cache invalidation: hybrid strategy combining TTL-based expiration (safety net), event-driven invalidation (freshness), and write-through for critical data.
Market prices use short TTLs (2 seconds) with event-driven invalidation. Leaderboards use long TTLs (5 minutes).
Cache stampede prevention: use locks to ensure only one request recomputes an expired value while others wait.

Inter-service communication via message queues decouples producers from consumers and enables independent scaling.
Priority queues: order matching (critical), settlement (high), notifications (normal), analytics (low).
Dead letter queues: messages that fail after max retries are captured for manual inspection, not silently dropped.
Retry strategy: exponential backoff with jitter: $t_{\text{retry}}(n) = t_{\text{base}} \cdot 2^{n-1} + \text{jitter}$.

Stateless API servers: sessions in Redis, no local state. Scale by adding instances behind a load balancer.
Matching engine sharding: distribute markets across shards using consistent hashing. Each shard processes its markets independently.
WebSocket separation: dedicated WebSocket servers subscribe to Redis Pub/Sub for price updates. IP-hash load balancing for session affinity.
Auto-scaling policies: scale up on leading indicators (CPU > 70%, p99 latency > 150 ms, queue depth > 1000). Scale down conservatively with longer evaluation periods.

The four golden signals: latency, traffic, errors, saturation. Every service must emit all four.
Prediction-market-specific metrics: matching engine queue depth, bid-ask spread, order-to-trade latency, WebSocket connection count.
Prometheus + Grafana is the standard stack: structured metrics, flexible dashboards, alerting rules.
Anomaly detection: rolling z-score on order volume flags unusual activity (potential manipulation or system issues).
Alert on symptoms (user-facing impact), not causes. Page on error rate, not CPU.

Rate limiting with token bucket algorithm: different tiers for different endpoints (order submission vs. price reads).
DDoS mitigation: CDN-level rate limiting, challenge pages for suspicious traffic, circuit breakers.
Input validation: all prices between 0 and 1, quantities positive, idempotency keys on all mutations.
TLS everywhere: all inter-service communication encrypted in transit.

Recovery Time Objective (RTO): maximum acceptable downtime. Target: < 5 minutes.
Recovery Point Objective (RPO): maximum acceptable data loss. Target: 0 (no data loss via continuous WAL archiving).
Regular failover drills: test that automated failover works before you need it in production.
Event sourcing naturally supports disaster recovery: the event log is the recovery mechanism.

Formula	Description
$R = T_{\text{peak}} / T_{\text{normal}}$	Peak-to-normal traffic ratio
$S(t) = \text{fold}(f, S_0, [e_1, \ldots, e_n])$	Event sourcing state reconstruction
$\text{staleness}_{\max} = \delta + \pi$	CQRS consistency bound
$V_{\text{cache}} = f \times c / r$	Cache value formula
$t_{\text{retry}} = t_{\text{base}} \cdot 2^{n-1}$	Exponential backoff retry delay
$\text{pool} = C \times (Q + L) / 1000$	Connection pool sizing