Chapter 35 — Key Takeaways
The big idea
Distributed databases spread data across machines for availability, read scaling, geography, and scale beyond one server — but distribution forces a fundamental trade-off (CAP) a single server never faces. Distribute for a concrete need, not prematurely.
Replication (copies of the data)
- Primary-replica: primary accepts writes; replicas serve reads + enable failover (HA). PostgreSQL streams the WAL (Ch. 28).
- Sync (no loss, slower) vs async (fast, slight lag, possible small loss on failover).
- Scales reads and provides HA — not writes. The highest-leverage first step. (Case Study 1.)
Sharding (splits the data)
- Each shard holds a subset of rows on a different node — partitioning (Ch. 25) across machines. Scales writes/storage beyond one machine.
- Hard: cross-shard queries/transactions are slow/complex; shard-key choice is critical (hot spots); rebalancing is painful.
- Last resort — don't shard prematurely. (Case Study 2: sharded before they needed to → two years of needless complexity.)
The CAP theorem
- During an inevitable network partition, you get at most 2 of Consistency, Availability, Partition-tolerance. P is mandatory → real choice is C vs A.
- CP (consistent, may be unavailable) for correctness-critical data (balances, inventory); AP (available, possibly stale) for uptime-critical data (feeds, carts).
- Eventual consistency = converge eventually; match it to data that tolerates staleness.
Distributed transactions & NewSQL
- 2PC (two-phase commit) is slow/fragile → keep transactions within a shard.
- NewSQL (Spanner, CockroachDB, YugabyteDB) = distributed scale + relational/ACID via consensus (Raft/Paxos). The "have it all," with latency cost.
Managed cloud + when to distribute
- Managed PostgreSQL (RDS/Cloud SQL/Aurora) + read replicas is the right default — HA and read scaling without operating the machinery.
- Escalation order: one server → tune/index → read replicas → (only on measured need) sharding/NewSQL.
Common mistakes
Premature sharding (Case Study 2); ignoring CAP; eventual consistency for data needing strong consistency; routine cross-shard queries/transactions; a bad shard key.
You can now…
- ☐ Explain why/when to distribute, and replication vs sharding trade-offs.
- ☐ State CAP and choose CP/AP per data.
- ☐ Explain eventual consistency and 2PC difficulty.
- ☐ Place NewSQL and managed cloud DBs; resist premature distribution.
Looking ahead
Chapter 36 — Specialized Databases. The right tool for unusual data: time-series, vector (AI embeddings), spatial, and search — plus PostgreSQL's extensions for each.
One sentence to carry forward: Reach for read replicas first (they solve most scaling and availability needs simply), shard or go NewSQL only on measured need, and remember that distribution forces a real consistency-vs-availability choice (CAP) you must make deliberately.