Chapter 35 — Key Takeaways

The big idea

Distributed databases spread data across machines for availability, read scaling, geography, and scale beyond one server — but distribution forces a fundamental trade-off (CAP) a single server never faces. Distribute for a concrete need, not prematurely.

Replication (copies of the data)

  • Primary-replica: primary accepts writes; replicas serve reads + enable failover (HA). PostgreSQL streams the WAL (Ch. 28).
  • Sync (no loss, slower) vs async (fast, slight lag, possible small loss on failover).
  • Scales reads and provides HA — not writes. The highest-leverage first step. (Case Study 1.)

Sharding (splits the data)

  • Each shard holds a subset of rows on a different node — partitioning (Ch. 25) across machines. Scales writes/storage beyond one machine.
  • Hard: cross-shard queries/transactions are slow/complex; shard-key choice is critical (hot spots); rebalancing is painful.
  • Last resort — don't shard prematurely. (Case Study 2: sharded before they needed to → two years of needless complexity.)

The CAP theorem

  • During an inevitable network partition, you get at most 2 of Consistency, Availability, Partition-tolerance. P is mandatory → real choice is C vs A.
  • CP (consistent, may be unavailable) for correctness-critical data (balances, inventory); AP (available, possibly stale) for uptime-critical data (feeds, carts).
  • Eventual consistency = converge eventually; match it to data that tolerates staleness.

Distributed transactions & NewSQL

  • 2PC (two-phase commit) is slow/fragile → keep transactions within a shard.
  • NewSQL (Spanner, CockroachDB, YugabyteDB) = distributed scale + relational/ACID via consensus (Raft/Paxos). The "have it all," with latency cost.

Managed cloud + when to distribute

  • Managed PostgreSQL (RDS/Cloud SQL/Aurora) + read replicas is the right default — HA and read scaling without operating the machinery.
  • Escalation order: one server → tune/index → read replicas → (only on measured need) sharding/NewSQL.

Common mistakes

Premature sharding (Case Study 2); ignoring CAP; eventual consistency for data needing strong consistency; routine cross-shard queries/transactions; a bad shard key.

You can now…

  • ☐ Explain why/when to distribute, and replication vs sharding trade-offs.
  • ☐ State CAP and choose CP/AP per data.
  • ☐ Explain eventual consistency and 2PC difficulty.
  • ☐ Place NewSQL and managed cloud DBs; resist premature distribution.

Looking ahead

Chapter 36 — Specialized Databases. The right tool for unusual data: time-series, vector (AI embeddings), spatial, and search — plus PostgreSQL's extensions for each.

One sentence to carry forward: Reach for read replicas first (they solve most scaling and availability needs simply), shard or go NewSQL only on measured need, and remember that distribution forces a real consistency-vs-availability choice (CAP) you must make deliberately.