Chapter 20 — Key Takeaways

The big idea

Denormalization is the deliberate, controlled re-introduction of redundancy to speed up reads — the conscious opposite of Chapter 19's accidental redundancy. Same redundancy, opposite intent; intent (and a consistency plan) is the whole difference.

When it pays off

Read-heavy workloads, reporting/analytics, expensive joins/aggregations on the hot path, data warehousing (star schemas, Ch. 34). The trade: faster reads ← → slower writes + integrity (drift) risk.

Techniques

  • Redundant columns — copy a frequently-joined value (e.g., order_items.unit_price: avoids a join and captures the historical price).
  • Precomputed aggregates — store a count/total (review_count, avg_rating).
  • Summary/rollup tables — batch-populated pre-aggregations.
  • Materialized views — the preferred way: the database manages the cached copy; you set a refresh cadence. (Case Study 1.)
  • Collapsing joins — merge tables (aggressive; mostly warehouse/reporting).

The cost: keep the copy consistent

No free denormalization — you pay in write overhead, consistency code, or staleness. Strategies:

  • Trigger — sees every change (app, admin, bulk, cascades); always consistent; per-write cost + hidden logic.
  • Application code — fragile (every path must remember; can't see DB-level cascades). (Case Study 2's failure.)
  • Scheduled refresh (matview/summary) — simple; tolerable staleness.
  • Accept staleness — when slightly-old is fine.

Un-maintained denormalization = the Chapter 19 anomaly, self-inflicted. (Case Study 2: the counter that lied.)

The professional sequence

Normalize → measure → try an index → try a materialized view → hand-maintain redundancy only if still needed (and then maintain + document it). Often an index (Ch. 23) solves the read problem with no redundancy — try it first.

When NOT to

  • No measured performance problem (premature denormalization).
  • Write-heavy, integrity-critical systems (ledgers, inventory).
  • When an index or matview already solves it.

Current fact vs. point-in-time

  • Duplicating a current fact that must stay in sync = risk/bug.
  • Recording a point-in-time value on purpose (unit_price at sale) = a feature. (Ch. 19 link.)

You can now…

  • ☐ Decide when denormalization is (and isn't) warranted.
  • ☐ Apply the techniques (redundant columns, aggregates, summary tables, matviews).
  • ☐ Choose a consistency strategy and state its trade-off.
  • ☐ Follow normalize→measure→index/matview→denormalize-and-maintain.
  • ☐ Distinguish deliberate denormalization from accidental-redundancy bugs.

Looking ahead

Chapter 21 — Data Modeling Patterns. Reusable solutions to recurring problems: audit trails, soft deletes, versioning, multi-tenancy, hierarchies, tagging, polymorphic associations — the design vocabulary that makes you fast.

One sentence to carry forward: Normalize for integrity, then denormalize only with a measured reason, the lowest-risk technique, and a plan to keep the copy consistent — redundancy without a maintenance plan is a bug waiting to happen.