Chapter 20 — Key Takeaways
The big idea
Denormalization is the deliberate, controlled re-introduction of redundancy to speed up reads — the conscious opposite of Chapter 19's accidental redundancy. Same redundancy, opposite intent; intent (and a consistency plan) is the whole difference.
When it pays off
Read-heavy workloads, reporting/analytics, expensive joins/aggregations on the hot path, data warehousing (star schemas, Ch. 34). The trade: faster reads ← → slower writes + integrity (drift) risk.
Techniques
- Redundant columns — copy a frequently-joined value (e.g.,
order_items.unit_price: avoids a join and captures the historical price). - Precomputed aggregates — store a count/total (
review_count,avg_rating). - Summary/rollup tables — batch-populated pre-aggregations.
- Materialized views — the preferred way: the database manages the cached copy; you set a refresh cadence. (Case Study 1.)
- Collapsing joins — merge tables (aggressive; mostly warehouse/reporting).
The cost: keep the copy consistent
No free denormalization — you pay in write overhead, consistency code, or staleness. Strategies:
- Trigger — sees every change (app, admin, bulk, cascades); always consistent; per-write cost + hidden logic.
- Application code — fragile (every path must remember; can't see DB-level cascades). (Case Study 2's failure.)
- Scheduled refresh (matview/summary) — simple; tolerable staleness.
- Accept staleness — when slightly-old is fine.
Un-maintained denormalization = the Chapter 19 anomaly, self-inflicted. (Case Study 2: the counter that lied.)
The professional sequence
Normalize → measure → try an index → try a materialized view → hand-maintain redundancy only if still needed (and then maintain + document it). Often an index (Ch. 23) solves the read problem with no redundancy — try it first.
When NOT to
- No measured performance problem (premature denormalization).
- Write-heavy, integrity-critical systems (ledgers, inventory).
- When an index or matview already solves it.
Current fact vs. point-in-time
- Duplicating a current fact that must stay in sync = risk/bug.
- Recording a point-in-time value on purpose (
unit_priceat sale) = a feature. (Ch. 19 link.)
You can now…
- ☐ Decide when denormalization is (and isn't) warranted.
- ☐ Apply the techniques (redundant columns, aggregates, summary tables, matviews).
- ☐ Choose a consistency strategy and state its trade-off.
- ☐ Follow normalize→measure→index/matview→denormalize-and-maintain.
- ☐ Distinguish deliberate denormalization from accidental-redundancy bugs.
Looking ahead
Chapter 21 — Data Modeling Patterns. Reusable solutions to recurring problems: audit trails, soft deletes, versioning, multi-tenancy, hierarchies, tagging, polymorphic associations — the design vocabulary that makes you fast.
One sentence to carry forward: Normalize for integrity, then denormalize only with a measured reason, the lowest-risk technique, and a plan to keep the copy consistent — redundancy without a maintenance plan is a bug waiting to happen.