Chapter 19 — Key Takeaways
The big idea
Normalization structures tables to eliminate redundancy — the root cause of data anomalies. Each fact should live in exactly one place; then it can't contradict itself. This is theme #1's deepest tool.
The disease and its symptoms
- Redundancy (same fact stored in many places) → three anomalies:
- Update: change one copy, others go stale (two emails for one customer).
- Insertion: can't record a fact without an unrelated one (a product with no order).
- Deletion: deleting a row loses unrelated facts (delete an order, lose product info).
The tool: functional dependencies
A → B = "A determines B" (each A has exactly one B). Slogan: every non-key attribute depends on the key, the whole key, and nothing but the key.
The normal forms
| Form | Removes | Informal rule |
|---|---|---|
| 1NF | lists-in-cells | atomic values, no repeating groups |
| 2NF | partial dependencies | non-key depends on the whole (composite) key |
| 3NF | transitive dependencies | no non-key → non-key |
| BCNF | rare 3NF edge cases | every determinant is a candidate key |
- 3NF is the practical baseline (usually = BCNF). Stopping at 2NF leaves transitive-dependency redundancy.
- Normalizing the flat orders table to 3NF yields (essentially) Mercado's actual schema — the anchor's payoff.
Subtlety: it's a tool, not dogma
- Normalization optimizes write integrity at the cost of read-time joins.
- Normalize to 3NF by default, then denormalize selectively and consciously for measured read performance (Chapter 20).
- Don't over-normalize prematurely (10-way-join schemas before any perf problem).
Current fact vs. historical snapshot
- Copying a current fact (warehouse_region on shipments) = a 3NF bug (it drifts — Case Study 2).
- Recording a point-in-time value deliberately (
order_items.unit_priceat sale) = intentional denormalization (a feature, documented). - Test: "does this need to stay in sync (bug) or capture a moment (feature)?"
You can now…
- ☐ Spot update/insertion/deletion anomalies and trace them to redundancy.
- ☐ Write functional dependencies and use them to drive splits.
- ☐ Normalize to 1NF/2NF/3NF (and recognize BCNF) step by step.
- ☐ Explain which class of redundancy each normal form removes.
- ☐ Distinguish accidental redundancy (a bug) from deliberate denormalization (a choice).
Looking ahead
Chapter 20 — Denormalization. The other half of the anchor: when and how to deliberately re-introduce redundancy for read performance, the trade-offs you accept, and why "normalize, then selectively denormalize" is the professional default.
One sentence to carry forward: Normalize so every fact lives once — then, and only with a measured reason, denormalize on purpose; redundancy by accident is a bug, redundancy by decision is a tool.