Chapter 20 — Quiz
12 questions. Answers at the bottom.
Multiple choice
Q1. Denormalization is: - A) Accidental redundancy from poor design - B) The deliberate re-introduction of controlled redundancy for read performance - C) The same as normalization - D) Always wrong
Q2. Denormalization trades: - A) Read speed for write cost / integrity risk - B) Storage for nothing - C) Correctness for fun - D) Writes for reads with no downside
Q3. Denormalization is most appropriate when: - A) Writes vastly outnumber reads - B) Reads vastly outnumber writes (read-heavy / reporting) - C) Integrity is the only concern - D) The table is tiny
Q4. order_items.unit_price (a copy of the product price) is justified because it:
- A) Saves storage
- B) Avoids a join AND captures the historical price at purchase time
- C) Is required by 3NF
- D) Prevents inserts
Q5. The often-preferred, database-managed way to denormalize an expensive query is: - A) A trigger on every table - B) A materialized view (refreshed on a schedule) - C) Comma-separated columns - D) Dropping constraints
Q6. The main ongoing cost of denormalization is: - A) Disk space only - B) Keeping the redundant data consistent (drift risk) + slower writes - C) Nothing - D) Slower reads
Q7. Before denormalizing for read speed, you should usually try: - A) Dropping the table - B) An index (and/or a materialized view) - C) Removing constraints - D) A bigger server only
Q8. Un-maintained denormalization leads to: - A) Faster writes - B) Data drift — the redundant copy diverges from the truth (an anomaly) - C) Better integrity - D) Smaller tables
Q9. A good consistency strategy for a "monthly sales dashboard" is: - A) A real-time trigger on every order - B) A scheduled refresh (staleness is acceptable) - C) No maintenance - D) Manual editing
Q10. Denormalization is usually a poor choice in: - A) A reporting warehouse - B) A read-heavy product page - C) A write-heavy, integrity-critical transactional system (e.g., a ledger) - D) A dashboard
True/False
Q11. "Normalize first, then denormalize selectively with a measured reason" is the professional default. (True / False)
Q12. If you duplicate a fact for performance, keeping it consistent is optional. (True / False)
Short answer
Q13. A team adds a comment_count column to posts for speed. Describe how it can go wrong and three ways to keep it correct.
---
Answer key
Q1 — B. Deliberate, controlled redundancy for reads — the conscious opposite of the Chapter 19 bug.
Q2 — A. Faster reads, at the cost of write overhead and integrity risk.
Q3 — B. Read-heavy / reporting workloads.
Q4 — B. Avoids a join and records the price as-of-purchase (historical snapshot).
Q5 — B. A materialized view — the database manages the cached copy; you set a refresh cadence.
Q6 — B. Maintaining consistency (drift) and slower writes. No free denormalization.
Q7 — B. Try an index (and/or matview) first — they speed reads with less/no redundancy risk.
Q8 — B. The copy drifts from the truth — exactly the anomaly normalization prevents.
Q9 — B. Scheduled refresh; monthly numbers tolerate some staleness.
Q10 — C. Write-heavy, integrity-critical systems — the drift risk usually isn't worth it.
Q11 — True. Normalize for integrity, then denormalize consciously for measured read needs.
Q12 — False. If you duplicate a fact, you own keeping it consistent — or it becomes a bug.
Q13. comment_count can drift if comments are inserted/deleted without updating it (e.g., a bulk delete, a code path that forgets), so it stops matching COUNT(*) of comments. Keep it correct via: (1) a trigger on comments that increments/decrements it (always consistent, per-write cost); (2) application code that updates it on every comment write (fragile — every path must remember); (3) a scheduled recompute from the true count (simple, slightly stale). Choose by required freshness vs. write overhead.
Scoring: 10–12 you balance both directions well; 7–9 review the consistency strategies; below 7, redo Exercises C–D.