Chapter 20 — Quiz

DataField.Dev

Chapter 20 — Quiz

12 questions. Answers at the bottom.

Multiple choice

Q1. Denormalization is: - A) Accidental redundancy from poor design - B) The deliberate re-introduction of controlled redundancy for read performance - C) The same as normalization - D) Always wrong

Q2. Denormalization trades: - A) Read speed for write cost / integrity risk - B) Storage for nothing - C) Correctness for fun - D) Writes for reads with no downside

Q3. Denormalization is most appropriate when: - A) Writes vastly outnumber reads - B) Reads vastly outnumber writes (read-heavy / reporting) - C) Integrity is the only concern - D) The table is tiny

Q4. order_items.unit_price (a copy of the product price) is justified because it: - A) Saves storage - B) Avoids a join AND captures the historical price at purchase time - C) Is required by 3NF - D) Prevents inserts

Q5. The often-preferred, database-managed way to denormalize an expensive query is: - A) A trigger on every table - B) A materialized view (refreshed on a schedule) - C) Comma-separated columns - D) Dropping constraints

Q6. The main ongoing cost of denormalization is: - A) Disk space only - B) Keeping the redundant data consistent (drift risk) + slower writes - C) Nothing - D) Slower reads

Q7. Before denormalizing for read speed, you should usually try: - A) Dropping the table - B) An index (and/or a materialized view) - C) Removing constraints - D) A bigger server only

Q8. Un-maintained denormalization leads to: - A) Faster writes - B) Data drift — the redundant copy diverges from the truth (an anomaly) - C) Better integrity - D) Smaller tables

Q9. A good consistency strategy for a "monthly sales dashboard" is: - A) A real-time trigger on every order - B) A scheduled refresh (staleness is acceptable) - C) No maintenance - D) Manual editing

Q10. Denormalization is usually a poor choice in: - A) A reporting warehouse - B) A read-heavy product page - C) A write-heavy, integrity-critical transactional system (e.g., a ledger) - D) A dashboard

True/False

Q11. "Normalize first, then denormalize selectively with a measured reason" is the professional default. (True / False)

Q12. If you duplicate a fact for performance, keeping it consistent is optional. (True / False)

Short answer

Q13. A team adds a comment_count column to posts for speed. Describe how it can go wrong and three ways to keep it correct.

---

Answer key

Q1 — B. Deliberate, controlled redundancy for reads — the conscious opposite of the Chapter 19 bug.

Q2 — A. Faster reads, at the cost of write overhead and integrity risk.

Q3 — B. Read-heavy / reporting workloads.

Q4 — B. Avoids a join and records the price as-of-purchase (historical snapshot).

Q5 — B. A materialized view — the database manages the cached copy; you set a refresh cadence.

Q6 — B. Maintaining consistency (drift) and slower writes. No free denormalization.

Q7 — B. Try an index (and/or matview) first — they speed reads with less/no redundancy risk.

Q8 — B. The copy drifts from the truth — exactly the anomaly normalization prevents.

Q9 — B. Scheduled refresh; monthly numbers tolerate some staleness.

Q10 — C. Write-heavy, integrity-critical systems — the drift risk usually isn't worth it.

Q11 — True. Normalize for integrity, then denormalize consciously for measured read needs.

Q12 — False. If you duplicate a fact, you own keeping it consistent — or it becomes a bug.

Q13. comment_count can drift if comments are inserted/deleted without updating it (e.g., a bulk delete, a code path that forgets), so it stops matching COUNT(*) of comments. Keep it correct via: (1) a trigger on comments that increments/decrements it (always consistent, per-write cost); (2) application code that updates it on every comment write (fragile — every path must remember); (3) a scheduled recompute from the true count (simple, slightly stale). Choose by required freshness vs. write overhead.

Scoring: 10–12 you balance both directions well; 7–9 review the consistency strategies; below 7, redo Exercises C–D.