Chapter 34 — Exercises
Design/analysis exercises. (answer in Appendix) = worked solution in Answers. ⭐ = stretch.
Group A — OLTP vs OLAP
A1 (34.1) List four ways an OLTP schema and an OLAP (warehouse) schema differ, and why. (answer in Appendix)
34.2 Why is running heavy analytical queries on the live transactional database a poor idea?
34.3 ⭐ Why does the warehouse deliberately denormalize when Part III said to normalize? (Tie to Chapter 20.)
Group B — Star schema
34.4 Design a star schema for Mercado sales: name the fact table (with its grain and measures) and the dimension tables. (answer in Appendix)
34.5 What is the "grain" of a fact table, and why is choosing it the most important design decision? (answer in Appendix)
34.6 Why is a date dimension almost always included? What attributes would it have?
34.7 ⭐ Write a star-schema query: revenue by category by quarter, joining the fact to the date and product dimensions.
Group C — Snowflake & SCD
34.8 Star vs snowflake schema — what's the difference and which is usually preferred? (answer in Appendix)
34.9 Describe SCD Type 1, 2, and 3. When would you use Type 2?
34.10 ⭐ A customer moves from region 'East' to 'West'. Show how Type 2 preserves which region was current for each past order.
Group D — Storage & tools
34.11 Why are analytical databases column-oriented, and why does that make aggregations fast? (Tie to Chapter 28.) (answer in Appendix)
34.12 Name three modern analytical databases and one thing each is known for.
34.13 ⭐ When might PostgreSQL (or DuckDB) be enough, with no separate warehouse? (Theme #4.)
Group E — Progressive project
34.14 Design a star schema for your domain: facts (with grain + measures) and dimensions.
34.15 Pick an SCD strategy for one changing dimension and justify it.
34.16 ⭐ Contrast your warehouse star schema with your OLTP schema and explain why they differ.
Self-check. If you can model a star schema, choose a fact grain, pick an SCD strategy, and explain why warehouses are denormalized and columnar — you can think analytically. Next: scaling across machines.