Answers to Selected Exercises

Worked answers to the exercises marked (answer in Appendix). SQL targets PostgreSQL 15+ against Mercado (seed-sample.sql). Where data values are shown, they reflect the deterministic sample. Try each exercise before reading.

Chapter 1 — What Is a Database?

1.1 A database is the organized, persistent collection of related data (e.g., the mercado data); a DBMS is the software that manages it (PostgreSQL). Data vs. the program that stores/protects/queries it.

1.6 Redundancy = the same fact stored in multiple places. When a customer's email is copied into many order rows and she changes it, updating only some copies leaves the data contradicting itself — an update anomaly (an integrity failure).

1.7 Structure (a quantity went negative because nothing enforced quantity >= 0).

1.12 Durability (a crash lost committed data because there was no durable log/backup).

1.14 Edgar F. Codd, 1970: represent all data as relations and let users ask what they want (declaratively) rather than navigate how.

1.20 "The first and last names of all gold-tier customers."

1.24 Entities: products, customers, orders, (order line items), categories, suppliers. Each gets a table; e.g., product → (name, price); customer → (name, email).

Chapter 2 — Setup

2.1 psql --version → "psql (PostgreSQL) 16.x" (any 15+). The # prompt = superuser; > = ordinary user.

2.5 \d orders shows columns/types, PK order_id, and FKs customer_id→customers, employee_id→employees, ship_address_id→addresses.

2.8 \h INSERT shows SQL syntax help for a command; \? lists psql meta-commands (backslash commands).

2.10 CREATE TABLE scratch (id integer GENERATED ALWAYS AS IDENTITY PRIMARY KEY, note text NOT NULL); INSERT INTO scratch (note) VALUES ('a'),('b'),('c'); SELECT * FROM scratch; DROP TABLE scratch;

2.13 SELECT name, price FROM products;

2.18 Re-running is safe because seed-sample.sql begins with TRUNCATE ... RESTART IDENTITY CASCADE (and uses OVERRIDING SYSTEM VALUE with fixed IDs), so it clears and reloads deterministically.

2.21 Connection refused = the server isn't running. Start it (brew services start / systemctl start postgresql / docker start).

Chapter 3 — The Relational Model

3.1 Relation = table (customers); tuple = row (one customer); attribute = column (email); domain = allowed values (loyalty_tier ∈ {standard,…}); degree = #columns (7); cardinality = #rows (12 in sample).

3.5 Superkey: {customer_id, email} (or {customer_id}); candidate keys: {customer_id}, {email}; chosen PK: customer_id.

3.8 inventory PK = (product_id, warehouse_id); each row = the stock of one product in one warehouse.

3.10 orders FKs: customer_id→customers, employee_id→employees, ship_address_id→addresses.

3.14 SELECT first_name, last_name FROM customers WHERE phone IS NULL; — WHERE phone = NULL returns nothing because = NULL is unknown (three-valued logic), never true.

3.16 COUNT(*) = 12 (rows); COUNT(phone) = 10 (non-NULL phones). They differ by the two NULL phones — COUNT(col) skips NULLs.

3.18 Rejected by CHECK (rating BETWEEN 1 AND 5) — reviews_rating_check violation.

3.22 Tables referencing customers: addresses (customer_id), orders (customer_id), reviews (customer_id). (Check each with \d.)

Chapter 4 — Relational Algebra

4.1 Selection (σ) → WHERE.

4.7 SELECT first_name, last_name FROM customers WHERE loyalty_tier = 'gold';

4.11 π sku,price ( σ price<50 (products) ) → SELECT sku, price FROM products WHERE price < 50;

4.14 Product = 12 × 15 = 180 rows; CROSS JOIN produces every customer-order pairing (mostly meaningless).

4.17 Equivalent (selection can be pushed below the join). (b) is cheaper — it filters orders to one customer before the join, so the join does far less work.

4.20 Same rows all three ways (IN/JOIN/EXISTS) for "orders from gold customers."

Chapter 5 — SELECT/FROM/WHERE

5.1 SELECT * FROM suppliers;

5.4 SELECT name, price, ROUND(price * 1.08, 2) AS price_with_tax FROM products;

5.6 SELECT * FROM products WHERE price > 500;

5.9 SELECT * FROM customers WHERE loyalty_tier IN ('gold','platinum');

5.14 SELECT * FROM products WHERE is_active AND category_id IN (3,5) AND price < 300;

5.15 SELECT * FROM products ORDER BY price DESC;

5.20 SELECT DISTINCT loyalty_tier FROM customers;

5.23 phone = NULL is never true; use WHERE phone IS NULL.

5.27 Text compares by collation order; 'a' < 'b' is true; 'Z' < 'a' is true in typical ASCII/C-like ordering (uppercase precedes lowercase) — collation-dependent.

5.30 "Active products under $50, name and price, cheapest first."

Chapter 6 — JOINs

6.1 SELECT o.order_id, o.order_date, c.first_name, c.last_name FROM orders o JOIN customers c ON c.customer_id = o.customer_id;

6.6 SELECT c.last_name, p.name, oi.quantity FROM orders o JOIN customers c ON c.customer_id=o.customer_id JOIN order_items oi ON oi.order_id=o.order_id JOIN products p ON p.product_id=oi.product_id;

6.10 SELECT c.first_name, c.last_name, o.order_id FROM customers c LEFT JOIN orders o ON o.customer_id=c.customer_id;

6.11 SELECT c.* FROM customers c WHERE NOT EXISTS (SELECT 1 FROM orders o WHERE o.customer_id=c.customer_id); (or LEFT JOIN ... WHERE o.order_id IS NULL).

6.16 SELECT e.first_name||' '||e.last_name AS employee, m.first_name||' '||m.last_name AS manager FROM employees e LEFT JOIN employees m ON m.employee_id = e.manager_id;

6.19 Comma with no condition = Cartesian product (180 rows). Add WHERE c.customer_id = o.customer_id (or use JOIN ... ON).

6.23 INNER JOIN (every order has a customer; no need to keep non-matches).

Chapter 7 — Aggregation

7.1 SELECT COUNT(*) FROM products; → 15.

7.3 SELECT COUNT(*) AS all, COUNT(phone) AS with_phone FROM customers; → 12 vs 10 (COUNT(col) skips NULLs).

7.6 SELECT loyalty_tier, COUNT(*) FROM customers GROUP BY loyalty_tier ORDER BY 2 DESC;

7.9 SELECT p.name, AVG(r.rating), COUNT(r.review_id) FROM products p LEFT JOIN reviews r ON r.product_id=p.product_id GROUP BY p.product_id, p.name;

7.11 SELECT loyalty_tier, COUNT(*) FROM customers GROUP BY loyalty_tier HAVING COUNT(*) > 2;

7.15 Error: aggregates can't appear in WHERE. Use HAVING COUNT(*) > 2 (groups don't exist at WHERE time).

7.18 SELECT c.customer_id, COALESCE(SUM(oi.quantity*oi.unit_price),0) AS revenue FROM customers c LEFT JOIN orders o ON o.customer_id=c.customer_id LEFT JOIN order_items oi ON oi.order_id=o.order_id GROUP BY c.customer_id;

7.22 Both equal here (discount is NOT NULL). In general AVG(col) skips NULLs; AVG(COALESCE(col,0)) counts them as 0.

7.25 ... GROUP BY ROLLUP (cat.name) ... adds a grand-total row (NULL category).

Chapter 8 — Built-in Functions

8.1 SELECT last_name || ', ' || first_name AS name FROM customers;

8.4 SELECT 'ORD-' || lpad(order_id::text, 5, '0') FROM orders;

8.6 SELECT name, round(price, -1) AS nearest_ten FROM products; (negative scale rounds to tens).

8.10 SELECT order_id, order_date::date FROM orders;

8.11 SELECT date_trunc('month', order_date) AS month, COUNT(*) FROM orders GROUP BY 1 ORDER BY 1;

8.15 SELECT '250'::integer + 50; → 300.

8.18 SELECT name, CASE WHEN price>=1000 THEN 'premium' WHEN price>=200 THEN 'mid-range' ELSE 'budget' END AS band FROM products;

8.20 SELECT COUNT(*) FILTER (WHERE status='delivered') AS delivered, COUNT(*) FILTER (WHERE status='cancelled') AS cancelled FROM orders;

8.22 Integer division: delivered_items / total_items truncates to 0. Use delivered_items::numeric / NULLIF(total_items,0).

Chapter 9 — Subqueries

9.1 SELECT name, price FROM products WHERE price > (SELECT AVG(price) FROM products);

9.5 SELECT * FROM orders WHERE customer_id IN (SELECT customer_id FROM customers WHERE loyalty_tier='gold');

9.9 SELECT AVG(t.tot) FROM (SELECT order_id, SUM(quantity*unit_price) tot FROM order_items GROUP BY order_id) t;

9.12 SELECT p.name, (SELECT COUNT(*) FROM reviews r WHERE r.product_id=p.product_id) AS reviews FROM products p;

9.13 SELECT first_name,last_name FROM customers c WHERE EXISTS (SELECT 1 FROM orders o WHERE o.customer_id=c.customer_id);

9.17 > ALL — products costlier than every category-5 product: WHERE price > ALL (SELECT price FROM products WHERE category_id=5);

9.20 Three equivalent forms (IN / JOIN / EXISTS) — same orders.

9.23 Error: the scalar subquery returns multiple rows. Use IN (...) (or aggregate / LIMIT 1).

Chapter 10 — Set Operations

10.1 SELECT first_name||' '||last_name AS name, email FROM customers UNION SELECT name, contact_email FROM suppliers ORDER BY name;

10.5 SELECT customer_id FROM orders INTERSECT SELECT customer_id FROM reviews;

10.8 SELECT product_id FROM products EXCEPT SELECT product_id FROM order_items;

10.12 SELECT name, price FROM products WHERE category_id=3 UNION ALL SELECT name, price FROM products WHERE category_id=5 ORDER BY price DESC;

10.14 Mismatched column counts. Make both two columns (e.g., SELECT name, contact_email FROM suppliers).

Chapter 11 — CTEs & Recursion

11.1 WITH per_order AS (SELECT o.order_id, SUM(oi.quantity*oi.unit_price) tot FROM orders o JOIN order_items oi ON oi.order_id=o.order_id GROUP BY o.order_id) SELECT AVG(tot), MAX(tot) FROM per_order;

11.3 WITH per_customer AS (...) SELECT * FROM per_customer WHERE revenue > (SELECT AVG(revenue) FROM per_customer);

11.5 Three CTEs: per-customer revenue → rank → join customers; WHERE rank <= 3.

11.7 WITH RECURSIVE t AS (SELECT category_id,name,parent_category_id,1 d FROM categories WHERE parent_category_id IS NULL UNION ALL SELECT c.category_id,c.name,c.parent_category_id,t.d+1 FROM categories c JOIN t ON c.parent_category_id=t.category_id) SELECT * FROM t;

11.11 Anchor = 'Laptops'; recurse to parent: ... JOIN ancestry a ON c.category_id = a.parent_category_id.

11.15 Cycles cause infinite recursion. Prevent with the CYCLE clause (or a path array) and/or a depth limit (WHERE depth < N).

Chapter 12 — Window Functions

12.1 SELECT name, price, AVG(price) OVER (PARTITION BY category_id) AS cat_avg FROM products;

12.4 SELECT name, price, ROW_NUMBER() OVER (PARTITION BY category_id ORDER BY price DESC) rn, RANK() OVER (...), DENSE_RANK() OVER (...) FROM products; — ROW_NUMBER unique; RANK skips after ties; DENSE_RANK no gaps.

12.5 SELECT * FROM (SELECT name, category_id, price, ROW_NUMBER() OVER (PARTITION BY category_id ORDER BY price DESC) rn FROM products) t WHERE rn=1;

12.8 SELECT order_id, amount, SUM(amount) OVER (ORDER BY paid_at) AS running FROM payments;

12.11 WITH m AS (SELECT date_trunc('month',paid_at) mo, SUM(amount) rev FROM payments GROUP BY 1) SELECT mo, rev, LAG(rev) OVER (ORDER BY mo) FROM m;

12.14 AVG(rev) OVER (ORDER BY day ROWS BETWEEN 2 PRECEDING AND CURRENT ROW).

12.17 Can't filter a window function in WHERE. Wrap in a subquery/CTE and filter the alias in the outer query.

Chapter 13 — Data Modification

13.1 INSERT INTO suppliers (name,country,contact_email) VALUES ('New Co','USA','x@y.example');

13.3 INSERT INTO customers (first_name,last_name,email) VALUES ('Maya','Singh','maya@example.com') RETURNING customer_id, created_at;

13.5 First SELECT * FROM products WHERE category_id=5; then UPDATE products SET price=price*1.10 WHERE category_id=5;

13.9 DELETE FROM reviews WHERE rating < 2; (preview with the same WHERE as a SELECT first).

13.13 INSERT INTO inventory (product_id,warehouse_id,quantity) VALUES (1,1,99) ON CONFLICT (product_id,warehouse_id) DO UPDATE SET quantity=EXCLUDED.quantity;

13.16 Preview the WHERE as a SELECT → BEGIN → run the change → verify with a SELECT → COMMIT (or ROLLBACK).

13.19 Missing WHERE → updates every product. Catch it by previewing the WHERE as a SELECT (count the rows).

Chapter 14 — Data Definition

14.1 CREATE TABLE wishlist (wishlist_id integer GENERATED ALWAYS AS IDENTITY PRIMARY KEY, customer_id integer NOT NULL REFERENCES customers(customer_id), product_id integer NOT NULL REFERENCES products(product_id), added_at timestamptz NOT NULL DEFAULT now());

14.2 price→numeric; name→text; flag→boolean; order timestamp→timestamptz; rating→smallint with CHECK (rating BETWEEN 1 AND 5).

14.5 Add UNIQUE (customer_id, product_id) (or make it the PK) and the two FKs NOT NULL.

14.9 CASCADE deletes children; RESTRICT blocks delete if children exist; SET NULL nulls the child FK.

14.12 ALTER TABLE wishlist ADD COLUMN notes text; then ALTER TABLE wishlist ALTER COLUMN notes SET DEFAULT '';

14.16 CREATE SCHEMA reporting; CREATE TABLE reporting.t (...); SELECT * FROM reporting.t;

14.18 (Library example) members, books, loans with PKs, FKs, CHECKs, UNIQUE(email) — see Chapter 39 Case Study 1.

Chapter 15 — Views & Functions

15.1 CREATE VIEW order_summary AS SELECT o.order_id, c.first_name||' '||c.last_name AS customer, o.order_date, o.status, SUM(oi.quantity*oi.unit_price) AS total FROM orders o JOIN customers c ON c.customer_id=o.customer_id LEFT JOIN order_items oi ON oi.order_id=o.order_id GROUP BY o.order_id, c.first_name, c.last_name, o.order_date, o.status;

15.3 Simplification (hide the join), security (expose subset), abstraction (stable interface over changing tables).

15.5 CREATE MATERIALIZED VIEW category_sales AS SELECT c.category_id, c.name, COUNT(DISTINCT o.order_id) n, SUM(oi.quantity*oi.unit_price) revenue FROM categories c JOIN products p ON p.category_id=c.category_id JOIN order_items oi ON oi.product_id=p.product_id JOIN orders o ON o.order_id=oi.order_id GROUP BY c.category_id, c.name;

15.7 Use a matview when the query is expensive, data changes slowly, and slight staleness is acceptable.

15.9 SQL function: CREATE FUNCTION order_total(p int) RETURNS numeric AS $$ SELECT COALESCE(SUM(quantity*unit_price*(1-discount)),0) FROM order_items WHERE order_id=p $$ LANGUAGE sql STABLE;

15.12 A price_history table + AFTER UPDATE trigger inserting old/new price when changed.

15.15 Materialized view, refreshed hourly (expensive aggregate, hourly freshness OK).

15.19 The matview isn't being refreshed; schedule REFRESH MATERIALIZED VIEW [CONCURRENTLY] ....

Chapter 16 — Advanced SQL

16.1 SELECT name, attributes ->> 'color' AS color FROM products;

16.2 SELECT * FROM products WHERE attributes @> '{"color":"silver"}';

16.5 Prefer jsonb (parsed, indexable). GIN index speeds @>.

16.6 SELECT name FROM products WHERE to_tsvector('english', name||' '||coalesce(description,'')) @@ to_tsquery('english','wireless');

16.8 Full-text search stems words and ranks by relevance (and uses a GIN index); LIKE does neither.

16.10 'wireless' = ANY(tags); tags @> ARRAY['wireless','black']; SELECT product_id, unnest(tags) FROM products;

16.12 SELECT generate_series('2024-01-01'::date,'2024-12-01','1 month');

16.17 (a) PostgreSQL/JSONB; (b) PostgreSQL full-text; (c) Redis (separate); (d) a warehouse (separate).

Chapters 17–22 (Design)

17.1 Entities: customer, order, product, category, supplier. Relationships: customer places order (1:N); order contains products (M:N → order_items); product belongs to category (1:N); product from supplier (1:N).

17.4 customer↔orders 1:N; order↔products M:N; category↔products 1:N; employee↔manager 1:N (self); product↔reviews 1:N.

17.11 (a) M:N; (b) self-referencing hierarchy; (c) M:N (user-roles).

17.13 Unmodeled M:N crammed into a column. Model students, courses, and an enrollments junction.

18.1 CREATE TABLE books (book_id integer GENERATED ALWAYS AS IDENTITY PRIMARY KEY, isbn text UNIQUE, title text NOT NULL, page_count integer, published_on date, price numeric(10,2));

18.4 FK on the many side: books.author_id REFERENCES authors.

18.7 Junction enrollments(student_id, course_id, grade) PK (student_id, course_id); grade is a relationship attribute → on the junction.

18.10 Single-table (one table + type, nullable cols); table-per-subclass (base + subtype tables, join); table-per-concrete (full table each, UNION for "all").

18.13 order_items is weak (no identity alone); PK (order_id, product_id).

19.1 Update: change a customer's email, stale copies remain. Insert: can't add a product with no order. Delete: deleting the only order for a product loses its price/category.

19.4 product_id → name, price, category_id (and sku → product_id).

19.6 "Every non-key attribute depends on the key (1NF/2NF), the whole key (2NF), and nothing but the key (3NF)."

19.7 Violates 1NF (list in a cell). Fix: enrollments junction.

19.8 Violates 2NF: product_name, product_price depend on product_id only (part of the composite key). Split product attributes into products.

20.1 Accidental redundancy = a bug (current fact copied, drifts). Deliberate denormalization = a documented choice for read speed with a maintenance plan.

20.3 unit_price avoids a join AND records the historical price at purchase time.

21.1 A product_audit(audit_id, product_id, changed_at, changed_by, action, old_row jsonb, new_row jsonb) + AFTER trigger.

21.5 Adjacency list (parent_id; recursive CTE), materialized path (LIKE), nested set (fast read/slow write), closure table (fast read, flexible).

22.1 up: CREATE TABLE wishlists (...); down: DROP TABLE wishlists;

22.4 Range-partition by month: parent PARTITION BY RANGE (order_date) + orders_2024_01 ... FOR VALUES FROM ('2024-01-01') TO ('2024-02-01').

22.7 Expand: add phone_number (nullable), deploy dual-write. Migrate: backfill in batches, switch reads. Contract: stop writing phone, drop it.

Chapters 23–28 (Performance & Internals)

23.1 EXPLAIN ANALYZE SELECT * FROM orders WHERE customer_id=4267; — with the index it's an Index Scan in ms; without, a Seq Scan over all rows.

23.4 A B-tree's shallow height makes a lookup touch ~log(n) nodes vs. scanning n rows.

23.7 JSONB @> → GIN; full-text → GIN; range overlap → GiST; integer equality → B-tree.

23.11 CREATE INDEX ON customers (lower(email)); then WHERE lower(email)=... uses it.

23.13 Slower writes, storage, maintenance.

24.1 Seq Scan (no index on price); note estimated vs actual rows and total time.

24.5 A column with no index → Seq Scan; an indexed column → Index Scan.

24.8 Likely a Hash Join or indexed Nested Loop (with orders(customer_id) index).

24.11 ANALYZE refreshes planner statistics; stale stats → wrong row estimates → bad plans.

24.14 Read plan → find the Seq Scan / mismatch → add index / ANALYZE / rewrite → re-run to confirm.

25.1 Smaller maintenance units; partition pruning; cheap retention via DROP.

25.4 CREATE TABLE orders_part (...) PARTITION BY RANGE (order_date); + monthly partitions (half-open ranges).

25.8 A date-filtered query scans only the matching month's partition (others pruned) — confirm in EXPLAIN.

26.1 A: all-or-nothing; C: constraints hold; I: concurrent txns don't interfere; D: committed data survives crashes.

26.4 BEGIN; INSERT INTO orders ...; INSERT INTO order_items ...; COMMIT; (an error → ROLLBACK, nothing persists).

26.7 Under Read Committed, T1's second read sees T2's committed change (non-repeatable read); Repeatable Read keeps one snapshot.

26.11 Read Committed (default); prevents dirty reads, not non-repeatable/phantom.

27.1 Two txns read the same value, each computes +1, one write overwrites the other → one increment lost. Invisible single-threaded.

27.4 SELECT ... FOR UPDATE locks the row; another session's update waits until commit.

27.7 UPDATE t SET ..., version=version+1 WHERE id=:id AND version=:v; — 0 rows → someone else changed it → retry.

27.10 Two sessions lock A,B in opposite order → deadlock; PostgreSQL aborts one. Fix with consistent lock ordering.

27.13 pg_advisory_lock(n) to serialize a job across sessions (a named mutex for non-row resources).

28.1 8 KB; PostgreSQL reads/writes per page (the I/O and cache unit), not per row.

28.4 First run reads from disk (misses); second run hits the buffer pool (cached) → much faster.

28.7 Write the WAL record (flushed) before modifying the page → durability; sequential writes are fast; replay recovers after a crash.

28.10 An UPDATE writes a new version, marks old dead (MVCC). VACUUM reclaims dead tuples.

28.14 Index speeds reads (fewer pages) but slows writes (every index updated, more WAL).

Chapters 29–32 (Application & Security)

29.1 conn=psycopg2.connect(...); cur=conn.cursor(); cur.execute("SELECT name,price FROM products WHERE price>200"); print(cur.fetchall()); cur.close(); conn.close()

29.4 cur.execute("SELECT * FROM customers WHERE email=%s", (email,)) — %s is a driver placeholder (value sent separately), not Python formatting; prevents injection.

29.5 "... WHERE name='"+name+"'" lets '; DROP TABLE products; -- become executed SQL. Fix: parameterize.

29.8 with conn, conn.cursor() as cur: cur.execute("INSERT INTO orders(...) VALUES(%s,%s) RETURNING order_id",(...)); oid=cur.fetchone()[0]; cur.execute("INSERT INTO order_items(...) VALUES(%s,...)",(oid,...)) — commits on success.

29.11 getconn()/putconn() reuse connections; return (don't close) so the pool can hand them out again.

29.14 A CustomerRepository with parameterized find_by_email/create methods (see Ch. 29 index).

30.1 Define Customer/Order mapped classes with relationship() (see Ch. 30 index).

30.5 Looping over customers accessing .orders fires one query per customer (N+1).

30.9 Raw SQL via session.execute(text("SELECT ... GROUP BY ...")) — analytics is clearer/faster than the ORM.

31.1 \copy (SELECT * FROM products) TO 'products.csv' WITH (FORMAT csv, HEADER true).

31.4 COPY finishes far faster than an INSERT loop (no per-statement/round-trip/commit overhead).

31.7 Load raw → loose staging_products (text cols) → INSERT INTO products SELECT ...::numeric, c.category_id ... JOIN categories ... WHERE <valid> ON CONFLICT (sku) DO UPDATE ....

31.10 Loop: UPDATE ... SET phone_number=phone WHERE phone_number IS NULL AND ...id range... per batch (re-runnable).

31.13 Drop indexes before a huge initial load, rebuild after (faster); NOT safe on a live table serving queries.

32.1 Vulnerable: concatenated input. Parameterized: %s — structurally safe because the value never becomes SQL.

32.4 CREATE ROLE app_rw LOGIN PASSWORD '...'; GRANT SELECT,INSERT,UPDATE,DELETE ON ALL TABLES IN SCHEMA public TO app_rw; and app_ro with only SELECT.

32.6 GRANT SELECT (name, price) ON products TO analyst;

32.8 ALTER TABLE t ENABLE ROW LEVEL SECURITY; CREATE POLICY p ON t USING (tenant_id = current_setting('app.tenant_id')::int); — safer than app-code filtering because a forgotten WHERE can't leak.

32.11 Require TLS so credentials/data aren't sniffable on the network.

32.14 Backups/PITR make ransomware/accidental-DROP recoverable — a security control, not just ops.

Chapters 33–40 (Beyond Relational; Admin; Career)

33.1 MongoDB=document, Redis=key-value, Cassandra=column-family, Neo4j=graph.

33.4 (a) JSONB; (b) full-text; (c) Redis (separate); (d) recursive CTE.

33.7 Graph: deep social traversal; Cassandra: massive writes; Redis: caching/sessions.

34.1 OLTP normalized/row-oriented/small-txn; OLAP denormalized/columnar/aggregating — different workloads.

34.4 Fact fact_sales (grain = one order line item; measures qty/revenue) + dimensions date/product/customer.

34.11 Columnar storage reads only the needed columns over many rows (less I/O) and compresses well.

35.1 Availability/HA, read scaling, geographic proximity, scale beyond one server.

35.3 Primary accepts writes; replicas replay the WAL and serve reads + enable failover.

35.9 CAP: at most 2 of C/A/P during a partition; since P is unavoidable, choose C vs A.

36.1 Time-series: append-heavy, time-range/aggregate, retention. InfluxDB/Prometheus; extension: TimescaleDB.

36.4 An embedding is a vector of meaning; vector search finds semantically similar items (not keyword matches).

37.1 Data model, workload, scale, consistency, query patterns, team/ops, cost, ecosystem.

37.7 SaaS billing → PostgreSQL (relational, OLTP, strong consistency, rich queries), + Redis/replicas as it grows.

38.1 SHOW shared_buffers; etc. For 16 GB RAM: shared_buffers ≈ 4 GB, effective_cache_size ≈ 8–12 GB.

38.8 pg_dump mercado > m.sql; createdb mercado_test; psql -d mercado_test -f m.sql; then compare count(*)s.

38.10 PITR = base backup + archived WAL → restore to any moment (e.g., just before a bad DELETE); a nightly dump can't.

38.12 Autovacuum reclaims dead tuples; long transactions block it (their snapshot pins tuples) → bloat.

These are selected answers. For exercises without a worked answer here, the in-chapter examples and the relevant key-takeaways provide the method; the answer is to do it against your database.