Answers to Selected Exercises
Worked answers to the exercises marked (answer in Appendix). SQL targets PostgreSQL 15+ against Mercado (
seed-sample.sql). Where data values are shown, they reflect the deterministic sample. Try each exercise before reading.
Chapter 1 — What Is a Database?
1.1 A database is the organized, persistent collection of related data (e.g., the mercado data); a DBMS is the software that manages it (PostgreSQL). Data vs. the program that stores/protects/queries it.
1.6 Redundancy = the same fact stored in multiple places. When a customer's email is copied into many order rows and she changes it, updating only some copies leaves the data contradicting itself — an update anomaly (an integrity failure).
1.7 Structure (a quantity went negative because nothing enforced quantity >= 0).
1.12 Durability (a crash lost committed data because there was no durable log/backup).
1.14 Edgar F. Codd, 1970: represent all data as relations and let users ask what they want (declaratively) rather than navigate how.
1.20 "The first and last names of all gold-tier customers."
1.24 Entities: products, customers, orders, (order line items), categories, suppliers. Each gets a table; e.g., product → (name, price); customer → (name, email).
Chapter 2 — Setup
2.1 psql --version → "psql (PostgreSQL) 16.x" (any 15+). The # prompt = superuser; > = ordinary user.
2.5 \d orders shows columns/types, PK order_id, and FKs customer_id→customers, employee_id→employees, ship_address_id→addresses.
2.8 \h INSERT shows SQL syntax help for a command; \? lists psql meta-commands (backslash commands).
2.10 CREATE TABLE scratch (id integer GENERATED ALWAYS AS IDENTITY PRIMARY KEY, note text NOT NULL); INSERT INTO scratch (note) VALUES ('a'),('b'),('c'); SELECT * FROM scratch; DROP TABLE scratch;
2.13 SELECT name, price FROM products;
2.18 Re-running is safe because seed-sample.sql begins with TRUNCATE ... RESTART IDENTITY CASCADE (and uses OVERRIDING SYSTEM VALUE with fixed IDs), so it clears and reloads deterministically.
2.21 Connection refused = the server isn't running. Start it (brew services start / systemctl start postgresql / docker start).
Chapter 3 — The Relational Model
3.1 Relation = table (customers); tuple = row (one customer); attribute = column (email); domain = allowed values (loyalty_tier ∈ {standard,…}); degree = #columns (7); cardinality = #rows (12 in sample).
3.5 Superkey: {customer_id, email} (or {customer_id}); candidate keys: {customer_id}, {email}; chosen PK: customer_id.
3.8 inventory PK = (product_id, warehouse_id); each row = the stock of one product in one warehouse.
3.10 orders FKs: customer_id→customers, employee_id→employees, ship_address_id→addresses.
3.14 SELECT first_name, last_name FROM customers WHERE phone IS NULL; — WHERE phone = NULL returns nothing because = NULL is unknown (three-valued logic), never true.
3.16 COUNT(*) = 12 (rows); COUNT(phone) = 10 (non-NULL phones). They differ by the two NULL phones — COUNT(col) skips NULLs.
3.18 Rejected by CHECK (rating BETWEEN 1 AND 5) — reviews_rating_check violation.
3.22 Tables referencing customers: addresses (customer_id), orders (customer_id), reviews (customer_id). (Check each with \d.)
Chapter 4 — Relational Algebra
4.1 Selection (σ) → WHERE.
4.7 SELECT first_name, last_name FROM customers WHERE loyalty_tier = 'gold';
4.11 π sku,price ( σ price<50 (products) ) → SELECT sku, price FROM products WHERE price < 50;
4.14 Product = 12 × 15 = 180 rows; CROSS JOIN produces every customer-order pairing (mostly meaningless).
4.17 Equivalent (selection can be pushed below the join). (b) is cheaper — it filters orders to one customer before the join, so the join does far less work.
4.20 Same rows all three ways (IN/JOIN/EXISTS) for "orders from gold customers."
Chapter 5 — SELECT/FROM/WHERE
5.1 SELECT * FROM suppliers;
5.4 SELECT name, price, ROUND(price * 1.08, 2) AS price_with_tax FROM products;
5.6 SELECT * FROM products WHERE price > 500;
5.9 SELECT * FROM customers WHERE loyalty_tier IN ('gold','platinum');
5.14 SELECT * FROM products WHERE is_active AND category_id IN (3,5) AND price < 300;
5.15 SELECT * FROM products ORDER BY price DESC;
5.20 SELECT DISTINCT loyalty_tier FROM customers;
5.23 phone = NULL is never true; use WHERE phone IS NULL.
5.27 Text compares by collation order; 'a' < 'b' is true; 'Z' < 'a' is true in typical ASCII/C-like ordering (uppercase precedes lowercase) — collation-dependent.
5.30 "Active products under $50, name and price, cheapest first."
Chapter 6 — JOINs
6.1 SELECT o.order_id, o.order_date, c.first_name, c.last_name FROM orders o JOIN customers c ON c.customer_id = o.customer_id;
6.6 SELECT c.last_name, p.name, oi.quantity FROM orders o JOIN customers c ON c.customer_id=o.customer_id JOIN order_items oi ON oi.order_id=o.order_id JOIN products p ON p.product_id=oi.product_id;
6.10 SELECT c.first_name, c.last_name, o.order_id FROM customers c LEFT JOIN orders o ON o.customer_id=c.customer_id;
6.11 SELECT c.* FROM customers c WHERE NOT EXISTS (SELECT 1 FROM orders o WHERE o.customer_id=c.customer_id); (or LEFT JOIN ... WHERE o.order_id IS NULL).
6.16 SELECT e.first_name||' '||e.last_name AS employee, m.first_name||' '||m.last_name AS manager FROM employees e LEFT JOIN employees m ON m.employee_id = e.manager_id;
6.19 Comma with no condition = Cartesian product (180 rows). Add WHERE c.customer_id = o.customer_id (or use JOIN ... ON).
6.23 INNER JOIN (every order has a customer; no need to keep non-matches).
Chapter 7 — Aggregation
7.1 SELECT COUNT(*) FROM products; → 15.
7.3 SELECT COUNT(*) AS all, COUNT(phone) AS with_phone FROM customers; → 12 vs 10 (COUNT(col) skips NULLs).
7.6 SELECT loyalty_tier, COUNT(*) FROM customers GROUP BY loyalty_tier ORDER BY 2 DESC;
7.9 SELECT p.name, AVG(r.rating), COUNT(r.review_id) FROM products p LEFT JOIN reviews r ON r.product_id=p.product_id GROUP BY p.product_id, p.name;
7.11 SELECT loyalty_tier, COUNT(*) FROM customers GROUP BY loyalty_tier HAVING COUNT(*) > 2;
7.15 Error: aggregates can't appear in WHERE. Use HAVING COUNT(*) > 2 (groups don't exist at WHERE time).
7.18 SELECT c.customer_id, COALESCE(SUM(oi.quantity*oi.unit_price),0) AS revenue FROM customers c LEFT JOIN orders o ON o.customer_id=c.customer_id LEFT JOIN order_items oi ON oi.order_id=o.order_id GROUP BY c.customer_id;
7.22 Both equal here (discount is NOT NULL). In general AVG(col) skips NULLs; AVG(COALESCE(col,0)) counts them as 0.
7.25 ... GROUP BY ROLLUP (cat.name) ... adds a grand-total row (NULL category).
Chapter 8 — Built-in Functions
8.1 SELECT last_name || ', ' || first_name AS name FROM customers;
8.4 SELECT 'ORD-' || lpad(order_id::text, 5, '0') FROM orders;
8.6 SELECT name, round(price, -1) AS nearest_ten FROM products; (negative scale rounds to tens).
8.10 SELECT order_id, order_date::date FROM orders;
8.11 SELECT date_trunc('month', order_date) AS month, COUNT(*) FROM orders GROUP BY 1 ORDER BY 1;
8.15 SELECT '250'::integer + 50; → 300.
8.18 SELECT name, CASE WHEN price>=1000 THEN 'premium' WHEN price>=200 THEN 'mid-range' ELSE 'budget' END AS band FROM products;
8.20 SELECT COUNT(*) FILTER (WHERE status='delivered') AS delivered, COUNT(*) FILTER (WHERE status='cancelled') AS cancelled FROM orders;
8.22 Integer division: delivered_items / total_items truncates to 0. Use delivered_items::numeric / NULLIF(total_items,0).
Chapter 9 — Subqueries
9.1 SELECT name, price FROM products WHERE price > (SELECT AVG(price) FROM products);
9.5 SELECT * FROM orders WHERE customer_id IN (SELECT customer_id FROM customers WHERE loyalty_tier='gold');
9.9 SELECT AVG(t.tot) FROM (SELECT order_id, SUM(quantity*unit_price) tot FROM order_items GROUP BY order_id) t;
9.12 SELECT p.name, (SELECT COUNT(*) FROM reviews r WHERE r.product_id=p.product_id) AS reviews FROM products p;
9.13 SELECT first_name,last_name FROM customers c WHERE EXISTS (SELECT 1 FROM orders o WHERE o.customer_id=c.customer_id);
9.17 > ALL — products costlier than every category-5 product: WHERE price > ALL (SELECT price FROM products WHERE category_id=5);
9.20 Three equivalent forms (IN / JOIN / EXISTS) — same orders.
9.23 Error: the scalar subquery returns multiple rows. Use IN (...) (or aggregate / LIMIT 1).
Chapter 10 — Set Operations
10.1 SELECT first_name||' '||last_name AS name, email FROM customers UNION SELECT name, contact_email FROM suppliers ORDER BY name;
10.5 SELECT customer_id FROM orders INTERSECT SELECT customer_id FROM reviews;
10.8 SELECT product_id FROM products EXCEPT SELECT product_id FROM order_items;
10.12 SELECT name, price FROM products WHERE category_id=3 UNION ALL SELECT name, price FROM products WHERE category_id=5 ORDER BY price DESC;
10.14 Mismatched column counts. Make both two columns (e.g., SELECT name, contact_email FROM suppliers).
Chapter 11 — CTEs & Recursion
11.1 WITH per_order AS (SELECT o.order_id, SUM(oi.quantity*oi.unit_price) tot FROM orders o JOIN order_items oi ON oi.order_id=o.order_id GROUP BY o.order_id) SELECT AVG(tot), MAX(tot) FROM per_order;
11.3 WITH per_customer AS (...) SELECT * FROM per_customer WHERE revenue > (SELECT AVG(revenue) FROM per_customer);
11.5 Three CTEs: per-customer revenue → rank → join customers; WHERE rank <= 3.
11.7 WITH RECURSIVE t AS (SELECT category_id,name,parent_category_id,1 d FROM categories WHERE parent_category_id IS NULL UNION ALL SELECT c.category_id,c.name,c.parent_category_id,t.d+1 FROM categories c JOIN t ON c.parent_category_id=t.category_id) SELECT * FROM t;
11.11 Anchor = 'Laptops'; recurse to parent: ... JOIN ancestry a ON c.category_id = a.parent_category_id.
11.15 Cycles cause infinite recursion. Prevent with the CYCLE clause (or a path array) and/or a depth limit (WHERE depth < N).
Chapter 12 — Window Functions
12.1 SELECT name, price, AVG(price) OVER (PARTITION BY category_id) AS cat_avg FROM products;
12.4 SELECT name, price, ROW_NUMBER() OVER (PARTITION BY category_id ORDER BY price DESC) rn, RANK() OVER (...), DENSE_RANK() OVER (...) FROM products; — ROW_NUMBER unique; RANK skips after ties; DENSE_RANK no gaps.
12.5 SELECT * FROM (SELECT name, category_id, price, ROW_NUMBER() OVER (PARTITION BY category_id ORDER BY price DESC) rn FROM products) t WHERE rn=1;
12.8 SELECT order_id, amount, SUM(amount) OVER (ORDER BY paid_at) AS running FROM payments;
12.11 WITH m AS (SELECT date_trunc('month',paid_at) mo, SUM(amount) rev FROM payments GROUP BY 1) SELECT mo, rev, LAG(rev) OVER (ORDER BY mo) FROM m;
12.14 AVG(rev) OVER (ORDER BY day ROWS BETWEEN 2 PRECEDING AND CURRENT ROW).
12.17 Can't filter a window function in WHERE. Wrap in a subquery/CTE and filter the alias in the outer query.
Chapter 13 — Data Modification
13.1 INSERT INTO suppliers (name,country,contact_email) VALUES ('New Co','USA','x@y.example');
13.3 INSERT INTO customers (first_name,last_name,email) VALUES ('Maya','Singh','maya@example.com') RETURNING customer_id, created_at;
13.5 First SELECT * FROM products WHERE category_id=5; then UPDATE products SET price=price*1.10 WHERE category_id=5;
13.9 DELETE FROM reviews WHERE rating < 2; (preview with the same WHERE as a SELECT first).
13.13 INSERT INTO inventory (product_id,warehouse_id,quantity) VALUES (1,1,99) ON CONFLICT (product_id,warehouse_id) DO UPDATE SET quantity=EXCLUDED.quantity;
13.16 Preview the WHERE as a SELECT → BEGIN → run the change → verify with a SELECT → COMMIT (or ROLLBACK).
13.19 Missing WHERE → updates every product. Catch it by previewing the WHERE as a SELECT (count the rows).
Chapter 14 — Data Definition
14.1 CREATE TABLE wishlist (wishlist_id integer GENERATED ALWAYS AS IDENTITY PRIMARY KEY, customer_id integer NOT NULL REFERENCES customers(customer_id), product_id integer NOT NULL REFERENCES products(product_id), added_at timestamptz NOT NULL DEFAULT now());
14.2 price→numeric; name→text; flag→boolean; order timestamp→timestamptz; rating→smallint with CHECK (rating BETWEEN 1 AND 5).
14.5 Add UNIQUE (customer_id, product_id) (or make it the PK) and the two FKs NOT NULL.
14.9 CASCADE deletes children; RESTRICT blocks delete if children exist; SET NULL nulls the child FK.
14.12 ALTER TABLE wishlist ADD COLUMN notes text; then ALTER TABLE wishlist ALTER COLUMN notes SET DEFAULT '';
14.16 CREATE SCHEMA reporting; CREATE TABLE reporting.t (...); SELECT * FROM reporting.t;
14.18 (Library example) members, books, loans with PKs, FKs, CHECKs, UNIQUE(email) — see Chapter 39 Case Study 1.
Chapter 15 — Views & Functions
15.1 CREATE VIEW order_summary AS SELECT o.order_id, c.first_name||' '||c.last_name AS customer, o.order_date, o.status, SUM(oi.quantity*oi.unit_price) AS total FROM orders o JOIN customers c ON c.customer_id=o.customer_id LEFT JOIN order_items oi ON oi.order_id=o.order_id GROUP BY o.order_id, c.first_name, c.last_name, o.order_date, o.status;
15.3 Simplification (hide the join), security (expose subset), abstraction (stable interface over changing tables).
15.5 CREATE MATERIALIZED VIEW category_sales AS SELECT c.category_id, c.name, COUNT(DISTINCT o.order_id) n, SUM(oi.quantity*oi.unit_price) revenue FROM categories c JOIN products p ON p.category_id=c.category_id JOIN order_items oi ON oi.product_id=p.product_id JOIN orders o ON o.order_id=oi.order_id GROUP BY c.category_id, c.name;
15.7 Use a matview when the query is expensive, data changes slowly, and slight staleness is acceptable.
15.9 SQL function: CREATE FUNCTION order_total(p int) RETURNS numeric AS $$ SELECT COALESCE(SUM(quantity*unit_price*(1-discount)),0) FROM order_items WHERE order_id=p $$ LANGUAGE sql STABLE;
15.12 A price_history table + AFTER UPDATE trigger inserting old/new price when changed.
15.15 Materialized view, refreshed hourly (expensive aggregate, hourly freshness OK).
15.19 The matview isn't being refreshed; schedule REFRESH MATERIALIZED VIEW [CONCURRENTLY] ....
Chapter 16 — Advanced SQL
16.1 SELECT name, attributes ->> 'color' AS color FROM products;
16.2 SELECT * FROM products WHERE attributes @> '{"color":"silver"}';
16.5 Prefer jsonb (parsed, indexable). GIN index speeds @>.
16.6 SELECT name FROM products WHERE to_tsvector('english', name||' '||coalesce(description,'')) @@ to_tsquery('english','wireless');
16.8 Full-text search stems words and ranks by relevance (and uses a GIN index); LIKE does neither.
16.10 'wireless' = ANY(tags); tags @> ARRAY['wireless','black']; SELECT product_id, unnest(tags) FROM products;
16.12 SELECT generate_series('2024-01-01'::date,'2024-12-01','1 month');
16.17 (a) PostgreSQL/JSONB; (b) PostgreSQL full-text; (c) Redis (separate); (d) a warehouse (separate).
Chapters 17–22 (Design)
17.1 Entities: customer, order, product, category, supplier. Relationships: customer places order (1:N); order contains products (M:N → order_items); product belongs to category (1:N); product from supplier (1:N).
17.4 customer↔orders 1:N; order↔products M:N; category↔products 1:N; employee↔manager 1:N (self); product↔reviews 1:N.
17.11 (a) M:N; (b) self-referencing hierarchy; (c) M:N (user-roles).
17.13 Unmodeled M:N crammed into a column. Model students, courses, and an enrollments junction.
18.1 CREATE TABLE books (book_id integer GENERATED ALWAYS AS IDENTITY PRIMARY KEY, isbn text UNIQUE, title text NOT NULL, page_count integer, published_on date, price numeric(10,2));
18.4 FK on the many side: books.author_id REFERENCES authors.
18.7 Junction enrollments(student_id, course_id, grade) PK (student_id, course_id); grade is a relationship attribute → on the junction.
18.10 Single-table (one table + type, nullable cols); table-per-subclass (base + subtype tables, join); table-per-concrete (full table each, UNION for "all").
18.13 order_items is weak (no identity alone); PK (order_id, product_id).
19.1 Update: change a customer's email, stale copies remain. Insert: can't add a product with no order. Delete: deleting the only order for a product loses its price/category.
19.4 product_id → name, price, category_id (and sku → product_id).
19.6 "Every non-key attribute depends on the key (1NF/2NF), the whole key (2NF), and nothing but the key (3NF)."
19.7 Violates 1NF (list in a cell). Fix: enrollments junction.
19.8 Violates 2NF: product_name, product_price depend on product_id only (part of the composite key). Split product attributes into products.
20.1 Accidental redundancy = a bug (current fact copied, drifts). Deliberate denormalization = a documented choice for read speed with a maintenance plan.
20.3 unit_price avoids a join AND records the historical price at purchase time.
21.1 A product_audit(audit_id, product_id, changed_at, changed_by, action, old_row jsonb, new_row jsonb) + AFTER trigger.
21.5 Adjacency list (parent_id; recursive CTE), materialized path (LIKE), nested set (fast read/slow write), closure table (fast read, flexible).
22.1 up: CREATE TABLE wishlists (...); down: DROP TABLE wishlists;
22.4 Range-partition by month: parent PARTITION BY RANGE (order_date) + orders_2024_01 ... FOR VALUES FROM ('2024-01-01') TO ('2024-02-01').
22.7 Expand: add phone_number (nullable), deploy dual-write. Migrate: backfill in batches, switch reads. Contract: stop writing phone, drop it.
Chapters 23–28 (Performance & Internals)
23.1 EXPLAIN ANALYZE SELECT * FROM orders WHERE customer_id=4267; — with the index it's an Index Scan in ms; without, a Seq Scan over all rows.
23.4 A B-tree's shallow height makes a lookup touch ~log(n) nodes vs. scanning n rows.
23.7 JSONB @> → GIN; full-text → GIN; range overlap → GiST; integer equality → B-tree.
23.11 CREATE INDEX ON customers (lower(email)); then WHERE lower(email)=... uses it.
23.13 Slower writes, storage, maintenance.
24.1 Seq Scan (no index on price); note estimated vs actual rows and total time.
24.5 A column with no index → Seq Scan; an indexed column → Index Scan.
24.8 Likely a Hash Join or indexed Nested Loop (with orders(customer_id) index).
24.11 ANALYZE refreshes planner statistics; stale stats → wrong row estimates → bad plans.
24.14 Read plan → find the Seq Scan / mismatch → add index / ANALYZE / rewrite → re-run to confirm.
25.1 Smaller maintenance units; partition pruning; cheap retention via DROP.
25.4 CREATE TABLE orders_part (...) PARTITION BY RANGE (order_date); + monthly partitions (half-open ranges).
25.8 A date-filtered query scans only the matching month's partition (others pruned) — confirm in EXPLAIN.
26.1 A: all-or-nothing; C: constraints hold; I: concurrent txns don't interfere; D: committed data survives crashes.
26.4 BEGIN; INSERT INTO orders ...; INSERT INTO order_items ...; COMMIT; (an error → ROLLBACK, nothing persists).
26.7 Under Read Committed, T1's second read sees T2's committed change (non-repeatable read); Repeatable Read keeps one snapshot.
26.11 Read Committed (default); prevents dirty reads, not non-repeatable/phantom.
27.1 Two txns read the same value, each computes +1, one write overwrites the other → one increment lost. Invisible single-threaded.
27.4 SELECT ... FOR UPDATE locks the row; another session's update waits until commit.
27.7 UPDATE t SET ..., version=version+1 WHERE id=:id AND version=:v; — 0 rows → someone else changed it → retry.
27.10 Two sessions lock A,B in opposite order → deadlock; PostgreSQL aborts one. Fix with consistent lock ordering.
27.13 pg_advisory_lock(n) to serialize a job across sessions (a named mutex for non-row resources).
28.1 8 KB; PostgreSQL reads/writes per page (the I/O and cache unit), not per row.
28.4 First run reads from disk (misses); second run hits the buffer pool (cached) → much faster.
28.7 Write the WAL record (flushed) before modifying the page → durability; sequential writes are fast; replay recovers after a crash.
28.10 An UPDATE writes a new version, marks old dead (MVCC). VACUUM reclaims dead tuples.
28.14 Index speeds reads (fewer pages) but slows writes (every index updated, more WAL).
Chapters 29–32 (Application & Security)
29.1 conn=psycopg2.connect(...); cur=conn.cursor(); cur.execute("SELECT name,price FROM products WHERE price>200"); print(cur.fetchall()); cur.close(); conn.close()
29.4 cur.execute("SELECT * FROM customers WHERE email=%s", (email,)) — %s is a driver placeholder (value sent separately), not Python formatting; prevents injection.
29.5 "... WHERE name='"+name+"'" lets '; DROP TABLE products; -- become executed SQL. Fix: parameterize.
29.8 with conn, conn.cursor() as cur: cur.execute("INSERT INTO orders(...) VALUES(%s,%s) RETURNING order_id",(...)); oid=cur.fetchone()[0]; cur.execute("INSERT INTO order_items(...) VALUES(%s,...)",(oid,...)) — commits on success.
29.11 getconn()/putconn() reuse connections; return (don't close) so the pool can hand them out again.
29.14 A CustomerRepository with parameterized find_by_email/create methods (see Ch. 29 index).
30.1 Define Customer/Order mapped classes with relationship() (see Ch. 30 index).
30.5 Looping over customers accessing .orders fires one query per customer (N+1).
30.9 Raw SQL via session.execute(text("SELECT ... GROUP BY ...")) — analytics is clearer/faster than the ORM.
31.1 \copy (SELECT * FROM products) TO 'products.csv' WITH (FORMAT csv, HEADER true).
31.4 COPY finishes far faster than an INSERT loop (no per-statement/round-trip/commit overhead).
31.7 Load raw → loose staging_products (text cols) → INSERT INTO products SELECT ...::numeric, c.category_id ... JOIN categories ... WHERE <valid> ON CONFLICT (sku) DO UPDATE ....
31.10 Loop: UPDATE ... SET phone_number=phone WHERE phone_number IS NULL AND ...id range... per batch (re-runnable).
31.13 Drop indexes before a huge initial load, rebuild after (faster); NOT safe on a live table serving queries.
32.1 Vulnerable: concatenated input. Parameterized: %s — structurally safe because the value never becomes SQL.
32.4 CREATE ROLE app_rw LOGIN PASSWORD '...'; GRANT SELECT,INSERT,UPDATE,DELETE ON ALL TABLES IN SCHEMA public TO app_rw; and app_ro with only SELECT.
32.6 GRANT SELECT (name, price) ON products TO analyst;
32.8 ALTER TABLE t ENABLE ROW LEVEL SECURITY; CREATE POLICY p ON t USING (tenant_id = current_setting('app.tenant_id')::int); — safer than app-code filtering because a forgotten WHERE can't leak.
32.11 Require TLS so credentials/data aren't sniffable on the network.
32.14 Backups/PITR make ransomware/accidental-DROP recoverable — a security control, not just ops.
Chapters 33–40 (Beyond Relational; Admin; Career)
33.1 MongoDB=document, Redis=key-value, Cassandra=column-family, Neo4j=graph.
33.4 (a) JSONB; (b) full-text; (c) Redis (separate); (d) recursive CTE.
33.7 Graph: deep social traversal; Cassandra: massive writes; Redis: caching/sessions.
34.1 OLTP normalized/row-oriented/small-txn; OLAP denormalized/columnar/aggregating — different workloads.
34.4 Fact fact_sales (grain = one order line item; measures qty/revenue) + dimensions date/product/customer.
34.11 Columnar storage reads only the needed columns over many rows (less I/O) and compresses well.
35.1 Availability/HA, read scaling, geographic proximity, scale beyond one server.
35.3 Primary accepts writes; replicas replay the WAL and serve reads + enable failover.
35.9 CAP: at most 2 of C/A/P during a partition; since P is unavoidable, choose C vs A.
36.1 Time-series: append-heavy, time-range/aggregate, retention. InfluxDB/Prometheus; extension: TimescaleDB.
36.4 An embedding is a vector of meaning; vector search finds semantically similar items (not keyword matches).
37.1 Data model, workload, scale, consistency, query patterns, team/ops, cost, ecosystem.
37.7 SaaS billing → PostgreSQL (relational, OLTP, strong consistency, rich queries), + Redis/replicas as it grows.
38.1 SHOW shared_buffers; etc. For 16 GB RAM: shared_buffers ≈ 4 GB, effective_cache_size ≈ 8–12 GB.
38.8 pg_dump mercado > m.sql; createdb mercado_test; psql -d mercado_test -f m.sql; then compare count(*)s.
38.10 PITR = base backup + archived WAL → restore to any moment (e.g., just before a bad DELETE); a nightly dump can't.
38.12 Autovacuum reclaims dead tuples; long transactions block it (their snapshot pins tuples) → bloat.
These are selected answers. For exercises without a worked answer here, the in-chapter examples and the relevant key-takeaways provide the method; the answer is to do it against your database.