Appendix B — The Mercado Practice Database

The complete reference for Mercado, the fictional online-marketplace database used in every chapter. ER diagram, full data dictionary (all 13 tables), relationships, sample-data sizes, and load instructions. The authoritative companion to sql/schema.sql.

Loading it

createdb mercado
psql -d mercado -f sql/schema.sql           # structure (this appendix)
psql -d mercado -f sql/seed-sample.sql       # small, deterministic data (book examples)
# performance chapters (23–25):
# psql -d mercado -f sql/generate_data.sql    # ~100K orders, ~450K rows; runs ANALYZE

ER diagram (crow's-foot, simplified)

   categories ──┐ (parent_category_id self-ref: tree)
        │ 1     │
        │       └──< categories
        │ N
   ┌────┴─────┐        ┌───────────┐        ┌────────────┐
   │ products │>──N──1─│ suppliers │        │ warehouses │
   │          │                                   │ 1
   │ (FK cat, │ 1                                  │ N
   │  FK sup) │──< inventory >──────────────────── ┘   (PK: product_id, warehouse_id)
   │          │ 1
   │          │──< reviews >──N──1── customers
   └────┬─────┘                          │ 1
        │ 1                              addresses (N)  ── customers
        │ N                              orders (N) ───< 1 customers
   order_items >──N──1── orders          orders >──N──0..1 employees (sales rep, nullable)
   (PK: order_id, product_id)            orders >──N──0..1 addresses (ship_address_id)
                                          orders ──1──< payments (N)
                                          orders ──1──< shipments (N)

   employees ──┐ (manager_id self-ref: org chart)
        │ 1    │
        │      └──< employees
        N

Two self-references (categories.parent_category_id, employees.manager_id) power recursive CTEs (Ch. 11) and self-joins (Ch. 6); two composite-key junctions (order_items, inventory) resolve many-to-many / per-pair facts.

Data dictionary (13 tables)

categories — product taxonomy (self-referencing tree) | Column | Type | Constraints | |---|---|---| | category_id | integer | PK, identity | | name | text | not null | | slug | text | not null, unique | | parent_category_id | integer | FK → categories(category_id) ON DELETE SET NULL | | description | text | |

suppliers | Column | Type | Constraints | |---|---|---| | supplier_id | integer | PK, identity | | name | text | not null | | country | text | not null | | contact_email | text | | | rating | numeric(2,1) | check 0–5 |

warehouses | Column | Type | Constraints | |---|---|---| | warehouse_id | integer | PK, identity | | code | text | not null, unique | | name | text | not null | | region | text | not null |

customers | Column | Type | Constraints | |---|---|---| | customer_id | integer | PK, identity | | first_name | text | not null | | last_name | text | not null | | email | text | not null, unique | | phone | text | (nullable) | | loyalty_tier | text | not null, default 'standard', check in (standard,silver,gold,platinum) | | created_at | timestamptz | not null, default now() |

employees — staff (self-referencing org chart) | Column | Type | Constraints | |---|---|---| | employee_id | integer | PK, identity | | first_name / last_name | text | not null | | email | text | not null, unique | | department | text | not null | | manager_id | integer | FK → employees(employee_id) ON DELETE SET NULL | | hired_at | date | not null | | salary | numeric(10,2) | not null, check ≥ 0 |

products — catalog (JSONB attributes) | Column | Type | Constraints | |---|---|---| | product_id | integer | PK, identity | | sku | text | not null, unique | | name | text | not null | | description | text | | | category_id | integer | not null, FK → categories | | supplier_id | integer | FK → suppliers ON DELETE SET NULL | | price | numeric(10,2) | not null, check ≥ 0 | | attributes | jsonb | not null, default '{}' | | is_active | boolean | not null, default true | | created_at | timestamptz | not null, default now() |

inventory — stock per product per warehouse (composite PK) | Column | Type | Constraints | |---|---|---| | product_id | integer | PK, FK → products ON DELETE CASCADE | | warehouse_id | integer | PK, FK → warehouses ON DELETE CASCADE | | quantity | integer | not null, default 0, check ≥ 0 | | reorder_level | integer | not null, default 10, check ≥ 0 |

addresses | Column | Type | Constraints | |---|---|---| | address_id | integer | PK, identity | | customer_id | integer | not null, FK → customers ON DELETE CASCADE | | kind | text | not null, default 'shipping', check in (shipping,billing) | | line1 | text | not null | | line2 / state | text | (nullable) | | city | text | not null | | postal_code | text | not null | | country | text | not null, default 'USA' | | is_default | boolean | not null, default false |

ordersorder_date is the partition key in Ch. 25 | Column | Type | Constraints | |---|---|---| | order_id | integer | PK, identity | | customer_id | integer | not null, FK → customers | | employee_id | integer | FK → employees ON DELETE SET NULL (nullable sales rep) | | ship_address_id | integer | FK → addresses (nullable) | | order_date | timestamptz | not null, default now() | | status | text | not null, default 'pending', check in (pending,paid,shipped,delivered,cancelled,returned) |

order_items — M:N junction (composite PK); unit_price is a deliberate point-in-time copy (Ch. 19–20) | Column | Type | Constraints | |---|---|---| | order_id | integer | PK, FK → orders ON DELETE CASCADE | | product_id | integer | PK, FK → products | | quantity | integer | not null, check > 0 | | unit_price | numeric(10,2) | not null, check ≥ 0 | | discount | numeric(4,3) | not null, default 0, check 0 ≤ d < 1 |

payments | Column | Type | Constraints | |---|---|---| | payment_id | integer | PK, identity | | order_id | integer | not null, FK → orders ON DELETE CASCADE | | amount | numeric(10,2) | not null, check ≥ 0 | | method | text | not null, check in (card,paypal,gift_card,bank_transfer) | | status | text | not null, default 'captured', check in (authorized,captured,refunded,failed) | | paid_at | timestamptz | not null, default now() |

shipments | Column | Type | Constraints | |---|---|---| | shipment_id | integer | PK, identity | | order_id | integer | not null, FK → orders ON DELETE CASCADE | | carrier | text | not null | | tracking_number | text | | | shipped_at / delivered_at | timestamptz | (nullable) | | status | text | not null, default 'preparing', check in (preparing,in_transit,delivered,lost) |

reviewsUNIQUE(product_id, customer_id): one review per customer per product | Column | Type | Constraints | |---|---|---| | review_id | integer | PK, identity | | product_id | integer | not null, FK → products ON DELETE CASCADE | | customer_id | integer | not null, FK → customers ON DELETE CASCADE | | rating | smallint | not null, check 1–5 | | title / body | text | | | created_at | timestamptz | not null, default now() |

Pre-built indexes

Created by schema.sql (others are left for you to add in Ch. 23–24 exercises): orders(customer_id), order_items(product_id), products(category_id), reviews(product_id).

Sample data sizes

  • seed-sample.sql (deterministic): 8 categories, 5 suppliers, 3 warehouses, 12 customers, 8 employees, 15 products, ~18 inventory rows, 8 addresses, 15 orders, ~26 order_items, 12 payments, 9 shipments, 12 reviews.
  • generate_data.sql (random, for performance work): 20K customers, 5K products, 15K inventory, 20K addresses, 100K orders, ~250K order_items, payments/shipments for relevant orders, ~40K reviews. Runs ANALYZE at the end.

What Mercado is designed to teach

Joins (orders→customers→order_items→products→categories); self-joins & recursion (categories tree, employees org chart); aggregation & window functions (sales by category, rankings, running totals); JSONB & full-text (products.attributes, names/descriptions); normalization (it's the 3NF result of Ch. 19's flat-table anchor); indexing/partitioning (orders by customer_id / order_date); transactions/concurrency (inventory, orders, payments).


See also: sql/schema.sql, sql/seed-sample.sql, sql/generate_data.sql; Appendix D (data types), Appendix F (normalization).