Case Study 1 — One JSONB Column Instead of Forty Nullable Ones
The constructive use of JSONB: a product catalog whose attributes differed wildly by category was drowning in sparse, mostly-null columns. A single
jsonbcolumn — properly indexed, and alongside the real relational columns — fixed it without adding a second database.
Background
An electronics retailer's products table had grown a column for every attribute any product might have: ram_gb, screen_inches, cpu_cores, battery_mah, page_count, isbn, wattage, material, shoe_size, voltage, and on and on — about 40 columns. The problem: a book has none of the laptop columns, a blender has none of the book columns. Every row was mostly NULL. Adding a new product category meant an ALTER TABLE to add more columns (a risky, locking operation on a big table — Chapter 14's Case Study 2). The table was a sparse, ever-growing mess, and merchandising waited on engineering every time they added a category.
One faction wanted to "just use MongoDB for products." Another proposed a notorious anti-pattern (an "EAV" key-value table). The team chose a third path: keep the relational columns relational, and put the variable attributes in a single jsonb column.
The redesign
The columns that every product has, and that the business queries, joins, and constrains on, stayed as real typed columns: product_id, sku, name, price, category_id, supplier_id, is_active. The dozens of category-specific, sparse attributes moved into one jsonb column:
ALTER TABLE products ADD COLUMN attributes jsonb NOT NULL DEFAULT '{}';
-- a laptop: {"ram_gb":16,"screen_in":14,"color":"silver"}
-- a book: {"format":"paperback","pages":420,"isbn":"978-..."}
-- a blender: {"watts":1200,"color":"red"}
(This is exactly Mercado's design — products.attributes is that column.) Querying is clean, and crucially, indexable:
CREATE INDEX idx_products_attributes ON products USING GIN (attributes);
-- "16GB laptops" — uses the GIN index
SELECT name FROM products
WHERE category_id = 3 AND attributes @> '{"ram_gb": 16}';
Now adding a new category needs no schema change — merchandising just stores whatever keys that category needs. The 40 sparse columns collapsed to one. And because it's still PostgreSQL, products kept their foreign keys, their CHECK (price >= 0), transactions, and joins — none of which a separate document database would have given them for free.
Why not MongoDB, and why not EAV
- Not MongoDB: the core of a product (price, category, supplier, active flag) is deeply relational — it joins to orders, categories, suppliers, and needs integrity and transactions. Moving products to a document store would have sacrificed all of that to solve a problem (variable attributes) that
jsonbsolves inside the relational database. (This is the Lumen lesson from Chapter 1, avoided.) - Not EAV (an
attributes(product_id, key, value)table): the entity-attribute-value pattern is flexible but makes every "products with 16GB RAM" query a self-joining nightmare and loses type safety.jsonbgives the same flexibility with far better ergonomics and a real index.
The balance that made it work
The design's success came from drawing the line correctly:
- Relational columns for what's universal, queried, joined, and constrained → typed columns with constraints.
- JSONB for what's variable, sparse, and category-specific → one indexed
jsonbcolumn.
They also resisted the temptation to push everything into JSONB. price stayed a real numeric column (you SUM it, constrain it >= 0, and want it typed), not attributes->>'price'. The rule they adopted: if you filter, join, aggregate, or constrain on it across all products, it's a column; if it's a category-specific spec you mostly read back whole, it's a JSONB key.
The analysis
-
JSONB is the right tool for genuinely variable/sparse attributes. It collapses dozens of mostly-null columns into one flexible, indexable column and removes the
ALTER TABLE-per-category treadmill. -
It complements relational design — it doesn't replace it. The universal, queried, related fields stayed as typed columns with constraints and foreign keys. JSONB handled the variable part. Theme #1 (design) and theme #4 (PostgreSQL's power) work together.
-
Index it or it's slow. The GIN index on
attributesis what makes@>containment queries fast; without it, every attribute query is a full scan (Chapter 23). -
It avoided a second database. One system — with transactions, joins, integrity, and document flexibility — instead of PostgreSQL + MongoDB to deploy, secure, sync, and reconcile. Theme #4 in action.
-
Draw the column/JSONB line deliberately. "Filter/join/aggregate/constrain on it → column; category-specific spec read back whole → JSONB key." Getting this line right is the whole skill.
Discussion questions
- Why is one
jsonbcolumn better than 40 sparse nullable columns and better than an EAV table? - Why did
priceandcategory_idstay as real columns instead of moving intoattributes? - What does the GIN index enable, and what happens to attribute queries without it?
- How does this design avoid the Lumen mistake (Chapter 1) of modeling relational data as documents?
- ⭐ State the rule for deciding whether a given attribute should be a column or a JSONB key, and apply it to three example attributes.