Case Study 2 — The Startup That Chose the Wrong Database

DataField.Dev

Case Study 2 — The Startup That Chose the Wrong Database

Chapter 1 argued that the relational model is right for most problems — but also that judgment, not loyalty, is the goal. This case study shows the other failure mode: not "spreadsheet when you needed a database," but "the wrong kind of database because it was trendy." It's a preview of the decision-making you'll master in Part VI.

Background

Lumen was a promising B2B startup building scheduling software for dental practices: appointments, patients, providers, billing, insurance claims. In their first month, the two founding engineers made an architectural decision that felt modern and forward-looking: they would store everything in a document database (a NoSQL store) rather than a "boring old" relational database. The pitch to themselves was seductive: no rigid schema, so we can move fast and change our data shape whenever we want; it scales horizontally, so we'll be ready for millions of users; and it stores JSON, which is what our API speaks anyway.

For a few months, it felt great. They could add a field to a patient document without running a migration. Prototyping was fast. They told themselves they'd dodged the bureaucracy of schemas.

Then the application grew up, and the bill came due.

Where it went wrong

Everything was related, and the document model fought it. Dental scheduling is a deeply relational domain. A patient has many appointments. An appointment involves one patient, one provider, one or more procedures, and a room. A procedure has a billing code, which maps to an insurance reimbursement, which depends on the patient's plan. These are exactly the one-to-many and many-to-many relationships the relational model was built for.

In the document store, the team had two bad options, and they tried both. First they embedded related data inside documents — each appointment document carried a copy of the patient's name, the provider's name, the procedure details. This is the redundancy trap from this chapter: when a provider changed their name, thousands of appointment documents now held a stale copy, and a "rename provider" feature became a slow, error-prone bulk update across the whole collection. Then they tried referencing (storing IDs and looking the related data up in application code) — which meant they had reinvented joins, badly, in their own code, fetching documents in loops. A single "show this week's schedule with patient and provider details" screen fired hundreds of separate lookups and took seconds to load.

Nobody enforced the rules, so the data rotted. Because the document store didn't enforce a schema, bugs in the application quietly wrote inconsistent data. Some appointment documents had status: "confirmed", others state: "CONFIRMED", others no status field at all, depending on which version of the code wrote them. Some referenced patient IDs that no longer existed (no foreign keys to prevent it). Six months in, every query had to defensively handle five historical shapes of the same data. The "flexibility" they'd celebrated had become a tax paid on every single read.

They needed transactions, and didn't have them. Billing is unforgiving. When a payment is applied to a claim, the claim's balance must decrease and a payment record must be created — both or neither. Their database (at the time, in the way they'd configured it) couldn't guarantee that two writes to two different documents happened atomically. A crash between the two left claims with payments that didn't exist, or payments applied to claims that still showed full balances. Reconciling the books became a monthly manual nightmare.

The scaling they'd optimized for never came. They had chosen horizontal scaling for "millions of users." They had, at the eighteen-month mark, four hundred dental practices. Their data would have fit comfortably on a single modest PostgreSQL server with room to spare for years. They had paid a steep, daily complexity cost to solve a scaling problem they did not have.

The analysis

Lumen's mistake wasn't choosing NoSQL — for some problems, a document store is exactly right (a product catalog with wildly varying attributes, a content management system, an event log). Their mistake was choosing it for a problem whose essence was relationships and integrity, and doing so based on hype ("schemas are slow," "NoSQL scales") rather than on the actual shape of their data and workload.

Map their pain back to the chapter's five problems and you can see they gave up the database's core strengths in each:

Structure — they had no enforced schema, so inconsistent data accumulated.
Integrity — no foreign keys, so references dangled and duplicates drifted.
Concurrency/transactions — no easy multi-record atomicity, so billing corrupted.
Querying — joins reimplemented in application code, slow and fragile.
Scale — the one thing they optimized for, and the one thing they didn't need.

When they finally migrated to PostgreSQL, the relational features erased whole categories of bugs. Foreign keys made dangling references impossible. A CHECK constraint and a normalized status column ended the five-shapes problem. A single transaction wrapped the payment-and-claim update so it could never half-happen. And a five-table join rendered the weekly schedule in milliseconds. They later added a single JSONB column (Chapter 16) for the genuinely free-form part of their data — clinical notes — getting document-style flexibility inside the relational database, exactly the synthesis this chapter described.

The lessons

Choose a database by the shape of your data and workload, not by fashion. "Is my data full of relationships and rules that must hold? Do I need transactions? Will I really exceed one big server?" Those questions — not a conference talk — should drive the decision. Part VI gives you a full framework (Chapter 37: The Database Decision).
"Schemaless" doesn't mean no schema — it means the schema lives in your application code, unenforced. The structure is still there; you've just moved responsibility for it from the database (which is very good at enforcing it) to every line of code you'll ever write (which is not).
The relational model is the right default. This is the chapter's sixth theme. Start relational; reach for something else when you have a concrete, demonstrated reason. Lumen had a hypothetical reason ("we might scale huge") and paid for it daily; the real reasons to go non-relational are specific and recognizable, and you'll learn to spot them.
You can have both. PostgreSQL's JSONB means the choice is rarely all-or-nothing. Model the relational core relationally, and use a flexible column for the genuinely unstructured parts. One well-understood database, used to its full power — theme #4.

Discussion questions

List three application domains where a document database genuinely is a strong default. What do they have in common that dental scheduling lacked?
The team celebrated "no migrations." What did that flexibility actually cost them six months later? Was it a net win or loss?
Explain in your own words why billing, specifically, needed transactions. What real-world harm came from not having them?
Lumen optimized for a scaling problem they didn't have. Why is "premature scaling" a common and expensive mistake? How could they have validated the real scale requirement first?
⭐ After migrating, they kept one JSONB column for clinical notes. Why is that a reasonable use of document-style storage inside a relational database, when embedding patient/provider data was not?