Case Study 2 — The Report That Quietly Lost a Third of the Revenue
Three-valued logic isn't an academic curiosity — it causes real, expensive, silent bugs. This is one of the most common: a perfectly reasonable-looking query that excludes
NULLrows without anyone noticing, and reports confident, wrong numbers.
Background
An online retailer (not unlike Mercado) ran a weekly "revenue by sales channel" report for its executives. Orders had a channel column indicating where the order originated: 'web', 'mobile', 'phone', or — for orders placed directly in a store, before the in-store system was integrated — NULL. About a third of all orders, the legacy in-store ones, had channel = NULL.
The analyst who built the report wanted "all non-web revenue" for one section. She wrote what looks like an obviously correct filter:
-- Intended: every order that did NOT come from the web.
SELECT SUM(amount) AS non_web_revenue
FROM orders_with_totals
WHERE channel <> 'web';
The number it returned was used in board decks for months. It was wrong by roughly a third — and nobody caught it, because there was no error, no warning, no crash. Just a confident, incorrect total.
What went wrong
The culprit is three-valued logic. For the legacy in-store orders, channel is NULL, so the condition becomes:
NULL <> 'web' → NULL (unknown, NOT true)
A WHERE clause keeps only rows where the condition is exactly true. NULL (unknown) is treated like false, so every legacy in-store order was silently excluded from "non-web revenue" — even though those orders are the very definition of non-web. The analyst's mental model was "<> 'web' means everything that isn't web," but in SQL it means "everything that is known not to be web." Unknown channels fell through the crack.
The same hazard lurks in several everyday patterns:
-- All of these silently drop NULL-channel rows:
WHERE channel <> 'web'
WHERE channel NOT IN ('web', 'mobile')
WHERE NOT (channel = 'web')
And the mirror-image surprise bites aggregates, too. A "count of orders per channel" using COUNT(channel) undercounts, because COUNT(column) skips NULLs:
SELECT COUNT(*) AS all_orders, -- counts every row
COUNT(channel) AS known_channel -- skips NULL channels
FROM orders;
The fix
Once the cause was understood, the corrections were small — and the lesson was large. To include the unknown-channel orders explicitly:
-- Correct: non-web means web is false OR channel is unknown.
SELECT SUM(amount) AS non_web_revenue
FROM orders_with_totals
WHERE channel <> 'web' OR channel IS NULL;
Or, better for the long term, eliminate the ambiguity at the source. The legacy NULL channel was really a known category — "in-store" — that had simply never been recorded. The team backfilled those rows with an explicit value and made the column NOT NULL going forward:
UPDATE orders SET channel = 'in_store' WHERE channel IS NULL;
ALTER TABLE orders ALTER COLUMN channel SET NOT NULL;
ALTER TABLE orders ADD CONSTRAINT channel_known
CHECK (channel IN ('web','mobile','phone','in_store'));
Now "non-web" is unambiguous, the report is correct, and — crucially — the next analyst can't make the same mistake, because there are no NULLs to fall through and a CHECK constraint documents the allowed values.
The analysis
-
NULLpropagates silently. There was no error to catch. The query was valid SQL; it just answered a subtly different question than intended. Silent wrongness is the most dangerous kind of bug, because nothing prompts you to look. -
"Not equal" doesn't include unknown. Internalize this: any condition comparing a possibly-null column with
=,<>,IN, orNOT INwill exclude theNULLrows unless you handle them explicitly withIS NULL/IS NOT NULL. Whenever you filter on a column, ask first: can this column be null? -
Aggregates skip
NULLs.COUNT(col),AVG(col),SUM(col)all ignore nulls. Usually that's what you want (averaging known values) — but know that it's happening, and useCOUNT(*)when you mean "how many rows." -
The deepest fix is design. The bug existed because a known category ("in-store") was stored as the unknown marker
NULL.NULLshould mean genuinely unknown or inapplicable — not "a value we didn't bother to record." Backfilling the real value and addingNOT NULL+CHECKmade an entire class of error impossible. This is theme #1 again: good design makes mistakes unrepresentable.
Discussion questions
- Walk through, step by step, why
WHERE channel <> 'web'excluded the in-store orders. What doesNULL <> 'web'evaluate to, and how doesWHEREtreat it? - Rewrite "orders not from web or mobile" correctly for a nullable
channelcolumn. - Explain the difference between
COUNT(*)andCOUNT(channel). When is each the right choice? - The team's final fix backfilled
NULLto'in_store'and addedNOT NULL+CHECK. Argue why this is better than just fixing the one report query. - ⭐ Propose a code-review checklist item that would catch this class of bug before it reaches production. (Hint: it's a question to ask about every
WHEREand every aggregate.)