Case Study 2 — The Report That Quietly Lost a Third of the Revenue

Three-valued logic isn't an academic curiosity — it causes real, expensive, silent bugs. This is one of the most common: a perfectly reasonable-looking query that excludes NULL rows without anyone noticing, and reports confident, wrong numbers.

Background

An online retailer (not unlike Mercado) ran a weekly "revenue by sales channel" report for its executives. Orders had a channel column indicating where the order originated: 'web', 'mobile', 'phone', or — for orders placed directly in a store, before the in-store system was integrated — NULL. About a third of all orders, the legacy in-store ones, had channel = NULL.

The analyst who built the report wanted "all non-web revenue" for one section. She wrote what looks like an obviously correct filter:

-- Intended: every order that did NOT come from the web.
SELECT SUM(amount) AS non_web_revenue
FROM orders_with_totals
WHERE channel <> 'web';

The number it returned was used in board decks for months. It was wrong by roughly a third — and nobody caught it, because there was no error, no warning, no crash. Just a confident, incorrect total.

What went wrong

The culprit is three-valued logic. For the legacy in-store orders, channel is NULL, so the condition becomes:

   NULL <> 'web'   →   NULL   (unknown, NOT true)

A WHERE clause keeps only rows where the condition is exactly true. NULL (unknown) is treated like false, so every legacy in-store order was silently excluded from "non-web revenue" — even though those orders are the very definition of non-web. The analyst's mental model was "<> 'web' means everything that isn't web," but in SQL it means "everything that is known not to be web." Unknown channels fell through the crack.

The same hazard lurks in several everyday patterns:

-- All of these silently drop NULL-channel rows:
WHERE channel <> 'web'
WHERE channel NOT IN ('web', 'mobile')
WHERE NOT (channel = 'web')

And the mirror-image surprise bites aggregates, too. A "count of orders per channel" using COUNT(channel) undercounts, because COUNT(column) skips NULLs:

SELECT COUNT(*)      AS all_orders,     -- counts every row
       COUNT(channel) AS known_channel  -- skips NULL channels
FROM orders;

The fix

Once the cause was understood, the corrections were small — and the lesson was large. To include the unknown-channel orders explicitly:

-- Correct: non-web means web is false OR channel is unknown.
SELECT SUM(amount) AS non_web_revenue
FROM orders_with_totals
WHERE channel <> 'web' OR channel IS NULL;

Or, better for the long term, eliminate the ambiguity at the source. The legacy NULL channel was really a known category — "in-store" — that had simply never been recorded. The team backfilled those rows with an explicit value and made the column NOT NULL going forward:

UPDATE orders SET channel = 'in_store' WHERE channel IS NULL;
ALTER TABLE orders ALTER COLUMN channel SET NOT NULL;
ALTER TABLE orders ADD CONSTRAINT channel_known
  CHECK (channel IN ('web','mobile','phone','in_store'));

Now "non-web" is unambiguous, the report is correct, and — crucially — the next analyst can't make the same mistake, because there are no NULLs to fall through and a CHECK constraint documents the allowed values.

The analysis

  1. NULL propagates silently. There was no error to catch. The query was valid SQL; it just answered a subtly different question than intended. Silent wrongness is the most dangerous kind of bug, because nothing prompts you to look.

  2. "Not equal" doesn't include unknown. Internalize this: any condition comparing a possibly-null column with =, <>, IN, or NOT IN will exclude the NULL rows unless you handle them explicitly with IS NULL / IS NOT NULL. Whenever you filter on a column, ask first: can this column be null?

  3. Aggregates skip NULLs. COUNT(col), AVG(col), SUM(col) all ignore nulls. Usually that's what you want (averaging known values) — but know that it's happening, and use COUNT(*) when you mean "how many rows."

  4. The deepest fix is design. The bug existed because a known category ("in-store") was stored as the unknown marker NULL. NULL should mean genuinely unknown or inapplicable — not "a value we didn't bother to record." Backfilling the real value and adding NOT NULL + CHECK made an entire class of error impossible. This is theme #1 again: good design makes mistakes unrepresentable.

Discussion questions

  1. Walk through, step by step, why WHERE channel <> 'web' excluded the in-store orders. What does NULL <> 'web' evaluate to, and how does WHERE treat it?
  2. Rewrite "orders not from web or mobile" correctly for a nullable channel column.
  3. Explain the difference between COUNT(*) and COUNT(channel). When is each the right choice?
  4. The team's final fix backfilled NULL to 'in_store' and added NOT NULL + CHECK. Argue why this is better than just fixing the one report query.
  5. ⭐ Propose a code-review checklist item that would catch this class of bug before it reaches production. (Hint: it's a question to ask about every WHERE and every aggregate.)