Case Study 2 — The Counter That Lied

The same idea as Case Study 1 — store a count to avoid an aggregation — but hand-maintained and under-maintained. Over time it drifted from the truth, and the wrong number drove a bad decision. This is denormalization's signature failure: redundancy you don't keep consistent becomes the very anomaly normalization was meant to prevent.

Background

A community platform showed each forum's topic_count on the forum list. Rather than COUNT(*) the topics every time (cheap, honestly — but they "optimized" prematurely), they added a denormalized column:

ALTER TABLE forums ADD COLUMN topic_count integer NOT NULL DEFAULT 0;

The plan: increment it in the application code whenever a topic is created. For a while it matched reality.

How it drifted

The redundant topic_count had to be updated by every code path that changed the number of topics — and over time, paths multiplied and some forgot:

  • The normal "create topic" path incremented it. ✓
  • A moderation tool that deleted spam topics didn't decrement it. ✗ (counts now too high)
  • A bulk import that loaded archived topics inserted rows directly, bypassing the app increment. ✗ (counts too low)
  • A "merge two forums" admin action moved topics between forums but didn't adjust either count. ✗
  • A database-level cascade (deleting a banned user removed their topics via ON DELETE CASCADE) bypassed the application entirely. ✗

Each gap was small, but they accumulated. Six months in, topic_count was wrong for most active forums — some off by dozens. Nobody noticed at first, because the number was plausible. Then leadership, looking at the forum list, decided to archive several "low-activity" forums based on small topic_count values — some of which were wrong (the counter had under-counted busy forums). Active communities were nearly shut down because of a lying counter.

What went wrong

The denormalized topic_count duplicated a fact (the number of topics) that's trivially derivable (COUNT(*) FROM topics WHERE forum_id = ?). Once duplicated, it had to be kept consistent across every path that affects topics — application code, admin tools, bulk jobs, and database cascades. Application-code maintenance can't see database-level changes (cascades) at all, and no human reliably updates a counter from a dozen scattered code paths. The copy drifted from the truth — the exact update anomaly from Chapter 19, self-inflicted by an unnecessary denormalization.

Worse, the denormalization wasn't even justified: COUNT(*) of topics per forum, with an index on topics(forum_id), is fast. They paid the drift risk to "optimize" a query that was never slow.

The fix

Two options, depending on whether the column was needed at all:

Option A — remove it (the right call here). The count was cheap to compute; the column existed for no measured reason. They dropped it and computed the count on the fly (or via a view), with an index on topics(forum_id):

ALTER TABLE forums DROP COLUMN topic_count;
-- forum list query:
SELECT f.forum_id, f.name, COUNT(t.topic_id) AS topic_count
FROM forums f LEFT JOIN topics t ON t.forum_id = f.forum_id
GROUP BY f.forum_id, f.name;

Option B — if a denormalized count were truly needed (e.g., billions of topics making COUNT(*) genuinely expensive), maintain it where all changes are visible: a trigger on topics (which fires for app writes, admin tools, bulk inserts, and cascade deletes alike), not scattered application code:

-- A trigger sees every INSERT/DELETE on topics, including cascades.
CREATE TRIGGER topics_count_maint
AFTER INSERT OR DELETE ON topics
FOR EACH ROW EXECUTE FUNCTION adjust_forum_topic_count();

They chose Option A — the simplest correct thing.

The analysis

  1. Don't denormalize without a measured need. The count was never slow; the denormalization added drift risk for zero benefit. Premature denormalization is a real cost. Try (or just keep) the honest COUNT(*) + index first.

  2. If you duplicate a fact, you own keeping it consistent across every path. Application-code maintenance is fragile: it can't see database-level changes (cascades, bulk SQL) and relies on every developer remembering. The redundancy drifts the moment one path forgets.

  3. A trigger sees what application code can't. If you must maintain a denormalized aggregate, a database trigger fires for all row changes — app writes, admin tools, bulk jobs, and cascade deletes — closing the gaps application code leaves. (At the cost of per-write overhead and hidden logic — document it.)

  4. Plausible-but-wrong is dangerous. The drifted counter looked believable, so it went unquestioned and drove a real decision (archiving active forums). Derived values that can silently diverge should be reconciled or, better, not duplicated.

  5. Contrast with Case Study 1. Same goal (avoid aggregation), opposite outcome. The difference: Case 1 used a managed matview (recomputed from truth) with a justified need and a consistency plan; Case 2 hand-maintained an unjustified column with no plan for all the write paths. Denormalization is only as good as its maintenance.

Discussion questions

  1. List the code paths that caused topic_count to drift. Why couldn't application-code increments catch all of them?
  2. Was the denormalization justified? What should the team have checked first?
  3. Why does a trigger close the gaps that scattered application code leaves open?
  4. Why is "plausible but wrong" especially dangerous for a derived metric?
  5. ⭐ Compare this case to Case Study 1 point by point. What three things did Case 1 do that Case 2 didn't?