Case Study 2 — The Counter That Lied
The same idea as Case Study 1 — store a count to avoid an aggregation — but hand-maintained and under-maintained. Over time it drifted from the truth, and the wrong number drove a bad decision. This is denormalization's signature failure: redundancy you don't keep consistent becomes the very anomaly normalization was meant to prevent.
Background
A community platform showed each forum's topic_count on the forum list. Rather than COUNT(*) the topics every time (cheap, honestly — but they "optimized" prematurely), they added a denormalized column:
ALTER TABLE forums ADD COLUMN topic_count integer NOT NULL DEFAULT 0;
The plan: increment it in the application code whenever a topic is created. For a while it matched reality.
How it drifted
The redundant topic_count had to be updated by every code path that changed the number of topics — and over time, paths multiplied and some forgot:
- The normal "create topic" path incremented it. ✓
- A moderation tool that deleted spam topics didn't decrement it. ✗ (counts now too high)
- A bulk import that loaded archived topics inserted rows directly, bypassing the app increment. ✗ (counts too low)
- A "merge two forums" admin action moved topics between forums but didn't adjust either count. ✗
- A database-level cascade (deleting a banned user removed their topics via
ON DELETE CASCADE) bypassed the application entirely. ✗
Each gap was small, but they accumulated. Six months in, topic_count was wrong for most active forums — some off by dozens. Nobody noticed at first, because the number was plausible. Then leadership, looking at the forum list, decided to archive several "low-activity" forums based on small topic_count values — some of which were wrong (the counter had under-counted busy forums). Active communities were nearly shut down because of a lying counter.
What went wrong
The denormalized topic_count duplicated a fact (the number of topics) that's trivially derivable (COUNT(*) FROM topics WHERE forum_id = ?). Once duplicated, it had to be kept consistent across every path that affects topics — application code, admin tools, bulk jobs, and database cascades. Application-code maintenance can't see database-level changes (cascades) at all, and no human reliably updates a counter from a dozen scattered code paths. The copy drifted from the truth — the exact update anomaly from Chapter 19, self-inflicted by an unnecessary denormalization.
Worse, the denormalization wasn't even justified: COUNT(*) of topics per forum, with an index on topics(forum_id), is fast. They paid the drift risk to "optimize" a query that was never slow.
The fix
Two options, depending on whether the column was needed at all:
Option A — remove it (the right call here). The count was cheap to compute; the column existed for no measured reason. They dropped it and computed the count on the fly (or via a view), with an index on topics(forum_id):
ALTER TABLE forums DROP COLUMN topic_count;
-- forum list query:
SELECT f.forum_id, f.name, COUNT(t.topic_id) AS topic_count
FROM forums f LEFT JOIN topics t ON t.forum_id = f.forum_id
GROUP BY f.forum_id, f.name;
Option B — if a denormalized count were truly needed (e.g., billions of topics making COUNT(*) genuinely expensive), maintain it where all changes are visible: a trigger on topics (which fires for app writes, admin tools, bulk inserts, and cascade deletes alike), not scattered application code:
-- A trigger sees every INSERT/DELETE on topics, including cascades.
CREATE TRIGGER topics_count_maint
AFTER INSERT OR DELETE ON topics
FOR EACH ROW EXECUTE FUNCTION adjust_forum_topic_count();
They chose Option A — the simplest correct thing.
The analysis
-
Don't denormalize without a measured need. The count was never slow; the denormalization added drift risk for zero benefit. Premature denormalization is a real cost. Try (or just keep) the honest
COUNT(*)+ index first. -
If you duplicate a fact, you own keeping it consistent across every path. Application-code maintenance is fragile: it can't see database-level changes (cascades, bulk SQL) and relies on every developer remembering. The redundancy drifts the moment one path forgets.
-
A trigger sees what application code can't. If you must maintain a denormalized aggregate, a database trigger fires for all row changes — app writes, admin tools, bulk jobs, and cascade deletes — closing the gaps application code leaves. (At the cost of per-write overhead and hidden logic — document it.)
-
Plausible-but-wrong is dangerous. The drifted counter looked believable, so it went unquestioned and drove a real decision (archiving active forums). Derived values that can silently diverge should be reconciled or, better, not duplicated.
-
Contrast with Case Study 1. Same goal (avoid aggregation), opposite outcome. The difference: Case 1 used a managed matview (recomputed from truth) with a justified need and a consistency plan; Case 2 hand-maintained an unjustified column with no plan for all the write paths. Denormalization is only as good as its maintenance.
Discussion questions
- List the code paths that caused
topic_countto drift. Why couldn't application-code increments catch all of them? - Was the denormalization justified? What should the team have checked first?
- Why does a trigger close the gaps that scattered application code leaves open?
- Why is "plausible but wrong" especially dangerous for a derived metric?
- ⭐ Compare this case to Case Study 1 point by point. What three things did Case 1 do that Case 2 didn't?