Case Study 2 — The Polymorphic Comments That Pointed Nowhere
The "type + id" polymorphic shortcut is seductive — one table, relate to anything. But it quietly sacrifices the database's most valuable guarantee: foreign-key integrity. A platform learned this when its comments started pointing at records that no longer existed.
Background
A platform let users comment on several things: posts, photos, and videos. Rather than separate comment tables or a shared parent, the team used the popular polymorphic type+id pattern (common in some ORMs):
CREATE TABLE comments (
comment_id integer GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
commentable_type text NOT NULL, -- 'post' | 'photo' | 'video'
commentable_id integer NOT NULL, -- the id WITHIN that type's table
author_id integer NOT NULL REFERENCES users(user_id),
body text NOT NULL,
created_at timestamptz NOT NULL DEFAULT now()
);
It felt elegant: one comments table for everything. For a while it worked.
What went wrong: no foreign key possible
The fatal property of type+id: you cannot put a foreign key on commentable_id, because it references different tables depending on commentable_type. A FK must point to one table; this column points to three. So the database could not enforce that a comment's target actually exists. That guarantee — referential integrity, the thing Chapter 3 made central — was simply off for this relationship.
The consequences accumulated:
- Orphaned comments. When a
postwas deleted, nothing cascaded or blocked the delete (no FK), so its comments lingered withcommentable_type='post', commentable_id=<gone>. The app would try to load the post for those comments and crash or show errors. Thousands of orphans built up. - Type/id mismatches from bugs. A code path wrote
commentable_type='photo'with apost's id; nothing stopped it (no FK to validate), so a comment "belonged" to a photo that didn't exist while a real post was missing its comment. Undetectable at the database level. - No cascade, no
ON DELETEoptions. All the deletion logic had to be reimplemented in application code for each type — and, as in Chapter 20's drift story, some paths (admin tools, cascade deletes of users) forgot, creating more orphans. - Queries were awkward. "All comments with their target's title" needed a
CASE/UNIONacross three joins, because there's no single table to join.
The data slowly rotted — not because anyone was careless, but because the model removed the database's ability to keep it correct.
The fix: a shared parent (supertype) table
The team migrated to a model that restores foreign-key integrity: a shared commentable supertype table that every commentable thing references, and comments point to it:
CREATE TABLE commentables ( -- the supertype
commentable_id integer GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
kind text NOT NULL CHECK (kind IN ('post','photo','video'))
);
-- each concrete type references its commentable row 1:1
ALTER TABLE posts ADD COLUMN commentable_id integer NOT NULL REFERENCES commentables(commentable_id);
ALTER TABLE photos ADD COLUMN commentable_id integer NOT NULL REFERENCES commentables(commentable_id);
-- comments point to the supertype — a REAL foreign key:
CREATE TABLE comments (
comment_id integer GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
commentable_id integer NOT NULL REFERENCES commentables(commentable_id) ON DELETE CASCADE,
author_id integer NOT NULL REFERENCES users(user_id),
body text NOT NULL
);
Now comments.commentable_id is a genuine foreign key: a comment cannot reference a nonexistent target, deleting a commentable cascades to its comments, and "all comments" is one simple join to commentables. Integrity is back, enforced by the database.
(An alternative they considered — separate junction tables per type (post_comments, photo_comments) — also restores FKs, at the cost of more tables and a UNION for "all comments." Either is defensible; both beat type+id on integrity.)
The analysis
-
type+id polymorphism trades away foreign-key integrity. One column can't reference three tables, so no FK is possible, and the database can't prevent orphans or mismatches. That's not a minor caveat — it's giving up the core guarantee relational databases exist to provide.
-
"The ORM does it" isn't a reason. Several popular ORMs default to type+id polymorphism for convenience. Convenient ≠ correct; know that adopting it means you must enforce integrity in application code (imperfectly), forever.
-
Restore integrity with a shared parent or per-type tables. Both the supertype (
commentables) and the per-type-junction approaches give real foreign keys, cascades, and easy "all comments" queries. Pick by taste; either beats type+id where integrity matters. -
Recognize the pattern and choose deliberately. The failure wasn't choosing type+id per se — it's a defensible choice when integrity is enforced elsewhere and simplicity is paramount. The failure was choosing it unaware it dropped referential integrity, and then being surprised by orphans. Decide with eyes open.
-
Echo of Chapter 3. Foreign keys make "a comment for a target that doesn't exist" unrepresentable. type+id re-opened that door; the fix closed it again. Whenever you can't add a foreign key, ask hard whether you should restructure so you can.
Discussion questions
- Why is a foreign key impossible on
commentable_idin the type+id model? - List the concrete data-integrity problems that resulted.
- How does the shared-parent (
commentables) model restore foreign-key integrity and cascades? - What's the alternative per-type-junction approach, and its trade-off?
- ⭐ When is type+id polymorphism a reasonable choice? State the conditions under which you'd accept the lost integrity.