Case Study 2 (Deep Dive): The Over-Documented File and the Comment That Lied
A contrasting scenario. Case Study 1 fixed the no-documentation end of the spectrum. This one walks the too-much end — the failure that disguises itself as diligence — and shows how over-documentation actively causes the worst kind of harm: a comment that lies. Composite and anonymized; the pattern is one every code reviewer has seen.
The setup
A new engineer, Priya, joins a payments team and is praised in her first week for "thorough documentation." Here is a representative function from her first pull request. Read it the way her reviewer should have:
def calculate_fee(amount, tier):
# initialize the fee to zero
fee = 0.0
# check if the tier is premium
if tier == "premium":
# premium users pay a 1% fee
fee = amount * 0.02
# check if the tier is standard
elif tier == "standard":
# standard users pay a 3% fee
fee = amount * 0.03
# otherwise the tier is basic
else:
# basic users pay a 5% fee
fee = amount * 0.05
# add the fixed processing charge
# 30 cents covers the card-network interchange floor; below this
# we lose money on the transaction. Do not remove. See #payments-77.
fee = fee + 0.30
# return the fee
return fee
At a glance it looks well-documented — a comment on nearly every line, nothing left unexplained. That impression is exactly the trap. This file is at the too-much end of the documentation spectrum (§24.6), and the over-documentation isn't neutral clutter; it's doing three kinds of harm at once.
Harm 1: Noise burying signal
Eight of the ten comments narrate the what — # initialize the fee to zero over fee = 0.0, # return the fee over return fee. Each adds nothing the code doesn't already say (§24.2). On their own they'd be merely useless. Together they do something worse: they bury the one comment that matters. The note about the 30-cent interchange floor — a genuine why, with a real warning ("Do not remove") and a ticket — is the single most important line in the file, and it's drowning in a sea of # check if the tier is premium. A reviewer (or a future maintainer) scanning ten comments, nine of which are noise, learns the rational lesson: the comments here aren't worth reading. So they skip all of them — including the one that says "remove this and we lose money on every transaction." Over-documentation didn't just waste space; it disabled the documentation that mattered, by training the reader to ignore the channel.
Harm 2: The comment that lies
Read the premium branch again:
# premium users pay a 1% fee
if tier == "premium":
fee = amount * 0.02
The comment says 1%. The code charges 2% (* 0.02). One of them is wrong, and a reader has no way to know which — maybe the rate was 1% when Priya wrote the comment and someone later bumped it to 2% without touching the comment (classic doc drift, §24.6), or maybe she simply mistyped. Either way, this is the failure the chapter calls worse than no comment at all: a reader who trusts the comment now believes premium users are charged 1% when they're charged 2%. With no comment, they'd have read the code (* 0.02) and known the truth. The comment doesn't aid understanding; it manufactures a false belief — and in a payments system, a false belief about a fee rate is the kind of thing that ends up in an incident report. This is why the chapter insists: every comment is a promise to keep it true, and the more comments you write — especially what-comments that shadow every line — the more promises you'll eventually break.
Harm 3: The obvious explained, the subtle skipped
Notice where Priya spent her documentation effort. Every trivial, self-evident line got a comment (they're easy to narrate). But the function has at least one genuinely subtle question that got no comment: why do the fee tiers differ by exactly these amounts, and is the order of operations right — should the fixed 30-cent charge be inside the percentage or added after? (Here it's added after, which is a real decision with billing consequences.) The over-documenter narrated all the easy whats and skipped the hard why, which is the precise inverse of where the effort should go. Voluminous and useless: everything obvious explained, the one thing worth explaining left silent.
The fix: cut to the calibrated center
The repair is subtraction, mostly — delete the narration, fix the lie, keep and surface the real why, and add the one missing why:
def calculate_fee(amount: float, tier: str) -> float:
"""Return the total fee for a charge of `amount` at the given tier.
Fee = a tier-based percentage of `amount` plus a fixed processing
charge. Tiers: premium (2%), standard (3%), basic (5%).
Args:
amount: The charge amount in dollars.
tier: One of "premium", "standard", "basic"; unknown tiers
are treated as basic.
Returns:
The total fee in dollars (percentage + fixed charge).
"""
rate = FEE_RATES.get(tier, FEE_RATES["basic"])
percentage_fee = amount * rate
# The fixed charge is added *after* the percentage, not folded into it:
# it covers the card-network interchange floor (30¢), which is a flat
# cost per transaction regardless of size. Below this we lose money.
# Do not remove. See #payments-77.
return percentage_fee + PROCESSING_CHARGE
with the rates lifted into named, self-documenting constants:
FEE_RATES = {"premium": 0.02, "standard": 0.03, "basic": 0.05}
PROCESSING_CHARGE = 0.30
What happened:
- The eight what-comments are gone. The code says all of that already (§24.2).
- The lie is gone — and it's gone structurally: the rates now live in one named place (
FEE_RATES), so there is no second copy of "1%/2%" to drift out of sync. The fix didn't just correct the comment; it removed the possibility of that drift (§24.4's coupled-documentation principle, §24.5's named constants). - The real why is preserved and now visible — the interchange-floor warning, no longer buried, reads loud because the noise around it is gone.
- The missing why is supplied — the comment now explains the after-not-folded ordering decision, the one genuinely subtle point.
- A docstring and types give callers the contract and the data shapes (§24.3–24.4).
The result has three comments where the original had ten — and it is far better documented, because comment count was never the measure. Each surviving comment earns its place with a why the code can't express; everything else became code, constants, or contract.
The takeaway
"Lots of comments" felt like diligence and was the opposite. The over-documented file buried its one crucial warning, shipped a comment that lied about a fee rate, and explained everything except the thing worth explaining. Both ends of the documentation spectrum are failures, and this is the sneakier one: no documentation looks like a problem, so people eventually fix it; too much documentation looks like a virtue, so it festers — comments rotting into lies while everyone congratulates the author for being thorough. The skill is the calibrated center: capture the why, trust the code with the what, and remember that every comment you write is a promise you'll have to keep.
Related: Chapter 24 · Case Study 1 · Exercises · Key Takeaways