Case Study 1: Raj's Function — From Silent to Self-Explaining

DataField.Dev

Case Study 1: Raj's Function — From Silent to Self-Explaining

A worked before/after. Raj Patel maintains chronoparse, a small open-source library that turns human date strings ("next Tuesday," "in 3 days") into machine dates. This is one function from its core, documented across one editing pass. Fictional but realistic; the code is the material being documented, not something you're asked to write.

The scene

It's a Tuesday, and Raj has reopened discount.py to fix a reported bug — a promotional date is resolving to the wrong day. The function in front of him is one he wrote eight months ago. He reads it twice and feels the specific dread of not understanding his own code:

def resolve(text, base=None, tz=None):
    text = text.strip().lower()
    if base is None:
        base = datetime.now()
    if tz is not None:
        base = base.astimezone(tz)
    for pattern, handler in PATTERNS:
        m = pattern.match(text)
        if m:
            r = handler(m, base)
            if r.hour == 0 and r.minute == 0:
                r = r.replace(hour=9)
            if r < base:
                r = r + timedelta(days=1)
            return r
    return None

He can read every line — Python is plain enough. What he can't read is the reasoning, and the reasoning is the whole problem. The function is a sequence of small decisions, each of which had a reason, and not one reason is on the screen:

Why lowercase and strip the text? Is that load-bearing or just tidy?
Why does a result with a midnight time get bumped to 9:00? Where did 9 come from?
Why does a result before base get pushed forward a day? Always? That looks like where the bug is — but is it a bug or a deliberate "next occurrence" rule?
Why return None on no match instead of raising?

There is exactly one comment in the file, three lines up from this function: # helper. It documents nothing. Raj has two options, and he resents both: spend an hour reverse-engineering his own intentions, or change something and hope. He's the author, and he's locked out.

This is the failure §24.1 names: the code is a complete record of what and a silent record of why, and eight months is more than enough for the why to evaporate. The fix isn't "add comments." It's to capture the why — and only the why — so that the next person to open this file (Raj, next quarter; a contributor, next week) can read and understand instead of guess.

The pass

Raj does what this chapter teaches, in order. First he reconstructs the whys (some from memory, some by re-deriving and then writing them down so he never has to again). Then he adds a docstring for callers, types the signature, and writes a why-comment at each decision — narrating nothing the code already says.

def resolve(text: str, base: datetime | None = None,
            tz: tzinfo | None = None) -> datetime | None:
    """Resolve a human date expression to a concrete datetime.

    Handles relative expressions ("tomorrow", "in 3 days") and named
    days ("next Friday"), always returning a *future* datetime relative
    to `base` — the library's promise that a parsed promo date never
    lands in the past.

    Args:
        text: The expression to resolve; case and surrounding whitespace
            are ignored.
        base: Reference point for relative expressions. Defaults to now.
        tz: If given, resolve in this timezone (promo times are local).

    Returns:
        A future datetime, or None if no pattern matches `text`.
    """
    # Patterns are authored lowercase and assume no padding, so callers
    # can pass raw user input like " Next Friday ".
    text = text.strip().lower()

    if base is None:
        base = datetime.now()

    # Resolve in the user's timezone when supplied: a "9am" promo must
    # mean 9am *local*, not 9am UTC. (This is the bug from #231 — see below.)
    if tz is not None:
        base = base.astimezone(tz)

    for pattern, handler in PATTERNS:
        m = pattern.match(text)
        if m:
            result = handler(m, base)

            # Bare dates ("next Friday") have no time, so handlers return
            # midnight. Default those to 9:00 — promos should fire during
            # business hours, not at 00:00. Explicit times are left as-is.
            if result.hour == 0 and result.minute == 0:
                result = result.replace(hour=9)

            # "Next occurrence" rule: if the resolved date is already past,
            # roll forward one day. This is what guarantees a future date —
            # and #231's bug was that the tz conversion above happened too
            # late, so this comparison ran against the wrong-zone `base`.
            if result < base:
                result = result + timedelta(days=1)

            return result

    # No pattern matched: return None so the caller can prompt the user to
    # rephrase, rather than crashing on unparseable promo copy.
    return None

What changed, and what didn't

The logic is almost untouched — Raj fixed one ordering bug (the timezone conversion), which he found only because writing the why-comment for the "next occurrence" rule forced him to articulate what base was supposed to be at that point, and the articulation exposed the mismatch. That is theme 1 in action: documenting the why made him think the why, and the thinking surfaced the bug. The documentation didn't just describe the code; it improved it.

Everything else is the chapter's toolkit applied cleanly:

The docstring gives a caller the contract — what it does, the future-date promise, the three parameters, the None return — so they can use resolve without reading the body (§24.3).
The typed signature (text: str, base: datetime | None, -> datetime | None) states the data shapes where a tool keeps them honest, and the | None return tells a caller failure is a None, not an exception (§24.4).
Every comment is a why. The lowercasing is load-bearing; the 9:00 default has a reason (business hours); the day-roll is the future-date rule; the None is a deliberate choice over crashing. There is no # strip the text, no # loop through patterns, no # return result — nothing that narrates the what (§24.2).

Count the comments: the original had one (# helper, useless). The documented version has five, every one load-bearing — more comments here, because this function is dense with non-obvious decisions, which is exactly when comments earn their place. Compare that to §24.2's first example, where documenting the why reduced the comment count. The number isn't the point; the load-bearing-ness is (§24.6).

The takeaway

The two versions are nearly the same code and completely different documents. The before locks out even its author after eight months; the after lets a stranger — including future-Raj — read it and understand every decision. The difference is not effort spent narrating; it's the why captured where the code couldn't hold it. And the bonus, which Raj didn't expect: writing the why is also how he found the bug he came to fix. Writing is thinking, even when what you're writing is a comment.

Related: Chapter 24 · Case Study 2 · Exercises · Key Takeaways