Case Study 39-B: Maya Catches a Bug Her Tests Were Waiting to Find

Background

Maya Reyes has been running her invoicing module in production for almost a year. It generates PDF invoices, sends them to clients, and logs payments to her SQLite database. In that time, she has sent 340 invoices totaling roughly $218,000 in billings. The system has never crashed, the emails have always arrived, and every client has received a correct invoice.

Or so she thought.

The Setup: Adding Tests to Existing Code

When Maya reads Chapter 39, her first reaction is mild defensiveness. Her code works. She's been using it for a year. What exactly is she supposed to be testing?

The chapter's framing changes her mind: tests aren't for code that has always worked — they're for code that will face inputs it has never seen.

She decides to add tests to her invoicing module. She creates a virtual environment, installs pytest, and writes her first test file: tests/test_invoice_calculations.py.

She starts with the function she trusts most — calculate_invoice_total:

# From maya_invoices/calculations.py (original, from Chapter 16)

def calculate_invoice_total(
    hours: float,
    hourly_rate: float,
    expenses: float = 0.0,
    discount_rate: float = 0.0,
) -> float:
    """Calculate the total amount due on an invoice."""
    subtotal = (hours * hourly_rate) + expenses
    discount_amount = subtotal * discount_rate
    return subtotal - discount_amount

Her test:

def test_invoice_total_standard():
    """Standard case: 8 hours at $175/hr, no expenses, no discount."""
    result = calculate_invoice_total(hours=8, hourly_rate=175.00)
    assert result == pytest.approx(1_400.00, rel=1e-6)


def test_invoice_total_with_expenses():
    """10 hours at $175/hr plus $85 in travel expenses."""
    result = calculate_invoice_total(
        hours=10, hourly_rate=175.00, expenses=85.00
    )
    assert result == pytest.approx(1_835.00, rel=1e-6)

She runs pytest. Both pass. She feels validated.

Then she writes the discount test.

The Moment of Discovery

def test_invoice_total_with_15_percent_discount():
    """
    10% client discount on 8 hours at $175/hr.
    Expected: $1,400 * (1 - 0.10) = $1,260
    """
    result = calculate_invoice_total(
        hours=8, hourly_rate=175.00, discount_rate=0.10
    )
    assert result == pytest.approx(1_260.00, rel=1e-6)

She runs pytest.

FAILED tests/test_invoice_calculations.py::test_invoice_total_with_10_percent_discount

AssertionError: assert 1260.0 == approx(1260.0 ± 1.3e-03)

Wait. The values look the same. She reads more carefully:

E       assert 1274.0 == approx(1260.0 ± 1.3e-03)

The function returned 1274.0. She expected 1260.0. The difference is $14.00.

Maya stares at this for a full minute.

She opens calculations.py and reads the function again:

def calculate_invoice_total(
    hours: float,
    hourly_rate: float,
    expenses: float = 0.0,
    discount_rate: float = 0.0,
) -> float:
    subtotal = (hours * hourly_rate) + expenses
    discount_amount = subtotal * discount_rate
    return subtotal - discount_amount

She works through the math manually for her test case: - hours = 8, rate = $175 - subtotal = 8 × 175 + 0 = $1,400 - discount_amount = $1,400 × 0.10 = $140 - total = $1,400 - $140 = $1,260 ✓

The formula is correct. So why did the function return 1274.0?

She adds a print statement temporarily:

def calculate_invoice_total(hours, hourly_rate, expenses=0.0, discount_rate=0.0):
    subtotal = (hours * hourly_rate) + expenses
    print(f"subtotal={subtotal}, discount_rate={discount_rate}")
    discount_amount = subtotal * discount_rate
    print(f"discount_amount={discount_amount}")
    return subtotal - discount_amount

She runs the test again:

subtotal=1400.0, discount_rate=0.1
discount_amount=140.0

Everything looks right. The function returns... 1274.0? That's impossible.

Then she looks at her test again. More carefully.

She called calculate_invoice_total(hours=8, hourly_rate=175.00, discount_rate=0.10).

She opens her module and finds a second function she forgot she'd written:

def calculate_invoice_total_with_tax(
    hours: float,
    hourly_rate: float,
    expenses: float = 0.0,
    discount_rate: float = 0.0,
    tax_rate: float = 0.0,
) -> float:
    """Calculate invoice total including applicable sales tax."""
    subtotal = (hours * hourly_rate) + expenses
    discount_amount = subtotal * discount_rate
    discounted_subtotal = subtotal - discount_amount
    tax_amount = discounted_subtotal * tax_rate
    return discounted_subtotal + tax_amount

And she finds it — a line she added three months ago when a client in a state with sales tax needed invoices:

# Added 2025-11-14 — client in CA needs tax applied before discount
def calculate_invoice_total(hours, hourly_rate, expenses=0.0, discount_rate=0.0):
    subtotal = (hours * hourly_rate) + expenses
    discount_amount = subtotal * discount_rate
    # BUG IS HERE: should be subtotal - discount, not subtotal - discount + tax_adjustment
    return subtotal - discount_amount + (subtotal * 0.01)  # temp adjustment

There it is. Buried in her file, three months old, is a line she added as a temporary fix for a California client and never removed: + (subtotal * 0.01). One percent added to every invoice. Silently. For three months.

The Scope of the Problem

Maya pulls up her invoice log. She sorts by date.

The temp adjustment line was added on November 14, 2025. She has sent 47 invoices since then. She calculates the overcharges:

# Quick analysis in the Python REPL

invoices_since_nov = [
    # (hours, rate, expenses)
    (12.5, 175.00, 0.00),
    (8.0, 175.00, 120.00),
    # ... 45 more rows
]

overcharges = []
for hours, rate, expenses in invoices_since_nov:
    subtotal = (hours * rate) + expenses
    overcharge = subtotal * 0.01  # the 1% that shouldn't be there
    overcharges.append(overcharge)

print(f"Total overcharged: ${sum(overcharges):,.2f}")

Output: Total overcharged: $1,847.22

Forty-seven clients were overcharged a combined $1,847.22. The individual amounts ranged from $14 to $67 per invoice — small enough that no client complained, large enough that it is a real problem.

The Fix and the Aftermath

Maya fixes the function immediately:

def calculate_invoice_total(
    hours: float,
    hourly_rate: float,
    expenses: float = 0.0,
    discount_rate: float = 0.0,
) -> float:
    """Calculate the total amount due on an invoice.

    Args:
        hours: Billable hours for this invoice.
        hourly_rate: Rate per hour in dollars.
        expenses: Reimbursable expenses in dollars. Default 0.
        discount_rate: Client discount as decimal (e.g., 0.10 for 10%). Default 0.

    Returns:
        Total amount due in dollars.
    """
    subtotal = (hours * hourly_rate) + expenses
    discount_amount = subtotal * discount_rate
    return subtotal - discount_amount

She runs the test suite. All tests pass. She commits with the message:

`Fix: remove accidental 1% surcharge added Nov 14 for CA tax workaround

Bug caused 47 invoices since 2025-11-14 to be overcharged by ~1%. Total overcharge: $1,847.22 across 11 affected clients. Will issue credit memos to each client this week. Root cause: temporary tax adjustment added directly to production function, not tracked in a feature branch, not tested.`

Then she emails each of the 11 clients who received multiple overcharged invoices, explains the error, and issues credit memos. Most clients respond graciously. One client — a small nonprofit she genuinely likes — responds: "I was wondering about that. Thank you for the integrity to catch it and tell me."

What the Tests Changed

Maya thinks about this carefully. The bug had been running for 94 days. In that time: - Zero clients complained - Zero reports showed an anomaly - Zero system alerts fired - She noticed nothing in her monthly revenue summaries because total billings had been trending up anyway

The bug was invisible. It would have remained invisible indefinitely.

The test found it in 0.08 seconds.

She writes a reflection in her project notes:

"Tests are not a burden on top of writing code. Tests are the only way I would have found this. I was overcharging clients for three months and didn't know it. The code 'worked' in the sense that it ran without errors. It did not work in the sense that it produced correct output.

From now on: every calculation function gets tests before I deploy. If I add a 'temporary' fix, I add it in a branch, I test it, and I document it properly. If I'm not willing to write a test for a temporary fix, I shouldn't add the fix."

The Improved Module

After the fix, Maya refactors her test file to be comprehensive — covering the cases she wishes she'd written before:

class TestCalculateInvoiceTotal:
    def test_hourly_only(self):
        assert calculate_invoice_total(8, 175.00) == pytest.approx(1_400.00)

    def test_with_expenses(self):
        assert calculate_invoice_total(8, 175.00, expenses=85.00) == pytest.approx(1_485.00)

    def test_with_discount(self):
        assert calculate_invoice_total(8, 175.00, discount_rate=0.10) == pytest.approx(1_260.00)

    def test_with_expenses_and_discount(self):
        # Discount applies to (hours + expenses) subtotal
        result = calculate_invoice_total(10, 175.00, expenses=200.00, discount_rate=0.15)
        # subtotal = 1750 + 200 = 1950; discount = 292.50; total = 1657.50
        assert result == pytest.approx(1_657.50, rel=1e-6)

    def test_zero_hours(self):
        # Expenses-only invoice (retainer or reimbursement)
        assert calculate_invoice_total(0, 175.00, expenses=500.00) == pytest.approx(500.00)

    def test_no_discount_means_full_subtotal(self):
        result = calculate_invoice_total(10, 175.00, discount_rate=0.0)
        assert result == pytest.approx(1_750.00)

    def test_result_does_not_include_percentage_of_subtotal(self):
        """
        Regression test for the Nov 14 bug: the 1% surcharge.
        Total must equal exactly (hours * rate - discount), nothing more.
        """
        hours, rate = 10.0, 175.00
        expected = hours * rate  # no discount, no extras
        result = calculate_invoice_total(hours=hours, hourly_rate=rate)
        assert result == pytest.approx(expected, rel=1e-9)

That last test — the regression test — is named and documented explicitly. If anyone ever accidentally re-adds a percentage calculation to this function, that test will fail immediately. The November incident is now encoded into the test suite permanently.

Lessons for the Reader

Maya's story illustrates three things about testing that the theory doesn't fully capture:

Tests find bugs that humans don't. Maya checked her invoices. She looked at her summaries. She reviewed her module code at least twice in three months. She never saw the bug. The test found it instantly.

The bugs most worth testing are the ones you're most confident don't exist. Maya trusted this function. She'd been using it for a year. That confidence is exactly why the bug persisted — she wasn't looking for it.

Regression tests encode institutional memory. The test she named test_result_does_not_include_percentage_of_subtotal does not just test the current code. It documents a specific failure mode that happened in November 2025. As long as that test exists, that bug cannot silently return.