Case Study 1: Priya's Monday Morning Crisis
The Report That Ran for Two Years Without a Problem
Every Monday at 6:45 AM, a Python script on Acme Corp's shared drive sprang to life. It read four CSV files — one for each sales region — calculated totals and trends, and sent Sandra Chen a formatted summary before she poured her first coffee. Priya Okonkwo had written it during her first month at Acme, and for nearly two years, it ran without incident.
Until the Monday that Western Region's data export started including a new field, and the ETL job that generated the file started writing "N/A" into the revenue column for transactions that were pending reconciliation.
Traceback (most recent call last):
File "weekly_report.py", line 58, in <module>
revenue = float(row['revenue'])
ValueError: could not convert string to float: 'N/A'
The script stopped at the first bad row in the Western file. The Eastern, Central, and Southern reports were never processed. Sandra got nothing.
The Original Script
Here is what the script looked like when Priya reviewed it:
# weekly_report.py (original version — brittle)
import csv
from pathlib import Path
REGIONS = ["western", "eastern", "central", "southern"]
DATA_DIR = Path("C:/AcmeData/weekly")
totals = {}
for region in REGIONS:
filepath = DATA_DIR / f"{region}_sales.csv"
region_revenue = 0.0
row_count = 0
with open(filepath) as f: # Crashes if file missing
reader = csv.DictReader(f)
for row in reader:
revenue = float(row['revenue']) # Crashes on 'N/A'
region_revenue += revenue
row_count += 1
totals[region] = {
"revenue": region_revenue,
"rows": row_count,
}
# Build and print the report
print("=== WEEKLY SALES SUMMARY ===")
for region, data in totals.items():
print(f"{region.capitalize():10s} ${data['revenue']:>12,.2f} ({data['rows']} transactions)")
Priya identified three failure points:
open(filepath)— crashes withFileNotFoundErrorif any region's export did not runfloat(row['revenue'])— crashes withValueErroron "N/A", empty strings, or any non-numeric valuerow['revenue']— crashes withKeyErrorif the column is renamed or missing
Any one of these stops all processing, not just the problem region.
The Diagnosis: Total Failure vs. Partial Success
Priya sat down with Marcus Webb to think through what the script's behavior should be when things go wrong.
"The real question," Marcus said, "is what's more useful to Sandra: no report at all, or a report that says three of four regions worked and here's why the fourth is missing?"
"Obviously the partial report," Priya said.
"Then the script needs to treat each region independently. An error in Western shouldn't affect Eastern, Central, or Southern."
They sketched out the logic:
- If a file is missing: log it as an error, record zero for that region, continue
- If a file has bad rows: log each bad row, skip it, continue processing the rest
- At the end: produce a report that explicitly flags which regions had issues
The Refactored Script
# weekly_report.py (refactored — robust)
"""
Weekly Sales Report Generator
Acme Corp — Data Team
Processes four regional CSV files and generates a Monday morning summary.
Logs errors and produces a partial report rather than crashing completely
when one or more files have problems.
Usage:
python weekly_report.py
Output:
weekly_report.log — full audit trail
Console — summary for review
"""
import csv
import logging
from dataclasses import dataclass, field
from pathlib import Path
from datetime import datetime
# ── Logging setup ──────────────────────────────────────────────────────────
logging.basicConfig(
level=logging.DEBUG,
format="%(asctime)s | %(levelname)-8s | %(message)s",
datefmt="%Y-%m-%d %H:%M:%S",
handlers=[
logging.StreamHandler(),
logging.FileHandler("weekly_report.log", encoding="utf-8"),
],
)
logger = logging.getLogger(__name__)
# ── Constants ──────────────────────────────────────────────────────────────
REGIONS = ["western", "eastern", "central", "southern"]
DATA_DIR = Path("C:/AcmeData/weekly")
REQUIRED_COLUMNS = {"invoice_id", "customer_id", "revenue", "date"}
# ── Data containers ────────────────────────────────────────────────────────
@dataclass
class RegionResult:
"""Processing result for a single regional CSV file."""
region: str
total_revenue: float = 0.0
rows_processed: int = 0
rows_skipped: int = 0
errors: list[str] = field(default_factory=list)
file_ok: bool = True # False if the file could not be loaded at all
@property
def has_errors(self) -> bool:
return len(self.errors) > 0 or not self.file_ok
# ── File loading ───────────────────────────────────────────────────────────
def load_region_file(region: str) -> tuple[list[dict], bool]:
"""
Attempt to load a region's CSV file.
Returns (rows, success). On failure, logs the error and returns ([], False).
Does NOT raise — the caller continues to the next region regardless.
"""
filepath = DATA_DIR / f"{region}_sales.csv"
try:
with open(filepath, encoding="utf-8", newline="") as f:
reader = csv.DictReader(f)
rows = list(reader)
found_columns = set(reader.fieldnames or [])
missing = REQUIRED_COLUMNS - found_columns
if missing:
logger.error(
f"[{region.upper()}] File is missing required columns: "
f"{sorted(missing)}. Skipping region."
)
return [], False
logger.info(f"[{region.upper()}] Loaded {len(rows)} rows from '{filepath}'")
return rows, True
except FileNotFoundError:
logger.error(
f"[{region.upper()}] File not found: '{filepath}'. "
"Check that the overnight data export completed successfully."
)
return [], False
except PermissionError:
logger.error(
f"[{region.upper()}] Permission denied reading '{filepath}'. "
"The file may be locked by another process."
)
return [], False
except UnicodeDecodeError:
logger.error(
f"[{region.upper()}] Encoding error reading '{filepath}'. "
"Try re-saving the export file as UTF-8."
)
return [], False
except Exception as e:
logger.error(
f"[{region.upper()}] Unexpected error loading '{filepath}': "
f"{type(e).__name__}: {e}"
)
return [], False
# ── Row processing ─────────────────────────────────────────────────────────
def process_region_rows(region: str, rows: list[dict]) -> RegionResult:
"""
Process all rows for a region, skipping invalid ones.
Returns a RegionResult with totals and error details.
"""
result = RegionResult(region=region)
for i, row in enumerate(rows, start=2): # Row 1 is header
invoice_id = row.get("invoice_id", f"row_{i}").strip() or f"row_{i}"
try:
revenue_raw = row.get("revenue", "").strip()
# Treat null-like values as skippable, not as a crash
if revenue_raw.lower() in ("", "n/a", "null", "none", "-"):
raise ValueError(
f"revenue is '{revenue_raw}' — "
"likely a pending reconciliation entry"
)
revenue = float(revenue_raw)
if revenue < 0:
# Negative revenue is unusual but not necessarily wrong —
# log it as a warning and include it (it may be a credit)
logger.warning(
f"[{region.upper()}] Row {i} ({invoice_id}): "
f"negative revenue {revenue:.2f} — including but flagging for review"
)
result.total_revenue += revenue
result.rows_processed += 1
logger.debug(
f"[{region.upper()}] Row {i} ({invoice_id}): ${revenue:.2f} — OK"
)
except ValueError as e:
result.rows_skipped += 1
result.errors.append(f"Row {i} ({invoice_id}): {e}")
logger.warning(
f"[{region.upper()}] Row {i} ({invoice_id}): SKIPPED — {e}"
)
except KeyError as e:
result.rows_skipped += 1
result.errors.append(f"Row {i} ({invoice_id}): missing column {e}")
logger.warning(
f"[{region.upper()}] Row {i} ({invoice_id}): SKIPPED — missing column {e}"
)
return result
# ── Main orchestration ─────────────────────────────────────────────────────
def generate_weekly_report() -> None:
"""
Main function: process all regions and print the weekly summary.
Continues processing all regions even if some have problems.
"""
run_start = datetime.now()
logger.info("=" * 60)
logger.info(f"Weekly Sales Report — {run_start.strftime('%A, %B %d, %Y')}")
logger.info("=" * 60)
region_results: list[RegionResult] = []
for region in REGIONS:
logger.info(f"Processing region: {region.upper()}")
rows, loaded_ok = load_region_file(region)
if not loaded_ok:
# File could not be loaded at all — record as a failed region
failed = RegionResult(region=region, file_ok=False)
failed.errors.append("File could not be loaded — see log for details")
region_results.append(failed)
continue # Move on to the next region
result = process_region_rows(region, rows)
region_results.append(result)
# ── Print the summary report ──────────────────────────────────────────
grand_total = sum(r.total_revenue for r in region_results)
total_rows = sum(r.rows_processed for r in region_results)
total_skips = sum(r.rows_skipped for r in region_results)
regions_with_errors = [r for r in region_results if r.has_errors]
print("\n")
print("╔══════════════════════════════════════════════════════════╗")
print("║ ACME CORP — WEEKLY SALES SUMMARY ║")
print(f"║ {run_start.strftime('%A, %B %d, %Y'):<47}║")
print("╠══════════════════════════════════════════════════════════╣")
print(f"{'Region':<12} {'Revenue':>14} {'OK Rows':>8} {'Skipped':>8} {'Status'}")
print("─" * 62)
for r in region_results:
status = "OK" if not r.has_errors else ("FILE ERROR" if not r.file_ok else "PARTIAL")
print(
f"{r.region.capitalize():<12} "
f"${r.total_revenue:>13,.2f} "
f"{r.rows_processed:>8,} "
f"{r.rows_skipped:>8,} "
f"{status}"
)
print("─" * 62)
print(
f"{'TOTAL':<12} ${grand_total:>13,.2f} "
f"{total_rows:>8,} {total_skips:>8,}"
)
print("╚══════════════════════════════════════════════════════════╝")
if regions_with_errors:
print(f"\nWARNING: {len(regions_with_errors)} region(s) had issues:")
for r in regions_with_errors:
print(f" {r.region.upper()}:")
for err in r.errors[:5]: # Show first 5 errors per region
print(f" - {err}")
if len(r.errors) > 5:
print(f" ... and {len(r.errors) - 5} more (see weekly_report.log)")
print(f"\nFull log written to: weekly_report.log")
logger.info("Report generation complete.")
if __name__ == "__main__":
generate_weekly_report()
The Test Run
Priya put together a test scenario: the Western file would have 3 bad rows (with "N/A" revenue), and the Southern file would be missing entirely (to simulate a failed export).
The output:
╔══════════════════════════════════════════════════════════╗
║ ACME CORP — WEEKLY SALES SUMMARY ║
║ Monday, November 18, 2024 ║
╠══════════════════════════════════════════════════════════╣
Region Revenue OK Rows Skipped Status
──────────────────────────────────────────────────────────
Western $412,847.00 209 3 PARTIAL
Eastern $388,502.50 186 0 OK
Central $291,043.00 154 0 OK
Southern $0.00 0 0 FILE ERROR
──────────────────────────────────────────────────────────
TOTAL $1,092,392.50 549 3
╚══════════════════════════════════════════════════════════╝
WARNING: 2 region(s) had issues:
WESTERN:
- Row 47 (INV-2024-1103): revenue is 'N/A' — likely a pending reconciliation entry
- Row 89 (INV-2024-1145): revenue is 'N/A' — likely a pending reconciliation entry
- Row 201 (INV-2024-1257): revenue is 'N/A' — likely a pending reconciliation entry
SOUTHERN:
- File could not be loaded — see log for details
Full log written to: weekly_report.log
Sandra got a report. It was not perfect, but it was three-quarters of the picture — and it told her exactly what was missing and why. The IT team could see from the log that the Southern export had not run, and Priya could fix the Western N/A values with a follow-up.
What Changed
| Before | After |
|---|---|
| Script crashes on first bad row | Script logs and skips bad rows |
| Zero output on any error | Partial output with clear error flags |
| No indication of what went wrong | Specific error messages per row and region |
Uses print() |
Uses logging with timestamps |
| All regions fail together | Each region processed independently |
| No audit trail | Full log file written to disk |
The Conversation With Marcus
When Priya showed Marcus the refactored version, he pointed to the continue statement in the region loop.
"That's the key architectural decision," he said. "By structuring the loop so each region is independent, a failure in one region can never propagate to another. The try/except blocks inside process_region_rows do the same thing at the row level."
"It's turtles all the way down," Priya said.
"Exactly. You have error boundaries at the file level and at the row level. As long as you log what you skip and why, you get a system that degrades gracefully instead of all-or-nothing."
Key Takeaways From This Case Study
-
Error boundaries should match your processing units. Processing four regions? Each region needs its own error boundary. Processing N rows? Each row needs its own error boundary.
-
Log the reason for every skip. "Row skipped" is useless. "Row 47 (INV-2024-1103): revenue is 'N/A' — likely pending reconciliation" is actionable.
-
A partial result is almost always better than no result. Design your scripts for partial success.
-
The
continuestatement is your friend in loops. After logging an error,continuemoves to the next iteration without running the rest of the loop body. -
Test your error handling explicitly. Priya intentionally created a missing-file and bad-data scenario before deploying. Test with broken data, not just good data.