Case Study 16-2: Maya's Complete Invoicing System
The Milestone
This case study documents the completion of Maya Reyes's invoicing system — the automation milestone referenced throughout the book since Chapter 9. By the end of this chapter, Maya has a script she runs on the last business day of each month. Three minutes later, she has professionally formatted Excel invoices ready to send for every client she worked with that month. The manual process this replaced took most of a workday.
This case study walks through Maya's thinking, the mistakes she made along the way, the design decisions that ended up mattering, and the final working system.
Where She Started: Invoice Day Before Python
Maya had been running her consulting practice for three years before Chapter 9. Her month-end invoicing process:
- Open her project tracker spreadsheet
- Filter to the current month's completed/invoiced projects
- For each client, manually calculate the total amount (hours × rate for hourly clients, or look up the fixed fee)
- Open a Word invoice template
- Copy and paste the client's name, address, and billing details
- Type in each line item (project name, hours, amount)
- Calculate subtotal, type it in
- Calculate due date (30 days from today), type it in
- Save as PDF
- Send the PDF via email
With 12 clients and an average of 2 projects per client, this was 24 line-item tables to build manually. It took five to six hours. She routinely found errors the next day — a line item with the wrong hours, a due date she'd calculated wrong, a client name with a typo.
"The invoice is your most important business communication," her business coach David had told her. "It's the one document where an error costs you money or credibility."
She had started the Python tracking system in Chapter 9 precisely because she wanted to fix this.
Step 1: What the System Needed to Do
Before writing code, Maya listed the requirements:
Functional requirements:
- Read all projects from maya_projects.csv that are marked "Invoiced" and have an invoice date in the current billing month
- Group projects by client
- Calculate the billed amount for each project (hours × rate, or fixed fee)
- Generate one Excel invoice per client
- Each invoice must contain: her business information, the client's information, invoice number, invoice date, due date, payment terms, itemized project table, subtotal, tax line, total due, payment instructions
Quality requirements: - Professional appearance — looks like something a real business would send - Consistent formatting across all invoices - Clear, readable line items with enough description for the client to understand what they're paying for - Due date computed automatically (not typed) - Invoice numbers sequential and meaningful (not just "Invoice 1")
Operational requirements: - Runs without modification month-to-month (the period is a command-line argument) - Handles the case where a client has no projects this month (doesn't generate an empty invoice) - Produces a console summary after running so she can verify the totals before sending - If something goes wrong with one client, it shouldn't crash the whole run
Step 2: The CSV Data Structure
Maya's maya_projects.csv had evolved through the book. By Chapter 16 it had these columns:
project_id, client_name, client_company, client_address, client_email,
project_name, description, billing_type, estimated_hours, actual_hours,
hourly_rate, fixed_fee, invoice_date, status
Key design decisions in the CSV structure:
Why separate hourly_rate and fixed_fee? Some projects bill by the hour; some are a fixed amount regardless of hours. Keeping both columns means the calculation is simply:
if billing_type == "fixed":
amount = fixed_fee
else: # hourly or retainer
amount = actual_hours * hourly_rate
This is cleaner than trying to detect billing type from which column has a value.
Why invoice_date instead of billing_month? Maya used the actual invoice date (always the last day of the month for her) rather than just a month label. This let her filter by period without string matching: df["invoice_date"].dt.to_period("M") == "2023-11".
Why client_company and client_name separately? client_name is the contact person; client_company is the legal entity. The invoice goes to the company, but she needed the contact name for email. Having both in the data means no lookup required.
Step 3: The Data Loading Layer
def load_and_validate_projects(filepath: str, period: str) -> pd.DataFrame:
"""
Loads, filters, and validates project data for the billing period.
Raises ValueError if required columns are missing.
Returns empty DataFrame (not None) if no projects match the period.
"""
required_columns = {
"project_id", "client_name", "client_company", "client_address",
"client_email", "project_name", "description", "billing_type",
"actual_hours", "hourly_rate", "fixed_fee", "invoice_date", "status",
}
df = pd.read_csv(filepath, parse_dates=["invoice_date"])
# Validate columns
missing = required_columns - set(df.columns)
if missing:
raise ValueError(
f"Missing required columns in {filepath}: {missing}\n"
f"Available columns: {list(df.columns)}"
)
# Filter to billing period
period_pd = pd.Period(period, freq="M")
billed_df = df[
(df["status"] == "Invoiced")
& (df["invoice_date"].dt.to_period("M") == period_pd)
].copy()
# Compute line-item amounts
def line_amount(row):
if row["billing_type"] == "fixed":
try:
return float(row["fixed_fee"])
except (ValueError, TypeError):
return 0.0
else:
try:
return float(row["actual_hours"]) * float(row["hourly_rate"])
except (ValueError, TypeError):
return 0.0
billed_df["amount"] = billed_df.apply(line_amount, axis=1)
return billed_df
The explicit column validation was something Maya added after the second time she ran the script against a CSV she'd edited by hand and had accidentally renamed a column. The error message now tells her exactly what's missing.
Step 4: The Invoice Data Model
Maya created simple Python dataclasses to hold the invoice data before writing it to Excel. This made the code cleaner: build the data structure first, then render it to Excel. The two steps were separate and testable independently.
from dataclasses import dataclass, field
from datetime import date, timedelta
@dataclass
class LineItem:
project_name: str
description: str
billing_type: str # "hourly", "fixed", "retainer"
actual_hours: float
rate: float # $/hr for hourly/retainer; total for fixed
amount: float
@property
def rate_display(self) -> str:
"""Human-readable rate label for the invoice table."""
if self.billing_type == "fixed":
return "Fixed Fee"
return f"${self.rate:.2f}/hr"
@dataclass
class Invoice:
number: str
invoice_date: date
billing_period: str # "November 2023"
client_company: str
client_address: str
client_email: str
lines: list[LineItem] = field(default_factory=list)
@property
def due_date(self) -> date:
return self.invoice_date + timedelta(days=30)
@property
def subtotal(self) -> float:
return sum(line.amount for line in self.lines)
@property
def total_hours(self) -> float:
return sum(
line.actual_hours
for line in self.lines
if line.billing_type != "fixed"
)
def __post_init__(self):
if not self.lines:
raise ValueError(
f"Invoice {self.number} for {self.client_company} has no line items."
)
The __post_init__ guard prevents generating an invoice with no billable items — an early version of the script had silently created empty invoices for clients who had open projects but nothing invoiced in the period.
Step 5: The Excel Rendering Layer
The Excel writer was the longest part of the code. Maya deliberately separated it from the data layer so she could change the formatting without touching the business logic.
Her core insight about Excel formatting with openpyxl: define all style objects once, at the top of the file. The first version of the script defined Font(name="Calibri", bold=True, size=11) inside loops, which is inefficient and impossible to maintain. When she decided to change the header font size from 11 to 12, she had to find and update 14 different places.
After refactoring:
# Defined once at module level
FONT_HEADER = Font(name="Calibri", bold=True, size=11, color="FFFFFF")
FILL_HEADER = PatternFill(fill_type="solid", fgColor="2980B9")
# Used everywhere
cell.font = FONT_HEADER
cell.fill = FILL_HEADER
When she changed the header color from blue to navy, she changed one line.
The invoice sheet builder followed a straightforward structure:
def write_invoice_sheet(ws, invoice: Invoice) -> None:
"""Renders a complete invoice onto a worksheet."""
_write_letterhead(ws) # Rows 1-5: Maya's contact info
_write_invoice_metadata(ws, invoice) # Rows 7-11: invoice #, dates, bill-to
_write_line_items_table(ws, invoice) # Row 13+: column headers and project rows
_write_totals(ws, invoice) # After line items: subtotal, tax, total
_write_footer(ws) # After totals: payment instructions
Breaking the rendering into sub-functions made it easy to modify individual sections. When Maya decided to change the payment instructions block, she only needed to find _write_footer.
Step 6: The Number Formatting Learning Curve
Maya spent more time on number formats than on any other single aspect of the Excel output. The openpyxl number format strings look like Excel's custom format syntax, but she had to re-learn it from examples.
The formats she ended up using:
# In the line items table
ws.cell(row=row, col=HOURS_COL).number_format = '#,##0.0 "hrs"'
# Output: 14.5 hrs
ws.cell(row=row, col=AMOUNT_COL).number_format = '"$"#,##0.00'
# Output: $2,537.50
# In the totals section
subtotal_cell.number_format = '"$"#,##0.00'
# Output: $8,750.00
# For the rate column — shows "Fixed Fee" as text or a dollar rate
# This cell is handled with an actual string value, not a number format
# because the display is conditional on billing type
One mistake she made early: applying '"$"#,##0.00' to a cell that sometimes contained the string "Fixed Fee". Formatted as a number, the string displayed as zero. She learned: use conditional logic to write either a number (with number format) or a string to the cell — not both.
# Correct approach for the rate column
if line.billing_type == "fixed":
rate_cell.value = "Fixed Fee"
rate_cell.number_format = "@" # Text format — don't interpret as number
else:
rate_cell.value = line.rate
rate_cell.number_format = '"$"#,##0.00"/hr"'
Step 7: The Operational Wrapper
The script needed to be easy to run. Maya added command-line argument parsing so she didn't need to edit the file each month:
# Default: bills for the previous completed month
python maya_invoice_generator.py
# Specific period
python maya_invoice_generator.py --period 2023-10
# Different data file
python maya_invoice_generator.py --data /Dropbox/consulting/maya_projects.csv
She also added a validation summary before generating files, so she could catch data problems before producing invoices:
print(f"\nPre-flight check for {period_label}:")
print(f" Projects found: {len(billed_df)}")
print(f" Clients: {billed_df['client_name'].nunique()}")
print(f" Total billings: ${billed_df['amount'].sum():,.2f}")
print(f" Average invoice: ${billed_df.groupby('client_name')['amount'].sum().mean():,.2f}")
print()
confirm = input("Generate invoices? [y/N] ")
if confirm.lower() != "y":
print("Cancelled.")
return
She later removed the confirmation prompt after six months of always typing "y". But for the first few months while she was building trust in the output, it was useful to see the pre-flight check before committing.
Step 8: Error Handling in Production
During the third month of running the system, one client had been added to the CSV with a missing client_address field. The script crashed with a KeyError on that client, and because all the other invoices had already been generated, she nearly sent them before she realized the run hadn't completed.
She added per-client error handling:
failed_clients = []
for invoice in invoices:
try:
output_path = write_invoice_excel(invoice, output_dir)
generated_files.append(output_path)
print(f" OK {invoice.client_company}")
except Exception as e:
failed_clients.append((invoice.client_company, str(e)))
print(f" FAILED {invoice.client_company}: {e}")
if failed_clients:
print(f"\nWARNING: {len(failed_clients)} invoice(s) failed to generate:")
for client_name, error in failed_clients:
print(f" - {client_name}: {error}")
print("\nFix the errors above and re-run for the failed clients.")
Now a failure for one client produces a clear error message and the script continues to generate invoices for all other clients. The summary at the end makes it obvious that the run was incomplete.
The First Month-End Run
November 30, 2023. Maya opened her terminal at 9:45 AM and typed:
python maya_invoice_generator.py --period 2023-11
The console output:
========================================================
Maya Reyes Consulting — Invoice Generator
========================================================
Billing period : November 2023
Invoice date : November 30, 2023
Data file : maya_projects.csv
Output dir : invoices/
Loading project data ...
Found 20 projects across 10 clients.
Total billings: $38,125.00
Generating invoices ...
------------------------------------------------------------
MR-202311-01 | Harrington & Associates LLC | 16.5 hrs | $6,337.50
MR-202311-02 | BlueStar Retail Inc. | 19.5 hrs | $3,412.50
MR-202311-03 | Meridian Capital Partners | 45.5 hrs | $9,725.00
MR-202311-04 | TechPath Solutions Corp. | 21.5 hrs | $3,762.50
MR-202311-05 | Apex Manufacturing Ltd. | 40.5 hrs | $7,137.50
MR-202311-06 | Vanguard Community Foundation | 20.5 hrs | $3,587.50
MR-202311-07 | Clearwater Health Systems | 20.0 hrs | $3,500.00
MR-202311-08 | Summit Real Estate Group | 23.0 hrs | $5,500.00
MR-202311-09 | Orion Technology Partners | 9.0 hrs | $2,200.00
MR-202311-10 | Cascade Marketing Agency | 5.5 hrs | $962.50
------------------------------------------------------------
Generation complete.
Invoices generated : 10
Output directory : /Users/maya/consulting/invoices/
Grand total billed : $46,125.00
Files created:
202311_Harrington_Associates_LLC_Invoice.xlsx (18.3 KB)
202311_BlueStar_Retail_Inc_Invoice.xlsx (17.9 KB)
...
Total elapsed time: 2 minutes, 47 seconds. That included the time to open the terminal.
She opened the Harrington & Associates invoice in Excel. The formatting was exactly what she'd designed. The line items were correct. The due date was December 30. The invoice number was MR-202311-01.
She checked the Meridian Capital invoice — the largest one. The fixed-fee financial model line item showed "Fixed Fee" in the rate column, not $0.00`. The hourly items showed `$175.00/hr. The subtotal was correct.
She attached all ten files to individual emails, one per client, and sent them.
By 10:12 AM she was drinking her second coffee.
What This System Does That Manual Invoicing Couldn't
Consistency. Every invoice has the same structure, same fonts, same payment instructions. There are no invoices where she forgot to update the payment terms or used last month's due date.
Audit trail. The script logs every run to a console output she can copy into her records. She knows exactly what was billed, to whom, and when the invoice was generated.
Re-generation. If a client says they didn't receive their invoice, she runs the script again for that period. Same output, same format, same invoice number. No need to rebuild it from memory.
Month-end speed. The first time Maya ran the old manual process after building the script, she mistakenly started doing it by hand out of habit. She stopped four minutes in and ran the script instead. The contrast was stark.
The data is the source of truth. If she needs to know how many hours she billed to a client over the year, the CSV has it. If she needs to find the original line items from six months ago, the CSV has it. The invoices themselves are just a presentation layer.
What's Next for This System
The invoice generator as described in this chapter generates Excel files. The next logical improvements:
- Chapter 17: Save the invoices to a dated folder automatically, replacing the manual filing step
- Chapter 19: Email each invoice automatically using Python's
smtplib, turning the month-end process from "generate and then send" to "generate, verify, and send all in one script" - Chapter 22: Schedule the script to run automatically on the last business day of each month, so it's ready before she even thinks about invoicing
- Chapter 30: Analyze her invoicing data as a dataset — which clients have the best margins, what's her effective hourly rate by project type, what's her revenue trend over the past 12 months
The invoicing system is not finished. It's a foundation.
This case study demonstrates: the complete Python-to-Excel invoicing workflow, dataclasses as data models, per-client error handling with try/except, command-line arguments with argparse, number format edge cases for mixed billing types, and the iterative development process — including the bugs that happened in production and how they were fixed.
Milestone: "Maya's invoice system complete" — first fully automated month-end run documented above.