Case Study 1.1: Priya's Monday Morning

The Situation

Priya Okonkwo has been a junior analyst at Acme Corp for 18 months. Her official job title is "Business Intelligence Analyst," which in practice means she is the person who generates the weekly sales report that Sandra Chen, VP of Sales, reads every Monday morning.

The report covers: - Total sales by region (Chicago, Cincinnati, Nashville, St. Louis) - Top 10 products by revenue for the week - Week-over-week and month-to-date comparison - A summary chart showing regional performance trends

The raw data comes from four CSV exports — one per region — that Marcus Webb schedules to land in a shared network folder by 6:30 AM every Monday.

The Current Process

Priya's Monday morning routine, documented step by step:

6:45 AM — Arrives early. Opens the network folder.

6:50 AM — Opens each of the four CSV files. Notes immediately that Nashville exported with column headers in a different order than usual. (This happens roughly once a month.)

7:00 AM — Opens the "master" Excel workbook, which she built by hand six months ago. Copies the data from each CSV into the appropriate sheet, making sure to paste as "values only" to avoid breaking the formulas.

7:20 AM — Fixes the Nashville data: manually reorders the columns to match the expected format.

7:25 AM — Refreshes the pivot tables. One of them breaks — the pivot table source range needs to be extended to include the new rows. Updates the range.

7:35 AM — Checks the week-over-week totals manually against last week's numbers. They're $47,000 higher than expected. Investigates. Finds that a large Chicago order was entered with a date two weeks in the past, causing it to appear in this week's data even though it's not a new sale.

7:55 AM — Makes a judgment call: emails Marcus to flag the data issue, manually excludes the order from this week's report.

8:05 AM — Updates the charts. The summary chart requires manual data entry into a lookup table.

8:20 AM — Formats the report: column widths, number formats, color coding for regions that hit quota (green) vs. those that missed (red).

8:40 AM — Sends the report to Sandra and the regional VPs, with a note about the Chicago data anomaly.

Total time: approximately 2 hours.

The Problems

Time cost: 2 hours × 52 weeks = 104 hours per year, or roughly 13 full working days. Just for this one report.

Knowledge dependency: When Priya was sick for a week in March, the report was not produced. No one else knew how to do it correctly.

Error risk: The manual steps — copying and pasting, adjusting pivot ranges, manually entering chart data — introduce opportunities for error at every stage. Priya is good at this, but she's human.

Audit trail: If someone later asks "how did you calculate that regional total?", the answer is a series of manual steps that no longer exist in a traceable form.

Data quality: The Nashville column order issue and the Chicago date error are not unusual. Data from real systems is messy, and manual handling of that messiness is unreliable and invisible.

What Python Would Do

Here's what Priya's Python script (written once, run every Monday with one command) would do:

# Conceptual outline — the code covered in Part 2 makes this real

# 1. Load all four CSV files from the network folder automatically
# 2. Standardize column names across all files
# 3. Combine into one unified dataset
# 4. Apply business rules:
#    - Flag orders with dates more than 5 days old as "historical" and exclude from weekly totals
#    - Log anomalies to an audit file
# 5. Calculate regional totals, top products, WoW and MTD comparisons
# 6. Write a formatted Excel workbook with charts
# 7. Optionally: draft and send the email with the workbook attached

Time cost after automation: 45–90 seconds.

Knowledge dependency: Any team member can run the script.

Error risk: The same logic runs every time. The script doesn't get tired, doesn't paste in the wrong range, doesn't forget to extend the pivot table.

Audit trail: The code is the audit trail.

Data quality: Anomalies are flagged automatically and logged to a file that Marcus and Priya can review.

The Learning Curve

It took Priya about 15 hours of Python work — spread across three weekends — to build the first working version of this script. The first version had bugs. The second version was better. The third version was the one she trusted enough to actually use on Monday morning.

By month three, she had enhanced it: the script now sends the email automatically when Sandra's name is in the To field and the report attaches correctly. By month six, she had added anomaly detection that emails Marcus directly when something looks wrong — before the report even goes out.

Her Monday mornings now start at 7:45 AM.

Discussion Questions

  1. What is the true cost of Priya's 2-hour Monday morning process, measured in time, risk, and organizational resilience?

  2. At what point does the 15-hour Python investment break even, compared to the 104-hour annual manual cost?

  3. Priya's script automates the "what" but still requires human judgment for anomalies (like the Chicago date issue). How should a system handle the boundary between automation and human decision-making?

  4. What other tasks in Priya's role might be candidates for similar automation?

  5. If Priya's manager saw this case study, what concerns might she have about automating Priya's core task? How would you address them?