Case Study 2: From Spreadsheet to Script — A Marketing Team's Python Journey

DataField.Dev

Case Study 2: From Spreadsheet to Script — A Marketing Team's Python Journey

Overview

This case study follows a fictional marketing analytics team at Greenfield Consumer Brands as they transition from Excel-based reporting to Python-based analytics over the course of six months. While the company and characters are invented, the challenges, resistance patterns, breakthrough moments, and outcomes described here are drawn from real organizational experiences documented in industry surveys and practitioner interviews. The case is designed to show you what the Python learning journey actually looks like inside a business team — not in a classroom, but on the ground.

The Company

Greenfield Consumer Brands is a mid-sized consumer packaged goods (CPG) company headquartered in Minneapolis, Minnesota. Founded in 2008, Greenfield sells organic snack foods, beverages, and personal care products through a mix of grocery chains, specialty retailers, and its own direct-to-consumer (DTC) e-commerce site. Annual revenue is approximately $280 million. The company employs 1,200 people, including a 14-person marketing department.

The Team

The marketing analytics function at Greenfield consists of five people:

Dana Whitfield (38) — Senior Marketing Analyst, 8 years at Greenfield. Dana is the team's Excel power user. She has built a complex ecosystem of interconnected spreadsheets that the entire marketing department relies on for weekly and monthly reporting. She can write VLOOKUP formulas in her sleep and has created elaborate VBA macros that automate parts of her workflow. Dana is proud of these systems and, initially, skeptical of the need to change.
Marcus Chen (26) — Marketing Analyst, 1 year at Greenfield. Marcus graduated from a business analytics program where he learned basic Python and R. He has been quietly frustrated that Greenfield's analytics stack is entirely Excel-based. He sees the limitations daily but has not felt empowered to push for change.
Priya Sharma (31) — Digital Marketing Manager, 3 years at Greenfield. Priya manages the e-commerce and paid media channels. She spends significant time manually downloading data from Google Analytics, Meta Ads Manager, and Shopify, then consolidating it in Excel for her weekly channel performance report.
James Ortiz (44) — VP of Marketing. James is results-oriented and not particularly technical. He cares about the accuracy and timeliness of the insights his team delivers, not the tools they use to produce them. He will support a change if someone can show him it will make the team faster and more reliable.
Keiko Tanaka (29) — Marketing Coordinator, 2 years at Greenfield. Keiko handles campaign execution and basic reporting. She has no coding experience and is anxious about any change that might make her feel incompetent.

Month 1: The Breaking Point

The catalyst for change was not a strategic initiative — it was a crisis.

On the first Monday of March, Dana arrived at work to discover that her master reporting workbook — a 23-tab Excel file called Marketing_Master_v47_FINAL_FINAL.xlsx — had corrupted over the weekend. The file would not open. IT recovered a backup, but it was three weeks old. Three weeks of updates, formula refinements, and data entries were lost.

Dana spent the next four days rebuilding the workbook from memory and partial records. By Thursday, she had recovered most of it, but two formulas were producing different numbers than before, and she could not determine which version was correct. The weekly report went out on Friday with a footnote: "Some figures are approximate due to a data recovery issue."

James Ortiz was not happy. "We're running a $40 million marketing budget based on 'approximate' figures?" he said in their Friday check-in. "How do we make sure this doesn't happen again?"

Dana's answer — better backup procedures, more careful version control of spreadsheets — was reasonable but incremental. Marcus, who had been waiting for an opening, offered a different perspective: "What if the reports weren't in Excel at all? What if they were generated from code — code that runs the same way every time, lives in version control, and can be re-run whenever we need it?"

James looked skeptical. "Show me," he said.

Month 2: The Proof of Concept

Marcus spent the next two weeks building a proof of concept. He chose the weekly channel performance report — Priya's responsibility — because it was well-defined, time-consuming, and painful.

Priya's existing process:

Log into Google Analytics. Download a CSV of website traffic by channel and date. (10 minutes)
Log into Meta Ads Manager. Download a CSV of Facebook and Instagram ad performance. (10 minutes)
Log into Shopify. Download a CSV of e-commerce revenue by source. (10 minutes)
Open all three CSVs in Excel. Manually align date ranges and channel names. (20 minutes)
Copy and paste data into the master reporting template. (15 minutes)
Update formulas and pivot tables. Fix any that broke. (20 minutes)
Format charts and tables. (15 minutes)
Export to PDF and email to James and the broader team. (10 minutes)

Total time: approximately 2 hours every Monday morning.

Marcus built a Python script that automated steps 1 through 6. The script:

import pandas as pd

# Load data from CSV exports (in production, these would come from APIs)
ga_data = pd.read_csv("google_analytics_export.csv")
meta_data = pd.read_csv("meta_ads_export.csv")
shopify_data = pd.read_csv("shopify_export.csv")

# Standardize column names
ga_data.columns = [col.lower().replace(" ", "_") for col in ga_data.columns]
meta_data.columns = [col.lower().replace(" ", "_") for col in meta_data.columns]
shopify_data.columns = [col.lower().replace(" ", "_") for col in shopify_data.columns]

# Standardize date formats
ga_data["date"] = pd.to_datetime(ga_data["date"])
meta_data["date"] = pd.to_datetime(meta_data["date"])
shopify_data["date"] = pd.to_datetime(shopify_data["date"])

# Merge datasets
combined = ga_data.merge(shopify_data, on=["date", "channel"], how="left")
combined = combined.merge(meta_data, on=["date", "channel"], how="left")

# Calculate derived metrics
combined["roas"] = combined["revenue"] / combined["spend"]
combined["cpa"] = combined["spend"] / combined["conversions"]
combined["conversion_rate"] = combined["conversions"] / combined["sessions"] * 100

# Weekly summary
weekly = combined.groupby("channel").agg(
    total_sessions=("sessions", "sum"),
    total_spend=("spend", "sum"),
    total_revenue=("revenue", "sum"),
    total_conversions=("conversions", "sum")
).reset_index()

weekly["roas"] = weekly["total_revenue"] / weekly["total_spend"]
weekly["cpa"] = weekly["total_spend"] / weekly["total_conversions"]

print(weekly.to_string(index=False))

The script was not elegant. Marcus was not an experienced Python developer. But it worked. It produced the same numbers as Priya's manual process — and it ran in under 3 seconds.

Marcus showed the proof of concept to James on a Friday afternoon. He ran the script live, showed the output, then changed the input file to a different week's data and ran it again. Same output format, same calculations, different numbers. Three seconds each time.

"How long did Priya spend on this last Monday?" James asked.

"About two hours," Marcus said.

"And if Priya goes on vacation?"

"Right now, nobody else knows how to build the report. With the script, anyone can run it."

James nodded. "Roll it out. But don't break anything — run both systems in parallel for a month."

Month 3: Resistance and Adoption

The parallel run revealed both the promise and the friction of the transition.

Priya was enthusiastic. She had been spending 8–10 hours per month on a task she found tedious, and the script freed her to focus on campaign strategy. She started learning Python herself, asking Marcus to walk her through the code line by line. Within three weeks, she could modify the script to add new metrics.

Dana was resistant. Her concerns were legitimate:

"I've spent years building these spreadsheets. They work. Why change something that works?"
"What happens when the code breaks? At least with Excel, I can see the formulas."
"I don't want to be dependent on Marcus. What if he leaves?"

James addressed Dana's concerns directly. He did not dismiss her expertise or frame the transition as a replacement for her skills. Instead, he positioned it as an evolution: "Dana, you are the person who understands our data better than anyone. That knowledge is trapped inside spreadsheets that only you can maintain. If we put it into code, it becomes institutional knowledge — documented, testable, and maintainable by the team."

This reframing was critical. Dana was not being replaced; her expertise was being liberated from a fragile medium.

Keiko was anxious. She attended the first Python training session Marcus organized and left feeling overwhelmed. "I don't understand any of this," she told Priya. "I'm not a technical person."

Priya, who had been in a similar position just weeks earlier, sat with Keiko for an hour after work. They started with print("Hello, world") and worked up to loading a CSV and calculating an average. By the end of the session, Keiko had filtered a DataFrame and sorted it. "It's like... really specific Excel," she said. "The syntax is weird, but the logic is the same."

That realization — "it's really specific Excel" — turned out to be the most effective framing for the non-technical team members. They were not learning a foreign language; they were learning a more precise way to express operations they already understood.

Month 4: The Breakthrough

The breakthrough came when Dana — the skeptic — solved a problem with Python that she could not solve in Excel.

Greenfield was planning a promotional campaign across three retail chains. Dana needed to analyze 18 months of weekly sales data by product, retailer, and region to identify which products to promote at which retailers. The dataset had 145,000 rows — well beyond Excel's comfort zone (it could technically handle 1 million rows, but her formulas slowed to a crawl above 50,000).

Marcus helped Dana load the data into a pandas DataFrame. He showed her groupby, pivot_table, and basic filtering. Dana, who understood the business logic cold, was able to direct the analysis even though she was still learning the syntax. She would say things like "Now I need to see the average weekly units for each product-retailer combination, but only for the last 6 months," and Marcus would show her the pandas code:

recent = sales_df[sales_df["date"] >= "2024-10-01"]
product_retailer = recent.groupby(["product", "retailer"])["units"].mean()
product_retailer = product_retailer.reset_index()
product_retailer = product_retailer.sort_values("units", ascending=False)

Dana stared at the output. "That's... all of it? That would have taken me an hour in Excel. And the numbers are right — I can cross-check against the totals."

She spent the next week building the full promotional analysis in a Jupyter notebook. Marcus helped with syntax; Dana provided the business logic and quality checks. The resulting notebook was 47 cells long, mixed code and markdown explanations, and produced a complete recommendation for which products to promote at each retailer.

When she presented it to James, she did not present a spreadsheet. She presented a notebook. She walked him through the analysis cell by cell: "Here's where we load the data. Here's where we filter for the last six months. Here's the groupby that shows us performance by product and retailer. Here's the chart."

James's reaction: "Can we do this for every promotional planning cycle?"

Dana: "Yes. And it'll take me two hours instead of two days."

Month 5: Scaling Up

With all five team members at least comfortable reading Python code (and three of them writing it), the team began migrating their core reporting infrastructure:

Report	Previous (Excel)	New (Python)	Time Savings
Weekly Channel Performance	2 hours/week	15 minutes/week (run script + review)	87%
Monthly Marketing Dashboard	6 hours/month	45 minutes/month	88%
Quarterly Business Review deck	12 hours/quarter	3 hours/quarter	75%
Ad Hoc Promotional Analysis	8-16 hours each	2-4 hours each	75%
Campaign Post-Mortem	4 hours each	1 hour each	75%

Total estimated time savings: approximately 25 hours per month across the team — equivalent to 300 hours per year, or roughly 15% of one full-time employee.

But the time savings were only part of the story. The qualitative improvements were equally significant:

Consistency. Reports produced by code were identical in format and calculation methodology every time. No more "which version of the spreadsheet has the right formula?"
Auditability. Every analytical decision was visible in the code. When James asked "How did you calculate ROAS?", the answer was a single line of Python, not a chain of cell references.
Version control. The team started using Git (a version control system) to track changes to their scripts. They could see who changed what, when, and why — and roll back if needed.
Knowledge sharing. New team members could read the code and understand the team's analytical methods. When Keiko needed to learn how the monthly dashboard worked, she read the notebook rather than asking Dana to walk her through a 23-tab spreadsheet.

Month 6: The ROI Conversation

At the end of June, James asked Marcus and Dana to present the results of the Python transition to Greenfield's CFO, who had noticed the marketing team requesting fewer "emergency data pulls" from the IT department.

They presented the following:

Direct cost savings: - 300 hours/year of analyst time recovered: approximately $22,500 (at a blended cost of $75/hour) - Reduced IT support requests for data exports: approximately $5,000/year - Eliminated need for two Excel add-in licenses: $1,200/year

Indirect value: - Faster time-to-insight: Promotional analysis that took 2 days now takes 2 hours, enabling the team to evaluate more options and make better decisions - Reduced error rate: No more formula drift or copy-paste errors in reports - Improved employee satisfaction: Team members reported spending more time on strategic work and less on data wrangling - Institutional knowledge: Analytical methods are documented in code, reducing key-person risk

Investment: - Marcus's time building initial infrastructure: approximately 80 hours (one-time) - Team training time: approximately 40 hours total across all team members (one-time) - Ongoing learning: approximately 2 hours/person/month (ongoing)

Simple ROI calculation:

annual_savings = 22500 + 5000 + 1200  # $28,700
initial_investment = (80 + 40) * 75    # $9,000 (120 hours at $75/hour)
ongoing_cost = 5 * 2 * 12 * 75        # $9,000 (5 people, 2 hrs/month, 12 months)
first_year_net = annual_savings - initial_investment - ongoing_cost

print(f"Annual Savings:      ${annual_savings:,.0f}")
print(f"Initial Investment:  ${initial_investment:,.0f}")
print(f"Ongoing Annual Cost: ${ongoing_cost:,.0f}")
print(f"First Year Net:      ${first_year_net:,.0f}")
print(f"Year 2+ Annual Net:  ${annual_savings - ongoing_cost:,.0f}")

Annual Savings:      $28,700
Initial Investment:  $9,000
Ongoing Annual Cost: $9,000
First Year Net:      $10,700
Year 2+ Annual Net:  $19,700

The CFO approved the initiative and asked: "Which other departments should consider this?"

Lessons Learned

The Greenfield team documented their experience in an internal retrospective. Their key takeaways:

1. Start with a painful, visible problem

Marcus did not pitch Python in the abstract. He waited for a moment — the spreadsheet corruption — when the limitations of the current approach were visceral and visible. The proof of concept addressed a specific, painful process that everyone on the team recognized as inefficient.

2. Run systems in parallel

For the first month, the team ran both Excel and Python processes simultaneously. This served two purposes: it validated that the Python scripts produced correct results, and it gave the team a safety net during the transition. Parallel running requires extra effort in the short term but builds trust.

3. Reframe, do not replace

The transition succeeded because it was framed as an evolution of Dana's expertise, not a repudiation of it. Dana's deep knowledge of the data and the business logic was the essential ingredient; Python was simply a better container for that knowledge. Teams that frame coding transitions as "out with the old" tend to encounter fierce resistance.

4. Pair business experts with technical translators

The most productive working model was Dana (business expert) and Marcus (Python translator) working side by side. Dana knew what needed to happen; Marcus knew how to express it in code. Over time, Dana absorbed enough Python to work independently on routine tasks, while Marcus developed a much deeper understanding of the marketing analytics domain.

5. Meet people where they are

Keiko's breakthrough came when Priya reframed Python as "really specific Excel." Different team members have different learning styles and different levels of comfort with technical material. Some learn best by reading documentation; others need a patient colleague to sit with them. The team's willingness to accommodate different learning speeds was critical.

6. Measure and communicate the ROI

The ROI presentation to the CFO was not an afterthought — it was a deliberate step that ensured the transition had organizational support and could serve as a model for other departments. Without quantified results, the transition might have been seen as a pet project rather than a business improvement.

7. Python is not the end — it is the beginning

By the end of Month 6, Marcus was already exploring pandas extensions and visualization libraries. Priya had started pulling data directly from APIs instead of downloading CSVs manually. Dana was investigating how Python could help with marketing mix modeling. The team had moved from "Can we learn Python?" to "What else can Python do for us?"

This progression — from fear, to familiarity, to fluency, to ambition — is the learning journey that this textbook is designed to accelerate.

Discussion Questions

The catalyst question. Greenfield's Python transition was triggered by a spreadsheet corruption crisis. What are the risks of waiting for a crisis before modernizing your analytical tools? How could Marcus have made the case for Python proactively, without a crisis?
Resistance as feedback. Dana's resistance to the transition included legitimate concerns (dependency on Marcus, inability to "see" formulas, risk of breaking existing systems). How would you address each of these concerns? Are there situations where resistance to a technology transition is actually correct — where the existing approach is genuinely better?
The "really specific Excel" reframe. Keiko's breakthrough came when Python was framed as a more precise version of something she already knew. Think of a technology or concept in your own field that could be reframed this way — something that seems foreign but is actually an extension of a familiar skill. How would you communicate this reframing to a skeptical colleague?
ROI calculation. The Greenfield team calculated a first-year ROI based on time savings and reduced costs. What are the limitations of this ROI calculation? What valuable outcomes (risk reduction, decision quality, employee satisfaction) are not captured in the numbers? How would you present these intangible benefits to a CFO?
Team dynamics. The case describes a spectrum of reactions to the Python transition — enthusiasm (Marcus, Priya), skepticism (Dana), and anxiety (Keiko). If you were James (the VP), how would you manage this spectrum? What would you do if one team member refused to engage with the new tools at all?
Scaling the approach. The CFO asked: "Which other departments should consider this?" Choose a department (finance, HR, operations, or sales) and sketch out a similar Python adoption plan. What would the first proof of concept be? What resistance would you expect? What would the ROI case look like?
Connection to Chapter 3. Identify at least three specific Python concepts from Chapter 3 (variables, data types, loops, functions, pandas operations, etc.) that appear in this case study. For each, explain how the concept was applied to solve a real business problem.

Note: Greenfield Consumer Brands is a fictional company created for educational purposes. The challenges, adoption patterns, and outcomes described in this case study are representative of real organizational experiences documented in industry surveys by McKinsey & Company (2023), Anaconda (State of Data Science Reports, 2022–2024), and practitioner interviews.