Case Study 28-1: Priya and the West Region Discovery

Background

It is the second week of January 2024. Acme Corp has just completed its fiscal year 2023, and the numbers are in. Sandra Chen, VP of Sales, is preparing the annual sales review for the executive team meeting scheduled for January 22nd.

Three of Acme's four regions finished within acceptable range of their targets: - North: $924,560 (target $890,000 — 3.9% above) - East: $807,210 (target $775,000 — 4.1% above) - South: $755,890 (target $740,000 — 2.1% above)

And then there is the West region: $359,732 against a $620,000 target. A 42% shortfall. A $260,268 miss.

Sandra knows she cannot walk into the board presentation and say "West region underperformed." The board will want to know why — and more importantly, what happens in 2024. Does Acme abandon the West territory? Reduce investment there? Or double down?

She calls Priya Okonkwo on Monday morning.

"Priya, I have the year-end numbers. West region is in trouble — $360k against a $620k target. I need a real analysis, not just a summary. I need to understand whether this is a Dave problem, a market problem, a product problem, or something else entirely. Can you build me something by Wednesday morning? I want charts I can show the exec team."

Priya has been in this role for seven months — long enough to know the data, short enough that the analysis still excites her. She has acme_sales_2023.csv on her laptop and a clear Wednesday deadline.


The Analysis

Step 1: Get the Full Picture First

Priya's first rule: never start with the problem region. Start with the full picture so you have a baseline for comparison.

import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.ticker as mticker
from sales_analysis import (
    load_sales_data,
    calculate_revenue_metrics,
    revenue_by_dimension,
    monthly_revenue_trend,
    pareto_analysis,
    sales_velocity,
)

df = load_sales_data("acme_sales_2023.csv")

# Company-wide metrics
company_metrics = calculate_revenue_metrics(df)
print(f"Total Revenue: ${company_metrics['total_revenue']:,.2f}")
print(f"Total Orders: {company_metrics['total_orders']:,}")
print(f"Average Order Value: ${company_metrics['average_order_value']:,.2f}")

Company total: $2,847,392 on 1,300 orders across 155 customers. A solid year overall. Average order value of $2,190.

Step 2: The Regional Comparison

region_df = revenue_by_dimension(df, "region")
print(region_df.to_string(index=False))
Region Revenue Orders % of Total Margin %
North $924,560 418 32.5% 51.2%
East $807,210 391 28.3% 50.8%
South $755,890 351 26.5% 51.5%
West $359,732 140 12.6% 51.1%

Three regions account for 87.3% of revenue. West accounts for 12.6%. But Priya notices something immediately: the gross margin percentage is almost identical across all four regions (50.8%–51.5%). Whatever is wrong with West, it is not a pricing problem or an efficiency problem.

Step 3: Isolate West and Test the Hypotheses

Sandra wants to know: Dave problem? Market problem? Product problem? Something else?

Priya tests each hypothesis systematically.

Hypothesis A: Deal quality problem (Dave is taking bad deals)

west = df[df["region"] == "West"].copy()
other = df[df["region"] != "West"].copy()

west_metrics = calculate_revenue_metrics(west)
other_metrics = calculate_revenue_metrics(other)

print(f"West avg order value:   ${west_metrics['average_order_value']:,.2f}")
print(f"Other avg order value:  ${other_metrics['average_order_value']:,.2f}")
print(f"West margin %:          {west_metrics.get('overall_margin_pct', 'N/A')}%")

West average order value: $2,180. Other regions: $2,205. A difference of $25, which is noise. The margin percentages are essentially identical.

Verdict on Hypothesis A: False. When Dave closes a deal, it is as good as any deal anywhere in the company.

Hypothesis B: Product mix problem (West buys different products)

west_products = revenue_by_dimension(west, "product")
other_products = revenue_by_dimension(other, "product")

print("West region top products:")
print(west_products[["product", "pct_of_total"]].head(5).to_string(index=False))

print("\nOther regions top products:")
print(other_products[["product", "pct_of_total"]].head(5).to_string(index=False))

The product ranking is nearly identical between West and other regions. Printer Paper is #1 everywhere. Binders and Pens are consistently in the top 5.

Verdict on Hypothesis B: False. West customers buy the same products as everyone else.

Hypothesis C: Market saturation (West territory is just smaller)

print(f"West unique customers:   {west_metrics['unique_customers']}")
print(f"Other unique customers:  {other_metrics['unique_customers']}")
print(f"West customer count:     {west['customer_id'].nunique()}")
print(f"North customer count:    {df[df['region']=='North']['customer_id'].nunique()}")
print(f"East customer count:     {df[df['region']=='East']['customer_id'].nunique()}")
print(f"South customer count:    {df[df['region']=='South']['customer_id'].nunique()}")

Output:

West unique customers:   15
North customer count:    42
East customer count:     38
South customer count:    41

West has 15 accounts. Every other region has 38–42. That is a 60–65% deficit in customer count.

Hypothesis D: Sales capacity problem (not enough people to cover the territory)

west_reps = revenue_by_dimension(west, "salesperson")
print("West region salespeople:")
print(west_reps.to_string(index=False))

for region in ["North", "South", "East"]:
    region_df = df[df["region"] == region]
    reps = region_df["salesperson"].nunique()
    customers = region_df["customer_id"].nunique()
    print(f"{region}: {reps} reps, {customers} customers, "
          f"{customers/reps:.1f} customers per rep")

west_reps_count = west["salesperson"].nunique()
west_customers = west["customer_id"].nunique()
print(f"West: {west_reps_count} rep(s), {west_customers} customers, "
      f"{west_customers/west_reps_count:.1f} customers per rep")

Output:

North: 3 reps, 42 customers, 14.0 customers per rep
East:  3 reps, 38 customers, 12.7 customers per rep
South: 2 reps, 41 customers, 20.5 customers per rep
West:  1 rep,  15 customers, 15.0 customers per rep

Dave Nguyen's customer-to-rep ratio (15.0) is actually better than South's 20.5. He is managing a normal number of accounts for one person. The problem is not his account management — it is that he cannot physically prospect and close new accounts while also managing the ones he has.

Verdict on Hypothesis D: Confirmed. West region has one salesperson where it needs two or three.

Step 4: The Monthly Pattern

# Compare West monthly performance to company average per region
west_monthly = west.groupby("year_month")["revenue"].sum()
company_monthly = df.groupby("year_month")["revenue"].sum() / 4  # per-region avg

comparison = pd.DataFrame({
    "West": west_monthly,
    "Per-Region Average": company_monthly,
}).fillna(0)
comparison["West % of Average"] = (
    comparison["West"] / comparison["Per-Region Average"] * 100
).round(1)

print(comparison.to_string())

Priya sees that West consistently runs at 40–55% of the per-region average throughout the year. This is not a one-quarter blip — it is a structural, persistent gap. And critically, it does not get worse in Q4 when the other regions surge. West just sits at its lower steady-state. It is not missing the seasonal uptick; it simply has a lower baseline.

Step 5: Building the Evidence Chart

fig, axes = plt.subplots(1, 2, figsize=(14, 6))
fig.suptitle("West Region Performance Analysis — Acme Corp 2023",
             fontsize=14, fontweight="bold")

# Left panel: Monthly revenue by region
pivot = df.groupby(["year_month", "region"])["revenue"].sum().unstack(fill_value=0)
months = [str(p) for p in pivot.index]
x = range(len(months))

colors = {"North": "#4472C4", "South": "#ED7D31", "East": "#A9D18E", "West": "#FF6B6B"}
for region in pivot.columns:
    axes[0].plot(x, pivot[region].values, marker="o", markersize=4,
                 linewidth=2, label=region, color=colors.get(region, "#888"))

axes[0].set_title("Monthly Revenue by Region", fontweight="bold")
axes[0].set_xticks(list(x))
axes[0].set_xticklabels(months, rotation=45, ha="right", fontsize=7)
axes[0].set_ylabel("Revenue ($)")
axes[0].yaxis.set_major_formatter(mticker.FuncFormatter(lambda v, _: f"${v:,.0f}"))
axes[0].legend()

# Right panel: Full-year revenue share (explode West to highlight it)
region_totals = df.groupby("region")["revenue"].sum()
region_colors = [colors.get(r, "#888") for r in region_totals.index]
explode = [0.08 if r == "West" else 0 for r in region_totals.index]

axes[1].pie(
    region_totals.values, labels=region_totals.index,
    autopct="%1.1f%%", colors=region_colors,
    startangle=90, explode=explode
)
axes[1].set_title("2023 Revenue Share by Region\n(West highlighted)", fontweight="bold")

plt.tight_layout()
plt.savefig("west_region_analysis.png", dpi=150, bbox_inches="tight")
print("Chart saved: west_region_analysis.png")

The Recommendation

Priya completes the analysis on Tuesday afternoon. She writes a one-page summary that Sandra can present directly to the executive team:


Acme Corp West Region Analysis — FY 2023 Prepared by: Priya Okonkwo, Acting Senior Analyst Date: January 16, 2024

Finding: West region's $260k revenue miss (42% shortfall vs. target) is a sales capacity problem, not a market, product, or individual performance problem.

Evidence: 1. West region average order value ($2,180) is statistically identical to other regions ($2,205). When deals close, they are comparable in size and quality. 2. West region gross margin (51.1%) matches the company average (51.1%). Profitability per deal is not lower. 3. West region has only 15 active customer accounts. North has 42, East has 38, South has 41. 4. West region has 1 salesperson (Dave Nguyen) carrying all 15 accounts at 15.0 customers/rep — a normal account load. 5. The performance gap is structural and consistent across all 12 months, not concentrated in any quarter.

Conclusion: Dave Nguyen's performance is sound. The West territory simply cannot be properly covered by one person. He manages his existing accounts well but lacks capacity to prospect and close new business simultaneously.

Recommendation: Hire 2 additional West region sales representatives in Q1 2024. Target enterprise customers in the 200–500 employee range (Acme's proven sweet spot). Benchmark: if West reaches 40 accounts at the company's average revenue per customer ($18,370), West region revenue would reach $734,800 — above the original target.

Financial projection: At 50% gross margin and $85k/rep total cost, each rep needs to generate $170k in new revenue to break even. At 40 accounts and average revenue, two new reps are projected to reach break-even within 18 months of hiring.


The Outcome

Sandra presents the analysis at the January 22nd executive meeting. Marcus Webb, the IT Manager who initially questioned the value of Priya's Python work, reviews the one-pager and says: "This is exactly the kind of analysis we should have been doing every quarter."

The West region hiring plan is approved: two new sales reps, with a Q1 2024 start date. Dave Nguyen is promoted to West Region Sales Lead, with the two new hires reporting to him.

By Q3 2024, West region is tracking at $490,000 annualized — still below target, but trending in the right direction for the first time in the region's history.


What This Case Study Illustrates

The right question matters more than the right answer. Sandra did not ask "what happened to West region revenue?" She asked "is this a Dave problem, a market problem, a product problem, or something else?" That framing led to a much more useful analysis.

Multiple hypothesis testing is more valuable than looking for confirmation. Priya tested four hypotheses and eliminated three of them before confirming the fourth. If she had started with the capacity hypothesis and stopped there, she might have confirmed it without proving the others were false — a less convincing argument.

Negative findings are evidence. The fact that average order value was nearly identical in West and other regions is a finding, not an absence of findings. "The problem is NOT deal quality" is just as important to state as "the problem IS sales capacity."

Python made the analysis replicable and transparent. Sandra could look at the code and the data and verify every number herself. The conclusion was not "trust me" — it was "here is the code, here is the data, here are the results." That transparency builds organizational trust in analytics.