Case Study 33-02: Maya and the "Do We Need AI?" Question

Character: Maya Reyes, freelance business analytics consultant Setting: Maya is two weeks into an engagement with Northfield Supply, a regional distributor with 80 employees. The CEO, David Park, has called her in specifically because he has heard that "AI and machine learning" can transform sales operations.


The Opening Request

"We want to use artificial intelligence to optimize our sales process," David says in their first meeting. He slides a magazine article across the table. It is about a large retailer that used "AI-powered predictive analytics" to increase revenue by 23%.

Maya reads the headline. She has seen articles like this many times. She has also seen what those implementations look like under the hood — and what they cost.

"Tell me what's happening in your sales process right now that you want to fix," she says. "Not the solution. The problem."

David pauses. It is a question he was not expecting.


Listening Before Diagnosing

Over the next hour, Maya asks questions and takes notes. By the end, she has a picture of Northfield Supply's actual situation:

The stated problem: "We don't know which opportunities to prioritize. Our reps spend time on deals that go nowhere."

The underlying symptoms David describes: 1. The sales pipeline has 200+ open opportunities at any given time, but only 40–50 close each quarter 2. Reps spend significant time on small-value, low-probability deals while occasionally missing larger opportunities 3. The CRM is inconsistently filled out — maybe 60% of opportunities have complete data 4. Reporting is done in Excel, manually exported from the CRM every Monday morning 5. There is no consistent definition of what makes an opportunity "qualified" 6. Senior reps claim they can predict deal outcomes intuitively; junior reps have no framework at all

Maya writes a list. On one side: "ML could help." On the other: "Not an ML problem."


Applying the ML Decision Framework

Maya uses the framework she has developed over years of consulting engagements. She works through it systematically.

Check 1: Is There a Pattern to Find?

Machine learning requires that the outcome (deal closing vs. not closing) is actually predictable from available data. If deal outcomes are essentially random — or driven entirely by factors that are never captured in the CRM — no amount of ML will help.

Maya asks David: "When a deal closes, why does it close? When a deal dies, why does it die?"

His answer is revealing: "It's usually about the buyer's budget cycle, whether our rep built a relationship with the right person, and whether we could match their timeline. Oh, and price. Price matters a lot for the commodity items."

Maya probes: "Are any of those things recorded in the CRM?"

"Budget cycle — sometimes, if the rep notes it. Relationship quality — no, that's in someone's head. Timeline — almost never. Price — yes, we have quote amounts."

Maya's assessment: Most of the predictive signal exists only in people's heads or in unstructured notes. The structured data in the CRM is incomplete and inconsistently maintained. This is a fundamental data quality problem, not a modeling problem.

Check 2: Do You Have Enough Labeled Historical Data?

Maya asks to see the historical opportunity data. The CRM has records going back three years: 847 closed/won opportunities and 1,203 closed/lost opportunities.

Two thousand labeled examples is technically enough to train a basic model — but only if the features are meaningful and complete. Maya looks at the completeness of the key fields:

Field Completeness
Company size 71%
Industry 83%
Deal value 94%
Sales rep 100%
Lead source 68%
Days in pipeline 100% (calculated)
Contact title 42%
Budget confirmed 29%
Decision timeline 31%
Competitor mentioned 18%

The fields with the most predictive potential — budget confirmed, decision timeline, contact title, competitor — are also the least complete. The complete fields (sales rep, days in pipeline, deal value) are either not causal predictors or are proxies for sales rep behavior rather than deal quality.

Maya's assessment: The data is too incomplete and inconsistently captured to support a reliable predictive model. Training a model on 30% complete data for the key features would produce something that looks like it works in cross-validation but fails to generalize.

Check 3: Would a Simpler Solution Solve 80% of the Problem?

This is the question Maya always asks before recommending ML.

She asks David: "If I could show you, right now, which deals in your pipeline are most likely to close — based on simple rules that your best sales reps already follow intuitively — would that solve your problem?"

"What kind of rules?"

"Things like: deals with confirmed budget and a defined timeline close at 3x the rate of deals without those. Deals that have been in the pipeline more than 90 days without activity are almost never going to close. Industry X converts better than industry Y for your product mix."

David thinks about it. "Yes, actually. If we could systematically apply what our best reps know, that would fix most of it."

Maya writes in her notes: Rules + reporting = 80% of the value. ML = marginal improvement on top, not the foundation.

Check 4: What Are the Costs of Being Wrong?

A lead scoring model makes two types of errors:

  • False positive (predicts a deal will close, it doesn't): rep over-invests time in a dead deal
  • False negative (predicts a deal won't close, it does): rep under-invests, deal dies from neglect

In Northfield's context, the cost of false negatives is high — they are already missing deals they should be winning. But a model that flags 60% of the pipeline as high-probability is worse than useless; it just shifts the prioritization problem.

With incomplete data, the model's error rate would be high enough that both types of errors would be frequent. The cost of a wrong model recommendation is real: reps who act on bad model output and get burned once will stop trusting it, and you lose not just the ML investment but the goodwill to try again.

Check 5: Do You Have the Infrastructure to Deploy and Maintain a Model?

Maya asks about the technical team. There is one IT person at Northfield, focused on infrastructure. No data engineering, no MLOps, no way to schedule a model to run against the CRM on a weekly basis without significant setup work.

Deploying a model that updates weekly, with outputs visible in the CRM, would require: - A data pipeline from CRM to model input - A model serving infrastructure - An integration to write scores back to the CRM - Monitoring to detect when performance degrades

For an 80-person company with one IT resource, this is not a realistic lift.


Maya's Recommendation: Not ML

After two days of discovery, Maya sits down with David and delivers her assessment.

"Here is what I found. You have a real problem — deals are not being prioritized well, and reps are spending time in the wrong places. Machine learning is the wrong solution for that problem right now, for three reasons.

First: data quality. The fields that would make an ML model meaningful — budget confirmation, decision timeline, contact quality — are captured less than a third of the time. A model trained on incomplete data will produce unreliable scores. Your reps will lose trust in it after the first few wrong predictions.

Second: infrastructure. Deploying and maintaining a model requires engineering work that your team doesn't currently have capacity for. We'd be building on sand.

Third: the simpler solution works. A set of four or five qualification rules, applied consistently, and visible in a dashboard your team can actually use, will get you 80% of the value you're looking for. That is solvable in four weeks, not six months."

David looks skeptical. "So you're saying we don't need AI at all?"

"I'm saying you don't need it yet. In 12–18 months, if you've tightened up your CRM hygiene and built consistent qualification practices, you'll have the data foundation that makes an ML model worth building. Right now, you'd be spending significant money to automate a process that's broken at the input."


What Maya Builds Instead

Maya's actual engagement produces three deliverables:

Deliverable 1: A Qualification Scoring Rubric (No Code)

Working with the three most experienced sales reps, Maya documents the criteria they use implicitly. The output is a 5-point scoring rubric applied at deal entry:

Criterion Points
Budget confirmed (not assumed) 2
Decision maker identified and engaged 2
Timeline defined (close date realistic) 1
Need is quantified (not "we might need this") 1
Competition identified 1
Total possible 7

Deals scoring < 3 are flagged for qualification review, not advancement. This single change, fully manual, addresses the core issue: reps were advancing unqualified deals to avoid the awkward conversation about budget.

Deliverable 2: A Pipeline Health Dashboard (Python + pandas)

Maya builds a dashboard that the sales manager runs every Monday morning by executing one Python script against a CRM data export. It takes about 20 minutes to set up and runs in 3 seconds.

"""
Pipeline health dashboard for Northfield Supply.
Reads the weekly CRM export and produces a prioritized deal list.
"""

import pandas as pd

def load_pipeline(filepath: str) -> pd.DataFrame:
    """Load and clean the weekly CRM export."""
    df = pd.read_csv(filepath, parse_dates=["created_date", "last_activity_date"])

    # Fill calculated fields
    df["days_in_pipeline"] = (pd.Timestamp.today() - df["created_date"]).dt.days
    df["days_since_activity"] = (
        pd.Timestamp.today() - df["last_activity_date"]
    ).dt.days

    return df


def score_deal(row: pd.Series) -> float:
    """
    Rule-based deal score (0–100).
    Based on qualification rubric agreed with sales team.
    """
    score = 50.0   # base score

    # Positive signals
    if row["budget_confirmed"]:
        score += 20
    if row["decision_maker_engaged"]:
        score += 15
    if row["days_in_pipeline"] < 60:
        score += 10  # fresh deals more likely to close
    if row["deal_value"] > 10_000:
        score += 5   # larger deals get more attention

    # Negative signals
    if row["days_since_activity"] > 30:
        score -= 25   # stale = dead
    if row["days_since_activity"] > 60:
        score -= 20   # very stale = almost certainly dead
    if row["days_in_pipeline"] > 180:
        score -= 15   # deals that old rarely close

    return max(0.0, min(100.0, score))


def generate_report(filepath: str) -> None:
    """Generate the Monday morning pipeline report."""
    df = load_pipeline(filepath)

    # Apply scoring rules
    df["priority_score"] = df.apply(score_deal, axis=1)
    df["priority_tier"] = pd.cut(
        df["priority_score"],
        bins=[0, 30, 60, 80, 100],
        labels=["Deprioritize", "Watch", "Active", "High Priority"],
    )

    # Summary statistics
    print("=" * 60)
    print("NORTHFIELD SUPPLY — PIPELINE HEALTH REPORT")
    print(f"Generated: {pd.Timestamp.today().strftime('%Y-%m-%d')}")
    print("=" * 60)

    print(f"\nTotal open opportunities: {len(df)}")
    print(f"Total pipeline value:     ${df['deal_value'].sum():,.0f}")

    print("\nPriority distribution:")
    tier_summary = df.groupby("priority_tier", observed=True).agg(
        count=("deal_value", "count"),
        total_value=("deal_value", "sum"),
    )
    print(tier_summary.to_string())

    print("\nTop 20 deals to focus on this week:")
    top_deals = (
        df[df["priority_tier"].isin(["High Priority", "Active"])]
        .sort_values("priority_score", ascending=False)
        .head(20)[["company_name", "deal_value", "priority_score",
                    "priority_tier", "days_in_pipeline", "assigned_rep"]]
    )
    print(top_deals.to_string(index=False))

    print("\nDeals to deprioritize (> 60 days stale):")
    stale = df[
        (df["priority_tier"] == "Deprioritize") &
        (df["days_since_activity"] > 60)
    ][["company_name", "deal_value", "days_since_activity", "assigned_rep"]]
    print(f"  {len(stale)} deals, ${stale['deal_value'].sum():,.0f} in value")
    print("  Recommendation: close these as lost or schedule qualification call")

The key insight: this is not machine learning. It is a carefully designed scoring function based on domain knowledge, implemented in 60 lines of Python. It does not learn from data. It applies rules that experienced humans already knew.

And it works. After six weeks: - 40+ stale deals were formally closed as lost, clearing the pipeline - Reps reported that the weekly report saved them 30–45 minutes of manual pipeline review - The qualification rubric reduced the number of unqualified deals entering the pipeline by roughly 25%

Deliverable 3: A Data Collection Improvement Plan

Maya documents exactly what data would need to be captured — and how consistently — to make an ML model worth building in 18 months. She gives the sales manager a checklist: what fields to make required in the CRM, what training is needed for reps, and what a "clean enough" dataset would look like.

This is the unglamorous work that makes future ML projects actually succeed.


The Follow-Up Conversation

Three months later, David sends Maya an email.

"The dashboard and scoring rubric have been more useful than anything we've implemented in the last two years. The pipeline is cleaner, reps are focused on the right deals, and close rates are up. When do you think we'll be ready for the machine learning step?"

Maya replies:

"Six months from now, if you keep the CRM hygiene discipline you've built. The key fields you need — budget, timeline, decision maker engagement — are now at 70%+ completeness, up from under 30%. Another two quarters of that and you'll have a labeled dataset worth training on.

When that time comes, the model will be an enhancement to the scoring system you already have, not a replacement for the discipline you've built. The model will find patterns in the data that the rules miss. But the rules will still carry most of the weight.

For now, keep doing what's working."


What This Case Study Teaches

The most important diagnostic question is: "what simpler solution should we try first?"

ML is a powerful tool, but it is not the first tool. Before reaching for scikit-learn, reach for: 1. A cleaner dashboard or report 2. A simple rule-based system derived from domain knowledge 3. A process improvement that generates better data

Data quality is a prerequisite, not a problem ML solves. A model trained on incomplete, inconsistently-captured data will produce unreliable outputs that erode trust in the entire data practice.

The infrastructure to deploy and maintain a model is part of the cost. Factor it in before recommending ML to a small organization.

"Better reporting solved 80% of the problem" is a good outcome, not a failure. Maya's engagement was a success precisely because she did not oversell ML. The client got what they actually needed, not what sounded impressive in a press release.

The ML project will happen — in 18 months, when the foundation is ready. And it will succeed, because the foundation will be solid.