Case Study 2: Corporate Training That Actually Worked

Redesigning a Two-Day Workshop Using Learning Science — and Measuring What Stuck


Every corporate training professional has seen the same scene: a two-day workshop, engaged participants, enthusiastic end-of-training evaluations ("Loved it! 5 stars!"), and then... essentially nothing changes in behavior three months later.

This is so common that it has a name in L&D circles: the "smile sheet" problem. End-of-training satisfaction surveys (colloquially called "smile sheets") measure whether participants enjoyed the training, not whether they learned anything or changed their behavior. These two things correlate weakly at best.

This case study follows Tomas, a learning and development manager at a regional healthcare staffing company, who redesigned his organization's two-day Customer Excellence training program — and then actually measured what the redesign did to retention at 30 and 90 days.


The Original Program

Tomas's company trained approximately 200 new hires per year in Customer Excellence — a program covering service principles, communication protocols, escalation procedures, documentation standards, and regulatory compliance basics.

The original two-day format was standard: - Day 1: Content delivery (presentations, videos, slides) - Day 2: More content delivery, plus one role-play exercise per participant

End-of-training satisfaction scores were consistently high (average 4.3 out of 5). Participants reported the training as engaging and valuable.

But in exit interviews, six-month performance reviews, and customer feedback data, the organization consistently saw: - New hires forgetting regulatory compliance details (a risk issue) - Communication protocol errors in the first 90 days - High use of "I don't remember from training, let me find out" responses to customer questions (which the service standards specifically aimed to reduce)

Tomas had a hypothesis: high satisfaction scores meant the training was pleasant, not that it was effective. He commissioned a pre/post retention study. The results were sobering:

  • Immediately post-training: average knowledge test score 84%
  • 30 days post-training: 61%
  • 90 days post-training: 44%

Over 40% of the tested knowledge was gone within 90 days. The training was doing almost nothing to produce lasting knowledge.


The Redesign Principles

Tomas worked with a learning consultant to apply four core principles from the learning science literature:

Principle 1: Retrieval practice beats re-exposure. Replace "we'll cover that in the presentation" with "let's retrieve what you know about that before we cover it, then cover it, then retrieve again."

Principle 2: Spacing produces more durable learning than massing. Two consecutive days of training is massing. Distributing training across time — with gaps between sessions — produces substantially better long-term retention. But two-day training programs are an institutional requirement. Work within the constraint while distributing practice post-training.

Principle 3: Interleaving produces more transfer than blocked practice. Don't teach all the communication protocols in one block, then all the escalation procedures in another block. Mix them. Require discrimination learning — which protocol applies in which situation? — not just recall within a category.

Principle 4: Follow-up reinforcement must happen. If training ends on Friday and there's no further contact with the material until a six-month performance review, forget it. Build spaced reinforcement into the weeks following training.


The Redesigned Program

Pre-training homework (1 week before): Instead of arriving to the training cold, participants receive a pre-training packet: 5-10 "question preview" items that introduce the training topics as questions, not as content. "What do you think the most important factor is in managing an escalated customer call?" "What do you already know about HIPAA compliance in customer interactions?"

This has two functions: it activates existing knowledge (making new information more memorable) and it creates "desirable difficulty" — the question is genuinely harder to answer without the training, creating a felt knowledge gap that the training fills.

Day 1 redesign: Each module opens with a 5-minute "pre-test" — participants answer five questions about the module's content before the content is delivered. The pre-test is explicitly ungraded and no one is expected to know all the answers. Its purpose: orient participants to what they'll be learning, create gaps, and establish a baseline for self-assessment.

Each module closes with a "retrieval synthesis": participants close their materials and write, in 3-5 minutes, the three most important things from the module in their own words. These notes are theirs to keep; they become the starting point for the post-training reinforcement program.

Day 2: Interleaved application Day 2 applies the previous day's content through interleaved scenarios — customer service situations that require integrating knowledge from multiple Day 1 modules. Participants can't just retrieve "communication protocol" — they have to figure out whether the situation calls for communication protocol, escalation procedure, or compliance documentation, and then apply the right one.

This is more difficult than blocked application exercises and initially produces more errors. This is by design: the errors, discussed and corrected in real time, produce better discrimination learning than a scenario where the right category is already specified.

Post-training reinforcement program (8 weeks): This is the most significant change. For 8 weeks after training, participants receive a daily 5-minute email-based retrieval exercise: - Week 1-2: Short retrieval questions on training content (3-5 questions, 2-3 minutes) - Week 3-4: Scenario-based retrieval (short case scenario, participants write their response) - Week 5-6: Application of content to participants' actual recent work situations - Week 7-8: Integration questions combining multiple training topics

Participation was mandatory for the first cohort (tracked), then strongly encouraged with manager visibility for subsequent cohorts. Completion rates averaged 78% across the 8-week program.


The Results

Tomas ran a controlled comparison: one cohort received the redesigned program, one cohort received the legacy program. Both cohorts were assessed at post-training, 30 days, and 90 days.

Post-training (immediately after): - Legacy: 84% (slightly higher than redesigned cohort's 81% — the traditional program covered more material in less time, producing higher immediate performance) - Redesigned: 81%

30 days post-training: - Legacy: 61% - Redesigned: 74%

90 days post-training: - Legacy: 44% - Redesigned: 71%

The pattern is striking. At the immediate post-training measurement, the legacy program slightly outperformed (probably due to the "illusion of knowing" created by the familiarity-based passive learning). By 30 days, the redesigned program had pulled ahead. By 90 days, the gap was 27 percentage points.

Behavioral measures also improved: the redesigned cohort's managers reported fewer protocol errors, fewer "I'll have to find out" customer responses, and better compliance documentation quality in the first 90 days compared to the legacy cohort.


What Made the Biggest Difference

Post-hoc analysis (Tomas tested different components in subsequent cohorts) suggested the biggest contributors to the 90-day retention improvement were:

  1. The post-training reinforcement program (largest contributor — the 8-week follow-up accounted for a substantial portion of the retention gap)
  2. The interleaved scenario practice on Day 2 (better transfer and discrimination learning)
  3. Module-level retrieval synthesis (end-of-module free recall)

The pre-training question preview had a smaller but measurable effect on the depth of learning during training (participants who engaged with it more thoroughly showed slightly better retention at 30 days).


The Business Case

Tomas made a business case for the redesign that went beyond retention scores:

  • The post-training reinforcement program cost approximately 40 hours of design time (one-time) plus server-side administration (minimal)
  • The compliance error rate in the first 90 days dropped by approximately 35% for the redesigned cohort
  • Customer satisfaction scores for the redesigned cohort's service interactions were 8% higher at 90 days than for the legacy cohort

"Compliance errors have real financial consequences — in healthcare staffing, a compliance failure can mean a contract termination or a regulatory penalty. A 35% reduction in compliance errors in the first 90 days more than justified the entire program redesign cost in the first six months."

The most resistant stakeholder — the COO, who had initially questioned whether a 5-minute daily email exercise was worth the bother — was, by month six, asking whether the model could be applied to other training programs.


What Tomas Would Do Differently

"I wish we'd started with the retention measurement earlier. We ran the smile sheets for years without any data on what actually stuck. If I'd had the 30-day and 90-day retention data from the beginning, I think the redesign would have happened much sooner.

The other thing: the post-training program is most effective when managers are involved — when they reference the training in one-on-ones, ask questions that require applying the training, and create performance contexts where the new knowledge is actually used. The training science can do a lot, but if the workplace doesn't reinforce the learning, even the best-designed post-training program has limited impact."


This case study illustrates a broader principle: the gap between training and performance is almost always a learning design gap, not a motivation or intelligence gap. When the design respects how learning actually works — retrieval, spacing, interleaving, reinforcement — the performance follows.