Case Study: The Empirical Rule in Quality Control — When Every Millimeter Matters

Contributors

Case Study: The Empirical Rule in Quality Control — When Every Millimeter Matters

The Setup

You probably don't think about ball bearings very often. But right now, there are ball bearings inside the hard drive of your computer (if it's not solid-state), the wheels of your skateboard, the motor of your refrigerator, and the hub of every car wheel on the road outside. Ball bearings are everywhere. They reduce friction, carry loads, and keep rotating machinery spinning smoothly.

And here's the thing about ball bearings: they have to be exactly the right size. A bearing that's a fraction of a millimeter too large won't fit into its housing. One that's too small will rattle, creating friction and heat that can destroy an engine. The difference between a good ball bearing and a dangerous one is measured in hundredths of a millimeter — thinner than a human hair.

This is why the Empirical Rule isn't just a textbook exercise. In manufacturing quality control, it's the difference between a product that works and a product that kills.

The Manufacturing Process

Consider a factory that produces steel ball bearings for automotive applications. The target diameter is 10.00 mm. But no manufacturing process is perfect — every bearing comes out slightly different. The question isn't "Are they all exactly 10.00 mm?" (they aren't). The question is: "How much variation is acceptable?"

After measuring thousands of bearings, the quality control team finds: - Mean diameter: $\bar{x} = 10.00$ mm (the process is well-calibrated — the center is right on target) - Standard deviation: $s = 0.02$ mm - Distribution shape: Approximately bell-shaped and symmetric (confirmed by histogram)

The engineering specification says that bearings must have diameters between 9.94 mm and 10.06 mm to function safely. Anything outside this range is defective and must be rejected.

Applying the Empirical Rule

Since the distribution is approximately bell-shaped, we can apply the Empirical Rule:

Range	Calculation	Interval	% of Bearings	Result
Within 1 SD	10.00 ± 0.02	9.98 to 10.02 mm	~68%	Well within spec
Within 2 SD	10.00 ± 0.04	9.96 to 10.04 mm	~95%	Within spec
Within 3 SD	10.00 ± 0.06	9.94 to 10.06 mm	~99.7%	Exactly at spec limits

Look at what this tells us: the engineering specification limits (9.94 to 10.06 mm) correspond almost exactly to 3 standard deviations from the mean. That means:

99.7% of bearings are within spec — they're good.
0.3% of bearings fall outside spec — they're defective.

If the factory produces 100,000 bearings per day, that's about 300 defective bearings every day. That might sound small in percentage terms (99.7% pass rate!), but 300 defective bearings per day is 300 potential safety hazards.

The Standard Deviation Detective

Here's where it gets really interesting. Watch what happens when the standard deviation changes — even slightly.

Scenario A: Tight Process ($s = 0.01$ mm)

If the factory invests in better equipment and reduces the standard deviation to 0.01 mm:

Range	Interval	% of Bearings
Within 1 SD	9.99 to 10.01 mm	~68%
Within 2 SD	9.98 to 10.02 mm	~95%
Within 3 SD	9.97 to 10.03 mm	~99.7%

Now the spec limits (9.94 to 10.06) are 6 standard deviations from the mean. The percentage of bearings outside these limits? Essentially zero — about 2 in a billion. The defect rate drops from 300 per day to effectively none.

This is the concept behind Six Sigma quality — the idea that your spec limits should be 6 standard deviations from the process mean. It means fewer than 3.4 defects per million items produced.

Scenario B: Sloppy Process ($s = 0.05$ mm)

Now imagine the equipment deteriorates and the standard deviation doubles to 0.05 mm:

Range	Interval	% of Bearings
Within 1 SD	9.95 to 10.05 mm	~68%
Within 2 SD	9.90 to 10.10 mm	~95%
Within 3 SD	9.85 to 10.15 mm	~99.7%

Now 1 standard deviation nearly fills the entire spec range. The spec limits (9.94 to 10.06) are barely more than 1 standard deviation from the mean, meaning only about 77% of bearings are within spec. That's a 23% defect rate — 23,000 defective bearings out of 100,000. The factory would hemorrhage money on rework and rejected parts. More critically, some defective bearings might slip through inspection and end up in vehicles.

The Lesson: Standard Deviation IS Quality

Scenario	$s$ (mm)	Spec limits in SD units	Defect rate
Tight	0.01	±6 SD	~0.0000002%
Normal	0.02	±3 SD	~0.3%
Sloppy	0.05	±1.2 SD	~23%

The mean didn't change in any scenario — it was always 10.00 mm. The center was fine. What changed was the spread. And the spread made the difference between a near-perfect process and a catastrophically bad one.

This is why statisticians say spread is uncertainty (Theme 4). The standard deviation doesn't just describe the data — it predicts the future. A small standard deviation means you can predict the next bearing's size with high confidence. A large standard deviation means you can't — and when you can't predict quality, you can't guarantee safety.

Real-World Application: Control Charts

In practice, quality control teams don't just calculate the standard deviation once and call it a day. They monitor it continuously using control charts — time-series plots of measurements with lines drawn at 1, 2, and 3 standard deviations from the mean.

Visual description (control chart): A line graph with individual bearing diameters plotted over time (e.g., one measurement per minute for 200 minutes). The y-axis shows diameter in mm. A horizontal solid line at 10.00 mm marks the mean. Two dashed lines at 9.98 and 10.02 mark ±1 SD. Two dashed-dotted lines at 9.96 and 10.04 mark ±2 SD. Two dotted lines at 9.94 and 10.06 mark ±3 SD (the spec limits).

Most points fall within the ±1 SD lines (the central band). A few venture into the ±2 SD zone. One point, circled in red, touches the ±3 SD line. The quality engineer investigates that bearing immediately.

Starting around measurement 150, the points begin drifting upward — the process mean is shifting. Even though no individual point has crossed the ±3 SD line yet, the pattern of drift is a warning sign. The engineer would shut down the machine for recalibration before defective bearings start rolling off the line.

Control charts use the Empirical Rule in real time. If a single measurement falls beyond 3 standard deviations, it's investigated as a potential defect or process failure. If several consecutive measurements drift in one direction (even if none individually crosses a line), that pattern signals a systematic shift — the machine needs maintenance.

This approach was pioneered by Walter Shewhart at Bell Labs in the 1920s and remains the foundation of statistical process control in manufacturing worldwide. Every time you drive a car, fly in an airplane, or take a medication from a factory, control charts based on the Empirical Rule helped ensure that product met specifications.

Extension: Healthcare Quality

The same principles apply far beyond manufacturing. Consider medication dosage in a hospital pharmacy.

A pharmacy prepares intravenous (IV) medication bags with a target dosage of 500 mg. The filling process has: - Mean: 500 mg - Standard deviation: 5 mg - Distribution: Approximately bell-shaped

The safe dosage range is 485 to 515 mg (±15 mg from target, or ±3 SD). By the Empirical Rule, about 99.7% of bags are within the safe range. But for a busy hospital preparing 1,000 bags per day, that means about 3 bags per day are outside the safe range.

For a critically ill patient, an under-dosed bag means the treatment may not work. An over-dosed bag could cause a dangerous reaction. Three bad bags per day — out of a thousand — is not an acceptable error rate when lives are at stake.

This is why hospitals invest in automated filling systems with smaller standard deviations. Reducing the standard deviation from 5 mg to 2 mg pushes the spec limits to ±7.5 SD from the mean — virtually eliminating dosage errors. The investment in precision is, literally, an investment in lives.

Discussion Questions

A chip manufacturer produces microprocessors with a target clock speed of 3.5 GHz. The process has a mean of 3.5 GHz and a standard deviation of 0.05 GHz. If the specification requires clock speeds between 3.35 and 3.65 GHz, what percentage of chips meet the specification? What is the "sigma level" of this process?
Two factories produce identical bolts with a target length of 50.0 mm. Factory A has $s = 0.1$ mm and Factory B has $s = 0.3$ mm. Both have the same mean (50.0 mm). If the specification allows bolts between 49.4 and 50.6 mm, calculate the defect rate for each factory using the Empirical Rule. Which factory would you rather buy from?
Why is it critical that the distribution be approximately bell-shaped before applying the Empirical Rule in quality control? What could go wrong if the distribution were actually bimodal (perhaps because two machines with different calibrations are mixing output)?
A hospital pharmacist says, "Our medication filling process has a 99.7% accuracy rate — practically perfect." A patient safety advocate responds, "For a hospital that fills 2,000 prescriptions per day, that means 6 errors per day — one every 4 hours. That's not practically perfect." Who is right? How does this disagreement illustrate the importance of considering both percentages and absolute numbers?
Connect this case study to the concept of z-scores from Section 6.8. If a bearing measures 10.05 mm in the normal process ($s = 0.02$ mm), what is its z-score? Would you flag it as a defect? What if the process had $s = 0.01$ mm — what would the z-score be then?

Key Takeaways from This Case Study

The Empirical Rule provides a powerful prediction tool for bell-shaped distributions: 68% of values within 1 SD, 95% within 2 SDs, 99.7% within 3 SDs.
Standard deviation IS the measure of quality in manufacturing. The mean tells you the target; the standard deviation tells you how consistently you hit it.
Small changes in standard deviation have enormous consequences. Cutting the standard deviation in half can transform a defect rate from problematic to negligible.
The Empirical Rule works in real time through control charts — monitoring whether a process is stable or drifting out of control.
"99.7% good" doesn't always mean "good enough." At high volumes, even tiny defect rates translate to large absolute numbers of failures. Context matters.
Spread is uncertainty, and uncertainty is risk. In manufacturing and healthcare, reducing uncertainty (the standard deviation) directly reduces the risk of harm.