Case Study 1: The Prosecutor's Fallacy — When Probability Goes to Court

Contributors to Introduction to Data Science

Case Study 1: The Prosecutor's Fallacy — When Probability Goes to Court

Tier 2 — Attributed Findings: This case study discusses the prosecutor's fallacy, a well-documented error in probabilistic reasoning that has affected real legal cases. The specific examples referenced (the Sally Clark case and general DNA evidence issues) are based on widely reported and academically documented events. The statistical concepts and their legal implications are accurately presented. The fictional courtroom scenario used for illustration is a composite for pedagogical purposes.

The Setting

In 1999, a British woman named Sally Clark was convicted of murdering her two infant sons. Both had died suddenly — one in 1996, the other in 1998. The prosecution's key argument hinged on a statistic: an expert witness testified that the probability of two children in the same family dying of sudden infant death syndrome (SIDS) was approximately 1 in 73 million.

One in 73 million. That sounds overwhelming. Surely, the jury thought, something this improbable can't be a coincidence. Clark must be guilty.

She wasn't. The conviction was overturned on appeal in 2003, after it was shown that the statistical reasoning was fundamentally flawed — in multiple ways. The case became one of the most famous examples of the prosecutor's fallacy, and it changed how statistical evidence is used in British courts.

This case study walks through the probabilistic errors, using the tools from Chapter 20.

The Errors

Error 1: Assuming Independence

The expert witness computed the probability by squaring the probability of a single SIDS death: if P(one SIDS death) ≈ 1/8,543, then P(two SIDS deaths) = (1/8,543)^2 ≈ 1/73 million.

This calculation assumes the two deaths are independent events. But are they?

No. If one child in a family dies of SIDS, the probability of a second SIDS death in the same family is higher than in the general population — due to shared genetic factors, shared environmental conditions, and other risk factors. The events are not independent.

import numpy as np

# If independent:
p_single = 1 / 8543
p_two_independent = p_single ** 2
print(f"P(two SIDS deaths, assuming independence): 1 in {1/p_two_independent:,.0f}")

# If correlated (second death more likely given first):
# Estimates suggest P(second | first) might be 1/100 to 1/200
p_second_given_first = 1 / 100  # Conservative estimate
p_two_correlated = p_single * p_second_given_first
print(f"P(two SIDS deaths, accounting for correlation): 1 in {1/p_two_correlated:,.0f}")
print(f"\nDifference: the independent assumption makes it {p_two_correlated/p_two_independent:.0f}x rarer than it really is")

Accounting for correlation, the probability might be closer to 1 in 850,000 — still rare, but 85 times more likely than the 1 in 73 million figure. The independence assumption massively exaggerated the improbability.

Error 2: The Prosecutor's Fallacy (Transposing the Conditional)

Even if we accept the 1 in 73 million number, there's a deeper error. The prosecution essentially argued:

"The probability of two SIDS deaths if the defendant is innocent is 1 in 73 million. Therefore, the probability that the defendant is innocent is 1 in 73 million."

This is the prosecutor's fallacy — confusing P(evidence | innocent) with P(innocent | evidence). These are different conditional probabilities, and confusing them is exactly the error we explored with Bayes' theorem in Chapter 20.

Let's think about this carefully with Bayes' theorem:

# What we know:
# P(two SIDS deaths | innocent) ≈ very small (use 1/73,000,000 for now)
# P(evidence that looks this way | guilty) ≈ ~1 (if guilty, babies die)
#
# What we need:
# P(innocent | two deaths)
#
# We ALSO need:
# P(innocent) — the prior probability (base rate of innocence)

# Consider: among all families where two infants die,
# how many are SIDS vs. murder?

# Rough estimates for illustration:
# About 650,000 births per year in England/Wales
# SIDS rate: ~1 in 8,543 per birth → about 76 SIDS deaths/year
# Double SIDS in same family: even with correlation, very rare — maybe 1 every few years
# Double infant homicide: also extremely rare

# Let's work with these numbers
p_two_sids = 1 / 73_000_000  # Using the prosecution's number
p_two_murders = 1 / 200_000_000  # Infant double homicide is also very rare

# P(evidence | innocent) × P(innocent)
# vs.
# P(evidence | guilty) × P(guilty)

# The KEY insight: BOTH explanations are extremely rare
# The question isn't "is double SIDS rare?" but
# "is double SIDS rarer than double murder?"

p_innocent_given_deaths = (p_two_sids) / (p_two_sids + p_two_murders)
p_guilty_given_deaths = (p_two_murders) / (p_two_sids + p_two_murders)

print("Simplified Bayesian analysis:")
print(f"  P(innocent | two deaths): {p_innocent_given_deaths:.4f}")
print(f"  P(guilty | two deaths):   {p_guilty_given_deaths:.4f}")
print(f"\n  Even with the prosecution's own numbers,")
print(f"  SIDS is {p_two_sids/p_two_murders:.1f}x MORE likely than murder!")

The crucial point: the prosecution compared the probability of innocence to ZERO (or to certainty), when they should have compared it to the probability of the alternative explanation. Both double SIDS and double infant homicide are extremely rare events. The question isn't "is double SIDS rare?" — it's "is double SIDS rarer than double homicide?" And the answer, by most estimates, is NO. Double SIDS is actually more common than double infant homicide.

Error 3: The Numbers Game

There's a third error that's more subtle. Even a 1-in-73-million event is expected to happen in a large population.

# In England and Wales, there are roughly 650,000 births per year
# Over 10 years, that's 6.5 million births
# How many families would we EXPECT to have two SIDS deaths?

births_per_year = 650_000
years = 10
# Simplified: number of families having two children in this period
# (rough estimate for illustration)
families_with_two_children = 3_000_000

# Even at 1 in 73 million:
expected_double_sids = families_with_two_children / 73_000_000
print(f"Expected families with double SIDS in 10 years (prosecution's number): {expected_double_sids:.2f}")
print(f"So we'd expect about {expected_double_sids:.2f} to happen by chance alone")
print(f"\nWith corrected (correlated) probability:")
expected_corrected = families_with_two_children / 850_000
print(f"Expected families with double SIDS: {expected_corrected:.1f}")

Even using the prosecution's inflated figure, we'd expect about 0.04 such cases per decade — meaning over a few decades, at least one case is plausible by chance alone. With the corrected figures, we'd expect several cases per decade.

A Simulation

Let's simulate the full scenario to make the fallacy visceral:

np.random.seed(42)

def simulate_courtroom(n_populations=100, pop_size=5_000_000,
                       p_sids=1/8543, p_murder=1/500_000,
                       correlation_factor=50):
    """
    Simulate multiple populations and see how often
    double-SIDS and double-murder occur.
    """
    sids_double = 0
    murder_double = 0

    for _ in range(n_populations):
        # Each person can experience: nothing, SIDS, or murder (simplified)
        # Double events for a single family
        for family in range(pop_size):
            # First child
            sids_1 = np.random.random() < p_sids
            murder_1 = np.random.random() < p_murder

            if sids_1:
                # Second child — correlated SIDS risk
                sids_2 = np.random.random() < (p_sids * correlation_factor)
                if sids_2:
                    sids_double += 1

            if murder_1:
                murder_2 = np.random.random() < p_murder * 10  # Also elevated
                if murder_2:
                    murder_double += 1

    return sids_double, murder_double

# This would take too long with 5M families, so let's use smaller numbers
# and reason about the proportions
print("Key insight from Bayesian reasoning:")
print("="*50)
print(f"Question: Given that two babies in a family died,")
print(f"what's MORE likely — SIDS or murder?")
print(f"")
print(f"The prosecution said: 'SIDS is 1 in 73 million,")
print(f"so it must be murder.'")
print(f"")
print(f"The correct reasoning: 'BOTH are rare. Which is")
print(f"less rare? SIDS — by a significant margin.'")
print(f"")
print(f"The fallacy: confusing P(evidence | innocent)")
print(f"with P(innocent | evidence).")

The Aftermath

Sally Clark was acquitted on her second appeal in 2003. The Royal Statistical Society took the unusual step of issuing a public statement criticizing the misuse of statistics in the original trial. The expert witness was later found to have suppressed evidence (microbiological evidence suggesting natural causes for the second death).

Tragically, Clark never fully recovered from the wrongful conviction and the death of her children. She died in 2007 at the age of 42.

Her case led to significant reforms in how statistical evidence is presented in British courts, and it remains a cautionary tale taught in law schools, medical schools, and statistics courses worldwide.

The Lessons

Lesson 1: P(A|B) Is Not P(B|A)

The prosecutor's fallacy is the confusion of the conditional — P(evidence | innocent) vs. P(innocent | evidence). This is exactly the error that Bayes' theorem corrects. In the chapter, we saw this with medical tests. Here, we see it with life-and-death consequences.

Lesson 2: Always Consider the Alternative Hypothesis

The prosecution asked: "How likely is this evidence if the defendant is innocent?" But they never asked the equally important question: "How likely is this evidence if the defendant is guilty?" When both the null hypothesis (innocent) and the alternative (guilty) predict rare events, you need to compare them, not just evaluate one.

Lesson 3: Check Independence Before Multiplying

The assumption of independence led to a dramatic overestimation of rarity. In real data — medical data, social data, any data involving humans — events within the same family, community, or context are almost never independent. Always ask: "Are these events truly independent, or might they be correlated?"

Lesson 4: Rare Events Happen

In a population of millions, events with probabilities of 1 in a million are expected to happen several times. The fact that something rare happened to a specific person does not mean that person caused it.

Discussion Questions

How would you explain the prosecutor's fallacy to a juror who has no statistical training?
Can you think of other real-world situations where P(A|B) and P(B|A) might be confused? (Consider medical screening, security profiling, or hiring decisions.)
The Royal Statistical Society's statement said that "statistics should be presented in a way that makes clear what can and cannot be inferred." What practices would you recommend for presenting statistical evidence in court?
How does the prosecutor's fallacy relate to the base rate neglect we studied with Bayes' theorem? Are they the same error, or different errors?

Connection to the Chapter

This case study directly applies conditional probability (Section 20.4), Bayes' theorem (Section 20.5), the distinction between P(A|B) and P(B|A), the concept of independence (Section 20.3), and base rate reasoning. It's a sobering illustration that probability isn't just an academic exercise — getting it wrong has real consequences.