Chapter 9 Key Takeaways: Forensic DNA Statistics

A one-page field card. If you remember nothing else from this chapter, remember the rule in the box at the bottom.


The core claims

  • A DNA "match" is empty without a number. "These profiles match" means only that they correspond; the random match probability (RMP) is what tells you whether the correspondence is unremarkable or astronomically rare.
  • The RMP is a statement about coincidence, not guilt. "1 in 43 million" means a random unrelated person would share the profile by chance with that probability. It is not "a 1-in-43-million chance the defendant is innocent."
  • The likelihood ratio (LR) is the better frame. It compares the probability of the evidence under two stated hypotheses (defendant is a contributor vs. an unrelated unknown is). For a single-source match, LR ≈ 1/RMP. Its virtue is that it forces both hypotheses into the open and resists the fallacy.
  • Mixtures need software; software needs honesty. Probabilistic genotyping computes an LR for complex mixtures the human eye cannot untangle — a real advance — but the LR is a model's estimate, not a measurement, and two valid programs can disagree by orders of magnitude on a complex sample.
  • The prosecutor's fallacy is the chief danger. Transposing $P(\text{match} \mid \text{innocent})$ into $P(\text{innocent} \mid \text{match})$ has helped convict the innocent. Its mirror, the defense fallacy, strips strong evidence of context to make it look worthless. Both are wrong.
  • Bayes shows what the match probability leaves out: the prior. posterior odds = LR × prior odds. The scientist supplies the LR (the weight); the jury supplies the prior (drawing on all other evidence). The same LR yields a coin flip or near-certainty depending on the prior — which is why the number never decides alone.

Method-validity verdict (NAS 2009 / PCAST 2016)

Method / claim Where it sits on the validity spectrum Key error mode
Single-source DNA match statistics (RMP, LR) Strong — quantified, peer-reviewed, the field's gold standard The error is in interpretation/language, not the math (the prosecutor's fallacy)
Probabilistic genotyping — simple mixtures (within validated limits) Foundationally valid (PCAST 2016) Operating outside validated limits; biased human inputs
Probabilistic genotyping — complex mixtures (many contributors, low template) Not yet established (PCAST 2016) Model-dependence; programs disagree; source-code opacity
Reporting "probability the defendant is the source/guilty" Not science — a transposed conditional The prosecutor's fallacy itself

Key terms (one line each)

  • Likelihood ratio (LR) — how many times more probable the evidence is under one hypothesis than a competing one.
  • Probabilistic genotyping — software that computes an LR for complex/low-template DNA mixtures.
  • Prosecutor's fallacy — treating P(evidence | innocent) as P(innocent | evidence).
  • Defense fallacy — dismissing strong evidence by counting coincidental matchers while ignoring the other evidence against this defendant.
  • Bayesian reasoning — posterior odds = LR × prior odds; updating belief with the weight of evidence.
  • Allele frequency — the proportion of chromosomes carrying a given allele; the raw material of match probabilities.

Themes advanced in this chapter

  • The validity spectrum: DNA statistics anchor the strong end — because they are quantified — yet even there the validity lives in the interpretation, and complex-mixture software is not yet established.
  • The CSI effect cuts both ways: a jury primed for certainty hears a hedged probability as a verdict (over-trust), while a defense recasting of the LR as coincidence can make strong evidence look worthless (under-value). Stating the limits out loud is the safeguard.

⚖️ What you can honestly say on the stand

"A random, unrelated person would share this profile by chance with a probability of about 1 in [X]; equivalently, the evidence is about [LR] times more probable if the defendant is a contributor than if an unrelated unknown person is. This speaks to whether the defendant's DNA is present — not to how or when it got there, and not to the probability that he is guilty."

What you may never say: "the probability the defendant is innocent is 1 in [X]"; "the probability he is the source is 99.99%"; "it's a match, so he did it." Each is the prosecutor's fallacy or overstated individualization wearing a number.

The whole discipline in one rule: report the strength of the evidence; never report the probability of guilt.