Chapter 9 Key Takeaways: Forensic DNA Statistics
A one-page field card. If you remember nothing else from this chapter, remember the rule in the box at the bottom.
The core claims
- A DNA "match" is empty without a number. "These profiles match" means only that they correspond; the random match probability (RMP) is what tells you whether the correspondence is unremarkable or astronomically rare.
- The RMP is a statement about coincidence, not guilt. "1 in 43 million" means a random unrelated person would share the profile by chance with that probability. It is not "a 1-in-43-million chance the defendant is innocent."
- The likelihood ratio (LR) is the better frame. It compares the probability of the evidence under two stated hypotheses (defendant is a contributor vs. an unrelated unknown is). For a single-source match, LR ≈ 1/RMP. Its virtue is that it forces both hypotheses into the open and resists the fallacy.
- Mixtures need software; software needs honesty. Probabilistic genotyping computes an LR for complex mixtures the human eye cannot untangle — a real advance — but the LR is a model's estimate, not a measurement, and two valid programs can disagree by orders of magnitude on a complex sample.
- The prosecutor's fallacy is the chief danger. Transposing $P(\text{match} \mid \text{innocent})$ into $P(\text{innocent} \mid \text{match})$ has helped convict the innocent. Its mirror, the defense fallacy, strips strong evidence of context to make it look worthless. Both are wrong.
- Bayes shows what the match probability leaves out: the prior. posterior odds = LR × prior odds. The scientist supplies the LR (the weight); the jury supplies the prior (drawing on all other evidence). The same LR yields a coin flip or near-certainty depending on the prior — which is why the number never decides alone.
Method-validity verdict (NAS 2009 / PCAST 2016)
| Method / claim | Where it sits on the validity spectrum | Key error mode |
|---|---|---|
| Single-source DNA match statistics (RMP, LR) | Strong — quantified, peer-reviewed, the field's gold standard | The error is in interpretation/language, not the math (the prosecutor's fallacy) |
| Probabilistic genotyping — simple mixtures (within validated limits) | Foundationally valid (PCAST 2016) | Operating outside validated limits; biased human inputs |
| Probabilistic genotyping — complex mixtures (many contributors, low template) | Not yet established (PCAST 2016) | Model-dependence; programs disagree; source-code opacity |
| Reporting "probability the defendant is the source/guilty" | Not science — a transposed conditional | The prosecutor's fallacy itself |
Key terms (one line each)
- Likelihood ratio (LR) — how many times more probable the evidence is under one hypothesis than a competing one.
- Probabilistic genotyping — software that computes an LR for complex/low-template DNA mixtures.
- Prosecutor's fallacy — treating P(evidence | innocent) as P(innocent | evidence).
- Defense fallacy — dismissing strong evidence by counting coincidental matchers while ignoring the other evidence against this defendant.
- Bayesian reasoning — posterior odds = LR × prior odds; updating belief with the weight of evidence.
- Allele frequency — the proportion of chromosomes carrying a given allele; the raw material of match probabilities.
Themes advanced in this chapter
- The validity spectrum: DNA statistics anchor the strong end — because they are quantified — yet even there the validity lives in the interpretation, and complex-mixture software is not yet established.
- The CSI effect cuts both ways: a jury primed for certainty hears a hedged probability as a verdict (over-trust), while a defense recasting of the LR as coincidence can make strong evidence look worthless (under-value). Stating the limits out loud is the safeguard.
⚖️ What you can honestly say on the stand
"A random, unrelated person would share this profile by chance with a probability of about 1 in [X]; equivalently, the evidence is about [LR] times more probable if the defendant is a contributor than if an unrelated unknown person is. This speaks to whether the defendant's DNA is present — not to how or when it got there, and not to the probability that he is guilty."
What you may never say: "the probability the defendant is innocent is 1 in [X]"; "the probability he is the source is 99.99%"; "it's a match, so he did it." Each is the prosecutor's fallacy or overstated individualization wearing a number.
The whole discipline in one rule: report the strength of the evidence; never report the probability of guilt.