Chapter 9: Forensic DNA Statistics: Match Probabilities, Likelihood Ratios, Mixtures, and the Prosecutor's Fallacy

DataField.Dev

40 min read

> "The probability of finding this evidence given innocence is not the probability of innocence given this evidence. Confusing the two has convicted the innocent."

Prerequisites

7
8

Learning Objectives

Explain what a DNA 'match' actually is and why the accompanying random match probability — not the word 'match' — carries the weight of the evidence.
Construct and interpret a likelihood ratio, stating in plain language the two hypotheses it compares and what a large or small value does and does not mean.
Describe how probabilistic genotyping software interprets complex mixtures, and articulate the 'black box' validation and source-code concerns that surround it.
Identify, name, and refute the prosecutor's fallacy and the defense fallacy whenever a forensic probability is stated.
Use Bayesian reasoning to combine a likelihood ratio with prior information, and explain why the forensic scientist supplies the LR while the jury supplies the prior.
Communicate a DNA result honestly to a non-specialist — as a strength-of-evidence statement, never as a statement of guilt or of the probability the defendant is the source.

In This Chapter

Overview
Learning Paths
9.1 From "match" to probability: what a number means
9.2 The likelihood ratio: the question the court should ask
9.3 Probabilistic genotyping software (and its black-box problem)
9.4 The prosecutor's fallacy (and the defense fallacy)
9.5 Bayesian reasoning for jurors: priors and the weight of evidence
9.6 Communicating a match honestly
🗂️ The Case File
Conclusion
Key Terms
Spaced Review

Exercises Quiz Case Study 01 Case Study 02 Key Takeaways Further Reading

Chapter 9: Forensic DNA Statistics: Match Probabilities, Likelihood Ratios, Mixtures, and the Prosecutor's Fallacy

"The probability of finding this evidence given innocence is not the probability of innocence given this evidence. Confusing the two has convicted the innocent." — a paraphrase, offered here as a [constructed teaching line], of the warning repeated for decades by statisticians reviewing forensic testimony; the specific error it names is documented in the cases this chapter studies.

Overview

A DNA analyst takes the stand and says the words the jury has been waiting for: "The defendant's profile matches the crime-scene profile." Heads nod. The case feels over. But that sentence, by itself, is almost empty — and in the wrong hands it is worse than empty, it is misleading. The work of this chapter is to fill that sentence with the only thing that gives it meaning: a number, honestly derived and honestly explained.

In the last two chapters you learned how a DNA profile is generated (Chapter 7) and how the field pushes that profile to its limits with touch DNA, degraded samples, and mixtures (Chapter 8). This chapter is about what happens after the profile exists — the interpretation, where the real intellectual danger lives. DNA is the one forensic discipline that sits at the strong end of the validity spectrum precisely because it can put a defensible number on a match. But a number is a tool, and a tool can be misused. The same statistic that makes DNA the gold standard can, when phrased the way prosecutors and untrained witnesses are tempted to phrase it, commit a logical error so common it has a name: the prosecutor's fallacy.

So we will move carefully. We will start with what a match probability actually measures, and why "1 in a billion" is a statement about coincidence, not about guilt. We will build the likelihood ratio, the cleaner way to frame the question, and learn to say out loud the two hypotheses it weighs. We will look at the software now used to untangle the mixtures that defeat the human eye — and at the serious, unresolved fight over whether that software's reasoning can be hidden from the defense. We will name and refute both the prosecutor's fallacy and its mirror image, the defense fallacy. And we will end where every chapter of this book ends: with what you can honestly say on the stand, and what you must refuse to say no matter how badly the room wants to hear it.

In this chapter, you will learn to:

Explain why a DNA match means nothing until it is paired with a probability, and what that probability is actually the probability of.
Build a likelihood ratio, state the two competing hypotheses it compares, and interpret its size honestly.
Describe how probabilistic genotyping software reads a mixture, and articulate the validation and source-code controversies that surround it.
Spot, name, and refute the prosecutor's fallacy and the defense fallacy the moment a number is stated.
Apply Bayesian reasoning — combining the evidence's weight with a prior — and explain the division of labor between scientist and juror.
Communicate a match in language that conveys its true strength without ever crossing into a claim about guilt.

Learning Paths

🔎 Investigator/CSI: Your collection decisions upstream (Chapters 2–3, 7–8) determine whether the number the analyst can report is a clean single-source match or a murky mixture. Read §9.3 to understand what you are handing the lab. The big idea for you: the strength of a DNA statistic is set long before the statistician sees it. 🧪 Lab analyst: This is your chapter. §9.1–9.3 are the core of modern interpretation; §9.6 is how you survive cross-examination without overstating. Know cold the difference between an RMP and a likelihood ratio, and never let an attorney put the prosecutor's fallacy in your mouth. ⚖️ Law/courtroom: §9.4 and §9.6 are the heart of it — the fallacies are the most common, most reversible interpretation errors in forensic testimony, on both sides of the aisle. §9.5 is how a Bayesian framework does and does not belong in a courtroom. 👥 General reader/juror: If you ever sit on a jury, §9.4 and §9.5 are the most important pages in this book. They are the difference between weighing DNA evidence correctly and being talked into a conclusion the number never supported.

9.1 From "match" to probability: what a number means

Begin with the word that does all the damage: match. When an analyst says two profiles "match," they mean something narrow and true — that at every genetic location compared, the types seen in the questioned sample are also seen in the reference sample, with no unexplained differences. That is a real observation. But notice what it is not. It is not a statement of how surprising the observation is. A match between two people at a single, common genetic location is utterly unremarkable; half the population might share it. A match across the full set of locations used in a modern profile is, for unrelated people, astronomically unlikely by chance. The word "match" sounds the same in both cases. The number is what tells them apart.

That number, for a single-source profile, is the random match probability (RMP) you met in Chapter 7: the probability that a person selected at random from the relevant population would, by coincidence, share the crime-scene profile. The way it is built is worth seeing once, slowly, because the logic of every DNA statistic descends from it.

The foundation is the allele frequency: how common a particular genetic variant is in a population.

Allele frequency — the proportion of chromosomes in a reference population that carry a particular allele (a specific variant) at a given genetic location. It is estimated from population databases, and it is the raw material from which every match probability is built. An allele seen on, say, 8% of chromosomes in the relevant population has an allele frequency of 0.08.

At each genetic location (a locus, in the vocabulary of Chapter 7), a person carries two alleles, one inherited from each parent. The frequency of a person's genotype at that locus — their specific pair of alleles — is computed from the individual allele frequencies, using a population-genetics model that describes how alleles combine. (The model rests on assumptions about how a population mates and mixes; population geneticists have spent decades stress-testing and correcting those assumptions, which is part of why DNA's foundation is so much stronger than that of the pattern disciplines in Part III.) The single most important reason the final RMP gets so small is that the loci are, to a good approximation, independent: your type at one location tells you almost nothing about your type at another. When events are independent, their probabilities multiply.

🔬 At the Bench Here is the multiplication, with deliberately round, illustrative numbers — not from any real case. Suppose the crime-scene profile has, at six independent loci, genotype frequencies of roughly $1/20$, $1/15$, $1/50$, $1/8$, $1/30$, and $1/12$. Multiply them: the chance a random person matches at all six is about $1/(20 \times 15 \times 50 \times 8 \times 30 \times 12)$, or roughly 1 in 43 million. Add more loci — modern profiles use twenty or more — and the product can fall to 1 in many billions or trillions. This multiplication is exactly why DNA can reach the "individual" end of the class-versus-individual spectrum (Chapter 1) that no pattern method can honestly reach: not because anyone examined every person on Earth, but because the rarity is calculated from a validated genetic model. The strength is real. So is the obligation to state it correctly — which is the rest of this chapter.

Now read the RMP for exactly what it says, and refuse to let it say more. "1 in 43 million" — even "1 in 4 billion" — means: if this DNA came from someone other than the defendant, the chance that random someone would happen to share this profile is 1 in 43 million. Express it both ways every time, fraction and population, because the words discipline the meaning: "1 in 4 billion is roughly one person in every four Earth-populations' worth of unrelated people." That is a statement about coincidence. It is emphatically not the statement "there is a 1-in-43-million chance the defendant is innocent." Those two sentences are different in kind, and treating the first as if it were the second is the prosecutor's fallacy we will dismantle in §9.4. Hold that distinction; the whole chapter turns on it.

⚖️ In the Courtroom An RMP also quietly assumes a reference population. Allele frequencies differ across population groups, so an analyst computes the RMP against one or more relevant databases and, in honest practice, reports a range or the most conservative value. A competent cross-examiner will ask: which population? what if the true donor is a relative of the defendant, who shares far more DNA than a random stranger? The RMP for an unrelated person can be 1 in billions while the chance a full sibling coincidentally matches is far, far higher — sometimes only hundreds or thousands to one. The number is only as meaningful as the alternative it is measured against. We make that comparison explicit in the next section.

The limits and the validity verdict for §9.1

The RMP is the strongest single number in forensic science, and it still has hard edges. It describes coincidence for an unrelated person, not a relative. It assumes the profile is clean and single-source — the moment a sample is a mixture or a partial, the simple multiplication no longer applies and we need the machinery of §9.2 and §9.3. And it is a statement about the DNA, never about the person's conduct: DNA can arrive at a scene by transfer (Chapter 8), so even a perfect match does not say how or when the DNA got there. On the validity spectrum (Chapters 1, 6), single-source DNA statistics sit at the strong end — quantified, peer-reviewed, grounded in population genetics, endorsed by the NAS 2009 and PCAST 2016 reports as the field's one rigorously validated comparison method. The error mode is not in the mathematics; it is in the interpretation and the language, which is exactly why the rest of this chapter exists.

9.2 The likelihood ratio: the question the court should ask

The RMP answers a question, but not quite the right one. It tells us how surprising the match would be if the defendant were not the source. It says nothing, directly, about the competing world in which the defendant is the source. The tool that holds both worlds in view at once — and the framework that now dominates forensic interpretation worldwide — is the likelihood ratio.

Likelihood ratio (LR) — a number expressing how much more (or less) probable the observed evidence is under one hypothesis than under a competing hypothesis. In forensic DNA it compares the probability of the DNA results assuming the prosecution's proposition (typically: the defendant is a contributor) to the probability of the same results assuming the defense's proposition (typically: an unknown, unrelated person is the contributor). An LR of 1 million means the evidence is one million times more probable if the prosecution's proposition is true than if the defense's is.

Write it as a fraction of two questions:

$$\text{LR} = \frac{\text{probability of the evidence IF the prosecution's proposition is true}}{\text{probability of the evidence IF the defense's proposition is true}}$$

The discipline of the LR — and the reason careful forensic scientists prefer it — is that it forces you to state both hypotheses out loud, in plain words, before you compute anything. You cannot report an LR without finishing two sentences: "Under $H_p$ (the prosecution hypothesis), the DNA came from ____." "Under $H_d$ (the defense hypothesis), the DNA came from ____." Get those propositions wrong — make them vague, or stack the defense hypothesis as a straw man — and the number is worthless no matter how cleanly it was calculated. The math is the easy part; choosing the right pair of hypotheses is the craft.

🔬 At the Bench For a clean single-source match, the LR and the RMP are simply two faces of the same coin. If the defendant is the source, the probability of seeing his profile in the evidence is essentially 1 (we'd expect exactly that). If an unrelated stranger is the source, the probability of seeing the defendant's exact profile is the RMP — say, 1 in 43 million. So: $$\text{LR} = \frac{1}{\text{RMP}} = \frac{1}{1/43{,}000{,}000} = 43{,}000{,}000.$$ The evidence is 43 million times more probable if the defendant is the source than if a random unrelated person is. Same strength, framed as a comparison rather than a coincidence. (All numbers here are illustrative.) Where the LR earns its keep is the case the RMP cannot handle at all — the mixture in §9.3 — because there the LR can still ask a clean two-hypothesis question even when no single "match probability" can be written down.

Why does the framing matter if the two numbers are equivalent for a single source? Because of how each is heard. "The random match probability is 1 in 43 million" invites a listener to slide, almost irresistibly, into "so there's a 1-in-43-million chance he's innocent" — the fallacy. "The evidence is 43 million times more probable if he is a contributor than if an unrelated person is" resists that slide: it is overtly a statement about evidence under two hypotheses, not a statement about the defendant. The LR is not just better mathematics; it is better-defended language. Standards bodies including SWGDAM and the guidance behind PCAST 2016 have pushed the field toward likelihood-ratio reporting in large part for this reason.

⚖️ In the Courtroom The verbal scale matters too. Many laboratories accompany a numeric LR with a verbal equivalent — for example, an LR in the billions reported as offering "very strong support" for the prosecution proposition over the defense proposition. These scales are conventions, not laws of nature, and they must be presented as support for one hypothesis over another, never as the probability of a hypothesis. The honest expert says: "This evidence provides very strong support for the proposition that the defendant contributed, as compared to the proposition that an unrelated unknown person did." The dishonest (or untrained) expert drops the comparison and says "very strong support that he's guilty" — and has just handed the jury the prosecutor's fallacy in friendly packaging.

🔍 Check Your Understanding 1. An analyst reports an LR of 200,000 "in favor of the prosecution proposition." State, in one sentence each, the two propositions the number is comparing. (If you can't name both, the number has no meaning.) 2. Why is an LR of 1 a statement that the evidence is neutral — that it favors neither side? 3. A clean single-source match has an RMP of 1 in 10 million. What is the LR, and why?

The limits and the validity verdict for §9.2

The LR is a framework, not a fact, and three of its limits matter on the stand. First, it is only as good as its two hypotheses: an inappropriate or unrealistic defense proposition (for instance, ignoring a known relative) can inflate the number dramatically. Second, the LR conveys the weight of the evidence and nothing else — it deliberately says nothing about the prior probability of guilt, which is the jury's province (§9.5). Third, for anything beyond a single-source match the probabilities in the numerator and denominator are themselves estimates, computed by models whose assumptions can be challenged. On the validity spectrum, the LR framework is sound and widely endorsed; the controversy is never the fraction itself but the inputs to it — above all the inputs that the software in the next section computes.

9.3 Probabilistic genotyping software (and its black-box problem)

In Chapter 8 you met the hard problem of the DNA mixture — biological material from two or more contributors, where the alleles overlap, the amounts are unequal, and the data are noisy with the artifacts of low-template amplification (stutter, drop-out, drop-in). For decades, analysts interpreted such mixtures by eye and by hand, deciding subjectively which peaks to "call," which contributors might be present, and what statistic to report. That subjectivity was, predictably, a source of error and of bias — give two analysts the same messy mixture and the suspect's profile, and they could reach different conclusions, sometimes nudged by knowing what answer the case wanted. The fix the field reached for is software.

Probabilistic genotyping — a class of software that uses explicit statistical models and computational methods to interpret DNA profiles, especially complex or low-template mixtures, by computing a likelihood ratio rather than relying on an analyst's subjective peak-by-peak judgment. It models the biological artifacts (stutter, drop-out, drop-in, peak-height variation) probabilistically and weighs the evidence under competing propositions about who contributed.

The promise is real and worth stating plainly. Probabilistic genotyping replaces an irreproducible human judgment with a documented, repeatable computation. It can extract usable information from mixtures that a human analyst would have to dismiss as inconclusive. It produces an LR with its assumptions written into code rather than carried in an examiner's head. Two well-known systems — one using a "fully continuous" model that accounts for peak heights, another in that same family — are now used by crime laboratories across the United States and abroad, and validation studies published by their developers and by some independent laboratories report that, used within their validated limits, they perform consistently.

And then the difficulties begin.

⚠️ Junk-Science Alert Probabilistic genotyping is not junk science — but uncritical faith in it would be. Three cautions belong on every report. First, the black box. These programs are commercial products, and developers have, in multiple cases, resisted defense requests to examine the source code, asserting trade-secret protection. A defendant convicted partly on a number produced by software he is not permitted to inspect raises a genuine due-process question: how do you cross-examine a calculation you cannot see? Courts have split, and the issue remains live. Second, different programs, different numbers. Independent comparisons have found that two validated programs, run on the same complex mixture, can return LRs that differ by orders of magnitude — and occasionally point in opposite directions. That does not mean both are "wrong," but it does mean the number is model-dependent, not a fact of nature. Third, the edge of the envelope. These tools are validated for certain numbers of contributors and minimum DNA quantities; pushed past those limits — too many contributors, too little template — their output becomes unreliable, and the obligation is to report "beyond validated limits," not to report a number anyway.

PCAST's 2016 report engaged exactly this technology and reached a careful, two-part verdict that you should be able to state. For simple mixtures — few contributors, reasonable amounts of DNA — it found probabilistic genotyping to have foundational validity: the studies show it does what it claims, within those bounds. For complex mixtures — many contributors, low template, large overlap — it found the evidence of validity not yet established, and urged caution. The takeaway is the book's recurring lesson in miniature: a method can be valid in one regime and unvalidated in another, and the honest practitioner states which regime the actual sample falls in.

🧠 Cognitive-Bias Watch Software does not eliminate cognitive bias (Chapter 31); it relocates it. The analyst still chooses the number of contributors to assume, still defines the propositions, still decides whether the sample is within validated limits — and those choices can be swayed by knowing the suspect's profile or what the detective expects. Best practice is to make as many of these decisions as possible before the reference profile is compared, and to document them. A probabilistic genotyping result is only as unbiased as the human inputs feeding it; "the computer said so" is not a defense against a biased setup. We will see in Chapter 31 how sequential unmasking is meant to protect exactly these upstream choices.

FIGURE 9.1 — "What the software is weighing"        [constructed teaching example]
   A two-person mixture at one genetic location, read off an electropherogram (Ch. 7):

      peak height
        |            ###
        |            ###                  ##
        |   ###      ###       ###        ##
        |   ###      ###       ###        ##         small peak (artifact?) -> .
        +---+--------+---------+----------+----------+----------  allele
           12       14        15         17        (18)
        major contributor: tall peaks (12, 15) ;  minor: shorter peaks (14, 17)
        (18) = a small peak that may be a true minor allele OR stutter/drop-in
   The model asks: across ALL loci, are these peak patterns MORE probable if the
   suspect is the minor contributor, or if an unknown person is? That ratio is the LR.

The diagram is schematic, and the ambiguity it shows is the whole point. A human analyst staring at that uncertain peak labeled (18) must decide, yes or no, whether to count it. Probabilistic genotyping instead carries the uncertainty through the calculation — weighting the possibility that (18) is real and the possibility that it is an artifact, across every locus at once — and folds it into a single likelihood ratio. That is a genuine methodological advance over the old "call it or don't" approach. It is also why the LR it produces is a model's considered estimate, not a measurement, and must be reported and cross-examined as such.

The limits and the validity verdict for §9.3

Probabilistic genotyping's place on the validity spectrum is split by design: foundationally valid for simple mixtures within its validated envelope, not-yet-established for complex ones, per PCAST 2016. Its error modes are (a) operating outside validated limits, (b) biased human inputs (assumed contributor number, propositions), and (c) the irreducible model-dependence that makes two valid programs disagree. The honest report states the program used, its validated range, the assumed number of contributors, and — critically — that the LR is an estimate from a model, not a fact. And the open courtroom problem, source-code access, is not a scientific question but a fairness one: evidence a defendant cannot examine sits uneasily in an adversarial system built on the right to confront it.

9.4 The prosecutor's fallacy (and the defense fallacy)

We arrive at the most important — and most dangerous — idea in the chapter. It is a logical error, not a mathematical one, which is exactly why it slips past so many intelligent people, including judges, attorneys, and expert witnesses who can do the arithmetic perfectly and still draw the wrong conclusion from it.

Prosecutor's fallacy — the error of treating the probability of the evidence given innocence (for example, the random match probability) as if it were the probability of innocence given the evidence. It transposes a conditional probability: "the chance a random innocent person would match is 1 in a million" is wrongly restated as "the chance the defendant is innocent is 1 in a million." The two quantities are different, and the difference can be enormous.

The fallacy is a transposed conditional. In symbols — and this is the one place the notation earns its keep — the random match probability is roughly $P(\text{match} \mid \text{innocent})$: the probability of a match given the person is not the source. What a juror cares about is $P(\text{innocent} \mid \text{match})$: the probability the person is not the source given the match. The fallacy is to assume these are the same number. They are not, and confusing $P(A \mid B)$ with $P(B \mid A)$ is one of the oldest, most seductive mistakes in all of reasoning.

A homely example breaks the spell. The probability that an animal has four legs given that it is a cow is essentially 1 — cows have four legs. The probability that an animal is a cow given that it has four legs is nowhere near 1 — most four-legged animals are not cows. Same two facts, two wildly different conditional probabilities, and anyone who swapped them would be obviously wrong. The prosecutor's fallacy is that exact swap, wearing the dignity of a DNA statistic so that it no longer looks absurd.

Why does the swap matter so much in a courtroom? Because the probability the juror actually wants — the probability the defendant is innocent given the match — depends on something the match probability does not contain: how many other people could have left that DNA, and how likely the defendant was to be the source before the DNA was considered at all. We make that dependence explicit in §9.5. For now, hold the warning: a small random match probability does not translate, by itself, into a small probability of innocence.

⚠️ Junk-Science Alert The fallacy's most famous early appearance was not even about DNA. In a 1968 California case, People v. Collins, a couple was convicted partly on a prosecutor's claim that the probability of a random couple matching the eyewitness description (a blonde woman with a ponytail, a bearded Black man, a yellow car) was 1 in 12 million — a figure assembled by multiplying made-up frequencies for independent-seeming traits. The state's argument invited the jury to read "1 in 12 million that a random couple matches" as "1 in 12 million that this couple is innocent." The California Supreme Court reversed, in a decision now taught in statistics courses, identifying both the fabricated, unsupported probabilities and the fallacious leap from coincidence to guilt. Collins is the prosecutor's fallacy's origin story, and its lesson predates DNA: a frightening-sounding match probability tells you about coincidence, and the jump from coincidence to guilt is a separate step that the number alone cannot make.

Now the mirror image, because honesty requires naming both, and the defense has its own seductive misuse.

Defense fallacy — the error of dismissing strong evidence by noting that, in a large population, many people would coincidentally match, and concluding that the match is therefore nearly worthless. It treats the defendant as if randomly drawn from the whole matching set while ignoring all the other evidence that placed this particular defendant before the court.

Here is the defense version in action. Suppose the RMP is 1 in 10 million and the relevant population is a country of, say, 60 million people. The defense argues: "Six people in this country would match by chance. The defendant is just one of six; the odds he's the source are only 1 in 6 — reasonable doubt." That argument is also wrong, and for a complementary reason. It pretends the defendant was plucked at random from the population of 60 million, when in fact he was brought to court by other evidence — opportunity, motive, a connection to the victim — that the other five hypothetical matchers do not share. The DNA does not stand alone; it combines with everything else (again, §9.5). The defense fallacy strips the DNA of its context to make it look weak, just as the prosecutor's fallacy strips it of its limits to make it look like proof.

⚖️ In the Courtroom Both fallacies have been committed by real experts in real trials, and appellate courts have reversed convictions over the prosecutor's version. In the United Kingdom, two cases in the 1990s — one usually cited as R v. Deen and another as R v. Doheny and Adams — produced appellate guidance precisely because expert witnesses had, in effect, testified to the transposed conditional, stating or strongly implying that the match probability was the probability of innocence. The resulting guidance instructed experts to state the rarity of the profile and to stop there — to leave the leap to a conclusion about guilt to the jury, where it belongs. The lesson for any expert: report the strength of the evidence; never, ever state the probability that the defendant is guilty or is the source. The instant you say "the probability he is the source is…," you have left science and committed the fallacy.

🔍 Check Your Understanding 1. An analyst testifies: "The random match probability is 1 in 5 million, so there is only a 1-in-5-million chance the defendant is innocent." Name the error and state, in one sentence, why the two probabilities differ. 2. A defense attorney argues: "Twelve people in this state would match by chance, so my client is just 1 of 12 — that's reasonable doubt." Name the error and state what the argument ignores. 3. Which conditional probability does the RMP actually report — $P(\text{match} \mid \text{not the source})$ or $P(\text{not the source} \mid \text{match})$?

The limits and the validity verdict for §9.4

There is no "method" here to place on the validity spectrum — the fallacies are interpretation errors, and that is the point of including them in a chapter about the field's most valid method. The most rigorous statistic in forensic science becomes an engine of wrongful conviction the moment its conditional is transposed. The error mode is purely linguistic and logical, which makes it both easy to commit and easy to prevent: the safeguard is a discipline of language (state coincidence, never guilt) and the Bayesian framework of the next section, which shows precisely what the match probability leaves out.

9.5 Bayesian reasoning for jurors: priors and the weight of evidence

If the prosecutor's fallacy is the disease, Bayesian reasoning is the cure — not because jurors must do arithmetic, but because the Bayesian structure shows exactly what the match probability leaves out and where it belongs.

Bayesian reasoning — a framework for updating the probability of a hypothesis as new evidence arrives, by combining a prior probability (belief before the evidence) with the weight of the new evidence (the likelihood ratio) to produce a posterior probability (belief after the evidence). In forensic terms: prior odds, multiplied by the likelihood ratio, give the posterior odds.

The relationship, stated as odds, is clean enough to hold in your head:

$$\text{posterior odds} = \text{LR} \times \text{prior odds}$$

In words: your belief after the DNA equals your belief before the DNA, multiplied by how strongly the DNA favors one hypothesis over the other. This single line explains everything that is wrong with the prosecutor's fallacy. The match probability (or the LR built from it) is only the multiplier — the weight of the evidence. To get to a probability of guilt, you must multiply it by a prior — how likely the defendant was to be the source before the DNA. The prosecutor's fallacy is the mistake of forgetting the prior entirely, of treating the multiplier as if it were already the answer.

A worked illustration, with illustrative numbers, makes the division of labor vivid.

🔬 At the Bench Suppose the DNA evidence carries an LR of 1,000,000 (the evidence is a million times more probable if the defendant is the source than if an unrelated person is). Now consider two very different cases:

Case A — a cold hit with nothing else. A database search of millions of profiles turns up the defendant, with no other connection to the crime. The prior odds that this particular person (rather than anyone else who could have been searched) is the source might be low — illustratively, 1 in 1,000,000. Posterior odds $= 1{,}000{,}000 \times \frac{1}{1{,}000{,}000} = 1$. That is even odds — a coin flip — despite an LR in the millions. The DNA alone, against a weak prior, is far from conclusive.

Case B — DNA plus independent evidence. The same LR of 1,000,000, but the defendant was the victim's business partner, stood to gain financially, and was placed near the scene by other evidence. The prior odds might now be, say, 1 in 100. Posterior odds $= 1{,}000{,}000 \times \frac{1}{100} = 10{,}000$ to 1. Now the case is strong.

Same DNA. Same LR. Wildly different conclusions — because the prior differs. This is why the forensic scientist must supply the LR and must not supply the answer: the scientist has no business setting the prior, which depends on all the non-DNA evidence the jury alone weighs. (These numbers are illustrative; do not quote them as typical.)

That example also quietly exposes a real-world danger you should carry forward: the cold-hit case. When a suspect is found only by searching a large database, the relevant prior can be very different from the case where independent evidence pointed to the person first, and naively reporting the same RMP can overstate the evidence. The statistical literature has argued for years about exactly how to handle database-search statistics; the safe practice is to disclose that the suspect was identified by a search and to let the factfinder weigh that, rather than to present a cold-hit match as if it were a confirmatory test of an independently chosen suspect.

⚖️ In the Courtroom Should jurors be taught Bayes' theorem and handed the LR to multiply by their own prior? Courts have been skeptical, and not without reason. In a 1990s English case usually cited as R v. Adams, the defense actually walked the jury through a Bayesian calculation; the Court of Appeal was unenthusiastic, worrying that turning jurors into amateur statisticians might supplant, rather than aid, ordinary reasoning. The mainstream position that emerged is a careful one: the expert reports the likelihood ratio (the weight of the evidence) and explains it in plain language, and the jury combines it with everything else using their judgment — not necessarily by formal arithmetic, but understanding that the DNA is one input to be weighed alongside the rest, never the whole verdict. The Bayesian framework is invaluable as a way of understanding what DNA evidence does; it is contested as a procedure to impose on a jury.

The framework, even unquantified, gives a juror a permanent defense against both fallacies. Ask: What is the weight of this DNA evidence (the LR)? And what did I believe before I heard it (the prior)? If the answer to the second is "almost nothing pointed to this person except the database hit," then even a colossal LR may leave real doubt. If independent evidence already pointed hard at the defendant, the DNA can push a strong case to a very strong one. The number never decides alone. That is not a weakness of DNA evidence; it is the correct way to use the strongest tool forensic science owns.

The limits and the validity verdict for §9.5

Bayesian reasoning is a framework, not a forensic test, so it has no spot on the validity spectrum — but it is the conceptual backbone that keeps the valid statistic from being misused. Its practical limits are two. First, the prior is genuinely hard to pin down and is not the scientist's to set, which is why courts resist formal jury arithmetic. Second, the framework assumes the LR was computed correctly and the propositions chosen well — garbage in, garbage out. Used as a way of thinking rather than a courtroom calculator, it is the single best inoculation against the prosecutor's and defense fallacies alike, which is why it closes the conceptual core of this chapter.

9.6 Communicating a match honestly

Everything in this chapter converges on a practical skill: saying a true thing about a DNA result, out loud, to people who are not statisticians, under pressure from at least one attorney who would prefer you said something stronger or weaker than the truth. Communicating uncertainty honestly is not a courtroom afterthought — it is, as Chapter 1 argued and Chapter 30 will develop in full, a core forensic competence. A perfect analysis described in misleading words becomes misleading evidence.

Start with the sentences you may say and the sentences you may not.

🔬 At the Bench — the honest script You may say: - "At twenty genetic locations, the defendant's profile and the crime-scene profile correspond, with no unexplained differences." - "The probability that a random, unrelated person from the relevant population would share this profile by chance is approximately 1 in [X] — a number I can express as one person in [Y] times the population of the United States." - "The evidence is approximately [LR] times more probable if the defendant is a contributor than if an unrelated, unknown person is." - "This is a mixture of at least two contributors; the analysis assumed [N] contributors and the propositions I compared were the following…" - "This result speaks to whether the defendant's DNA is present. It does not speak to how or when that DNA was deposited."

You may NOT say: - "The probability the defendant is innocent is 1 in [X]." (Prosecutor's fallacy — transposed conditional.) - "The probability the defendant is the source is 99.99%." (That is a posterior; it requires a prior the scientist cannot supply.) - "This is a match, so he did it." (Conflates presence of DNA with guilt and with conduct.) - "To a reasonable degree of scientific certainty, this is his DNA and no one else's." (Overstated individualization; reserve quantified claims for the quantified statistic.)

Notice that every permitted sentence is about the evidence and every forbidden one is about the defendant's guilt or the probability of a hypothesis. That is the entire discipline, reducible to a single instruction: report the strength of the evidence; never report the probability of guilt. If you keep the verb on the evidence ("the evidence supports," "the profile corresponds," "a random person would match with probability…") you stay inside the science. The moment the verb attaches to the person ("he is the source with probability…," "the chance he's innocent is…"), you have stepped over the line, even if every number is correct.

🧠 Cognitive-Bias Watch The pressure to overstate is not only external. An analyst who has worked a case for months, who knows the detective is sure, who has seen the other evidence, feels the pull toward the conclusion everyone wants — and that pull leaks into word choice ("strong match," "definitely him") long before it shows up in a number. The safeguard is partly structural (limit what the analyst knows about the case — Chapter 31) and partly verbal discipline: rehearse the honest script until the careful phrasing is automatic and the overstated phrasing feels wrong in the mouth. The CSI effect (Chapter 1) compounds the danger from the other side: a jury primed by television to expect certainty may hear certainty in a carefully hedged statement, so the honest expert states the limits explicitly rather than trusting the jury to infer them.

⚖️ In the Courtroom Cross-examination on a DNA statistic almost always probes the same seams, and you should welcome each question because each has an honest answer. Which population did you use, and what if the donor is a relative? — answer with the reference population and the (higher) relative-match probability. Could this DNA have gotten there by transfer? — yes, the statistic addresses presence, not mechanism (Chapter 8). Was the suspect found by a database search? — disclose it; it bears on the prior (§9.5). Did software produce this number, and can the defense examine it? — name the program, its validated limits, and the source-code question honestly (§9.3). A DNA result honestly reported survives all of these, because every answer is already built into the careful claim. A DNA result overstated collapses under the first of them — and takes the lab's credibility with it.

The thread that ties this section to the book's argument: the strongest evidence forensic science owns is undone not by bad chemistry but by bad language. The whole edifice of population genetics, capillary electrophoresis, and probabilistic modeling delivers a defensible number — and then a single transposed conditional in a closing argument can turn that number into a lie the jury believes. Guarding the language is therefore not lesser work than guarding the lab. It is the same work, finished.

The limits and the validity verdict for §9.6

Communication is a skill, not a method, but it is the final gate every DNA result must pass, and it fails more often than the chemistry does. The error modes are the two fallacies (§9.4), overstated individualization, and conflating presence with conduct. The verdict that belongs on a key-takeaways card: single-source DNA statistics are foundationally valid and quantified; their honest communication is a strength-of-evidence statement about coincidence, never a statement about guilt; and the most common way valid DNA evidence misleads a jury is through the prosecutor's fallacy in the mouth of an expert or an attorney.

🗂️ The Case File

Carrow County — the mixture, interpreted. Recall where Chapters 7 and 8 left the Mill Creek file. A touch-DNA sample lifted from the handle of the gas can recovered in the cabin yielded not a clean profile but a mixture: the major contributor consistent with the victim, Marcus Diallo, and at least one minor contributor, the data heat-degraded and low in quantity. A reference sample has now been obtained from Roy Keller, Diallo's business partner and the co-owner of the property.

The state laboratory runs the mixture through probabilistic genotyping software, assuming two contributors, with the propositions stated in advance: under $H_p$, the minor contributor is Roy Keller; under $H_d$, the minor contributor is an unknown, unrelated person. The software returns a likelihood ratio that strongly supports Keller as the minor contributor over an unknown unrelated person — a large number, in the range that laboratories describe verbally as strong support. (The exact figure is left to your workbook; treat any number you assign as illustrative.)

Now read it honestly, because this is the chapter's whole lesson applied to the case. The LR says the mixture data are far more probable if Keller is the minor contributor than if a random stranger is. It says Keller is consistent with having contributed; it does not say he did, and it says nothing about how his DNA reached the can. He is a co-owner of the property and the renovation — his DNA on a gas can stored at a site he partly owns has an innocent explanation, and the defense will press exactly that point (this is the transfer problem of Chapter 8, and the prior of §9.5). The result narrows the field; it does not close it.

Your task this chapter is twofold. First, log the mixture interpretation as a likelihood ratio with both propositions written out — not as a "match" and not as a probability that Keller is guilty. Second, a detective working the case tells a reporter: "The lab says there's only a one-in-a-billion chance it isn't Keller — basically a lock." Write, in your file, the correction. Name the error (the prosecutor's fallacy: the detective has transposed a statement about coincidence into a statement about Keller's guilt), restate what the number actually means (a random unrelated person would share this minor profile with that small probability; the strength of the evidence is an LR, not a probability of innocence), and note what the number leaves out (the prior — and the innocent-transfer explanation a co-owner has). The honest status of the file after Chapter 9: the DNA is consistent with Keller and provides strong support that he, rather than an unrelated stranger, is the minor contributor — but it is not proof, and a co-owner's DNA at the property is exactly the kind of result that must be weighed, not waved. Keller is not excluded; he is also not convicted by a number.

Conclusion

A DNA match is a number, and the number is the evidence. This chapter taught what that number is — a statement about how surprising a coincidence would be — and the disciplined ways to express it: the random match probability that quantifies coincidence, and the likelihood ratio that frames the evidence as a comparison between two stated hypotheses. We saw how probabilistic genotyping software extends that framework to the mixtures the human eye cannot untangle, and we held it to the same honest yardstick as everything else: foundationally valid for simple mixtures, not-yet-established for complex ones, and burdened by a real, unresolved fight over whether a defendant may examine the code that convicts him.

Above all, this chapter was about a single logical error and its mirror image. The prosecutor's fallacy transposes "the chance a random person would match" into "the chance the defendant is innocent," and that swap — committed by experts who can do the arithmetic flawlessly — has helped convict the innocent. The defense fallacy runs the other way, stripping strong evidence of its context to make it look worthless. The cure for both is the Bayesian structure that shows what the match probability leaves out: the prior, which the jury supplies and the scientist must not. And the practical end of all of it is communication — the honest script that reports the strength of the evidence and refuses, no matter the pressure, to report the probability of guilt.

Two of the book's themes ran straight through this chapter. The validity spectrum: DNA statistics sit at the strong end precisely because they are quantified — and even there, the validity lives in the interpretation, not just the chemistry. The CSI effect cuts both ways: a jury primed for certainty will hear a careful probability as a verdict unless the expert states the limits out loud, and will dismiss strong evidence if a defense attorney recasts it as coincidence. The strongest tool forensic science owns is only as honest as the sentence that delivers it.

In the next chapter we leave the genome for the rest of the body's traces — blood and body fluids, the serology that finds and identifies them, and bloodstain pattern analysis, a discipline that sits far lower on the validity spectrum than DNA and far lower than television suggests. The contrast is deliberate: you are about to leave the field's most rigorous method and meet one of its most contested.

Key Terms

Likelihood ratio (LR) — a number expressing how much more probable the observed evidence is under one hypothesis (typically: the defendant is a contributor) than under a competing hypothesis (typically: an unknown, unrelated person is); the framework that now dominates forensic DNA interpretation because it forces both hypotheses to be stated.
Probabilistic genotyping — software that uses explicit statistical models to interpret DNA profiles, especially complex or low-template mixtures, computing a likelihood ratio rather than relying on an analyst's subjective peak-by-peak judgment.
Prosecutor's fallacy — the error of treating the probability of the evidence given innocence (e.g., the random match probability) as if it were the probability of innocence given the evidence; a transposed conditional that confuses coincidence with guilt.
Defense fallacy — the error of dismissing strong evidence by noting that many people in a large population would coincidentally match, while ignoring the other evidence that placed this particular defendant before the court.
Bayesian reasoning — a framework for updating belief in a hypothesis by combining a prior probability with the weight of new evidence (the likelihood ratio) to yield a posterior; in forensics, posterior odds = LR × prior odds.
Allele frequency — the proportion of chromosomes in a reference population carrying a particular allele at a given location; the raw material from which match probabilities are built.

Spaced Review

An analyst reports a random match probability of 1 in 8 million and then says, "so there's only a 1-in-8-million chance the defendant is innocent." Name the error, and rewrite the sentence so it is true. (§9.4)
Restate the relationship "posterior odds = LR × prior odds" in plain English, and explain which part the forensic scientist supplies and which part the jury supplies. (§9.5)
From Chapter 8: what makes a DNA mixture so much harder to interpret than a single-source profile, and how does probabilistic genotyping (§9.3) address that difficulty? (Ch. 8; §9.3)
From Chapter 7: what does the random match probability measure, and why must it always be stated against a reference population and with relatives in mind? (Ch. 7; §9.1)
Validity-spectrum question: Where on the NAS 2009 / PCAST 2016 spectrum does (a) single-source DNA statistics and (b) probabilistic genotyping of complex mixtures sit, and what is the key difference in their validation status? (§9.1, §9.3)