Case Study 1 — The Golden State Killer: Inside the Family Tree, and the Questions It Raised
A real, publicly documented case, and the load-bearing anchor of this book's argument about validated forensic progress. The facts below are drawn from the public record; where this study reconstructs a step for teaching, it is labeled. Chapter 8 introduced this case to establish what investigative genetic genealogy (IGG) is and that it inverted the database problem. This study advances the anchor from the emerging-technology angle: the mechanics of the family-tree reconstruction in finer detail, and the privacy and ethics questions that crystallized in the years after the 2018 arrest. The crimes were horrific; they are described here only as clinically as the analysis requires. The interest, as always, is the method.
1. Why this case is the anchor — and why it returns here
This book has named three load-bearing anchor cases. Cameron Todd Willingham is the catastrophe — junk science with a body count. Brandon Mayfield is the cautionary middle — even the gold standard is human judgment under bias. The Golden State Killer is the triumph — the emblem that real, validated forensic progress is possible, and that it looks like honest, careful science rather than a television database chirping a name.
Chapter 8 established the case's core: a serial offender who left excellent crime-scene DNA across a long series of crimes in 1970s–1980s California; a decade of fruitless CODIS searches because the offender was not in the criminal database; and the 2018 breakthrough when investigators, working with genealogists, used IGG to develop a SNP profile, found distant relatives in the public GEDmatch database, reconstructed family trees, and triangulated to Joseph James DeAngelo, a former police officer — whose identification was then confirmed by a conventional STR comparison against abandoned DNA. He was arrested in April 2018 and later pleaded guilty.
That much you know. This study goes deeper into the two things that make the case worth returning to in a chapter on emerging technology: how the genealogy actually worked, and what it cost.
2. The mechanics, up close: how a distant cousin becomes a suspect
The popular account compresses IGG into a single magic step — "they found his DNA in a database." That is wrong in a way worth correcting, because the compression hides exactly the labor and the uncertainty that a forensic practitioner must understand. Here is the work, in finer detail than Chapter 8 gave it.
A different molecule of information. Conventional forensic typing reads roughly twenty STR loci — short tandem repeats — which are superb for matching one profile to another but carry almost no information about relatedness to distant kin. IGG instead generates a profile across hundreds of thousands of SNPs (single-nucleotide polymorphisms), the same dense genome-wide markers consumer ancestry companies use. Two STR profiles either match or they don't; two SNP profiles can be compared for the amount of DNA shared, and the amount shared is what reveals how closely two people are related. This re-typing of decades-old crime-scene DNA into a SNP profile suitable for genealogy was itself a technical accomplishment, and it is the first reason IGG could not have been done in the 1990s.
A database no one built for this. The SNP profile was uploaded to GEDmatch, a public site where consumer-DNA customers voluntarily upload their results to find relatives. The search did not ask "is the offender here?" — he was not. It asked "are any of his relatives here?", and returned a ranked list of people sharing enough DNA to be distant relatives, on the order of second to fourth cousins. This is the conceptual inversion that defines IGG: the offender's absence from the criminal database said nothing about his relatives' presence in a consumer one.
The genealogist's labor — weeks, not seconds. Here is the part the magic-step account erases entirely. A third-cousin match does not point to the offender; it points to a common ancestor — perhaps a great-great-grandparent — from whom both the match and the offender descend, with potentially hundreds of other descendants in between. The genealogists' task was to take multiple distant matches, on different branches of the family, and reconstruct each one's family tree outward and backward from public records — censuses, birth and marriage and death certificates, obituaries, newspaper archives — until the separate trees converged on shared ancestral couples. Then they worked downward from those common ancestors through generations of descendants, triangulating from the different branches to a much smaller set of candidates, and finally winnowed that set by the case's hard constraints: approximate age, sex, and the California geography of the crimes. This is painstaking documentary research, prone to dead ends, gaps in the records, and false branches — weeks of skilled human work, not a database query.
The reconstruction, schematically [after the public record; the specific tree is generalized for teaching].
text COMMON ANCESTORS (great-great-grandparents, ~1800s) │ │ (branch A descendants) (branch B descendants) │ │ DISTANT MATCH #1 DISTANT MATCH #2 (in GEDmatch, a 3rd–4th (in GEDmatch, on a cousin who uploaded) different branch) │ │ └──────────── triangulate ──────────┘ │ candidate descendants (winnow by age / sex / California geography) │ ▼ ONE CANDIDATE SUSPECT │ ▼ CONFIRM: conventional STR on ABANDONED DNA vs. crime-scene profile (the gold-standard method of Chapter 7 makes the ID)
The confirmation that did the legal work. Once the genealogy named a candidate, investigators obtained DNA from items DeAngelo discarded — abandoned DNA, which carries no expectation of privacy once thrown away — and ran a conventional STR comparison against the crime-scene profile. That comparison, the validated method of Chapter 7, is what identified him. The family tree generated the name; ordinary, gold-standard DNA confirmed it. Keep this division of labor in front of you: it is the entire reason this case is a model of honest progress and not merely impressive progress.
3. What the method did — and didn't — establish
State it precisely, because the precision is the lesson.
- IGG did generate, from crime-scene DNA and a public genealogy database, an investigative lead — a single candidate to investigate — where a decade of CODIS searching had produced nothing.
- IGG did not identify DeAngelo. The identification presented to the legal system was the conventional STR match between the crime-scene profile and his confirmed sample. The genealogy was the path to the door; the STR match is what opened it.
- The method's failure mode is not "convicting an innocent person on genealogy." The downstream STR confirmation guards against that: if the genealogy had pointed to the wrong man, his STR profile would not have matched, and the error would have been caught before any charge. The real failure mode is following the tree to the wrong branch — which wastes investigative effort and, more troublingly, draws innocent relatives into a criminal investigation. That is a real cost, but a different kind of cost than a wrongful conviction.
This is the structure that earns IGG its place on the validity spectrum. The lead-generation step is judged as an investigative method — by whether it reliably points investigations correctly, and by its privacy costs — not by a courtroom error rate it never claims. The confirmation step is judged where DNA always is: at the top, validated and strong. A method that hands its conclusion to the strongest tool in the field, and never pretends to be that tool, is being structurally honest about its own limits. That honesty is precisely what the bite mark (Chapter 16) never had.
4. What it cost: the ethics the triumph cannot erase
A celebrated success is exactly when a careful field should examine its costs most closely, because success silences scrutiny. The Golden State Killer case did not only demonstrate a powerful method; it opened a set of questions the field is still answering.
Consent at a distance. When a person uploads their DNA to a consumer service to find a half-sibling or a birth parent, they expose not only their own genome but, partially, the genomes of everyone who shares their DNA — siblings, parents, children, and cousins who consented to nothing and may not know the database exists. IGG turns that ambient exposure into a law-enforcement tool: one distant cousin's voluntary upload rendered DeAngelo's entire extended family findable. The offender's lack of consent is unobjectionable — he is a suspect. But the dozens of innocent relatives whose presence in the database made him findable did not consent either, and that is the part that should give a practitioner pause.
Database terms, changed under users' feet. GEDmatch's users had uploaded under terms that did not clearly contemplate law-enforcement searching; the propriety of the search, and the subsequent changes services made to their policies (some switching users to opt-out or opt-in regimes for police matching), became a live controversy. People who uploaded to find family did not necessarily sign up to help solve strangers' cases. Informed consent is genuinely hard when the consequences run through one's relatives and the rules can change after the fact.
Equity and uneven reach. Consumer genealogy databases over-represent people of European descent and under-represent others. IGG's power therefore falls unevenly across populations — it is more likely to find a suspect with many databased cousins than one whose community is sparsely represented. A tool's reach being a function of which groups have used consumer genetics is a fairness problem worth naming, even when the tool works.
The regulation lag. The law governing what police may upload, to which databases, under what oversight, lagged the technology badly. In the years after 2018, an interim U.S. Department of Justice policy set conditions on forensic genetic genealogy in federally supported cases — generally limiting it to violent crimes and the identification of unknown remains, and requiring that conventional database searching be exhausted first — but a comprehensive legal framework remains a patchwork across jurisdictions and companies.
The honest posture. None of these costs makes IGG junk; the method is valid, powerful, and, married to STR confirmation, a genuine triumph. But a forensic scientist who can run a method and cannot reason about its costs is only half-trained. The mature judgment holds three things at once: IGG is a triumph that closed unsolvable cases; it is a model of methodological honesty (lead, then validated confirmation); and it is a genetic-surveillance capability that outran the rules meant to govern it. Collapsing into either boosterism or alarm is the failure; holding all three is the discipline.
5. The lesson
The Golden State Killer case teaches, in one example, the whole argument of this chapter and much of the book.
-
Honest progress hands off. The single best sign of a trustworthy emerging method is that it generates a lead confirmed by a transparent, validated downstream method — never that it offers its own output as the conclusion. IGG's structure (SNP-genealogy lead → STR confirmation) is the template. An emerging method that confirms itself, with no independent check, has the bite mark's structure in new clothing.
-
Validity is judged by the right question, at the right step. The lead-generation step is judged as an investigative tool (does it point correctly? at what privacy cost?); the confirmation step is judged as courtroom science (what is the error rate? — DNA's, which is tiny). Conflating the two — judging IGG as if the genealogy itself were the courtroom identification, or excusing an unvalidated method because it "feels" as solid as DNA — muddles both the science and the law.
-
A triumph is the moment to examine costs, not to suspend judgment. The method worked; the costs to consent, equity, and privacy are real anyway. Forensic maturity is the capacity to celebrate a genuine advance and to keep asking who bears its costs — in the same breath, without flinching from either.
The Golden State Killer is this book's emblem of validated progress because it shows what such progress actually is: not certainty delivered in nine seconds, but a careful new way to generate a lead, confirmed by the field's most rigorous method, claimed at exactly its true strength — and scrutinized for its costs even in victory.
Discussion questions
-
The popular account compresses IGG to "they found his DNA in a database." Using §2, identify at least three distinct steps that compression erases, and explain why a forensic practitioner must understand them rather than the slogan.
-
Distinguish the genealogy lead from the STR confirmation. Which one was presented to the legal system as the identification, and why does that division of labor make IGG a model of methodological honesty (compare the bite mark, Chapter 16)?
-
The chapter says IGG's failure mode is "following the tree to the wrong branch," not "convicting an innocent person on genealogy." Explain why the downstream STR confirmation changes the kind of error the method can make — and why the remaining error (wrong branch) is still a real cost.
-
"Consent at a distance" means one person's upload exposes their relatives. Using §4 and Chapter 8's preview of Chapter 38, explain why informed consent is especially hard for IGG, and why a database changing its terms after users uploaded sharpens the problem.
-
IGG's reach is uneven across populations because consumer databases over-represent some groups. Connect this to the training-data-bias problem in AI (§29.4): in what sense do both an IGG search and a facial-recognition model inherit the demographics of the data they rely on?
-
Apply the §29.6 evaluation checklist to IGG as if it were brand new. On which items does it pass cleanly (e.g., "what does it hand off to?"), and on which does the honest answer involve costs rather than validity (e.g., "who bears the cost, and did they consent?")?