Chapter 29 — Self-Check Quiz

DataField.Dev

Chapter 29 — Self-Check Quiz

25 questions: multiple choice and short answer. Try them closed-book. The answer key is in the collapsed block at the bottom.

Multiple choice

1. Rapid DNA is validated primarily for: - A. Complex crime-scene mixtures analyzed at the scene - B. Clean, single-source reference samples (e.g., a buccal swab) - C. Degraded, low-template touch DNA - D. Determining time since death

2. The chief danger of rapid DNA, per the chapter, is: - A. The STR chemistry is unsound - B. Scope creep — applying it to evidentiary samples outside its validated envelope, without an analyst - C. It is slower than a full laboratory - D. It cannot search CODIS under any circumstances

3. The necrobiome is most directly analogous to which earlier-chapter method? - A. Bite-mark comparison - B. Insect succession on remains (forensic entomology) - C. Bloodstain pattern analysis - D. Questioned-document examination

4. Forensic isotope analysis of an unidentified body most defensibly establishes: - A. The person's exact hometown - B. The person's identity - C. A region of likely origin and a life-history sketch (an investigative lead) - D. The time of death

5. Tooth enamel and scalp hair record different time periods because: - A. Enamel forms in childhood and never remodels, while hair grows out as a record of recent months - B. Hair is older than enamel in every person - C. Only enamel contains isotopes - D. They are chemically identical and record the same period

6. A machine-learning model that produces an output without a human-understandable reason is a: - A. Validated method by definition - B. "Black box" — a problem for cross-examination and the Daubert gatekeeper - C. Guarantee of objectivity - D. Form of conventional STR typing

7. Automation bias refers to: - A. A machine automatically removing all human bias - B. The human tendency to over-trust a machine's output because it is a machine - C. A bias only present in DNA analysis - D. The speed of an automated instrument

8. Training-data bias in an AI forensic tool means: - A. The model invents fairness on its own - B. The model inherits the statistics — including the injustices — of the data it learned from, then re-emits them with false authority - C. The model is always more accurate on underrepresented groups - D. The training data are irrelevant to the output

9. Familial searching searches: - A. A consumer genealogy database for distant cousins - B. The criminal database (CODIS) for a partial match indicating a close relative - C. Isotope reference maps - D. A microbial community profile

10. Investigative genetic genealogy differs from familial searching chiefly in that IGG uses: - A. The same STR markers and the same criminal database - B. Hundreds of thousands of SNPs in a consumer genealogy database, reaching distant relatives - C. Only fingerprint data - D. No DNA at all

11. In both familial searching and IGG, the courtroom identification is made by: - A. The genealogy / partial-match work itself - B. A conventional, direct STR comparison (the gold-standard method of Chapter 7) - C. An AI similarity score - D. The isotope profile

12. In the Golden State Killer case, the genealogists' actual labor consisted mostly of: - A. A single instant database query - B. Weeks of family-tree reconstruction from public records, triangulating from distant matches - C. Running the suspect's fingerprints - D. A microbial PMI estimate

13. A "96% similarity score" from a facial-recognition system is: - A. The probability that the match is correct - B. The model's internal similarity metric — not a probability of correctness - C. A validated error rate - D. Proof of identity

14. Microbial forensics for PMI and trace association is best described as: - A. Fully validated and courtroom-ready - B. Promising research whose forensic validation (large, reproducible studies; measured error rates) is still incomplete - C. Discredited junk science - D. Identical in reliability to single-source DNA

15. The single most important question to ask of any emerging method, per §29.6, is: - A. Does it use sophisticated technology? - B. What is the measured error rate, and how do we know (and across which subgroups)? - C. Has a vendor claimed it is accurate? - D. Did it work once in a demonstration?

16. The best sign of methodological honesty in an emerging method is that it: - A. Confirms its own output with no independent check - B. Generates a lead confirmed by a transparent, validated downstream method - C. Produces a single confident decimal - D. Cannot show its work

17. Rapid DNA's position on the validity spectrum is: - A. Always at the top, regardless of sample - B. Always at the bottom - C. Dependent on the use — high for clean reference swabs inside its envelope, effectively off it for degraded mixtures - D. Impossible to assess

18. The equity concern specific to familial searching is rooted in: - A. Who uploads to consumer genealogy databases - B. Who is already in the criminal database (CODIS), since their families become partially searchable - C. The cost of the instrument - D. The speed of the search

19. "It uses deep learning, so it must be reliable" is an example of: - A. A validation study - B. The appeal to sophistication — a rhetorical move to distrust, since complexity is not validity - C. A measured error rate - D. Auditable reasoning

Short answer

20. Explain, in two sentences, why removing the human analyst makes rapid DNA both faster and less safe. Name one thing an analyst would catch that the box will not.

21. State three perils of AI/ML in forensics, and for each, give a one-phrase description of how it can produce error.

22. Distinguish an investigative lead from a courtroom identification, using IGG as the example. Why does the difference protect against wrongful conviction even if the genealogy points to the wrong family branch?

23. A vendor claims "99% accuracy." Name three things you must know before that figure could responsibly support anything in court.

24. In the cold case, this chapter delivers an exclusion ("stranger theory excluded"), not an identification. Write one sentence stating exactly what was established and its honest verb, and one sentence stating what it does not establish.

25. Explain how the validity spectrum is used prospectively in this chapter — as a gate for future methods — and why that is more durable than a fixed list of which current methods are good or bad.

Answer key (click to expand)

**Multiple choice:** 1-B · 2-B · 3-B · 4-C · 5-A · 6-B · 7-B · 8-B · 9-B · 10-B · 11-B · 12-B · 13-B · 14-B · 15-B · 16-B · 17-C · 18-B · 19-B **Short answer (model points):** **20.** Automation does the extraction, amplification, separation, and initial call without a human, which is *fast* but removes the analyst's checkpoints — the points where a person would re-extract a stubborn sample, adjust input DNA, or *look at the data and exercise judgment*. The box returns a confidently formatted result whether the input was a pristine reference swab or a degraded mixture; an analyst would recognize, for example, that a sample is a **mixture** (more than two alleles at loci) and stop — the box will not. **21.** Any three: (a) **black box** — produces an output with no human-auditable reason, defeating cross-examination and the *Daubert* gatekeeper; (b) **training-data bias** — inherits and re-emits the injustices of its training set (e.g., worse false-match rates for underrepresented groups) with false authority; (c) **automation bias** — humans over-trust the machine's output *because* it is a machine; (d) the **validation gap** — deployed without independent, blind, adversarial validation or measured subgroup error rates. **22.** A *lead* is a candidate to investigate by independent means; an *identification* is the conclusion presented to a jury. In IGG, the genealogy generates the lead and a conventional STR comparison makes the identification. If the genealogy points to the wrong branch, the wrong candidate's STR profile *will not match* the crime-scene profile, so the error is caught *before* any charge — the failure mode becomes "wasted effort / intruded-upon innocent relatives," not "wrongful conviction." **23.** Any three: accurate on *what* test set, drawn from *what* population; measured against *what* ground truth; with *what* false-positive and false-negative rates across *which* subgroups; validated by *whom* (independent, no stake) — and whether the reasoning is auditable. A vendor accuracy claim is marketing until these are answered. **24.** **Established (honest verb = excludes):** "Rapid DNA and investigative genetic genealogy establish that the minor contributor to the gas-can DNA mixture is *not* an unknown stranger from outside the established field of persons of interest — the 'random intruder' theory is excluded." **Not established:** "It does not name the contributor, does not establish how or when those cells reached the handle, and does not prove the contributor set the fire or committed any crime." **25.** The chapter turns the NAS 2009 / PCAST 2016 framework from a backward-looking *audit* (re-examining the bite mark, hair comparison, arson folklore) into a forward-looking *gate*: a checklist of questions (precise claim? independent validation? measured error rate? auditable reasoning? inside its envelope? who bears the cost? what does it hand off to?) asked of *any* new method before it enters court. This is more durable than a fixed list of good/bad methods because new methods appear constantly; the *question* ("what does it prove, and how do we know?") survives every change in technology, while any list of verdicts goes stale.