Quiz: The Epistemic Health Checklist

Q: The Epistemic Health Checklist evaluates: (a) Individual claims (b) Individual researchers (c) Entire fields and organizations — whether they are structurally capable of producing trustworthy knowledge and correcting errors (d) Funding agencies

(c) The checklist assesses the system, not individual claims (that's the Red Flag Scorecard from Chapter 31) or individual people. The question is: "Is this field capable of producing trustworthy claims?"

Q: Dimension 1 (Dissent Tolerance) is described as "the strongest single predictor of a field's correction capacity" because: (a) Dissenters are always right (b) Fields that punish dissent protect their current position — right or wrong — while fields that engage with dissent can revise when wrong (c) Consensus is always wrong (d) Dissent is more interesting than agreement

(b) The treatment of dissenters reveals whether a field defends its positions through evidence (healthy) or through institutional power (unhealthy). Every wrong consensus in this book was sustained partly by suppressing those who challenged it.

Q: In the worked examples, nutrition science scored an average of 3.1/10. This score predicts: (a) The field will self-correct within a year (b) The field is structurally incapable of self-correction — errors persist until external forces compel change (c) Individual nutrition claims are all wrong (d) Nutrition science should be abolished

(b) A score of 3.1 places nutrition in the "poor epistemic health" range (3-5). This doesn't mean every claim is wrong, but the field's structures actively protect error and suppress correction. The dietary fat hypothesis took 40+ years to correct — consistent with the score.

Q: The chapter argues that "the profile pattern matters as much as the average score." This is because: (a) All dimensions are equally important (b) Specific vulnerabilities produce specific types of error — a field with strong replication but weak measurement validity will catch statistical errors but miss validity problems (c) Higher averages are always better (d) The pattern is easier to read than the average

(b) The pattern reveals what kind of error a field is vulnerable to. Uniformly moderate scores suggest diffuse risk; uneven scores suggest specific, targetable vulnerabilities.

Q: Dimension 4 (Measurement Validity) detects which failure modes? (a) Authority cascade and sunk cost (b) The streetlight effect (Ch.4) and precision without accuracy (Ch.12) — measuring what is measurable rather than what matters (c) The outsider problem (d) The revision myth

(b) When a field's metrics are poor proxies for what actually matters (test scores for learning, body counts for winning, VaR for risk), the measurement validity dimension catches this vulnerability.

Q: Software engineering scored 8/10 on outsider access but 4/10 on history awareness. This pattern suggests: (a) The field is equally healthy on all dimensions (b) The field is open to outside challenge (new people can contribute easily) but prone to the revision myth (presenting current practices as inevitable progress rather than acknowledging past failures) (c) The field is closed to outsiders (d) The field has excellent historical awareness

(b) The uneven profile reveals a specific vulnerability: software engineering's open culture allows new ideas in easily, but its poor history awareness means the field may repeat errors because it doesn't study its own past mistakes carefully enough.

Q: A field scores 9/10 on process transparency and 2/10 on incentive alignment. What does this pattern suggest? (a) Everything is fine (b) The field's processes are visible, but the incentives driving those processes reward error-producing behavior — you can see exactly how the system is producing wrong answers, but the system has no incentive to fix it (c) The field needs less transparency (d) The field has aligned incentives

(b) Transparency without incentive alignment means the problems are visible but the system is not structured to address them. This is arguably worse than opacity — it produces the frustration of watching a system produce errors that everyone can see but no one can fix.

Q: The interpretation guide suggests that a field with an average score of 1-3 should be treated with: (a) Full trust (b) Moderate trust with verification (c) Default skepticism until specific claims are independently validated (d) Complete rejection of all claims

(c) A score of 1-3 indicates "critical epistemic health failure" — the field's structures actively protect error. This doesn't mean every claim is wrong, but the default should be skepticism rather than trust, with individual claims requiring independent validation.

Q: Dimension 6 (Correction Speed) synthesizes the Correction Speed Model from Chapter 22. Criminal justice scored 1/10 on this dimension because: (a) Criminal justice has no errors to correct (b) Legal precedent, finality bias, and prosecutorial power create structural barriers that make correction take decades or longer (c) Criminal justice corrects very quickly (d) The dimension doesn't apply to criminal justice

(b) Criminal justice's Correction Speed Model profile was the most pessimistic of any field examined — legal precedent preserves error, finality bias resists re-evaluation, and no crisis mechanism forces systematic correction.

Quiz: The Epistemic Health Checklist

Q1. The Epistemic Health Checklist evaluates:

(a) Individual claims (b) Individual researchers (c) Entire fields and organizations — whether they are structurally capable of producing trustworthy knowledge and correcting errors (d) Funding agencies

Answer

**(c)** The checklist assesses the system, not individual claims (that's the Red Flag Scorecard from Chapter 31) or individual people. The question is: "Is this field capable of producing trustworthy claims?"

Q2. Dimension 1 (Dissent Tolerance) is described as "the strongest single predictor of a field's correction capacity" because:

(a) Dissenters are always right (b) Fields that punish dissent protect their current position — right or wrong — while fields that engage with dissent can revise when wrong (c) Consensus is always wrong (d) Dissent is more interesting than agreement

Answer

**(b)** The treatment of dissenters reveals whether a field defends its positions through evidence (healthy) or through institutional power (unhealthy). Every wrong consensus in this book was sustained partly by suppressing those who challenged it.

Q3. In the worked examples, nutrition science scored an average of 3.1/10. This score predicts:

(a) The field will self-correct within a year (b) The field is structurally incapable of self-correction — errors persist until external forces compel change (c) Individual nutrition claims are all wrong (d) Nutrition science should be abolished

Answer

**(b)** A score of 3.1 places nutrition in the "poor epistemic health" range (3-5). This doesn't mean every claim is wrong, but the field's structures actively protect error and suppress correction. The dietary fat hypothesis took 40+ years to correct — consistent with the score.

Q4. The chapter argues that "the profile pattern matters as much as the average score." This is because:

(a) All dimensions are equally important (b) Specific vulnerabilities produce specific types of error — a field with strong replication but weak measurement validity will catch statistical errors but miss validity problems (c) Higher averages are always better (d) The pattern is easier to read than the average

Answer

**(b)** The pattern reveals *what kind* of error a field is vulnerable to. Uniformly moderate scores suggest diffuse risk; uneven scores suggest specific, targetable vulnerabilities.

Q5. Dimension 4 (Measurement Validity) detects which failure modes?

(a) Authority cascade and sunk cost (b) The streetlight effect (Ch.4) and precision without accuracy (Ch.12) — measuring what is measurable rather than what matters (c) The outsider problem (d) The revision myth

Answer

**(b)** When a field's metrics are poor proxies for what actually matters (test scores for learning, body counts for winning, VaR for risk), the measurement validity dimension catches this vulnerability.

Q6. Software engineering scored 8/10 on outsider access but 4/10 on history awareness. This pattern suggests:

(a) The field is equally healthy on all dimensions (b) The field is open to outside challenge (new people can contribute easily) but prone to the revision myth (presenting current practices as inevitable progress rather than acknowledging past failures) (c) The field is closed to outsiders (d) The field has excellent historical awareness

Answer

**(b)** The uneven profile reveals a specific vulnerability: software engineering's open culture allows new ideas in easily, but its poor history awareness means the field may repeat errors because it doesn't study its own past mistakes carefully enough.

Q7. A field scores 9/10 on process transparency and 2/10 on incentive alignment. What does this pattern suggest?

(a) Everything is fine (b) The field's processes are visible, but the incentives driving those processes reward error-producing behavior — you can see exactly how the system is producing wrong answers, but the system has no incentive to fix it (c) The field needs less transparency (d) The field has aligned incentives

Answer

**(b)** Transparency without incentive alignment means the problems are visible but the system is not structured to address them. This is arguably worse than opacity — it produces the frustration of watching a system produce errors that everyone can see but no one can fix.

Q8. The interpretation guide suggests that a field with an average score of 1-3 should be treated with:

(a) Full trust (b) Moderate trust with verification (c) Default skepticism until specific claims are independently validated (d) Complete rejection of all claims

Answer

**(c)** A score of 1-3 indicates "critical epistemic health failure" — the field's structures actively protect error. This doesn't mean every claim is wrong, but the default should be skepticism rather than trust, with individual claims requiring independent validation.

Q9. Dimension 6 (Correction Speed) synthesizes the Correction Speed Model from Chapter 22. Criminal justice scored 1/10 on this dimension because:

(a) Criminal justice has no errors to correct (b) Legal precedent, finality bias, and prosecutorial power create structural barriers that make correction take decades or longer (c) Criminal justice corrects very quickly (d) The dimension doesn't apply to criminal justice

Answer

**(b)** Criminal justice's Correction Speed Model profile was the most pessimistic of any field examined — legal precedent preserves error, finality bias resists re-evaluation, and no crisis mechanism forces systematic correction.

Q10. The Epistemic Health Checklist and the Red Flag Scorecard are designed to be used:

(a) Independently — use one or the other (b) Sequentially — use the Checklist to assess the field, then the Scorecard to assess specific claims within that field, with the field assessment informing how skeptically to evaluate individual claims (c) Only by experts (d) Only for scientific fields

Answer

**(b)** The tools are complementary. The Checklist tells you "how much should I trust this field in general?" The Scorecard tells you "how much should I trust this specific claim?" A claim in a field with poor epistemic health deserves more scrutiny than the same claim in a field with strong epistemic health.

Scoring Guide

9-10 correct: Excellent. You can use both diagnostic tools and understand their relationship.
7-8 correct: Good. Review the distinction between the Checklist and the Scorecard, and the importance of profile patterns.
5-6 correct: Fair. Revisit the 10 dimensions and their connections to specific failure modes.
Below 5: Re-read the chapter with attention to what each dimension measures and why it matters for epistemic health.