Exercises: Red Flags

Part A: Comprehension and Application

A.1. For each of the following red flag questions, identify the primary failure mode from Parts I-III that it detects: (a) "Who funded this?" (b) "What would disprove this?" (c) "What happens to people who disagree?" (d) "How does the field tell its own history?" (e) "Has this been independently replicated?"

A.2. Explain the difference between a red flag (strong signal) and a yellow flag (uncertainty signal). Give an example of each for a claim in a field you know well.

A.3. The chapter argues that "no single red flag proves a claim is wrong." Why not? Under what circumstances could a correct claim receive multiple red flags? Give a specific example.

A.4. Apply Questions 1-5 (funding, replication, falsification, beneficiaries, evidence age) to the claim that "eyewitness testimony is reliable" (circa 2000, before widespread adoption of DNA exonerations). Score each question and explain your reasoning.

A.5. Apply Questions 6-10 (precision vs. accuracy, dissent treatment, source independence, effect size, real-world validation) to the claim that "learning styles improve education" (Chapter 30). Score each and explain.

Part B: Analysis

B.1. Complete a full 15-question Red Flag Scorecard for one of the following: - (a) The claim that acupuncture is effective for pain management - (b) The claim that organic food is significantly healthier than conventional food - (c) The claim that class size reduction is the most cost-effective education intervention

Document your scoring and reasoning for each question.

B.2. The worked example scores the dietary fat hypothesis circa 1990 with 8 red flags. Apply the same scorecard to the dietary fat hypothesis circa 1965 (when Keys's work was newer and less challenged). How does the score change? What does this tell you about how red flag profiles evolve over time?

B.3. Identify a claim in your own field and apply all 15 Red Flag questions. If you find more than 5 red flags, investigate the two most concerning ones in depth. What additional evidence would you need to determine whether the red flags indicate actual error?

Part C: Synthesis and Evaluation

C.1. The Red Flag Scorecard is a screening tool, not a diagnostic tool — it identifies structural risk, not truth value. Evaluate the strengths and limitations of this approach. When might the scorecard mislead? What kinds of wrong consensuses might score well (few red flags), and what kinds of correct claims might score poorly (many red flags)?

C.2. Design three additional diagnostic questions that are not on the list of 15. For each, identify the failure mode it detects, provide an example of a case it would have caught, and define the green/yellow/red scoring criteria.

C.3. A colleague argues that the Red Flag Scorecard is itself a "plausible story" (Chapter 6) — an intuitive framework that substitutes for rigorous analysis. Evaluate this criticism. Is the scorecard evidence-based, or is it a narrative tool? How would you test whether it works?

Part D: Mixed Practice (Interleaved)

D.1. Score the pre-2012 AI winter consensus ("neural networks are a dead end") using the full 15-question scorecard. Compare your score to the worked dietary fat example in the chapter. Which consensus scored more red flags? Does this match their correction timelines?

D.2. A policymaker asks you to quickly assess the reliability of a new report claiming that a specific educational intervention improves student outcomes by 15%. You have 30 minutes. Using the Red Flag Scorecard, identify the 5 most important questions to investigate first. Justify your prioritization.

D.3. Apply the scorecard to this book. Use Question 13 (how does this book tell the history of knowledge?), Question 3 (what would disprove this book's thesis?), and Question 8 (does the evidence come from one source or many?). Score the book honestly. What red flags does it trigger, and what does that tell you about how to read it critically?