Exercises: The Streetlight Effect

Difficulty Guide: - ⭐ Foundational (5–10 min) | ⭐⭐ Intermediate (10–20 min) | ⭐⭐⭐ Challenging (20–40 min) | ⭐⭐⭐⭐ Advanced/Research (40+ min)


Part A: Conceptual Understanding ⭐

A.1. State Goodhart's Law in your own words. Why does making a metric into a target weaken its validity?

A.2. Explain the McNamara Fallacy in four steps. Give one example NOT from the chapter.

A.3. What is the difference between a proxy measure and the construct it represents? Give three examples of proxy-construct pairs from different fields.

A.4. Explain the concept of "metric displacement." How does optimizing for a proxy actively crowd out the construct?

A.5. Why can't the streetlight effect be solved simply by creating better metrics?

A.6. The chapter describes the streetlight effect as "the most insidious" entry mechanism because it doesn't introduce a specific wrong answer. Explain why this makes it harder to detect than the authority cascade or unfalsifiability.


Part B: Applied Analysis ⭐⭐

B.1. Choose an organization you know well (workplace, university, government agency). Identify three metrics it uses to evaluate performance. For each, analyze: (a) What construct is the metric supposed to represent? (b) Has the metric decoupled from the construct? (c) How is the metric gamed?

B.2. The chapter discusses body counts in Vietnam, test scores in education, and patient satisfaction in healthcare. Choose one and design a better measurement system. What would you measure instead? How would you prevent gaming?

B.3. Apply the five-step "deeper pattern" from section 4.3 to a measurement system in your field: identify the construct, the proxy, the system, the decoupling, and the blindness.

B.4. The chapter argues that citation counts in academia have decoupled from research quality. Design an alternative system for evaluating academic research that is less susceptible to Goodhart's Law. What trade-offs would your system involve?

B.5. Compare the streetlight effect in education (test scores) with the streetlight effect in healthcare (quality ratings). What structural similarities exist? What are the key differences?

B.6. The "Active Right Now" section identifies social media engagement metrics as a current example. Trace the five-step pattern for social media: construct, proxy, system, decoupling, blindness.


Part C: Research Design Challenges ⭐⭐–⭐⭐⭐

C.1. Design a study to determine whether a specific metric in your field has decoupled from the construct it represents. What data would you need? What would count as evidence of decoupling?

C.2. The chapter suggests "rotating metrics" as a strategy. Design a rotating metric system for a school district. How would you implement it? What resistance would you expect?

C.3. Propose a method for measuring the "gap" between a proxy and its construct in a specific domain. How would you validate that your gap measure is itself valid? (Beware the meta-streetlight effect!)


Part D: Synthesis & Critical Thinking ⭐⭐⭐

D.1. The chapter argues that the streetlight effect interacts with the authority cascade (Chapter 2). Trace this interaction: how does the authority of psychometrics (testing science) reinforce the streetlight effect in education?

D.2. Is the streetlight effect ever beneficial? Can you identify a case where metric fixation produced a genuinely good outcome? What does this tell you about the limits of the chapter's argument?

D.3. The chapter mentions James C. Scott's concept of "legibility." How does the demand for legibility create the conditions for the streetlight effect? Is there an alternative to legibility that doesn't sacrifice the information the state (or an organization) needs?

D.4. Apply the failure mode framework from Chapters 1–4 to the streetlight effect itself. Is the chapter's argument about the streetlight effect subject to any of the failure modes it describes? (Hint: is the chapter looking under its own streetlight?)


Part M: Mixed Practice (Interleaved) ⭐⭐–⭐⭐⭐

M.1. (From Chapter 1) Map the McNamara Fallacy in Vietnam to the lifecycle of a wrong idea. At which stage did the body count metric become a "wrong idea"?

M.2. (From Chapter 2) The authority of Ancel Keys reinforced the dietary fat hypothesis (Chapter 2). The streetlight effect reinforced it through calorie counting and lipid panel metrics (measurable proxies for heart health). How did these two failure modes interact?

M.3. (From Chapter 3) Can a metric-driven claim be unfalsifiable? If "educational improvement" is defined as rising test scores, is the claim "education is improving" falsifiable?

M.4. (Integration) Return to your Epistemic Audit target. Combining all four failure modes (lifecycle, authority cascade, unfalsifiability, streetlight effect), which is the most active in your field?


Part E: Research & Extension ⭐⭐⭐⭐

E.1. Read Jerry Muller's The Tyranny of Metrics (2018). Compare his argument with the chapter's framework. Where do they agree? Where does the chapter go beyond Muller?

E.2. Investigate a specific case where metric gaming produced a crisis (e.g., Wells Fargo's fake accounts scandal, the VA hospital wait time scandal, Atlanta's testing cheating scandal). Analyze the case using the five-step pattern and Goodhart's Law.


Solutions

Selected solutions in appendices/answers-to-selected.md.