Chapter 15 Quiz: Self-Assessment

Chapter 15 Quiz: Self-Assessment

Instructions: Answer each question without looking back at the chapter. After completing all questions, check your answers against the key at the bottom. If you score below 70%, revisit the relevant sections before moving on to Chapter 16.

Multiple Choice

Q1. Goodhart's Law states that:

a) Good metrics always lead to good outcomes b) When a measure becomes a target, it ceases to be a good measure c) Metrics should never be used for decision-making d) Only quantitative metrics are vulnerable to gaming

Q2. In the Soviet nail factory example, when production was measured by weight, factories produced:

a) Perfectly balanced nails of various sizes b) Enormous, heavy nails that were useless for most purposes c) Lightweight nails optimized for efficiency d) The exact nails specified in the production plan

Q3. Campbell's Law adds to Goodhart's original insight by emphasizing that:

a) Only social indicators are vulnerable to corruption b) Metrics always improve the processes they measure c) Corruption pressures distort and corrupt the social processes the indicator was intended to monitor d) Quantitative metrics are always superior to qualitative assessment

Q4. The term "teaching to the test" refers to:

a) Using test results to inform teaching decisions b) Restructuring curricula around what will appear on a specific standardized exam rather than around what students most need to learn c) Teaching students how to study effectively d) Administering frequent formative assessments

Q5. In the Vietnam War, the "body count" metric led to all of the following EXCEPT:

a) Inflation of enemy casualty figures b) Reclassification of civilian deaths as enemy combatants c) Accurate assessment of strategic progress d) Tactical decisions driven by kill-maximizing rather than strategy

Q6. CompStat gaming in policing included:

a) Upgrading misdemeanors to felonies to justify more resources b) Downgrading serious crimes to less serious categories to improve statistics c) Hiring more officers to reduce crime d) Increasing community engagement programs

Q7. The Hospital Readmissions Reduction Program led some hospitals to:

a) Improve discharge planning and follow-up care (intended outcome) b) Place returning patients in "observation status" to avoid counting them as readmissions c) Both a and b d) Neither a nor b

Q8. The "principal-agent problem" in the context of Goodhart's Law refers to:

a) The challenge of finding good employees b) The gap between what a principal wants (the underlying reality) and what they can observe (the proxy metric), which agents can exploit c) The difficulty of managing large organizations d) The problem of choosing which metrics to use

Q9. Surgical mortality metrics can paradoxically harm patients because:

a) Surgeons become less skilled when measured b) Surgeons may refuse to operate on high-risk patients to maintain favorable statistics c) Hospitals invest in metrics instead of equipment d) Mortality data is always inaccurate

Q10. The SEO industry is described as Goodhart's Law in action because:

a) Google's search algorithm is inherently flawed b) PageRank measured page quality, and when it became a known metric, an entire industry arose to game it c) SEO professionals are dishonest d) Search engines cannot measure quality

Q11. Social media engagement metrics drive outrage and polarization because:

a) Platform designers deliberately want to polarize society b) Algorithms optimized for engagement systematically promote content that triggers strong negative emotions, since such content generates more interaction c) Users prefer outrage content d) There is no connection between engagement metrics and polarization

Q12. The academic "replication crisis" is connected to Goodhart's Law through:

a) The publish-or-perish incentive structure that rewards novel, publishable results over rigorously replicated findings b) Poor laboratory equipment c) Insufficient funding for research d) The difficulty of understanding statistical methods

Q13. "P-hacking" refers to:

a) Hacking into academic databases b) Running multiple statistical analyses and selectively reporting significant results c) Using computers to analyze data more quickly d) Publishing papers in peer-reviewed journals

Q14. Strathern's generalization differs from Goodhart's original formulation by:

a) Being more specific to monetary policy b) Stripping away domain-specific language to reveal the universal structure applicable to any domain c) Focusing only on academic metrics d) Rejecting the idea that metrics can be useful

Q15. The Lucas critique in economics argues that:

a) Government intervention always improves the economy b) Statistical relationships observed in economic data will change once policymakers try to exploit them c) Economists should not use mathematical models d) Markets are always efficient

Q16. Which of the following is NOT one of the five solutions to Goodhart's Law proposed in Section 15.8?

a) Multi-metric approaches b) Eliminating all metrics and relying entirely on intuition c) Qualitative assessment d) Ostrom's polycentric governance

Q17. The "ratchet effect" in Soviet planning meant:

a) Quality improved steadily over time b) Exceeding the quota raised next year's target, so factory managers deliberately avoided producing too much c) New technologies were rapidly adopted d) Central planners increased their control over time

Q18. The chapter's threshold concept -- "Metrics Are Models" -- means:

a) Metrics are physical devices used for measurement b) Every metric is a simplified representation of reality, and optimization pressure exploits the gap between the metric and the reality it represents c) Metrics should be avoided because they are never accurate d) Models should replace metrics in all domains

Q19. The phrase "dishonest noise" refers to:

a) Random measurement error b) Systematic bias introduced by agents who have an incentive to distort the data c) Noise that is too loud d) Measurement instruments that are poorly calibrated

Q20. According to the chapter, the correct response to Goodhart's Law is:

a) Stop measuring things entirely b) Use metrics wisely -- as one input among many, held lightly, rotated frequently, and always checked against the underlying reality c) Trust that good metrics will always produce good outcomes d) Only use metrics that are impossible to game

Short Answer

Q21. Explain in two to three sentences why Goodhart's Law is not a problem with specific bad metrics but a structural pattern that applies to any metric used as a target.

Q22. Give one example from the chapter of a domain where the metric improved while the underlying reality got worse. Identify the metric, the underlying reality, and the direction of divergence.

Q23. In your own words, explain the difference between using a metric as a "thermometer" and using it as a "thermostat." Why does only the thermostat use trigger Goodhart's Law?

Q24. The chapter argues that blaming individual agents for gaming metrics "misses the structural point." Explain this argument in two to three sentences. Do you agree?

Q25. Name two forward connections mentioned in the chapter. What future chapters are referenced, and what concepts from Chapter 15 will they build upon?

Answer Key

Multiple Choice:

Q1: b -- Strathern's generalization of Goodhart's Law. (Section 15.7)

Q2: b -- Factories optimized for the weight metric by producing heavy, useless nails. (Section 15.1)

Q3: c -- Campbell's Law emphasizes that the indicator corrupts the very processes it was designed to monitor. (Section 15.1)

Q4: b -- Teaching to the test means restructuring instruction around the specific exam rather than around genuine learning. (Section 15.2)

Q5: c -- The body count did not accurately assess strategic progress; the Tet Offensive shattered the illusion of progress the metric had created. (Section 15.3)

Q6: b -- CompStat gaming included downgrading serious crimes to less serious categories. (Section 15.3)

Q7: c -- Some hospitals improved care (intended outcome) while others gamed the metric through observation status (unintended outcome). (Section 15.4)

Q8: b -- The principal-agent problem describes the gap between what the principal wants and what they can observe, which agents exploit. (Section 15.5)

Q9: b -- Surgeons may refuse high-risk patients to maintain favorable mortality statistics, leaving those patients without care. (Section 15.4)

Q10: b -- PageRank was a quality metric that became a target, spawning an industry dedicated to gaming it. (Section 15.4)

Q11: b -- Engagement optimization systematically promotes emotionally provocative content because it generates more measurable interaction. (Section 15.4)

Q12: a -- The publish-or-perish system rewards publication quantity and novelty over replication and rigor, incentivizing practices that produce unreliable results. (Section 15.6)

Q13: b -- P-hacking is the practice of running multiple analyses and selectively reporting those that produce statistically significant results. (Section 15.6)

Q14: b -- Strathern's formulation strips away domain-specific language, revealing the universal structure. (Section 15.7)

Q15: b -- The Lucas critique argues that exploiting observed statistical relationships changes behavior in ways that destroy those relationships. (Section 15.7)

Q16: b -- The chapter does not advocate eliminating all metrics; it advocates using them wisely. (Section 15.8)

Q17: b -- The ratchet effect punished exceeding targets by raising future targets, incentivizing managers to hide capacity. (Case Study 1)

Q18: b -- The threshold concept holds that every metric is a simplified model, and optimization pressure exploits the simplification. (Section 15.5)

Q19: b -- Dishonest noise is systematic bias from agents with incentives to distort data, as distinguished from random measurement error. (Section 15.3)

Q20: b -- Metrics should be used wisely, held lightly, supplemented by qualitative judgment, and never mistaken for the reality they represent. (Section 15.8)

Short Answer Rubric:

Q21: Goodhart's Law is structural because every metric is an incomplete model of the underlying reality it represents. Under optimization pressure, agents exploit the gap between the metric and reality -- this dynamic is inherent to any proxy measure used as a target, not to the specific choice of metric. Changing the metric changes the specific form of gaming but does not eliminate the structural vulnerability.

Q22: Acceptable examples include: test scores improved while actual learning (measured by independent assessments) did not; body counts increased while strategic progress deteriorated (the Tet Offensive revealed the gap); crime statistics improved while actual public safety may not have (downgrading and discouraged reporting); engagement metrics increased while quality of public discourse decreased (outrage amplification and polarization).

Q23: A thermometer measures passively -- it tells you the temperature without trying to change it. A thermostat measures and acts -- it detects a deviation and triggers a response to correct it. When a metric is used as a thermometer, no one has an incentive to game it. When it becomes a thermostat (an optimization target with consequences), agents have an incentive to improve the metric regardless of whether the underlying reality improves, triggering Goodhart's Law.

Q24: The structural argument holds that agents who game metrics are responding rationally to the incentive structure they face. If the system rewards metric improvement regardless of whether the underlying reality improves, then gaming is the predictable, rational response. The problem is the system design (the gap between metric and reality, combined with high-stakes incentives), not the moral character of individual agents. Agreement may vary, but the response should engage with the structural argument.

Q25: Forward connections include Chapter 16 (Legibility and Control -- Goodhart's Law as a preview of the broader legibility problem, where entire complex realities are simplified for administrative control), Chapter 21 (The Cobra Effect -- metric-based incentive systems that produce the opposite of their intended effect, Goodhart's Law pushed to its extreme), and Chapter 22 (Map and Territory -- the philosophical foundation of the map/territory distinction that underlies Goodhart's Law).