Quiz: The Replication Problem

Q: The Reproducibility Project found that approximately what percentage of 100 psychology studies replicated? 10% | - B) 36% | - C) 64% | - D) 85%

B) 36%. Reference: Section 10.1

Q: P-hacking is best described as: Deliberately fabricating data Analyzing data in multiple ways until a statistically significant result emerges Hiring hackers to access other researchers' data Using obsolete statistical methods

B) Testing many analyses and reporting only the significant ones. Reference: Section 10.2

Q: HARKing stands for: Hypothesizing After Results are Known Historical Analysis of Research Knowledge Hierarchical Analysis of Research Knowledge Hypothesis-Adjusted Ranking of Knowledge

A) Formulating hypotheses after seeing the data, then presenting research as though it was hypothesis-driven. Reference: Section 10.2

Q: The Amgen preclinical cancer research replication attempt found what replication rate for "landmark" studies? 11% | - B) 36% | - C) 50% | - D) 75%

A) Only 6 of 53 landmark studies (11%) replicated. Reference: Section 10.4

Q: Ioannidis's 2005 paper argued that most published findings are false primarily because of: Widespread fraud The mathematical interaction of low prior probability, small samples, researcher degrees of freedom, and publication bias Incompetent reviewers Outdated statistical software

B) The mathematical structure of the system, not individual misconduct. Reference: Section 10.3

Q: Registered reports address publication bias by: Requiring double-blind review Publishing only positive results faster Committing to publish based on methodology before results are known Requiring larger sample sizes

C) Journals commit to publish based on methods, eliminating the results-based filter. Reference: Section 10.8

Q: "The replication crisis means that all of psychology is unreliable."

False. Many areas of psychology (perception, learning, some cognitive processes) have strong replication records. The crisis is concentrated in specific subfields and specific types of studies.

Q: "P-hacking is a form of deliberate fraud."

False. P-hacking is typically an unconscious response to the incentive structure, not deliberate fraud. Researchers genuinely believe they are finding real patterns. The problem is structural, not motivational.

Q: "Pre-registration eliminates all analytical flexibility and therefore all false positives."

False. Pre-registration reduces flexibility but doesn't eliminate it. Researchers can still deviate from their plans, and exploratory analyses are still possible (they're just labeled as exploratory). Pre-registration reduces false positives but doesn't eliminate them.

Quiz: The Replication Problem

Target: 70% or higher to proceed confidently.

Section 1: Multiple Choice (1 point each)

1. The Reproducibility Project found that approximately what percentage of 100 psychology studies replicated? - A) 10% | - B) 36% | - C) 64% | - D) 85%

Answer

**B)** 36%. *Reference:* Section 10.1

2. P-hacking is best described as: - A) Deliberately fabricating data - B) Analyzing data in multiple ways until a statistically significant result emerges - C) Hiring hackers to access other researchers' data - D) Using obsolete statistical methods

Answer

**B)** Testing many analyses and reporting only the significant ones. *Reference:* Section 10.2

3. HARKing stands for: - A) Hypothesizing After Results are Known - B) Historical Analysis of Research Knowledge - C) Hierarchical Analysis of Research Knowledge - D) Hypothesis-Adjusted Ranking of Knowledge

Answer

**A)** Formulating hypotheses after seeing the data, then presenting research as though it was hypothesis-driven. *Reference:* Section 10.2

4. The Amgen preclinical cancer research replication attempt found what replication rate for "landmark" studies? - A) 11% | - B) 36% | - C) 50% | - D) 75%

Answer

**A)** Only 6 of 53 landmark studies (11%) replicated. *Reference:* Section 10.4

5. Ioannidis's 2005 paper argued that most published findings are false primarily because of: - A) Widespread fraud - B) The mathematical interaction of low prior probability, small samples, researcher degrees of freedom, and publication bias - C) Incompetent reviewers - D) Outdated statistical software

Answer

**B)** The mathematical structure of the system, not individual misconduct. *Reference:* Section 10.3

6. Registered reports address publication bias by: - A) Requiring double-blind review - B) Publishing only positive results faster - C) Committing to publish based on methodology before results are known - D) Requiring larger sample sizes

Answer

**C)** Journals commit to publish based on methods, eliminating the results-based filter. *Reference:* Section 10.8

Section 2: True/False with Justification (1 point each)

7. "The replication crisis means that all of psychology is unreliable."

Answer

**False.** Many areas of psychology (perception, learning, some cognitive processes) have strong replication records. The crisis is concentrated in specific subfields and specific types of studies.

8. "P-hacking is a form of deliberate fraud."

Answer

**False.** P-hacking is typically an unconscious response to the incentive structure, not deliberate fraud. Researchers genuinely believe they are finding real patterns. The problem is structural, not motivational.

9. "Pre-registration eliminates all analytical flexibility and therefore all false positives."

Answer

**False.** Pre-registration reduces flexibility but doesn't eliminate it. Researchers can still deviate from their plans, and exploratory analyses are still possible (they're just labeled as exploratory). Pre-registration reduces false positives but doesn't eliminate them.

Section 3: Short Answer (2 points each)

10. Explain the "dice rolling" analogy for p-hacking. Why does testing 20 hypotheses and reporting only the significant one produce unreliable results?

Sample Answer

At p < 0.05, approximately 1 in 20 analyses will produce a "significant" result purely by chance. Testing 20 hypotheses is like rolling 20 dice — getting at least one six is nearly certain but carries no information. Reporting only the six while hiding the other 19 results creates the false impression of a targeted prediction when the result is actually noise. The audience sees a "significant finding"; the statistics reveal random variation.

Section 4: Applied Scenario (3 points)

11. A pharmaceutical company publishes three studies of a new drug, all showing it reduces symptoms significantly. All three used small samples (N=40 each), were not pre-registered, and were funded by the company. No independent replication has been attempted. Using this chapter's framework, evaluate the reliability of the evidence and recommend what should happen next.

Sample Answer

Reliability concerns: (1) Small samples increase false positive risk, (2) non-pre-registered studies allow p-hacking and HARKing, (3) company funding creates incentive for positive results, (4) no independent replication means the finding has not been verified outside the interested party. Ioannidis's model would predict relatively low positive predictive value under these conditions. Recommendation: (1) Require independent replication by researchers with no financial ties to the company, (2) insist on pre-registered replication protocol, (3) use larger samples with adequate power, (4) request access to full datasets for independent analysis, (5) check for unpublished negative studies from the company's files (publication bias check). The drug may work — but the current evidence base is insufficient to conclude that it does.

Scoring & Next Steps

Score	Assessment	Recommended Action
< 50%	Needs review	Re-read 10.1–10.3
50–70%	Partial	Review the structural incentives and Ioannidis argument
70–85%	Solid	Ready to proceed
> 85%	Strong	Proceed to Chapter 11