Chapter 3: Key Takeaways
Core Concepts
-
The replication crisis was catalyzed by Daryl Bem's 2011 precognition paper, which demonstrated that standard psychological methods could produce evidence for impossible phenomena. The subsequent Open Science Collaboration (2015) found that only 36% of 100 published findings replicated successfully.
-
Four practices caused the crisis: p-hacking (flexible analysis until significance is found), HARKing (hypothesizing after results are known), publication bias (journals preferring significant results), and small samples with low statistical power.
-
Landmark findings that didn't hold up include: ego depletion (willpower as a limited resource), social priming effects, the Stanford Prison Experiment (methodologically compromised), the marshmallow test (effect largely explained by SES), and some power posing effects.
-
The field is reforming. Pre-registration, Registered Reports, open data, larger samples, and multi-lab replications have made new research more reliable. Psychology is stronger for having confronted its problems.
-
The replication crisis means psychology is MORE trustworthy now, not less. A science that checks and corrects its work should be trusted more than one that doesn't. The crisis revealed problems that had always existed; the reforms are addressing them.
-
Most popular psychology claims are based on pre-crisis research — published before pre-registration was standard, before large-scale replications were common, and before publication bias was widely recognized. This doesn't mean all pre-crisis findings are wrong, but it means their reliability should be treated as uncertain until confirmed by modern methods.
-
"Has this been replicated?" is now one of the most important questions you can ask about any psychology claim. Before 2011, this question barely existed in the field's culture.
Evidence Ratings in This Chapter
| Claim | Rating | Summary |
|---|---|---|
| "Published findings in top journals are reliable" | ❌ DEBUNKED | OSC 2015: only 36% replicated |
| "Ego depletion (willpower as a muscle) is real" | ❌ DEBUNKED | Multi-lab RRR found no effect (d = 0.04) |
| "The Stanford Prison Experiment proved situations override character" | ❌ DEBUNKED | Methodologically compromised: demand effects, self-selection, coaching |
| "Psychology is less trustworthy after the crisis" | ❌ DEBUNKED | Reforms make the field more reliable than ever |
| "The replication crisis means psychology is self-correcting" | ✅ SUPPORTED | Pre-registration, open data, and Registered Reports are genuine improvements |
Key Terms Introduced
- Replication crisis: The discovery that a large proportion of published findings in psychology (and other sciences) cannot be reproduced
- P-hacking: Analyzing data flexibly until a significant result is found, then reporting only that result
- HARKing: Hypothesizing After the Results are Known — writing papers as if data-driven hypotheses were pre-specified
- Publication bias (file drawer problem): The systematic tendency of journals to publish significant results and reject null findings
- Statistical power: The probability of detecting a real effect; low power increases false positive rates and inflates effect sizes
- Winner's curse: The tendency for significant results from low-powered studies to overestimate the true effect
- Pre-registration: Publicly committing to hypotheses and analysis plans before collecting data
- Registered Reports: Journal format where papers are peer-reviewed and provisionally accepted before data collection
One Sentence to Remember
The replication crisis revealed that many of psychology's most famous findings were built on shaky methods — but the field's willingness to confront this honestly is why you should trust it more now, not less.