Case Study 1: The Replication Reckoning for Priming and Anchoring

Case Study 1: The Replication Reckoning for Priming and Anchoring

The Priming Empire

In the 2000s, social priming was one of the hottest areas in psychology. The research program, led by John Bargh and colleagues, produced findings that were elegant, surprising, and perfectly suited for popular audiences:

Elderly priming (1996): Unscrambling sentences with elderly-related words made young people walk more slowly
Professor priming (Dijksterhuis & van Knippenberg, 1998): Thinking about a professor before a trivia test improved performance; thinking about a soccer hooligan impaired it
Warm cup effect (Williams & Bargh, 2008): Holding a warm cup made you rate others as warmer
Heavy clipboard effect (Ackerman et al., 2010): Holding a heavy clipboard made you judge résumés as more important
Money priming (Vohs et al., 2006): Exposure to money cues made people more self-sufficient and less helpful

These findings were published in top journals, cited thousands of times, and became staples of popular science coverage. They suggested that our behavior is shaped by environmental cues we're not even aware of — a thrilling and somewhat unsettling idea.

The Replication Attempts

Starting around 2012, researchers began attempting to replicate these findings:

Study	Replication Attempt	Result
Elderly priming	Doyen et al. (2012)	Failed to replicate; effect appeared only when experimenters expected it
Elderly priming	Pashler et al. (2012)	Failed to replicate
Professor priming	Shanks et al. (2013)	Nine experiments, no reliable effect
Warm cup effect	Lynott et al. (2014)	Failed to replicate across three studies
Money priming	Rohrer et al. (2015)	Failed to replicate
Flag priming	Klein et al. (2014, Many Labs)	Failed to replicate

The pattern was consistent: the original studies, typically with small samples (20–40 per condition) and flexible methods, produced significant effects. The replications, with larger samples and pre-registered methods, found nothing — or effects so small as to be practically meaningless.

What Went Wrong

The social priming research suffered from every methodological problem identified in Chapter 3:

Small samples. The original studies used small samples, making significant results more likely to be false positives or inflated by the winner's curse.

Flexible analysis. Multiple possible outcome measures, covariates, and exclusion criteria provided researcher degrees of freedom that could produce significant results from noise.

Publication bias. Null results went unpublished. The published literature showed a 100% success rate for priming effects — which is itself suspicious. Real effects don't produce 100% success rates even with adequate power.

Experimenter effects. Doyen et al.'s finding that the elderly priming effect appeared only when experimenters expected it suggests that experimenter behavior (subtle cues to participants) may have driven the original results.

The Anchoring Exception

Anchoring — the finding that initial numbers influence subsequent judgments — has fared better than social priming:

Basic anchoring is robust. If you ask people "Is the Mississippi River longer or shorter than 500 miles?" before asking "How long is the Mississippi River?", they give higher estimates than people asked with a 200-mile anchor. This basic effect has been replicated extensively.

Extreme/irrelevant anchoring is weaker. Some of the more dramatic demonstrations (anchoring with obviously irrelevant numbers, like your Social Security number) have shown smaller effects in replications. The basic effect is real; the most extreme versions may have been inflated.

The mechanism debate. Researchers disagree about whether anchoring reflects genuine cognitive anchoring (the number actually influences the judgment process) or conversational pragmatics (people assume the experimenter's number is relevant information). This debate doesn't eliminate the effect but complicates the interpretation.

Discussion Questions

The social priming program produced beautiful, surprising findings that turned out to be unreliable. What lessons should science communicators draw about how they present surprising findings to the public?
Bargh has defended his original findings and disputed the replication failures. At what point does defending original findings become obstruction of scientific correction?
The anchoring effect is robust in basic form but weaker in extreme demonstrations. How should textbooks and popular science present findings where the core effect is real but the most dramatic version is overstated?
If social priming effects are real but much smaller than originally reported, do they have practical significance? Is a very small effect worth knowing about?