Case Study 1.1: The Viral Study That Wasn't

"Scientists Prove Women Are Attracted to Men with Deep Voices"


The Headline

On a Tuesday afternoon in September, a science headline begins circulating on social media. By evening it has been shared tens of thousands of times:

"Scientists Prove Women Are Attracted to Men with Deep Voices — Here's Why"

The article, published on a popular science aggregation website, is 600 words long. It leads with the finding, offers a brief evolutionary explanation (deep voices signal testosterone, which signals genetic health and dominance), quotes a sentence from the study abstract, and closes with a paragraph on "what this means for dating." The author has a friendly tone and cites the study by name. The article includes a stock photo of a conventionally attractive man.

Over the following week, this headline is picked up by eleven other outlets. Several add their own evolutionary gloss; two include tips for men on how to "deepen" their voice. A self-described dating coach posts a YouTube video citing the "new science" as confirmation of techniques he charges $299 to teach. A feminist columnist uses the same study to argue that women are sociologically conditioned to find markers of masculine dominance attractive, which men then exploit. Both are confident they know what the study proved.


What the Study Actually Found

The original study, published in a mid-tier peer-reviewed journal, examined the relationship between male vocal pitch and ratings of physical attractiveness in heterosexual women. Here is what the methodology actually looked like:

Sample: 64 undergraduate women recruited from the psychology subject pool at a single American university. All participants received course credit. The age range was 18–22. The sample was not described by race or ethnicity in the published paper.

Method: Participants listened to recordings of 20 male voices saying a standardized phrase ("I went to the store to buy some groceries"). Voices had been digitally manipulated to create versions that were identical except for fundamental frequency — that is, artificially deepened or raised. Participants rated each voice on a 1–7 attractiveness scale.

Finding: Women rated deeper voices as more attractive, on average. The effect was statistically significant (p < .05).

Effect size: Cohen's d = 0.31 — a small effect by conventional standards. The practical significance: if you randomly drew one woman from the sample and one voice rating from the pool, the difference attributable to vocal pitch would be detectable but modest.

Replication status: Mixed. Three subsequent studies using similar methods found small positive effects. Two found no significant effect. One large preregistered replication with 312 participants found an effect that was directionally consistent but smaller than the original (d = 0.18, which barely exceeds the threshold for "small").


What a Skeptical Reader Should Notice

The gap between the headline and the study is large. "Scientists prove" overstates what science does — science produces evidence, not proof. "Women are attracted to men with deep voices" universalizes a finding about 64 undergraduate women at one American university to all women everywhere. "Here's why" presents one evolutionary hypothesis as established explanation.

Consider what the study does and does not tell us:

What it suggests: Among young adult American women in a controlled listening experiment, deeper male voices received slightly higher attractiveness ratings on average. This is a finding worth knowing about.

What it does not tell us: Whether this effect appears in non-American samples; whether the effect holds for women outside the 18–22 age range; whether naturally varying voices produce the same effect as digitally manipulated ones; whether vocal pitch would matter to women making real-world dating decisions, where voice is one signal among dozens; whether the effect is consistent across racial and ethnic groups; whether it applies in same-sex attraction contexts; whether women who prefer men report different patterns than bisexual women; and whether the effect, even where real, is large enough to meaningfully influence actual behavior.

The study was conducted on a WEIRD sample — Western, Educated, Industrialized, Rich, Democratic. It measured an attitudinal response (rating a voice) rather than a behavioral outcome (choosing to pursue someone). The sample was all-undergraduate, meaning all participants were in a life stage where short-term mate preferences may be more salient than long-term ones. None of this invalidates the finding. All of it constrains what the finding can responsibly claim.


Discussion Questions

  1. The article claimed that scientists "proved" the finding. Why is the word "prove" problematic in the context of a single study? What would it take to move from "a study found" to something closer to "we have good evidence that"?

  2. The effect size in the original study was d = 0.31. Using what you know about effect sizes, explain what this means practically. If you met a woman at a party and she told you she found deep voices attractive, would this study predict her reaction to any particular voice? Why or why not?

  3. Both the dating coach and the feminist columnist felt that the study confirmed their prior view. What does this suggest about how people consume scientific findings? What cognitive bias might be operating?

  4. If you were advising the science aggregation website that published the original article, what three specific changes would you recommend to the headline and framing to make it more accurate without being so technical that readers lose interest?

  5. The replication picture for this study was mixed — some studies found the effect, others didn't. What might explain the variation? What methodological factors might lead different researchers to get different results when studying the same phenomenon?


The Takeaway

The story of the viral voice study is not unusual — it is the standard pattern by which attraction science enters popular culture. A real study, with a real finding and real limitations, gets compressed into a headline that strips away the caveats, universalizes the sample, adds an evolutionary explanation that may or may not be warranted, and delivers a verdict where the actual data delivered a tentative signal.

The antidote is not cynicism. Most researchers who conduct attraction studies are doing honest work, and the finding — that voice pitch influences attractiveness ratings — is probably real in some form. The antidote is the habit this chapter introduced: asking what the evidence actually says, who it's actually about, and what follows and doesn't follow from it. Applied consistently, that habit will serve you better than any number of viral headlines.