Case Study 25.1: The OkCupid Racial Data — What *Dataclysm* Showed and What It Didn't

Case Study 25.1: The OkCupid Racial Data — What Dataclysm Showed and What It Didn't

Background

In 2009, OkCupid co-founder Christian Rudder published a blog post titled "How Your Race Affects the Messages You Get" — drawing on millions of messages and ratings from the platform to map racial preferences in online dating. Five years later, he expanded this analysis in his book Dataclysm: Love, Sex, Race, and Identity — What Our Online Lives Tell Us About Our Offline Selves (2014). The findings were striking enough to generate years of academic and journalistic conversation, and they remain among the most-cited empirical reference points in public discussions of racial hierarchy in dating.

What the Data Showed

Rudder's OkCupid analysis examined two types of evidence: message behavior (who sends messages to whom) and star ratings (how users rated each other's attractiveness on a 1–5 scale).

The rating data revealed a consistent cross-racial hierarchy. In Rudder's analysis, White men and women received the highest average ratings from all groups. Black women received the lowest average ratings from all non-Black male groups by a substantial margin — in some comparisons, the gap was more than a full point on a 5-point scale. Asian men received systematically lower ratings from non-Asian women. Other groups fell at various intermediate points.

Message behavior told a related story: users were significantly more likely to send messages to members of their own racial group than to members of other groups, controlling for distance and other factors. When cross-racial messages were sent, response rates were lower than for in-group messages — and the response rate gaps also followed the hierarchy pattern.

Rudder also examined how racial preferences had changed over time. He found that by 2014, racial preferences in ratings had become slightly less extreme than in the mid-2000s — a modest but measurable trend toward more equal cross-racial ratings.

Methodological Issues

The OkCupid data is rich but carries significant methodological limitations that are often omitted in popular retellings.

Selection bias. OkCupid's user base in 2009–2014 was not representative of the U.S. population. It skewed urban, college-educated, and young. The racial patterns observed may differ substantially from those in more representative samples — or in different digital environments.

User-assigned race categories. Users self-selected their racial identity from a limited menu of options. The category "Asian" subsumed East Asian, South Asian, Southeast Asian, and Pacific Islander users — groups with very different social positions, historical relationships to U.S. racial hierarchy, and likely very different desirability profiles. Aggregating them introduces substantial measurement error.

Context of ratings. Rating a profile on a 1–5 scale is a simplified behavior that may not translate directly to actual partner selection. Users may rate profiles differently than they swipe, and both may differ from how they behave in person.

Confounding variables. The analysis controlled for some variables but not others. If users of certain racial groups have systematically different profile completion rates, photo quality, bio length, or other factors that influence ratings, these would confound a pure racial effect. Later research has attempted to isolate racial effects more rigorously.

How It Was Reported in Media

Popular media coverage of Rudder's findings was, almost uniformly, worse than the underlying analysis. Headlines like "Science Proves White Men Are Most Attractive" (a real headline from a national magazine) stripped away the structural context and presented the hierarchy as a statement about natural attractiveness rather than a snapshot of racialized social behavior.

This is a predictable but serious problem: data that shows the effects of structural racism (a racial hierarchy in desirability) is reported as data about natural human preferences, erasing the structural analysis entirely. The finding that Black women receive the lowest cross-racial ratings becomes "Black women are less attractive" rather than "centuries of anti-Black propaganda, legal exclusion, and media underrepresentation have produced this aggregate behavioral outcome in 2009."

Rudder himself, to his credit, framed the findings as "the least flattering aspect of human nature" and was clear that the patterns were social and mutable. But nuanced framing in a book does not survive summarization into headlines.

Subsequent Research

Subsequent academic work has confirmed, complicated, and extended Rudder's findings.

Cynthia Feliciano and colleagues (2009) found that explicit racial exclusion in self-described dating preferences was widespread: the majority of White men and women on Yahoo! Personals stated racial preferences that excluded Black partners. The explicit nature of the exclusion was striking — it was not inferential from behavior but stated policy.

Lin and Lundquist (2013) found similar hierarchical patterns on Match.com while documenting significant gender asymmetries within racial groups: the disadvantage for Asian men and the relative advantage of Asian women in cross-racial desirability were both documented, suggesting that racial hierarchy interacts with gender in ways that Rudder's aggregate analysis could not fully capture.

Bruch and Newman (2018), using a large dating site dataset, found that most users exhibited aspirational partner selection — messaging users rated higher than themselves on the platform's desirability metric — and that this aspirational pattern itself showed racial structure: the average "desirability gap" in cross-racial messaging followed the documented hierarchy.

Implications

The OkCupid data, and the literature that followed it, does not tell us anything about what people should find attractive. It tells us what people do find attractive in aggregate, in a specific social context, at a specific historical moment — and that what they find attractive closely resembles what centuries of racial propaganda, legal exclusion, and media representation have trained them to find attractive.

That this pattern is documented does not make it natural, inevitable, or beyond scrutiny. It makes it evidence.

Discussion Questions

A journalist summarizes the OkCupid data as: "People prefer to date their own race." What is lost in that summary? How would you rewrite it to include the structural analysis?
Rudder found that racial preferences in ratings became slightly less extreme from the mid-2000s to 2014. What factors might explain this modest shift? What would you need to see in the data to conclude that the hierarchy is genuinely eroding rather than slightly softening?
The OkCupid data has been cited both by critical race scholars (as evidence of structural racism in desire) and by white nationalists (as evidence of natural racial hierarchy). What does the same data's availability to both interpretations tell us about the relationship between data and its social meaning?