Case Study 2: The Audit Study — Names, Race, and the Hiring Market

DataField.Dev

Case Study 2: The Audit Study — Names, Race, and the Hiring Market

Chapter 18 — Born Lucky? The Sociology of Structural Advantage

Overview

In 2004, economists Marianne Bertrand and Sendhil Mullainathan published a paper with a stark title: "Are Emily and Greg More Employable Than Lakisha and Jamal? A Field Experiment on Labor Market Discrimination."

The answer the data returned was: yes. By a substantial margin.

This study — known in the social science literature simply as "the audit study" — is one of the most methodologically elegant and empirically disturbing pieces of research on discrimination in the modern hiring market. It is also one of the clearest demonstrations of a specific mechanism through which structural luck operates: the way in which characteristics of birth (race, as signaled by name) systematically alter the probability of opportunity, independent of qualifications.

This case study examines the study's design, findings, implications, and the research that has followed it.

The Research Question

Bertrand and Mullainathan were investigating a fundamental question: Does racial discrimination persist in the U.S. labor market? And if so, how does it operate?

The challenge in studying discrimination is that it is difficult to observe directly. Hiring managers rarely announce their biases. Surveys asking about discriminatory intent are easily contaminated by social desirability — people report what they believe they should think, not what actually drives their decisions. And observational data comparing outcomes across racial groups is confounded by countless other variables: education, experience, geography, industry, and so on.

The audit study methodology cuts through these complications using a simple and powerful design.

The Methodology

Resume Construction

The researchers constructed resumes representing a range of credential levels — high-quality and low-quality, measured by educational attainment, experience, and signals of competence. They were careful to construct resumes that were realistic, well-formatted, and contained plausible work histories.

Each resume was designed so that it could be credibly assigned to either a high-quality or low-quality category, allowing the researchers to test not just whether discrimination occurred, but whether discrimination differed across quality levels.

Random Name Assignment

The critical manipulation was the name on the resume. The researchers drew on birth certificate data from Massachusetts to identify first names that were disproportionately given to Black versus white children. They selected names that research confirmed would be strongly racially coded by readers:

Black-associated names used: Lakisha, Jamal, Tamika, Aisha, Rasheed, Tremayne, Ebony, Leroy, Kareem, Darnell

White-associated names used: Emily, Greg, Allison, Brad, Anne, Jay, Kristen, Matthew, Laurie, Todd

These names were not chosen arbitrarily — they were selected based on data showing high racial distinctiveness. Readers receiving a resume with "Jamal" would be very likely to read the candidate as Black; readers seeing "Greg" would be very likely to read the candidate as white.

Names were randomly assigned to resumes. The same resume content might appear with a Black-sounding name in one application and a white-sounding name in another application to a different employer. This random assignment is what makes the design powerful: it isolates the effect of the name from everything else on the resume.

The Applications

The researchers sent 4,870 resumes in response to 1,300 help-wanted ads in Boston and Chicago between July 2001 and January 2002. Industries included sales, administrative support, clerical positions, and management — a broad range of entry- and mid-level positions.

The callback rate — whether the resume received a response inviting the applicant to interview — was recorded as the outcome measure.

The Findings

Overall Callback Gap

The headline finding was clear: resumes with white-sounding names received 50% more callbacks than identical resumes with Black-sounding names.

Specifically: - White-sounding names: 9.65% callback rate - Black-sounding names: 6.45% callback rate - Difference: approximately 3.2 percentage points, or about 50% more callbacks for white-sounding names

To make this concrete: for every 10 applications sent with a Black-sounding name, 6 to 7 callbacks were received. For the same 10 applications with a white-sounding name, approximately 10 callbacks were received.

The Quality Differential

One of the most disturbing additional findings concerned the return on quality — the extra callbacks received for having a higher-quality resume.

For white-sounding names, having a high-quality resume (more experience, better credentials) versus a low-quality resume increased callbacks by approximately 30%. For Black-sounding names, the quality premium was substantially smaller.

In other words: improving your resume as a Black-sounding candidate produced less payoff than the same improvement for a white-sounding candidate. Investment in human capital — in making yourself a stronger candidate on paper — had a lower return depending on the racial coding of the name at the top of the page.

This is a particularly sharp illustration of how structural luck shapes the "return on investment" for individual effort.

Industry and Size Variation

The researchers also found variation across contexts: - Federal contractors (who must comply with affirmative action requirements) showed smaller gaps but did not eliminate them - Larger employers showed no smaller gap than smaller ones, contrary to a hypothesis that more formalized hiring would reduce bias - The gap existed across all industries represented in the sample

What the Study Is and Is Not

What It Is

The audit study is a field experiment — an experiment conducted in real-world conditions rather than a laboratory. This is its greatest strength: it measured actual hiring behavior in real job markets, not hypothetical responses in survey conditions.

The random assignment of names eliminates the most common confound in discrimination research: the possibility that observed differences in outcomes reflect genuine differences in qualifications. In this study, the qualifications were literally identical — only the name differed.

The study is therefore strong evidence for discrimination at the callback stage — the stage between submitting an application and being invited to interview.

What It Is Not

The study does not measure discrimination at every stage. It measures only callbacks, not interview performance, hiring decisions, salary offers, or long-term career outcomes. Discrimination may be larger or smaller at other stages.

The study also does not establish the mechanism. The gap in callbacks is consistent with multiple explanations: conscious racism (some employers actively do not want to hire Black employees), unconscious implicit bias (some employers are surprised by the association between the name and a strong resume, or make quick stereotyped assessments), and statistical discrimination (some employers use race as a proxy for unobserved variables they believe correlate with it, correctly or incorrectly).

Distinguishing between these mechanisms requires additional research. What the audit study establishes firmly is that the gap exists — that the name at the top of a resume systematically alters the probability of a callback for otherwise equivalent candidates.

Subsequent Research and Replication

The Bertrand and Mullainathan study was not the first of its kind — audit studies had been conducted since the 1970s — but it was by far the largest and most methodologically sophisticated at the time of publication, and it became the benchmark.

Subsequent research has generally confirmed and extended the findings:

Replication studies in multiple countries (Germany, France, Sweden, Australia) have found analogous gaps based on names or photographs associated with immigrant or minority groups.

Evolution of the design: More recent audit studies have used email correspondence, LinkedIn profiles, and even video applications to test discrimination at different stages and through different cues.

Intersectionality: Research has examined how gender interacts with race in name-based discrimination, finding that Black women face a compound effect in some industries and a different pattern in others.

The gender-name interaction in tech: Studies of tech hiring found that resumes with male names (controlling for race) received more callbacks in technical roles, and that the interaction between race and gender produced compound effects.

Structural Luck as the Operating Mechanism

The audit study is a precise measurement of one specific form of structural luck: the luck of the name you were given, which reflects the racial community your parents identified with or were assigned to, which you did not choose.

Lakisha did not choose her name. She did not choose to be born into a racially stratified society. She did not choose to have her name read — by some hiring managers, unconsciously — as a signal that activates stereotypes and reduces the probability of a callback.

She worked just as hard as Emily. Her resume was equally strong. The difference in their callback rates cannot be attributed to anything Lakisha did or failed to do. It is structural. It precedes her actions. It operates before anyone evaluates her actual qualifications.

This is constitutive luck — the luck of being who you are in the society you're in — expressed with economic precision.

The Compound Effect: When Quality Doesn't Pay Off Equally

Recall the chapter's discussion of intersectionality and compound structural luck. The audit study provides a sharp illustration.

For a white-sounding candidate, improving resume quality yields a substantial payoff. For a Black-sounding candidate, the payoff is smaller. This means:

The structural luck of racial coding diminishes the return on individual effort
Working harder, building a better resume, investing more in your own qualifications — all of these have a lower expected return if your name codes as Black
This is not because the effort is less real, but because the structural filter through which the effort passes is not neutral

This is the compound form of structural disadvantage: it doesn't just create a lower starting point; it also reduces the rate at which individual investment generates returns. The gap doesn't just exist at the bottom — it widens as candidates try to improve their position.

Implications for Priya's Situation

Priya's situation is not the audit study scenario exactly. The audit study isolates racial name discrimination in the callback stage. Priya's challenge includes racial dynamics but also class-mediated access and network effects.

But the audit study is directly relevant to her understanding of why the hiring market does not function as pure meritocracy. Even when formal qualifications are equivalent — same degree, same major, similar experience — structural factors alter the distribution of outcomes in ways that are:

Large in magnitude (50% callback gap)
Systematic (occurring across industries and geographic markets)
Independent of individual effort (not addressable by working harder or improving the resume)
Concentrated at early stages (the callback) before individual qualifications are even evaluated in conversation

Understanding this is not an invitation to give up. It is an invitation to strategize correctly. If the formal application process carries a structural headwind, then strategies that route around it — networking, referrals, direct conversations that make the person real before the resume lands — become more valuable, not less.

This is exactly the insight that drives Chapters 19 and 20.

Key Terms

Field experiment: An experiment conducted in real-world conditions rather than a laboratory, using actual participants (here, hiring managers) who do not know they are subjects.
Audit study: A field experiment in which researchers present equivalent candidates differing on one variable to measure discrimination.
Callback rate: The proportion of job applications that receive a response inviting the applicant to interview.
Statistical discrimination: Using group membership as a proxy for unmeasured individual characteristics, producing discrimination as a byproduct even without animus.
Return on quality: The additional benefit (in callbacks, salary, etc.) received for having higher qualifications — the study found this was lower for Black-sounding names.

Discussion Questions

The study finds that improving resume quality has a lower payoff for Black-sounding names. How does this affect the practical advice you would give someone with a Black-sounding name who is job searching? What strategies might compensate for this lower return?
The study cannot distinguish between conscious racism, implicit bias, and statistical discrimination as mechanisms for the gap. Does the mechanism matter morally? Does it matter practically (for what to do about it)?
The audit study measured only callbacks. What additional stages of the hiring process might you want to audit? How would you design those studies?
Some people argue that Black families should give their children "race-neutral" names to avoid name-based discrimination. Evaluate this argument from the perspectives of (a) individual strategic rationality, (b) collective action and social change, and (c) cultural identity and dignity.
The researchers found that federal contractors (who are required to have affirmative action plans) showed a smaller but not eliminated gap. What does this suggest about the effectiveness of formal equal opportunity requirements versus the informal mechanisms through which discrimination operates?