Case Study: Content Moderation at Scale — The Human Cost

DataField.Dev

Case Study: Content Moderation at Scale — The Human Cost

"They said we'd be protecting people. But nobody protected us." — Former Facebook content moderator, quoted in The Verge, 2019

Overview

Every day, billions of pieces of content are posted to social media platforms: text, images, videos, live streams, stories, and comments. A fraction of this content — but a fraction measured in millions of posts per day — is violent, sexually exploitative, hateful, fraudulent, or otherwise harmful. Removing this content is the task of content moderation.

This case study examines what happens behind the screen — who actually performs the work of content moderation, under what conditions, and at what personal cost. It investigates the industry of outsourced moderation labor, the documented psychological harms to workers, the structural inequalities embedded in the moderation supply chain, and the fundamental tension between the scale of content creation and the limits of both human and algorithmic review.

Skills Applied: - Analyzing labor conditions and power dynamics in the platform economy - Connecting algorithmic systems to the human labor that supports them - Evaluating the adequacy of current governance approaches to content moderation - Applying the Power Asymmetry framework to global supply chains

The Scale of the Problem

What Needs Moderation

Content moderation is not limited to removing obvious spam. Moderators encounter:

Child sexual abuse material (CSAM): The Internet Watch Foundation identified over 250,000 URLs containing CSAM in 2022 alone. Moderators must view this material to identify and remove it.
Graphic violence: Videos of assaults, murders, torture, animal abuse, and war crimes. The Christchurch mosque shooting in 2019 was live-streamed on Facebook; the video was re-uploaded 1.5 million times in the first 24 hours.
Hate speech: Content targeting individuals or groups based on race, religion, gender, sexual orientation, disability, or other protected characteristics. Distinguishing hate speech from political commentary, satire, or legitimate critique requires cultural and contextual knowledge.
Misinformation: False claims about health, elections, climate change, and public safety. Determining what is "false" requires judgment calls that may differ across political and cultural contexts.
Self-harm and suicide content: Posts that depict or encourage self-harm. Moderators must make rapid decisions about content that may be distressing — while recognizing that some such content is a cry for help rather than a glorification.
Terrorism and extremism: Recruitment material, propaganda, and incitement. Distinguishing extremist content from journalistic coverage, historical documentation, or academic analysis requires nuanced judgment.

The Numbers

Facebook alone receives approximately 350 million photos and 100 million videos per day. YouTube users upload more than 500 hours of video every minute. Twitter/X processes approximately 500 million tweets per day. TikTok's volume is comparable.

No technology company has enough employees to review all of this content. Instead, platforms use a tiered approach:

Automated detection: AI systems (hash-matching for known CSAM, computer vision for violence, NLP for hate speech) flag content for review or automatically remove it.
User reporting: Users flag content they believe violates community standards.
Human review: Content moderators — the human beings at the center of this case study — review flagged content and make final decisions.

The automated systems handle the highest volume but are the least accurate. The human reviewers handle the most difficult cases but bear the greatest personal cost.

The Workers: Who Moderates the Internet?

The Outsourcing Model

The majority of content moderation for major platforms is performed not by employees of Facebook, YouTube, or TikTok, but by contract workers employed by outsourcing companies. The largest include:

Accenture (formerly Cognizant's content moderation division)
Teleperformance — a French company operating in 91 countries
Majorel (now part of Teleperformance)
Samasource (now Sama) — operating in Kenya and Uganda

These companies are hired by platforms to perform moderation at a fraction of the cost of employing moderators directly. Contracts are structured around volume and speed: moderators are typically expected to review hundreds of pieces of content per shift, making a decision every 30 seconds to 2 minutes.

Geographic Distribution

Content moderation centers are located predominantly in lower-income countries and regions:

The Philippines: Manila has been described as the "content moderation capital of the world." Filipino moderators review content primarily in English, leveraging the country's high English proficiency and lower labor costs. Wages range from $1.50 to $4.00 per hour.
India: Major moderation centers operate in Hyderabad, Bangalore, and New Delhi, handling content in English, Hindi, and other languages.
Kenya and East Africa: Sama and other companies employ moderators in Nairobi to review content for platforms including Facebook and ChatGPT (OpenAI). A 2023 Time investigation revealed that Kenyan workers training ChatGPT's safety systems were paid less than $2 per hour while being exposed to graphic content describing violence and abuse.
The United States: Some moderation is performed domestically, primarily in cities like Tampa, Phoenix, and Austin. U.S.-based moderators are paid more ($12-$18/hour) but still significantly less than the platform employees whose products they support.

The geographic pattern reveals a clear Power Asymmetry: platforms are headquartered in Silicon Valley; the psychological labor of cleaning up those platforms is performed in the Global South.

Demographics and Conditions

Content moderators are disproportionately young, economically vulnerable, and from marginalized communities. Investigative reporting and academic research have documented:

Wages: Significantly below what platform employees earn. A Facebook software engineer earns a median salary of approximately $200,000/year; a content moderator reviewing the most disturbing content on the same platform earns $25,000-$35,000/year (in the U.S.) or far less internationally.
Employment status: Most moderators are contract workers, not employees of the platform. They receive fewer benefits, less job security, and limited access to mental health support.
Non-disclosure agreements: Moderators are required to sign NDAs that prevent them from discussing their work — including the psychological harms they experience — publicly. This silence reinforces the invisibility of the labor.
Performance metrics: Moderators are evaluated on speed and accuracy. The pressure to make rapid decisions — one every 30 seconds in some centers — leaves almost no time for the contextual judgment that complex moderation decisions require.

The Psychological Cost

Documented Harms

The psychological impact of content moderation has been documented extensively:

Post-Traumatic Stress Disorder (PTSD). A 2020 study by researchers at the University of Oxford found that content moderators exhibited PTSD symptoms at rates comparable to combat veterans and emergency first responders. Moderators reported flashbacks, nightmares, hypervigilance, and intrusive thoughts triggered by the violent and abusive content they reviewed daily.

Secondary Traumatic Stress. Unlike emergency responders, content moderators experience repeated exposure to traumatic material over months and years. The cumulative effect is particularly severe because the exposure is not occasional but continuous — eight hours per day, five days per week, for the duration of employment.

Behavioral and cognitive changes. Former moderators have reported: - Developing extreme distrust of other people, particularly around children (after reviewing CSAM) - Compulsive checking of their children's devices and online activity - Inability to watch news or entertainment media depicting violence - Adoption of conspiracy theories they had been exposed to repeatedly during moderation - Substance abuse as a coping mechanism - Relationship breakdown and social withdrawal

The Cognizant whistleblowers. In 2019, The Verge published a landmark investigation into content moderation conditions at a Cognizant facility in Tampa, Florida, that processed Facebook content. The reporting revealed: - A moderator who developed PTSD and was prescribed medication within months of starting - Moderators self-medicating with drugs and alcohol - One moderator who died of a heart attack at his desk; colleagues suspected the stress contributed - Workers who began to believe conspiracy theories they had reviewed repeatedly - A "wellness room" that consisted of a small closet with a beanbag chair - A counselor available to the entire facility — for one hour per week

The Invisibility Problem

Perhaps the most striking aspect of the content moderation workforce is its invisibility. Users experience clean, curated feeds without awareness that their experience depends on thousands of workers viewing the most disturbing material the internet produces. The labor is deliberately hidden — by platforms that do not publicize it, by NDAs that silence workers, and by an outsourcing structure that creates distance between the platform's brand and the conditions under which moderation is performed.

This invisibility is structurally similar to other forms of hidden labor: the garment workers whose conditions are obscured by long supply chains, the delivery drivers whose precarity is hidden by the app's seamless interface, the data labelers in Kenya and India whose work makes AI systems function but who receive neither credit nor adequate compensation.

Governance and Legal Landscape

Platform Self-Regulation

Content moderation standards are currently set by platforms themselves through "community guidelines" — terms of service that define what content is permitted and what is prohibited. These guidelines differ across platforms, change frequently, and are applied by moderators with varying levels of training and cultural context.

There is no external standard for content moderation accuracy, no independent audit process, and no regulatory requirement to disclose how moderation decisions are made, how many moderators are employed, or what conditions they work under.

Legal Protections (and Their Absence)

In the United States, Section 230 of the Communications Decency Act protects platforms from liability for user-generated content, while also protecting their right to moderate content. This legal framework creates an incentive to moderate (to remove harmful content that could damage the platform's reputation) but no legal requirement to moderate well or to protect the workers who perform the moderation.

Labor protections for content moderators are minimal in most jurisdictions. In the U.S., contract workers are excluded from many employment protections available to direct employees. In the Philippines and Kenya, workers may have even fewer protections.

The Facebook settlement (2020). Facebook agreed to pay $52 million to settle a class-action lawsuit brought by current and former content moderators who developed PTSD and other conditions. The settlement provided up to $50,000 per claimant for moderators diagnosed with PTSD. While significant, the settlement represented less than 0.1% of Facebook's annual revenue.

The EU's Digital Services Act (2022)

The European Union's Digital Services Act (DSA), which took full effect in 2024, represents the most significant regulatory intervention in content moderation to date. Among its provisions:

Large platforms must provide transparency reports on content moderation activities
Users have the right to appeal moderation decisions
Platforms must assess and mitigate "systemic risks" including the spread of illegal content and disinformation
Platforms must provide information to researchers studying systemic risks

The DSA does not, however, directly regulate the working conditions of content moderators or require platforms to disclose details about their moderation workforce.

Stakeholder Analysis

The Users

Users benefit from content moderation — their feeds are cleaner, their children are (somewhat) protected, and the most extreme content is (mostly) removed. But users bear costs too: legitimate speech is sometimes removed, cultural expression is sometimes misinterpreted, and the system creates a false sense of safety that may reduce personal vigilance.

The Platforms

Platforms benefit from content moderation because a cleaner platform retains users and satisfies advertisers. But moderation is an expense, not a revenue generator. Platforms have an incentive to moderate just enough to maintain advertiser confidence and regulatory compliance — and no more.

The Moderators

Moderators bear the highest cost of all stakeholders. They perform psychologically damaging labor at low wages, under strict performance metrics, with minimal mental health support, and are silenced by NDAs. Their work is essential to the functioning of the platform economy, but they are the most disposable participants in it.

Advertisers

Advertisers want their products associated with safe, appealing content. They exert pressure on platforms to moderate — but this pressure is about brand safety (removing content near ads), not about worker welfare. Advertiser boycotts have been among the most effective levers for improving moderation, but the improvements they drive are oriented toward advertiser interests, not moderator wellbeing.

Discussion Questions

The labor question. Content moderation has been described as "the most important job in Silicon Valley" and "the worst job in technology." Why is work that is this essential compensated so poorly and protected so weakly? What structural forces keep wages low and protections minimal? Connect your analysis to the Power Asymmetry.
The automation paradox. Platforms invest heavily in automated content detection to reduce reliance on human moderators. But automated systems create their own problems: false positives (removing legitimate content), false negatives (missing harmful content), and inability to assess context, satire, or cultural nuance. Is full automation of content moderation desirable? What would be lost?
The visibility question. Should platforms be required to publicly disclose the number of content moderators they employ, where they are located, what they are paid, and what mental health support is provided? What arguments favor transparency? What arguments might platforms make against it?
The ethical framework question. Apply at least two ethical frameworks from Chapter 6 (e.g., utilitarianism, deontology, virtue ethics, care ethics) to the content moderation labor question. Does utilitarianism justify the current arrangement (the suffering of thousands of moderators enables a safer experience for billions of users)? What would a deontological or care ethics analysis conclude?

Your Turn: Mini-Project

Option A: Policy Proposal. Draft a one-page policy proposal — addressed to a national legislature or a platform's board of directors — establishing minimum standards for content moderation labor. Your policy should address: (1) maximum daily exposure time to graphic content, (2) required mental health support, (3) minimum compensation relative to platform employees, (4) limitations on NDAs, and (5) independent oversight of working conditions.

Option B: Supply Chain Mapping. Research the content moderation supply chain for one major platform (Facebook/Meta, YouTube/Google, TikTok, or Twitter/X). Map the chain from the platform to the outsourcing company to the moderation center. Identify: Where are the moderators located? Who employs them? What is known about their working conditions? What is unknown? Write a two-page analysis framing this as a labor supply chain analogous to those studied in garment manufacturing or electronics production.

Option C: Moderator Wellbeing Audit. Design a framework for auditing the wellbeing of content moderators. Specify: What metrics would you measure (PTSD rates, turnover, satisfaction, access to counseling)? How often would audits occur? Who would conduct them (internal teams, independent auditors, regulators)? What would trigger a platform to modify its practices? Write a two-page framework document.

References

Newton, Casey. "The Trauma Floor: The Secret Lives of Facebook Moderators in America." The Verge, February 25, 2019.
Newton, Casey. "Facebook Will Pay $52 Million in Settlement with Moderators Who Developed PTSD." The Verge, May 12, 2020.
Perrigo, Billy. "Exclusive: OpenAI Used Kenyan Workers on Less Than $2 Per Hour to Make ChatGPT Less Toxic." Time, January 18, 2023.
Roberts, Sarah T. Behind the Screen: Content Moderation in the Shadows of Social Media. New Haven: Yale University Press, 2019.
Steiger, Miriah, Timir J. Bharucha, Sukrit Venkatagiri, Martin J. Riedl, and Matthew Lease. "The Psychological Well-Being of Content Moderators." Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems. ACM, 2021.
Gillespie, Tarleton. Custodians of the Internet: Platforms, Content Moderation, and the Hidden Decisions That Shape Social Media. New Haven: Yale University Press, 2018.
European Commission. "Digital Services Act: Regulation (EU) 2022/2065." Official Journal of the European Union, 2022.
Internet Watch Foundation. "Annual Report 2022: Trends in Online Child Sexual Abuse Material." Cambridge: IWF, 2023.
Gray, Mary L., and Siddharth Suri. Ghost Work: How to Stop Silicon Valley from Building a New Global Underclass. Boston: Houghton Mifflin Harcourt, 2019.