On March 16, 2022, a video appeared on Ukrainian social media platforms and Russian news channels showing a man who appeared to be Ukrainian President Volodymyr Zelensky announcing that he was ordering Ukrainian forces to lay down their arms and...
In This Chapter
- Learning Objectives
- Introduction
- Section 18.1: The Synthetic Media Landscape
- Section 18.2: The Technology of Deepfakes
- Section 18.3: A Brief History of Image and Video Manipulation
- Section 18.4: Use Cases and Harms
- Section 18.5: Political Deepfakes
- Section 18.6: The Liar's Dividend
- Section 18.7: Detection Methods
- Section 18.8: Audio Deepfakes and Voice Cloning
- Section 18.9: Regulatory and Platform Responses
- Section 18.10: Future Trajectories
- Key Terms
- Discussion Questions
- Summary
Chapter 18: Deepfakes, Synthetic Media, and Emerging Threats
Learning Objectives
By the end of this chapter, students will be able to:
- Define and distinguish the full spectrum of synthetic media — from cheap fakes through shallow fakes to deepfakes — and apply these distinctions to real-world examples.
- Explain the core technical mechanisms underlying deepfake generation, including Generative Adversarial Networks (GANs) and diffusion models, at a conceptually rigorous level accessible without advanced mathematics.
- Place the deepfake transition point (~2017) within a longer history of media manipulation, from pre-digital darkroom work to the Photoshop era.
- Analyze the specific harms caused by synthetic media across domains including non-consensual intimate imagery, political manipulation, fraud, harassment, and reputation destruction.
- Evaluate documented cases of political deepfakes for both actual and potential harms, distinguishing verified incidents from speculative scenarios.
- Apply and explain Chesney and Citron's "Liar's Dividend" concept and assess its implications for audiovisual evidence and political discourse.
- Describe the major forensic and technical approaches to deepfake detection, including artifact analysis, GAN fingerprinting, and content provenance systems.
- Assess the regulatory and platform responses to synthetic media, including state laws criminalizing non-consensual intimate imagery and the C2PA content provenance standard.
- Reason about the future trajectory of synthetic media technology and its epistemological implications for public trust in audiovisual information.
Introduction
On March 16, 2022, a video appeared on Ukrainian social media platforms and Russian news channels showing a man who appeared to be Ukrainian President Volodymyr Zelensky announcing that he was ordering Ukrainian forces to lay down their arms and surrender to Russian forces. Zelensky, who had been consistently rallying Ukrainian resistance to the Russian invasion, looked directly into the camera and spoke in Ukrainian. The video spread rapidly. Within hours, Zelensky himself appeared on video — real this time — to debunk it. The synthetic video was crude by professional standards, with a slightly misshapen head and flat, robotic audio. But it circulated widely before being definitively identified as a deepfake.
This episode, which we examine in detail later in this chapter, captures both the promise and the reality of synthetic media as a tool for disinformation. The technology was not sophisticated enough to fool careful viewers, but it spread far enough to require an active public debunking by a sitting head of state — which itself became a story. The episode also illustrated a dynamic that researchers Robert Chesney and Danielle Citron had predicted two years earlier: that the mere existence of deepfake technology would allow people to dismiss authentic videos as fake — the "Liar's Dividend."
We are in the early stages of a technological transformation in the credibility of audiovisual media. For most of recorded history, photographs, videos, and audio recordings carried an implicit claim to authenticity — they showed what the camera saw and what the microphone captured. That implicit claim is dissolving. The same artificial intelligence systems that can generate photorealistic images of people who have never existed, that can transfer one person's facial expressions onto another's body in real time, and that can clone a person's voice from three seconds of audio, are now consumer technologies available in free mobile applications. We are moving from a world where seeing something was a reasonable basis for believing it to a world where the relationship between seeing and believing must be substantially renegotiated.
This chapter maps that transition: the technology that makes it possible, the harms it enables, the policy and technical responses that have emerged, and the broader epistemological implications for how we know what we know in a world saturated with synthetic media.
Section 18.1: The Synthetic Media Landscape
Defining Synthetic Media
"Synthetic media" is an umbrella term for any media content produced or substantially altered using computational methods, particularly artificial intelligence, such that the result depicts or sounds like people, places, or events that were not actually recorded by the originating device. The term encompasses a wide spectrum of technologies and manipulations, not all of which are problematic.
It is important to distinguish synthetic media that deceives from synthetic media that does not. AI-generated art, voice assistants, virtual influencers disclosed as synthetic, and CGI in film are all forms of synthetic media — but they do not represent epistemic threats because they are understood to be synthetic. The threat arises when synthetic media is presented as authentic, or when its synthetic nature is not disclosed in contexts where authenticity matters.
The Spectrum from Cheap Fakes to Deepfakes
Researchers Nina Jankowicz, Christopher Paul, and others have described a spectrum of audiovisual manipulation ranging from simple, low-technology manipulations to sophisticated AI-generated content. Understanding this spectrum is important because different points on it require different detection approaches and pose different levels of threat.
Cheap fakes (also called "shallow fakes" in some taxonomies, though usage varies) are manipulations produced with readily available consumer tools, without AI involvement. They include: - Speed manipulation: Slowing down or speeding up video to make a subject appear impaired or agitated. - Selective editing: Removing context from video clips to misrepresent what was said or done. - Recontextualization: Presenting authentic footage from one context (a different location, time, or event) as depicting something else. - Dubbed audio: Replacing or overlaying audio to change what appears to be said.
Cheap fakes are technically simple and are often more consequential than deepfakes, precisely because they require less technical sophistication and can be produced and deployed faster. The slowed-down video of House Speaker Nancy Pelosi that circulated in 2019, making her appear intoxicated, was a cheap fake — simply playing back a genuine video at reduced speed. It spread to millions of viewers and required active debunking by Facebook and other platforms.
Shallowfakes (usage varies by researcher — some use this term for the cheap fake category and reserve "deepfake" for AI-generated content) include basic AI-assisted manipulations that do not involve the full face-swap or generation pipeline: - AI-assisted colorization or restoration of historical footage - Background replacement in video calls - Automated lip-sync modifications - Basic voice alteration
Deepfakes in the strict sense use deep learning (specifically neural network architectures including GANs and diffusion models) to generate or substantially replace facial appearance, voice, or body in video or audio content. The defining characteristic is that the result is not an editing artifact but a genuine AI generation — the pixels depicting the "face" were generated by a neural network trained on footage of the target person, not captured by a camera.
The most technically sophisticated current synthetic media systems can generate: - Photo-realistic static images of people who have never existed (systems like StyleGAN, Stable Diffusion) - Videos with real-time face-swapping that can be applied to live video streams - Voice cloning from very short audio samples (a few seconds to a few minutes) - Talking head videos generated from a single photograph and any audio input - Full body motion transfer — applying one person's body movements to another
Callout Box: Terminology Note The terms "deepfake," "synthetic media," "cheap fake," and "shallowfake" are used inconsistently across researchers, journalists, and policymakers. For this chapter, we use "deepfake" to mean AI-generated or AI-substantially-modified audiovisual content that depicts real people, "cheap fake" to mean low-technology manipulations (speed, editing, recontextualization), and "synthetic media" as the umbrella term. When reading other sources, pay attention to how these terms are defined in context.
Section 18.2: The Technology of Deepfakes
Generative Adversarial Networks (GANs)
The technology that enabled the modern deepfake era is the Generative Adversarial Network (GAN), introduced by Ian Goodfellow and colleagues in 2014. Understanding GANs at a conceptual level is essential for understanding what deepfakes are and why they have certain characteristic limitations.
A GAN consists of two neural networks trained simultaneously in an adversarial relationship:
The Generator takes random noise as input and produces synthetic output — an image, audio clip, or video frame. Initially, the generator produces nonsense. Its goal is to produce output that the discriminator cannot distinguish from real data.
The Discriminator is trained on real data and on the generator's output. Its goal is to correctly classify inputs as "real" or "generated." Initially, the discriminator can easily distinguish real from generated, but as training proceeds, the generator improves.
The "adversarial" part: the generator is trained to maximize the discriminator's error rate; the discriminator is trained to minimize it. They compete against each other. As training continues, the generator produces increasingly realistic output, and the discriminator becomes increasingly sophisticated — until (ideally) the generator produces output the discriminator cannot reliably classify.
For face synthesis, GANs are trained on large datasets of human faces. The generator learns the statistical regularities of faces — the spatial relationships between features, the distribution of skin tones, the typical illumination patterns — and can generate novel faces that match these regularities. Progressive GAN training (StyleGAN, StyleGAN2, StyleGAN3) allows generation at increasingly high resolution.
For face-swapping specifically, the GAN architecture is modified to learn an encoding of a specific person's face — the identity encoder captures what makes the target person look like themselves — and to apply that identity to any new video frame. The result is a video in which the source body performs any action, but the face appears to be the target person.
Diffusion Models
Since approximately 2022, diffusion models have largely superseded GANs for many image generation tasks, though both remain active areas of development. DALL-E, Stable Diffusion, Midjourney, and similar systems use diffusion rather than adversarial training.
A diffusion model is trained by learning to reverse a noise-addition process. During training, the model sees images with progressively more noise added, and learns to predict what noise was added at each step. To generate a new image, the model starts with pure noise and iteratively removes noise — each step guided by a text description or other conditioning signal — until a coherent image emerges.
Diffusion models have significant advantages over GANs for general image generation: they are more stable to train, produce higher quality and more diverse outputs, and are more controllable through text conditioning. For deepfake-specific face-swapping tasks, GAN-based approaches remain competitive, but diffusion models are increasingly applied to video generation tasks.
Key capability: Text-to-video synthesis. Systems like Sora (OpenAI) and similar models can generate realistic video from text descriptions. The implications for synthetic media disinformation are significant: generating a deepfake will increasingly require only a text prompt rather than training data about the specific target person.
Voice Cloning
Voice cloning — the synthesis of a specific person's voice from audio samples — has progressed rapidly. Current state-of-the-art systems (ElevenLabs, Resemble AI, and similar) can produce voice clones of sufficient quality for many deceptive purposes from very short audio samples — as little as 3-30 seconds in some demonstrations.
The underlying technology typically involves a speaker encoding (extracting the characteristics that make a voice sound like a specific person) and a neural text-to-speech system that applies those characteristics to any input text. The combination produces speech in the target person's voice saying words they never said.
Voice cloning enables a range of harms distinct from visual deepfakes: phone scams that impersonate family members claiming emergency ("grandparent scams"), fake audio evidence in legal or employment contexts, fraudulent authorization of financial transactions, and synthetic podcast or media content.
Key Term: Generative Adversarial Network (GAN) A machine learning architecture consisting of two competing neural networks: a generator that produces synthetic content and a discriminator that attempts to detect synthetic content. The adversarial training process enables the generator to produce increasingly realistic output.
The Computational Pipeline
A typical deepfake video production involves several stages:
- Data collection: Gather training images or video of the target person (face-swap deepfakes require many images; modern methods require fewer as data efficiency improves).
- Face detection and alignment: Detect and normalize faces in the training data.
- Model training: Train the GAN or other architecture to learn the target's appearance.
- Inference (generation): Apply the trained model to new source video or audio.
- Post-processing: Correct color balance, lighting, and other artifacts to improve consistency with the source footage.
- Audio synchronization: If replacing audio, synchronize the generated voice with lip movements.
The time required for this pipeline has dropped dramatically: what required significant specialized computing resources and weeks of training in 2018 can now be accomplished in hours on consumer hardware, and many mobile applications perform real-time face-swapping with no training required.
Section 18.3: A Brief History of Image and Video Manipulation
Pre-Digital Manipulation
The idea that photographs cannot lie is itself a modern myth. Photographic manipulation is as old as photography itself. In the 1860s, composite photographs that combined multiple negatives were used to create images of Lincoln with the body of John C. Calhoun (Lincoln's head, Calhoun's body), a technique used for aesthetic purposes but demonstrating early awareness that photographs could depict things that never happened. Stalin's systematic removal of purged officials from photographs is perhaps the most famous example of political photo manipulation, performed laboriously by Soviet retouchers who repainted photographs to excise faces.
Darkroom techniques — dodging, burning, double exposure, sandwiching negatives — allowed skilled photographers to create composite images that could be photographically convincing. These techniques were known to the sophisticated viewer of the era and contributed to a culture of healthy skepticism about photographic evidence.
The Photoshop Era
Adobe Photoshop was released commercially in 1990 and defined a new era of image manipulation: digital tools made sophisticated alterations accessible to non-specialists, and the resulting images were typically indistinguishable from unmanipulated photographs by casual inspection. The term "photoshopped" entered common usage as a shorthand for any digitally altered image, often derogatorily.
The Photoshop era produced its characteristic problems: tabloid magazines digitally altering celebrities' bodies, news organizations accidentally or deliberately running altered images (Reuters suspended photographer Adnan Hajj in 2006 for adding smoke to a Lebanon bombing photograph), and a general awareness that still photographs could no longer be presumed authentic.
Critical limitation of the Photoshop era: Still image manipulation was relatively detectable by skilled analysts, and video manipulation remained extraordinarily labor-intensive. Creating a video that convincingly showed someone saying something they never said required either traditional video production or significant specialist effort (motion capture, CGI). This limitation meant that the epistemic threat of digital manipulation was largely bounded to still photography and text.
The Deepfake Transition (~2017)
The watershed moment for deepfake technology is typically dated to late 2017, when a Reddit user posting under the name "deepfakes" began posting AI-synthesized pornographic videos featuring celebrities' faces superimposed on adult performers' bodies. The underlying technology — a GAN-based face-swap system — was shared publicly and rapidly spread through underground communities.
Several dynamics converged at this moment to create the deepfake era: - Technical accessibility: The GAN architecture had been published and the code was freely available. - Consumer hardware: Graphics processing units (GPUs) powerful enough to train the models had become widely affordable. - Training data abundance: The internet provided vast archives of photographs and video of public figures. - Community development: Open-source deepfake communities (particularly on Reddit and GitHub) rapidly improved the technology.
By 2018, free software packages (DeepFaceLab, FaceSwap) had made deepfake production accessible to anyone with a gaming-level GPU and basic technical skills. By 2023, real-time mobile applications had made it a consumer technology requiring no technical skill at all.
The deepfake transition represents a qualitative shift from the Photoshop era: for the first time, the manipulation of video — the most persuasive and trusted medium in the modern information environment — became accessible to non-specialists and increasingly indistinguishable from authentic footage.
Section 18.4: Use Cases and Harms
Non-Consensual Intimate Imagery (NCII)
The most prevalent and documented harm from deepfakes is the non-consensual creation and distribution of intimate imagery — pornographic deepfakes created without the target's consent. Research by Sensity (formerly Deeptrace) in 2019 found that approximately 96% of all deepfakes on the public internet were non-consensual pornography targeting women. A 2023 update found the number of deepfake pornographic videos had grown to hundreds of thousands, with a handful of dedicated sites hosting millions of views.
NCII deepfakes disproportionately target women and girls, including: - Private individuals (overwhelmingly women) targeted by acquaintances, ex-partners, or online harassers - Female public figures including journalists, politicians, and entertainers - In documented cases, minors
The harms are severe and well-documented: psychological trauma (including PTSD), reputational damage, professional consequences, stalking and physical safety concerns, and in some documented cases, suicide. Unlike conventional non-consensual intimate photography (revenge porn), deepfakes do not require any genuine intimate imagery of the target — the synthetic content is generated from ordinary photographs or video. This dramatically expands the potential victim population.
Political Manipulation and Propaganda
Synthetic media poses direct threats to political discourse through several mechanisms:
Fabricating statements by political figures: Creating video or audio of politicians, government officials, or public figures making statements they never made — confessions, policy announcements, incriminating admissions, or inflammatory remarks.
Impersonating candidates and officials: Synthesizing video of candidates during elections to spread disinformation about their positions or conduct.
Manufacturing evidence: Creating fake video "evidence" of crimes, corruption, or misconduct to be used against political opponents.
International influence operations: Sovereign states or state-sponsored actors creating synthetic media to influence elections or public opinion in target countries.
Fraud and Financial Crime
Voice cloning has enabled a wave of financial fraud. In documented cases: - A UK energy company was defrauded of €220,000 in 2019 after scammers used AI-synthesized voice to impersonate the company's German parent's CEO and direct a wire transfer. - Multiple documented cases of "virtual kidnapping" scams using cloned children's voices to extort parents. - Business email compromise (BEC) scams have incorporated voice cloning to add authenticity to fraudulent authorization requests. - The "grandparent scam" — fraudsters impersonating grandchildren in emergency situations — has been amplified by voice cloning tools.
Harassment and Reputation Destruction
Beyond mass-distributed deepfakes, synthetic media is used for targeted harassment — creating compromising imagery or video of specific individuals to damage their reputation, relationships, or career, or to coerce them into silence or compliance. This use case is particularly common against women journalists, activists, and public figures.
Section 18.5: Political Deepfakes
The Gabon President Bongo Controversy (2019)
In January 2019, a video of Gabonese President Ali Bongo Ondimba appeared on national television — the first public appearance of the president, who had suffered a stroke in October 2018, in months. Following the broadcast, opponents and critics alleged that the video was a deepfake — a fabricated appearance designed to conceal Bongo's true medical condition or even death.
The Gabon case is important primarily as a counter-example: subsequent analysis by deepfake researchers found no definitive evidence that the Bongo video was synthetic. The oddities that prompted suspicion (stiff posture, unusual framing, limited movement) were more likely explained by the president's genuine post-stroke condition. Days after the video, Gabon's army staged an unsuccessful coup, citing the video as justification.
This episode illustrates the Liar's Dividend in practice: regardless of whether the video was real or fake, the allegation that it was fake had real consequences. The deepfake accusation served as a political tool regardless of its accuracy.
The Obama/Buzzfeed PSA (2018)
In April 2018, director Jordan Peele and BuzzFeed released a deliberately produced deepfake video of Barack Obama as a public service announcement about deepfake disinformation. In the video, "Obama" says things Obama did not say, and then Peele (visually) appears alongside "Obama" to explain that the video is a synthetic demonstration.
The video was explicitly labeled and disclosed as a deepfake; its purpose was educational and it was produced with the cooperation of all involved parties. It is included here because it represents an early, widely circulated demonstration of what deepfakes could do, and it set the terms of public discourse about the technology.
Notably, the "Obama" deepfake in the BuzzFeed video was visibly imperfect by 2023 standards — the face tracking and lip synchronization have artifacts visible to trained observers. The rate of quality improvement since 2018 means that similar demonstrations today would be substantially more convincing.
The Zelensky Surrender Video (2022)
Examined in detail in Case Study 18-1, the March 2022 synthetic video of Ukrainian President Zelensky represents the first confirmed use of a deepfake video in an active military conflict. While the video was crude and quickly debunked, it demonstrated the potential for synthetic media in information warfare. We address this in detail in Case Study 18-1.
Actual vs. Potential Harms
It is important to maintain an empirically grounded distinction between documented harms from political deepfakes and speculative future scenarios. As of early 2025, documented cases of genuinely deceptive, high-quality political deepfakes actually altering public opinion or electoral outcomes are limited. The Zelensky video was quickly debunked. The Gabon video's synthetic nature was never proven. Most confirmed political deepfakes have been: - Crude enough to be quickly identified by researchers - Attributed to state or near-state actors with significant resources - Rapidly countered by platform labels or subject debunking
The genuine threat is forward-looking: as the technology improves, the detection and debunking advantage enjoyed by authoritative institutions will diminish. The question is not whether today's deepfakes can effectively deceive sophisticated audiences, but what the landscape will look like when deepfakes are indistinguishable from authentic footage.
Callout Box: State-Sponsored Deepfake Operations The U.S. intelligence community has documented state-sponsored use of AI-generated images in influence operations. In 2023, the FBI and CISA warned about the use of "generative AI" by foreign adversaries to create fake personas, generate fabricated content, and amplify divisive narratives in advance of U.S. elections. Meta reported removing networks of coordinated inauthentic behavior using AI-generated profile images. The use of AI-generated imagery in influence operations is already well-documented; the use of video deepfakes for specific disinformation campaigns remains less common but is an increasing concern.
Section 18.6: The Liar's Dividend
Chesney and Citron's Concept
Professors Robert Chesney (University of Texas School of Law) and Danielle Citron (University of Virginia School of Law) coined the term "Liar's Dividend" in their foundational 2019 article "Deep Fakes: A Looming Challenge for Privacy, Democracy, and National Security" (published in the California Law Review, 2019, 107(6):1753). The concept deserves careful examination.
The Liar's Dividend refers to a second-order harm from deepfake technology beyond the direct harm of fake videos: the technology enables people who actually did or said something to plausibly claim that authentic video evidence of their conduct is a deepfake. The existence of deepfakes provides a general-purpose denial mechanism for genuine misconduct.
Consider a politician who actually made a damaging statement captured on video. Without deepfake technology, their options for denial are limited — the video was either authentic or it was not, and video authentication was simpler. With deepfake technology, they can raise the plausible claim: "That video is a deepfake. My opponents fabricated it." This claim is, in any specific case, easily made and not easily disproved, particularly in the short time window during which public opinion is formed.
Manifestations of the Liar's Dividend
Direct application: Defendants in criminal or civil cases alleging that genuine video evidence has been fabricated. As deepfake technology becomes more widely known, juries may find reasonable doubt more accessible.
Political application: Politicians alleging that damaging authentic footage is synthetic. In several documented cases, politicians in various countries have claimed that genuine videos showing them in compromising situations were deepfakes. While courts and fact-checkers can often assess these claims, the public audience may receive only the denial.
Journalistic application: Sources or institutions alleging that investigative journalism based on video or audio evidence has used fabricated material. This is a particular concern in contexts where investigative journalism already faces hostility.
Epistemic application at scale: The broader cultural effect of a media environment in which synthetic video is common is that the default presumption of video authenticity erodes. When audiences cannot routinely distinguish real from fake, they may default to generalized skepticism — treating all audiovisual evidence as potentially suspect, which has the perverse effect of protecting those who commit genuinely recorded misconduct.
The Epistemic Implications
The Liar's Dividend has profound implications for democratic governance, legal proceedings, and social truth-telling. Democratic accountability depends on the ability to document and publicize public officials' actions. Journalism's social function depends on the capacity to present evidence of misconduct. Legal proceedings depend on video and audio evidence being presumptively authentic.
Chesney and Citron argue that we may be approaching a world in which audiovisual evidence becomes nearly worthless — either believed uncritically (in contexts where people want to believe it) or dismissed as potentially fabricated (in contexts where people want to disbelieve it). This epistemic chaos serves the interests of those who wish to escape accountability.
Section 18.7: Detection Methods
Forensic Artifact Analysis
Early deepfakes produced characteristic visual artifacts that forensic analysts could identify. Understanding these artifacts is important both for historical reasons and because variants of them persist in current systems:
Face boundary artifacts: GAN-based face-swap systems must blend the generated face with the original video's head, lighting, and background. The boundary region — where the generated face meets the neck, hairline, and sides — often shows telltale blending artifacts: color inconsistencies, mismatched skin tone, unnatural texture transitions.
Eye behavior anomalies: Early deepfakes had difficulty reproducing natural eye blinking (neural networks trained on images of open eyes) and natural gaze direction. Research demonstrated that deepfakes blinked significantly less frequently than real humans. This detection method became less reliable as training data and architectures improved.
Temporal inconsistencies: Video deepfakes process frames independently (or in short segments), which can produce temporal inconsistencies — subtle variations in appearance between frames that are statistically improbable in genuine video. Frequency analysis techniques can detect these frame-to-frame inconsistencies.
High-frequency artifact analysis: Neural networks tend to produce characteristic patterns in the high-frequency content of images (the statistical properties of rapid pixel-to-pixel variations). These "GAN fingerprints" are not visible to the eye but can be detected by Fourier transform analysis.
Physiological signal inconsistencies: Remote photoplethysmography (rPPG) techniques can detect subtle periodic color variations in skin caused by blood pulse. Deepfakes, generated frame by frame, do not reproduce these biological signals; they can be detected by algorithms looking for the absence of physiological signals that should be present.
The Detection Arms Race
Deepfake detection is fundamentally a competitive technology: as detectors identify artifacts, generators are trained to avoid those specific artifacts. GAN discriminators trained on detection can be incorporated into the generator's training loop to suppress detectable artifacts. This creates a continuous arms race between generation and detection.
Research has consistently found that detection systems trained on deepfakes from one generation of generation systems perform poorly on deepfakes from newer, artifact-reduced systems. This generalization failure is a fundamental challenge: detection systems based on artifact recognition will always lag behind generation systems.
Content Provenance and C2PA
The Coalition for Content Provenance and Authenticity (C2PA) represents a different approach to the authenticity problem: rather than detecting fabrication after the fact, establish authenticated provenance at the point of creation. C2PA is a technical standard (developed by Adobe, Microsoft, Intel, the BBC, and others) that allows content — images, video, audio — to carry cryptographically signed metadata recording:
- When and where it was created
- What device created it
- What software processed it
- Any transformations that were applied
A C2PA-signed image carries a "Content Credentials" certificate chain that can be verified by any compatible viewer. The signature is bound to the content — altering the content invalidates the signature. If a camera manufacturer implements C2PA, images from that camera carry authenticated creation metadata. If an AI generation system implements C2PA, its outputs carry metadata declaring that they are AI-generated.
Limitations of C2PA: The system depends on adoption by cameras, software, and platforms. Content without C2PA credentials cannot be distinguished from content where credentials were stripped. The system proves authentic provenance for signed content but cannot prove inauthenticity for unsigned content.
Watermarking
AI watermarking — embedding invisible but detectable signals in AI-generated content — is a complementary approach. If all AI-generation systems embed watermarks in their output, those outputs can be subsequently identified as AI-generated by detector systems reading the watermarks. Proposals for mandatory watermarking of AI-generated content have been included in various regulatory frameworks, including some EU AI Act provisions.
Limitations of watermarking: Watermarks embedded in generated content can potentially be removed or destroyed by re-saving, cropping, or other transformations. Watermarks that survive such attacks are typically more perceptible. The system also depends on all generation systems implementing watermarks — systems operated by malicious actors would not implement them.
Section 18.8: Audio Deepfakes and Voice Cloning
The Distinct Threat Landscape of Audio
Audio deepfakes — AI-synthesized voice content — represent a threat landscape somewhat distinct from video deepfakes. While video deepfakes require visual authenticity that is increasingly subject to detection, audio deepfakes are more technically mature, more broadly deployed in malicious contexts, and in some ways harder to detect casually because listeners are accustomed to varying audio quality in phone and streaming contexts.
Key attributes of the audio deepfake threat: - Lower data requirements: Current voice cloning systems can work from seconds to minutes of audio. - Scalable deployment: Unlike video, audio fraud can be deployed at telephone scale — automated systems can make thousands of calls simultaneously with synthesized voices. - Emotional manipulation: Voice carries emotional information (tone, pacing, stress) that AI voice synthesis systems increasingly reproduce convincingly. A cloned voice saying "I'm hurt, please send money" is more emotionally compelling than a text message. - Limited verification culture: Most people do not have habits of verifying telephone voices through independent channels; they rely on recognition.
Documented Fraud Cases
The €220,000 CEO Voice Fraud (2019): A UK energy company executive received a phone call from what sounded like the CEO of the company's German parent, instructing him to transfer €220,000 to a Hungarian supplier. The voice had "a slight German accent" and was described as convincingly like the CEO's actual voice. The transfer was made; the fraud was later identified. This is among the earliest documented cases of AI voice cloning used in financial fraud.
The 2024 Hong Kong CFO Deepfake Fraud: A more sophisticated case involved a finance worker at a Hong Kong company who was fooled by a deepfake video conference call involving the company's CFO and other "employees" — all of whom were synthetic. The worker transferred approximately HK$200 million (approximately $26 million) to the fraudsters.
Grandparent Scam Enhancement: The traditional "grandparent scam" — where fraudsters call elderly people claiming to be their grandchildren in trouble and needing money — has been enhanced with voice cloning technology that reproduces the grandchild's actual voice from social media audio.
Detection Approaches for Audio
Audio deepfake detection uses several approaches:
Spectral analysis: AI-generated audio has characteristic patterns in its frequency spectrum — particularly in the high-frequency and very low-frequency ranges — that differ from natural speech.
Prosodic analysis: Natural speech has subtle rhythmic and stress patterns that AI synthesis systems approximate imperfectly. Analysis of speech rhythm, pause patterns, and stress placement can distinguish synthetic speech.
Background noise analysis: Natural voice recordings typically contain environmental sounds (air conditioning, traffic, room resonance) that are absent from or inconsistent in AI-generated speech.
Challenge-response verification: In high-stakes voice authentication, requiring responses to unpredictable prompts makes replay attacks difficult; however, voice cloning systems can be applied in real-time, reducing this protection.
Section 18.9: Regulatory and Platform Responses
State Laws Criminalizing NCII Deepfakes
In the absence of comprehensive federal legislation in the United States, states have enacted patchwork protections:
- As of 2024, over 20 U.S. states have enacted laws specifically criminalizing non-consensual intimate imagery, and a growing number have expanded these laws to explicitly cover AI-generated NCII deepfakes. Virginia, California, Georgia, and New York are among states with explicit deepfake NCII provisions.
- The laws vary significantly in scope, definitions, penalties, and civil cause of action provisions.
- Federal legislation — the DEFIANCE Act, the TAKE IT DOWN Act, and others — has been proposed to create a federal cause of action for NCII deepfakes, with varying prospects.
Limitations of criminal law responses: Criminal prosecutions are resource-intensive, often require identification of an anonymous perpetrator, and occur after the harm has already been inflicted. The viral distribution of NCII deepfakes often cannot be meaningfully remedied by a conviction of the original creator.
Political Deepfakes Legislation
Several states have enacted or proposed laws specifically addressing deepfakes in electoral contexts: - California's AB 739 (2019) prohibited distributing deepfakes of candidates within 60 days of an election. - Texas enacted SB 751 in 2023, creating civil liability for creating deepfakes to influence elections. - Similar legislation has been enacted or proposed in multiple additional states.
First Amendment concerns: Laws restricting synthetic media in political contexts face potential First Amendment challenges. Satire and parody — including synthetic media — are constitutionally protected forms of expression. Regulations must be carefully crafted to target deceptive uses without overly restricting protected speech.
Platform Policies
Major platforms have adopted policies on synthetic media: - YouTube prohibits deepfakes that could mislead users about political, social, or other issues, and requires creators to disclose realistic-appearing synthetic content. - Meta/Facebook prohibits deepfakes that could deceive the public into thinking someone said something they did not, with exceptions for clearly labeled parody. - TikTok requires labels on AI-generated content and prohibits deepfakes that could cause harm. - X/Twitter's policies have been less restrictive under post-2022 ownership.
Platform policies are limited in enforcement: the volume of content uploaded to major platforms far exceeds human review capacity, and automated deepfake detection systems have significant false positive and false negative rates.
C2PA Adoption
The C2PA standard has been adopted or announced by a growing list of participants: - Camera manufacturers: Sony, Nikon, Canon, Leica - AI generation platforms: Adobe Firefly, Microsoft, OpenAI (DALL-E) - News organizations: BBC, CBC, Reuters - Social media: LinkedIn has implemented C2PA display; other platforms have announced intent - Device manufacturers: Qualcomm has announced C2PA implementation in mobile chip hardware
Full ecosystem adoption — where the majority of cameras, editing software, AI generators, and display platforms support C2PA — would substantially address the provenance problem for compliant actors. The gap is malicious actors who will not implement C2PA in their deepfake tools.
Section 18.10: Future Trajectories
Where Synthetic Media is Heading
The trajectory of synthetic media technology can be assessed along several dimensions:
Quality: Current AI generation systems produce output that ranges from clearly synthetic (for trained observers) to indistinguishable from authentic (for non-specialist viewers in favorable conditions). The direction of travel is consistently toward higher quality, lower artifact rates, and better performance across challenging conditions (lighting, angle, motion).
Data efficiency: Early deepfake systems required thousands of training images to produce convincing output for a specific person. Current few-shot learning systems can produce reasonable results from fewer than a dozen images. The threshold will continue to decline, expanding the potential victim pool.
Speed: Real-time deepfake generation — creating synthetic video in live video calls — is already possible in limited quality. Improving inference speed and quality will make real-time deepfakes increasingly convincing.
Multimodal integration: Current systems are largely specialized by modality (image generation, voice cloning, video synthesis). Integrated systems that generate coherent video, audio, and text simultaneously — each consistent with the others — will enable more sophisticated synthetic media production.
Accessibility: The trend is consistently toward less technical skill required, lower cost, and broader availability. Technologies that required specialist teams in 2018 are consumer applications in 2023.
Epistemological Implications
The deepfake trajectory has profound implications for human epistemology — for how we know what we know.
Collapse of audiovisual evidence as proof: If sufficiently advanced synthetic media can generate convincing videos of anyone saying anything, the evidentiary value of audiovisual recordings is fundamentally undermined. Legal systems, journalistic practices, and public discourse all rely on audiovisual evidence in ways that will need to be renegotiated.
Default skepticism and its costs: As audiences become aware of synthetic media, they may adopt a generalized skepticism toward audiovisual content. This skepticism, if well-calibrated, is appropriate. But generalized skepticism has costs: it can impede belief in genuine documented atrocities, undermine accountability journalism, and contribute to a nihilistic epistemology where "nobody can know anything" becomes a shield for those who actually did commit recorded wrongs.
Epistemic inequality: The ability to verify authenticity — through C2PA credentials, forensic analysis, or institutional research capacity — will not be evenly distributed. Sophisticated institutions will have tools to evaluate synthetic media that ordinary members of the public lack. This creates new information asymmetries.
The role of trust institutions: In this environment, trusted institutions — credentialed journalism, courts, scientific bodies — may play an increasingly important role as validators of authentic information. But these institutions are themselves under pressure, and the deepfake environment provides new ammunition for attacks on institutional credibility.
Building verification cultures: The most durable response to synthetic media may be cultural rather than technical: building widespread understanding of how to verify information, when to ask for provenance credentials, and how to calibrate skepticism appropriately across different contexts and sources.
Callout Box: The Responsibility of AI Developers The generative AI companies that build and deploy voice cloning, image generation, and video synthesis systems bear significant responsibility for the synthetic media environment. Responsible development involves: implementing and promoting C2PA or equivalent provenance standards; building disclosure requirements into consumer-facing products; maintaining "know your customer" requirements for high-stakes applications; cooperating with research on detection; and engaging with policy processes around regulation. Some leading companies have adopted meaningful versions of these practices; others have not.
Key Terms
Deepfake: AI-generated or AI-substantially-modified audiovisual content depicting real people in situations they were not actually in, typically produced using generative adversarial networks or diffusion models.
Cheap Fake: Low-technology audiovisual manipulation not involving AI, including speed manipulation, selective editing, recontextualization, and basic audio dubbing.
Generative Adversarial Network (GAN): A machine learning architecture consisting of two competing neural networks (generator and discriminator) whose adversarial training enables generation of realistic synthetic content.
Diffusion Model: A class of generative AI model that learns to generate content by learning to reverse a noise-addition process; used in systems like Stable Diffusion and DALL-E.
Voice Cloning: AI synthesis of a specific person's voice from audio samples, enabling the generation of speech in that person's voice saying words they never spoke.
Liar's Dividend: The ability for people who actually said or did something to plausibly claim that authentic video or audio evidence of their conduct is a deepfake, enabled by the existence of deepfake technology.
C2PA (Coalition for Content Provenance and Authenticity): A technical standard that enables cryptographically authenticated provenance metadata to be embedded in digital content, allowing verification of the content's creation history.
Non-Consensual Intimate Imagery (NCII): Sexual imagery created and distributed without the subject's consent; "deepfake NCII" refers to AI-synthesized intimate imagery targeting real individuals.
GAN Fingerprint: Statistical patterns in the high-frequency content of GAN-generated images that are characteristic of specific GAN architectures, enabling forensic identification of synthetic images.
Error Level Analysis (ELA): A forensic technique for detecting image manipulation by analyzing the differential compression artifacts at different levels of JPEG re-compression.
Discussion Questions
-
The Liar's Dividend suggests that deepfake technology harms truth-telling even when fake videos are quickly debunked, because their existence makes it plausible to deny authentic footage. Can you think of examples where this dynamic has already appeared — before deepfakes were widespread — and what does this tell us about the Liar's Dividend's dependence on the technology?
-
Most confirmed deepfake harms involve NCII — non-consensual intimate imagery targeting women. Why do you think this category of harm is so much more prevalent than political deepfakes, despite the latter receiving more policy attention?
-
The C2PA approach to provenance addresses the authenticity problem through technical means, but depends on broad adoption by both content producers and content distributors. What incentives are necessary to achieve this adoption, and what gaps would remain even in a world of complete legitimate-actor adoption?
-
Researchers describe a deepfake "arms race" between generation and detection. Is this framing accurate? Are there detection approaches (like physiological signal detection or C2PA) that are not subject to this arms race, and why?
-
How should legal systems adapt to the existence of deepfakes? Consider both criminal evidence rules (how should courts treat audiovisual evidence when deepfakes are possible?) and civil law (what legal frameworks should apply to the creation and distribution of NCII deepfakes?).
-
The deepfake detection research literature consistently shows that detection systems trained on one generation of deepfakes perform poorly on the next generation. What are the implications of this finding for detection-based defense strategies?
-
Consider the claim: "Deepfakes are just the latest in a long series of media manipulation technologies, and society has adapted to each previous one (photography, Photoshop, etc.). We will adapt to deepfakes too." Evaluate this claim. In what respects does the deepfake challenge resemble previous challenges, and in what respects is it qualitatively different?
Summary
Synthetic media — from cheap fakes through AI-generated deepfakes — represents a fundamental challenge to the presumptive authenticity of audiovisual content that has underpinned media literacy and democratic accountability for over a century. The technology, enabled by GANs and diffusion models, has improved dramatically in quality and accessibility since the 2017 transition point, and the trajectory is consistently toward greater capability with less technical barrier.
The harms are real and distributed across domains: NCII deepfakes devastate individual targets with little legal recourse; political deepfakes threaten democratic discourse; audio deepfakes enable large-scale financial fraud; and the Liar's Dividend undermines the evidentiary value of authentic footage. Responses — forensic detection, C2PA provenance, legal frameworks, platform policies — are necessary but partial. No single technical or legal solution addresses the full landscape.
The epistemological implications extend beyond any specific harm: we are in the process of collectively renegotiating the relationship between seeing and believing, between audiovisual evidence and truth. The appropriate response is not despair or blanket skepticism, but the development of verification practices — asking how we know, what credentials and provenance information are available, and which institutions are positioned to help us evaluate — that can function in a world of pervasive synthetic media.