Chapter 18: Quiz — Deepfakes, Synthetic Media, and Emerging Threats

DataField.Dev

Chapter 18: Quiz — Deepfakes, Synthetic Media, and Emerging Threats

Instructions: Answer each question. For multiple-choice questions, select the single best answer. For short-answer questions, write 2-4 sentences. Answers are hidden below each question.

Question 1

The 2019 slowed-down video of Nancy Pelosi that made her appear intoxicated is best classified as:

A) A deepfake produced using a GAN B) A cheap fake produced by simple speed manipulation C) A shallowfake using AI-assisted lip synchronization D) A synthetic media artifact generated by a diffusion model

Answer

**B) A cheap fake produced by simple speed manipulation.** The Pelosi video was created by slowing the playback of a genuine video — no AI was involved. This is the paradigm example of a "cheap fake": a low-technology manipulation that can be produced with basic video editing software. The chapter emphasizes that cheap fakes are often more consequential than deepfakes precisely because they are faster and cheaper to produce.

Question 2

In a Generative Adversarial Network (GAN), what is the role of the discriminator?

A) To generate synthetic content that is indistinguishable from real content B) To detect whether input data is real or generated, creating adversarial pressure on the generator C) To encode the identity of a specific person for face-swapping operations D) To remove visual artifacts from generated content through post-processing

Answer

**B) To detect whether input data is real or generated, creating adversarial pressure on the generator.** The discriminator is trained to classify inputs as real or generated. Its accuracy creates gradient signals that train the generator to improve its output — the "adversarial" relationship. As the discriminator improves, the generator must produce more realistic content to fool it; as the generator improves, the discriminator must become more sophisticated. This competitive dynamic drives the improving quality of GAN-generated content.

Question 3

The "Liar's Dividend" concept, coined by Chesney and Citron, refers to:

A) The financial profits earned by deepfake creators who sell synthetic content B) The ability for those who committed recorded conduct to plausibly deny authentic video evidence as a deepfake C) The economic value that lies capture compared to truths in attention-based media economies D) The advantage gained by the first party to make a false claim before fact-checkers respond

Answer

**B) The ability for those who committed recorded conduct to plausibly deny authentic video evidence as a deepfake.** The Liar's Dividend describes a second-order harm from deepfake technology: the mere existence of convincing deepfakes allows bad actors to claim that genuine video evidence of their misconduct is synthetic. Even if the claim is false, it raises plausible doubt in the short time window when public opinion is formed, and provides a rhetorical defense that was unavailable before deepfake technology existed.

Question 4

Which of the following deepfake detection methods is NOT subject to the generation-detection arms race in the same way as artifact-based detection?

A) Facial boundary artifact analysis B) Eye blinking frequency analysis C) Content provenance via C2PA cryptographic signatures D) GAN fingerprint detection in the frequency domain

Answer

**C) Content provenance via C2PA cryptographic signatures.** C2PA provenance works through cryptographic verification of the content's creation history, not through analysis of visual or acoustic artifacts. While generators can be trained to suppress artifacts detected by methods A, B, and D, they cannot forge cryptographic signatures from legitimate camera hardware — that would require compromising the private keys embedded in the camera's secure element. C2PA addresses the authentication problem at the provenance level rather than the artifact level, making it structurally different from the arms race dynamic.

Question 5

Approximately what percentage of deepfakes found on the public internet in 2019 were non-consensual pornographic imagery, according to Sensity (formerly Deeptrace)?

A) 20% B) 50% C) 76% D) 96%

Answer

**D) 96%.** Sensity's (formerly Deeptrace's) 2019 research found that approximately 96% of all deepfakes indexed on the public internet were non-consensual pornographic imagery targeting women. This finding is important for policy prioritization: while political deepfakes receive substantial media and policy attention, NCII deepfakes represent the overwhelming majority of the actual harm being produced and distributed.

Question 6

The C2PA (Coalition for Content Provenance and Authenticity) standard addresses the deepfake problem by:

A) Training an AI detector to identify deepfakes with high accuracy B) Requiring all AI generation systems to be registered with government authorities C) Embedding cryptographically signed provenance metadata in content at the point of creation D) Creating a public database of known deepfakes that platforms can check against

Answer

**C) Embedding cryptographically signed provenance metadata in content at the point of creation.** C2PA establishes a technical standard for "Content Credentials" — metadata that records the content's creation and modification history, signed with cryptographic keys from the originating hardware or software. The signatures are tamper-evident: altering the content invalidates the signature. Verified credentials establish authentic provenance; the absence of credentials does not prove inauthenticity, but the presence of credentials from trusted hardware establishes authenticity.

Question 7

Which of the following events is typically cited as the "transition point" for the modern deepfake era?

A) The release of Adobe Photoshop in 1990 B) The introduction of the GAN architecture by Ian Goodfellow in 2014 C) A Reddit user posting AI face-swap pornographic deepfakes in late 2017 D) The release of Stable Diffusion as an open-source model in 2022

Answer

**C) A Reddit user posting AI face-swap pornographic deepfakes in late 2017.** While Goodfellow's GAN paper (2014) provides the technical foundation, the 2017 Reddit posts by user "deepfakes" mark the moment when GAN-based face-swap technology was publicly shared, rapidly replicated, and made accessible to non-specialists. The subsequent open-source development (DeepFaceLab, FaceSwap) and community improvement made deepfake production widely accessible — this is the functional transition point even though the underlying technology predates it.

Question 8

What is "mode collapse" in the context of GAN training?

A) When the discriminator becomes too powerful and the generator cannot improve B) When the generator produces only a limited variety of outputs despite being able to produce diverse realistic samples C) When the entire GAN training process collapses due to numerical instabilities D) When the training data is insufficient to train a high-quality generator

Answer

**B) When the generator produces only a limited variety of outputs despite being able to produce diverse realistic samples.** Mode collapse occurs when the generator finds a limited number of output types that reliably fool the discriminator and "collapses" to generating only those types, ignoring the diversity of the real data distribution. The generator optimizes for fooling the discriminator rather than for covering the full diversity of the training data, resulting in limited and repetitive output. This is a common failure mode in GAN training and is one reason why more stable architectures (Wasserstein GAN, progressive growing) were developed.

Question 9

The physiological signal detection approach to deepfake detection exploits:

A) The fact that GAN generators produce characteristic high-frequency patterns B) Subtle color variations in skin caused by blood pulse (remote photoplethysmography) C) Inconsistencies in facial landmark positioning between video frames D) Differences in compression artifacts between genuine and generated video

Answer

**B) Subtle color variations in skin caused by blood pulse (remote photoplethysmography).** Remote photoplethysmography (rPPG) detects the subtle periodic changes in skin color caused by blood flowing through capillaries with each heartbeat. These signals are present in genuine video of real people and are absent in deepfakes, which are generated frame by frame without any biological process. Detection algorithms can identify the absence of these expected physiological signals as an indicator of synthetic content.

Question 10

Voice cloning technology can typically produce convincing results from:

A) A single phoneme of audio B) At least 10 hours of studio-quality recording C) As little as a few seconds to a few minutes of audio, depending on the system D) Only from recordings without background noise or processing

Answer

**C) As little as a few seconds to a few minutes of audio, depending on the system.** Current state-of-the-art voice cloning systems (ElevenLabs, Resemble AI, and similar) can produce voice clones of sufficient quality for deceptive purposes from very short audio samples. While longer samples generally produce better results, the barrier to entry for voice cloning fraud has fallen dramatically from the early systems that required many hours of training audio.

Question 11

The Gabon President Bongo video controversy (2019) is important for which of the following reasons?

A) It was the first confirmed deepfake successfully used to manipulate an election B) It demonstrates the Liar's Dividend: the allegation that the video was a deepfake had political consequences regardless of its actual authenticity C) It established international legal precedent for state accountability for deepfake production D) It was successfully authenticated by C2PA provenance credentials, demonstrating the standard's utility

Answer

**B) It demonstrates the Liar's Dividend: the allegation that the video was a deepfake had political consequences regardless of its actual authenticity.** Deepfake researchers who examined the Bongo video found no definitive evidence it was synthetic; the oddities in the video were more likely explained by Bongo's post-stroke condition. Yet the deepfake allegation — regardless of its accuracy — was cited as justification for an attempted military coup. This perfectly illustrates the Liar's Dividend: the technology's existence made it plausible to dismiss a genuine video as fabricated, with real political consequences.

Question 12

"Error Level Analysis" (ELA) is a technique for:

A) Detecting GAN-generated images by analyzing their high-frequency content B) Identifying regions of an image that have been added or modified by analyzing differential JPEG compression artifacts C) Measuring the eye blinking rate in video to identify deepfakes D) Verifying C2PA content credentials for authenticity

Answer

**B) Identifying regions of an image that have been added or modified by analyzing differential JPEG compression artifacts.** ELA works on the principle that JPEG compression introduces artifacts that vary with image content. When an image is re-saved as a JPEG, the compression re-applies — but modified regions that were inserted from another source at a different compression quality will show different error levels than the original regions. By visualizing these differential error levels, analysts can identify areas that may have been manipulated. ELA is primarily useful for detecting composited still images, not for GAN-generated synthetic images.

Question 13

Under the C2PA standard, what happens to the Content Credentials when the underlying image content is altered?

A) The credentials update automatically to reflect the new modification B) The existing credentials are invalidated (the signature no longer validates) C) The credentials are deleted from the file automatically D) A new credential entry is appended to the existing credential chain

Answer

**B) The existing credentials are invalidated (the signature no longer validates).** C2PA uses cryptographic digital signatures that are computed from the content's actual pixel data (or a cryptographic hash thereof). Altering the content changes the underlying data, which means the signature computed over the original data no longer matches — the signature verification fails. This tamper-evidence property is central to C2PA's security model. Legitimate modifications by an authorized tool would add a new credential assertion to the chain rather than altering the original signed assertion.

Question 14

The primary limitation of deepfake detection systems based on artifact recognition is:

A) They require too much computing power to be deployed at scale B) Generators can be trained to suppress the specific artifacts that detectors look for, creating a continuous arms race C) They can only analyze still images, not video frames D) They require access to the original training data to make accurate classifications

Answer

**B) Generators can be trained to suppress the specific artifacts that detectors look for, creating a continuous arms race.** This is the fundamental limitation of artifact-based detection: any specific artifact that a detector learns to identify can be used as a training signal to improve the generator — train the generator to avoid producing that artifact. Research consistently shows that detectors trained on one generation of deepfakes generalize poorly to the next. This arms race dynamic is why C2PA and other provenance-based approaches, which do not depend on artifact analysis, are seen as more structurally durable.

Question 15

The BuzzFeed/Jordan Peele Obama deepfake (2018) was:

A) A deceptive disinformation operation targeting the former president B) A disclosed, purposely created demonstration video explaining the risks of deepfakes C) Classified as a cheap fake because Peele's voice was dubbed rather than synthesized D) The first deepfake to be prosecuted under California's AB 739

Answer

**B) A disclosed, purposely created demonstration video explaining the risks of deepfakes.** The BuzzFeed video was explicitly labeled as a synthetic demonstration; Jordan Peele appears visually alongside the synthetic "Obama" to explain the nature of the video. Its purpose was educational — to demonstrate the capabilities and risks of deepfake technology. It was produced with full consent and cooperation of all involved parties. It is significant historically as one of the first widely distributed demonstrations of political deepfake capability, but it was not deceptive or disinformation.

Question 16

What is "remote photoplethysmography" (rPPG)?

A) A method for detecting deepfake audio by analyzing vocal cord frequency patterns B) Detection of blood pulse signals from subtle skin color variations visible in video recordings C) A technique for measuring screen refresh rate artifacts in recorded video D) Authentication of video provenance through analysis of camera sensor noise patterns

Answer

**B) Detection of blood pulse signals from subtle skin color variations visible in video recordings.** rPPG is a non-contact method for measuring physiological signals (primarily heart rate) by detecting the tiny periodic changes in skin color caused by blood flowing through capillaries. In video recordings of real people, these subtle color variations are present as a biological signal. In deepfake videos generated frame by frame, these biological signals are absent. Detection algorithms can identify their absence as an indicator of synthetic content.

Question 17

Which of the following is a limitation specific to C2PA as a solution to the deepfake problem?

A) C2PA cannot work for video files, only for still images B) The system can only verify content from government-approved sources C) Content without C2PA credentials cannot be proven inauthentic — only content with credentials can be proven authentic D) C2PA requires internet connection for every verification

Answer

**C) Content without C2PA credentials cannot be proven inauthentic — only content with credentials can be proven authentic.** This is C2PA's fundamental limitation as a comprehensive solution: it establishes positive authentication for signed content, but the absence of credentials proves nothing. Malicious actors will not implement C2PA in their deepfake tools, so synthetic content will simply lack credentials. The question for a viewer is: does the absence of credentials mean the content is inauthentic, or does it mean it was created with a camera/tool that doesn't implement C2PA? Without universal adoption, the absence of credentials is ambiguous.

Question 18

In the documented 2019 CEO voice fraud case involving a UK energy company, approximately how much was fraudulently transferred?

A) £15,000 B) €220,000 C) $1.2 million D) €4 million

Answer

**B) €220,000.** The UK energy company executive received a phone call from what sounded like the CEO of the company's German parent and was instructed to transfer €220,000 to a Hungarian supplier. The voice was described as convincingly like the CEO's, with "a slight German accent." The transfer was completed before the fraud was identified. This case is among the first documented uses of AI voice cloning in financial fraud.

Question 19

Diffusion models differ from GANs in the primary generative mechanism by:

A) Using two competing networks instead of one B) Learning to reverse a noise-addition process to generate content from noise C) Requiring explicit facial landmark detection as a preprocessing step D) Generating content from a single reference image rather than from noise

Answer

**B) Learning to reverse a noise-addition process to generate content from noise.** Diffusion models are trained by learning to predict and remove noise that has been progressively added to training images. Generation works by starting with pure noise and iteratively applying the learned denoising process, guided by a conditioning signal (text, image reference, etc.). This is fundamentally different from the adversarial training process of GANs. Diffusion models have largely superseded GANs for high-quality image generation tasks due to better training stability and output diversity.

Question 20

The term "GAN fingerprint" in deepfake forensics refers to:

A) The biometric facial patterns that identify the specific person depicted in a deepfake B) Statistical patterns in the high-frequency content of images characteristic of specific GAN architectures C) Metadata automatically embedded in GAN-generated images identifying the generation system D) The specific visual artifacts produced by the GAN's discriminator during training

Answer

**B) Statistical patterns in the high-frequency content of images characteristic of specific GAN architectures.** GAN fingerprints are statistical regularities in the pixel-level (high-frequency) content of GAN-generated images that arise from the specific convolution operations and upsampling methods used in different GAN architectures. These patterns are not visible to the naked eye but can be detected through Fourier transform analysis or CNN-based detectors. They can identify not just that an image is GAN-generated, but which specific GAN architecture generated it — analogous to a forensic signature.

Question 21

The Zelensky surrender deepfake video (March 2022) was:

A) So convincing that it was briefly believed by Ukrainian military commanders B) Crude by contemporary standards and quickly identified as fake by analysts C) Produced by a state-sponsored team with sophisticated post-production capabilities D) Successfully removed from all social media platforms within one hour of posting

Answer

**B) Crude by contemporary standards and quickly identified as fake by analysts.** The March 2022 Zelensky deepfake was described by researchers as crude — with a misshapen head, flat audio, and poor lip synchronization. It was quickly identified as synthetic by deepfake researchers and was rapidly debunked by Zelensky himself in a real video. Notably, the video was not produced by a sophisticated state actor but by what appear to have been relatively unsophisticated actors. It is significant primarily for occurring in an active military conflict, not for its technical quality.

Question 22

Which of the following describes the correct understanding of cheap fakes versus deepfakes in terms of actual harm caused?

A) Deepfakes cause substantially more harm than cheap fakes because they are harder to detect B) Cheap fakes have often caused significant harm equal to or exceeding deepfakes, because they can be produced faster and are widely accessible C) Cheap fakes are rare compared to deepfakes and cause minimal harm D) Only deepfakes cause measurable harm; cheap fakes are generally recognized immediately

Answer

**B) Cheap fakes have often caused significant harm equal to or exceeding deepfakes, because they can be produced faster and are widely accessible.** Research and documented case studies consistently show that cheap fakes — simple manipulations like the Pelosi slowed video — have spread widely and caused real harm, sometimes more quickly and broadly than sophisticated deepfakes. The simplicity of production means cheap fakes can be created and deployed rapidly, often faster than debunking can occur. The chapter emphasizes this to correct the assumption that sophistication equals harm; technical simplicity is often an advantage for disinformation operations.

Question 23

What is the fundamental epistemological implication of the deepfake era, as described in the chapter?

A) Video evidence will become completely worthless in legal and political contexts B) AI will replace human judgment in evaluating the authenticity of media content C) The presumptive authenticity of audiovisual content — that seeing is a reasonable basis for believing — must be substantially renegotiated D) Only government-certified sources of video will be trusted by the public

Answer

**C) The presumptive authenticity of audiovisual content — that seeing is a reasonable basis for believing — must be substantially renegotiated.** The chapter argues that we are in the process of renegotiating the relationship between seeing and believing. For most of recorded history, photographs and videos carried an implicit claim to authenticity — they showed what was actually in front of the camera. Synthetic media dissolves this implicit claim. The appropriate response is not to stop believing anything (A) or to rely on AI verification (B) or government certification (D), but to develop verification practices appropriate to a world where the presumption of audiovisual authenticity is no longer warranted.

Question 24

Non-consensual intimate imagery (NCII) deepfakes differ from traditional "revenge porn" in which critical way?

A) NCII deepfakes are prosecuted more harshly under existing laws B) NCII deepfakes require no genuine intimate imagery of the target — they can be generated from ordinary photographs C) NCII deepfakes affect only celebrities and public figures D) NCII deepfakes are easier to detect and remove than traditional NCII

Answer

**B) NCII deepfakes require no genuine intimate imagery of the target — they can be generated from ordinary photographs.** This is the critical difference that dramatically expands the potential victim population and the scale of the harm. Traditional non-consensual intimate photography (revenge porn) requires that the perpetrator have access to intimate images of the target, which typically limits it to ex-partners or those who have breached trust. Deepfake NCII can be generated from any photograph of any person — including photos from public social media — meaning essentially any person is a potential victim.

Question 25

According to the chapter, which institution produced the foundational academic paper defining the "Liar's Dividend" concept?

A) The MIT Media Lab, in their "Detect Fakes" research project B) The Electronic Frontier Foundation, in a policy brief on AI regulation C) Robert Chesney and Danielle Citron, in a 2019 California Law Review article D) DARPA, in a report on synthetic media threats to national security

Answer

**C) Robert Chesney and Danielle Citron, in a 2019 California Law Review article.** The term "Liar's Dividend" was coined by law professors Robert Chesney (University of Texas) and Danielle Citron (University of Virginia) in "Deep Fakes: A Looming Challenge for Privacy, Democracy, and National Security," published in the California Law Review in 2019 (107(6):1753). This paper is among the most influential early academic analyses of deepfake technology's legal and social implications and established much of the conceptual vocabulary for subsequent academic and policy discussions.

Question 26

"Mode collapse" as a GAN training failure is most analogous to which of the following everyday scenarios?

A) A student who always gives the same answer to test questions regardless of what is asked, because that answer has always received partial credit B) A student who gradually improves through practice but eventually reaches a performance plateau C) A student who performs perfectly on the training material but fails all new tests D) A student who gives diverse but consistently wrong answers across all topics

Answer

**A) A student who always gives the same answer to test questions regardless of what is asked, because that answer has always received partial credit.** This analogy captures the mode collapse dynamic: the generator finds one or a few outputs that reliably fool the discriminator and "collapses" to producing only those — just as a student who finds that answer X always gets partial credit will start giving answer X regardless of what is asked. The generator is optimizing for fooling the discriminator rather than for covering the full diversity of the real data distribution.

Question 27

What characteristic of audio deepfakes makes them a more mature and broadly deployed threat than video deepfakes as of 2024?

A) Audio deepfakes are cheaper to produce because they require less computing power B) Audio deepfake detection technology does not yet exist C) Audio requires less perceptual fidelity than video to be convincing, and verification culture for telephone voices is limited D) Audio deepfakes are not covered by any regulatory frameworks

Answer

**C) Audio requires less perceptual fidelity than video to be convincing, and verification culture for telephone voices is limited.** People are accustomed to varying audio quality in phone calls and recordings, which means audio deepfakes do not need to be perfect to be convincing. Additionally, most people have no habit or mechanism for independently verifying the identity of a voice on a phone call — they rely on recognition. Combined with the lower data requirements for voice cloning and the scalability of telephone-based fraud, these factors make audio deepfakes a more immediately practical fraud tool than video deepfakes.