Chapter 39 Exercises: AI, Generative Models, and the Future of Synthetic Media

Section A: Conceptual and Analytical Exercises

Exercise 1: Hallucination Taxonomy

Design a taxonomy of LLM hallucination types relevant to misinformation. Your taxonomy should distinguish at least four categories based on: (a) the nature of the false claim, (b) the mechanism by which the hallucination occurs, and (c) the harm potential in different information contexts. For each category, provide a specific illustrative example and explain why that type of hallucination is particularly dangerous. (500–700 words)

Exercise 2: Comparative Threat Assessment

Create a threat matrix comparing the following synthetic media types on five dimensions: production cost, required technical skill, detection difficulty, maximum persuasive impact, and scalability. The media types to compare are: (a) LLM-generated false news articles, (b) AI-generated realistic images, (c) voice-cloned audio deepfakes, (d) AI-generated video deepfakes, and (e) synthetic persona social media accounts. Use a 1–5 rating scale with written justification for each cell.

Exercise 3: C2PA Provenance Chain Analysis

You are a photo editor at a wire service. Describe, step by step, how a photograph would move through a complete C2PA-compliant workflow from camera capture through publication and reader verification. At each step, identify: what is added to the provenance manifest, who is responsible for that step, and what failure modes could break the chain. Conclude with an assessment of where the weakest link is in the chain.

Exercise 4: The Liar's Dividend in Practice

Research or construct a detailed hypothetical in which authentic evidence of a harmful event is plausibly denied using the claim that it is AI-generated. Your scenario should be realistic, specify the nature of the evidence, the bad actor using the denial, the audience targeted by the denial, and the mechanisms through which the denial spreads. Then design a counter-strategy: what verification approaches could establish authenticity against a determined liar's dividend attack? (600–800 words)

Exercise 5: Detection Tool Audit

Select any publicly available AI text detection tool (GPTZero, Originality.ai, Copyleaks, etc.). Design an experiment to empirically measure its false positive rate (human text misidentified as AI) and false negative rate (AI text not detected). Specify: the text samples you would use (types, sources, lengths), the experimental design, the metrics you would collect, and how you would interpret the results. If you can actually run the experiment, report your results.

Exercise 6: EU AI Act Compliance Analysis

You are the head of trust and safety at a mid-sized AI company that provides a text generation API. Analyze which provisions of the EU AI Act apply to your service and what compliance steps your company would need to take. Address: general purpose AI model obligations, transparency requirements for generated content, technical measures for AI content detection, and what would happen to your compliance strategy if your model were open-sourced. (600–800 words)

Exercise 7: Personalized Disinformation Design Exercise

This is an adversarial thinking exercise for defensive purposes. You are a researcher designing a study to test how personalized AI-generated disinformation compares to generic disinformation in persuasive impact. Specify: the research design, the personalization variables (what information about subjects would be used), the outcome measures, the ethical safeguards, and how you would debrief participants. What would a significant effect in this study imply for platform policy and content moderation?

Exercise 8: Goldstein et al. Replication Design

The Goldstein et al. (2023) study found that GPT-3 generated persuasive messages roughly as effective as human-written ones. Design a study that would test whether this result generalizes to: (a) a different AI model (specify which), (b) a different political topic domain, and (c) a different delivery mechanism (social media posts vs. email vs. chatbot conversation). Identify the design choices required and the hypotheses about how results might differ.

Exercise 9: Newsroom Policy Drafting

You are advising a regional news organization on its policy for AI-generated content. Draft a comprehensive policy covering: (a) permitted uses of AI in newsgathering and production, (b) disclosure requirements when AI is used, (c) verification procedures for external content that may be AI-generated, (d) standards for images and multimedia, and (e) staff training requirements. The policy should be practical for a newsroom with limited resources.

Exercise 10: Political Advertising Regulation Analysis

Compare the AI disclosure requirements in political advertising of three different jurisdictions: the United States (federal), California, and one EU member state of your choice. For each, specify: what is required, when it applies, what the penalties are, and what is not covered. Then evaluate: which approach is most likely to achieve its protective goals, and what improvements would you recommend to any of these frameworks?

Section B: Applied and Technical Exercises

Exercise 11: Feature Engineering for AI Text Detection

Without using existing AI detection tools, design a set of stylometric features that you hypothesize could distinguish AI-generated from human-written text. For each feature, explain the theoretical basis for why it might be a discriminating signal. Then discuss: how you would validate these features experimentally, what baseline comparison you would use, and what accuracy rate would be needed for practical deployment.

Exercise 12: Watermarking Tradeoffs

The Scott Aaronson / OpenAI watermarking approach embeds a detectable statistical signal in LLM outputs by biasing token selection using a cryptographic key. Analyze the tradeoffs of this approach along the following dimensions: (a) statistical detection power vs. text quality impact, (b) robustness to paraphrasing vs. computational overhead, (c) centralized key control vs. decentralized verification, and (d) effectiveness against adversarial evasion. What modifications to the basic scheme might improve its practical utility?

Exercise 13: Synthetic Content Economic Modeling

Model the economics of AI-assisted disinformation production. Establish reasonable cost estimates (per article, per image, per audio clip) for human-produced and AI-assisted disinformation content. Then model: (a) the cost of producing 1,000 pieces of disinformation content by each method, (b) the implications for the ratio of disinformation to human fact-checking capacity, (c) how the model changes as AI capability improves and costs continue declining. Present your analysis with a simple spreadsheet or calculation.

Exercise 14: Platform Labeling Design

You are a product manager at a social media platform designing a labeling system for AI-generated content. Specify your design choices: (a) what types of content require labels, (b) what the labels say and how they are displayed, (c) how the system would detect AI-generated content (creator attestation vs. automated detection vs. third-party verification), (d) what happens when creators falsely attest, and (e) how you would measure whether the labels achieve their intended effect on user trust and behavior.

Exercise 15: Voice Cloning Ethics and Policy

Several jurisdictions have enacted laws restricting non-consensual voice cloning, particularly for deceased individuals and political figures. Research the current state of voice cloning law in the United States and the EU. Then write a policy brief (400–600 words) recommending a comprehensive legal framework for voice cloning that balances: free expression, individual dignity and likeness rights, political integrity, and legitimate creative and commercial uses.

Section C: Media Literacy Applications

Exercise 16: Synthetic Image Analysis

Examine five images — some real, some AI-generated — and document your analysis process in detail. For each image, record: what specific features you examined, which features raised or lowered your suspicion of synthetic origin, your confidence level in your assessment, and what additional verification steps you took. After completing your analysis, verify the actual origin of each image and assess the accuracy of your judgments.

Exercise 17: AI Detection Self-Audit

Write a short essay (300–400 words) on any topic of your choice. Then submit the essay to two different AI detection tools and record the results. Then generate an essay on the same topic using an LLM and submit it to the same tools. Compare the results and analyze: what does this experiment suggest about the reliability of AI detection tools for practical use? What factors might explain any incorrect or inconsistent results?

Exercise 18: Curriculum Design

Design a one-session (90-minute) workshop on "Synthetic Media and Critical Thinking" for a target audience of your choice (high school students, senior citizens, journalists, politicians — specify). Provide: a session plan with timing, specific activities, the media literacy skills being taught, how you will handle the challenge that detection guidance becomes outdated quickly, and an assessment mechanism. Justify your pedagogical choices.

Exercise 19: Case Reconstruction

Research the 2024 New Hampshire primary Biden robocall incident in detail. Reconstruct the following: the technical methods used to clone Biden's voice, the distribution mechanism, the timeline from production to detection, the investigation that identified the source, the legal charges filed (if any), and the policy and regulatory responses at federal and state levels. What does the case reveal about the current adequacy of legal frameworks for deepfake political speech?

Exercise 20: Epistemic Resilience Assessment

The chapter discusses the "epistemic apocalypse" concern and the countervailing evidence. Conduct your own assessment of this debate: (a) identify the three strongest arguments that AI-generated synthetic media poses a catastrophic epistemic threat, (b) identify the three strongest counterarguments, and (c) develop your own position on the realistic magnitude of the epistemic risk and the most important responses. (700–900 words)

Section D: Research and Extended Analysis Exercises

Exercise 21: AI-Generated News Site Investigation

Identify three websites that display characteristics of AI-generated content farms (high article volume, lack of identifiable human journalists, generic or inconsistent bylines, lack of editorial contact information). Document the indicators of AI generation in each. Then trace: who is monetizing these sites (ad network data if available), whether they have been flagged by NewsGuard or similar raters, and how they spread their content. What would you recommend to platform moderation teams based on your findings?

Exercise 22: Literature Review

Write a structured literature review (800–1,000 words) of the empirical research on AI-generated content detection. Cover: the major studies on accuracy of human detection of AI text, the major studies on accuracy of automated detection tools, the research on watermarking effectiveness, and the key gaps in current knowledge. Conclude with a research agenda identifying the three most important unanswered questions.

Exercise 23: International Comparison

Compare the regulatory approaches to AI-generated synthetic media of the EU AI Act, China's "Deep Synthesis" regulations (effective 2023), and a country with minimal AI-specific regulation of your choice. For each jurisdiction, describe: what is required, the enforcement mechanism, the penalties, and the political and economic context that shaped the approach. Then evaluate: which approach is most likely to reduce AI-generated disinformation without suppressing legitimate AI uses?

Exercise 24: Technical Deep Dive: C2PA Implementation

Study the C2PA technical specification (available at c2pa.org). Write a technical summary (600–800 words) covering: the data model for manifests, the cryptographic mechanisms used for signing and verification, how the "hard binding" of manifests to content is achieved, how the specification handles conflict between manifest claims and content, and what the spec's own documentation identifies as its known limitations.

Exercise 25: Personal Media Literacy Audit

Conduct a 72-hour media diet audit with specific attention to AI-generated content. For all information you consume: record the content, the source, whether it might be AI-generated, and what signals led you to that assessment. At the end, analyze: what proportion of your information diet might plausibly be AI-generated or AI-assisted, which contexts made you most vulnerable to undetected synthetic content, and how your information-consumption habits changed (if at all) when you were actively monitoring for AI-generated content.

Exercise 26: Policy Memo

You are a staff analyst at the Federal Election Commission. Draft a policy memo (600–800 words) recommending whether the FEC should adopt a rule requiring disclosure of AI-generated content in political advertising, addressing: the legal authority for such a rule under the Federal Election Campaign Act, the specific content that should trigger disclosure, the disclosure format and timing, enforcement mechanisms, and the arguments for and against the recommended approach that the Commission should address in its rulemaking record.

Exercise 27: Scenario Analysis

Develop a detailed scenario analysis of a hypothetical AI-generated disinformation incident in the context of a state-level election: specify the actors, the synthetic content created (type, content, volume), the distribution channels used, the detection timeline, the verification response by fact-checkers and the platform, and the ultimate impact on the election if any. Then analyze: what interventions — technical, policy, media literacy — at what points in the scenario could have reduced harm?

Exercise 28: Comparative AI Safety Analysis

Different AI model providers have implemented different safety measures to prevent their systems from being used for disinformation: content moderation filters, usage policies, monitoring systems, and generation restrictions. Compare the safety approaches of two major AI providers (OpenAI and Google, for example) specifically with respect to disinformation prevention. What are the gaps in each approach, and how might determined bad actors circumvent them?

Exercise 29: Ethics of Synthetic Media Research

Researchers studying AI-generated misinformation often need to create synthetic disinformation for experimental purposes — to test detection tools, measure persuasive impact, and train defensive systems. Analyze the ethical dimensions of this research practice. What safeguards should be required? What review processes should govern such research? How do the benefits of the research weigh against the risk that the methods, datasets, or findings could themselves be misused? Draw on existing research ethics frameworks (IRB standards, the Belmont Report) to ground your analysis.

Exercise 30: Synthesis and Future Scenario

Based on everything you have learned in this chapter, write a 1,000-word scenario describing the AI-generated information environment of 2030. Your scenario should be based on realistic projections of: AI capability improvement (considering historical rate of progress), likely regulatory developments, platform response evolution, and media literacy education outcomes. Identify the three most significant factors that will determine whether the 2030 information environment is better or worse than the current one, and explain your reasoning.