Further Reading: Generative AI: Ethics of Creation and Deception

The sources below provide deeper engagement with the themes introduced in Chapter 18. They are organized by topic and include a mix of foundational texts, empirical research, legal analyses, and investigative journalism. Annotations describe what each source covers and why it is relevant to the chapter's core questions.

Bender, Emily M., Timnit Gebru, Angelina McMillan-Major, and Shmargaret Shmitchell. "On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?" Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency (FAccT), 610-623. ACM, 2021. The paper that became a flashpoint in AI ethics debates — and contributed to the dismissal of two prominent AI ethics researchers at Google. Bender et al. argue that large language models carry underappreciated risks: environmental costs, training data biases, the illusion of understanding, and the potential for generating convincing but unreliable text. The paper's analysis of LLMs as "stochastic parrots" — systems that produce fluent text without understanding meaning — is foundational to the chapter's discussion of hallucination and epistemic risk.

Weidinger, Laura, et al. "Ethical and Social Risks of Harm from Language Models." arXiv preprint arXiv:2112.04359 (2021). A comprehensive taxonomy of harms from large language models, produced by researchers at DeepMind. The paper identifies six risk areas: discrimination, exclusion, and toxicity; information hazards; misinformation harms; malicious uses; human-computer interaction harms; and environmental and socioeconomic harms. Useful as a systematic reference for understanding the full range of risks that generative AI presents, beyond the specific topics covered in the chapter.

Crawford, Kate. Atlas of AI: Power, Politics, and the Planetary Costs of Artificial Intelligence. New Haven: Yale University Press, 2021. Crawford maps the material infrastructure of AI — from the lithium mines that supply hardware to the data centers that consume energy to the human labor that trains models. Her analysis of AI as an extractive industry, not a neutral technology, provides essential context for the chapter's discussion of training data ethics and ghost work. The book demonstrates that AI's costs are distributed globally and fall disproportionately on communities with the least power.

Training Data Ethics, Copyright, and Ownership

Lemley, Mark A., and Bryan Casey. "Fair Learning." Texas Law Review 99, no. 4 (2021): 743-785. A legal analysis of whether machine learning constitutes fair use of copyrighted material. Lemley and Casey argue that learning from copyrighted works — as distinct from copying them — is a transformative use that should be protected. The paper presents the strongest legal argument in favor of AI training on copyrighted data and is essential reading for understanding the fair use debate at the center of the AI art controversy.

Samuelson, Pamela. "Generative AI Meets Copyright." Science 381, no. 6654 (2023): 158-161. A concise analysis by one of the leading copyright scholars in the United States. Samuelson examines the legal questions raised by generative AI from both the training-data side (is scraping copyrighted works infringement?) and the output side (is AI-generated content copyrightable?). She identifies the factors courts are likely to weigh and offers predictions about likely outcomes. An accessible and authoritative entry point for students interested in the copyright dimensions.

Sag, Matthew. "Copyright and Copy-Reliant Technology." Northwestern University Law Review 103, no. 4 (2009): 1607-1682. Although written before the generative AI era, Sag's analysis of "copy-reliant technology" — systems that must copy works in order to analyze them — provides important legal framework for understanding AI training. Sag argues that intermediate copying for the purpose of extracting non-copyrightable information (such as patterns and styles) may be permissible under fair use. His framework is frequently cited in current AI copyright litigation.

Deepfakes, Synthetic Media, and Democratic Governance

Chesney, Robert, and Danielle Citron. "Deep Fakes: A Looming Challenge for Privacy, Democracy, and National Security." California Law Review 107 (2019): 1753-1819. The foundational legal analysis of deepfake threats. Chesney and Citron identify three categories of harm — to individuals, to organizations, and to society — and evaluate the adequacy of existing legal remedies. They introduce the concept of the "liar's dividend" and argue that deepfakes pose a structural threat to the epistemic foundations of democracy. Written before the 2024 election cycle, the paper proved remarkably prescient.

Paris, Britt, and Joan Donovan. "Deepfakes and Cheap Fakes: The Manipulation of Audio and Visual Evidence." Data & Society Research Institute, 2019. Paris and Donovan distinguish between sophisticated AI deepfakes and simpler manipulations ("cheap fakes" — slowed video, decontextualized clips, basic edits) and argue that cheap fakes may pose an equal or greater threat because they require no technical sophistication to create. The report's emphasis on the distribution mechanisms (platforms, messaging apps, social media) rather than the production technology is particularly relevant to the chapter's analysis of how deepfakes spread during elections.

Vaccari, Cristian, and Andrew Chadwick. "Deepfakes and Disinformation: Exploring the Impact of Synthetic Political Video on Deception, Uncertainty, and Trust in News." Social Media + Society 6, no. 1 (2020): 1-13. An empirical study examining how exposure to deepfake political content affects trust in news media. Vaccari and Chadwick find that exposure to deepfakes increases uncertainty about the authenticity of all video content — even content that is genuine. This finding supports the chapter's argument that the epistemic harm of deepfakes extends far beyond the specific content fabricated.

AI Labor, Ghost Work, and the Creative Economy

Gray, Mary L., and Siddharth Suri. Ghost Work: How to Stop Silicon Valley from Building a New Global Underclass. Boston: Houghton Mifflin Harcourt, 2019. The essential text on the hidden human labor that powers AI systems. Gray and Suri document the working conditions of data annotators, content moderators, and task workers on platforms like Amazon Mechanical Turk, and argue that "ghost work" constitutes a new form of invisible labor that is essential to the AI industry but excluded from its benefits. Their analysis directly informs the chapter's treatment of annotation labor and the power asymmetry between AI companies and their workers.

Perrigo, Billy. "Exclusive: OpenAI Used Kenyan Workers on Less Than $2 Per Hour to Make ChatGPT Less Toxic." Time, January 18, 2023. The investigative report that revealed the working conditions of Kenyan data annotators who labeled toxic content for OpenAI's ChatGPT. Workers described viewing hundreds of descriptions of violence, sexual abuse, and self-harm daily, for wages of approximately $2 per hour. Several reported symptoms of PTSD. The article is a primary source for the chapter's discussion of annotation labor and is essential reading for understanding the human costs hidden behind generative AI systems.

Jiang, Harry H., et al. "AI Art and Its Impact on Artists." Proceedings of the 2023 AAAI/ACM Conference on AI, Ethics, and Society, 363-374. ACM, 2023. An empirical study of generative AI's impact on visual artists, based on surveys and interviews with professional artists. The researchers document economic impacts (income loss, market displacement), psychological impacts (devaluation of creative identity), and collective responses (opt-out campaigns, legal action). The paper provides the empirical foundation for the chapter's discussion of artist displacement and the AI art controversy case study.

Watermarking, Provenance, and Governance Tools

Coalition for Content Provenance and Authenticity (C2PA). "C2PA Technical Specification." Version 1.3, 2024. The technical specification for the C2PA standard described in Section 18.6. The document details how provenance metadata is embedded in digital content using cryptographic signatures, creating a chain of custody from creation through editing and distribution. While technical, the specification's introductory sections are accessible and provide concrete detail for understanding how content provenance works in practice.

Gu, Zhongjie, et al. "A Survey on Deepfake Detection." IEEE Transactions on Knowledge and Data Engineering 35, no. 12 (2023): 12596-12616. A comprehensive survey of deepfake detection methods, covering both image/video detection and audio detection. The paper evaluates the effectiveness of different approaches (frequency analysis, biological signal detection, neural network classifiers) and documents the arms race between generation and detection. Useful for understanding why technical detection alone cannot solve the deepfake governance challenge.

Policy and Regulation

European Parliament. "Regulation (EU) 2024/1689 Laying Down Harmonised Rules on Artificial Intelligence (Artificial Intelligence Act)." Official Journal of the European Union, 2024. The EU AI Act includes specific provisions for generative AI: disclosure requirements for AI-generated content, transparency obligations for model providers, and rules governing the use of copyrighted material in training data. As the most comprehensive AI governance framework in the world, it provides essential context for the chapter's discussion of legislative responses. The Act's treatment of "general-purpose AI models" is particularly relevant to LLM governance.

U.S. Copyright Office. "Copyright Registration Guidance: Works Containing Material Generated by Artificial Intelligence." Federal Register 88, no. 51 (March 16, 2023): 16190-16194. The Copyright Office's guidance on AI-generated content registration — ruling that purely AI-generated content cannot be copyrighted but that works involving substantial human creative contribution may be eligible. The guidance has been applied in several registration decisions and is the most authoritative U.S. statement on AI authorship to date.

These readings are starting points, not endpoints. The generative AI landscape evolves faster than any textbook can capture. Students are encouraged to supplement these sources with current reporting from outlets covering AI policy, law, and ethics.