52 min read

> "I generated seventeen news articles about a fictional city council vote last night. It took forty-five minutes."

Chapter 37: AI-Generated Content and Synthetic Media

"I generated seventeen news articles about a fictional city council vote last night. It took forty-five minutes." — Tariq Hassan, opening Part 7


Opening: Forty-Five Minutes

The room had already been getting noisier for ten minutes before class was supposed to start. Someone had forwarded a link. Then someone else had. By the time Professor Marcus Webb walked in and set his bag on the desk, at least six students had their phones out and were reading the same thing.

Tariq Hassan was already standing at the front of the room, phone face-down on the table beside the projector input. He looked like he was waiting.

"Should I go ahead?" he asked Webb, who hadn't even sat down yet.

Webb looked at him. Then at the class. Then back at Tariq. "Go ahead."

Tariq picked up the phone, plugged it in. Seventeen screenshots filled the projector. News articles. Datelines from a city called Millbrook, Ohio. Headlines about a city council vote on a zoning variance for a warehouse development in a residential neighborhood. Protests at city hall. A quote from a councilmember named Patricia Yuen, expressing concerns about traffic impact. A quote from a developer named Robert Strand, emphasizing the jobs the project would create. An editorial. A follow-up article three days later reporting the vote had passed, four to three, amid community opposition.

The writing was clean. The tone was recognizable — that particular register of local civic journalism, slightly formal, slightly stentorian, concerned with exact vote counts and the names of people who had spoken during public comment.

"Millbrook, Ohio doesn't have a city council," Tariq said. "Patricia Yuen and Robert Strand don't exist. The zoning variance was fictional. The warehouse doesn't exist. The protests didn't happen. I wrote the prompt, I did some light editing on two of the articles, and the rest came out more or less finished."

He let that sit.

"Forty-five minutes," he said. "That includes the time I spent choosing a plausible Ohio city name."

Sophia Marin, two rows back, said what several people were thinking: "Okay, but those are obviously fake, right? If you know the city is fake?"

"Sure," Tariq said. "But what if you don't know? What if you'd never heard of Millbrook, Ohio? What if you Googled it and found a result — which you would, because there are small cities named Millbrook in several states — and the article used a real zip code?"

Webb had finally sat down. He was looking at the slides with an expression that Ingrid Larsen, in the back row, would later describe as "the face of a man who's been dreading this conversation."

"Here's what I want to say," Webb said. "I've been teaching this course for eleven years. I've watched a lot of things change. This one — I'm going to be honest with you — no one knows how this ends."

It was possibly the most alarming thing he had said all semester.


37.1 What AI-Generated Content Is

To understand why Tariq's forty-five-minute experiment is significant, we need to understand what he actually did — and why it represents something different from the forms of automated content production that came before it.

Generative AI and Large Language Models

The term "artificial intelligence" covers an enormous range of technologies, from simple rule-based systems to complex neural networks. What Tariq used belongs to a specific category: generative AI, and more specifically, large language models (LLMs).

A large language model is a type of artificial intelligence system trained on vast quantities of text — billions of documents drawn from the internet, books, academic papers, news archives, and other sources. During training, the model learns statistical patterns in language: which words tend to follow which other words, how sentences are constructed, how arguments are made, how different genres of writing differ in tone and structure. The training process involves billions of adjustable parameters that gradually shift to improve the model's ability to predict and generate text.

The key mechanism is next-token prediction: given a sequence of text, the model predicts what word (or sub-word unit, called a "token") is most likely to come next. This sounds simple. The emergent result of doing this across billions of training examples, with hundreds of billions of parameters, is a system that can produce fluent, coherent, contextually appropriate text across an extraordinary range of styles, topics, and purposes.

The critical distinction from earlier AI systems is this: LLMs do not retrieve existing text and modify it. They generate text that did not previously exist. When Tariq prompted the model to produce a news article about a city council vote, the model did not find a news article and swap in new names. It generated, token by token, text that coheres as a news article about the specific fictional scenario Tariq described — because its training taught it what news articles about city council votes look, sound, and read like.

The Scope of Generative AI

Textual LLMs are the most widely accessible form of generative AI, but the category is broader:

Synthetic images are produced by systems like Midjourney, DALL-E, and Stable Diffusion, which generate photorealistic or stylized images from text descriptions. These systems can produce convincing photographs of people who do not exist, events that never happened, and documents that were never created.

Synthetic audio generation systems can clone a voice from a short audio sample and produce new speech in that voice saying anything the operator specifies. A politician's voice can be made to say things they never said; an expert's voice can deliver fabricated testimony.

Synthetic video — the technology underlying what are commonly called "deepfakes" — can place a person's face and voice onto another person's body, produce realistic-seeming video of events that never occurred, or animate still photographs into speaking figures. This technology is addressed in depth in Chapter 38.

Synthetic documents — PDFs, spreadsheets, research papers with appropriate formatting — can be generated with LLMs combined with document-production software, producing artifacts that look like official sources.

For the purposes of this chapter, we focus primarily on synthetic text: LLM-generated articles, social media posts, comments, and other text-based propaganda artifacts. The principles apply across modalities.

Why This Is Different

There is a temptation to classify LLM-generated content as simply "another form of fake news" — a faster version of something that has existed for decades. This classification is wrong in ways that matter.

Previous automated content production required: - Pre-written templates with fill-in-the-blank variables - Constrained to specific domains (financial reports, weather updates) where variation was predictable - High error rates outside the template structure - Recognizable as machine-produced through stylistic artifacts

LLM-generated content requires none of these constraints. It can produce coherent content in any domain, in any style, about any topic, without templates, with low error rates, and with stylistic characteristics that are increasingly difficult to distinguish from human writing. The constraint that limited previous automation — the need for human creativity and contextual judgment — has been substantially reduced.

This is not merely an incremental improvement. It is, as we will examine, a qualitative change in the economics of content production.


37.2 A Brief History of Automated Content

The arrival of sophisticated generative AI did not emerge from nowhere. There is a history of automated and semi-automated content production that helps us understand what is genuinely new.

Early Automation: Templated Reporting

The Associated Press began using automated writing software in 2014 to produce quarterly earnings reports for publicly traded companies. The system, developed by Automated Insights, used templates populated with financial data to produce tens of thousands of brief earnings summaries per quarter — far more than could be produced by human reporters. The articles were recognizably formulaic but served their purpose: conveying factual financial information to audiences who needed it.

This model — template plus structured data — was the dominant paradigm for automated writing throughout the 2010s. It worked well for domains with predictable structure and reliable data inputs. It produced text that was functional but identifiably automated. An earnings report generated by an algorithm reads like an earnings report generated by an algorithm: grammatically correct, informationally complete, stylistically flat.

Bot Networks: Scale Without Quality

Social media platforms in the 2010s became home to enormous networks of automated bot accounts producing and amplifying content. These bots served several functions: amplifying messages by liking, sharing, and retweeting; creating the appearance of trending topics by generating coordinated activity; and in more sophisticated operations, producing text posts that mimicked human engagement.

Early bot text was easy to identify. The content was repetitive, grammatically imperfect, contextually inappropriate. Platform algorithms eventually became reasonably effective at detecting and removing simple bot accounts based on behavioral signatures — posting frequency, account age, network patterns. The text quality was rarely the limiting factor because the detection methods focused on behavior rather than content.

The IRA Operation: Human Scale

The Internet Research Agency's 2016 influence operation, documented extensively in Chapter 20, represents the high-water mark of human-labor-based large-scale propaganda production. At its peak, the IRA employed approximately 1,000 people in a St. Petersburg office complex, working in shifts, operating multiple fake social media accounts each, producing original English-language content targeted at American audiences on divisive political topics.

The operation was genuinely impressive in scale and sophistication. Operators studied American political culture, developed distinct fictional personas with coherent backstories and posting histories, and produced content that successfully passed as authentic American grassroots expression. The operation's content was qualitatively superior to earlier bot-generated text precisely because it was produced by humans.

But it was expensive. It required physical infrastructure, management, training, quality control, and the ongoing human labor of a medium-sized company. It produced roughly the output that roughly 1,000 humans could produce when working intensively on content creation.

The GPT Discontinuity

The release of GPT-3 by OpenAI in 2020, and the subsequent release of GPT-4 in 2023 and various comparable systems, represented a qualitative break from all previous automated content generation.

GPT-3's output, when prompted appropriately, was fluent, coherent, contextually appropriate, and difficult for casual readers to distinguish from human writing. GPT-4 improved on this substantially: the output was not merely fluent but capable of mimicking specific styles, sustaining complex arguments, producing appropriate emotional register, and generating content that passed review by professional editors when they were not specifically looking for AI generation.

What changed was not just quality but access and cost. GPT-3 was available via API at a cost that made even high-volume generation economically trivial. A single human operator with API access could generate content at rates that would have required hundreds of human workers under the IRA model. By 2023, consumer-facing tools made LLM generation accessible without any technical expertise — the prompt-and-output workflow that Tariq demonstrated requires no programming knowledge whatsoever.

Ingrid Larsen, thinking about this timeline, noted something that had not been fully appreciated in earlier parts of the course: "The IRA operation that everyone cites as the scary example — that was the old scary. The new scary has been available for years."

She was right. What Tariq demonstrated in forty-five minutes is available to anyone with an internet connection and a credit card. In some form, it is available for free.


37.3 The Propaganda Applications of LLMs

With this foundation, we can map the specific applications of large language models to propaganda and disinformation operations. Five categories are particularly significant.

Category A: Content Farm Automation

The most straightforward application is the automated production of large volumes of disinformation articles at near-zero marginal cost. Tariq's seventeen articles about Millbrook, Ohio represent a toy version of this capability.

A serious content farm operation using LLMs can produce hundreds or thousands of articles per day, on any topic, in any desired style, targeting any desired geographic or demographic audience. The articles can be populated with plausible-sounding but fabricated names, institutions, statistics, and quotations. They can be written in the style of legitimate local journalism, academic reporting, or community newsletters. They can be formatted and posted automatically to websites that are visually indistinguishable from legitimate news sites.

The specific danger of the local news application deserves emphasis. As documented in Chapter 32, the collapse of local journalism in the United States and elsewhere has created "news deserts" — geographic areas with minimal or no professional local news coverage. Audiences in these areas are hungry for local news and have few reference points for identifying legitimate local sources. AI-generated "local" news sites can exploit this vacuum, producing content that appears to serve a community's informational needs while actually serving a propagandist's influence goals.

Category B: Persona Generation

Social media influence operations require not just content but credible sources for that content — fake people with convincing online identities. Building such personas manually, as the IRA did, requires significant labor investment per persona. An operator must establish posting history, develop a coherent personality and set of interests, build a following through genuine-seeming interaction over time, and then leverage that established credibility for influence purposes.

LLMs dramatically reduce the labor cost of this process. An operator can prompt an LLM to generate the complete posting history, profile biography, and apparent interests of a fictional person. Generative image tools can produce a photorealistic profile picture of a person who does not exist. The operator can then maintain the persona with ongoing LLM-generated posts, with the model maintaining consistency across the posting history.

More sophisticated implementations can maintain hundreds of such personas simultaneously, each with a distinct voice and history, interacting with each other and with real users to create the appearance of an authentic online community.

Category C: Targeted Message Personalization

One of the most powerful and underappreciated propaganda applications of LLMs is the ability to generate personalized persuasive messages tailored to specific individual targets.

Traditional propaganda is broadcast: the same message goes to everyone. Microtargeted advertising, pioneered by Cambridge Analytica and others, allows different messages to reach different audiences based on demographic and psychographic profiling. But even microtargeted messages are produced in limited variants — a few dozen versions for a few dozen audience segments.

LLM-enabled personalization can theoretically go further: generating a different version of a persuasive message for every individual target, drawing on that individual's known interests, concerns, values, and vulnerabilities. A message designed to persuade a fifty-three-year-old small business owner in rural Ohio to distrust a particular policy proposal would be constructed differently from a message targeting a twenty-two-year-old progressive activist in Seattle — not just in emphasis but in the specific examples used, the emotional register, the framing, the cultural references, and the rhetorical moves.

This represents what researchers have called "hyper-personalized influence operations" — influence that scales while remaining individualized in ways that were previously possible only in face-to-face persuasion contexts.

Category D: Comment Flooding

Online discussion platforms — newspaper comment sections, Reddit, Twitter, Facebook, YouTube comments — serve as spaces where apparent public opinion is formed and observed. Research in political psychology documents that people's assessments of public opinion are significantly influenced by what they perceive the opinion distribution to be in online discussions (the "spiral of silence" phenomenon, relevant to Chapter 14's coverage of social norms).

LLMs enable large-scale injection of synthetic comments into online discussions, creating the appearance of a dominant opinion that does not actually exist among real participants. An operator can flood a news article's comment section with LLM-generated comments expressing outrage, support, skepticism, or any desired sentiment, drowning out authentic human voices and creating the impression that the synthetic perspective represents the view of "most people."

The qualitative improvement over earlier bot comments is crucial here. Bot-generated comments in the early 2010s were easily identifiable as synthetic — they were short, formulaic, and contextually inappropriate. LLM-generated comments can engage specifically with the article's content, reference related news events, express nuanced positions with apparent emotional authenticity, and respond to other comments in ways that simulate genuine dialogue.

Category E: Sockpuppet Networks

The term "sockpuppet" refers to a fake online identity operated to manipulate discussion — typically a second account operated by someone who also has a primary account, allowing them to appear to be multiple different people. At small scale, this is a familiar form of online manipulation.

At the scale enabled by LLMs, coordinated sockpuppet networks can manufacture the appearance of mass social movements, consensus scientific opinion, grassroots political mobilization, and community concern. An operator managing a network of LLM-maintained fake personas can simulate the organic development of a social movement — complete with internal disagreements, evolving positions, and apparent learning over time — while the entire network serves a single coordinating purpose.

The significant danger of sockpuppet networks is their ability to manufacture what researchers call "manufactured consensus" — the false appearance that a position is widely held, which then functions as social proof to encourage real adoption of that position by real people.


37.4 The Scale Problem: Why This Is Different

Understanding the propaganda applications of LLMs requires understanding the economics of the change they represent.

The Labor Economics of Propaganda Before LLMs

Propaganda has always required labor. Goebbels's Reich Ministry of Public Enlightenment and Propaganda employed over a thousand staff at its height, producing an industrial volume of content across newspapers, radio, film, posters, and public events. The IRA employed approximately 1,000 people for its 2016 operation. These are not coincidental numbers — they reflect the actual human labor required to produce propaganda at effective scale.

This labor requirement was, historically, a constraint that provided some defense. A state or organization capable of deploying a thousand full-time propaganda workers is, almost by definition, a significant state or organization. The labor investment required some commitment, some organization, some resources that not every actor could command. Propaganda at scale was, in this sense, a high-capital endeavor.

The Cost Collapse

Large language models do not eliminate the labor of content production — they compress it dramatically.

A 2023 analysis by researchers at Georgetown University's Center for Security and Emerging Technology estimated that GPT-4-quality LLMs could produce content at a rate roughly equivalent to multiplying the productivity of a human content producer by a factor of between 40 and 100, depending on the type of content. For straight-forward, templated content (routine news articles, social media posts, comment responses), the multiplication factor is at the higher end.

What this means in practical terms: an operation with the human labor resources of a small marketing department — perhaps ten to twenty people — can, with LLM assistance, produce content at volumes that previously required the organizational scale of the IRA. A single motivated individual with access to commercial LLM APIs can produce content at volumes that would have required a small team in the pre-LLM era.

The cost per article of AI-generated content, for a well-optimized operation, is fractions of a cent. Human-written articles at competitive rates cost between $20 and $200 depending on quality and length. This is not a modest efficiency improvement; it is the elimination of the marginal cost of content production as a meaningful constraint.

The Countermeasures Problem

Webb raised this directly when Tariq finished his demonstration. "So what do we do about it?" he asked.

The obvious first answer — human fact-checking — fails immediately on scale grounds. Professional fact-checking organizations can produce between fifty and two hundred fact-checks per day across their entire staff. A single well-resourced influence operation using LLMs can produce thousands of articles per day. The ratio of false claims to fact-check capacity was already badly out of balance before generative AI; it becomes nonsensical afterward.

Automated detection (discussed in Section 37.6) faces its own fundamental limitations. Platform moderation — the removal of inauthentic content by social media companies — can reduce distribution but cannot eliminate the harm from content that is seen before removal. Legal responses (labeling requirements, transparency mandates) require disclosure from actors who have every incentive not to disclose.

This is the core of what Webb meant when he said "no one knows how this ends." The existing infrastructure of counter-disinformation work was designed for a world in which the labor constraint on content production provided at least some natural limiting factor. That constraint is now largely gone, and the institutional responses have not yet caught up.

The Firehose Connection

Chapter 39 will examine in detail the Russian "firehose of falsehood" strategy — the deliberate production of such high volumes of contradictory, confusing, and unverifiable claims that audiences simply disengage from the effort of determining what is true. This strategy has historically been constrained by the labor cost of maintaining the volume.

LLM-generated content removes that constraint. A firehose of falsehood strategy powered by AI-generated content can produce volumes of false information that swamp the information environment not periodically or around specific events, but continuously and across every domain simultaneously. The cognitive burden on audiences — and on the institutions responsible for maintaining the information environment — becomes effectively unlimited.


37.5 Content Farms and the Information Ecosystem

The abstract analysis of Section 37.4 becomes concrete when we examine what has already happened to the information ecosystem.

NewsGuard's Findings

NewsGuard, a journalism organization that monitors news and information sources, published a series of reports beginning in 2023 documenting the emergence of AI-generated news websites. Their methodology involved identifying sites with characteristics associated with automated production — regular, high-volume posting; limited or nonexistent author attribution; content covering a wide range of unrelated topics; writing style consistent with LLM output — and then analyzing their content and funding structure.

The scale of what they found was striking. By mid-2023, NewsGuard had identified more than 400 websites that appeared to be operating primarily or entirely with AI-generated content, with no identifiable human editorial staff. These sites collectively published thousands of articles per day. They were monetized primarily through programmatic advertising — the same advertising networks that fund legitimate news sites — allowing them to generate revenue without human oversight of the content being produced.

The content of these sites was not uniformly false. Much of it was accurate, or at least not identifiably false: aggregated coverage of news events, rewritten press releases, generic lifestyle content. This mix of accurate and inaccurate content is more dangerous than purely false content, because it establishes the site's credibility with readers before deploying inaccurate or manipulative content on specific issues.

The Local News Vacuum

The particular danger of AI-generated local news deserves detailed examination, because it exploits a specific vulnerability in the contemporary information ecosystem.

Between 2004 and 2023, the United States lost approximately 2,500 local newspapers. In many counties and smaller cities, there is now no professional journalist reporting on local government, school boards, city councils, zoning decisions, local elections, or police activities. This is not merely a cultural loss — it represents a genuine informational vacuum in which local civic decisions happen with minimal public scrutiny or awareness.

Into this vacuum, AI-generated "local news" sites have moved. These sites produce content that appears to be local — it references real local place names, local institutions, and sometimes real local officials — but is generated centrally, without local reporting, and often without any human review. The content may be produced by foreign actors with no stake in the community, by partisan operators seeking to influence local political opinion, or simply by commercial operators optimizing for advertising revenue without regard for accuracy.

The reader encountering one of these sites has few easy tools for evaluating its legitimacy. It looks like local journalism. It covers local topics. Its domain name may be designed to suggest local origin. Without knowledge of the local context sufficient to identify factual errors — knowledge that requires actual community membership — distinguishing AI-generated fake local journalism from authentic local journalism is genuinely difficult.

This is the information ecosystem problem in its sharpest form: not sophisticated users being deceived by high-quality disinformation, but ordinary community members seeking local information finding a fabricated simulacrum of local journalism in the space where local journalism used to be.


37.6 Detection: What Works and What Doesn't

The question Sophia raised — "But those are obviously fake, right?" — points toward the natural first line of defense: detection. If AI-generated content can be identified, it can be labeled, filtered, or discounted. The challenge is that detection is harder than it appears, and the hardness is structural rather than merely technical.

Existing Detection Tools

Several tools have been developed to identify AI-generated text:

GPTZero, developed by Edward Tian as a Princeton student project in 2022 and subsequently commercialized, uses a combination of "perplexity" (a measure of how unexpected the text would be to a language model) and "burstiness" (variation in sentence complexity) as signals of human vs. AI authorship. Human writing tends to be "burstier" — more variable in complexity — while AI-generated text tends to be more uniform.

Turnitin, the academic plagiarism detection service, added AI detection capabilities in 2023, reporting a percentage probability that submitted text is AI-generated. It is widely used in academic contexts.

OpenAI's own classifier, released in 2023, attempts to identify text generated by OpenAI's own models. It was quietly discontinued later that year, with OpenAI acknowledging that its accuracy rates were insufficient for reliable use — it produced too many false positives (flagging human writing as AI-generated) and too many false negatives (failing to flag AI-generated text).

The Fundamental Detection Problem

All existing detection tools share a fundamental vulnerability: they rely on characteristics of current AI-generated text that are artifacts of current model design and training, not inherent properties of AI generation. As models improve, these characteristics change.

The "perplexity" measure, for instance, works because current LLMs tend to produce text that is statistically "expected" — they consistently choose high-probability tokens. But this is a property of current optimization targets, not of generative AI in general. Models can be prompted to produce less predictable text, or fine-tuned to produce output that maximizes unpredictability in ways that defeat perplexity-based detection.

This creates a fundamental dynamic: any detectable characteristic of AI-generated text is also an optimization target for improving AI generation. When GPTZero's detection mechanism became public, it became possible to prompt LLMs to avoid the characteristics that GPTZero was detecting. When Turnitin published guidance on how it identified AI text, that guidance became a specification for what characteristics to avoid.

The detection arms race is, by its structure, asymmetric: the detector must correctly identify all AI-generated content while never falsely flagging human content, while the generator merely needs to produce content that defeats the most current detector. The detector is playing defense across an entire distribution; the generator is playing offense on a specific target.

Watermarking Approaches

Several researchers and organizations, including OpenAI and Google DeepMind, have investigated watermarking as an alternative to post-hoc detection. Rather than trying to identify AI-generated text after the fact, watermarking embeds an imperceptible signal in the generation process — a statistical pattern in the token selection that a detector can identify but that human readers cannot perceive.

This approach has genuine promise for contexts in which the generating organization cooperates with the watermarking process. It does not work when the generating model is operated by an actor who does not want to be identified — open-source models, for instance, can simply be run without watermarking. Watermarks can also be removed or obscured through simple text transformations: paraphrasing, translation and back-translation, or light editing.

What Forensic Analysis Can Realistically Achieve

Webb returned to this question at several points during the class discussion. "We're not going to be able to reliably detect every piece of AI-generated content," he said. "So what can we do?"

The honest answer is that forensic analysis can identify some AI-generated content, under some conditions, with non-trivial error rates. Specifically:

  • Documents with hallucinated citations — references to papers, books, or studies that do not exist — are identifiable through citation verification, which is labor-intensive but reliable.
  • Content produced at unnatural volume and speed can be identified through behavioral patterns rather than textual analysis — the signature of the operation, not the individual piece of content.
  • Provenance inconsistencies — metadata, posting patterns, domain registration details — can identify coordinated inauthentic behavior even when the content itself is indistinguishable.
  • Factual errors about specific, verifiable local details — names, dates, institutions — can be identified by people with genuine local knowledge.

None of these methods is scalable to the volume of AI-generated content in circulation. They are investigation tools, useful for documenting and attributing specific operations, not filters that can be applied to the information environment as a whole.


37.7 The Authenticity Problem

The detection challenge has an even more troubling dimension than the difficulty of identifying AI-generated content. The very existence of AI generation creates a problem for authentic content — what legal scholars Robert Chesney and Danielle Citron called the "liar's dividend" in their 2019 analysis.

The Liar's Dividend

The concept is straightforward but its implications are profound. As AI-generated content becomes widely known to exist, any content — including authentic content — becomes susceptible to the claim that it is AI-generated. This claim does not have to be credible to be effective; it merely has to introduce sufficient doubt to prevent the content from having its natural rhetorical impact.

Consider the historical cases examined in this course. Authentic recordings of politicians making damaging statements have, in the past, been difficult to deny — the voice is recognizable, the context is verifiable, the statement is clear. In the post-deepfake era, any such recording can be dismissed as "AI-generated." The dismissal may be transparently false to careful analysts, but it will be accepted as plausible by audiences who (a) are predisposed to doubt the content for other reasons and (b) know enough to know that AI-generated audio is technically possible.

The liar's dividend does not require that any AI-generated content actually be deployed. The knowledge that it could be deployed is sufficient to provide plausible deniability for authentic content. A documented video of genuine misconduct, an authentic leaked document, a real recording of a private statement — each of these can be deflected by the mere possibility that AI generation exists.

Asymmetric Epistemic Effects

The liar's dividend has asymmetric effects that favor powerful actors over accountability-seeking ones. Ordinary citizens, journalists, and accountability organizations typically do not have the resources to produce sophisticated deepfakes or high-quality LLM-generated forgeries — but they also typically do not have the resources to prove that authentic evidence is authentic in a world where audiences are primed to doubt it.

Powerful actors who are subjects of authentic damaging evidence, by contrast, do have resources — communications teams, legal counsel, friendly media outlets — to amplify the doubt created by the possibility of AI generation. They can hire technical consultants to produce reports questioning the authenticity of evidence. They can use the language of AI skepticism to frame authentic accountability journalism as potentially fabricated.

This is not a hypothetical future risk. By 2023 and 2024, the defense of "that might be AI-generated" was being deployed as a rhetorical move by politicians, media figures, and organizations confronted with authentic critical content, in some cases successfully.

Implications for Propaganda Analysis

The liar's dividend changes the propaganda calculus in a specific way. Propaganda traditionally operates by introducing false content into the information environment to shape beliefs. The liar's dividend introduces uncertainty about authenticity into the information environment, which undermines the epistemic foundations on which audiences evaluate all content — including the content produced by accountability institutions.

This is not a trivial shift. Much of the counter-propaganda infrastructure analyzed in this course — fact-checking, source credibility evaluation, media literacy training — rests on the premise that there is a knowable truth that can serve as a reference point for evaluating claims. The liar's dividend erodes this premise, not by making truth unknowable in fact but by making it appear unknowable to audiences, which for practical purposes amounts to the same thing.


37.8 AI and Scientific Misinformation

The propaganda application that would have most interested the public relations operatives of the tobacco industry, had it been available to them, is the deployment of AI-generated content to manufacture scientific doubt.

Manufactured Doubt Revisited

Chapter 26 analyzed in depth how industries facing scientific consensus about their products' harms — tobacco, fossil fuels, pharmaceuticals — developed systematic strategies for manufacturing the appearance of scientific uncertainty. These strategies included funding compliant scientists, producing industry-sponsored research, creating front organizations that appeared to be independent scientific bodies, and flooding regulatory and media discussions with technical-sounding content that non-expert audiences could not easily evaluate.

These strategies were labor-intensive and expensive. They required recruiting actual scientists, commissioning actual research (however biased in design and reporting), and maintaining the appearance of legitimate scientific activity over extended periods. The tobacco industry spent decades and billions of dollars on this effort.

What AI Enables

Large language models, trained on scientific literature, can produce convincing scientific-sounding content — abstracts, literature reviews, methodology sections, discussion of findings — without any genuine research having been conducted. This content can include:

Hallucinated citations: references to plausible-sounding but nonexistent studies. An LLM generating a paragraph about the health effects of a chemical might produce citations in correct academic format to journals that exist, in volumes that exist, in years that exist — but with paper titles, authors, and findings that are entirely fabricated. The citation looks real; the study does not exist.

AI-generated academic papers: full papers formatted according to the conventions of specific fields, capable of being submitted to academic journals or pre-print servers. Several high-profile cases have documented AI-generated papers submitted to journals, some of which passed initial peer review before being identified.

AI-generated expert commentary: quotes, public comments, and testimony attributed to real or fictional experts, written in the register of scientific expertise. A regulatory comment process that receives thousands of comments, some AI-generated, cannot easily distinguish authentic expert opinion from fabricated expert opinion.

Pre-print server exploitation: pre-print servers like arXiv and SSRN allow scientific papers to be posted before peer review, providing a legitimate-seeming source for results that have not been vetted. AI-generated papers posted as pre-prints can be cited in media coverage before any review has occurred, and the citation can survive even after the paper is identified as fabricated.

The Big Tobacco Counterfactual

Consider what the tobacco industry could have done with these capabilities in 1950, when the scientific evidence linking cigarette smoking to lung cancer was just becoming clear. Instead of spending decades cultivating compliant scientists and commissioning biased research, an LLM-equipped tobacco PR operation could have:

  • Generated hundreds of plausible scientific papers questioning the cancer link, with fabricated citations, and submitted them to journals and pre-print servers under invented author names
  • Produced AI-generated expert commentary for Congressional hearings, regulatory comment periods, and media inquiries
  • Flooded academic and media discussion with LLM-generated technical objections to the genuine research, creating the appearance of scientific controversy
  • Maintained AI-generated front organization websites with the trappings of legitimate scientific institutions

The resulting "manufactured uncertainty" would have been far larger in volume, far lower in cost, and far more difficult to trace to its source than anything the tobacco industry actually achieved. And the actual tobacco industry, recall, successfully delayed meaningful regulation for decades using the labor-intensive manual version of these strategies.

The implication for contemporary public health, environmental regulation, and policy generally is not reassuring.


37.9 Platform and Regulatory Responses

Governments and platforms have recognized the challenge of AI-generated content and are developing responses. Evaluating these responses requires distinguishing between what they can accomplish and what they cannot.

Platform Policies

Major social media platforms have introduced policies requiring disclosure of AI-generated content, particularly in the context of elections and political advertising. Meta (Facebook and Instagram) announced in 2023 that it would require disclosure when political advertisements include AI-generated imagery or audio. YouTube has required disclosure of AI-generated content in videos uploaded to the platform, with particular emphasis on news and information content.

These policies face three significant limitations. First, they depend on voluntary disclosure by operators who have no incentive to disclose. A disinformation operation using LLM-generated content for influence purposes is not going to label that content as AI-generated. Platform disclosure requirements function as rules for well-intentioned actors operating in good faith; they are irrelevant to the bad actors who constitute the actual threat.

Second, enforcement of non-disclosure is limited by detection capabilities. If platforms cannot reliably identify AI-generated content, they cannot enforce disclosure requirements for AI-generated content.

Third, the scope of existing policies is narrow. Requirements focused on political advertising during election periods do not address the broader information environment; they are significant improvements for one specific, narrow context while leaving most AI-generated disinformation entirely unregulated.

The EU AI Act

The European Union's AI Act, finalized in 2024, includes provisions specifically addressing AI-generated content. Article 50 requires providers of AI systems that generate synthetic content — text, images, audio, video — to ensure that the content is labeled in a machine-readable format, using technical solutions including watermarking. The requirement applies to general-purpose AI systems capable of producing synthetic content that could deceive users.

The EU approach represents the most substantive regulatory intervention to date. Ingrid observed what she had observed throughout the course: "The EU is moving faster on this than anyone else. The U.S. has almost nothing comparable at the federal level."

The limitations of the EU Act in this context parallel those of platform policies: requirements for transparency and labeling apply to compliant operators deploying systems in commercial contexts, not to covert influence operations deliberately obscuring the origin of their content. Regulatory requirements can shape the behavior of commercial AI providers — requiring them to build watermarking into their systems, for instance — but they cannot reach bad actors who use open-source models or who operate outside EU jurisdiction.

C2PA and Content Provenance

The Coalition for Content Provenance and Authenticity (C2PA), a cross-industry initiative involving Adobe, Microsoft, Intel, and several news organizations, has developed a technical standard for content provenance — essentially a cryptographic signature embedded in digital content that records its origin and any transformations it has undergone.

The C2PA standard allows content to carry verifiable information about where it was created, what device or software produced it, and what edits have been applied. Content generated by a C2PA-compliant AI system can carry a machine-readable record of that generation. Authentic photographs taken with a C2PA-compliant camera carry a verifiable record of their capture.

This approach has genuine potential in one specific context: verifying the authenticity of content produced by compliant actors. If major news organizations, government institutions, and reputable content producers adopt C2PA, consumers can verify that content bearing a C2PA signature from a trusted source is authentic. This does not solve the problem of unattributed or deceptively attributed AI-generated content — it does not, in itself, identify synthetic content that lacks a provenance signature. But it creates a positive verification system for trusted content, which is a meaningful contribution.

Realistic Assessment

The honest assessment of existing regulatory and platform responses is that they are insufficient to the scale of the challenge, but not pointless. They establish norms, create infrastructure, and shape behavior at the legitimate end of the AI deployment spectrum. They do not solve the problem of AI-generated propaganda produced by bad-faith actors outside the regulated sphere.

This is not an argument against regulation. It is an argument for honesty about what regulation can accomplish, and for developing counter-measures that do not depend entirely on regulatory compliance by adversarial actors.


37.10 Inoculation and AI Content: Does It Transfer?

Sophia had been thinking about this throughout the class. She raised it directly: "We've spent the last month on inoculation campaigns. Does any of it still work?"

It is the right question. The inoculation approach developed in Chapters 29 and 33 is based on technique-based inoculation: exposing people to weakened forms of manipulative techniques so that they develop resistance when they encounter those techniques in the wild. The FLICC framework (Fake Experts, Logical Fallacies, Impossible Expectations, Cherry-picking, Conspiracy Theories) and similar tools categorize the rhetorical techniques that propaganda and disinformation use.

What Transfers

The good news, such as it is: AI-generated propaganda uses the same psychological manipulation techniques as human-generated propaganda. An AI-generated article manufacturing doubt about vaccine safety will use fake experts, cherry-picked studies, and conspiracy framing — because those are the techniques that work, and the LLM has been trained on examples of content that uses them. An AI-generated social media post amplifying political division will use emotional appeals, in-group/out-group dynamics, and threat framing — because those techniques are present throughout the training data on which the model learned to produce engaging political content.

Technique-based inoculation should therefore transfer to AI-generated content. A person who has been inoculated against the fake expert technique is likely to recognize and resist it whether the fake expert claim was generated by a human or an LLM. The underlying psychological mechanism — recognition of the technique, diminished persuasive impact — does not depend on the origin of the content.

This is meaningfully reassuring. The years of work developing technique-based inoculation frameworks are not rendered obsolete by AI-generated content.

New Technique Categories

However, AI-generated propaganda introduces some new technique categories that require specific inoculation approaches:

The AI Authority Appeal: Claims framed as "AI analysis confirms..." or "according to machine learning analysis..." or "data science shows..." that invoke the perceived objectivity and infallibility of AI systems to lend false authority to claims. This technique exploits the widespread perception that AI systems are neutral, objective, and free of human bias. People who have been inoculated against human expertise manipulation may not automatically apply the same skepticism to AI-attributed claims.

Synthetic Consensus Technique: The use of AI-generated comments, reviews, and social media posts to create the false appearance that a position is widely held. This technique exploits the social proof heuristic — the tendency to use others' apparent behavior as a guide to correct behavior. Inoculation specifically targeting manufactured social consensus — explaining how AI can generate the appearance of mass opinion — is needed to address this.

The AI Manufactured Doubt Factory: The deployment of AI-generated technical-sounding content to overwhelm audiences with apparent complexity and controversy. This is the scientific misinformation application discussed in Section 37.8, and it requires inoculation messages that specifically address the possibility that apparent scientific controversy may be AI-generated, not genuine scientific debate.

Design Principles for AI-Era Inoculation

Webb worked through these at the whiteboard, arriving at a set of principles that the class refined through discussion:

  1. Inoculate against the meta-claim first: Before addressing specific AI-generated content, inoculate people against the general claim structure — "AI analysis shows X" or "studies confirm Y" — by explaining how these claims can be manufactured.

  2. Emphasize process over product: Rather than trying to identify AI-generated content after the fact, inoculate people to look for provenance — where did this come from, who produced it, can the source be verified independently?

  3. Quantitative consensus is not qualitative consensus: Inoculate the specific error of treating volume of apparent agreement (comments, shares, likes) as evidence of genuine consensus. AI can generate volume; it cannot generate genuine community agreement.

  4. Maintain technique-based inoculation as core: The existing FLICC-based and technique-based approaches remain valid and should be maintained, because AI uses the same techniques. AI-era additions are supplements, not replacements.

The progressive project associated with this chapter asks students to apply these principles to their specific target community — which we turn to in Section 37.15.


37.11 Research Breakdown

Goldstein, Sastry, et al. (2023), "Generative Language Models and Automated Influence Operations"

Citation: Goldstein, J. A., Sastry, G., Musser, M., DiResta, R., Gentzel, M., & Sedova, K. (2023). Generative language models and automated influence operations: Emerging threats and potential mitigations. arXiv preprint arXiv:2301.04246.

Methodology: The study used a combination of literature review, structured capability analysis, and experimental testing to assess the extent to which GPT-3.5 and comparable models could be used to perform specific tasks associated with influence operations, including content generation, persona construction, and targeted message drafting.

Key Findings:

The study found that LLMs substantially lower the barrier to entry for several specific influence operation tasks. For content generation, GPT-3.5-class models were capable of producing news articles, social media posts, and opinion pieces that evaluated raters judged to be comparable in quality to human-produced content in the same genre, in a fraction of the time.

For persona construction, the models were effective at generating coherent backstories, posting histories, and consistent voice for fictional online personas — though maintaining consistency across extended interaction with real users remained a limitation.

For targeted message generation, the study found that LLMs could effectively adapt persuasive messages to specific described target profiles, using different rhetorical strategies, emotional appeals, and framing for different audiences.

The study is notably careful about what it does and does not establish. It demonstrates capability — that these things are possible — rather than deployment — that influence operations are actively using them at scale. The authors note that the gap between demonstrated capability and documented deployment in real-world operations may reflect the relative recency of the technology rather than a decision by influence operation actors not to use it.

Implications: The study's most important contribution is establishing a rigorous capability baseline. By demonstrating what these models can do under controlled conditions, it enables assessment of what influence operations using these tools would realistically be capable of — and it frames that capability as substantial.


37.12 Primary Source Analysis

An AI-Generated Disinformation Article: Structural Analysis

As an example of AI-generated disinformation in the wild, consider the structure and technique profile of the type of content that has been documented in NewsGuard's 2023 research and in subsequent academic analyses of AI-generated partisan content farms.

The Article Type: "Research Confirms" Format

One common category of AI-generated disinformation articles follows what researchers have called the "research confirms" format: an article claiming to report on a scientific study or institutional finding that lends authority to a politically motivated claim. The article begins with a claim framed as a finding ("A new study from researchers at [plausible institution] has found that..."), provides a brief description of the purported methodology, offers several specific-sounding data points, quotes one or more named "experts," and ends with implications for policy or public concern.

Applying the Course's Analytical Framework

Applying the propaganda analysis framework from Chapter 10:

Source analysis: The article names a plausible-sounding institution that either does not exist or does not have the stated center or program. The named researchers typically cannot be found in academic directories, or exist as real people but did not conduct the named study.

Technique identification: The article deploys the Fake Expert technique (FLICC) — attributing claims to invented or misrepresented experts. It deploys selective emphasis (cherry-picking) by presenting the fabricated findings without context about existing genuine research on the topic. Where the topic involves health or safety, it commonly deploys appeal to fear with precise-sounding statistics.

Stylistic analysis: Compared to authentic science journalism, AI-generated "research confirms" articles tend to produce statistics with artificial precision — percentages given to one decimal place, sample sizes that are suspiciously round numbers, effect sizes that are larger than typical findings in the field. The writing is fluent but lacks the specific hedging and qualification that characterizes careful science journalism. Authentic journalists who cover science tend to include caveats ("the study was limited by," "other researchers caution that"); AI-generated versions of this genre tend to state findings with uniform confidence.

Propagandistic function: These articles perform manufactured doubt or manufactured consensus depending on their target. When targeting established scientific consensus (vaccines, climate change), they manufacture the appearance of scientific controversy. When targeting policies the operator opposes, they manufacture the appearance of scientific backing for the opposing position.

The Key Difference from Human-Generated Propaganda

The most significant analytical difference between this AI-generated content and comparable human-generated propaganda is not in the techniques used — they are identical — but in the detectability of production artifacts. Human-generated fake news articles often contain regional idioms, cultural references, or knowledge patterns that reveal the operator's actual background. A Russian-produced English-language fake article about American local politics sometimes contains subtle markers of non-native construction.

AI-generated content, trained on native English text, lacks these geographic and cultural markers. The absence of the artifacts that have historically aided attribution is itself a forensic consideration: highly polished, culturally fluent content with no detectable origin fingerprint may indicate AI generation.


37.13 Debate Framework

Resolution: "Does AI-Generated Content Represent a Fundamentally New Propaganda Threat or an Accelerated Old One?"

Position A: A New Category of Threat

The argument for novelty rests on three premises:

The labor constraint removal is categorical, not incremental. Previous improvements in propaganda efficiency — printing presses, radio, television, social media — each expanded the reach of content while leaving the cost of production relatively stable. LLMs remove the marginal cost of production almost entirely. This is not a point on a continuum; it is the removal of a constraint that has structured all previous propaganda practice.

Perfect personalization at scale is qualitatively different from broadcast messaging. All previous large-scale propaganda, including microtargeted digital advertising, was broadcast in character — the same message or a limited set of messages reaching many people. LLM-enabled propaganda can theoretically generate a unique, specifically tailored persuasive message for every individual target, drawing on individualized knowledge of that target's psychology, concerns, and vulnerabilities. This is the persuasion approach of the skilled propagandist operating one-on-one, scaled to the population. No previous technology achieved this combination.

Detection impossibility changes the epistemological situation. Previous propaganda could, in principle, be identified and attributed. The structural impossibility of reliable AI content detection — the arms race asymmetry described in Section 37.6 — means that for the first time, a form of propaganda exists that cannot be comprehensively identified or attributed. This changes the epistemic environment for all content, not just AI-generated content, through the liar's dividend mechanism.

Position B: An Accelerated Old Threat

The argument for continuity rests on equally compelling premises:

The psychological mechanisms are unchanged. LLM-generated propaganda succeeds because of the same psychological vulnerabilities that have made propaganda effective throughout history: confirmation bias, availability heuristic, social proof, emotional appeals, authority appeals. The technology that generates the content is new; the human psychology it exploits is ancient. Interventions that address the psychological mechanisms — including the inoculation approaches developed in this course — remain effective against AI-generated propaganda.

Scale is not a new problem. The challenge of information volume overwhelming audiences' evaluative capacity has been present since at least the mass media era. Television, social media, and 24-hour news cycles have all produced versions of the scale problem. The response that has worked — media literacy education, institutional credibility systems, professional journalism standards — addresses the audience's relationship to information rather than the volume of information. These responses scale as well.

The appropriate institutional response is not new. The question of what institutions can reliably inform the public is not new; it is the central question of democratic information infrastructure, and it has been addressed through professional journalism, public broadcasting, educational institutions, and regulatory frameworks. The appropriate response to AI-generated propaganda is to strengthen these institutions, not to develop entirely new frameworks premised on AI's novelty.

Synthesis: The Position This Course Recommends

Webb was characteristically careful here. "I think both positions have something right," he said. "The question is whether the existing frameworks need to be supplemented or replaced."

The honest position is that AI-generated propaganda is both an acceleration and, in some respects, a novelty. The psychological techniques and institutional responses from the pre-AI era remain relevant — the inoculation work transfers, the media literacy work transfers, the credibility infrastructure work transfers. But the scale change and the liar's dividend represent genuinely new dimensions that require new specific responses: provenance systems, AI-specific inoculation content, institutional adaptation to a world where volume is no longer a reliable signal of authentic interest.


37.14 Action Checklist: Evaluating AI-Generated Content

The following checklist is designed for practical use when evaluating news articles, social media posts, or other informational content that may be AI-generated. It is calibrated to what careful readers can actually do, without specialized forensic tools or technical expertise.

For any text claim:

  • [ ] Verify citations independently. If the content references studies, statistics, or expert quotes, search for the actual source. Hallucinated citations are among the most reliable indicators of AI generation and are completely detectable by anyone with internet access.

  • [ ] Check the source's publication history. Does the site have a history of publication? Do the articles cover a coherent geographic or topical area consistent with genuine local journalism or subject-matter expertise? High-volume sites covering many unrelated topics with no consistent editorial voice are red flags.

  • [ ] Use lateral reading. Before engaging with the content of an article, open multiple new tabs and search for information about the source — not the claim the article makes, but the outlet making it. What do other sources say about this publication? Lateral reading is more efficient than trying to verify every factual claim.

  • [ ] Look for the human voice. Authentic journalism tends to have specific, named sources with specific, local knowledge. AI-generated content tends toward plausible-sounding generalities. The quote attributed to "Patricia Yuen, a concerned resident" with no further identification, who says something that could apply to any zoning dispute anywhere, is a different artifact than a quote from a specific person with verifiable community membership.

  • [ ] Check the writing style for artificial uniformity. Human writing varies in sentence length, complexity, and register within a piece. AI-generated text tends toward more uniform complexity. This is an imperfect heuristic, but sustained uniform prose in a genre (local news, academic summary) where human writers typically vary their approach is worth noting.

  • [ ] Be skeptical of precise statistics without sources. AI hallucination tends to produce precise-sounding numbers. "67.3% of residents opposed the proposal" from an unidentified survey is a pattern worth scrutinizing.

For images:

  • [ ] Check image metadata using a tool like Jeffrey's Exif Viewer — authentic photographs carry metadata about the camera, location, and time of capture. AI-generated images typically lack this metadata or have it stripped.

  • [ ] Look for consistency errors in text within images, hand rendering, teeth, hair edges, and reflective surfaces — areas where current image generation models still produce characteristic errors.

  • [ ] Reverse image search to identify whether the image has prior existence online or appears only in contexts associated with the article.

For what detection cannot reliably do:

High-quality AI-generated text, produced with attention to these detection heuristics, may not be reliably identifiable by any of the above methods. The checklist is a tool for raising probability assessments, not for achieving certainty. The appropriate response to uncertainty is not to treat all content as equally suspect but to calibrate trust to the strength of the evidence for provenance, rather than to the content of the claim itself.


37.15 Inoculation Campaign: Future-Proofing Analysis

Progressive Project — Part 7 Component

With Chapter 37, the progressive project enters its final analytical phase: future-proofing your inoculation campaign against the AI-generated content landscape.

Analysis Framework

Students working on their inoculation campaigns should address the following questions:

1. Which AI-generated content threats are most likely to affect your target community?

Different communities face different AI content threat profiles. A community defined by local civic engagement (neighborhood associations, school board involvement) faces the local news content farm threat as its most pressing AI challenge. An elderly community with lower media literacy faces the synthetic authority figure threat — AI-generated content attributed to trusted official sources. A politically active community faces the synthetic consensus threat — manufactured apparent public opinion. A community engaged with specific policy debates faces the manufactured doubt threat in that policy domain.

Map your target community's most likely exposure points before designing AI-specific inoculation content.

2. What is your community's current awareness of AI generation capabilities?

Survey or assess: Does your target community know that news articles can be AI-generated at industrial scale? Do they know that expert quotes can be fabricated? Do they know that social media comments can be AI-generated? Research consistently shows that awareness of manipulation capability — even without specific detection ability — reduces susceptibility to that form of manipulation.

The first component of AI-era inoculation may simply be awareness-raising about capabilities that many community members do not yet know exist.

3. How would you adapt your existing inoculation messages?

Review the inoculation messages you developed in Chapters 29 and 33. Which of these address technique categories (fake experts, cherry-picking, conspiracy framing) that AI-generated content also uses? These messages should remain in your campaign — the technique transfer argument from Section 37.10 applies.

Which messages assume that content was produced by identifiable human operators with traceable funding, organizational affiliations, and geographic origins? These messages may need adaptation for a world where content can be produced without any of these identifiable characteristics.

4. Design one AI-specific inoculation message

Drawing on the principles from Section 37.10, design a specific inoculation message that addresses one of the three new AI technique categories: the AI authority appeal, the synthetic consensus technique, or the AI manufactured doubt factory.

Recall the design criteria for effective inoculation messages from Chapter 33: the message should (a) warn that an attempt to manipulate will occur, (b) provide a weakened form of the manipulative argument, and (c) provide refutation of the technique. Apply these criteria to your chosen AI technique.

Submission Requirements

The Future-Proofing Analysis should produce: (1) a threat profile assessment for your target community, (2) an awareness audit of the community's current knowledge of AI capabilities, (3) an evaluation of which existing inoculation messages transfer and which need adaptation, and (4) one fully developed AI-specific inoculation message with rationale.


Chapter Summary

Chapter 37 has examined the most technically novel challenge in the contemporary propaganda landscape: the deployment of large language models and generative AI to produce synthetic content at scale.

The analysis has moved from foundations — what LLMs are and how they work — through the five categories of propaganda application (content farm automation, persona generation, targeted personalization, comment flooding, and sockpuppet networks), to the structural challenge of scale economics (the removal of the marginal cost of content production), to the specific vulnerabilities it exploits in the information ecosystem (local news deserts, pre-print scientific publishing, regulatory comment processes).

We examined the detection challenge honestly: existing detection tools have significant limitations, the fundamental arms-race structure of detection is asymmetric and favors the generator, and no reliable population-level detection solution currently exists. We examined the liar's dividend — the way that AI generation capability undermines trust in authentic content — as perhaps the most underappreciated threat mechanism.

We asked whether the inoculation work of the prior chapters transfers, and found a qualified yes: technique-based inoculation transfers because AI-generated propaganda uses the same psychological techniques; new inoculation content is needed for the AI-specific technique categories the technology introduces.

Webb closed the class with the same honesty he had opened with. "Tariq's forty-five minutes should bother you. Not because we have no responses — we do — but because the responses are going to require more work than the existing infrastructure is set up to provide. The question isn't whether we have tools. The question is whether we'll build them fast enough."


Key Terms

Large Language Model (LLM): An artificial intelligence system trained on large text corpora, capable of generating fluent, coherent text across domains and styles through next-token prediction.

Generative AI: AI systems that produce new content (text, images, audio, video) rather than classifying or retrieving existing content.

Synthetic content: Content produced by AI generation systems, designed to appear as though produced by a human author.

Content farm: A website or operation producing large volumes of low-quality content primarily for advertising revenue or influence purposes, increasingly using AI generation.

Hallucinated citation: A reference to a study, book, or source that does not exist, generated by an LLM as a plausible-sounding but fabricated attribution.

Liar's dividend: The phenomenon by which the known possibility of AI-generated synthetic content provides plausible deniability for authentic damaging content.

Sockpuppet network: A coordinated group of fake online identities operated to manipulate discourse, at scales now enabled by LLM-maintained personas.

Synthetic consensus: The manufactured appearance of widespread opinion or agreement, created through AI-generated comments, posts, and apparent social proof.

Watermarking: An approach to AI content detection that embeds imperceptible signals in AI-generated content at the generation stage, enabling later identification.

Content provenance: A framework (such as C2PA) for cryptographically recording the origin and transformation history of digital content to enable authenticity verification.

AI authority appeal: A propaganda technique that invokes AI analysis or machine learning results to lend false authority to claims, exploiting perceptions of AI objectivity.

Next-token prediction: The core mechanism of LLMs — generating text by predicting the most probable next word or sub-word unit given the preceding text.


Chapter 38: Deepfakes and Synthetic Media — The Visual Dimension continues Part 7's examination of AI-generated content, focusing specifically on synthetic video, audio, and image manipulation, its application in political disinformation and non-consensual intimate imagery, and the specific forensic and institutional challenges it presents.