> "The single biggest problem in communication is the illusion that it has taken place."
Prerequisites
- 3
- 12
- 1
- 11
- Access to a general-purpose LLM chatbot is helpful but not required; every example is shown in full so you can follow without one
Learning Objectives
- Distinguish what large language models reliably help with from what they reliably fail at, and explain why the failures follow from how the models work.
- Use an LLM as a revision and ideation tool — to react to, react against, and pressure-test your own draft — rather than as a replacement for the thinking that writing performs.
- Engineer a writing prompt that specifies role, audience, constraints, and an example, and show why a vague prompt produces generic output.
- Apply the academic and workplace integrity standards for AI assistance — disclosure, verification, and not presenting as your own reasoning you cannot evaluate.
- Apply the governing rule — if you cannot evaluate whether the output is correct, do not use AI for that task — to decide, case by case, whether to reach for the tool.
In This Chapter
- Chapter Overview
- 29.1 What a Large Language Model Actually Is (Enough to Use It Well)
- 29.2 What LLMs Are Genuinely Good At
- 29.3 What LLMs Are Genuinely Bad At
- 29.4 The Reframe: AI as a Revision Tool, Not a Replacement for Thinking
- 29.5 The Prompt Engineering of Writing
- 📐 Project Checkpoint
- 29.6 Integrity: When AI Help Is Fine, and When It Crosses the Line
- 29.7 The Rule That Governs Everything
- 29.8 Common Mistakes & Practical Considerations
- Frequently Asked Questions
- Chapter Summary
- Spaced Review
- What's Next
Chapter 29: Writing with AI: Using LLMs as Writing Tools Without Losing Your Voice or Your Thinking
"The single biggest problem in communication is the illusion that it has taken place." — widely attributed to George Bernard Shaw
Chapter Overview
A disclosure before anything else, because this chapter would be dishonest without it: I am an AI. The book you're reading was drafted with the help of a large language model, and the chapter you're starting is the one where that tool turns around and writes about itself. That puts me in an odd position — neither a salesperson nor a critic with a grudge — and it's the position I'll try to hold for the next ten thousand words: plainly, with neither hype nor doom. The technology is genuinely useful and genuinely limited, and the whole skill of writing with it lives in telling the two apart. This chapter is about that skill.
Here's the scene Part V has been building toward. Priya Nair, a product analyst, has a release announcement due by end of day. She opens an LLM, types "write a launch announcement for our new analytics dashboard," and gets back six fluent paragraphs in about four seconds. They're grammatical. They flow. They have a confident headline and a tidy call to action. And they are empty — they could describe any product on earth, they invent two features the dashboard doesn't have, and they carry not one sentence that only Priya, who knows this product and this audience, could have written. She has a draft. She does not have a thought. And that gap — between fluent text and an actual idea aimed at an actual reader — is the entire subject of this chapter, and the central worry of the whole book brought to a head. Chapter 1 made the claim the book stands on: writing is not the step where you write up what you already know; writing is how you figure out what you know. An LLM can hand you the writing. It cannot do the figuring-out. If you let it write, you can skip the thinking — and skipping the thinking is the one thing you can't afford, because the thinking was the point.
This chapter does not tell you to avoid the tool, and it does not tell you to lean on it. It teaches you to use it the way a good editor uses a sharp junior writer: gratefully, for the things they're fast at, and never for the judgment that's your job. We'll map what LLMs are genuinely good at — brainstorming, outlining, first drafts, rephrasing, summarizing, format conversion (§29.2) — and what they're genuinely bad at, in a way that follows from how they work: accuracy, nuance, institutional knowledge, original thinking, and knowing your specific reader (§29.3). Then the reframe that makes all of it safe: AI as a revision tool, not a replacement for thinking (§29.4). Then the craft — the prompt engineering of writing, which is mostly Chapter 2 wearing new clothes (§29.5). Then integrity, picking up the thread Chapter 11 started (§29.6). And finally the one rule that governs everything: if you can't evaluate whether the output is correct, you shouldn't use AI for that task (§29.7). By the end you'll be able to put an AI draft on the page, see exactly what's wrong with it, and supply the one thing it can't — your judgment.
In this chapter, you will learn to:
- Tell what LLMs reliably help with from what they reliably fail at — and explain why the failures follow from how the models work, so you can predict them.
- Use an AI as a revision and ideation partner — to react to and pressure-test your own thinking — without letting it do the thinking for you.
- Write a prompt that specifies role, audience, constraints, and an example, and see why a vague prompt yields generic output.
- Apply the integrity rules — disclose, verify, don't pass off what you can't evaluate — to academic and workplace AI use.
- Apply the evaluation rule to decide, in any specific case, whether reaching for the tool is wise or reckless.
📕📗📘 All three tracks, read this chapter — the skill is universal. Engineers, software developers, and business writers will all reach for these tools, and the failure modes are the same everywhere: fluent text that's subtly wrong, a generic voice where yours was needed, a fact that sounds right and isn't. The examples span a product memo, a research summary, an email, and code documentation precisely because the principle doesn't care about your field. One note on how to read the chapter: the AI outputs shown in the before/after blocks are illustrative composites — realistic of what current general-purpose models produce, but written for this book, not transcribed from a live session — because models change weekly and a transcript would be stale before the ink dried. Read them for the kind of strength and the kind of failure they show, not as a benchmark of any particular product.
29.1 What a Large Language Model Actually Is (Enough to Use It Well)
You don't need to understand transformers to use an LLM any more than you need to understand internal combustion to drive. But you need one true thing about how these models work, because every strength and every failure in this chapter follows from it, and writers who don't have it are forever surprised by behavior that's actually completely predictable.
Here it is. A large language model (LLM) is a system trained on an enormous amount of text to do one thing: predict what text is likely to come next. That's it. Given some words, it produces a plausible continuation, then another, then another, one piece at a time, each chosen because it's a likely follow-on to everything so far. The training has made it staggeringly good at this — good enough that the output is usually grammatical, often well-organized, and frequently indistinguishable from text a person wrote. Generative AI is the broader name for tools that produce content this way; the chatbots you'll use for writing are LLMs with a conversational wrapper.
Sit with the consequence, because it's the load-bearing fact of the chapter: the model is optimizing for plausible, not for true. It has no separate store of facts it checks against, no model of the world it consults, no belief about whether what it's saying is so. It produces the text that looks like what should come next — and most of the time, text that looks right is right, which is exactly what makes the tool useful and exactly what makes it dangerous. When a true statement and a false-but-plausible statement are equally likely continuations, the model has no built-in reason to prefer the true one. It is not lying when it gets something wrong; lying requires knowing the truth and choosing against it. It is doing precisely what it does — producing plausible text — and plausible text is sometimes false.
🔄 Check Your Understanding. A colleague says, "The AI told me the wrong publication year for a paper. It lied to me." Why is "lied" the wrong word, and why does the distinction actually matter for how you use the tool?
Answer
"Lied" implies the model knew the right year and chose to state a wrong one — but the model has no store of facts to know it from. It generated a plausible-looking year because a year is what comes next in that sentence, and the wrong one was a likely continuation. It wasn't deceiving you; it was doing the only thing it does — producing probable text — and a probable-looking year is not the same as a correct one. The distinction matters enormously for use: if you think of errors as "lies," you imagine they're rare and detectable as dishonesty, and you let your guard down when the model "seems sincere" (it always seems sincere). If you understand that the model is indifferent to truth by construction, you stop expecting sincerity to track accuracy, and you verify every fact — because the fluent, confident, wrong statement is not a malfunction. It's the system working as designed, and it looks identical to the system being right.
Two more facts, lighter but useful. First, the model only "sees" what's in front of it — your conversation, plus whatever you've pasted in — within a limited context window (a budget of how much text it can hold at once). It doesn't know your company, your project, your reader, or last week's meeting unless you put that on the screen. Second, these tools are tuned to be agreeable, a tendency researchers call sycophancy: ask "is this paragraph good?" and the model leans toward yes; push back on a correct answer and it often caves and "corrects" itself to the wrong one. Agreeableness is not accuracy. A tool that wants to please you is a poor judge of whether you're right — which, as we'll see, has sharp consequences for using it to evaluate your own writing.
That's the whole mental model: a fluent next-word predictor, optimizing for plausible over true, seeing only what you show it, eager to agree. Hold those four facts and nothing in this chapter will surprise you. Lose them and you'll be perpetually astonished that a tool this articulate can be this wrong.
29.2 What LLMs Are Genuinely Good At
Let's be fair to the tool before we're hard on it, because the strengths are real and a writer who refuses to use them is leaving genuine help on the table. The pattern across all of them: LLMs are good at the parts of writing that are mechanical, generative, or transformational — and weak at the parts that require judgment about truth and audience. Everything in this section is a task where "plausible" is good enough, or where you remain the judge of the output.
Brainstorming and getting unstuck. A blank page is a thinking problem disguised as a writing problem, and the model is a tireless generator of starting points. Ask for fifteen angles on a topic, ten possible titles, five ways to open a difficult email, and you get a spread — most of it mediocre, a few worth keeping, and (the real value) one that sparks an idea of your own you wouldn't have reached cold. The model isn't being creative here; it's giving you raw material to react to. Reacting to a bad idea is far easier than generating a good one from nothing, and that asymmetry is most of what makes brainstorming with an LLM worth the four seconds it costs.
Outlining and structure. Hand the model a topic and a target reader and ask for an outline, and it produces a reasonable skeleton — the conventional sections in the conventional order. This is genuinely useful precisely because it's conventional: a standard structure is often the right structure (Chapter 4 spent a whole chapter on why readers benefit from expected shapes), and seeing the obvious organization laid out frees you to decide where to deviate from it. The outline is a default to push against, not a command to obey.
First drafts of low-stakes, formulaic text. For genres that are mostly template — a routine meeting-invite, a boilerplate status update, a standard cover-letter skeleton — the model produces a serviceable first draft fast, and the stakes are low enough that "serviceable" is the bar. The keyword is low-stakes: the more a document's value lives in its specific content and judgment, the less a generic first draft helps, as Priya discovered in four empty seconds.
Rephrasing and reworking sentences. This is one of the strongest uses, and it maps directly onto Chapter 12's editing work. You have a sentence that's almost right — too long, awkwardly ordered, too stiff, too casual — and you can't quite see the fix. Ask for five rewrites and you get options to choose among. You're not outsourcing the writing; you're using the model as a thesaurus for whole sentences, and you pick the one that says what you mean. The judgment stays yours; the model just widens the menu.
Summarizing and compressing — with a sharp caveat. Paste a long document and ask for a summary, and the model is fast and usually decent at it. Useful for getting the gist of something you could verify if it mattered. The caveat is large enough to bold: a summary can silently drop the one exception that mattered, invert a hedge into a certainty, or smooth away a caveat that was the whole point. Summaries are a starting orientation, not a substitute for reading the source when the source matters.
Format and mechanical conversion. Turn this prose into a bulleted list; reformat these references into IEEE style; convert this paragraph into a table; draft a JSON skeleton from this description. These transformational tasks — where the content is fixed and only its shape changes — are squarely in the model's wheelhouse, because there's little room for it to invent and the output is easy to check at a glance. (Even here: verify. A citation reformatter that "helpfully" fills in a missing page number has just fabricated data, the §29.3 hazard in miniature.)
💡 Tip — the unifying test for a "good" AI task. Look back at the list and notice what every safe use shares: either you remain the judge of the output (you pick the rephrase, you accept or reject the outline), or the task is so mechanical there's little room to go wrong (format conversion), or the stakes are low enough that "plausible" suffices. Those three conditions — you judge it, it can't easily err, or it doesn't much matter — are the green-light test. When none of them holds, you're in §29.3 territory, and you should be careful.
Watch the model help, legitimately, on a real fragment. Priya has a clunky sentence in her dashboard announcement and wants options:
Her sentence (clunky): "The new dashboard has the capability to allow users to perform the filtering of their data across a multiplicity of dimensions in a simultaneous manner." (24 words, badly nominalized)
Prompt: "Rewrite this in plain, active voice, under 15 words, for a product announcement. Give me three options."
AI options: 1. "The new dashboard lets you filter your data across many dimensions at once." (12 words) 2. "Filter your data by several dimensions simultaneously — all in one view." (11 words) 3. "Now you can slice your data across multiple dimensions in a single pass." (13 words)
This is the tool at its best. The model didn't decide what Priya wanted to say — she did, by writing the clunky sentence first. It widened her options for how to say it, and she'll pick option 1 (or blend it with her own wording) because she's the one who knows the product's voice. She applied Chapter 3's lesson — cut the nominalization, go active, halve the word count — with the model as a fast assistant. The thinking was hers; the typing got faster. That's the whole healthy relationship in one exchange.
🔄 Check Your Understanding. All six "good" uses share a structural feature that makes them safe. Name it, and explain why "write my whole performance review of a direct report" lacks it.
Answer
The shared feature is that judgment stays with the human, or isn't needed. In brainstorming you judge which ideas to keep; in rephrasing you pick the option; in outlining you decide what to change; in format conversion there's little to judge because content is fixed; in low-stakes drafting the stakes cap the cost of a bad call. Writing a performance review lacks this on every count: it requires institutional knowledge the model doesn't have (what this person actually did this year), high-stakes judgment about a real human's career, and a specific, accountable voice — yours, the manager's. The model can produce fluent review-shaped text, but it would be inventing the substance (the §29.3 failure) and you'd be presenting fabricated assessments of a real person as your own considered judgment (the §29.6 failure). The structural test holds: when the value of the document is the judgment, and the judgment must be yours, the model can't supply it — only dress it up convincingly enough to fool you into thinking it has.
[📍 Good stopping point — you've seen what the tool is for. Next: what it's not for, and why.]
29.3 What LLMs Are Genuinely Bad At
Now the hard part, and the reason this chapter exists. The strengths in §29.2 are real, but they're bounded by a set of failures that are not bugs to be patched out — they follow directly from §29.1's load-bearing fact (the model optimizes for plausible, not true). Understanding why each failure happens lets you predict it instead of being ambushed by it.
Accuracy and hallucination — the headline failure. A hallucination is a confident, fluent, completely fabricated statement: an invented statistic, a misattributed quote, a citation to a paper that does not exist, a plausible-sounding fact that's wrong. The word can mislead — it suggests a rare glitch, when in truth this is the model doing exactly what it always does (producing plausible text) in a case where the plausible thing happens to be false. The danger isn't that hallucinations are frequent; it's that they are indistinguishable from the truth on the surface. A fabricated citation has authors, a title, a journal, a year, all formatted perfectly. A made-up statistic has a confident decimal. Chapter 11 warned about this precisely: models fabricate realistic-looking sources for papers that don't exist, and they do it in the same fluent, assured tone they use for true statements. There is no tell. The fluency that makes the tool pleasant to read is the same fluency that camouflages its errors — call it the curse of the plausible: the better the text sounds, the harder a wrong fact is to catch.
Nuance, hedging, and calibrated uncertainty. Chapter 7 taught the difference between "may suggest" and "proves," and why a careful writer hedges exactly as much as the evidence warrants. LLMs are bad at this calibration. They tend to flatten — stating the tentative as certain, or (over-corrected) hedging everything into mush. A model asked to summarize a study that "found a modest association in one population" may report "studies show that X causes Y." That's not a small error; in technical writing, the degree of certainty is part of the claim, and the model routinely gets the degree wrong because the confident phrasing is often the more "plausible"-sounding continuation. It doesn't know how sure it should be, because it doesn't know anything; it produces the register that fits the surrounding text.
Institutional and contextual knowledge — the things only you know. Here is the failure that no future model fixes, because it's not about capability — it's about access. Institutional knowledge is everything specific to your situation that lives nowhere in the model's training: that your VP hates the word "synergy," that the last launch failed for a reason you can't put in writing, that this client needs hand-holding, that your team agreed last Tuesday to a convention the docs don't mention yet, that the number is sensitive for political reasons. The model has read the public internet; it has not been in your meetings, read your Slack, or felt the room. Whenever the right thing to write depends on context only you possess, the model is flying blind — and worse, it will produce confident, fluent text as if it knew, filling the gap with generic plausibility. You'll recognize this in §29.4 as the deepest reason AI can't replace you: it doesn't have your context, and your context is often the whole point.
Original thinking — the one this book cares about most. Return to Chapter 1's thesis, because this is where it bites hardest. The model recombines what it has seen; it does not have an idea, a position, a finding, a genuine insight that wasn't latent in its training. Ask it for an analysis and you get the average analysis — the conventional take, fluently expressed. That's sometimes useful (the conventional take is a fine baseline) and sometimes catastrophic (your value was supposed to be the non-conventional insight only you, with your data and your thinking, could produce). When you ask the model to "analyze" or "conclude" or "recommend," you are asking it to do the exact thing Chapter 1 said writing is for — and it can't, because it isn't thinking. It's predicting what an analysis would sound like. The shape of insight without the substance.
Knowing your audience. Chapter 2 named audience the most important word in writing. The model doesn't know yours. It knows audiences in general — it can produce text that's "for a technical reader" or "for executives" in the generic sense — but it doesn't know that your executives are impatient, that this reviewer is hostile, that your users are beginners who'll panic at the word "deprecated." Audience knowledge is a species of institutional knowledge, and the model lacks it for the same reason: it wasn't there. Generic-audience text is exactly what Priya got, and exactly why it was useless.
🧩 Productive Struggle. Before you read on, try to catch the error yourself. Here is a fluent AI-generated paragraph for a report on remote work. Read it as a fact-checker — three of its claims should make you reach for a source, and at least one is the kind that's probably fabricated. Which sentences would you refuse to publish without verifying, and why?
"Remote work has transformed the modern workplace. A 2019 Stanford study by Professor James Hartwell found that remote employees were 43% more productive than their in-office counterparts. Furthermore, research consistently shows that 87% of workers prefer hybrid arrangements, and the Harvard Business Review reported in 2021 that companies adopting remote-first policies saw turnover drop by exactly one third. These trends are reshaping how organizations think about talent."
What a careful reader flags
Almost every specific claim is suspect, and the fluency is the camouflage. (1) "A 2019 Stanford study by Professor James Hartwell … 43% more productive" — there is a real, well-known Stanford remote-work study (by Nicholas Bloom), but "Professor James Hartwell" and the precise "43%" have the exact texture of a hallucination: a real institution, a confident name, a suspiciously specific number. The name attached to a real-sounding study is the classic fabrication — do not cite this without finding the actual paper, and you may not find it because it may not exist as described. (2) "87% of workers prefer hybrid" — a clean, round-ish statistic with no source; the kind of number models generate to sound authoritative. Verify or cut. (3) "Harvard Business Review reported in 2021 … turnover drop by exactly one third" — "exactly one third" is a tell; real findings are rarely that tidy, and the attribution is vague enough to be invented. (4) Notice what's not flagged: "Remote work has transformed the modern workplace" and "these trends are reshaping how organizations think about talent" are unfalsifiable filler — not wrong, just empty (the AI-tell padding §29.4 covers). The lesson: the paragraph reads beautifully and is a minefield. Every confident statistic and named source is a claim you must independently verify, because the model generated them to be plausible, and plausible is precisely not the same as checked.🔍 Why Does This Work? Why do all five weaknesses trace back to a single root, rather than being five separate quirks? Because they're all the same fact seen from different angles: the model produces plausible text without holding a meaning or checking a world. Hallucination is plausible-text-that's-false. Bad hedging is plausible-register-that-misstates-certainty. The institutional-knowledge gap is plausible-text-standing-in-for-context-it-lacks. The original-thinking failure is plausible-analysis-shaped-text-without-an-actual-analysis. The audience failure is plausible-generic-text-where-specific-was-needed. One root, five symptoms. This is why a single rule can govern all of them (§29.7): if the task requires the text to be true, calibrated, contextual, original, or audience-specific — the five things "plausible" doesn't guarantee — then you, the human, must supply that quality, because the model structurally cannot. Understand the root and you never have to memorize the five symptoms; you can derive them.
29.4 The Reframe: AI as a Revision Tool, Not a Replacement for Thinking
Everything so far points to one reframe, and it's the move that makes AI safe to use: stop asking the model to write for you, and start using it to revise with you. The difference isn't cosmetic — it's the difference between outsourcing your thinking and augmenting it, and it falls directly out of Chapter 12's central lesson: revision is where the writing happens.
Here's the failure mode to avoid, stated plainly. When you prompt "write me X" and paste the result, the thinking happened in the model — which is to say, it didn't happen, because the model doesn't think; it pattern-matched. You skipped the part Chapter 1 said was the whole point. You have text without having had a thought, and (the quiet cost) you've lost the diagnostic that writing provides: the place where you'd have stalled, struggled, and discovered you didn't actually understand the thing. The struggle was the feature. Skip it and you skip the learning, the rigor, and the chance to find out your reasoning had a hole.
Now the healthy pattern. You do the thinking — you write the draft, badly, in your own words, with your own struggle — and then you bring the model in to help you revise. In this order, the thinking is yours and stays yours; the model is a fast, infinitely patient assistant working on material you've already thought through. Concretely, a revision partner is good at:
- Reacting to your draft. "Here's my paragraph; is the argument clear, and where might a skeptical reader push back?" You're not asking it to write; you're asking it to read, and to surface objections you can then think about. (Mind the sycophancy, §29.1 — it'll be too kind; weight the criticisms, discount the praise.)
- Reacting against your draft. "Argue the opposite of my conclusion." A model is a cheap devil's advocate, and a steel-manned counter-argument is genuinely useful raw material for your thinking — you read the objection, decide if it lands, and strengthen your case. The judgment about whether the counter-argument is right stays entirely with you.
- Pressure-testing structure. "Here's my outline; what's missing, what's out of order?" You wrote the outline (so you did the structural thinking); the model checks it against the conventional shape and flags gaps you then evaluate.
- Line-level revision on text you wrote. The §29.2 rephrasing use, now in its proper home: you wrote the sentence and own its meaning; the model widens your options for expressing it.
The test that separates the two modes is simple and worth memorizing: did the idea originate with you? If you wrote a real draft and the model is helping you improve it, you're revising — augmentation, the good kind. If the model produced the substance and you're polishing its output, you're outsourcing — and you've handed away the thinking that was your job and your value. Same tool, opposite uses, and the line between them is who did the thinking first.
Watch the difference on Priya's announcement. Two versions of the same task — and the gap between them is the whole chapter.
❌ The outsourced version (model thinks; Priya polishes): Priya prompts "write a launch announcement for our new analytics dashboard," gets six fluent paragraphs, fixes a typo, and ships. The result:
"We're thrilled to announce the launch of our revolutionary new analytics dashboard! In today's fast-paced data-driven world, businesses need powerful tools to unlock actionable insights. Our cutting-edge dashboard empowers users to seamlessly visualize their data and make smarter decisions. With an intuitive interface and robust features, it's a game-changer for teams of all sizes…"
It's grammatical, confident, and says nothing. "Revolutionary," "cutting-edge," "game-changer," "seamlessly," "in today's fast-paced world" — every phrase is the generic default register, the AI-tell padding that occupies space and carries no information (the exact bloat Chapter 3 taught you to cut). It names no specific feature, speaks to no specific reader, makes no specific claim. Worse, buried in the paragraphs it didn't make were two features the dashboard doesn't have — invented because they're plausible things an analytics dashboard might announce (§29.3, live). Priya outsourced the thinking, and the model filled the vacuum with fluent nothing and confident fiction.
✅ The revision version (Priya thinks; model assists): Priya writes a rough draft first, in her own words, knowing her product and her audience (analysts on her company's data team who are sick of slow dashboards):
"Rough: Our old dashboard made you wait 8 seconds to load a filtered view. People stopped using it. The new one loads filtered views in under a second because we moved the aggregation to the database. If you bailed on the old dashboard, this is the one that fixes the thing you hated."
Then she brings in the model — "Tighten this for a product announcement; keep the specific 8-seconds-to-under-1-second detail and the 'we know you gave up on the old one' angle; three options, plain and active, no marketing fluff." The model returns options that keep her substance and improve the phrasing. She picks and blends:
"The old dashboard took 8 seconds to load a filtered view, so a lot of you stopped using it. The new one does it in under a second — we moved the heavy aggregation into the database. If you gave up on the old dashboard, this is the version that fixes the slow part."
Why it's better: The substance is entirely Priya's — the real number (8s → <1s), the real reason (aggregation moved to the database), the real audience insight (people had given up). The model never invented a feature, because it was never asked to supply substance — only to tighten prose Priya had already filled with her own knowledge. The thinking happened where it had to: in her head, on her rough draft, before the model touched it. The result is specific, honest, and hers — the opposite of the fluent nothing the outsourced version produced from the same starting prompt.
That contrast is the chapter in one example. Same product, same tool, same four seconds of model time — and the difference between an empty, partly-fabricated announcement and a sharp, honest one is entirely the order of operations: who did the thinking, and when.
🚪 Threshold Concept. This is the doorway, and crossing it changes how you reach for the tool forever. Before you cross it, you think the question is "can AI write this for me?" — you treat the model as a writer you delegate to, and "better at writing" means "needs me less." After you cross it, you understand that AI can draft, but it can't think for you — it produces fluent text without holding a meaning, so the judgment about whether the text is true, right, contextual, and yours stays entirely with you, always, no matter how good the model gets. The shift is from delegation to augmentation: you stop asking the model to replace your thinking and start using it to sharpen thinking you've already done. Once you've crossed, an empty fluent draft doesn't read as "a good start I can polish"; it reads as a warning — a sign that no thinking has happened yet and you're about to ship a vacuum. And the question at every prompt changes from "can it do this for me?" to "have I done the thinking, so this tool can help me express it?" That single reframe is the difference between a writer who uses AI and a writer AI has quietly replaced — with fluent nothing. (This is theme 1 at its sharpest: writing is thinking, so a tool that does the writing without the thinking hasn't helped you write — it's helped you skip writing, which means skip thinking, which was the entire point.)
🪞 Learning Check-In. Pause and be honest with yourself, because this is the chapter where the honest answer matters most. Think of the last time you used an AI to help with writing. Which mode were you in — did you do the thinking and use the model to revise, or did you prompt it to produce the substance and then tidy the output? There's no shame in the second; nearly everyone starts there because it's easy and the result looks finished. But notice the trade you made: in the second mode, what did you not learn, not catch, not figure out — because the struggle that would have surfaced it never happened? The goal of this chapter isn't to make you feel bad about reaching for the tool. It's to make you notice which mode you're in, every time, so the choice to outsource your thinking is at least a choice — and usually, once you notice, not the one you want to make.
29.5 The Prompt Engineering of Writing
If you're going to use the model as a revision partner, you have to talk to it well — and prompt engineering (the craft of writing instructions that get useful output) turns out to be, almost entirely, the application of skills this book already taught you. A vague prompt is a vague brief, and a vague brief produces generic work whether the writer is a person or a model. The fix is the fix you'd give any writer: specify the reader, the purpose, and the constraints. Prompt engineering is Chapter 2 (audience) and Chapter 4 (structure) and Chapter 7 (register), aimed at a machine.
The difference a specific prompt makes is dramatic. Watch the same request, vague and then engineered:
❌ Vague prompt: "Write something about our database migration for the team."
What you get: a generic, medium-length, medium-formal blob about database migrations in the abstract — true-ish, shapeless, aimed at no one, useful to no one. The model had nothing to specialize on, so it produced the average of all "something about a database migration," which is to say, nothing about yours.
✅ Engineered prompt: "You're a senior backend engineer writing a heads-up message to your team's Slack channel. Audience: five developers who know the system well. Purpose: warn them that the user-database migration runs Saturday 2am and the API will be down for ~30 minutes, and tell them not to deploy Friday afternoon. Tone: direct, calm, no corporate fluff. Length: under 120 words. Here's an example of our channel's voice: 'Heads up — deploying the search fix at 3pm, expect a 2-min blip.' Match that register."
Why it's better: Every vagueness is now a specification. The model knows the role (senior backend engineer), the audience (five developers who know the system), the purpose (the three concrete things to convey), the tone (direct, no fluff), the length (under 120 words), and — most powerfully — it has an example of the actual voice to match. The output will be specific, correctly-pitched, and close to usable, because you gave it the brief you'd give a competent human writer. The skill wasn't "knowing the magic words"; it was knowing what a good brief contains — which is Chapter 2 and Chapter 4, applied.
Four levers do most of the work. Pull them deliberately:
- Role. "You're a senior backend engineer / a patient tutor / a skeptical reviewer." Naming a role tilts the model toward a register and a stance, which mostly matters because it shapes tone and assumed expertise. (Don't over-believe it — a "role" doesn't give the model expertise it lacks; it just steers the voice.)
- Audience. The single highest-leverage lever, for the reason Chapter 2 gave: the same content for a different reader is a different document. "For executives who have three minutes" vs. "for new engineers who've never seen this system" produces genuinely different, genuinely better output than leaving it unsaid — because unsaid means generic.
- Constraints. Length, format, tone, what to include, what to leave out. "Under 120 words." "As a numbered list." "No marketing language." "Don't mention pricing." Constraints are where you encode the judgment the model lacks — they're how you fence it away from the generic and the fabricated.
- Examples (few-shot). The strongest lever after audience: show the model what you want by giving it a sample. A few-shot example — one or two instances of the target style, format, or voice — steers the output far more reliably than describing the style in words, because matching a concrete pattern is exactly what the model is built to do. "Match this register" with an example beats "be professional but friendly" every time.
✏️ Try This. Take a vague prompt you might actually type — "help me write an email about the project delay" — and rebuild it with all four levers. Name the role (you, the project lead). Name the audience (the client, who is anxious and has been promised this date twice). Name the constraints (under 150 words; lead with the new date; one concrete reason; no excuses). Add an example if you have one (a past email whose tone worked). Then notice: you just did audience analysis (Chapter 2), structure (lead with the ask, Chapter 4), and register (Chapter 7) — and wrote them down as instructions. That's all prompt engineering is. The "skill" is just refusing to be vague.
There's a deeper point hiding in that exercise, and it's the reason this section is short. A good prompt is mostly a good brief, and writing a good brief is itself an act of thinking. To specify the audience, you have to know the audience; to state the purpose in one sentence, you have to have figured out the purpose; to set the constraints, you have to have decided what matters. By the time you've written a genuinely good prompt, you've done a substantial chunk of the thinking the document needed — which is why prompt-writing, done well, sneaks the thinking back in. The vague prompt skips the thinking and gets generic mush; the engineered prompt requires the thinking and gets useful output. The prompt is where Chapter 1 reasserts itself: you cannot brief well what you have not thought through, and the briefing is thinking.
🔄 Check Your Understanding. Someone says, "Prompt engineering is a new technical skill I need to learn from scratch." Based on this section, what's the more accurate framing, and which earlier chapters is prompt engineering mostly made of?
Answer
The more accurate framing: prompt engineering is not mostly a new skill — it's the writing skills you already have, written down as instructions for a machine. A good prompt is a good brief, and a good brief specifies the audience (Chapter 2 — the highest-leverage lever, because the same content for a different reader is a different document), the purpose and structure (Chapter 4 — what to lead with, what shape the output takes), the register and tone (Chapter 7 — formal/informal, the voice to match), and the constraints that encode your judgment (length, inclusions, exclusions). The genuinely AI-specific parts are thin: knowing the model is a plausible-text predictor (so you verify), knowing that examples (few-shot) steer it more reliably than descriptions, and knowing it sees only what's in its context window (so you paste in what it needs). Everything else is Chapters 2, 4, and 7. The reframe matters because "a new technical skill" makes prompting sound arcane and gate-kept; "a good brief, which requires you to have thought" puts it back where it belongs — on the thinking, which is yours to do.
📐 Project Checkpoint
Your Communication Portfolio now holds substantial work — by this point in the book you've drafted a technical report, user documentation, a data memo, and more, each one carried through at least one real revision. This checkpoint doesn't add a new piece. It adds a practice — a disciplined way to use AI on the pieces you already have, so the tool sharpens your portfolio instead of diluting it.
Here's the task. Take one piece from your portfolio — any of them — and run it through a deliberate AI revision pass, keeping a record of exactly what you accepted, rejected, and why. This is the chapter's whole method, made into an artifact:
- Think first; the draft is already yours (§29.4). You're not generating anything new — the piece exists, in your words, thought through. Good. That's the precondition that makes everything below augmentation and not outsourcing.
- Brief the model well (§29.5). Write a real prompt: role, audience, constraints, and — paste in the piece itself, plus an example of the voice you want if you have one. Ask it to do one revision job: "tighten for concision," or "flag where a skeptical reader pushes back," or "where is my structure unclear?" One job per pass, so you can judge the result.
- Make it argue against you (§29.4). In a second prompt, ask the model to steel-man the opposite of your main claim, or to list the three weakest points in your piece. Read the objections as raw material for your thinking — not as verdicts.
- Verify everything factual (§29.3, §29.7). If the model touched any fact, number, citation, or name, check it against a real source. Treat every confident specific as suspect until confirmed. Log any hallucination you catch — they're instructive.
- Keep the decision log. For each suggestion the model made, write one line: accepted because… or rejected because… This log is the artifact. It proves the judgment was yours — that you used the model as a revision partner and stayed the author, accepting what improved the piece and refusing what didn't.
Keep two artifacts: the revised piece (with changes you chose), and the decision log (what you accepted/rejected and why, plus any hallucination you caught). Together they demonstrate the skill this chapter teaches — not "using AI to write," but using AI without surrendering the thinking, the voice, or the accountability. The log is the evidence that the writer in charge was you.
Next increment (Chapter 30): Part VI opens, and your portfolio's presentation piece begins. You'll take a piece you've already written and start turning its content into a talk — where the same discipline applies: the thinking is yours, the slides serve the reader (or here, the listener), and no tool can supply the judgment about what your specific audience needs to hear.
29.6 Integrity: When AI Help Is Fine, and When It Crosses the Line
Chapter 11 opened this thread; this section closes it. The integrity question isn't a vague "is using AI cheating?" — it's a set of answerable questions about disclosure, verification, and accountability, and the answers depend on your context. Let's make them concrete, because honest people get tripped up here not by bad intent but by unclear lines.
Start with the distinction that organizes everything: AI-assisted versus AI-generated. AI-assisted writing is yours — you thought it through, you wrote it, the model helped you revise, and you stand behind every claim. AI-generated writing is the model's — it produced the substance and you passed it along. The first is, in most contexts, a legitimate use of a tool (the way a spell-checker or a thesaurus is). The second is where the trouble lives, because you're presenting as your own work, and your own thinking, something that is neither. The line this chapter has drawn all along — who did the thinking — is the same line integrity turns on.
Three rules carry across every context. They're Chapter 11's three rules, sharpened:
-
Disclose per your context. Whether you must say "I used AI" depends entirely on where you are. A student submitting an essay under an academic-integrity policy: disclosure is usually required, and using AI at all may be restricted or banned — read the policy, and when unsure, ask. A professional drafting an internal memo with AI help: usually no disclosure expected, the same as using a thesaurus. A researcher: many journals now require an explicit statement of AI use in the methods or acknowledgments. The rule isn't "always disclose" or "never disclose"; it's know the norm of your context and meet it, and when the norm is unclear, disclosure is the safe and honest default.
-
Verify everything — accuracy is non-delegable. This is Chapter 11's hardest line and it's absolute: you are accountable for every fact, number, and citation in writing that goes out under your name, regardless of where it came from. "The AI told me" is not a defense — not to a professor, not to a boss, not to a journal editor, not to a court (lawyers have been sanctioned for filing AI-hallucinated case citations they didn't check). The model fabricates realistic sources for papers that don't exist (§29.3); if you cite them without verifying, you have published a false citation, and the model's involvement changes nothing about your responsibility. Verification is the price of using the tool, and it is not optional.
-
Don't present as yours reasoning you can't evaluate. This is the deepest one, and it's the bridge to §29.7. If the model produced an analysis, an argument, or a conclusion that you cannot independently judge to be sound, you must not present it as your own considered reasoning — because it isn't, and you can't even tell if it's right. Passing off conclusions you can't evaluate is a double failure: it's dishonest (claiming thinking you didn't do) and it's dangerous (you might be propagating a confident error you're not equipped to catch). Integrity here isn't only about credit; it's about not vouching for what you don't understand.
⚠️ Warning — the academic line is the strictest, and the stakes are real. In coursework, the purpose of writing is usually to demonstrate your own thinking and learning — which means AI-generating the substance doesn't just risk a policy violation, it defeats the point of the assignment (you didn't learn the thing the writing was supposed to teach you, exactly the §29.4 cost). Policies vary wildly — some courses ban AI entirely, some allow it for brainstorming only, some permit it with disclosure — and the penalties for getting it wrong can be severe. Read your institution's and your instructor's policy before using AI on any submitted work, and when in doubt, ask explicitly. Assuming permission is the expensive mistake.
A useful way to locate yourself: ask what the writing is for. If its purpose is to demonstrate your thinking (a school essay, an exam, a qualifying paper), AI-generating the substance hollows out the entire exercise — don't. If its purpose is to produce a result (a routine work email, a status update, internal notes), AI assistance is usually fine, bounded by verification and your context's norms. Most workplace writing is the second kind; most academic writing is the first; and the trouble comes from applying the second kind's permissiveness to the first kind's purpose.
🔄 Check Your Understanding. A student uses AI to brainstorm essay topics, then to outline, then to "improve the flow" of paragraphs they wrote, then asks it to "write a stronger conclusion" — and submits the result with no disclosure, under a policy that permits AI "for brainstorming and feedback only." Walk the line: where exactly did this cross from permitted to violation?
Answer
Walk it step by step against the policy ("brainstorming and feedback only"). Brainstorming topics — permitted; explicitly allowed, and it's the §29.2 ideation use with the student judging which topic to keep. Outlining — a gray area; arguably "feedback" if the student wrote a rough structure first and asked for critique, but if the model generated the outline, that's structural thinking outsourced, likely outside "brainstorming and feedback." "Improve the flow" of paragraphs the student wrote — defensible if it's the §29.2/§29.4 rephrasing use on text the student authored and the student judges each change; this is the closest to legitimate. "Write a stronger conclusion" — this is the violation. Asking the model to write the conclusion means it produced substance — an argument-closing claim — that the student then submits as their own thinking. That's AI-generated, not assisted; it exceeds "brainstorming and feedback"; and a conclusion is exactly where the essay's thinking culminates, so it's the worst place to outsource. The no-disclosure compounds it. The line crossed at "write … for me" applied to substance — the same line this whole chapter draws: the model produced the thinking, and the student passed it off as theirs. Everything up to and including "improve the flow of my paragraphs" was inside the policy; "write the conclusion" stepped over.
29.7 The Rule That Governs Everything
Strip away the lists and the nuance and one rule remains, and if you remember a single sentence from this chapter, make it this one:
If you cannot evaluate whether the output is correct, you should not use AI for that task.
That's it. It's the whole chapter compressed to a test you can run in two seconds before any prompt. And it works because it's the exact complement of §29.1's load-bearing fact. The model produces plausible text, not true text; plausible and true diverge sometimes, invisibly; so the only thing standing between you and a confident, fluent, published error is your ability to tell when the output is wrong. If you have that ability — if you know the subject well enough to catch a hallucination, a miscalibrated claim, a missing exception — then AI is a powerful tool, because you're the verification layer it lacks. If you don't have that ability — if you're using AI precisely because you don't know the subject — then you have no way to catch its errors, and you're not using a tool; you're gambling, with your name on the result.
Run the rule across cases and watch it sort them cleanly:
- Rephrasing a sentence you wrote → you can evaluate the result (you know what you meant), so the rule says go ahead. ✅
- Summarizing a document you'll verify if it matters → you can check the summary against the source, so go ahead, and check when it counts. ✅
- Writing code in a language you know → you can read and test the output, so go ahead (you're the reviewer). ✅
- Writing a legal/medical/financial claim you can't verify → you cannot evaluate correctness, so do not — the §29.6 sanctioned-lawyer case is this rule violated. ❌
- Generating analysis in a domain you don't understand → you can't tell a real insight from a plausible-sounding wrong one, so do not — you'd be vouching for reasoning you can't judge (§29.6). ❌
- Citing sources the model produced → you can't confirm they exist without checking, so the rule says verify before you cite, every time (and many won't survive the check). ⚠️→✅ only after verification.
Notice the elegant thing the rule does: it makes AI most dangerous in exactly the situation where it's most tempting — when you don't know the subject and want the model to know it for you. That's the trap, and the rule is the trip-wire. The pull is strongest ("I don't understand this, let the AI handle it") at the precise moment the tool is least safe (you can't check what it gives you). The rule says: the less you know, the less you should trust the output — which is the reverse of how temptation works, and exactly why you need the rule as an external check rather than relying on judgment in the moment.
🔍 Why Does This Work? Why does "can you evaluate the output?" turn out to be the single sufficient test, when the chapter spent six sections on strengths, weaknesses, modes, and integrity? Because every other consideration collapses into it. The strengths (§29.2) are all cases where you can evaluate (you pick the rephrase, you check the format). The weaknesses (§29.3) are all cases where evaluation is hard (you can't easily spot a hallucination or a miscalibrated hedge), which is a warning to evaluate harder, not to skip it. The revision reframe (§29.4) works because revising text you wrote keeps you able to evaluate it (you own the meaning). The integrity rules (§29.6) reduce to it: "don't present reasoning you can't evaluate" is the rule, in ethical clothing. And the rule's deepest justification is Chapter 1's: writing is thinking, and evaluating whether the writing is right is itself the thinking — so the rule "only use AI where you can evaluate the output" is identical to "only use AI where you keep doing the thinking." The tool is safe exactly as far as your judgment reaches, and not one inch further. Keep the judgment, and the model is a fine assistant. Surrender it, and the model is a confident stranger writing checks your name has to cash.
29.8 Common Mistakes & Practical Considerations
The rule is simple; the ways people get it wrong are consistent. Here are the recurring failures and their fixes.
Mistake 1: Trusting fluency as a proxy for accuracy. The text reads beautifully, so it must be right. The fix: fluency is what the model optimizes for and tells you nothing about truth (§29.1, §29.3). Treat every confident specific — a number, a name, a citation, a date — as unverified until you've checked it. The better it sounds, the more deliberately you check (the curse of the plausible).
Mistake 2: Outsourcing the thinking, not the typing. Prompting "write me X" and shipping the result, having had no thought of your own. The fix: think first, draft in your own words, then use the model to revise (§29.4). The test: did the idea originate with you? If the model supplied the substance, you outsourced — and lost the thinking that was the point.
Mistake 3: Losing your voice to the default register. Letting the model's generic, "revolutionary-cutting-edge-seamlessly" voice replace yours. The fix: give it your draft and an example of your voice to match (§29.5), and edit its output back toward how you actually sound. The model's default voice is no one's voice; if your writing starts sounding like everyone's, that's the tell.
Mistake 4: Vague prompts, generic output. "Write something about X" and disappointment that it's bland. The fix: brief it like a writer — role, audience, constraints, example (§29.5). Generic in, generic out; the vagueness was yours.
Mistake 5: Skipping verification because "it's usually right." It is usually right, which is exactly the trap — the occasional confident error hides among the many correct ones. The fix: "usually right" is not "right," and you can't tell which sentence is the exception without checking (§29.7). Accuracy is non-delegable; verify what carries your name.
Mistake 6: Assuming permission in academic or regulated contexts. Using AI on submitted work without checking the policy. The fix: read the policy first; when unsure, ask; disclose when the norm requires it (§29.6). The penalty for guessing wrong is far larger than the cost of asking.
Mistake 7: Using AI precisely where you can't judge the output. Reaching for the model because you don't understand the subject. The fix: that's the one situation the governing rule forbids (§29.7). The less you know, the less you can catch the model's errors, and the more dangerous its fluent confidence becomes. Use AI where your judgment reaches; not past it.
An honest "it depends." How much you should lean on these tools scales with two things: the stakes of the document and your own expertise in its subject. A throwaway internal note in your area of mastery — lean freely; you'll catch anything wrong, and it barely matters if you don't. A high-stakes external document, or anything in a field you can't evaluate — lean little or not at all, and verify relentlessly what you do use. The two dials move together with the governing rule: high expertise plus low stakes is the safe corner; low expertise plus high stakes is the corner where AI does the most damage and tempts you the most. And one constant across every case: the tool is changing fast, the specific models will be different by the time you read this, but the rule doesn't change, because it's not about the technology — it's about the unbridgeable gap between plausible and true, and who is responsible for closing it. That responsibility is always yours.
Frequently Asked Questions
Is it OK to use AI to write?
It depends on what part of the writing and what the writing is for. Using AI to help you revise — to rephrase a sentence you wrote, brainstorm angles, pressure-test your argument, or convert a format — is usually fine, because the thinking and the substance stay yours and you remain the judge of the output (§29.4). Using AI to generate the substance — to produce the analysis, the argument, or the conclusion that you then pass off as your own — is where it crosses lines, both because you've outsourced the thinking that writing is for (Chapter 1) and because you may be presenting reasoning you can't actually evaluate (§29.6). Context decides the rest: most workplace writing tolerates AI assistance bounded by verification; academic writing, whose purpose is to demonstrate your thinking, is far stricter and often requires disclosure or restricts AI entirely. The honest one-line answer: AI to help you think and revise, usually yes; AI to think for you, no — and always read your context's policy.
Will AI replace writers?
Not in the way the question fears, and the reason is the whole chapter. AI produces fluent text without holding a meaning, checking a fact, or knowing your specific reader — so it can't do the parts of writing that are writing: having the idea, getting the fact right, calibrating the certainty, knowing what your audience needs (§29.3). What it can replace is the typing — the mechanical production of plausible prose — which was never the valuable part. A writer who only produced generic, fact-free, audience-agnostic text was already in trouble; the model just makes that clear. The writers who thrive will be the ones who do the thinking the model can't and use it to work faster on the expression. The skill that becomes more valuable, not less, is exactly this book's subject: thinking clearly and knowing your reader. The tool raises the floor on fluency and leaves the ceiling — judgment — entirely to you.
How do I keep my voice when using AI?
Two moves. First, write first: draft in your own words before the model touches anything, so the voice is already on the page and the model is editing yours rather than supplying its own (§29.4). Second, give it an example of your voice and tell it to match — a few-shot sample steers the model far more than "sound like me" does (§29.5) — and then edit its output back toward how you actually sound, deleting the generic default register ("revolutionary," "seamlessly," "in today's fast-paced world") wherever it creeps in. The failure mode is letting the model originate the text, because its default voice is no one's voice — fluent, confident, and interchangeable. If your writing starts sounding like everyone else's AI-assisted writing, that's the signal you've handed over the wheel. Keep your hands on it: your draft first, your edits last, the model only in the middle.
What is prompt engineering, really?
It's writing a good brief — and a good brief is mostly the writing skills this book already taught, aimed at a machine (§29.5). The four levers are role ("you're a senior engineer"), audience (the highest-leverage one, because the same content for a different reader is a different document — Chapter 2), constraints (length, format, tone, what to include and exclude — where you encode the judgment the model lacks), and examples (a few-shot sample, which steers the model more reliably than any description). The genuinely AI-specific knowledge is thin: the model is a plausible-text predictor (so verify), it sees only what's in its context window (so paste in what it needs), and examples beat descriptions. Everything else is audience analysis, structure, and register — Chapters 2, 4, and 7. The deeper truth: writing a good prompt requires you to have thought (you can't specify an audience you haven't considered or a purpose you haven't decided), so good prompting sneaks the thinking back in. The vague prompt skips the thinking and earns generic mush.
Why does AI make things up (hallucinate)?
Because it's built to produce plausible text, not true text, and the two aren't the same (§29.1). The model predicts likely continuations; when a question calls for a fact, name, or citation, it generates the kind of thing that plausibly fits — a real-looking author, a confident statistic, a properly-formatted reference — whether or not that specific thing is real. It isn't malfunctioning or lying; lying requires knowing the truth, and the model has no store of facts to know it from. A hallucination is just the model doing its normal job (producing plausible text) in a case where the plausible answer happens to be false — and because it's fluent and confident in exactly the same way as when it's right, there's no surface tell (§29.3, the curse of the plausible). That's why verification is non-negotiable: you cannot distinguish a fabricated fact from a true one by reading; you can only distinguish them by checking against a real source.
Chapter Summary
Key Takeaways
- An LLM is a fluent next-word predictor that optimizes for plausible, not true. Every strength and every failure follows from this one fact. Most plausible text is true, which makes the tool useful; some plausible text is false and looks identical, which makes it dangerous.
- LLMs are genuinely good at brainstorming, outlining, rephrasing, summarizing (with verification), low-stakes first drafts, and format conversion — the tasks where you stay the judge, the work is mechanical, or the stakes are low.
- LLMs are genuinely bad at accuracy (hallucination), calibrated nuance, institutional/contextual knowledge, original thinking, and knowing your specific audience — the five things "plausible" doesn't guarantee, all tracing to one root.
- Use AI as a revision tool, not a replacement for thinking. Do the thinking, write the draft in your own words, then bring the model in to react, pressure-test, and rephrase. The test: did the idea originate with you?
- Prompt engineering is writing a good brief — role, audience, constraints, examples — which is mostly Chapters 2, 4, and 7 aimed at a machine. A good prompt requires thinking, so it sneaks the thinking back in.
- Integrity rests on three rules: disclose per your context, verify everything (accuracy is non-delegable — "the AI told me" is no defense), and don't present as yours reasoning you can't evaluate. Academic writing is strictest because its purpose is to demonstrate your thinking.
- The governing rule: if you cannot evaluate whether the output is correct, you should not use AI for that task. The tool is safe exactly as far as your judgment reaches.
- The threshold concept: AI can draft, but it can't think for you. The judgment about whether text is true, right, and yours stays entirely with you, no matter how good the model gets.
Action Items
- Before any AI writing task, run the governing rule: can I evaluate whether this output is correct? If no, don't use AI for it.
- Reverse your order of operations: write your own rough draft first, then use the model to revise — never the other way around.
- Rebuild one vague prompt into a real brief: role, audience, constraints, and a voice example.
- Verify every fact, number, name, and citation the model produces, against a real source — especially the ones that sound authoritative.
- Before using AI on any submitted academic work, read the policy; when unsure, ask explicitly.
Common Mistakes
- Trusting fluency as a proxy for accuracy (the curse of the plausible).
- Outsourcing the thinking, not just the typing (idea originated with the model).
- Losing your voice to the generic default register.
- Vague prompts producing generic output.
- Skipping verification because "it's usually right."
- Assuming permission in academic or regulated contexts.
- Using AI precisely where you can't judge the output — the one forbidden case.
Decision Framework
| Before you reach for the model, ask… | …and do this |
|---|---|
| Can I evaluate whether the output is correct? | If no, don't use AI for this task (the governing rule). |
| Did the idea originate with me, or with the model? | If the model, you're outsourcing — write your own draft first, then revise. |
| Is this task one where I stay the judge (rephrase, outline, format)? | Green light — use it, pick the good options, discard the rest. |
| Does the output contain any fact, number, or citation? | Verify each against a real source before it goes out under your name. |
| Is this writing meant to demonstrate my thinking (school, exam)? | AI-generating the substance defeats the purpose — and check the policy. |
| Is my prompt a real brief (role, audience, constraints, example)? | If vague, rewrite it; generic in, generic out. |
| Am I reaching for AI because I don't understand the subject? | Stop — that's the one situation the rule forbids. |
Spaced Review
A few questions reaching back, to strengthen retention.
- (From Chapter 12) Chapter 12 taught that revision is the work — a document becomes good in revision, not in the first draft — and gave the editing hierarchy (content → structure → paragraphs → sentences → words → proofreading). This chapter reframes AI as a revision tool. At which levels of the editing hierarchy is an LLM most safely useful, and at which is it most dangerous — and how does that map onto §29.4's "did the idea originate with you?"
- (From Chapter 11) Chapter 11 gave three rules for AI and integrity: disclose, verify every fact and citation, and don't present as yours reasoning you can't evaluate. This chapter built its governing rule on the third. Restate the governing rule, and explain why Chapter 11's "verify — accuracy is non-delegable" is a consequence of the model being a plausible-text predictor rather than a separate rule.
- (From Chapter 1, bridging) Chapter 1's threshold concept was "writing is thinking, not transcription." This chapter's is "AI can draft, but it can't think for you." Show how the second is the first one's logical consequence — why does "writing is thinking" force the conclusion that a tool which does the writing without the thinking hasn't actually helped you write?
Answers
1. An LLM is **most safely useful at the bottom of the editing hierarchy** — sentences and words (rephrasing, tightening, fixing awkward constructions) — because at that level *you* already own the content and the meaning; you wrote the sentence, so you can judge whether the model's rephrase still says what you meant. It's also useful as a *reactor* at the top (flagging where structure is unclear or where a reader might push back), as long as you wrote the draft and only *evaluate* its suggestions. It's **most dangerous at the top when used to *originate*** — to generate the content and structure (the *what you say* and *the order you say it in*), because that's exactly the thinking [Chapter 12](../../part-02-building-blocks/chapter-12-editing-and-revision/index.md) and [Chapter 1](../../part-01-writing-is-thinking/chapter-01-why-writing-matters/index.md) said the writer must do. This maps cleanly onto §29.4's test: at the word/sentence level the idea *originated with you* (you're revising your own meaning — augmentation), while asking the model to produce the content means the *idea originated with the model* (outsourcing — the thinking you skipped). The hierarchy and the test agree: safe where you've already thought, dangerous where you're asking the model to think. 2. The governing rule: **if you cannot evaluate whether the output is correct, you should not use AI for that task.** [Chapter 11](../../part-02-building-blocks/chapter-11-citing-sources/index.md)'s "verify — accuracy is non-delegable" is a *consequence* of the model being a plausible-text predictor, not a separate rule, because: the model optimizes for *plausible*, not *true* (§29.1), so its output is *not* pre-checked for accuracy — a fabricated citation and a real one are equally plausible continuations and look identical. Therefore the only accuracy check that exists is *yours*, after the fact. "Verify everything" isn't an arbitrary integrity demand; it's the *necessary* response to a tool that produces confident text indifferent to truth. If the model checked facts, verification would be optional; because it structurally can't, verification is the price of use — and "the AI told me" is no defense precisely because the AI was never claiming truth, only plausibility, which you mistook for truth at your own risk. 3. The second threshold concept is the first one's logical consequence by a short, tight argument. [Chapter 1](../../part-01-writing-is-thinking/chapter-01-why-writing-matters/index.md): *writing is thinking* — putting ideas into clear sentences is not transcribing a finished thought but the very process by which the thought gets finished and tested (the page forces the logical joints you'd skip in your head). If that's true, then the *act of writing* is the *act of thinking*; they're the same act, not two steps. Now introduce a tool that performs the *writing* (produces the sentences) *without* performing the thinking (it pattern-matches plausible text, holds no meaning, tests no logic). Because writing and thinking are the same act, a tool that does the writing-without-thinking hasn't done a *part* of your writing — it's produced the *artifact* of writing (sentences) while skipping the *function* of writing (thinking). So it can't "help you write" in the sense that matters; it can only help you *skip* writing, which is to skip thinking, which [Chapter 1](../../part-01-writing-is-thinking/chapter-01-why-writing-matters/index.md) said was the whole point. Hence: AI can draft (produce the artifact), but it can't think for you (perform the function) — the first threshold makes the second unavoidable.What's Next
You can now use an AI without letting it use you: you know what it's genuinely good at and genuinely bad at, why the failures follow from how it works, how to keep it in the revision seat instead of the thinking seat, how to brief it like a writer, where the integrity lines fall, and the one rule that governs every case. That closes Part V — the writing that lives in and around software and data. Chapter 30 opens Part VI and turns from the page to the room. You've spent twenty-nine chapters making text clear for a reader; now you'll make slides clear for a listener — and you'll find the principles transfer exactly. The assertion-evidence slide is the inverted pyramid in visual form; one point per slide is "one idea per paragraph" on a screen; and the judgment about what your audience needs to see is the same judgment no tool could supply here. The thinking is still yours. It always was.
Practice: Exercises · Quiz Go deeper: Case Study · Case Study 2 Review: Key Takeaways · Further Reading
Related Reading
Explore this topic in other books
Music Production AI and the Future of Music Production: Tools, Threats, and the… How to Learn Anything Learning in the Age of AI