Quiz — Chapter 29: Writing with AI

Q: If an AI's output is grammatical and well-organized, it's probably accurate.

False. Fluency is what the model optimizes for and is independent of truth — a fabricated fact reads exactly as smoothly as a real one (the curse of the plausible, §29.1/§29.3), so good form is no evidence of accuracy.

Q: Using AI to brainstorm ten title options for an article you wrote is outsourcing your thinking.

False. It's augmentation — you wrote the article (did the thinking) and you judge which title fits; brainstorming options you select from is one of the §29.2 safe uses, because the substance and the judgment stay yours.

Q: Prompt engineering is a brand-new technical skill mostly unrelated to the writing skills in this book.

False. A good prompt is a good brief — role, audience, constraints, examples — which is mostly Chapters 2, 4, and 7 aimed at a machine; the genuinely AI-specific parts (verify, context window, examples-beat-descriptions) are thin (§29.5).

Q: A summary produced by an AI is always a safe substitute for reading the source.

False. A summary can silently drop the one exception that mattered, invert a hedge into a certainty, or smooth away the key caveat (§29.2) — it's a starting orientation, safe only for sources you'd verify if they mattered, never a substitute when the source matters.

Q: The more you understand a subject, the *more* dangerous it is to use AI for it.

False — it's the reverse: the less you understand a subject, the more dangerous AI is, because you can't catch its errors (§29.7); high expertise makes AI safer because you're the verification layer it lacks.

Q: Disclosing that you used AI excuses you from verifying the facts it produced.

False. Disclosure and verification are separate obligations (§29.6); disclosure tells readers AI was involved, but you remain fully accountable for the accuracy of everything under your name — "I disclosed it" is no more a defense than "the AI told me."

DataField.Dev

Quiz — Chapter 29: Writing with AI

Target: 70%+ before moving on. This quiz checks whether you understand what LLMs are good and bad at and why, can tell augmentation from outsourcing, can engineer a prompt, know the integrity lines, and can apply the governing rule. Answers and explanations are hidden — try each before revealing.

Section 1 — Multiple Choice

1. The single rule that governs all AI use in this chapter is: - A) Never use AI for important writing. - B) Always disclose when you use AI. - C) If you can't evaluate whether the output is correct, you shouldn't use AI for that task. - D) AI is fine as long as you fix the grammar.

Answer

**C.** The governing rule (§29.7) is the chapter compressed: the model produces *plausible*, not *true*, text, so the only thing between you and a confident error is your ability to catch it — which requires being able to evaluate the output. A is too absolute (AI is fine where you can evaluate); B is one integrity rule, not the governing one, and isn't always required; D mistakes surface polish for the real issue (truth and judgment).

2. Why does an LLM "hallucinate" — produce confident, fabricated facts? - A) It has bugs that engineers haven't fixed yet. - B) It is built to produce plausible text, not true text, so when a plausible-looking fact is wrong, it states it anyway with no way to know. - C) It is deliberately deceiving the user. - D) It only hallucinates when the prompt is badly written.

Answer

**B.** A hallucination is the model doing its normal job — generating likely continuations — in a case where the plausible answer happens to be false (§29.1, §29.3). A is wrong: it's not a fixable bug but a property of how the models work. C is wrong: deception requires knowing the truth; the model has no store of facts to know it from. D is wrong: good prompts reduce but don't eliminate it — the indifference to truth is structural.

3. Which task is the safest use of an LLM, per this chapter? - A) Writing the analysis section of a report in a field you don't know. - B) Generating a legal argument you can't verify. - C) Rephrasing a sentence you wrote, where you can judge whether the rephrase still says what you meant. - D) Drawing the conclusions of a study you haven't read.

Answer

**C.** Rephrasing your own sentence is the model at its best: you own the meaning, so you can evaluate every option and pick the right one (§29.2, §29.4, §29.7). A, B, and D all fail the governing rule — each asks the model to supply substance in a situation where you *cannot* evaluate its correctness (unfamiliar field, unverifiable legal claim, unread study). The difference is whether your judgment can catch an error; in C it can, in the others it can't.

4. "Using AI as a revision tool, not a replacement for thinking" most precisely means: - A) Only use AI after you've finished proofreading. - B) Do the thinking and write the draft yourself first, then use the model to help you improve it. - C) Let the AI write the draft, then you revise it. - D) Use AI for short documents but not long ones.

Answer

**B.** The reframe (§29.4) is about *order of operations and who thinks*: you generate the substance and the draft, then the model reacts, pressure-tests, and rephrases. C is the *opposite* — it's the outsourcing the chapter warns against (the model supplies the substance). A confuses "revision" with "proofreading" (different levels of [Chapter 12](../../part-02-building-blocks/chapter-12-editing-and-revision/index.md)'s hierarchy); D is irrelevant to the principle (length isn't the variable — who did the thinking is).

5. Which is the highest-leverage lever in a writing prompt? - A) Telling the model to "be creative." - B) Specifying the audience. - C) Using a longer prompt. - D) Asking the model to "do its best."

Answer

**B.** Audience is the highest-leverage lever (§29.5) for the reason [Chapter 2](../../part-01-writing-is-thinking/chapter-02-audience/index.md) gave: the same content for a different reader is a different document, so specifying the reader is what most transforms generic output into useful output. A and D are non-instructions that give the model nothing to specialize on (vague in, generic out); C confuses length with specificity — a long vague prompt is still vague.

6. A "few-shot example" in a prompt is: - A) A prompt you only use a few times. - B) One or two examples of the target style, format, or voice, which steer the output more reliably than describing the style in words. - C) A short prompt under a few words. - D) A way to make the model respond faster.

Answer

**B.** A few-shot example shows the model what you want by giving it a sample to match — and because matching a concrete pattern is exactly what the model is built to do, it steers output more reliably than any verbal description (§29.5). "Match this register" with an example beats "be professional but friendly." A, C, and D misread the term entirely.

7. "The AI told me" is offered as a defense for a wrong fact in a published document. Per this chapter, this defense is: - A) Valid, since the AI produced the error. - B) Valid in the workplace but not in academia. - C) Not a defense — accuracy is non-delegable; you're accountable for everything under your name regardless of source. - D) Valid only if you disclosed using AI.

Answer

**C.** Verification is non-delegable (§29.6): you are responsible for every fact, number, and citation in writing that goes out under your name, whatever produced it. Lawyers have been sanctioned for filing AI-hallucinated citations they didn't check — "the AI told me" protected no one. A, B, and D all imagine the responsibility can shift to the tool or be waived by disclosure; it can't. Disclosure is a separate obligation and doesn't excuse an unverified falsehood.

8. What does an LLM fundamentally lack that makes it unable to write your specific document well? - A) Enough training data. - B) A fast enough processor. - C) Institutional/contextual knowledge — the specifics of your situation, project, and audience that live nowhere in its training. - D) The right prompt.

Answer

**C.** Institutional knowledge (§29.3) is the failure no bigger model fixes, because it's about *access*, not capability: the model wasn't in your meetings, didn't read your Slack, doesn't know your VP or your users. D is half-true (a good prompt can *supply* some context) but the deep point stands — there's always context only you possess. A and B misdiagnose it as a scale or speed problem; it's a fundamental gap between public training data and your private situation.

9. Why is "fluent" a dangerous thing to trust in AI output? - A) Fluent text is always wrong. - B) Fluency is exactly what the model optimizes for and is unrelated to whether the content is true — so a fabricated fact reads as smoothly as a real one (the curse of the plausible). - C) Fluent text is harder to read. - D) Fluency means the model worked too hard.

Answer

**B.** The model optimizes for plausible, fluent text; fluency tells you nothing about truth, so a hallucination is delivered in exactly the same confident, well-formed prose as a correct statement — there's no surface tell (§29.1, §29.3). "The curse of the plausible": the better it sounds, the harder the wrong fact is to catch. A overstates (fluent text is often right); C and D are nonsense.

10. In which context is AI use most strictly limited, and why? - A) Internal work memos, because companies are cautious. - B) Academic coursework, because the purpose of the writing is to demonstrate your own thinking and learning, which AI-generated substance defeats. - C) Personal journaling, because it's private. - D) Marketing copy, because it must be original.

Answer

**B.** Academic writing is strictest (§29.6) because its *purpose* is to demonstrate and develop your thinking — so AI-generating the substance doesn't just risk a policy violation, it defeats the point of the assignment (you didn't learn the thing the writing was meant to teach). Most workplace writing (A) tolerates AI assistance bounded by verification. C and D aren't where the strictest lines fall.

11. A model produces an analysis in a domain you don't understand, and it "sounds insightful." The chapter says you should: - A) Use it — if it sounds insightful, it probably is. - B) Not present it as your own reasoning, because you can't evaluate whether the "insight" is real or a plausible-sounding error. - C) Use it but add a disclosure. - D) Use half of it.

Answer

**B.** This is the governing rule and the third integrity rule together (§29.6, §29.7): if you can't independently judge the reasoning sound, you must not pass it off as your own — you can't tell a genuine insight from a fluent wrong one, and you'd be vouching for what you don't understand. A is the exact trap (sounding insightful is not being correct). C and D don't fix the core problem: you still can't evaluate it, so disclosure or partial use doesn't make un-judged reasoning safe.

12. "Will AI replace writers?" The chapter's answer is best summarized as: - A) Yes — fluent text generation makes human writers obsolete. - B) No — AI can replace the typing (producing plausible prose) but not the parts that are writing: having the idea, getting facts right, knowing your specific reader. - C) Yes, but only for technical writing. - D) No — AI can't produce grammatical text.

Answer

**B.** AI raises the floor on fluency and leaves the ceiling — judgment — to humans (§29.3, FAQ). It can replace the mechanical production of generic prose (never the valuable part); it can't do the thinking, the fact-checking, or the audience-knowing that constitute writing. A overstates; D is factually wrong (the model is *very* good at grammar — that's the point); C arbitrarily restricts it.

Section 2 — True/False with Justification

State true or false and justify in one sentence.

T1. If an AI's output is grammatical and well-organized, it's probably accurate.

Answer

**False.** Fluency is what the model optimizes for and is independent of truth — a fabricated fact reads exactly as smoothly as a real one (the curse of the plausible, §29.1/§29.3), so good form is no evidence of accuracy.

T2. Using AI to brainstorm ten title options for an article you wrote is outsourcing your thinking.

Answer

**False.** It's augmentation — you wrote the article (did the thinking) and you judge which title fits; brainstorming options you select from is one of the §29.2 safe uses, because the substance and the judgment stay yours.

T3. Prompt engineering is a brand-new technical skill mostly unrelated to the writing skills in this book.

Answer

**False.** A good prompt is a good *brief* — role, audience, constraints, examples — which is mostly Chapters 2, 4, and 7 aimed at a machine; the genuinely AI-specific parts (verify, context window, examples-beat-descriptions) are thin (§29.5).

T4. A summary produced by an AI is always a safe substitute for reading the source.

Answer

**False.** A summary can silently drop the one exception that mattered, invert a hedge into a certainty, or smooth away the key caveat (§29.2) — it's a starting orientation, safe only for sources you'd verify if they mattered, never a substitute when the source matters.

T5. The more you understand a subject, the more dangerous it is to use AI for it.

Answer

**False** — it's the reverse: the *less* you understand a subject, the more dangerous AI is, because you can't catch its errors (§29.7); high expertise makes AI *safer* because you're the verification layer it lacks.

T6. Disclosing that you used AI excuses you from verifying the facts it produced.

Answer

**False.** Disclosure and verification are *separate* obligations (§29.6); disclosure tells readers AI was involved, but you remain fully accountable for the accuracy of everything under your name — "I disclosed it" is no more a defense than "the AI told me."

Section 3 — Short Answer

Two to four sentences each.

S1. Explain, using §29.1, why "the model lied to me" is the wrong way to describe an AI's factual error, and why the distinction changes how you use the tool.

Model answer + rubric

"Lied" implies the model knew the truth and chose against it, but the model has no store of facts to know it from — it produces *plausible* text, and a plausible-looking fact is sometimes false. It isn't deceiving you; it's doing its only job (predicting likely text) in a case where the likely answer is wrong. The distinction changes everything about use: if you think errors are "lies," you expect them to be rare and detectable as insincerity, and you trust the model when it "seems sincere" (it always does); if you understand it's *indifferent to truth by construction*, you verify every fact, because the confident wrong statement is the system working normally, not malfunctioning. **Rubric:** distinguishes "lying" (knows truth, chooses against) from plausible-text generation (1); states the model has no truth-store/is indifferent to truth (1); draws the use consequence — verify everything because sincerity ≠ accuracy (1).

S2. The chapter's five AI weaknesses all trace to one root. State the root, and show how two of the weaknesses follow from it.

Model answer + rubric

The root: the model produces *plausible* text without holding a meaning or checking a world. Hallucination follows directly — it's plausible-text-that's-false (a real-looking citation is a likely continuation whether or not the paper exists). The original-thinking failure follows too — asked to "analyze," the model produces analysis-*shaped* text (the average, plausible take) without performing an actual analysis, because it isn't reasoning, just predicting what an analysis would sound like. (Other valid pairs: miscalibrated hedging = plausible-register-that-misstates-certainty; institutional-knowledge gap = plausible-text standing in for context it lacks; audience failure = plausible-generic where specific was needed.) **Rubric:** states the root correctly (1); derives two distinct weaknesses *from* the root, not just lists them (1 each, cap 2).

S3. A coworker shows you a prompt — "write a blog post about our product" — and complains the output is generic. Diagnose the problem and give them the four levers that would fix it.

Model answer + rubric

The prompt is vague, so the model produced the *average* of all "blog post about a product" — generic in, generic out (§29.5). It specified nothing for the model to specialize on. The four levers: **role** (who's writing — a product expert, a developer advocate); **audience** (the highest-leverage — *who* reads this and what they care about); **constraints** (length, tone, what to include/exclude, no marketing fluff); and an **example** (a sample of the voice/format to match — more powerful than describing it). The deeper fix: a good prompt is a good brief, and writing one requires the coworker to actually *think* about audience and purpose. **Rubric:** diagnoses vagueness → generic/average output (1); names all four levers (1); notes audience as highest-leverage or that a good prompt requires thinking (1).

S4. Distinguish "AI-assisted" from "AI-generated" writing, and explain why the distinction is the hinge of the integrity question.

Model answer + rubric

AI-*assisted* writing is yours — you thought it through, wrote it, the model helped you revise, and you stand behind every claim. AI-*generated* writing is the model's — it produced the substance and you passed it along as your own. The distinction is the integrity hinge because the line everything turns on is *who did the thinking*: assistance keeps the thinking and substance human (a legitimate tool use, like a thesaurus), while generation means presenting as your own work and reasoning something that is neither — and may include conclusions you can't even evaluate (§29.6). **Rubric:** defines both correctly (1); identifies "who did the thinking" as the line (1); connects to why generation is the integrity problem — presenting non-yours as yours (1).

S5. Show how this chapter's threshold concept ("AI can draft, but it can't think for you") is a logical consequence of Chapter 1's ("writing is thinking, not transcription").

Model answer + rubric

[Chapter 1](../../part-01-writing-is-thinking/chapter-01-why-writing-matters/index.md) establishes that writing and thinking are the *same act* — putting ideas in sentences is how the thought gets finished and tested, not a recording of a finished thought. If writing *is* thinking, then a tool that does the writing (produces sentences) without the thinking (it pattern-matches plausible text, holds no meaning) hasn't done a *part* of your writing — it's produced the *artifact* of writing while skipping its *function*. So it can't help you write in the sense that matters; it can only help you *skip* writing, which is to skip thinking — the whole point. Hence AI can draft (the artifact) but not think for you (the function): the first threshold makes the second unavoidable. **Rubric:** states writing-and-thinking-as-same-act from [Ch 1](../../part-01-writing-is-thinking/chapter-01-why-writing-matters/index.md) (1); distinguishes artifact (sentences) from function (thinking) (1); concludes the model produces the artifact without the function, so can't replace the thinking (1).

Section 4 — Applied Scenario

Short writing/judgment tasks, graded by rubric.

A1. Improve the weak prompt. Here is a prompt that will produce generic mush:

"Write a summary of our quarterly results for the team."

Rewrite it as a real brief using all four levers (§29.5). Invent plausible specifics: revenue up 12%, one product line down, the team is the 8-person sales group, the tone should be honest and motivating without spin, under 150 words, and you have a prior update whose voice worked.

Model answer + rubric

A strong rewrite, for example: > *"You're the sales team lead writing a quarterly-results update for your 8-person sales group — people who know the numbers and dislike spin. Purpose: report that overall revenue is up 12%, but the [Product X] line is down 8%, and frame the quarter honestly while keeping the team motivated for next quarter. Tone: direct, honest, no corporate cheerleading. Length: under 150 words, scannable. Match the voice of this prior update that landed well: '[paste 2–3 sentences of the prior update].' Lead with the headline number, name the soft spot plainly, end with one concrete focus for next quarter."* **Rubric (5 points):** - **Role** specified (sales team lead) — **1** - **Audience** specified with a relevant fact (8-person team that dislikes spin) — **1** - **Constraints** specified (the two real numbers, honest-not-spin tone, <150 words, what to lead with) — **1** - **Example** of voice included (or a clear placeholder for one) — **1** - The prompt encodes *judgment the model lacks* (which number to lead with, the no-spin requirement) rather than leaving it generic — **1** 4–5 = excellent (a near-usable brief). 2–3 = re-read §29.5; you're missing a lever or still leaving the model too much room to be generic. 0–1 = the prompt is still vague; generic in, generic out.

A2. Fact-check the fluent-but-wrong AI draft. An AI produced this paragraph for a report. Mark every claim you would refuse to publish without verifying, state why each is suspect (use chapter vocabulary), and rewrite the paragraph keeping only what's safe (noting what you'd check).

"The term 'technical writing' was first coined in 1953 by Dr. Albert Reese of Carnegie Mellon, who founded the field's first academic program. Today, the global technical-writing market is worth exactly $47.3 billion, growing at 14% annually, and a 2022 Gartner report confirmed that 89% of Fortune 500 companies employ dedicated technical writers."

Model answer + rubric

**Claims to flag (all are the texture of hallucination — confident, specific, attributed):** - *"first coined in 1953 by Dr. Albert Reese of Carnegie Mellon, who founded the field's first academic program"* — a named person + named institution + specific year + a "first" claim: the classic fabrication signature (§29.3). Highly suspect; verify against real history (and the precise origin story of the term is contested — a too-tidy single-origin claim is itself a warning). Do not publish without a real source. - *"worth exactly $47.3 billion"* — "exactly" plus a precise market figure with no source; the curse of the plausible (§29.1). Verify or cut. - *"growing at 14% annually"* — clean round-ish statistic, unsourced. Verify or cut. - *"a 2022 Gartner report confirmed that 89% of Fortune 500 companies…"* — named firm + year + suspiciously precise statistic ("89%"): textbook hallucination texture; the named source makes it *more* suspect, not less, because the model formats fabricated sources flawlessly. Verify by finding the actual report (it may not exist as described). **Safe rewrite (keeps only the uncontroversial; flags the rest):** > *"Technical writing is a well-established professional field with a recognized academic and industry presence. [If reporting market size, employment rates, or the field's origin, cite verified figures — every specific statistic and named study in the source draft must be confirmed against the actual publication before use; several have the hallmarks of fabrication and may not exist.]"* **Rubric (6 points):** - Flags the Reese/1953/Carnegie Mellon origin claim as a likely fabrication — **1** - Flags "exactly $47.3 billion" (and the "exactly" tell) — **1** - Flags the 14% growth statistic — **1** - Flags the Gartner/89%/Fortune 500 claim, noting the named source is camouflage not evidence — **1** - Uses chapter vocabulary (hallucination, curse of the plausible, plausible-not-true, named-source-as-camouflage) — **1** - Rewrite keeps only verifiable/uncontroversial content and explicitly notes what must be checked rather than silently dropping or keeping the suspect claims — **1** 5–6 = excellent (you read it as a fact-checker and caught the camouflage). 3–4 = you caught some but trusted at least one confident specific; re-read §29.3. 0–2 = re-read §29.1 and §29.3; the fluency fooled you, which is exactly what it's built to do.

Scoring & Next Steps

Score	What it means	Do this
< 50%	The plausible-vs-true distinction isn't solid yet.	Re-read §29.1 and §29.3, then redo exercises Part A (analyze AI output). The whole chapter rests on understanding that fluent ≠ true.
50–70%	You've got the core idea but miss it on harder cases (integrity lines, the governing rule, prompt craft).	Redo exercises Parts B and M (improve a prompt; fact-check a draft; mixed cases). Focus on the augmentation/outsourcing test and the evaluation rule.
70–85%	Solid. You can use AI well and catch its failures.	Proceed to Chapter 30 (slide design) — Part VI, where the thinking is still yours and the principles transfer to the room.
> 85%	Strong command.	Try exercises Part E (the hallucination hunt, the voice experiment, draft a real AI policy) and read a piece of AI-assisted writing in the wild as a critic — can you spot the default register and the unverified claims?

One-line self-test before moving on: the next time you reach for an AI, can you answer "can I evaluate whether this output is correct?" before you prompt? If you can't answer it, you've found the habit this chapter exists to build.