Case Study 1: The Fabricated Citation

How a Hallucination Nearly Made It Into a Published Report

Persona: Elena (Management Consultant) Domain: Organizational Research, Client Deliverable Error Type: Pure Hallucination — Fabricated Academic Citation Detection Method: Systematic Citation Verification Workflow Outcome: Error caught pre-delivery; verification protocol formalized

Background

Elena is a management consultant at a mid-size advisory firm. Her work involves producing research-backed deliverables for corporate clients: change management roadmaps, organizational effectiveness reports, strategic planning frameworks. Her clients are senior executives who read closely and ask hard questions, and her firm's reputation rests on the quality and accuracy of its analytical work.

Elena had been using AI tools for six months when the incident occurred. She used them primarily for research synthesis — asking the tool to identify relevant literature, summarize key findings, and help her build the evidentiary base for client reports. She found the tools valuable, especially for covering literature quickly in fields adjacent to her core expertise.

She had not yet developed a systematic citation verification workflow. She trusted her judgment — if a citation looked legitimate and the source sounded credible, she typically moved forward. She had checked a handful of citations in her first weeks of AI use, found them accurate, and gradually stopped checking as a matter of course.

This is the pattern that nearly cost her a client relationship.

The Incident

Elena was preparing a strategic organizational change report for a financial services client. The report included a literature review section supporting specific change management interventions. She asked an AI tool to provide supporting research, specifically requesting academic citations for claims about change resistance and intervention effectiveness.

The model returned six citations. They were well-formatted — APA style, complete with author names, journal names, volume and issue numbers, page ranges, and years. The authors were real organizational behavior researchers whose names Elena recognized from prior legitimate sources. The journal names were real and appropriate to the field. The titles were plausible: specific enough to sound like actual research, general enough to fit the claims being supported.

Elena included all six in a draft of the literature review section and moved on to the analysis.

The draft went through an internal review before client delivery. Elena's colleague Marcus — a more senior consultant doing a quality pass — noticed something when he tried to pull one of the citations to read the full paper. The DOI she had listed didn't resolve to the paper described. He tried Google Scholar. The paper title returned no results. He searched by author name and found that the researcher in question had published prolifically in organizational behavior — but not this paper.

Marcus flagged it. Elena went back and checked all six citations.

Four were real. She could verify them through Google Scholar, DOI resolution, and the journals' online archives. The papers existed, the page numbers matched, the content was consistent with how she had cited them.

Two were fabrications. The authors were real. The journals were real. The titles were plausible. Nothing else was accurate. The papers did not exist.

Why the Fabrications Were So Convincing

Elena spent some time afterward understanding why she hadn't caught this herself.

The citations were indistinguishable from real ones on visual inspection. They didn't have obvious formatting errors. The author names were people who actually publish in organizational behavior — she had seen their names before in legitimate contexts. The journals were real and appropriate to the field. The formatting was correct in every detail.

The model had learned citation patterns so thoroughly that its fabricated citations had the exact texture of real ones. This is the core of the citation hallucination problem: the dangerous ones look identical to the real ones. You cannot catch them by reading.

She also reflected on her own trust calibration error. She had done spot-checks in early use and found them accurate. From that experience, she had generalized incorrectly — concluding that the citations were reliable rather than that the ones she happened to check were real. She had been building trust on a sample that wasn't representative.

The Verification Workflow She Built

After the incident, Elena developed a citation verification workflow that she now applies to every AI-sourced citation before it enters any client document.

Step 1: DOI Resolution Check For every citation with a DOI, she goes to doi.org and enters it directly. The DOI must resolve to the correct paper — matching title, author(s), and journal. A DOI that resolves to a different paper is a red flag for a fabricated citation where the model attached a real DOI to invented content.

Step 2: Title Search Verification She runs the paper title in Google Scholar in quotation marks. A real paper will typically appear immediately. If nothing comes up, or if results show a completely different paper with a similar title, the citation is suspect.

Step 3: Author Verification She checks that the named author(s) have published in the claimed field. She then confirms the specific paper appears in their publication history, accessible through Google Scholar's author profiles or the author's institutional webpage.

Step 4: Abstract Check She reads the abstract of the actual paper (or at minimum the full title and keywords) to confirm that the paper's actual content supports the claim she is citing it for. A real paper can still be mischaracterized — the AI's description of what a paper says may not match what the paper actually says.

Step 5: Documentation For each verified citation, she records in her project notes: the verification method used, the date, and any notes about discrepancies between how AI characterized the paper and what the actual abstract says.

The entire process for a set of six citations takes approximately fifteen to twenty minutes — faster than she expected once she established the habit. She now blocks time for this as a distinct step in her project workflow: "citation audit" before any draft moves to review.

The Conversation with Marcus

Marcus had caught the error during a routine review, not a verification audit — he happened to want to read one of the papers in full and found the DOI didn't work. After the incident, he and Elena agreed to formalize citation verification as a workflow requirement for all AI-assisted research.

They framed it internally not as a response to distrust of AI tools — they continued using them — but as an extension of professional standards. "We already verify data from client sources," Elena noted. "We cross-check statistics. We don't take our clients' characterizations of research findings as accurate without looking at the actual research. We need to apply the same standard to AI output."

The firm began incorporating citation audit time into project scoping estimates for any deliverable that included AI-assisted literature review.

What Didn't Happen

What makes this case study valuable is partly the error and partly the near-miss structure. The citation did not make it into the client report. Elena's firm's review process caught it.

But consider the versions where it would not have been caught: a shorter project timeline with no internal review, a solo consultant with no colleague doing a quality pass, a client who reads for conclusions rather than checking citations, a report that gets filed and referenced internally without any reader trying to access the original papers.

All of those are plausible. In professional practice, the review process that caught Elena's error is not universal. Many organizations have no equivalent safeguard for AI-assisted research.

The narrow escape is the point. The gap between "nearly happened" and "didn't happen" was Marcus wanting to read a paper. A systematic verification workflow closes that gap deliberately, rather than relying on accidental discovery.

Lessons

1. Fabricated citations are visually indistinguishable from real ones. You cannot catch them by reading. They require active verification.

2. Successful spot-checks do not establish that all citations are reliable. Trust calibration requires a larger and more systematic sample, not generalization from early successes.

3. Verification is not "checking up on AI" — it is professional standards applied consistently. The same rigor you apply to client-provided data applies to AI-provided citations.

4. Building the workflow once and applying it consistently is far less costly than the alternative. Elena estimates the time cost of her verification protocol at 15-20 minutes per citation set. The time cost of a citation error in a client deliverable — correction, client conversation, reputation management — would be orders of magnitude higher.

5. Near-misses are information. When a system catches an error, the correct response is not "good, the system worked" but "what systematic change ensures I don't rely on that system working next time?"

Related: Chapter 29, Section 3 (High-risk domains: citations), Section 4 (Citation verification), Section 6 (Building a personal hallucination detection protocol)

Continue to Case Study 2: Alex's Viral Number: When a Hallucinated Statistic Gets Shared