Chapter 3: Exercises — The Right Mental Models for AI Collaboration

These exercises help you identify the mental models you currently hold, test them against your experience, and deliberately update them toward more productive frames. Some require reflection only; others ask you to run practical experiments with an AI tool.

Section A: Identifying Your Current Mental Models

Exercise 1: The Model Inventory

Before reading any further, write down honest answers to the following questions. Do not edit toward what you think the "right" answer is — the goal is to surface your actual mental models, not the models you want to have.

When you type a message to an AI tool, what do you picture happening on the other end? What is actually responding to you?
When the AI gives you a wrong answer, what is your first instinct about why it happened?
When the AI gives you a great answer, what do you attribute the quality to?
If you could get the same AI output by reading the best blog post on the topic, would you feel the AI had failed? Why or why not?
Do you feel any social hesitation about criticizing AI output directly ("This is wrong") versus hedging it ("This is great, but could we maybe try...)?

Review your answers against the six broken mental models. Which one or two does your self-description most resemble? Be honest — most people recognize themselves in at least one.

Exercise 2: The Behavior Audit

Look back at your last five significant AI interactions (conversations where you were working on something real, not just testing). For each one, answer:

How long was your initial prompt? Did it include context, constraints, and purpose — or was it brief and query-like?
Did you accept the first response, or did you iterate?
If you iterated, what did you change — the instruction wording, or the context?
Did you verify any factual claims in the output?
Did you re-use context from one session in a subsequent session, or start fresh each time?

What pattern emerges? Which mental models are reflected in your actual behavior, as distinct from what you believe you think?

Exercise 3: The Model-to-Behavior Mapping

For each broken mental model, write down one specific behavior it would cause in a user who holds it. For example: "A person with the oracle model would never double-check a factual claim in AI output."

Then check your own behavior against each one. Mark each model as: (a) this explains behavior I recognize in myself, (b) this explains behavior I used to have but have changed, or (c) I do not recognize this in my practice.

Section B: Testing Productive Models

Exercise 4: The Brilliant Intern Onboarding

Choose a task you have been doing with AI tools that has felt inconsistent — sometimes good, sometimes off. Apply the brilliant intern model to it:

Write the "onboarding brief" that a brilliant new intern would need to do this task well for you. Include: the purpose of the task, who the audience is, what "good" looks like, what mistakes to avoid, and any relevant background or constraints. Keep it to 300–500 words.

Now use that brief as the opening of your next session on this task. Compare the output quality to what you were getting before. What changed?

Exercise 5: First Draft Engagement Practice

Take an AI-generated output — a document, a piece of code, a recommendation — that you would normally either accept or discard. Instead of making a binary choice, engage with it as a first draft.

Mark it up with three types of annotations: - "Keep as is" — parts that are accurate and useful - "Revise" — parts that are on the right track but need adjustment - "Remove or replace" — parts that are wrong, off-target, or counterproductive

Then write a revision prompt that incorporates your specific markup. Compare the revised output to the original. How much did the engagement process improve quality? How much did it improve your own clarity about what you actually needed?

Exercise 6: The Thinking Partner Prompt Set

Design a set of five "thinking partner prompts" for a decision or problem you are currently working on. These should not ask the AI to produce something — they should ask it to help you think. Examples:

"Here is my current reasoning: [reasoning]. What assumptions am I making that I should examine?"
"What are the strongest arguments against this approach?"
"What would I need to believe for this recommendation to be wrong?"
"What are three alternative approaches I have not considered?"
"What are the most likely failure modes of this plan?"

Run all five prompts and assess: did any of them surface considerations you had not thought of? What does this tell you about the value of the thinking partner use case compared to pure content generation?

Exercise 7: The Pattern Matcher Self-Assessment

List ten tasks you have used or might use AI for in your work. For each one, assess the "pattern match score" — your estimate of how well-represented this type of task is in the model's training data. Use a simple scale:

High: extremely common task type with clear, widely-available examples (writing a professional email, explaining a common concept, generating code in a major framework)
Medium: moderately common, but with significant local or domain-specific variation (analyzing industry-specific data, generating content in a specialized field)
Low: unusual task type, requires specific local knowledge, or depends on recent information not likely in training data

For your "Low" tasks, what practices would you adopt to compensate for weak pattern matching? (Providing more examples? Verifying more thoroughly? Using a different tool?)

Exercise 8: The Amplifier Test

The amplifier model claims that input quality drives output quality. Test this directly:

Take a single task and create three versions of the prompt, varying only the quality of the input:

Version A: A minimal prompt (one or two sentences describing what you want)
Version B: A moderate prompt (adds purpose, audience, and a few key constraints)
Version C: A rich prompt (adds detailed context, an example of good output, explicit criteria, and any relevant history)

Compare the three outputs on a scale you design yourself. Is the quality improvement from A to C proportional to the effort invested in C? At what level of prompt richness does the improvement plateau?

Section C: The Model Diagnostic in Practice

Exercise 9: Retroactive Diagnostic

Think of a specific time AI output surprised you — either positively or negatively. Apply the Model Diagnostic:

Describe the specific expectation and the specific outcome
Identify which mental model generated the expectation
Identify the most likely mechanical explanation for the gap (training cutoff? Missing context? Pattern mismatch? Context window?)
Write the model update in one or two sentences

Share the update with someone else who uses AI tools. Does it resonate with their experience?

Exercise 10: Live Diagnostic

For your next ten AI interactions, keep a brief log. For each interaction, note:

What you expected
What you got
Whether there was a gap
If there was a gap, what model generated the expectation

At the end of ten interactions, review the log. What are the most common sources of unexpected outcomes? What model or models are producing the most inaccurate expectations?

Exercise 11: The Model Update Statement

Based on your work in Exercises 9 and 10, write a "current mental model statement" — a two to three paragraph description of how you currently understand AI tools that reflects the models you have been testing and refining. Include:

What you believe AI tools are good at (with specifics)
What you believe they are not good at (with specifics)
What you believe your role in the collaboration is
What you believe determines output quality

Return to this statement in thirty days and note what you would change.

Section D: Applying Models Across Personas

Exercise 12: Alex's Challenge

You are Alex, a marketing manager trying to produce campaign content that sounds distinctively like your brand rather than generic marketing copy. You have been using the AI as a replacement model, submitting briefs and accepting drafts. The output is consistently "fine but generic."

Apply each of the five productive mental models to this challenge. For each model, write:

How does this model frame the problem?
What specific prompting or workflow change does this model suggest?
What would success look like if you applied this model?

Which model (or combination) seems most likely to resolve the generic content problem?

Exercise 13: Elena's Context Document

Elena's brilliant intern model requires her to maintain a project context document. Design a template for such a document that would work for your own projects or work context. Include all the sections you would need to brief an excellent but brand-new collaborator on the task or project. Keep the template to under 600 words — it needs to fit within a reasonable portion of a context window.

Test the template on a current project by filling it in and using it in your next AI session. What did you leave out that you should have included? What did you include that was not useful?

Exercise 14: Raj's Confidence Framework

Raj developed a framework for calibrating trust in AI output based on pattern match score. Design your own version of this framework for your domain or workflow. Identify:

The task types in your work where you would assign high confidence to AI output (strong pattern match, low stakes of error)
The task types where you would assign medium confidence (moderate match, important to verify)
The task types where you would assign low confidence (weak match, time-sensitive, high stakes)

For each category, describe what your verification and review practice should look like.

Reflection Exercises

Exercise 15: The 30-Day Model Review

At the start of each week for the next four weeks, take ten minutes to review your mental models against the previous week's experience. For each week, answer:

Did my current mental models generate accurate predictions about AI behavior this week?
What was the largest gap between my expectation and the outcome?
What update, if any, should I make to my models based on this week's experience?

At the end of four weeks, write a summary of how your models have changed and what drove the changes.

Exercise 16: Teach the Models

The best test of whether you have internalized a concept is whether you can teach it. Choose one of the five productive mental models and design a five-minute explanation you could give to a colleague who has never heard of the model. Include:

What the model is and how to think about it
Why it is more accurate than the broken model it replaces
One concrete example from your own domain that illustrates the difference
One specific practice change that follows from the model

Deliver the explanation to an actual colleague and ask for questions. What questions reveal gaps in your own understanding?

Exercise 17: The Institutional Mental Model Audit

Consider the mental models about AI held by the people around you — your team, your organization, or your professional community. Without attributing views to specific people:

Which broken models do you observe most commonly in your context?
What consequences are those models producing (poor AI adoption, over-reliance, frustration, under-use)?
What would need to change for those models to update — what experiences, information, or conversations might shift them?

This exercise is not about criticizing others. It is about understanding the social and organizational context in which you are building your own practice.