Chapter 9: Key Takeaways
Context Management and Conversation Design
-
Context windows are finite and uneven. AI models pay the most attention to the beginning (primacy) and end (recency) of the context window. Information in the middle receives less attention, a phenomenon known as the "Lost in the Middle" effect. Structure your conversations accordingly.
-
Your conversation is a data structure. Treat it as an append-only, ordered sequence with fixed capacity. Every message---both yours and the AI's---consumes tokens. Design your conversation deliberately rather than letting it evolve haphazardly.
-
Front-load critical context. Your first message is prime real estate. Use it to establish the project context, tech stack, coding conventions, architectural decisions, and key constraints. This investment saves thousands of tokens over the life of the conversation.
-
Use the sandwich pattern for important requests. Place critical constraints at both the top and bottom of your message, with detailed code and specifications in the middle. This exploits both the primacy and recency effects to maximize attention on your requirements.
-
Choose the right multi-turn pattern for each task. Progressive disclosure for layered features, scaffold-then-fill for architecture-first development, test-first for well-specified requirements, review-and-refine for exploratory work, and parallel exploration for comparing approaches.
-
Invest in context priming. Role priming, codebase context priming, anti-pattern priming, and output format priming all measurably improve AI output quality. Create reusable priming templates for your most common coding tasks.
-
Conversations degrade after 15-20 turns. Watch for degradation signals: the AI forgets constraints, contradicts itself, duplicates logic, or declines in quality. These are signals to summarize and start fresh, not signals to push through.
-
Master the Fresh Start Protocol. When starting a new conversation, carry forward a concise summary plus the final code artifacts---not the entire conversation history. A fresh conversation with well-curated context outperforms a degraded conversation every time.
-
Choose the right file context strategy. Provide the relevant subset of files, use interface-only context for dependencies, include contextual snippets rather than full files when possible, use tree-and-summary for large codebases, and provide diffs for debugging.
-
Budget your context before you start. Estimate your token needs for file context, conversation turns, and response generation. Plan for 70% utilization, leaving 30% headroom for unexpected complexity.
-
Use project-level configuration files. Tools like Claude Code's
CLAUDE.mdand Cursor's.cursorrulesinject persistent context at a high-attention position in every conversation, ensuring key conventions are never forgotten. -
Break complex tasks into focused sessions. Multiple short, focused conversations produce better results than one marathon session. Plan your session boundaries around logical units of work (one component, one feature, one layer at a time).
-
Periodically restate key constraints with anchor messages. Every 5-8 turns, briefly remind the AI of your most important constraints and the current state of development. This is inexpensive insurance against context drift.
-
Ask the AI to be concise when you do not need explanations. The AI's verbose explanations consume tokens from your context budget. Instructing it to return code only (when appropriate) can save 1,000-3,000 tokens per response, dramatically extending your conversation's useful life.
One-sentence summary: Treat your AI conversation as a carefully designed data structure---plan your context budget, front-load critical information, use the right multi-turn pattern for each task, and start fresh when the conversation degrades.