Chapter 13 Exercises: Working with Multiple Files and Large Codebases
These exercises progress from basic recall through creative application to challenging multi-chapter integration. Complete them in order within each tier, but feel free to skip tiers that are too easy for your current level.
Tier 1: Recall (Exercises 1-6)
These exercises test your understanding of the core concepts from this chapter.
Exercise 1: Context Strategy Identification
For each of the following scenarios, identify which context-providing strategy (interface-first, full file inclusion, dependency summary, example-driven, or progressive disclosure) would be most appropriate. Explain your reasoning.
a) You need the AI to write a new service that calls three utility functions from an existing module. b) You need the AI to refactor an existing 200-line file to improve error handling. c) You need the AI to create a new model file that follows the exact same pattern as five existing model files. d) You need the AI to redesign the authentication system, but you are not sure which files are involved. e) You need the AI to write a quick script that imports a User class with 8 fields.
Exercise 2: Repository Map Components
List the five key pieces of information that a good repository map should include for each file. Explain why each piece of information is valuable to an AI assistant.
Exercise 3: File-by-File vs. Holistic Decision
For each scenario below, state whether you would use file-by-file generation, holistic generation, or a hybrid approach:
a) Generating a complete REST API with models, routes, and services for a new microservice (12 files) b) Adding a complex search algorithm to an existing search_engine.py module c) Creating three new tightly-coupled data model files that reference each other d) Building a CLI tool with 6 independent subcommands, each in its own file e) Generating a complete test suite (8 test files) for an existing application
Exercise 4: Import Map Construction
Given the following directory structure, write a complete import map showing what each module exports and the correct import statement for each export.
myapp/
├── __init__.py
├── models/
│ ├── __init__.py
│ ├── user.py (exports: User, UserStatus)
│ └── post.py (exports: Post, PostCategory)
├── services/
│ ├── __init__.py
│ ├── user_service.py (exports: UserService)
│ └── post_service.py (exports: PostService)
└── utils/
├── __init__.py
└── database.py (exports: get_session, DatabaseError)
Exercise 5: Dependency Rule Violations
Review the following import statements and identify which ones violate the standard layered architecture dependency rules (models -> no project imports, utils -> no project imports, services -> models + utils only, api -> services + models only):
# File: models/user.py
from myapp.utils.validation import validate_email
from myapp.services.auth_service import hash_password
# File: services/order_service.py
from myapp.models.order import Order
from myapp.models.user import User
from myapp.api.routes import get_current_user
# File: api/routes.py
from myapp.services.user_service import UserService
from myapp.models.user import User
# File: utils/database.py
from myapp.models.user import User
Exercise 6: Convention Drift Scenarios
Describe three specific ways that convention drift might manifest in a multi-file project generated over a long AI session. For each, explain what the drift looks like and how you would detect it.
Tier 2: Apply (Exercises 7-12)
These exercises ask you to apply the concepts in practical scenarios.
Exercise 7: Write a Context Document
Create a complete context document (under 500 words) for a fictional e-commerce application with the following components: - User management (registration, login, profiles) - Product catalog (categories, search, filtering) - Shopping cart and checkout - Order management and tracking - Payment processing (via Stripe)
Include: architecture overview, technology stack, naming conventions, dependency rules, and module summaries.
Exercise 8: Design a Repository Map Generator Prompt
Write a prompt that asks an AI assistant to generate a repository map from a provided directory tree and a set of file contents. The prompt should specify exactly what information you want in the map (file sizes, exports, dependencies, etc.) and what format to use.
Exercise 9: Create a Consistency Reference
Given the following service file, write a prompt that uses it as a consistency reference to generate a new, different service. Your prompt should explicitly call out which patterns the AI should replicate.
class ProductService:
"""Service for managing product operations."""
def __init__(self, db: Database, cache: Cache) -> None:
self._db = db
self._cache = cache
def get_by_id(self, product_id: int) -> Product:
"""Retrieve a product by its unique identifier.
Args:
product_id: The unique identifier of the product.
Returns:
The product with the given ID.
Raises:
NotFoundError: If no product with the given ID exists.
"""
cached = self._cache.get(f"product:{product_id}")
if cached:
return cached
product = self._db.query(Product).filter_by(id=product_id).first()
if not product:
raise NotFoundError(f"Product {product_id} not found")
self._cache.set(f"product:{product_id}", product, ttl=300)
return product
Exercise 10: Phased Approach Planning
You need to add a "favorites" feature to an existing e-commerce application. Users should be able to favorite products, view their favorites, and receive notifications when favorited products go on sale. Plan a phased approach that breaks this into manageable AI sessions. For each phase, specify: - What files need to be created or modified - What context the AI needs - What the deliverable is
Exercise 11: Monorepo Scoping
You are working in a monorepo with the following packages: user-service, product-service, order-service, notification-service, shared-models, and common-utils. You need to add a "wishlist" feature. Determine:
- Which packages need to be modified
- What order to make the changes
- What context from other packages each session needs
- How to verify cross-package consistency
Exercise 12: Sliding Window Context Management
You need to generate 10 data model files, each following the same pattern. Design a sliding window context management plan that specifies: - What goes in the "stable context" (always present) - What goes in the "sliding window" (current + previous file) - How you handle the transition between files - How you verify consistency across all 10 files at the end
Tier 3: Analyze (Exercises 13-18)
These exercises require analyzing scenarios and making judgments.
Exercise 13: Context Efficiency Analysis
A developer provides the following context to an AI assistant for generating a new API endpoint:
Here is my entire models directory (4 files, 400 lines total).
Here is my entire services directory (3 files, 600 lines total).
Here is my entire utils directory (5 files, 350 lines total).
Here is my existing routes.py (200 lines).
Here are my project's requirements.txt (50 lines).
Please add a GET /api/users/{id}/orders endpoint.
Analyze this approach: - What is good about it? - What is wasteful or potentially problematic? - How would you restructure the context to be more efficient? - Estimate the token cost of the original approach vs. your improved approach.
Exercise 14: Consistency Audit
Review the following three function signatures from different files in the same project and identify all consistency issues:
# From user_service.py
def get_user_by_id(self, userId: int) -> Optional[User]:
# From product_service.py
def get_product_by_id(self, product_id: int) -> Product | None:
# From order_service.py
def getOrderById(self, order_id: str) -> Optional[Order]:
For each issue, explain what the inconsistency is, why it matters, and what the standardized version should look like.
Exercise 15: Dependency Graph Analysis
Given the following import statements from a 6-file project, draw the dependency graph and identify: - Any circular dependencies - Any layer violations - The most coupled module (most dependencies) - The most depended-upon module - Suggestions for improvement
# models/user.py - imports: nothing internal
# models/order.py - imports: models.user
# services/user_service.py - imports: models.user, utils.database, utils.email
# services/order_service.py - imports: models.order, models.user, services.user_service, utils.database
# utils/database.py - imports: nothing internal
# utils/email.py - imports: models.user
Exercise 16: Holistic vs. File-by-File Tradeoff Analysis
You are building a new feature that requires 6 new files: - 2 model files (tightly coupled, reference each other) - 2 service files (each depends on both models, loosely coupled with each other) - 2 test files (one for each service)
Analyze the tradeoffs of three approaches: a) Generate all 6 files in one prompt b) Generate all 6 files one at a time c) Generate in groups: models together, services together, tests together
For each approach, discuss: consistency, quality per file, context window usage, number of iterations needed, and risk of integration issues.
Exercise 17: Context Window Budget
You have a model with a 128,000-token context window. Your task requires: - System prompt and instructions: ~1,000 tokens - Style guide and conventions: ~500 tokens - Repository map: ~800 tokens - The file to generate: ~2,000 tokens (estimated output) - Response overhead: ~500 tokens
This leaves approximately 123,200 tokens for source code context. Your project has 40 Python files averaging 150 lines (approximately 450 tokens) each. That is 18,000 tokens total for the entire project.
Should you include all 40 files? Analyze the tradeoffs and recommend a strategy.
Exercise 18: Cross-Package Change Impact Analysis
In a monorepo, you need to add an is_verified boolean field to the shared User model. Analyze the ripple effects:
- Which types of files need to change?
- In what order should changes be made?
- What are the risks of making these changes using AI?
- How would you verify that all necessary changes were made?
- What testing strategy would you use?
Tier 4: Create (Exercises 19-24)
These exercises require you to build something using the chapter's concepts.
Exercise 19: Build a Repository Map Generator
Using the concepts from Section 13.2 and the example code, create a Python script that: - Takes a directory path as input - Recursively scans all Python files - For each file, extracts: classes, functions, imports, and line count - Outputs a formatted repository map suitable for pasting into an AI prompt - Handles errors gracefully (permission errors, binary files, etc.)
Test it on a real project directory.
Exercise 20: Build a Cross-File Context Builder
Create a Python tool that: - Takes a target file path and a project directory - Analyzes the target file's imports to determine its dependencies - For each dependency within the project, extracts the public interface (class and function signatures with docstrings) - Outputs a formatted "cross-file context" block ready for an AI prompt
Exercise 21: Build a Convention Checker
Create a Python script that checks a directory of Python files for consistency in: - Naming conventions (classes PascalCase, functions snake_case, constants UPPER_CASE) - Docstring presence on all public functions and classes - Import style (all absolute or all relative, but not mixed) - Type hint coverage (functions with vs. without type hints)
Output a report of any inconsistencies found.
Exercise 22: Multi-File Project Generation
Using the vibe coding techniques from this chapter, generate a complete small project (8-10 files) for a library management system. Document: - Your context strategy (what you included in each prompt) - Whether you used file-by-file, holistic, or hybrid generation - Any consistency issues you encountered and how you resolved them - The total number of prompts used
Exercise 23: Context Document Creation
Create a comprehensive context document for an existing open-source Python project. Pick any project with 20+ files. Your document should include: - Architecture overview - Module summaries - Key abstractions and patterns - Dependency map - Naming conventions - Import conventions
Test the document by using it to generate a new feature with AI assistance.
Exercise 24: Import Cycle Detector
Build a Python tool that: - Scans a directory of Python files - Parses import statements from each file - Builds a directed dependency graph - Detects and reports any circular dependencies - Suggests which imports to restructure to break cycles
Tier 5: Challenge (Exercises 25-30)
These exercises integrate concepts from multiple chapters and push beyond the material covered in this chapter.
Exercise 25: Automated Context Optimizer
Build a tool that, given a task description and a codebase, automatically determines the optimal set of files and file fragments to include as context for an AI prompt. The tool should: - Parse the task description to identify key entities (models, services, endpoints mentioned) - Map those entities to files in the codebase - Determine the minimum set of context needed (interfaces vs. full files) - Estimate the token count and flag if it exceeds a configurable limit - Output a formatted context block ready for use in a prompt
This integrates Chapter 9 (Context Management) with this chapter's techniques.
Exercise 26: Multi-Agent Codebase Modification
Design (and optionally implement) a system that uses multiple AI sessions working in parallel to make a cross-cutting change to a codebase. The system should: - Take a high-level change description - Analyze the codebase to determine affected files - Split the work into independent workstreams - Generate a context package for each workstream - Execute the workstreams (potentially in parallel) - Verify consistency across all changes
This integrates Chapter 12 (Advanced Prompting) and previews Chapter 38 (Multi-Agent Systems).
Exercise 27: Enterprise Codebase Simulation
Create a simulated enterprise codebase (50+ files across 5+ packages in a monorepo structure) using AI. Then: - Generate a comprehensive repository map - Create tiered context documents (Tier 1 through Tier 4) - Use these documents to successfully add a new cross-cutting feature - Measure and report: number of prompts needed, consistency issues found, total tokens consumed
Exercise 28: Legacy Code Modernization Pipeline
Design and implement a pipeline that uses AI to modernize legacy Python code: 1. Analyze a legacy module and generate documentation 2. Create a modernization plan 3. Generate modernized code that maintains the public API 4. Generate tests that verify the modernized code matches the original behavior 5. Verify consistency between old and new versions
Test on a real legacy Python file (Python 2 style, no type hints, no docstrings).
Exercise 29: Team Convention Enforcement System
Build a system that: - Reads a team's coding conventions from a configuration file - Scans AI-generated code for convention violations - Generates corrective prompts that can be sent to an AI to fix violations - Tracks convention adherence over time (per developer, per module, per sprint)
This integrates Chapter 25 (Design Patterns and Clean Code) with this chapter.
Exercise 30: Repository Understanding Benchmark
Create a benchmark for measuring how well AI understands a codebase. The benchmark should: - Take a real codebase as input - Generate a set of questions about the code (e.g., "What happens when a user with an expired token makes a request?", "Which module handles database connection pooling?") - Generate correct answers from manual analysis - Measure AI accuracy with different context strategies (full code, repository map only, interface-only, tiered) - Report which strategies yield the best comprehension for the least context cost
This is a research-oriented exercise that integrates Chapter 9, this chapter, and Chapter 7 (Understanding AI-Generated Code).