Chapter 2: Exercises

Tier 1: Recall (Remember and Understand)

Exercise 2.1 — Key Vocabulary

Define each of the following terms in your own words, using no more than two sentences each: 1. Token 2. Context window 3. Transformer 4. Attention mechanism 5. Temperature (in the context of AI) 6. RLHF 7. Fine-tuning 8. Probability distribution 9. Pre-training 10. Sampling

Exercise 2.2 — Sequence the Pipeline

The following steps describe what happens when you ask an AI coding assistant to write code. Put them in the correct order: - A. The model predicts a probability distribution over the next token - B. Your text is broken into tokens - C. The selected token is appended and the process repeats - D. Tokens are converted into numerical representations - E. The numbers pass through the transformer's layers - F. A token is selected from the distribution - G. The final sequence of tokens is decoded back into text

Exercise 2.3 — True or False

Determine whether each statement is true or false, and briefly explain your reasoning: 1. AI coding assistants understand code the same way human programmers do. 2. A token always corresponds to exactly one word. 3. The context window includes both the prompt and the response. 4. Higher temperature settings produce more predictable output. 5. RLHF is performed before pre-training. 6. Transformers process input tokens in parallel, not sequentially. 7. The attention mechanism allows the model to focus on relevant parts of distant input. 8. AI models can update their knowledge after training without retraining.

Exercise 2.4 — Fill in the Blanks

Complete each sentence: 1. The architecture that powers modern AI coding assistants is called a _. 2. The fundamental unit of text that AI models process is called a _. 3. A temperature of 0 makes the model's output more _. 4. The model's training data has a _ _, meaning it does not know about events after a certain date. 5. _ _ _ is the process that aligns model output with human preferences about code quality.

Exercise 2.5 — Matching

Match each concept on the left with the best analogy on the right:

Concept	Analogy
1. Context window	A. A spotlight operator at a theater
2. Attention mechanism	B. An adventurousness dial on a recommendation system
3. Temperature	C. A contractor's set of blueprints
4. Pre-training	D. An extremely well-read autocomplete system
5. Language model	E. Learning a language by reading millions of books

Tier 2: Apply (Use Knowledge in New Situations)

Exercise 2.6 — Token Estimation

Estimate the token count for each of the following code snippets. Then use the tokenization script from code/example-01-tokenization.py (or an online tokenizer tool) to check your estimates:

# Snippet A
def hello():
    print("Hello, World!")

# Snippet B
class UserAuthentication:
    def __init__(self, database_connection_string: str, max_retry_attempts: int = 3):
        self.db_connection = database_connection_string
        self.max_retries = max_retry_attempts
        self._authenticated_users: dict[str, bool] = {}

# Snippet C
result = [x ** 2 for x in range(100) if x % 2 == 0]

Exercise 2.7 — Context Window Budgeting

You are working with a model that has a 4,096-token context window. You need to: - Provide a system prompt (approximately 200 tokens) - Include an existing class definition (approximately 800 tokens) - Include three related function signatures (approximately 150 tokens) - Write your instruction prompt (approximately 100 tokens) - Leave room for the model's response

Calculate: How many tokens are available for the model's response? If the model generates approximately 4 tokens per line of code, approximately how many lines of code can it produce?

Exercise 2.8 — Temperature Prediction

For each of the following tasks, predict whether low temperature (0-0.3), medium temperature (0.3-0.7), or high temperature (0.7-1.0) would be most appropriate, and explain why: 1. Implementing the binary search algorithm 2. Generating creative variable names for a game 3. Writing unit tests for a known function 4. Brainstorming different architectural approaches 5. Converting a Python function to JavaScript 6. Writing a poem as a code comment (for fun)

Exercise 2.9 — Prompt Optimization

The following prompt is inefficient and vague. Rewrite it to be more effective, applying what you learned about how AI models process context:

"I need some code. It should work with data. The data comes from users. Process it somehow and make sure it works. Use Python. Also make it good."

Exercise 2.10 — Attention Simulation

Consider the following code and imagine you are the attention mechanism. For each blank, identify which earlier parts of the code the attention should focus on most strongly, and predict the completion:

class ShoppingCart:
    def __init__(self):
        self.items = []
        self.discount_rate = 0.1

    def add_item(self, name: str, price: float, quantity: int = 1):
        self.items.append({"name": name, "price": price, "quantity": quantity})

    def calculate_total(self):
        subtotal = sum(item["price"] * item["____"] for item in self.____)
        discount = subtotal * self.____
        return subtotal - ____

Tier 3: Analyze (Break Down and Examine)

Exercise 2.11 — Output Analysis

An AI coding assistant was asked: "Write a function to check if a number is prime." It produced the following code:

def is_prime(n):
    if n < 2:
        return False
    for i in range(2, n):
        return n % i != 0
    return True

Analyze the output: 1. What is wrong with this code? 2. Why might the AI have generated this particular error? 3. How does understanding token-by-token generation help explain this type of mistake? 4. Write the corrected version.

Exercise 2.12 — Training Data Influence

Consider these two responses from an AI to the prompt "Write a Python swap function":

Response A:

def swap(a, b):
    temp = a
    a = b
    b = temp
    return a, b

Response B:

def swap(a, b):
    return b, a

Both are correct. Analyze: 1. Why might the AI produce Response A in some cases and Response B in others? 2. Which response reflects more "Pythonic" code, and why might the model have learned both patterns? 3. How might temperature settings affect which response is generated?

Exercise 2.13 — Context Window Experiment

Design an experiment to test how context window utilization affects code quality. Describe: 1. The coding task you would use 2. Three different context configurations (minimal, moderate, rich) 3. What you would measure to evaluate quality 4. What results you would predict and why

Exercise 2.14 — Failure Mode Classification

For each of the following AI-generated code errors, classify the most likely cause (pattern matching failure, context limitation, training data bias, or novel problem difficulty): 1. The AI generates a function using a deprecated API from an older library version 2. The AI writes correct code for the first three methods of a class but introduces an inconsistency in the fourth method 3. The AI implements a well-known algorithm correctly but fails on a custom variation 4. The AI uses var instead of let in a modern JavaScript project 5. The AI generates a SQL query with a potential injection vulnerability

Exercise 2.15 — Comparing Explanations

Ask an AI coding assistant to explain how a binary search works. Then ask it again with additional context: "Explain binary search to someone who has never programmed but understands the concept of looking up a word in a physical dictionary."

Compare the two responses: 1. How did the context change the response? 2. What does this tell you about how the model uses context? 3. Which explanation would be more useful for a beginner, and why?

Tier 4: Create (Build Something New)

Exercise 2.16 — Build a Token Counter

Write a Python program that: 1. Accepts a file path as a command-line argument 2. Reads the file content 3. Estimates the token count using the approximation rules from Section 2.6 4. Compares the estimate to the actual token count using the tiktoken library (if available) 5. Reports the results with a breakdown by code vs. comments

Exercise 2.17 — Context Window Simulator

Extend the context window simulation from code/example-02-context-window.py to: 1. Accept multiple files as input 2. Prioritize which file contents to include based on relevance to a given task description 3. Generate a formatted prompt that fits within a specified token budget 4. Display how the context budget was allocated

Exercise 2.18 — Temperature Visualizer

Create a Python script that: 1. Defines a simple probability distribution over 10 tokens 2. Applies different temperature values (0.1, 0.5, 1.0, 2.0) to the distribution 3. Visualizes the results using matplotlib (or prints an ASCII bar chart) 4. Simulates sampling 100 tokens at each temperature and shows the frequency distribution

Exercise 2.19 — AI Capability Assessment Tool

Design and implement a Python script that generates a set of test prompts designed to evaluate an AI coding assistant's strengths and weaknesses. The script should: 1. Generate prompts in categories: syntax, algorithms, architecture, security, documentation 2. Include expected output patterns for basic validation 3. Provide a scoring rubric for each category 4. Output a formatted report

Exercise 2.20 — Prompt Optimizer

Write a Python function that takes a raw prompt and optimizes it based on the principles from this chapter: 1. Ensures the most important information is at the beginning 2. Adds structure (headers, bullet points) to help attention 3. Estimates the token count and warns if it exceeds a budget 4. Adds explicit quality requirements if not present 5. Returns the optimized prompt with a summary of changes made

Tier 5: Challenge (Extend and Synthesize)

Exercise 2.21 — Attention Heatmap

Research and implement a simplified version of the attention mechanism. Your implementation should: 1. Take a sequence of token embeddings (you can use random vectors) 2. Compute Query, Key, and Value matrices 3. Calculate attention scores between all pairs of tokens 4. Visualize the attention pattern as a heatmap 5. Explain what the heatmap reveals about which tokens "attend to" which other tokens

Exercise 2.22 — Tokenizer Comparison

Compare tokenization across different approaches: 1. Implement a simple character-level tokenizer 2. Implement a simple word-level tokenizer 3. Use the tiktoken library for BPE tokenization 4. Compare all three on the same 10 code snippets (varying languages and complexity) 5. Analyze the trade-offs: vocabulary size, sequence length, and handling of rare tokens 6. Write a report summarizing which approach works best for code and why

Exercise 2.23 — Simulated RLHF

Design a system that simulates the RLHF process for code quality: 1. Define a "code quality" scoring function that evaluates: variable naming, docstrings, error handling, type hints, and code length 2. Write a function that generates 5 variations of a simple function (simulating model outputs) 3. Score each variation using your quality function 4. Select the best variation and explain how this mimics the RLHF process 5. Discuss limitations of automated quality scoring vs. human evaluation

Exercise 2.24 — The Context Experiment

Conduct an actual experiment with an AI coding assistant: 1. Choose a moderately complex coding task (e.g., implement a cache with TTL expiration) 2. Submit the task three times with different levels of context: - Minimal: Just the task description - Moderate: Task description plus type hints and interface requirements - Rich: Task description, type hints, existing code patterns, test cases, and constraints 3. Evaluate each response for: correctness, style consistency, edge case handling, and documentation 4. Write a 500-word analysis of your findings, relating them to the concepts in this chapter

Exercise 2.25 — Teaching Exercise

Write a 1,000-word explanation of how AI coding assistants work, targeted at a non-technical manager who needs to understand the technology to make informed decisions about adopting AI tools for their team. You may not use any technical jargon without first defining it in plain language. Include at least three analogies that are different from those used in this chapter.

Exercise 2.26 — Architecture Diagram

Create a detailed diagram (using text-based tools like Mermaid or ASCII art) that shows: 1. The complete pipeline from user prompt to generated code 2. The internal structure of a transformer layer (attention + feed-forward) 3. The three stages of training (pre-training, SFT, RLHF) 4. How context window contents are structured during a typical coding interaction

Exercise 2.27 — Cross-Chapter Synthesis

Using concepts from both Chapter 1 (vibe coding fundamentals) and Chapter 2 (how AI works): 1. Explain why "describing intent clearly" (from Chapter 1) is so important, now that you understand attention mechanisms and probability distributions 2. Explain why "iterating on AI output" (from Chapter 1) is necessary, now that you understand token-by-token generation and its limitations 3. Propose three new vibe coding best practices that are directly motivated by the technical concepts in this chapter

Exercise 2.28 — Debate Preparation

Prepare arguments for both sides of the following debate: "Understanding how AI models work internally is essential for effective vibe coding." Write three strong arguments for each side, supporting each with specific technical details from this chapter.