Chapter 21 Exercises: AI-Assisted Testing Strategies
Tier 1: Recall and Understanding (Exercises 1--6)
Exercise 1: Testing Vocabulary
Define each of the following testing terms in one or two sentences. Provide one concrete example for each.
- Unit test
- Integration test
- End-to-end test
- Test fixture
- Mock object
- Test coverage
- Property-based test
Exercise 2: The Testing Pyramid
Draw or describe the testing pyramid. For each layer, explain (a) what it tests, (b) relative speed, (c) relative quantity, and (d) one example from a to-do list application.
Exercise 3: AI Code Failure Modes
List five common failure modes of AI-generated code discussed in this chapter. For each, write one sentence explaining why testing helps catch it.
Exercise 4: pytest Basics
Explain the difference between these pytest concepts. Write a one-sentence description and a two-line code example for each.
@pytest.fixturevs.setUpmethod@pytest.mark.parametrizevs. writing separate test functions@pytest.mark.skipvs.@pytest.mark.xfail
Exercise 5: Fixture Scopes
Explain what each of the following pytest fixture scopes means and give a use case for each:
scope="function"scope="class"scope="module"scope="session"
Exercise 6: Coverage Interpretation
Given the following coverage report, answer the questions below.
Name Stmts Miss Cover Missing
myapp/auth.py 50 15 70% 23-30, 45-51
myapp/models.py 80 4 95% 67-70
myapp/utils.py 30 0 100%
myapp/api.py 100 25 75% 34-45, 78-90, 95-98
TOTAL 260 44 83%
- Which file has the most untested code?
- Which file has the highest coverage?
- If
auth.pyhandles password validation andapi.pyhandles user input, which should you prioritize for additional testing? Why? - Is 83% overall coverage sufficient? Under what circumstances would you want higher coverage?
Tier 2: Application (Exercises 7--12)
Exercise 7: Writing pytest Tests
Write pytest tests for the following function. Include at least five test cases covering normal operation, edge cases, and error handling.
def fizzbuzz(n: int) -> str:
"""Return 'Fizz' for multiples of 3, 'Buzz' for multiples of 5,
'FizzBuzz' for multiples of both, or the number as a string."""
if not isinstance(n, int):
raise TypeError("Input must be an integer")
if n <= 0:
raise ValueError("Input must be a positive integer")
if n % 15 == 0:
return "FizzBuzz"
if n % 3 == 0:
return "Fizz"
if n % 5 == 0:
return "Buzz"
return str(n)
Exercise 8: Parametrized Tests
Rewrite the following repetitive tests using @pytest.mark.parametrize:
def test_celsius_to_fahrenheit_freezing():
assert celsius_to_fahrenheit(0) == 32.0
def test_celsius_to_fahrenheit_boiling():
assert celsius_to_fahrenheit(100) == 212.0
def test_celsius_to_fahrenheit_body_temp():
assert celsius_to_fahrenheit(37) == 98.6
def test_celsius_to_fahrenheit_negative():
assert celsius_to_fahrenheit(-40) == -40.0
def test_celsius_to_fahrenheit_absolute_zero():
assert celsius_to_fahrenheit(-273.15) == pytest.approx(-459.67)
Exercise 9: Writing Fixtures
Create a conftest.py file with fixtures for testing a simple blog application. Your fixtures should provide:
- A sample blog post (dictionary with title, content, author, date)
- A list of three blog posts
- A mock database connection that yields a connection and cleans up afterward
- A configured test client (assume a Flask app)
Exercise 10: Mocking External Services
The following function calls an external weather API. Write a test that mocks the API call and verifies the function's behavior.
import requests
def get_temperature(city: str) -> float:
"""Fetch the current temperature for a city in Fahrenheit."""
response = requests.get(
f"https://api.weather.example.com/current",
params={"city": city, "units": "fahrenheit"}
)
response.raise_for_status()
data = response.json()
return data["temperature"]
Write tests for: (a) a successful API call, (b) a city that returns a 404 error, and (c) a network timeout.
Exercise 11: Basic Property-Based Test
Write Hypothesis property-based tests for the following function. Identify at least three properties.
def reverse_string(s: str) -> str:
"""Reverse a string."""
return s[::-1]
Exercise 12: TDD Cycle
Practice the TDD cycle. Write tests first for a function called validate_email(email: str) -> bool that should:
- Return
Truefor valid email addresses - Return
Falsefor strings without an@symbol - Return
Falsefor strings without a domain after@ - Return
Falsefor empty strings - Return
Falsefor strings with spaces
Write only the tests (at least 8 test cases). Do not write the implementation.
Tier 3: Analysis (Exercises 13--18)
Exercise 13: Test Quality Analysis
Analyze the following tests and explain what is wrong with each one. Rewrite each test to be more effective.
def test_1():
result = calculate_total([10, 20, 30])
assert result
def test_user_creation():
user = create_user("alice", "alice@example.com")
assert user is not None
assert user
def test_everything():
# Test creation
item = create_item("Widget", 9.99)
assert item.name == "Widget"
# Test update
item.update_price(19.99)
assert item.price == 19.99
# Test deletion
delete_item(item.id)
assert get_item(item.id) is None
Exercise 14: Mock Overuse Diagnosis
The following test uses extensive mocking. Identify which mocks are appropriate and which are excessive. Explain your reasoning and rewrite the test with appropriate mocking.
@patch("myapp.services.validate_input")
@patch("myapp.services.calculate_tax")
@patch("myapp.services.format_currency")
@patch("myapp.services.send_receipt_email")
@patch("myapp.services.log_transaction")
def test_process_payment(mock_log, mock_email, mock_format,
mock_tax, mock_validate):
mock_validate.return_value = True
mock_tax.return_value = 8.50
mock_format.return_value = "$108.50"
mock_email.return_value = True
mock_log.return_value = None
result = process_payment(amount=100.0, card="4111111111111111")
assert result["success"] is True
mock_validate.assert_called_once()
mock_tax.assert_called_once_with(100.0)
mock_format.assert_called_once_with(108.50)
mock_email.assert_called_once()
mock_log.assert_called_once()
Exercise 15: Coverage Gap Analysis
You have a function with 95% line coverage but suspect the tests are still inadequate. The uncovered lines are in an error handling block. Write a set of tests that would cover the error handling and explain why error handling coverage is particularly important for AI-generated code.
def parse_config(filepath: str) -> dict:
"""Parse a JSON configuration file."""
try:
with open(filepath, "r") as f:
data = json.load(f)
except FileNotFoundError:
raise ConfigError(f"Config file not found: {filepath}")
except json.JSONDecodeError as e:
raise ConfigError(f"Invalid JSON in {filepath}: {e}")
required_keys = ["database_url", "secret_key", "debug"]
missing = [k for k in required_keys if k not in data]
if missing:
raise ConfigError(f"Missing required keys: {missing}")
return data
Exercise 16: Hypothesis Strategy Design
Design Hypothesis strategies for the following data types. Write a composite strategy for each.
- A valid US phone number (format:
(XXX) XXX-XXXX) - A valid RGB color (three integers 0-255)
- A valid product (name: 1-100 chars, price: 0.01-99999.99, quantity: 0-10000)
- A valid date range (start date before end date, both within the last 10 years)
Exercise 17: Integration Test Design
Design integration tests for a user registration system that involves these components:
UserValidator: Validates username and email formatPasswordHasher: Hashes passwords with bcryptUserRepository: Stores users in a databaseEmailService: Sends welcome emailsRegistrationService: Orchestrates the registration process
Write at least five integration tests that verify interactions between these components. Specify which components should be real and which should be mocked in each test.
Exercise 18: Test Failure Diagnosis
The following test passes locally but fails in CI. Analyze three possible causes and suggest fixes for each.
def test_log_file_creation():
logger = AppLogger(log_dir="/tmp/app_logs")
logger.info("Test message")
log_files = os.listdir("/tmp/app_logs")
assert len(log_files) == 1
with open(f"/tmp/app_logs/{log_files[0]}") as f:
content = f.read()
assert "Test message" in content
Tier 4: Synthesis and Evaluation (Exercises 19--24)
Exercise 19: Complete Test Suite Design
Design a complete test suite for an AI-generated library management system with these features:
- Add, remove, and search for books
- Check out and return books
- Track overdue books
- Generate reports (most popular books, active borrowers)
- Send overdue notifications
Create a test plan document that includes: 1. Test categories (unit, integration, E2E, property-based) 2. At least 15 specific test cases organized by category 3. Fixtures needed 4. Mocking strategy 5. Coverage targets per module 6. CI pipeline configuration
Exercise 20: Property-Based Test Suite
Write a comprehensive Hypothesis property-based test suite for an AI-generated shopping cart module with these functions:
def add_item(cart: Cart, item: Item, quantity: int) -> Cart: ...
def remove_item(cart: Cart, item_id: str) -> Cart: ...
def calculate_total(cart: Cart) -> Decimal: ...
def apply_discount(cart: Cart, code: str) -> Cart: ...
def checkout(cart: Cart, payment: Payment) -> Order: ...
Write at least 8 property-based tests covering: - Invariants (total is always non-negative, item count matches) - Idempotency (removing a non-existent item changes nothing) - Round-trip properties (add then remove returns original cart) - Commutativity (adding items in any order gives same total)
Exercise 21: TDD with AI Simulation
Simulate a complete TDD-AI workflow for building a Markdown-to-HTML converter. Write:
- An initial set of 5 tests for basic Markdown features (headings, bold, italic, links, lists)
- The prompt you would give to an AI to implement the code
- Five additional tests for edge cases
- The prompt you would give to fix any expected failures
- Property-based tests for the converter
Exercise 22: Test Refactoring
The following test file has multiple problems: duplicated setup, poor naming, missing edge cases, no parametrization, and no fixtures. Refactor it into a well-structured test file.
def test1():
calc = Calculator()
assert calc.add(2, 3) == 5
def test2():
calc = Calculator()
assert calc.add(-1, 1) == 0
def test3():
calc = Calculator()
assert calc.subtract(10, 3) == 7
def test4():
calc = Calculator()
assert calc.multiply(4, 5) == 20
def test5():
calc = Calculator()
assert calc.divide(10, 2) == 5
def test6():
calc = Calculator()
try:
calc.divide(1, 0)
assert False
except:
assert True
Exercise 23: CI Pipeline Design
Design a comprehensive CI testing pipeline for a Python web application built with AI assistance. Your pipeline should include:
- A fast feedback stage (< 2 minutes)
- A thorough testing stage (< 10 minutes)
- A quality gate stage
- A deployment stage (only on main branch)
Write the complete GitHub Actions YAML configuration. Include linting (ruff), type checking (mypy), unit tests, integration tests, coverage reporting, and security scanning.
Exercise 24: Test Anti-Pattern Catalog
Create a catalog of at least 8 testing anti-patterns commonly seen in AI-generated test code. For each anti-pattern:
- Give it a descriptive name
- Show a code example
- Explain why it is problematic
- Show the corrected version
- Write a one-sentence rule to avoid it
Tier 5: Expert Challenges (Exercises 25--30)
Exercise 25: Mutation Testing Analysis
Using the mutmut library (or by manually creating mutations), analyze a small module of your choosing. Create at least 10 mutations by hand, run your test suite against each, and report:
- How many mutations were caught (killed)
- How many survived (tests still passed)
- What the surviving mutations reveal about test suite weaknesses
- New tests written to kill the surviving mutants
Exercise 26: Custom Hypothesis Strategy
Build a custom Hypothesis strategy that generates valid SQL SELECT statements. The strategy should produce queries with:
- Random but valid table names
- 1-5 column selections (or
*) - Optional
WHEREclauses with valid conditions - Optional
ORDER BYclauses - Optional
LIMITclauses
Use this strategy to test an AI-generated SQL parser.
Exercise 27: Concurrent Test Verification
Write tests that verify the thread-safety of an AI-generated cache implementation. Your tests should:
- Run multiple threads simultaneously reading and writing to the cache
- Verify that no data is lost or corrupted
- Test behavior under high contention
- Verify that cache eviction works correctly under concurrent access
- Use
threading.Barrierorthreading.Eventto synchronize test threads
Exercise 28: Test Generation Prompt Engineering
Design a set of 5 progressively detailed prompts for asking AI to generate tests for a REST API with CRUD operations. Start with a minimal prompt and end with a comprehensive one. For each prompt:
- Write the prompt
- Predict what the AI will generate
- Identify gaps in the predicted output
- Explain how the next prompt addresses those gaps
Evaluate the quality difference between the minimal and comprehensive prompts.
Exercise 29: Full Application Test Harness
Build a complete test harness for a command-line note-taking application that includes:
- Unit tests for all business logic (at least 20 tests)
- Integration tests for file I/O and data persistence (at least 8 tests)
- E2E tests using Click's CliRunner (at least 5 tests)
- Property-based tests using Hypothesis (at least 5 tests)
- A conftest.py with shared fixtures
- A pytest.ini or pyproject.toml with test configuration
- A coverage configuration targeting 90%+
Write all the test code. The application should support: creating notes, listing notes, searching notes, tagging notes, and exporting notes to JSON.
Exercise 30: Testing Strategy Document
Write a testing strategy document (1000+ words) for a team of developers who are adopting vibe coding practices. The document should cover:
- Why testing practices must change when using AI for code generation
- Recommended test types and their proportions
- Which tests should humans write vs. which can be AI-generated
- Required coverage levels and quality gates
- Code review checklist for AI-generated tests
- Process for integrating property-based testing
- CI/CD pipeline requirements
- Training recommendations for the team