Chapter 21 Exercises: AI-Assisted Testing Strategies

Tier 1: Recall and Understanding (Exercises 1--6)

Exercise 1: Testing Vocabulary

Define each of the following testing terms in one or two sentences. Provide one concrete example for each.

  • Unit test
  • Integration test
  • End-to-end test
  • Test fixture
  • Mock object
  • Test coverage
  • Property-based test

Exercise 2: The Testing Pyramid

Draw or describe the testing pyramid. For each layer, explain (a) what it tests, (b) relative speed, (c) relative quantity, and (d) one example from a to-do list application.

Exercise 3: AI Code Failure Modes

List five common failure modes of AI-generated code discussed in this chapter. For each, write one sentence explaining why testing helps catch it.

Exercise 4: pytest Basics

Explain the difference between these pytest concepts. Write a one-sentence description and a two-line code example for each.

  • @pytest.fixture vs. setUp method
  • @pytest.mark.parametrize vs. writing separate test functions
  • @pytest.mark.skip vs. @pytest.mark.xfail

Exercise 5: Fixture Scopes

Explain what each of the following pytest fixture scopes means and give a use case for each:

  • scope="function"
  • scope="class"
  • scope="module"
  • scope="session"

Exercise 6: Coverage Interpretation

Given the following coverage report, answer the questions below.

Name                    Stmts   Miss  Cover   Missing
myapp/auth.py              50     15    70%   23-30, 45-51
myapp/models.py            80      4    95%   67-70
myapp/utils.py             30      0   100%
myapp/api.py              100     25    75%   34-45, 78-90, 95-98
TOTAL                     260     44    83%
  1. Which file has the most untested code?
  2. Which file has the highest coverage?
  3. If auth.py handles password validation and api.py handles user input, which should you prioritize for additional testing? Why?
  4. Is 83% overall coverage sufficient? Under what circumstances would you want higher coverage?

Tier 2: Application (Exercises 7--12)

Exercise 7: Writing pytest Tests

Write pytest tests for the following function. Include at least five test cases covering normal operation, edge cases, and error handling.

def fizzbuzz(n: int) -> str:
    """Return 'Fizz' for multiples of 3, 'Buzz' for multiples of 5,
    'FizzBuzz' for multiples of both, or the number as a string."""
    if not isinstance(n, int):
        raise TypeError("Input must be an integer")
    if n <= 0:
        raise ValueError("Input must be a positive integer")
    if n % 15 == 0:
        return "FizzBuzz"
    if n % 3 == 0:
        return "Fizz"
    if n % 5 == 0:
        return "Buzz"
    return str(n)

Exercise 8: Parametrized Tests

Rewrite the following repetitive tests using @pytest.mark.parametrize:

def test_celsius_to_fahrenheit_freezing():
    assert celsius_to_fahrenheit(0) == 32.0

def test_celsius_to_fahrenheit_boiling():
    assert celsius_to_fahrenheit(100) == 212.0

def test_celsius_to_fahrenheit_body_temp():
    assert celsius_to_fahrenheit(37) == 98.6

def test_celsius_to_fahrenheit_negative():
    assert celsius_to_fahrenheit(-40) == -40.0

def test_celsius_to_fahrenheit_absolute_zero():
    assert celsius_to_fahrenheit(-273.15) == pytest.approx(-459.67)

Exercise 9: Writing Fixtures

Create a conftest.py file with fixtures for testing a simple blog application. Your fixtures should provide:

  1. A sample blog post (dictionary with title, content, author, date)
  2. A list of three blog posts
  3. A mock database connection that yields a connection and cleans up afterward
  4. A configured test client (assume a Flask app)

Exercise 10: Mocking External Services

The following function calls an external weather API. Write a test that mocks the API call and verifies the function's behavior.

import requests

def get_temperature(city: str) -> float:
    """Fetch the current temperature for a city in Fahrenheit."""
    response = requests.get(
        f"https://api.weather.example.com/current",
        params={"city": city, "units": "fahrenheit"}
    )
    response.raise_for_status()
    data = response.json()
    return data["temperature"]

Write tests for: (a) a successful API call, (b) a city that returns a 404 error, and (c) a network timeout.

Exercise 11: Basic Property-Based Test

Write Hypothesis property-based tests for the following function. Identify at least three properties.

def reverse_string(s: str) -> str:
    """Reverse a string."""
    return s[::-1]

Exercise 12: TDD Cycle

Practice the TDD cycle. Write tests first for a function called validate_email(email: str) -> bool that should:

  1. Return True for valid email addresses
  2. Return False for strings without an @ symbol
  3. Return False for strings without a domain after @
  4. Return False for empty strings
  5. Return False for strings with spaces

Write only the tests (at least 8 test cases). Do not write the implementation.


Tier 3: Analysis (Exercises 13--18)

Exercise 13: Test Quality Analysis

Analyze the following tests and explain what is wrong with each one. Rewrite each test to be more effective.

def test_1():
    result = calculate_total([10, 20, 30])
    assert result

def test_user_creation():
    user = create_user("alice", "alice@example.com")
    assert user is not None
    assert user

def test_everything():
    # Test creation
    item = create_item("Widget", 9.99)
    assert item.name == "Widget"
    # Test update
    item.update_price(19.99)
    assert item.price == 19.99
    # Test deletion
    delete_item(item.id)
    assert get_item(item.id) is None

Exercise 14: Mock Overuse Diagnosis

The following test uses extensive mocking. Identify which mocks are appropriate and which are excessive. Explain your reasoning and rewrite the test with appropriate mocking.

@patch("myapp.services.validate_input")
@patch("myapp.services.calculate_tax")
@patch("myapp.services.format_currency")
@patch("myapp.services.send_receipt_email")
@patch("myapp.services.log_transaction")
def test_process_payment(mock_log, mock_email, mock_format,
                         mock_tax, mock_validate):
    mock_validate.return_value = True
    mock_tax.return_value = 8.50
    mock_format.return_value = "$108.50"
    mock_email.return_value = True
    mock_log.return_value = None

    result = process_payment(amount=100.0, card="4111111111111111")

    assert result["success"] is True
    mock_validate.assert_called_once()
    mock_tax.assert_called_once_with(100.0)
    mock_format.assert_called_once_with(108.50)
    mock_email.assert_called_once()
    mock_log.assert_called_once()

Exercise 15: Coverage Gap Analysis

You have a function with 95% line coverage but suspect the tests are still inadequate. The uncovered lines are in an error handling block. Write a set of tests that would cover the error handling and explain why error handling coverage is particularly important for AI-generated code.

def parse_config(filepath: str) -> dict:
    """Parse a JSON configuration file."""
    try:
        with open(filepath, "r") as f:
            data = json.load(f)
    except FileNotFoundError:
        raise ConfigError(f"Config file not found: {filepath}")
    except json.JSONDecodeError as e:
        raise ConfigError(f"Invalid JSON in {filepath}: {e}")

    required_keys = ["database_url", "secret_key", "debug"]
    missing = [k for k in required_keys if k not in data]
    if missing:
        raise ConfigError(f"Missing required keys: {missing}")

    return data

Exercise 16: Hypothesis Strategy Design

Design Hypothesis strategies for the following data types. Write a composite strategy for each.

  1. A valid US phone number (format: (XXX) XXX-XXXX)
  2. A valid RGB color (three integers 0-255)
  3. A valid product (name: 1-100 chars, price: 0.01-99999.99, quantity: 0-10000)
  4. A valid date range (start date before end date, both within the last 10 years)

Exercise 17: Integration Test Design

Design integration tests for a user registration system that involves these components:

  • UserValidator: Validates username and email format
  • PasswordHasher: Hashes passwords with bcrypt
  • UserRepository: Stores users in a database
  • EmailService: Sends welcome emails
  • RegistrationService: Orchestrates the registration process

Write at least five integration tests that verify interactions between these components. Specify which components should be real and which should be mocked in each test.

Exercise 18: Test Failure Diagnosis

The following test passes locally but fails in CI. Analyze three possible causes and suggest fixes for each.

def test_log_file_creation():
    logger = AppLogger(log_dir="/tmp/app_logs")
    logger.info("Test message")

    log_files = os.listdir("/tmp/app_logs")
    assert len(log_files) == 1

    with open(f"/tmp/app_logs/{log_files[0]}") as f:
        content = f.read()
    assert "Test message" in content

Tier 4: Synthesis and Evaluation (Exercises 19--24)

Exercise 19: Complete Test Suite Design

Design a complete test suite for an AI-generated library management system with these features:

  • Add, remove, and search for books
  • Check out and return books
  • Track overdue books
  • Generate reports (most popular books, active borrowers)
  • Send overdue notifications

Create a test plan document that includes: 1. Test categories (unit, integration, E2E, property-based) 2. At least 15 specific test cases organized by category 3. Fixtures needed 4. Mocking strategy 5. Coverage targets per module 6. CI pipeline configuration

Exercise 20: Property-Based Test Suite

Write a comprehensive Hypothesis property-based test suite for an AI-generated shopping cart module with these functions:

def add_item(cart: Cart, item: Item, quantity: int) -> Cart: ...
def remove_item(cart: Cart, item_id: str) -> Cart: ...
def calculate_total(cart: Cart) -> Decimal: ...
def apply_discount(cart: Cart, code: str) -> Cart: ...
def checkout(cart: Cart, payment: Payment) -> Order: ...

Write at least 8 property-based tests covering: - Invariants (total is always non-negative, item count matches) - Idempotency (removing a non-existent item changes nothing) - Round-trip properties (add then remove returns original cart) - Commutativity (adding items in any order gives same total)

Exercise 21: TDD with AI Simulation

Simulate a complete TDD-AI workflow for building a Markdown-to-HTML converter. Write:

  1. An initial set of 5 tests for basic Markdown features (headings, bold, italic, links, lists)
  2. The prompt you would give to an AI to implement the code
  3. Five additional tests for edge cases
  4. The prompt you would give to fix any expected failures
  5. Property-based tests for the converter

Exercise 22: Test Refactoring

The following test file has multiple problems: duplicated setup, poor naming, missing edge cases, no parametrization, and no fixtures. Refactor it into a well-structured test file.

def test1():
    calc = Calculator()
    assert calc.add(2, 3) == 5

def test2():
    calc = Calculator()
    assert calc.add(-1, 1) == 0

def test3():
    calc = Calculator()
    assert calc.subtract(10, 3) == 7

def test4():
    calc = Calculator()
    assert calc.multiply(4, 5) == 20

def test5():
    calc = Calculator()
    assert calc.divide(10, 2) == 5

def test6():
    calc = Calculator()
    try:
        calc.divide(1, 0)
        assert False
    except:
        assert True

Exercise 23: CI Pipeline Design

Design a comprehensive CI testing pipeline for a Python web application built with AI assistance. Your pipeline should include:

  1. A fast feedback stage (< 2 minutes)
  2. A thorough testing stage (< 10 minutes)
  3. A quality gate stage
  4. A deployment stage (only on main branch)

Write the complete GitHub Actions YAML configuration. Include linting (ruff), type checking (mypy), unit tests, integration tests, coverage reporting, and security scanning.

Exercise 24: Test Anti-Pattern Catalog

Create a catalog of at least 8 testing anti-patterns commonly seen in AI-generated test code. For each anti-pattern:

  1. Give it a descriptive name
  2. Show a code example
  3. Explain why it is problematic
  4. Show the corrected version
  5. Write a one-sentence rule to avoid it

Tier 5: Expert Challenges (Exercises 25--30)

Exercise 25: Mutation Testing Analysis

Using the mutmut library (or by manually creating mutations), analyze a small module of your choosing. Create at least 10 mutations by hand, run your test suite against each, and report:

  1. How many mutations were caught (killed)
  2. How many survived (tests still passed)
  3. What the surviving mutations reveal about test suite weaknesses
  4. New tests written to kill the surviving mutants

Exercise 26: Custom Hypothesis Strategy

Build a custom Hypothesis strategy that generates valid SQL SELECT statements. The strategy should produce queries with:

  • Random but valid table names
  • 1-5 column selections (or *)
  • Optional WHERE clauses with valid conditions
  • Optional ORDER BY clauses
  • Optional LIMIT clauses

Use this strategy to test an AI-generated SQL parser.

Exercise 27: Concurrent Test Verification

Write tests that verify the thread-safety of an AI-generated cache implementation. Your tests should:

  1. Run multiple threads simultaneously reading and writing to the cache
  2. Verify that no data is lost or corrupted
  3. Test behavior under high contention
  4. Verify that cache eviction works correctly under concurrent access
  5. Use threading.Barrier or threading.Event to synchronize test threads

Exercise 28: Test Generation Prompt Engineering

Design a set of 5 progressively detailed prompts for asking AI to generate tests for a REST API with CRUD operations. Start with a minimal prompt and end with a comprehensive one. For each prompt:

  1. Write the prompt
  2. Predict what the AI will generate
  3. Identify gaps in the predicted output
  4. Explain how the next prompt addresses those gaps

Evaluate the quality difference between the minimal and comprehensive prompts.

Exercise 29: Full Application Test Harness

Build a complete test harness for a command-line note-taking application that includes:

  1. Unit tests for all business logic (at least 20 tests)
  2. Integration tests for file I/O and data persistence (at least 8 tests)
  3. E2E tests using Click's CliRunner (at least 5 tests)
  4. Property-based tests using Hypothesis (at least 5 tests)
  5. A conftest.py with shared fixtures
  6. A pytest.ini or pyproject.toml with test configuration
  7. A coverage configuration targeting 90%+

Write all the test code. The application should support: creating notes, listing notes, searching notes, tagging notes, and exporting notes to JSON.

Exercise 30: Testing Strategy Document

Write a testing strategy document (1000+ words) for a team of developers who are adopting vibe coding practices. The document should cover:

  1. Why testing practices must change when using AI for code generation
  2. Recommended test types and their proportions
  3. Which tests should humans write vs. which can be AI-generated
  4. Required coverage levels and quality gates
  5. Code review checklist for AI-generated tests
  6. Process for integrating property-based testing
  7. CI/CD pipeline requirements
  8. Training recommendations for the team