Chapter 21 Quiz: AI-Assisted Testing Strategies

DataField.Dev

Chapter 21 Quiz: AI-Assisted Testing Strategies

Test your understanding of AI-assisted testing concepts, pytest, property-based testing, and test suite design. Each question has one best answer unless otherwise stated.

Question 1

Why does AI-generated code require more testing than manually written code?

A) AI-generated code is always buggier than human code B) You lack the mental model of design decisions that comes from writing code yourself C) AI cannot generate syntactically correct code D) Testing frameworks do not support AI-generated code

Answer

**B) You lack the mental model of design decisions that comes from writing code yourself.** When you write code by hand, you understand every decision, trade-off, and edge case you considered. With AI-generated code, you receive a finished product without that internal understanding, creating a "trust gap" that testing helps bridge.

Question 2

Which of the following is NOT a common failure mode of AI-generated code?

A) Happy-path bias B) Inconsistent error handling C) Syntax errors in every function D) Plausible but incorrect logic

Answer

**C) Syntax errors in every function.** AI-generated code is typically syntactically correct. The more insidious problems are logical errors, incomplete edge case handling, inconsistent error patterns, and code that looks correct but subtly is not.

Question 3

What is the primary advantage of pytest over Python's built-in unittest module?

A) pytest can only run on Linux B) pytest uses simpler syntax with plain assert statements and powerful fixtures C) unittest does not support test discovery D) pytest generates code coverage reports automatically

Answer

**B) pytest uses simpler syntax with plain `assert` statements and powerful fixtures.** pytest eliminates the need for `TestCase` classes, special assertion methods, and `setUp`/`tearDown` methods. Its fixture system is more flexible, and its plugin ecosystem is extensive.

Question 4

What does the yield keyword do in a pytest fixture?

A) It marks the fixture as deprecated B) It separates the setup code (before yield) from the teardown code (after yield) C) It makes the fixture run faster D) It allows the fixture to return multiple values

Answer

**B) It separates the setup code (before yield) from the teardown code (after yield).** Code before `yield` runs before the test; the yielded value is provided to the test function; code after `yield` runs after the test completes, even if the test fails. This ensures proper resource cleanup.

Question 5

Which fixture scope creates the fixture once and reuses it for all tests in the entire test session?

A) scope="function" B) scope="module" C) scope="class" D) scope="session"

Answer

**D) `scope="session"`.** Session scope creates the fixture once for the entire test run. This is useful for expensive resources like database connections or loading large datasets.

Question 6

What is the purpose of @pytest.mark.parametrize?

A) To skip tests that are slow B) To run the same test function with different input values and expected results C) To mock external dependencies D) To measure test coverage

Answer

**B) To run the same test function with different input values and expected results.** Parametrization eliminates the need to write separate test functions for each input case, reducing duplication while increasing test coverage across multiple scenarios.

Question 7

In the testing pyramid, which layer should have the most tests?

A) End-to-end tests B) Integration tests C) Unit tests D) All layers should have equal numbers of tests

Answer

**C) Unit tests.** The testing pyramid recommends that unit tests (70-80%) form the base, integration tests (15-25%) the middle, and E2E tests (5-10%) the top. Unit tests are fast, focused, and cheap to maintain.

Question 8

What is the key difference between a stub and a mock?

A) Stubs are slower than mocks B) Stubs return predetermined values; mocks verify how they were called C) Mocks are only used in integration tests D) Stubs can only return None

Answer

**B) Stubs return predetermined values; mocks verify how they were called.** Stubs provide controlled return values to keep the test running. Mocks add behavior verification --- you can assert that specific methods were called with specific arguments a specific number of times.

Question 9

When using unittest.mock.patch, what path should you mock?

A) The path where the object is originally defined B) The path where the object is imported and used C) Always use the full standard library path D) The path does not matter; patch finds the object automatically

Answer

**B) The path where the object is imported and used.** If `myapp/services.py` does `from requests import get`, you should mock `myapp.services.get`, not `requests.get`. This is because `patch` replaces the object at the location where it is looked up.

Question 10

What does property-based testing verify?

A) That specific inputs produce specific expected outputs B) That invariant properties hold true across many randomly generated inputs C) That the code follows coding style guidelines D) That external APIs are available

Answer

**B) That invariant properties hold true across many randomly generated inputs.** Instead of testing specific examples, property-based testing defines properties (e.g., "sorted output is always in ascending order") and verifies them against hundreds or thousands of randomly generated inputs.

Question 11

Which Hypothesis strategy would generate a list of at least one integer?

A) st.lists(st.integers()) B) st.lists(st.integers(), min_size=1) C) st.integers().list() D) st.nonempty(st.lists(st.integers()))

Answer

**B) `st.lists(st.integers(), min_size=1)`.** The `min_size` parameter ensures the generated list contains at least one element. Without it, Hypothesis may generate empty lists.

Question 12

Which of the following is an example of a "round-trip" property test?

A) Testing that sorting a list does not change its length B) Testing that encoding and then decoding data returns the original C) Testing that a function returns the correct type D) Testing that error messages are descriptive

Answer

**B) Testing that encoding and then decoding data returns the original.** Round-trip (or "there and back") properties verify that applying an operation and then its inverse returns you to the starting point: `decode(encode(x)) == x`.

Question 13

In the TDD-AI workflow, who writes the tests and who writes the implementation?

A) AI writes both tests and implementation B) The developer writes both tests and implementation C) The developer writes tests; AI writes the implementation D) AI writes tests; the developer writes the implementation

Answer

**C) The developer writes tests; AI writes the implementation.** This workflow puts the developer in control of the specification (via tests) while leveraging AI's speed for implementation. The developer owns what the code should do; the AI handles how.

Question 14

What does pytest.approx do?

A) It skips tests that are approximately correct B) It allows approximate floating-point comparisons with a configurable tolerance C) It rounds all test values to the nearest integer D) It estimates how long a test will take to run

Answer

**B) It allows approximate floating-point comparisons with a configurable tolerance.** Due to floating-point precision issues, exact equality comparisons with floats can fail unexpectedly. `pytest.approx(expected)` allows a small tolerance (default 1e-6) for comparisons.

Question 15

What is line coverage?

A) The percentage of lines of code reviewed by a human B) The percentage of source code lines executed during testing C) The number of tests per line of code D) The percentage of lines with comments

Answer

**B) The percentage of source code lines executed during testing.** Line coverage measures how many source code lines were executed by at least one test, expressed as a percentage of total lines.

Question 16

What is a key limitation of code coverage as a quality metric?

A) Coverage reports are always inaccurate B) High coverage does not guarantee that assertions are meaningful or correct C) Coverage cannot be measured for Python code D) Coverage tools slow down tests by a factor of 10

Answer

**B) High coverage does not guarantee that assertions are meaningful or correct.** A test can execute every line of code (100% coverage) while making weak or incorrect assertions. Coverage tells you what code was *run*, not whether it was *correctly verified*.

Question 17

What is mutation testing?

A) Testing code with randomly modified inputs B) Introducing small changes to source code and checking if tests detect them C) Testing that code handles genetic algorithm operations D) Rewriting tests to use different assertion methods

Answer

**B) Introducing small changes to source code and checking if tests detect them.** Mutation testing modifies the source code (e.g., changing `>` to `<`, `+` to `-`) and checks whether any test fails. If a mutation is not caught ("survives"), it indicates a weakness in the test suite.

Question 18

What is the purpose of conftest.py in pytest?

A) To configure the Python virtual environment B) To share fixtures and hooks across multiple test files in a directory C) To define which tests should be skipped D) To generate coverage reports

Answer

**B) To share fixtures and hooks across multiple test files in a directory.** Fixtures defined in `conftest.py` are automatically available to all test files in its directory and subdirectories, without needing to import them.

Question 19

Which of the following is an anti-pattern when mocking in tests?

A) Mocking external API calls B) Mocking database connections in unit tests C) Mocking internal functions so heavily that the test only verifies mock behavior D) Using spec=True on mock objects

Answer

**C) Mocking internal functions so heavily that the test only verifies mock behavior.** Over-mocking replaces the actual code under test with predetermined return values, meaning the test no longer verifies real behavior. Mock at boundaries (external services, I/O), not internal logic.

Question 20

In a CI pipeline, what is the purpose of the -x flag when running pytest?

A) It enables XML output B) It stops execution on the first test failure for fast feedback C) It excludes slow tests D) It enables extra verbose output

Answer

**B) It stops execution on the first test failure for fast feedback.** The `-x` (or `--exitfirst`) flag makes pytest stop after the first failure, which is useful during development and in fast-feedback CI stages.

Question 21

When prompting AI to generate tests, which approach produces the best results?

A) "Write some tests for my code" B) "Write pytest tests with fixtures and parametrize, covering happy path, edge cases (empty input, None, negative numbers), and error cases. Use type hints and docstrings." C) "Test everything" D) "Generate 100 test cases"

Answer

**B) "Write pytest tests with fixtures and parametrize, covering happy path, edge cases (empty input, None, negative numbers), and error cases. Use type hints and docstrings."** Specific, structured prompts that name the testing framework, techniques, edge cases to cover, and code style produce far better test code than vague or overly broad instructions.

Question 22

What does the @pytest.mark.xfail marker indicate?

A) The test should not be run B) The test is expected to fail (a known issue), and a failure should not cause the suite to fail C) The test must complete in under one second D) The test requires network access

Answer

**B) The test is expected to fail (a known issue), and a failure should not cause the suite to fail.** `xfail` marks a test as expected to fail. If it fails, it is reported as "xfail" (not a failure). If it unexpectedly passes, it is reported as "xpass."

Question 23

Which property type tests that performing an operation twice produces the same result as performing it once?

A) Commutativity B) Associativity C) Idempotency D) Reflexivity

Answer

**C) Idempotency.** An idempotent operation produces the same result whether applied once or multiple times: `f(f(x)) == f(x)`. Common examples include normalizing text, deduplicating a list, or setting an absolute value.

Question 24

What is the recommended approach for testing AI-generated code that interacts with a database?

A) Use the production database B) Skip database tests entirely C) Use an in-memory database (e.g., SQLite :memory:) or a dedicated test database D) Only test database code manually

Answer

**C) Use an in-memory database (e.g., SQLite `:memory:`) or a dedicated test database.** In-memory databases are fast, isolated, and disposable. They provide realistic database behavior without affecting production data or requiring external infrastructure.

Question 25

According to the chapter, what is the most important mindset shift for testing in the age of AI-assisted development?

A) Trust AI-generated code completely because AI knows best B) Avoid using AI for code generation due to unreliability C) Your tests are the specification, and AI-generated code is the implementation that must conform to it D) Focus exclusively on E2E tests since unit tests are redundant with AI

Answer

**C) Your tests are the specification, and AI-generated code is the implementation that must conform to it.** This principle puts the developer in control of *what* the code should do while using AI for *how* it does it. Tests define the contract; AI code must fulfill it.