Chapter 8 Quiz: Subqueries and Common Table Expressions

Chapter 8 Quiz: Subqueries and Common Table Expressions

Test your understanding of subqueries, correlated subqueries, EXISTS, derived tables, and CTEs in DB2. Choose the best answer for each question.

Question 1

What is a scalar subquery?

A) A subquery that returns exactly one column but any number of rows B) A subquery that returns exactly one row and one column C) A subquery that uses only numeric data types D) A subquery that does not reference the outer query

Answer: B. A scalar subquery returns exactly one row and one column — a single value. If it returns more than one row, DB2 raises SQLCODE -811.

Question 2

What happens if a scalar subquery returns zero rows?

A) DB2 raises an error B) The result is 0 C) The result is NULL D) The result is an empty string

Answer: C. A scalar subquery that returns no rows evaluates to NULL.

Question 3

What is the key difference between a correlated and an uncorrelated subquery?

A) Correlated subqueries use CTEs; uncorrelated subqueries do not B) Correlated subqueries reference columns from the outer query; uncorrelated subqueries are self-contained C) Correlated subqueries can only appear in the WHERE clause D) Uncorrelated subqueries always run faster

Answer: B. A correlated subquery depends on the outer query by referencing its columns, requiring logical re-evaluation for each outer row. An uncorrelated subquery is self-contained and runs once.

Question 4

Given this query, what does the subquery logically compute?

SELECT * FROM ACCOUNT a
WHERE a.BALANCE > ALL (SELECT BALANCE FROM ACCOUNT WHERE ACCOUNT_TYPE = 'CHECKING');

A) Accounts with balance above the minimum checking balance B) Accounts with balance above the average checking balance C) Accounts with balance above every checking balance (i.e., above the maximum) D) Accounts with balance above any checking balance

Answer: C. > ALL means greater than every value in the set, which is equivalent to greater than the maximum.

Question 5

Why is NOT IN dangerous when the subquery result may contain NULL values?

A) NOT IN ignores NULL values entirely B) NULL in the subquery causes the NOT IN condition to evaluate to UNKNOWN for every outer row, returning zero results C) NOT IN converts NULL to 0, producing incorrect matches D) DB2 raises an error when NOT IN encounters NULL

Answer: B. When a NULL is in the subquery result, the NOT IN comparison produces UNKNOWN (because value <> NULL is UNKNOWN), and WHERE filters out UNKNOWN, so no rows are returned.

Question 6

Which alternative to NOT IN handles NULLs correctly?

A) NOT ANY B) NOT ALL C) NOT EXISTS D) EXCEPT

Answer: C. NOT EXISTS checks for the absence of matching rows and handles NULLs correctly because NULL values simply fail the equality check in the correlation condition.

Question 7

In an EXISTS subquery, does the SELECT list matter?

A) Yes — you must select the correlation column B) Yes — you must use SELECT * C) No — DB2 only checks whether any rows are returned, not what values they contain D) No — but you must use SELECT 1 specifically

Answer: C. DB2 ignores the SELECT list in an EXISTS subquery. It only checks row existence. SELECT 1 is a common convention, but any valid SELECT list works.

Question 8

What execution pattern does EXISTS implement in DB2's query plan?

A) Full outer join B) Cross join C) Semi-join D) Cartesian product

Answer: C. EXISTS implements a semi-join — the outer row is returned as soon as any match is found in the inner table, without duplication.

Question 9

What is a derived table?

A) A table created using CREATE TABLE AS B) A subquery in the FROM clause that acts as an in-line table C) A table that inherits structure from another table D) A CTE that has been materialized

Answer: B. A derived table is a subquery placed in the FROM clause, producing a result set that the outer query treats as a table.

Question 10

Which of the following is an advantage of CTEs over derived tables?

A) CTEs always run faster B) CTEs can be referenced multiple times in the same statement C) CTEs are stored permanently in the database D) CTEs can use indexes directly

Answer: B. A CTE can be referenced multiple times in the main query, avoiding the need to duplicate the subquery logic.

Question 11

How many CTEs can you define in a single WITH clause?

A) Exactly one B) Up to three C) Up to ten D) No fixed limit — as many as needed, separated by commas

Answer: D. There is no fixed limit on the number of CTEs in a WITH clause. Each CTE is separated by a comma.

Question 12

Can a CTE reference another CTE defined earlier in the same WITH clause?

A) No — CTEs cannot reference each other B) Yes — a CTE can reference any CTE defined before it C) Yes — but only if RECURSIVE is specified D) Yes — CTEs can reference CTEs defined before or after them

Answer: B. In a non-recursive WITH clause, each CTE can reference any CTE defined before it (but not those defined after it).

Question 13

What does CTE "materialization" mean in DB2?

A) The CTE is saved as a permanent table B) The CTE is created as a temporary index C) DB2 executes the CTE query and stores its result in a temporary work area D) The CTE is converted into a view

Answer: C. Materialization means DB2 executes the CTE's query, stores the result in a temp area, and subsequent references read from that stored result.

Question 14

When is DB2 most likely to materialize a CTE?

A) When the CTE is referenced only once B) When the CTE is referenced multiple times in the main query C) When the CTE contains a simple filter D) When the CTE is the last one in the WITH clause

Answer: B. DB2 is more likely to materialize a CTE when it is referenced multiple times, to avoid redundant computation.

Question 15

What is the default behavior of DB2's optimizer regarding CTE execution?

A) Always materialize B) Always inline (merge into the main query) C) Generally prefer to inline, materializing when the optimizer estimates it is cheaper D) Ask the user via a prompt

Answer: C. DB2 generally prefers inlining CTEs for maximum optimization flexibility, but may materialize when it estimates a benefit.

Question 16

Which of these is NOT a valid reason to prefer a CTE over a subquery?

A) The intermediate result is referenced multiple times B) The query has multiple levels of nesting C) You want the result stored permanently for other queries D) You want the query to read top-to-bottom

Answer: C. CTEs exist only for the duration of a single SQL statement. For permanent reuse, create a VIEW.

Question 17

What pattern is this query implementing?

SELECT c.CUSTOMER_ID, c.FIRST_NAME
FROM   CUSTOMER c
WHERE  NOT EXISTS (
    SELECT 1 FROM ACCOUNT a
    WHERE a.CUSTOMER_ID = c.CUSTOMER_ID
);

A) Semi-join B) Anti-semi-join C) Full outer join D) Self-join

Answer: B. NOT EXISTS implements an anti-semi-join — returning outer rows that have no match in the inner table.

Question 18

Examine this query:

WITH customer_bal AS (
    SELECT CUSTOMER_ID, SUM(BALANCE) AS TOTAL_BAL
    FROM   ACCOUNT
    GROUP BY CUSTOMER_ID
)
SELECT c.FIRST_NAME, c.LAST_NAME, cb.TOTAL_BAL
FROM   CUSTOMER c
JOIN   customer_bal cb ON c.CUSTOMER_ID = cb.CUSTOMER_ID
WHERE  cb.TOTAL_BAL > (SELECT AVG(TOTAL_BAL) FROM customer_bal);

How many times is the CTE customer_bal referenced?

A) One B) Two C) Three D) It depends on the number of customers

Answer: B. The CTE is referenced twice: once in the main query's FROM clause (as cb) and once in the scalar subquery in the WHERE clause.

Question 19

When refactoring nested subqueries into CTEs, which subquery should become the first CTE?

A) The outermost subquery B) The innermost subquery C) The largest subquery D) The most frequently referenced subquery

Answer: B. Start with the innermost subquery as the first CTE, then work outward. Each subsequent CTE can reference the previous ones.

Question 20

What does "thinking in sets" mean in the context of SQL subqueries?

A) Using the SET operator in every query B) Defining collections of rows that satisfy conditions, then combining them, rather than processing rows one at a time C) Always using UNION to combine results D) Avoiding aggregate functions

Answer: B. Set-based thinking means defining sets of qualifying rows and combining them (via joins, intersections, etc.) rather than iterating through rows procedurally.

Question 21

Which statement about a correlated subquery in a SELECT list is correct?

A) It must return multiple rows B) It runs once for the entire query C) It must return exactly one value per outer row (scalar) D) It cannot reference the outer query

Answer: C. A subquery in the SELECT list must be scalar — returning exactly one value per outer row. Since it references outer columns, it is correlated.

Question 22

What is the advantage of using CTEs to implement complex business eligibility rules?

A) CTEs enforce data integrity constraints B) Each rule can be isolated in its own CTE, making the logic testable and readable C) CTEs automatically cache results for faster execution D) CTEs allow you to bypass security restrictions

Answer: B. By placing each business rule in a separate CTE, you can test, read, and maintain each rule independently, then combine them in the main query.

Question 23

Consider: WHERE a.BALANCE > ANY (SELECT BALANCE FROM ACCOUNT WHERE BRANCH_ID = 5). This is equivalent to:

A) WHERE a.BALANCE > MAX(BALANCE) for branch 5 B) WHERE a.BALANCE > MIN(BALANCE) for branch 5 C) WHERE a.BALANCE > AVG(BALANCE) for branch 5 D) WHERE a.BALANCE > SUM(BALANCE) for branch 5

Answer: B. > ANY means "greater than at least one value in the set," which is equivalent to being greater than the minimum value.

Question 24

You are reviewing EXPLAIN output and see a TEMP operator associated with a CTE. What does this indicate?

A) The CTE is being inlined B) The CTE is being materialized into a temporary result set C) The CTE has a syntax error D) The CTE is referencing a temporary table

Answer: B. A TEMP operator in the EXPLAIN output indicates that DB2 materialized the CTE — executed its query and stored the result for reuse.

Question 25

Which of these is the best first step when a correlated subquery is performing poorly?

A) Rewrite it as a cursor loop in a stored procedure B) Check the EXPLAIN plan and ensure indexes exist on the correlation columns C) Add MATERIALIZED hints to the subquery D) Convert it to a UNION query

Answer: B. The first step is always to check the execution plan and ensure appropriate indexes exist. Often, an index on the correlation column transforms a slow nested-loop into a fast indexed lookup.