Chapter 9 — Key Takeaways

The big idea

A subquery nests one query inside another — the practical payoff of closure (Ch. 4). Subqueries let you filter by another query's result, compute per-row values, and aggregate-then-query.

The four shapes

Shape Returns Used in Example
Scalar one value anywhere a value goes WHERE price > (SELECT AVG(price) …)
List one column, many rows IN, ANY, ALL WHERE customer_id IN (SELECT …)
Derived table a table (alias required) FROM aggregate, then query the aggregate
Correlated depends on outer row with EXISTS, in SELECT per-row COUNT(*)

EXISTS / NOT EXISTS — your anti-join default

  • EXISTS (subquery) = true if ≥1 row; stops at first match; NULL-safe.
  • NOT EXISTS = the safe "X with no matching Y." Prefer it over NOT IN.
  • SELECT 1 is the convention inside EXISTS (the columns don't matter).

The two killer pitfalls

  • NOT IN + a NULL in the subquery → zero rows (three-valued logic). Works for years, then fails silently. Use NOT EXISTS. (Case Study 1.)
  • Correlated subqueries run per outer row → fine on small data, catastrophic at scale. Rewrite as "aggregate once, then join" (derived table/CTE). (Case Study 2.)

Other gotchas

  • A scalar subquery returning >1 row errors. Use IN, an aggregate, or LIMIT 1.
  • ANY/ALL: > ANY = greater than the minimum; > ALL = greater than the maximum; IN = = ANY.

Subquery vs. join

  • JOIN when you need the other table's columns in the output.
  • IN/EXISTS when you only need to test membership/existence (and to avoid fan-out — EXISTS never duplicates rows).
  • They're often equivalent; the optimizer rewrites between them. Pick the clearest; use EXPLAIN for performance.

You can now…

  • ☐ Write scalar, list, derived-table, and correlated subqueries.
  • ☐ Use EXISTS/NOT EXISTS for existence and NULL-safe anti-joins.
  • ☐ Use ANY/ALL and know IN = = ANY.
  • ☐ Choose subquery vs. join by output needs and fan-out.
  • ☐ Avoid the NOT IN/NULL trap and de-correlate slow subqueries.

Looking ahead

Chapter 10 — Set Operations. UNION, INTERSECT, EXCEPT — combining whole result sets ("this plus that," "in both," "in one but not the other"), the SQL form of Chapter 4's set algebra.

One sentence to carry forward: Reach for NOT EXISTS (never NOT IN) on anti-joins, and "aggregate once then join" instead of asking the same question per row — these two reflexes prevent the most common subquery disasters.