Chapter 9 — Key Takeaways
The big idea
A subquery nests one query inside another — the practical payoff of closure (Ch. 4). Subqueries let you filter by another query's result, compute per-row values, and aggregate-then-query.
The four shapes
| Shape | Returns | Used in | Example |
|---|---|---|---|
| Scalar | one value | anywhere a value goes | WHERE price > (SELECT AVG(price) …) |
| List | one column, many rows | IN, ANY, ALL |
WHERE customer_id IN (SELECT …) |
| Derived table | a table (alias required) | FROM |
aggregate, then query the aggregate |
| Correlated | depends on outer row | with EXISTS, in SELECT |
per-row COUNT(*) |
EXISTS / NOT EXISTS — your anti-join default
EXISTS (subquery)= true if ≥1 row; stops at first match;NULL-safe.NOT EXISTS= the safe "X with no matching Y." Prefer it overNOT IN.SELECT 1is the convention insideEXISTS(the columns don't matter).
The two killer pitfalls
NOT IN+ aNULLin the subquery → zero rows (three-valued logic). Works for years, then fails silently. UseNOT EXISTS. (Case Study 1.)- Correlated subqueries run per outer row → fine on small data, catastrophic at scale. Rewrite as "aggregate once, then join" (derived table/CTE). (Case Study 2.)
Other gotchas
- A scalar subquery returning >1 row errors. Use
IN, an aggregate, orLIMIT 1. ANY/ALL:> ANY= greater than the minimum;> ALL= greater than the maximum;IN== ANY.
Subquery vs. join
- JOIN when you need the other table's columns in the output.
IN/EXISTSwhen you only need to test membership/existence (and to avoid fan-out —EXISTSnever duplicates rows).- They're often equivalent; the optimizer rewrites between them. Pick the clearest; use
EXPLAINfor performance.
You can now…
- ☐ Write scalar, list, derived-table, and correlated subqueries.
- ☐ Use
EXISTS/NOT EXISTSfor existence andNULL-safe anti-joins. - ☐ Use
ANY/ALLand knowIN = = ANY. - ☐ Choose subquery vs. join by output needs and fan-out.
- ☐ Avoid the
NOT IN/NULLtrap and de-correlate slow subqueries.
Looking ahead
Chapter 10 — Set Operations. UNION, INTERSECT, EXCEPT — combining whole result sets ("this plus that," "in both," "in one but not the other"), the SQL form of Chapter 4's set algebra.
One sentence to carry forward: Reach for
NOT EXISTS(neverNOT IN) on anti-joins, and "aggregate once then join" instead of asking the same question per row — these two reflexes prevent the most common subquery disasters.