Chapter 5 Key Takeaways

  1. SELECT is your primary data retrieval tool. Name columns explicitly (SELECT col1, col2) rather than using SELECT * in production code. Explicit column lists protect against schema changes and reduce unnecessary data transfer.

  2. WHERE filters rows before they reach your result set. Master the full predicate toolkit: comparison operators (=, <>, <, >, <=, >=), BETWEEN for inclusive ranges, IN for value lists, LIKE for pattern matching (with % and _ wildcards), and IS NULL / IS NOT NULL for missing-value detection.

  3. AND binds more tightly than OR. When combining logical operators, always use parentheses to make your intent explicit. The expression A OR B AND C means A OR (B AND C), which is a frequent source of bugs.

  4. ORDER BY is the only guarantee of row order. Without it, DB2 may return rows in any sequence. Use ASC/DESC for direction, multiple columns for tie-breaking, and NULLS FIRST/LAST (on LUW) to control where NULLs appear.

  5. DISTINCT eliminates duplicate rows but has a performance cost. Use it when you genuinely need unique combinations. If you find yourself adding DISTINCT to "fix" a query that returns too many rows, the real problem is likely in your join logic or WHERE clause.

  6. Scalar functions transform individual values. Memorize the essential ones: UPPER/LOWER for case conversion, SUBSTR for extraction, LENGTH for size, TRIM for cleanup, COALESCE for NULL handling, CAST for type conversion, YEAR/MONTH/DAY for date decomposition, and ROUND/ABS for numeric formatting.

  7. CASE expressions bring conditional logic into SQL. Use simple CASE for value matching and searched CASE for Boolean conditions. Always include an ELSE clause to avoid silent NULLs. CASE works in SELECT lists, WHERE clauses, ORDER BY, and inside function arguments.

  8. NULL means unknown, not zero, not empty. NULL does not equal NULL. Any comparison with NULL yields UNKNOWN. Use IS NULL to test, COALESCE to provide defaults, and NULLIF to generate NULLs. Be especially wary of NULL in NOT IN subqueries, string concatenation, and arithmetic.

  9. FETCH FIRST N ROWS ONLY controls result set size. This is the DB2 standard syntax — do not use LIMIT (MySQL) or TOP (SQL Server). Combine with OFFSET for pagination. FETCH FIRST can improve performance when the optimizer can use it to short-circuit a sort.

  10. Column aliases (AS) and expressions make output readable. Use aliases to give computed columns meaningful names. Use string concatenation, date arithmetic, and CASE to shape raw data into the format your audience needs.

  11. SQL is a data quality tool, not just a reporting tool. The same SELECT/WHERE/CASE/function skills that produce business reports can audit data for completeness, format consistency, and logical correctness. Regulators, auditors, and business owners all rely on this capability.

  12. DB2 SQL is highly portable between platforms. The SELECT syntax in this chapter is identical on DB2 for z/OS and DB2 for LUW. The only differences involve edge-case features like NULLS FIRST/LAST (LUW only) and the specific command-line tools used to execute queries.


Key Takeaways for Chapter 5 of "IBM DB2: From First Query to Enterprise Architecture"