Chapter 24 Key Takeaways: SQL Tuning
The Tuning Mindset
- Never tune blind. Measure current performance (elapsed time, CPU, getpages, I/O) and examine the EXPLAIN output before changing any SQL.
- Validate results after every rewrite. A rewrite that changes the result set is a bug, not a tune. Always compare output row counts and sample data.
- Check prerequisites first. Stale statistics, missing indexes, and system resource problems should be addressed before rewriting SQL.
- Document every change. Record the original SQL, the rewrite, the rationale, and before/after metrics.
Sargable Predicates
- A sargable predicate has the column alone on one side of the comparison, unmodified by any function or expression. This is the single most impactful concept in SQL tuning.
- Replace date functions with range predicates.
YEAR(col) = 2025becomescol >= '2025-01-01' AND col < '2026-01-01'. - Avoid implicit data type conversions. Ensure literal types match column types to prevent the optimizer from applying hidden CAST functions.
- Move arithmetic from the column side to the literal side.
col * 1.1 > 5000becomescol > 5000 / 1.1. - Leading wildcards in LIKE prevent index use. Use trailing wildcards only, or invest in a text search index for substring searches.
Stage 1 vs Stage 2 (z/OS)
- Stage 1 predicates are evaluated by Data Manager as data is read; Stage 2 predicates are evaluated by RDS after the row is fetched. Stage 1 evaluation rejects rows earlier, saving CPU throughout the processing pipeline.
- Functions on columns make predicates Stage 2. Systematic identification and rewriting of Stage 2 predicates is one of the highest-return z/OS tuning activities.
- A predicate can be Stage 1 without being indexable (if no matching index exists). Stage 1 is still better than Stage 2 even without an index.
Join Optimization
- Join order matters enormously. Start with the most selective table (fewest rows after filtering) as the driving table for nested loop joins.
- Use EXISTS for existence checks, JOIN when you need data from both tables. Avoid IN with large subqueries.
- Always use NOT EXISTS instead of NOT IN. NOT IN returns incorrect results when the subquery can produce NULL values.
Subquery Optimization
- Replace correlated subqueries with JOINs or derived tables to avoid per-row re-execution.
- Use EXISTS instead of COUNT(*) > 0 for existence checks. EXISTS stops at the first match; COUNT must examine every matching row.
- Eliminate duplicate subqueries. If the same correlated subquery appears in both the SELECT list and WHERE clause, consolidate into a single derived table JOIN.
Sort Avoidance
- Design indexes to match ORDER BY and GROUP BY sequences. When leading index columns are fixed by equality predicates, the ORDER BY needs to match the remaining index columns.
- Use UNION ALL instead of UNION when duplicates are impossible or acceptable. UNION forces a sort for duplicate elimination.
- Replace DISTINCT with EXISTS when DISTINCT is only needed because a JOIN produces unwanted duplicates.
Pagination and FETCH FIRST
- Use FETCH FIRST n ROWS ONLY to signal the optimizer that you need only a limited result set. Combine with OPTIMIZE FOR n ROWS for strongest effect.
- Use keyset (seek-based) pagination for deep pagination. Offset pagination has O(N) cost per page; keyset pagination has O(1) cost.
Materialized Query Tables
- MQTs pre-compute expensive aggregations and can be automatically used by the optimizer when CURRENT REFRESH AGE is configured.
- Choose the right refresh strategy. REFRESH DEFERRED for batch workloads; REFRESH IMMEDIATE (LUW only) for small, frequently queried tables with low change rates.
- Keep statistics current on MQTs. The optimizer needs accurate statistics to decide whether routing to the MQT is beneficial.