Chapter 23: Key Takeaways

The Core Principle

EXPLAIN output is the single most reliable tool for understanding what DB2 is doing with your query. Reading it is not optional — it is the foundation of all rational performance tuning.


Capturing EXPLAIN

  1. On z/OS, use EXPLAIN PLAN SET QUERYNO = n FOR <query> to populate PLAN_TABLE. Use BIND ... EXPLAIN(YES) to capture plans for all static SQL in a package. Use SET CURRENT EXPLAIN MODE for dynamic SQL.

  2. On LUW, use EXPLAIN PLAN FOR <query> to populate the EXPLAIN tables. Use db2expln for quick command-line analysis. Use Visual Explain in Data Studio for graphical plan viewing.

  3. Always create a baseline before making changes. Capture EXPLAIN output, save it, and compare against it after every optimization attempt.


Access Types — From Best to Worst

  1. Index-only access (INDEXONLY = 'Y' on z/OS; IXSCAN without FETCH on LUW) is the gold standard. The query is answered entirely from the index without touching data pages.

  2. Matching index scan with high MATCHCOLS means DB2 is efficiently positioning into the index. More matching columns means less scanning.

  3. Table scan (ACCESSTYPE = 'R' on z/OS; TBSCAN on LUW) is acceptable for small tables but catastrophic for large ones with selective predicates. Always investigate table scans on large tables.

  4. Multiple index access (ACCESSTYPE = 'M'/'MX' on z/OS; IXAND/IXOR on LUW) usually indicates that a better composite index could be created.


Join Methods

  1. Nested loop join (METHOD = 1 / NLJOIN) is ideal when the outer table is small and the inner table has a good index. It is disastrous when the inner table is scanned for each outer row.

  2. Merge scan join (METHOD = 2 / MSJOIN) works well when both inputs are already sorted on the join column. Check for sort operations that indicate the data is not naturally sorted.

  3. Hash join (METHOD = 4 / HSJOIN) is the workhorse for large-scale equi-joins when no index is helpful. It requires enough memory to hold the hash table for the smaller input.


Sorts and Costs

  1. Sort operations are expensive. On z/OS, check the eight SORT columns (SORTN_ and SORTC_). On LUW, look for SORT operators. Indexes that provide data in the required order can eliminate sorts.

  2. Cost estimates are relative, not absolute. Use them to compare plans, not to predict elapsed time. Timerons (LUW) and PROCSU (z/OS) are proportional to actual cost but not equal to it.

  3. Estimated cardinalities are the optimizer's predictions of row counts at each step. When these are wildly wrong, the optimizer makes bad decisions. Stale statistics are the most common cause.


The Optimization Workflow

  1. Follow the six-step process: Capture baseline, analyze the plan, identify the problem, hypothesize a fix, test the fix, verify the improvement.

  2. Never guess. The EXPLAIN output tells you exactly where the problem is. Acting without reading it is like prescribing medicine without looking at the test results.


Protecting Against Regression

  1. REBIND can change access plans. Always compare EXPLAIN output before and after a REBIND. On z/OS, use PLANMGMT(EXTENDED) to retain previous plans as a fallback.

  2. RUNSTATS before REBIND can cause surprises. New statistics can lead the optimizer to different (sometimes worse) choices. Consider separating RUNSTATS and REBIND into different maintenance windows.

  3. Monitor critical queries. Automate EXPLAIN comparison for your most important queries. Detect access path regressions before users do.


Platform-Specific Reminders

  1. z/OS: The PLAN_TABLE must exist in your schema. ACCESSTYPE, MATCHCOLS, INDEXONLY, and METHOD are the most important columns. DSN_PREDICAT_TABLE shows predicate classification. DSN_STATEMNT_TABLE shows overall cost.

  2. LUW: The operator tree is read from bottom (leaf nodes) to top (RETURN). IXSCAN, FETCH, TBSCAN, NLJOIN, HSJOIN, SORT, and GRPBY are the most common operators. TOTAL_COST on the RETURN operator is the overall query cost.