42 min read

> "If you learn one skill from this entire book, let it be this one. Reading EXPLAIN output is how you see what DB2 is actually doing with your query. Not what you think it is doing. Not what you hope it is doing. What it is actually doing."

Chapter 23: Reading and Interpreting EXPLAIN — The Single Most Important DBA Skill

"If you learn one skill from this entire book, let it be this one. Reading EXPLAIN output is how you see what DB2 is actually doing with your query. Not what you think it is doing. Not what you hope it is doing. What it is actually doing."

Every query you write is a request. You are asking DB2 to retrieve data, and you are describing what you want. But you are not telling DB2 how to get it. The optimizer makes that decision. It chooses which indexes to use, which join methods to employ, whether to sort, whether to prefetch, and in what order to access the tables. The EXPLAIN facility is the window into those decisions. Without it, you are tuning blind.

I have watched senior developers spend hours rewriting queries, adding hints, creating indexes, and rebuilding statistics — all without ever looking at the access plan. They were guessing. Sometimes they guessed right. More often, they made things worse. The DBA who reads EXPLAIN first, acts second, and verifies third will outperform the guessing developer every single time.

This chapter is the longest in Part V for good reason. We will cover EXPLAIN on both z/OS and LUW, walk through every important column and operator, learn to recognize good plans and bad plans, and then apply everything to Meridian National Bank queries. By the end, you will be able to open an EXPLAIN output and read it like a story — a story that tells you exactly where your performance problem lives.


23.1 What EXPLAIN Does and Why It Matters

When DB2 receives a SQL statement, the optimizer evaluates dozens or hundreds of possible execution strategies. It estimates the cost of each strategy — considering I/O, CPU, sorting, network transfer, and available memory — and selects the plan it believes will be cheapest. EXPLAIN captures that chosen plan and writes it to a set of tables you can query.

The EXPLAIN output tells you:

  • Which tables are accessed and in what order
  • How each table is accessed — full table scan, index scan, index-only access, or something else
  • Which indexes are used, and how many key columns match the predicates
  • How tables are joined — nested loop, merge scan, or hash join
  • Where sorts occur and why
  • Estimated costs — how expensive the optimizer thinks each step will be
  • Estimated cardinalities — how many rows the optimizer expects at each step

This information is indispensable for three reasons.

First, it reveals the actual strategy. You might assume your query uses the index you created, but EXPLAIN will show you that the optimizer chose a table scan instead — perhaps because the statistics are stale, or because the predicate is not sargable, or because the table is small enough that a scan is genuinely cheaper.

Second, it provides a baseline. Before you tune anything, capture the EXPLAIN output. After you make changes — adding an index, rewriting a predicate, updating statistics — capture it again. Comparing the two tells you whether your change helped, hurt, or had no effect. Without this before-and-after comparison, you cannot distinguish a real improvement from a placebo.

Third, it enables proactive tuning. You do not have to wait for users to complain. You can EXPLAIN a query before it ever runs in production, identify potential problems, and fix them before they cause an outage. In a well-run shop, every SQL statement is EXPLAINed during code review before it is promoted to production.

The EXPLAIN Tables

EXPLAIN does not produce a pretty report on its own. It populates a set of relational tables with structured data about the access plan. You then query those tables to see the plan. This design is deliberate — it means you can write your own analysis queries, join EXPLAIN data with catalog information, build trending reports, and automate access plan monitoring.

On z/OS, the primary table is PLAN_TABLE, with supplementary tables including DSN_STATEMNT_TABLE, DSN_FILTER_TABLE, DSN_PREDICAT_TABLE, and others. On LUW, the EXPLAIN tables include EXPLAIN_INSTANCE, EXPLAIN_STATEMENT, EXPLAIN_ARGUMENT, EXPLAIN_OBJECT, EXPLAIN_OPERATOR, EXPLAIN_PREDICATE, EXPLAIN_STREAM, and others.

The structure differs between platforms, but the concepts are identical. Both platforms tell you the same story — they just use different vocabularies.


23.2 [z/OS] Capturing EXPLAIN

On z/OS, there are three primary ways to capture EXPLAIN output.

Method 1: The EXPLAIN Statement

The most direct approach is to execute the EXPLAIN statement explicitly:

EXPLAIN PLAN SET QUERYNO = 1001 FOR
  SELECT A.ACCT_ID, A.ACCT_TYPE, C.CUST_NAME
  FROM MERIDIAN.ACCOUNTS A
  JOIN MERIDIAN.CUSTOMERS C
    ON A.CUST_ID = C.CUST_ID
  WHERE A.BRANCH_ID = 'BR-0042'
    AND A.ACCT_STATUS = 'ACTIVE';

This populates your PLAN_TABLE (and related tables) with the access plan for the specified query without actually executing it. The QUERYNO value is a tag you assign so you can find the plan later.

Method 2: Binding with EXPLAIN(YES)

When you bind a package or plan, you can request that DB2 explain every SQL statement in the package:

BIND PACKAGE(MERIDIAN.ACCTPKG) -
     MEMBER(ACCTPGM) -
     EXPLAIN(YES) -
     ISOLATION(CS)

This is the standard approach in production environments. Every time a package is bound or rebound, the EXPLAIN data is refreshed. You can then query PLAN_TABLE filtered by PROGNAME and COLLID to see the access plans for all statements in that package.

Method 3: EXPLAIN(YES) on Dynamic SQL

For dynamic SQL, you can set the CURRENT EXPLAIN MODE special register:

SET CURRENT EXPLAIN MODE = YES;

When this is set, every dynamic SQL statement will be both explained and executed. If you want to explain without executing, use:

SET CURRENT EXPLAIN MODE = EXPLAIN;

This is particularly useful for testing in development environments.

The PLAN_TABLE Structure

The PLAN_TABLE is the heart of z/OS EXPLAIN. It must exist in your schema before you can run EXPLAIN. DB2 ships a sample DDL in SDSNSAMP member DSNTESC, but here are the critical columns:

Column Description
QUERYNO Your assigned query number
QBLOCKNO Query block number (1 for outer query, >1 for subqueries)
PLANNO Step number within the query block
METHOD Join method (0=first table, 1=NLJOIN, 2=MSJOIN, 3=not used, 4=HSJOIN)
CREATOR Schema of the accessed table
TNAME Name of the accessed table
TABNO Table number in the FROM clause
ACCESSTYPE How the table is accessed (I=index, R=table scan, I1=one-fetch index, N=index scan, M=multiple index, MX=extended multiple index)
MATCHCOLS Number of index key columns matched by predicates
ACCESSCREATOR Schema of the index used
ACCESSNAME Name of the index used
INDEXONLY Whether the query can be satisfied from the index alone (Y/N)
SORTN_UNIQ Sort on new table for uniqueness
SORTN_JOIN Sort on new table for join
SORTN_ORDERBY Sort on new table for ORDER BY
SORTN_GROUPBY Sort on new table for GROUP BY
SORTC_UNIQ Sort on composite for uniqueness
SORTC_JOIN Sort on composite for join
SORTC_ORDERBY Sort on composite for ORDER BY
SORTC_GROUPBY Sort on composite for GROUP BY
TSLOCKMODE Lock mode on the tablespace
TIMESTAMP When the EXPLAIN was captured
REMARKS Free-text remarks
PREFETCH Prefetch type (S=sequential, L=list, blank=none)
COLUMN_FN_EVAL Where column functions are evaluated
MIXOPSEQ Sequence number for multiple index operations
ACCESSTYPE values need special attention — we will explore them fully in Section 23.8.

Supplementary z/OS EXPLAIN Tables

Beyond PLAN_TABLE, several other tables provide additional detail:

DSN_STATEMNT_TABLE contains statement-level information including the estimated cost, the SQL text, and the statement type. The COST_CATEGORY and PROCSU columns give you the optimizer's overall cost estimate.

DSN_FILTER_TABLE shows filter factors applied to predicates — the fraction of rows the optimizer expects each predicate to eliminate. If you see a filter factor of 1.0 (no filtering), something is wrong with your statistics.

DSN_PREDICAT_TABLE provides detailed information about each predicate: which column it references, what operator is used, and how the predicate was classified (indexable, stage 1, stage 2, or residual).

Together, these tables give you a complete picture of the optimizer's reasoning.


23.3 [LUW] Capturing EXPLAIN

On DB2 LUW, you have three primary tools for capturing and viewing EXPLAIN output.

Method 1: The EXPLAIN Statement

Similar to z/OS, you can explicitly explain a statement:

EXPLAIN PLAN FOR
  SELECT A.ACCT_ID, A.ACCT_TYPE, C.CUST_NAME
  FROM MERIDIAN.ACCOUNTS A
  JOIN MERIDIAN.CUSTOMERS C
    ON A.CUST_ID = C.CUST_ID
  WHERE A.BRANCH_ID = 'BR-0042'
    AND A.ACCT_STATUS = 'ACTIVE';

This populates the EXPLAIN tables (EXPLAIN_INSTANCE, EXPLAIN_STATEMENT, EXPLAIN_OPERATOR, etc.). You then query those tables to see the plan.

Before you can use EXPLAIN, the explain tables must exist. You can create them using:

db2 -tf SQLLIB/misc/EXPLAIN.DDL

Or call the SYSPROC.SYSINSTALLOBJECTS procedure:

CALL SYSPROC.SYSINSTALLOBJECTS('EXPLAIN', 'C', '', CURRENT SCHEMA);

Method 2: db2expln Command

The db2expln command-line utility is often the fastest way to see an access plan on LUW. It formats the plan in a readable text tree:

db2expln -d MERIDIAN -t -statement "SELECT A.ACCT_ID, A.ACCT_TYPE, C.CUST_NAME FROM MERIDIAN.ACCOUNTS A JOIN MERIDIAN.CUSTOMERS C ON A.CUST_ID = C.CUST_ID WHERE A.BRANCH_ID = 'BR-0042' AND A.ACCT_STATUS = 'ACTIVE'"

The output shows an indented operator tree with cost estimates, cardinalities, and access details at each node. For quick analysis, db2expln is hard to beat.

You can also explain all static SQL in a package:

db2expln -d MERIDIAN -c MERIDIAN -p ACCTPKG -s 0 -t

Method 3: Visual Explain in Data Studio

IBM Data Studio provides a graphical Visual Explain tool that renders the access plan as a flowchart. Each operator is a node, with arrows showing data flow. You can click on any node to see its properties: estimated cost, cardinality, predicates applied, and arguments.

Visual Explain is excellent for complex queries with many joins, because the graphical layout makes the overall structure immediately apparent. However, for quick command-line work, db2expln is more efficient.

Setting EXPLAIN Mode

On LUW, you can capture EXPLAIN for all dynamic SQL by setting the special register:

SET CURRENT EXPLAIN MODE = YES;      -- Explain and execute
SET CURRENT EXPLAIN MODE = EXPLAIN;  -- Explain only, do not execute
SET CURRENT EXPLAIN MODE = NO;       -- Normal execution, no explain

When EXPLAIN MODE is YES or EXPLAIN, every subsequent dynamic SQL statement populates the explain tables.

The LUW EXPLAIN Tables

The key tables on LUW are:

Table Purpose
EXPLAIN_INSTANCE One row per EXPLAIN invocation
EXPLAIN_STATEMENT Statement text and overall costs
EXPLAIN_OPERATOR One row per operator in the plan tree
EXPLAIN_ARGUMENT Arguments/properties for each operator
EXPLAIN_PREDICATE Predicates applied at each operator
EXPLAIN_OBJECT Tables and indexes accessed
EXPLAIN_STREAM Data flow between operators (parent-child relationships)

The operator tree is the core of LUW EXPLAIN. Each operator has a numeric ID, a type (FETCH, IXSCAN, TBSCAN, NLJOIN, HSJOIN, MSJOIN, SORT, TEMP, GRPBY, RETURN, etc.), and connections to parent and child operators via the EXPLAIN_STREAM table.


23.4 Reading the EXPLAIN Output — Access Type

The access type tells you how DB2 retrieves rows from a table. This is the single most important piece of information in the EXPLAIN output, because the difference between access types can be the difference between a query that runs in milliseconds and one that runs for hours.

Tablespace Scan / Table Scan

z/OS PLAN_TABLE: ACCESSTYPE = 'R' LUW Operator: TBSCAN

A table scan (or tablespace scan on z/OS) reads every row in the table. DB2 starts at the beginning and reads sequentially to the end, applying predicates to each row to decide whether it qualifies.

Table scans are not inherently bad. For small tables (a few hundred rows), a table scan is often the fastest access method because it avoids the overhead of index lookups. For lookup tables, reference tables, and configuration tables, a table scan is perfectly acceptable.

Table scans become a problem when applied to large tables. Scanning a 100-million-row transaction table to find 5 matching rows is catastrophically expensive. If you see a table scan on a large table with selective predicates, that is almost always a problem worth fixing.

Common causes of unwanted table scans:

  • No suitable index exists for the predicates in the WHERE clause
  • The predicate is not indexable — for example, a function applied to the indexed column (WHERE YEAR(TRANS_DATE) = 2025), a LIKE pattern with a leading wildcard (WHERE CUST_NAME LIKE '%Smith'), or an OR condition that cannot be converted to an IN list
  • Statistics are stale or missing, causing the optimizer to underestimate the table size or overestimate the number of qualifying rows
  • The table is genuinely small, and a scan is cheaper than an index access

Index Scan

z/OS PLAN_TABLE: ACCESSTYPE = 'I', MATCHCOLS > 0 LUW Operators: IXSCAN followed by FETCH

An index scan uses an index to locate the qualifying rows, then fetches the corresponding data pages. The MATCHCOLS value (z/OS) tells you how many leading columns of the index key are being used for matching — more matching columns means a more selective (and therefore cheaper) index scan.

For example, if an index is defined on (BRANCH_ID, ACCT_STATUS, ACCT_TYPE) and your predicate is WHERE BRANCH_ID = 'BR-0042' AND ACCT_STATUS = 'ACTIVE', MATCHCOLS will be 2. The optimizer can position directly to the first qualifying entry and scan forward, reading only the entries that match both columns.

If your predicate were only WHERE ACCT_STATUS = 'ACTIVE' (the second column, but not the first), MATCHCOLS would be 0 on many DB2 versions, because the leading column is not constrained. The optimizer might still use this index with index screening (checking non-leading columns after the scan) or it might choose a table scan instead.

On LUW, you see this as a pair of operators: IXSCAN (scanning the index) feeding into FETCH (retrieving the data rows from the table). The IXSCAN operator's properties will show which index is used and which predicates are applied during the scan.

Index-Only Access

z/OS PLAN_TABLE: INDEXONLY = 'Y' LUW Operators: IXSCAN with no FETCH above it

Index-only access is the gold standard. When every column needed by the query exists in the index, DB2 does not need to access the data pages at all. It reads the answer entirely from the index, which is dramatically faster — especially for queries that touch many rows but need only a few columns.

For example, if you have an index on (BRANCH_ID, ACCT_STATUS) and your query is:

SELECT COUNT(*) FROM MERIDIAN.ACCOUNTS
WHERE BRANCH_ID = 'BR-0042' AND ACCT_STATUS = 'ACTIVE';

DB2 can count the qualifying index entries without ever reading a data page. INDEXONLY = 'Y' on z/OS, and on LUW you will see IXSCAN without a FETCH operator.

Creating "covering indexes" that enable index-only access for critical queries is one of the most effective performance tuning techniques available.

List Prefetch

z/OS PLAN_TABLE: PREFETCH = 'L' LUW Operator: Indicated in IXSCAN/FETCH properties

List prefetch is a technique where DB2 first scans the index to collect a list of qualifying row IDs (RIDs), sorts them into page-number order, and then fetches the data pages in sequential order. This is beneficial when the index scan produces many RIDs that are scattered across many data pages — fetching them in page order minimizes random I/O.

On z/OS, PREFETCH = 'L' indicates list prefetch. When you see this, DB2 is optimizing the physical I/O pattern for a potentially large result set. It is often a sign of a reasonably good plan that just happens to touch many rows.

Sequential Prefetch

z/OS PLAN_TABLE: PREFETCH = 'S'

Sequential prefetch means DB2 is reading data pages sequentially ahead of where they are currently needed. This is the standard prefetch method for table scans and for index scans that access data in clustered order. If the data is well-clustered with respect to the index, sequential prefetch is very efficient.

Multiple Index Access

z/OS PLAN_TABLE: ACCESSTYPE = 'M' or 'MX' LUW Operator: IXAND or IXOR operators

Multiple index access means DB2 is using more than one index on the same table, combining the results via RID intersection (AND logic) or RID union (OR logic). This typically occurs when no single index satisfies all predicates, but two or more indexes together provide good selectivity.

On z/OS, ACCESSTYPE = 'M' means multiple index access using RID pool intersection, and MIXOPSEQ shows the sequence of operations. On LUW, you will see IXAND or IXOR operators combining the results of multiple IXSCAN operators.

Multiple index access is often a sign that a better composite index could be created. If DB2 is combining two single-column indexes, a composite index on both columns would usually be more efficient.


23.5 Reading EXPLAIN — Join Method

When a query involves more than one table, DB2 must choose a method to combine the rows. Three join methods exist on both platforms.

Nested Loop Join (NLJOIN)

z/OS PLAN_TABLE: METHOD = 1 LUW Operator: NLJOIN

In a nested loop join, DB2 reads each row from the outer (driving) table, and for each outer row, it looks up matching rows in the inner table. This is like two nested FOR loops in a programming language.

Nested loop join is the optimal choice when:

  • The outer table is small (or heavily filtered)
  • There is a good index on the inner table's join column
  • The join produces relatively few rows

Nested loop join is problematic when:

  • The outer table is large and not well-filtered
  • The inner table lacks an appropriate index (forcing a table scan for each outer row)

The second scenario — nested loop with an inner table scan — is one of the worst patterns you can encounter. If the outer table produces 10,000 rows and the inner table has 1,000,000 rows with no index on the join column, DB2 will scan 1,000,000 rows 10,000 times. This is a performance disaster.

In the EXPLAIN output, look at the inner table's access type. If it is an index scan (ACCESSTYPE = 'I' on z/OS, IXSCAN on LUW), the nested loop is probably fine. If it is a table scan (ACCESSTYPE = 'R', TBSCAN), investigate immediately.

Merge Scan Join (MSJOIN)

z/OS PLAN_TABLE: METHOD = 2 LUW Operator: MSJOIN

In a merge scan join, DB2 sorts both the outer and inner table on the join columns (if they are not already in order), then merges them in a single pass. Both sorted lists are walked in parallel — when a match is found, the joined row is emitted.

Merge scan join is efficient when:

  • Both tables are large
  • Both tables are already sorted on the join column (or an index provides the sort order)
  • The join is an equi-join

The cost of merge scan join is dominated by the sorting step. If both inputs are already sorted (because of suitable indexes), the merge itself is very efficient. If sorts are required, the cost depends on the size of the data being sorted.

On z/OS, if you see METHOD = 2, check the SORTN_JOIN and SORTC_JOIN columns. If they are 'Y', a sort is being performed for the join. If 'N', the data is already in the right order.

Hash Join (HSJOIN)

z/OS PLAN_TABLE: METHOD = 4 LUW Operator: HSJOIN

In a hash join, DB2 reads the smaller (build) table into memory, creating a hash table on the join column. It then reads the larger (probe) table, hashing each row's join column to look up matches in the hash table.

Hash join is often the best choice when:

  • Both tables are large
  • Neither table has a suitable index on the join column
  • There is enough memory to hold the hash table for the smaller input

Hash join does not require sorted input and does not produce sorted output. It is a workhorse for large-scale joins in data warehouse queries.

On z/OS, hash join must be enabled (it is controlled by the SJMHASH DSNZPARM). On LUW, it is always available.

Join Order

Beyond the join method, the order in which tables are joined matters enormously. In a five-table join, there are 120 possible orderings (5 factorial). The optimizer evaluates many of these and picks the cheapest. The EXPLAIN output shows the order by the sequence of PLANNO values (z/OS) or the structure of the operator tree (LUW).

The general principle is: start with the most selective access. The table that produces the fewest rows after filtering should be the outer (driving) table, because each row drives a lookup into the next table. If the first table produces 10 rows instead of 10,000, the entire downstream work is reduced by a factor of 1,000.


23.6 Reading EXPLAIN — Sort Operations

Sorts are expensive. They consume CPU, require memory (sort heap on LUW, sort work area on z/OS), and if the data does not fit in memory, they spill to disk — which is dramatically slower.

When Sorts Happen

Sorts occur for several reasons:

  • ORDER BY — The query requests sorted output, and no index provides the data in that order
  • GROUP BY — Aggregation may require a sort to group matching values together (although hash-based grouping can sometimes avoid this)
  • DISTINCT — Duplicate elimination may require sorting
  • UNION (not UNION ALL) — Duplicate elimination across set operations
  • Merge scan join — Both inputs must be sorted on the join column
  • Subquery materialization — Certain subquery types are materialized into a temporary table, which may require sorting

Identifying Sorts in EXPLAIN

On z/OS, the PLAN_TABLE has eight sort indicator columns:

Column Meaning
SORTN_UNIQ Sort new table for uniqueness (DISTINCT)
SORTN_JOIN Sort new table for merge join
SORTN_ORDERBY Sort new table for ORDER BY
SORTN_GROUPBY Sort new table for GROUP BY
SORTC_UNIQ Sort composite table for uniqueness
SORTC_JOIN Sort composite table for merge join
SORTC_ORDERBY Sort composite table for ORDER BY
SORTC_GROUPBY Sort composite table for GROUP BY

The "N" variants apply to the "new" table (the table being accessed in this step). The "C" variants apply to the "composite" table (the accumulated result of previous join steps).

A value of 'Y' means a sort is being performed. A value of 'N' means no sort. To avoid sorts, you need indexes that provide data in the order required by the ORDER BY, GROUP BY, or join columns.

On LUW, sorts appear as explicit SORT operators in the operator tree. Each SORT operator shows the sort columns, the estimated number of rows being sorted, and the estimated sort cost.

Sort Overflow

When a sort does not fit in memory, it overflows to temporary disk storage. This is called a "sort overflow" and is significantly slower than an in-memory sort. On LUW, you can monitor sort overflows through the database monitor (the sort_overflows metric). On z/OS, you can detect this through IFCID trace records.

If you see a sort on a large intermediate result set, consider whether an index could eliminate it. For example, if the query has ORDER BY TRANS_DATE DESC and you have an index on TRANS_DATE, DB2 can read the index in reverse order and avoid the sort entirely.


23.7 Reading EXPLAIN — Cost Estimates

The optimizer assigns cost estimates to each step of the access plan. Understanding these costs helps you identify the most expensive parts of a query and focus your tuning efforts.

z/OS Cost Metrics

On z/OS, the DSN_STATEMNT_TABLE contains the overall cost estimate in the PROCSU (processor service units) and COST_CATEGORY columns. COST_CATEGORY values include:

  • A — The optimizer is confident in its estimate
  • B — The estimate may be less accurate (often due to host variables whose values are not known at bind time)

Individual step costs are not directly exposed in PLAN_TABLE, but you can infer relative costs from the access type, MATCHCOLS, and cardinality estimates.

LUW Cost Metrics — Timerons

On LUW, the optimizer expresses costs in timerons — an abstract unit that is proportional to actual execution time but not equal to it. You cannot say "this query will take 3 seconds because it costs 3,000 timerons." But you can say "this query costs 10x more than that one, so it will probably take roughly 10x longer."

The EXPLAIN_OPERATOR table includes the following cost columns for each operator:

Column Meaning
TOTAL_COST Cumulative cost from this operator down to the leaves
IO_COST I/O component of the cost
CPU_COST CPU component of the cost
FIRST_ROW_COST Cost to produce the first row (useful for OLTP queries)
RE_TOTAL_COST Re-execution cost (for operators inside a nested loop)

The RETURN operator at the top of the tree shows the total cost for the entire query. This is the number you compare before and after optimization.

Estimated Cardinality

Both platforms provide estimated cardinality — the number of rows the optimizer expects at each step. This is critical for diagnosing optimizer mistakes.

On z/OS, the CARDF column in DSN_FILTER_TABLE shows the estimated cardinality after applying filter predicates. On LUW, the STREAM_COUNT column in EXPLAIN_STREAM shows the estimated number of rows flowing between operators.

When the estimated cardinality is wildly wrong, the optimizer makes bad decisions. If it estimates 10 rows but the actual result is 10 million, it might choose a nested loop join (good for 10 rows) instead of a hash join (good for millions). Cardinality errors are the root cause of most bad access plans.

Common causes of cardinality errors:

  • Stale statistics — The table has grown or the data distribution has changed since RUNSTATS was last executed
  • Correlated predicates — The optimizer assumes predicates are independent, but in reality, knowing the value of one column tells you a lot about the value of another
  • Skewed data — If a column has a few very common values and many rare values, the average selectivity does not apply to all queries
  • Host variables or parameter markers — The optimizer may use average selectivity instead of the actual value

23.8 [z/OS] PLAN_TABLE Deep Dive

Let us walk through the z/OS PLAN_TABLE columns in detail, using a concrete example from Meridian National Bank.

Setting Up the Scenario

Meridian has these tables and indexes:

-- Tables
MERIDIAN.ACCOUNTS     -- 2,000,000 rows
MERIDIAN.CUSTOMERS    -- 500,000 rows
MERIDIAN.TRANSACTIONS -- 50,000,000 rows

-- Key indexes
IX_ACCT_BRANCH   ON ACCOUNTS(BRANCH_ID, ACCT_STATUS)
IX_ACCT_CUST     ON ACCOUNTS(CUST_ID)
IX_CUST_PK       ON CUSTOMERS(CUST_ID) -- primary key
IX_TRANS_ACCT    ON TRANSACTIONS(ACCT_ID, TRANS_DATE)
IX_TRANS_DATE    ON TRANSACTIONS(TRANS_DATE)

Now consider this query:

EXPLAIN PLAN SET QUERYNO = 2001 FOR
  SELECT C.CUST_NAME, A.ACCT_ID, T.TRANS_AMT, T.TRANS_DATE
  FROM MERIDIAN.CUSTOMERS C
  JOIN MERIDIAN.ACCOUNTS A ON C.CUST_ID = A.CUST_ID
  JOIN MERIDIAN.TRANSACTIONS T ON A.ACCT_ID = T.ACCT_ID
  WHERE A.BRANCH_ID = 'BR-0042'
    AND A.ACCT_STATUS = 'ACTIVE'
    AND T.TRANS_DATE >= '2025-01-01'
  ORDER BY T.TRANS_DATE DESC;

Reading the PLAN_TABLE Rows

The EXPLAIN might produce three PLAN_TABLE rows:

Row 1 (PLANNO=1):

Column Value Interpretation
PLANNO 1 First step — the driving table
METHOD 0 First table access (no join yet)
TNAME ACCOUNTS Driving table is ACCOUNTS
ACCESSTYPE I Index scan
MATCHCOLS 2 Both columns (BRANCH_ID, ACCT_STATUS) matched
ACCESSNAME IX_ACCT_BRANCH Using the branch/status index
INDEXONLY N Must read data pages (need ACCT_ID, CUST_ID)
PREFETCH S Sequential prefetch
SORTN_JOIN N No sort needed on this table for join
SORTN_ORDERBY N No sort on this table for ORDER BY

Interpretation: DB2 starts with the ACCOUNTS table, using the IX_ACCT_BRANCH index with 2 matching columns. This is an excellent access — it will narrow down to only accounts in branch BR-0042 with ACTIVE status. Sequential prefetch suggests the qualifying index entries are expected to be somewhat contiguous.

Row 2 (PLANNO=2):

Column Value Interpretation
PLANNO 2 Second step
METHOD 1 Nested loop join
TNAME TRANSACTIONS Inner table is TRANSACTIONS
ACCESSTYPE I Index scan
MATCHCOLS 1 One column (ACCT_ID) matched
ACCESSNAME IX_TRANS_ACCT Using the account/date index
INDEXONLY N Must read data pages
PREFETCH (blank) No prefetch — expect few rows per lookup
SORTN_JOIN N No sort needed
SORTC_ORDERBY N No sort on composite yet

Interpretation: For each ACCOUNTS row from step 1, DB2 does a nested loop into TRANSACTIONS using the IX_TRANS_ACCT index. MATCHCOLS = 1 means only ACCT_ID is matched — the TRANS_DATE >= '2025-01-01' predicate is applied as an index screening predicate (or stage 1 predicate on the data page). This is reasonable.

Notice that MATCHCOLS is 1, not 2. Even though the index includes TRANS_DATE as the second column and we have a predicate on TRANS_DATE, the >= predicate on the second key column acts as a range-delimiting predicate, not a matching equal predicate. However, because the index is on (ACCT_ID, TRANS_DATE), the date predicate can still be applied during the index scan — DB2 will start at the first entry for the ACCT_ID and skip forward to TRANS_DATE >= '2025-01-01'.

Row 3 (PLANNO=3):

Column Value Interpretation
PLANNO 3 Third step
METHOD 1 Nested loop join
TNAME CUSTOMERS Inner table is CUSTOMERS
ACCESSTYPE I Index scan
MATCHCOLS 1 One column (CUST_ID) matched
ACCESSNAME IX_CUST_PK Using the primary key index
INDEXONLY N Must read data page (need CUST_NAME)
PREFETCH (blank) No prefetch
SORTC_ORDERBY Y Sort the composite for ORDER BY

Interpretation: After joining ACCOUNTS and TRANSACTIONS, DB2 looks up each customer via the primary key index. MATCHCOLS = 1 on the primary key is optimal. The SORTC_ORDERBY = 'Y' on this final step tells us that the accumulated result will be sorted for the ORDER BY TRANS_DATE DESC clause.

Key Observations

The overall plan is: ACCOUNTS (index scan, 2 match cols) -> NL JOIN -> TRANSACTIONS (index scan, 1 match col) -> NL JOIN -> CUSTOMERS (PK lookup) -> SORT for ORDER BY.

This is a reasonable plan. The two areas to consider for improvement:

  1. The final sort — If we could arrange for TRANSACTIONS to be accessed in TRANS_DATE DESC order, we could eliminate the sort. But that would conflict with the ACCT_ID-first access through the nested loop. There may be no way to avoid this sort.

  2. MATCHCOLS on TRANSACTIONS — If the predicate on TRANS_DATE were an equality instead of a range, MATCHCOLS could be 2. But the business requirement is a range, so MATCHCOLS = 1 is the best we can do with this index.

The ACCESSTYPE Dictionary

Here is the complete set of z/OS ACCESSTYPE values:

Value Meaning
I Index scan
I1 One-fetch index scan (at most one row, e.g., MIN/MAX on indexed column)
N Index scan, IN-list predicate
R Tablespace scan (table scan)
M Multiple index access
MX Multiple index access with extended optimization
MI Intersection of multiple indexes (AND logic)
MU Union of multiple indexes (OR logic)

23.9 [LUW] Access Plan Deep Dive

On LUW, the access plan is expressed as a tree of operators. Data flows from the bottom (leaf nodes — the table and index accesses) up to the top (the RETURN operator that delivers results to the application).

The Operator Tree Structure

Consider the same Meridian query from Section 23.8. On LUW, the db2expln output might look something like this (simplified):

Rows   RETURN
(  1)  Cost     I/O
 140   254.3    45

Rows   SORT
(  2)  Cost     I/O
 140   253.8    45
  |
Rows   NLJOIN
(  3)  Cost     I/O
 140   240.1    44
 /---+----\
Rows          Rows
NLJOIN        FETCH
(  4)         (  8)
 140           1
 238.7         1.4
  |             |
 /---+----\   Rows
Rows      Rows IXSCAN
FETCH     FETCH (  9)
(  5)     (  7) 1
 28        5    0.4
  |        |    IX_CUST_PK
Rows      Rows
IXSCAN    IXSCAN
(  6)     (  7)
 28        5
 3.2       0.8
IX_ACCT   IX_TRANS
BRANCH    _ACCT

Key LUW Operators

RETURN — The topmost operator. Delivers result rows to the application. Its TOTAL_COST is the cost of the entire query.

FETCH — Retrieves data rows from a table using RIDs provided by a child operator (usually IXSCAN). If you see FETCH, it means DB2 is accessing data pages — this is not index-only access.

IXSCAN — Scans an index. The properties show which index is used and which predicates are applied. If IXSCAN feeds directly into the parent (no FETCH), it is index-only access.

TBSCAN — Table scan. Reads all rows in a table (or a range partition). Check the estimated cardinality and the table size — if it is scanning a large table, this may be a problem.

NLJOIN — Nested loop join. The left child is the outer (driving) input, and the right child is the inner (lookup) input. For each row from the left, the right subtree is executed.

MSJOIN — Merge scan join. Both inputs must be sorted on the join column. Look for SORT operators below MSJOIN — if they are present, the data is not naturally sorted.

HSJOIN — Hash join. The left child builds the hash table, and the right child probes it. Check the estimated cardinalities — if the build input is unexpectedly large, it may not fit in memory.

SORT — Sorts the input rows. Look at the sort columns and the estimated input cardinality. A SORT on millions of rows is expensive.

TEMP — Materializes the input into a temporary table. This often occurs for subqueries, CTEs, or when the optimizer needs to re-read an intermediate result.

GRPBY — Performs GROUP BY aggregation. May operate on sorted or hashed input.

FILTER — Applies a predicate to filter rows. Typically used for predicates that cannot be pushed down to the table or index access level.

UNION — Combines results from multiple inputs. UNION (with duplicate elimination) may include a SORT; UNION ALL does not.

RIDSCN — RID scan operator for list prefetch operations.

Reading the Operator Tree — Bottom Up

Always read the LUW operator tree from bottom to top, following the data flow:

  1. Start with the leaf nodes (IXSCAN, TBSCAN)
  2. Note which indexes or tables they access
  3. Follow the data upward through FETCH, SORT, and join operators
  4. Check estimated cardinalities at each level — do they make sense?
  5. End at RETURN, which shows the total cost

At each operator, ask yourself: "Is this the right operation here? Is the estimated cardinality reasonable? Is there a cheaper alternative?"


23.10 Common EXPLAIN Patterns

After reading hundreds (or thousands) of access plans, you start to recognize patterns. Here are the most important ones, categorized by quality.

The Good

Pattern: Index-only access on a covering index. What you see: IXSCAN with no FETCH (LUW), or INDEXONLY = 'Y' (z/OS). Why it is good: The query is satisfied entirely from the index. No data page I/O. When you see it: COUNT queries, existence checks, queries selecting only indexed columns.

Pattern: Matching index scan with high MATCHCOLS. What you see: MATCHCOLS equals the number of equality predicates on leading index columns (z/OS), or IXSCAN with start/stop key predicates (LUW). Why it is good: DB2 directly positions to the qualifying entries — minimal index scanning. When you see it: Point lookups, primary key access, well-designed composite indexes.

Pattern: Nested loop join with index lookup on inner table. What you see: METHOD = 1 / NLJOIN, with inner table accessed via index. Why it is good: Each outer row drives a cheap index lookup. Total cost is proportional to the number of outer rows. When you see it: OLTP queries joining a few tables with good indexes.

Pattern: Hash join on large tables. What you see: METHOD = 4 / HSJOIN, with both inputs being large scans or filtered results. Why it is good: Hash join is efficient for large-scale equi-joins when no index is useful. When you see it: Data warehouse queries, batch reports, complex analytical queries.

The Bad

Pattern: Table scan on a large table with selective predicates. What you see: ACCESSTYPE = 'R' / TBSCAN, on a table with millions of rows, and the predicate should filter most of them. Why it is bad: Reading all rows when only a fraction qualify wastes enormous I/O and CPU. What to do: Check for a suitable index. Check if the predicate is sargable. Check if statistics are current.

Pattern: Sort of a large intermediate result. What you see: SORT operator with millions of estimated input rows. Why it is bad: Large sorts consume memory and may overflow to disk. What to do: Check if an index could provide the data in the required order. Consider whether the sort is truly necessary (e.g., can ORDER BY be eliminated?).

Pattern: Merge join requiring sorts on both inputs. What you see: METHOD = 2 / MSJOIN, with SORT operators on both inputs (or SORTN_JOIN = 'Y' on both steps on z/OS). Why it is bad: Sorting both inputs is expensive, especially for large tables. What to do: Consider whether indexes on the join columns could eliminate the sorts. If not, a hash join might be cheaper.

Pattern: Filter operator high in the tree. What you see: FILTER operator applied after a join, with a large number of input rows. Why it is bad: The filter should have been pushed down to reduce rows before the join. What to do: Check why the predicate was not pushed down — it may be a complex expression or a correlated subquery predicate.

The Ugly

Pattern: Nested loop join with inner table scan. What you see: METHOD = 1 / NLJOIN, and the inner table is accessed via ACCESSTYPE = 'R' / TBSCAN. Why it is catastrophic: For each outer row, DB2 scans the entire inner table. If the outer produces 1,000 rows and the inner has 1,000,000 rows, that is 1 billion rows read. What to do: Create an index on the inner table's join column. This single change can transform a query from hours to milliseconds.

Pattern: Cartesian product (cross join). What you see: A join step with no join predicate — every row from one table is combined with every row from the other. Why it is catastrophic: 1,000 rows x 1,000,000 rows = 1 billion result rows. What to do: Check for a missing join condition in the query. This is almost always a bug.

Pattern: Repeated SORT-TEMP-SORT chains. What you see: Multiple consecutive SORT and TEMP operators, suggesting the data is being materialized and re-sorted multiple times. Why it is bad: Each materialization and sort adds significant cost. What to do: Simplify the query, consider CTEs with materialization hints, or restructure the query to reduce intermediate sorts.


23.11 EXPLAIN-Driven Optimization Workflow

Reading EXPLAIN is not a one-time activity. It is part of a disciplined workflow that you should follow every time you encounter a performance problem.

Step 1: Capture the Current Plan

Before changing anything, capture the EXPLAIN output for the problematic query. This is your baseline. Save it — you will compare against it later.

-- z/OS
EXPLAIN PLAN SET QUERYNO = 9001 FOR <your query>;

-- LUW
SET CURRENT EXPLAIN MODE = EXPLAIN;
<your query>;
SET CURRENT EXPLAIN MODE = NO;

Step 2: Analyze the Plan

Read through the access plan systematically:

  1. Check the access type for each table. Any unexpected table scans?
  2. Check MATCHCOLS / index usage. Are the right indexes being used? Are enough columns matching?
  3. Check the join methods. Are they appropriate for the data volumes?
  4. Check for sorts. Can any be eliminated with indexes?
  5. Check estimated cardinalities. Do they match your knowledge of the data?
  6. Check the overall cost. Which step is the most expensive?

Step 3: Identify the Problem

Based on your analysis, identify the most significant problem. Common root causes include:

  • Missing index
  • Non-sargable predicate
  • Stale statistics
  • Poor join order
  • Unnecessary sort
  • Suboptimal join method

Step 4: Hypothesize a Fix

Formulate a specific hypothesis: "If I create an index on TRANSACTIONS(ACCT_ID, TRANS_DATE, TRANS_AMT), the optimizer will use index-only access for the transaction lookup and eliminate the FETCH."

Step 5: Test the Fix

Implement the change in a test environment and re-capture the EXPLAIN output:

-- Create the proposed index
CREATE INDEX IX_TRANS_ACCT_V2
  ON MERIDIAN.TRANSACTIONS(ACCT_ID, TRANS_DATE, TRANS_AMT);

-- Update statistics
-- z/OS: RUNSTATS
-- LUW: RUNSTATS ON TABLE MERIDIAN.TRANSACTIONS AND INDEXES ALL

-- Re-explain
EXPLAIN PLAN SET QUERYNO = 9002 FOR <your query>;

Step 6: Verify the Improvement

Compare the new plan against the baseline:

  • Did the access type change as expected?
  • Did the estimated cost decrease?
  • Did the problematic sort or scan disappear?

Also run the query and measure actual elapsed time, CPU time, and I/O. EXPLAIN estimates are not always proportional to actual performance, especially when buffer pool hit ratios or system load differ from what the optimizer assumed.

Step 7: Document and Monitor

Record what you found, what you changed, and what improved. Monitor the query in production to confirm the improvement persists. Set up alerts for access plan regression (see Case Study 2 for a real example of this).


23.12 Meridian Bank EXPLAIN Analysis

Let us apply everything we have learned to five Meridian National Bank queries, ranging from simple to complex. For each query, we will examine the EXPLAIN output on both platforms and discuss what the plan tells us.

Query 1: Simple Customer Lookup

SELECT CUST_ID, CUST_NAME, CUST_EMAIL, CUST_PHONE
FROM MERIDIAN.CUSTOMERS
WHERE CUST_ID = 'C-00048291';

Expected plan: This is a primary key lookup. The optimizer should use IX_CUST_PK with MATCHCOLS = 1 (z/OS) or IXSCAN on the primary key index feeding a FETCH (LUW). The plan should return exactly one row.

z/OS PLAN_TABLE:

PLANNO METHOD TNAME ACCESSTYPE MATCHCOLS ACCESSNAME INDEXONLY
1 0 CUSTOMERS I 1 IX_CUST_PK N

INDEXONLY = 'N' because we need CUST_NAME, CUST_EMAIL, and CUST_PHONE, which are not in the index. The single row FETCH from the data page is trivially cheap.

LUW Operator Tree:

RETURN -> FETCH -> IXSCAN (IX_CUST_PK)

Verdict: This is a textbook-perfect plan. No optimization needed.

Query 2: Join Query — Account Balances by Branch

SELECT C.CUST_NAME, A.ACCT_ID, A.ACCT_TYPE, A.BALANCE
FROM MERIDIAN.ACCOUNTS A
JOIN MERIDIAN.CUSTOMERS C ON A.CUST_ID = C.CUST_ID
WHERE A.BRANCH_ID = 'BR-0042'
  AND A.ACCT_STATUS = 'ACTIVE'
ORDER BY A.BALANCE DESC;

z/OS PLAN_TABLE:

PLANNO METHOD TNAME ACCESSTYPE MATCHCOLS ACCESSNAME INDEXONLY SORTN_ORDERBY SORTC_ORDERBY
1 0 ACCOUNTS I 2 IX_ACCT_BRANCH N N N
2 1 CUSTOMERS I 1 IX_CUST_PK N N Y

Interpretation: - Step 1: Access ACCOUNTS via IX_ACCT_BRANCH with 2 matching columns — excellent selectivity - Step 2: Nested loop join into CUSTOMERS via primary key — optimal for a small number of outer rows - SORTC_ORDERBY = 'Y' on step 2: The composite result will be sorted for ORDER BY BALANCE DESC

The sort on BALANCE is unavoidable unless we create an index that includes BALANCE in the right position (which would be impractical here since we are filtering by BRANCH_ID first).

Estimated rows from step 1: If branch BR-0042 has approximately 300 active accounts, the sort is on a small result set and will be very fast.

Verdict: Good plan. The sort is on a small result set, so it is not a concern.

Query 3: Aggregation — Transaction Summary

SELECT A.BRANCH_ID,
       COUNT(*) AS TRANS_COUNT,
       SUM(T.TRANS_AMT) AS TOTAL_AMT,
       AVG(T.TRANS_AMT) AS AVG_AMT
FROM MERIDIAN.TRANSACTIONS T
JOIN MERIDIAN.ACCOUNTS A ON T.ACCT_ID = A.ACCT_ID
WHERE T.TRANS_DATE BETWEEN '2025-01-01' AND '2025-03-31'
GROUP BY A.BRANCH_ID
ORDER BY TOTAL_AMT DESC;

LUW Operator Tree (simplified):

RETURN
  |
SORT (ORDER BY TOTAL_AMT DESC)
  |
GRPBY (GROUP BY BRANCH_ID)
  |
SORT (GROUP BY BRANCH_ID)
  |
HSJOIN
 /       \
FETCH     TBSCAN
  |       ACCOUNTS
IXSCAN
IX_TRANS_DATE
TRANSACTIONS

Interpretation: - TRANSACTIONS is accessed via IX_TRANS_DATE — a range scan for the three-month period - ACCOUNTS is accessed via a table scan (TBSCAN). This might seem alarming, but ACCOUNTS has only 2,000,000 rows and every account might be touched (since we are grouping by BRANCH_ID across all branches) - HSJOIN is used because both inputs are large — the TRANSACTIONS range scan might return millions of rows - SORT for GROUP BY, then another SORT for ORDER BY

Potential improvement: If the TRANSACTIONS range scan returns a very large number of rows (say 5 million in three months), the hash join and sorts will be expensive. Consider:

  1. A composite index on TRANSACTIONS(TRANS_DATE, ACCT_ID, TRANS_AMT) for index-only access on the TRANSACTIONS side
  2. Pre-aggregating at the account level before joining to ACCOUNTS, to reduce the join cardinality

Verdict: Acceptable for a batch report. If this runs during business hours and affects OLTP performance, the index-only optimization is worth pursuing.

Query 4: Subquery — Customers with High-Value Transactions

SELECT C.CUST_ID, C.CUST_NAME
FROM MERIDIAN.CUSTOMERS C
WHERE EXISTS (
  SELECT 1 FROM MERIDIAN.ACCOUNTS A
  JOIN MERIDIAN.TRANSACTIONS T ON A.ACCT_ID = T.ACCT_ID
  WHERE A.CUST_ID = C.CUST_ID
    AND T.TRANS_AMT > 100000.00
    AND T.TRANS_DATE >= '2025-01-01'
);

z/OS PLAN_TABLE:

PLANNO QBLOCKNO METHOD TNAME ACCESSTYPE MATCHCOLS ACCESSNAME INDEXONLY
1 1 0 CUSTOMERS R 0 N
2 2 0 ACCOUNTS I 1 IX_ACCT_CUST N
3 2 1 TRANSACTIONS I 1 IX_TRANS_ACCT N

Interpretation: - QBLOCKNO = 1: The outer query scans CUSTOMERS via tablespace scan (ACCESSTYPE = 'R'). This is concerning if CUSTOMERS is large. - QBLOCKNO = 2: The correlated subquery. For each customer, it looks up ACCOUNTS via IX_ACCT_CUST (MATCHCOLS = 1 on CUST_ID), then nested-loop joins to TRANSACTIONS via IX_TRANS_ACCT (MATCHCOLS = 1 on ACCT_ID). - The EXISTS subquery uses early-out logic — it stops as soon as one qualifying row is found.

The table scan on CUSTOMERS means we are checking the subquery for every customer — all 500,000 of them. Even though the subquery is efficient (two index lookups), executing it 500,000 times adds up.

Potential improvements:

  1. If only a small fraction of customers have high-value transactions, consider rewriting as a join with DISTINCT, or using a semi-join hint if available.
  2. On z/OS DB2 12 and later, the optimizer may automatically transform EXISTS into a semi-join. Check whether the optimizer did this by looking for METHOD = 4 (hash semi-join).
  3. If the optimizer is not transforming it, check statistics on TRANSACTIONS for the TRANS_AMT > 100000 predicate. If the filter factor is very small (few transactions exceed $100K), the optimizer may not realize that only a handful of customers qualify.

Verdict: Watch the actual execution time. The table scan on CUSTOMERS is a yellow flag, but if the EXISTS early-out is effective (most customers have at least one high-value transaction, or the subquery is very fast), it may be acceptable.

Query 5: Complex Report — Branch Performance Dashboard

SELECT B.BRANCH_NAME,
       B.BRANCH_CITY,
       ACCT_STATS.TOTAL_ACCOUNTS,
       ACCT_STATS.TOTAL_BALANCE,
       TRANS_STATS.MONTHLY_TRANS_COUNT,
       TRANS_STATS.MONTHLY_TRANS_VOLUME,
       LOAN_STATS.ACTIVE_LOANS,
       LOAN_STATS.TOTAL_OUTSTANDING
FROM MERIDIAN.BRANCHES B
LEFT JOIN LATERAL (
  SELECT COUNT(*) AS TOTAL_ACCOUNTS,
         SUM(A.BALANCE) AS TOTAL_BALANCE
  FROM MERIDIAN.ACCOUNTS A
  WHERE A.BRANCH_ID = B.BRANCH_ID
    AND A.ACCT_STATUS = 'ACTIVE'
) ACCT_STATS ON 1=1
LEFT JOIN LATERAL (
  SELECT COUNT(*) AS MONTHLY_TRANS_COUNT,
         SUM(T.TRANS_AMT) AS MONTHLY_TRANS_VOLUME
  FROM MERIDIAN.TRANSACTIONS T
  JOIN MERIDIAN.ACCOUNTS A2 ON T.ACCT_ID = A2.ACCT_ID
  WHERE A2.BRANCH_ID = B.BRANCH_ID
    AND T.TRANS_DATE >= CURRENT DATE - 30 DAYS
) TRANS_STATS ON 1=1
LEFT JOIN LATERAL (
  SELECT COUNT(*) AS ACTIVE_LOANS,
         SUM(L.OUTSTANDING_AMT) AS TOTAL_OUTSTANDING
  FROM MERIDIAN.LOANS L
  JOIN MERIDIAN.ACCOUNTS A3 ON L.ACCT_ID = A3.ACCT_ID
  WHERE A3.BRANCH_ID = B.BRANCH_ID
    AND L.LOAN_STATUS = 'ACTIVE'
) LOAN_STATS ON 1=1
ORDER BY TRANS_STATS.MONTHLY_TRANS_VOLUME DESC NULLS LAST;

This is a complex report with three lateral subqueries. The EXPLAIN output will be large, so let us focus on the key aspects.

LUW Operator Tree (high-level structure):

RETURN
  |
SORT (ORDER BY MONTHLY_TRANS_VOLUME DESC)
  |
NLJOIN (BRANCHES x LOAN_STATS lateral)
  |           |
NLJOIN       GRPBY
  |    \       |
NLJOIN  GRPBY  HSJOIN or NLJOIN (LOANS x ACCOUNTS)
  |  \    |
TBSCAN GRPBY  HSJOIN or NLJOIN (TRANSACTIONS x ACCOUNTS)
BRANCHES  |
        IXSCAN or TBSCAN (ACCOUNTS filtered by BRANCH_ID)

Key observations:

  1. BRANCHES is the outer driver — this is correct, as there are relatively few branches (perhaps 50-200). Each branch drives three lateral subquery evaluations.

  2. The ACCOUNTS lateral subquery should use IX_ACCT_BRANCH with 2 matching columns (BRANCH_ID, ACCT_STATUS = 'ACTIVE'). If it uses a table scan, that is a problem.

  3. The TRANSACTIONS lateral subquery is the most expensive part. For each branch, it must join TRANSACTIONS and ACCOUNTS to find recent transactions for that branch's accounts. If there is no index on ACCOUNTS(BRANCH_ID) that enables efficient access, this subquery will be slow.

  4. The LOANS lateral subquery has similar structure to the TRANSACTIONS one.

  5. The final SORT is on the result of all joins, which has at most a few hundred rows (one per branch). This is trivially cheap.

Potential improvements:

  1. Ensure IX_ACCT_BRANCH index is being used in all three lateral subqueries
  2. For the TRANSACTIONS subquery, consider a materialized summary table that pre-aggregates monthly transaction volumes by branch
  3. If this report runs regularly, consider creating a materialized query table (MQT) that the optimizer can route to automatically
  4. The lateral join pattern forces nested-loop execution. If the number of branches is large, consider rewriting with standard GROUP BY joins

Verdict: This plan is acceptable for a batch dashboard report that runs once daily or on-demand. The nested loop over branches with indexed subquery lookups is efficient because the branch count is small. However, if this becomes a frequently-executed query, pre-aggregation is the better strategy.


Spaced Review

From Chapter 15: Transaction Isolation and Locking

Question: You capture an EXPLAIN plan and see TSLOCKMODE = 'IS' on a tablespace scan. What isolation level is in effect, and what does this mean for concurrent access?

Answer: TSLOCKMODE = 'IS' means Intent Share, indicating a read operation at Cursor Stability or Read Stability isolation. Intent Share on the tablespace allows other transactions to read and update other pages concurrently. Individual page or row locks will be acquired as rows are accessed. If you needed Repeatable Read, you would see TSLOCKMODE = 'S' (Share) on the tablespace.

From Chapter 17: RUNSTATS and Catalog Statistics

Question: You see an EXPLAIN plan where the optimizer estimates 100 rows from a table scan, but you know the actual result is 500,000 rows. What is the most likely cause, and how does it affect the plan quality?

Answer: The most likely cause is stale statistics. If RUNSTATS has not been run since the table grew significantly, the optimizer's cardinality estimates will be based on the old row count and old column statistics. An underestimate of 100 rows (instead of 500,000) would cause the optimizer to choose plans optimized for small results — like nested loop joins instead of hash joins, or to undersize sort buffers. The fix is to run RUNSTATS with current data and rebind affected packages.

From Chapter 22: Index Design

Question: Your EXPLAIN shows MATCHCOLS = 1 on a three-column composite index. The query has equality predicates on all three columns. What might explain the low MATCHCOLS value?

Answer: Several possibilities: (1) The predicates are not on the leading columns of the index — if the index is on (A, B, C) and the predicates are on B and C but not A, MATCHCOLS cannot exceed 0 for B and C access. (2) The predicates use expressions or functions that prevent matching (e.g., WHERE UPPER(COL_A) = 'X'). (3) The optimizer chose a different index that only matches one column, possibly because statistics suggest it is cheaper. (4) On z/OS, certain predicate types (LIKE with host variable, BETWEEN with certain patterns) may limit matching. Check DSN_PREDICAT_TABLE to see how each predicate was classified.


Summary

EXPLAIN is not a debugging tool you use when things go wrong. It is a planning tool you use before things go wrong. Every SQL statement that matters should be explained, analyzed, and understood before it runs in production.

The key concepts from this chapter:

Access types range from the excellent (index-only access) to the catastrophic (nested loop with inner table scan). The access type tells you whether DB2 is using your indexes effectively.

Join methods — nested loop, merge scan, and hash join — each have scenarios where they excel and scenarios where they struggle. The EXPLAIN output tells you which was chosen and helps you determine whether it was the right choice.

Sort operations are expensive and should be eliminated when possible through appropriate index design. The EXPLAIN output shows you exactly where sorts occur and why.

Cost estimates and estimated cardinalities help you gauge the optimizer's confidence and detect cases where stale statistics are leading to bad decisions.

The optimization workflow — capture, analyze, identify, hypothesize, test, verify — is a disciplined process that produces reliable results. Guessing is not a strategy.

In the next chapter, we will build on this foundation by exploring advanced optimization techniques — rewriting queries for better plans, using optimization profiles, and managing workload-level performance.


Key Terms

Term Definition
Access plan The strategy DB2 chooses to execute a SQL statement, including access methods, join methods, join order, and sort operations
PLAN_TABLE The z/OS table that stores EXPLAIN output, with one row per access step
EXPLAIN tables The set of tables on LUW that store detailed access plan information
Timeron An abstract cost unit used by the LUW optimizer; proportional to but not equal to elapsed time
MATCHCOLS The number of leading index key columns matched by equality predicates (z/OS)
Index-only access An access method where the query can be satisfied entirely from the index without reading data pages
Table scan / Tablespace scan Reading every row in a table sequentially — cheap for small tables, expensive for large ones
List prefetch Collecting RIDs from an index scan, sorting them by page number, and fetching data pages in sequential order
Nested loop join A join method where each row from the outer table drives a lookup into the inner table
Merge scan join A join method where both inputs are sorted on the join column and merged in a single pass
Hash join A join method where the smaller input is hashed into memory and the larger input probes the hash table
Cardinality estimate The optimizer's prediction of how many rows an operation will produce
Filter factor The fraction of rows that a predicate is expected to eliminate (0.0 = eliminates all, 1.0 = eliminates none)
Sort overflow When a sort operation does not fit in memory and spills to temporary disk storage
Sargable predicate A predicate that can be evaluated during index scan or data page access (Search ARGument ABLE)
Operator tree The LUW representation of an access plan as a tree of data-flow operators
Covering index An index that contains all columns needed by a query, enabling index-only access
Access path regression A situation where a previously efficient access plan becomes inefficient, typically after REBIND, statistics update, or DB2 migration