> "The optimizer is the most important piece of software you'll never write. It decides how every query runs. Understanding it is the single most valuable skill a DBA can develop."
In This Chapter
- 22.1 Rule-Based vs. Cost-Based Optimization
- 22.2 The Optimizer's Decision Process
- 22.3 Filter Factors and Cardinality Estimation
- 22.4 Access Path Types
- 22.5 Join Methods
- 22.6 [z/OS] Stage 1 and Stage 2 Predicates
- 22.7 Query Rewrite Optimization
- 22.8 Catalog Statistics and the Optimizer
- 22.9 Why Access Paths Change
- 22.10 Influencing the Optimizer
- 22.11 The Optimizer and the Meridian Bank
- Spaced Review: Connecting to Earlier Chapters
- Chapter Summary
- What Comes Next
Chapter 22: How the DB2 Optimizer Works — Cost-Based Optimization, Access Path Selection, and Why Your Query Plan Changed
"The optimizer is the most important piece of software you'll never write. It decides how every query runs. Understanding it is the single most valuable skill a DBA can develop."
You have been writing SQL since Chapter 5. You have designed schemas, built indexes, loaded data, and configured buffer pools. Every time you pressed Enter on a query, something decided how that query would execute — which indexes to use, in what order to access the tables, whether to sort or hash, whether to read pages sequentially or skip around the disk. That something is the DB2 optimizer.
The optimizer is not a configuration parameter you set and forget. It is not a black box that "just works." It is a sophisticated piece of software that makes hundreds of micro-decisions for every SQL statement, and those decisions are based on mathematics — on statistics, cost formulas, and combinatorial search. When a query runs fast, the optimizer made good decisions. When a query runs slowly, the optimizer may have made a poor decision — but more often, you gave it poor information to work with.
This chapter takes you inside the optimizer. We will trace the complete decision process from the moment your SQL text arrives to the moment an executable access plan is produced. We will examine how DB2 estimates the cost of every possible execution strategy, how it narrows millions of possibilities down to one plan, and why that plan can change unexpectedly. By the end, you will be able to look at an access plan and understand not just what the optimizer chose, but why it chose it — and what you can do when it chooses wrong.
Learning Objectives
After completing this chapter, you will be able to:
- Explain cost-based optimization and how DB2's optimizer makes decisions using catalog statistics, cost formulas, and search algorithms.
- Understand filter factor estimation and cardinality estimation, including the uniform distribution assumption and its limitations.
- Describe the access paths available to the optimizer — tablespace scan, index access, list prefetch, MQT rewrite — on both z/OS and LUW.
- Identify why access paths change after RUNSTATS, REBIND, or version migration.
- Understand the role of catalog statistics in optimizer decisions and know which statistics matter most.
- Relate optimizer behavior to the Meridian National Bank workload and predict how the optimizer will handle the bank's typical queries.
22.1 Rule-Based vs. Cost-Based Optimization
Before we examine DB2's optimizer in detail, we need to understand the two fundamental approaches to query optimization — because DB2 did not always work the way it does today, and understanding the evolution explains the design.
22.1.1 The Rule-Based Era
The earliest relational database optimizers used rule-based optimization (RBO). The idea was simple: define a fixed set of rules that rank access methods by desirability, and always pick the highest-ranked method that applies.
A rule-based optimizer might have rules like:
- If a unique index matches the predicate exactly, use it. (Best)
- If a non-unique index matches, use it.
- If a composite index has a matching leading column, use it.
- Otherwise, do a table scan. (Worst)
Rule-based optimization has the virtue of predictability. Given the same query and the same indexes, you always get the same plan. DBAs could memorize the rules and reliably predict optimizer behavior.
But rule-based optimization has a fatal flaw: it ignores data distribution. Consider this query:
SELECT * FROM ACCOUNT WHERE STATUS = 'ACTIVE';
If 99.5% of accounts are active, a table scan is almost certainly faster than an index scan — the index adds overhead (reading index pages, then randomly reading data pages) for no selectivity benefit. But a rule-based optimizer does not know that 99.5% of rows match. It sees an index on STATUS and blindly uses it, because the rules say "index access beats table scan."
Conversely, consider:
SELECT * FROM ACCOUNT WHERE STATUS = 'FROZEN';
If only 12 out of 2 million accounts are frozen, the index is spectacularly efficient. The rule-based optimizer uses the index here too — but for the right reason only by accident.
Oracle used rule-based optimization as its default through Oracle 9i (released 2001), and many Oracle DBAs became expert "rule memorizers." IBM chose a different path for DB2 from the very beginning.
22.1.2 The Cost-Based Revolution
DB2 has used cost-based optimization (CBO) since its initial release on z/OS in 1983. This was a foundational design decision made by the IBM San Jose Research Laboratory team, and it has shaped every aspect of DB2's architecture.
A cost-based optimizer does not follow fixed rules. Instead, it:
- Enumerates possible execution strategies (access paths, join methods, join orders).
- Estimates the cost of each strategy using a mathematical model.
- Selects the strategy with the lowest estimated cost.
The "cost" is a composite metric. In DB2, the optimizer's cost model accounts for:
| Cost Component | What It Measures |
|---|---|
| I/O cost | Number of page reads from disk (random and sequential, weighted differently) |
| CPU cost | Processor instructions for evaluating predicates, sorting, hashing, copying rows |
| Sort cost | The cost of materializing intermediate sort work files |
| Communication cost | For distributed queries, the cost of sending data between nodes |
On z/OS, the optimizer expresses cost in internal units called timerons — an abstract unit that blends I/O and CPU cost. On LUW, the optimizer uses a similar blended cost metric expressed as estimated elapsed time in timerons (despite the name, they do not map directly to wall-clock seconds).
The key insight is that cost-based optimization requires data about the data — statistics. The optimizer must know:
- How many rows are in each table.
- How many distinct values exist in each column.
- How values are distributed (uniformly? skewed?).
- How many leaf pages are in each index.
- How data is physically clustered.
This information comes from the DB2 catalog — the system tables that store metadata about every object in the database. The RUNSTATS utility (z/OS) or RUNSTATS command (LUW) collects these statistics from the actual data and stores them in the catalog. The optimizer reads the catalog at optimization time and feeds the statistics into its cost formulas.
22.1.3 What "Cost" Really Means
Let us be precise about the optimizer's cost model, because misunderstanding it leads to confusion.
The optimizer does not try to predict the exact elapsed time of a query. It cannot — elapsed time depends on concurrent workload, buffer pool hit ratios at runtime, storage subsystem caching, and dozens of other factors the optimizer cannot know at plan-selection time.
Instead, the optimizer computes a relative cost for each candidate plan. The cost is accurate enough to rank plans: if Plan A has a cost of 500 and Plan B has a cost of 50,000, the optimizer can confidently say Plan A is better. The absolute numbers are not meaningful on their own — you cannot say "a cost of 500 means the query will run in 500 milliseconds." But the ratios are meaningful: Plan B is estimated to be roughly 100 times more expensive than Plan A.
The cost model uses parameters that represent the hardware environment:
- SEQIOCOST: Cost of a sequential I/O operation (reading the next page in sequence).
- RANDIOCOST: Cost of a random I/O operation (seeking to a non-sequential page).
- CPU speed factor: How fast the processor can evaluate instructions.
On z/OS, these values are calibrated during installation and stored in the DSNZPARM configuration. On LUW, they are configured at the database level (the DB2_OVERRIDE_BPF registry variable and the CPUSPEED and COMM_BANDWIDTH database configuration parameters, among others).
Platform Note — z/OS: On z/OS, the optimizer also factors in the type of storage hardware. zHyperLink and zHyperWrite-capable storage devices have different I/O cost characteristics than traditional FICON-attached storage. DB2 12 and later versions can detect these capabilities and adjust cost estimates accordingly.
22.1.4 Why Cost-Based Won
Cost-based optimization won because data is not uniform, and queries are not predictable. Real data has skewed distributions, correlations between columns, and volumes that change over time. A rule-based optimizer treats all indexes as equally useful, all columns as equally selective, and all tables as equally sized. A cost-based optimizer adapts to reality.
The price of cost-based optimization is complexity. You must maintain accurate statistics (Chapter 23 covers EXPLAIN tools, and we will discuss RUNSTATS in detail in Section 22.8). You must understand that the optimizer's decisions depend on its model of the data — not the data itself. When the model is accurate, the optimizer is brilliant. When the model is stale or incomplete, the optimizer can make terrible choices. The rest of this chapter teaches you how to ensure the model stays accurate and how to diagnose problems when it does not.
22.2 The Optimizer's Decision Process
When a SQL statement arrives at the DB2 engine, it passes through a pipeline of processing stages. The optimizer is one stage — but it is the most consequential. Let us trace the complete pipeline.
22.2.1 Statement Processing Pipeline
SQL Statement (text)
|
v
+------------------+
| 1. PARSING | Syntax check, resolve object names, check authorization
+------------------+
|
v
+------------------+
| 2. SEMANTIC | Verify column names, data types, view expansion
| VALIDATION |
+------------------+
|
v
+------------------+
| 3. QUERY REWRITE | Transform the query into an equivalent but more
| | efficient form (covered in Section 22.7)
+------------------+
|
v
+------------------+
| 4. ACCESS PATH | Enumerate candidate plans, estimate costs,
| SELECTION | select the lowest-cost plan
+------------------+
|
v
+------------------+
| 5. CODE | Generate executable runtime structures
| GENERATION | (the "section" or "package section")
+------------------+
|
v
Executable Access Plan
Stages 1 and 2 are straightforward — they ensure the SQL is valid. Stage 3 (query rewrite) and Stage 4 (access path selection) are where the optimizer does its real work. Stage 5 translates the chosen plan into the internal structures that the DB2 runtime engine can execute.
22.2.2 The Search Space Problem
For a simple single-table query with three candidate indexes, the optimizer might evaluate four plans: one table scan and three index-based access paths. That is trivial.
Now consider a five-table join. The optimizer must decide:
- Join order: In how many orders can you join five tables? The answer is 5! = 120. But the optimizer also considers subsets and bushy trees, not just left-deep trees, so the real number is larger.
- Join method: For each pair of tables, should we use nested loop, merge scan, or hash join? If there are three join methods and four join steps, that is 3^4 = 81 combinations per join order.
- Access method: For each table, should we use a table scan or one of potentially many indexes? If each table has an average of three candidate access methods, that is 3^5 = 243 combinations.
The total search space is approximately 120 x 81 x 243 = 2,361,960 candidate plans — for a five-table join. A ten-table join would have a search space in the billions.
The optimizer cannot exhaustively evaluate every candidate. Instead, it uses pruning and heuristic search to explore the most promising regions of the search space while skipping plans that are unlikely to be optimal.
22.2.3 Optimization Levels
DB2 provides control over how aggressively the optimizer searches the plan space.
On z/OS, the DSNZPARM parameter OPTMIZER_COST (and the BIND option OPTHINT) influence optimizer behavior, but the optimizer generally operates at a single high level of effort for static SQL. For dynamic SQL, the optimizer may use simplified strategies if the statement is expected to be short-running.
On LUW, the CURRENT QUERY OPTIMIZATION special register controls the optimization level explicitly:
| Level | Behavior |
|---|---|
| 0 | Minimal optimization. Essentially greedy — pick the first reasonable plan. Suitable for very simple queries in OLTP. |
| 1 | Slightly more exploration than level 0. Limited join enumeration. |
| 2 | More join enumeration. Good for moderate queries. |
| 3 | Moderate optimization. Considers more join orders. |
| 5 | Default. Full optimization. Uses heuristics to prune the search space but explores broadly. |
| 7 | Full optimization with additional consideration of Cartesian products and more aggressive rewrite rules. |
| 9 | Maximum optimization. Exhaustive search. Very expensive for complex queries — may take minutes to optimize a single statement. Use only for critical long-running queries. |
-- LUW: Set optimization level for the session
SET CURRENT QUERY OPTIMIZATION = 7;
-- Now run your complex reporting query
SELECT ...
When to Change the Optimization Level: For OLTP workloads with simple queries (single-table lookups, two-table joins), the default level 5 is more than sufficient. For complex analytical queries with many joins and subqueries, levels 7 or 9 may find a better plan — but the optimization itself takes longer. For DSS (Decision Support System) workloads, it is common to run at level 7.
22.2.4 Static vs. Dynamic Optimization
DB2 has a critical distinction that affects when optimization occurs:
- Static SQL (embedded in a program, bound with BIND): Optimization occurs at BIND time. The plan is stored in the catalog and reused every time the program runs. On z/OS, this is the dominant mode for production CICS/IMS workloads.
- Dynamic SQL (prepared at runtime, e.g., from JDBC, ODBC, or CLP): Optimization occurs at PREPARE time. The plan may be cached in the dynamic statement cache (if enabled) and reused for identical SQL text.
This distinction matters enormously for access path stability. A statically bound package retains its access plan until someone issues REBIND. A dynamic SQL statement is re-optimized when it is first prepared (or when it falls out of the cache). We will return to this in Section 22.9 when we discuss why access paths change.
22.3 Filter Factors and Cardinality Estimation
The optimizer's cost estimates depend on one critical calculation: how many rows will each step of the plan produce? This is called cardinality estimation, and it is the single most important — and most error-prone — part of cost-based optimization.
22.3.1 What Is a Filter Factor?
A filter factor (FF) is the fraction of rows in a table that satisfy a given predicate. It is a number between 0 and 1.
- FF = 1.0 means the predicate matches every row (no filtering).
- FF = 0.0 means the predicate matches no rows.
- FF = 0.01 means the predicate matches 1% of rows.
The optimizer uses filter factors to estimate cardinality — the number of rows produced by an operation:
Estimated rows = Table cardinality × Filter factor
For example, if the ACCOUNT table has 2,000,000 rows and the predicate STATUS = 'ACTIVE' has a filter factor of 0.995, the optimizer estimates:
2,000,000 × 0.995 = 1,990,000 rows
That estimate tells the optimizer that an index on STATUS is useless for this predicate — it would still need to read almost every row. A table scan is cheaper.
Conversely, if STATUS = 'FROZEN' has a filter factor of 0.000006 (6 out of 1 million):
2,000,000 × 0.000006 = 12 rows
Now the index is enormously valuable. The optimizer will almost certainly choose index access.
22.3.2 The Uniform Distribution Assumption
When the optimizer has no detailed distribution statistics, it falls back on the uniform distribution assumption: it assumes that values are evenly distributed across the distinct values in a column.
The default filter factor for an equality predicate under uniform distribution is:
FF(column = value) = 1 / COLCARD
Where COLCARD is the column cardinality — the number of distinct values in the column, stored in SYSCAT.COLUMNS (LUW) or SYSIBM.SYSCOLUMNS (z/OS).
Example: The ACCOUNT_TYPE column has 4 distinct values (CHECKING, SAVINGS, CD, MONEY_MARKET). Under uniform distribution:
FF(ACCOUNT_TYPE = 'CHECKING') = 1/4 = 0.25
The optimizer estimates that 25% of accounts are of each type. If in reality 60% are CHECKING and only 5% are CD, the optimizer's estimate is wrong — and it may choose a suboptimal plan.
22.3.3 Range Predicate Filter Factors
For range predicates, the optimizer uses HIGH2KEY and LOW2KEY — the second-highest and second-lowest values in the column (stored in the catalog after RUNSTATS).
The filter factor for a range predicate is:
FF(column > value) = (HIGH2KEY - value) / (HIGH2KEY - LOW2KEY)
FF(column < value) = (value - LOW2KEY) / (HIGH2KEY - LOW2KEY)
FF(column BETWEEN low AND high) = (high - low) / (HIGH2KEY - LOW2KEY)
Example: The BALANCE column in ACCOUNT has LOW2KEY = 100.00 and HIGH2KEY = 500,000.00. For the predicate BALANCE > 400000:
FF = (500,000 - 400,000) / (500,000 - 100) = 100,000 / 499,900 ≈ 0.20
The optimizer estimates that about 20% of accounts have a balance above 400,000. If balances are actually heavily skewed toward lower values (as they are in most banks), the real percentage might be 2%. The optimizer would overestimate by a factor of 10.
22.3.4 Frequency Values and Histogram Statistics
To overcome the uniform distribution assumption, DB2 supports more detailed statistics:
Frequency values (also called most frequent values) record the actual occurrence count for the most common values in a column. These are stored in:
- z/OS: SYSIBM.SYSCOLDIST (TYPE = 'F' for frequency)
- LUW: SYSCAT.COLDIST (TYPE = 'F')
If RUNSTATS has collected frequency values for the STATUS column, the optimizer knows that 'ACTIVE' appears in 99.5% of rows and 'FROZEN' appears in 0.0006% of rows. It will compute accurate filter factors instead of assuming uniform distribution.
Histogram statistics (also called quantile statistics) divide the value range into buckets and record how many rows fall in each bucket. These are stored in:
- z/OS: SYSIBM.SYSCOLDIST (TYPE = 'Q' for quantile)
- LUW: SYSCAT.COLDIST (TYPE = 'Q')
Histograms help with range predicates on skewed data. Instead of assuming a linear distribution between LOW2KEY and HIGH2KEY, the optimizer uses the histogram buckets to estimate how many rows fall in any given range.
Collecting detailed statistics (we will cover this more in Section 22.8):
-- z/OS: Collect frequency statistics for the top 20 values
RUNSTATS TABLESPACE dbname.tsname
TABLE(ACCOUNT)
COLUMN(STATUS) FREQVAL NUMCOLS 1 COUNT 20
COLUMN(ACCOUNT_TYPE) FREQVAL NUMCOLS 1 COUNT 10;
-- LUW: Collect distribution statistics
RUNSTATS ON TABLE meridian.ACCOUNT
WITH DISTRIBUTION
ON COLUMNS (STATUS, ACCOUNT_TYPE);
22.3.5 Compound Predicate Filter Factors
When a query has multiple predicates combined with AND or OR, the optimizer combines filter factors:
FF(A AND B) = FF(A) × FF(B) -- assuming independence
FF(A OR B) = FF(A) + FF(B) - FF(A) × FF(B) -- assuming independence
FF(NOT A) = 1 - FF(A)
The critical word is independence. The optimizer assumes that predicates are statistically independent unless it has evidence otherwise. This assumption often breaks down:
SELECT * FROM CUSTOMER
WHERE STATE = 'IL' AND CITY = 'CHICAGO';
The optimizer might estimate:
FF(STATE = 'IL') = 1/50 = 0.02 (50 states)
FF(CITY = 'CHICAGO') = 1/5000 = 0.0002 (5000 distinct cities)
FF(combined) = 0.02 × 0.0002 = 0.000004
On 1,000,000 customers, the optimizer estimates 4 rows. But in reality, if 15% of Illinois customers are in Chicago, the real count is 1,000,000 × 0.02 × 0.15 = 3,000. The optimizer underestimates by a factor of 750, and this can lead to catastrophically bad plan choices (e.g., choosing nested loop join where hash join would be better).
DB2 provides multi-column frequency statistics and column group statistics to address correlated columns:
-- z/OS: Multi-column frequency values
RUNSTATS TABLESPACE dbname.tsname
TABLE(CUSTOMER)
COLUMN(STATE, CITY) FREQVAL NUMCOLS 2 COUNT 50;
-- LUW: Column group statistics
RUNSTATS ON TABLE meridian.CUSTOMER
ON COLUMNS ((STATE, CITY));
These tell the optimizer the actual joint distribution of (STATE, CITY) pairs, allowing accurate estimates for correlated predicates.
22.3.6 The Cardinality Estimation Cascade
Cardinality estimation errors are the most common cause of suboptimal plans, and they have a dangerous property: errors compound. Consider a three-table join:
- The optimizer estimates Table A produces 100 rows after filtering (actual: 10,000).
- That estimate feeds into the join with Table B. The optimizer estimates the join produces 500 rows (actual: 200,000).
- That estimate feeds into the join with Table C. By now the error has compounded so severely that the optimizer chooses nested loop join (good for small inputs) where hash join (good for large inputs) would be orders of magnitude faster.
This cascade effect is why cardinality estimation is the threshold concept of optimizer understanding. If you understand that the optimizer is making statistical guesses — not consulting an oracle of truth — and that those guesses can be wrong, especially when statistics are stale or data is correlated, you will understand 80% of all performance problems.
Threshold Concept Alert: The optimizer makes STATISTICAL decisions based on catalog data. It does not examine your actual data at optimization time. It builds a mathematical model of your data using catalog statistics, and it optimizes against that model. When the model diverges from reality, the plan diverges from optimal. This insight — that you are optimizing against a model, not against reality — is the key to understanding every access path surprise you will ever encounter.
22.4 Access Path Types
The optimizer's goal is to select an access path — a specific method of retrieving data from a table. DB2 supports numerous access path types, and each is optimal under different conditions. We will cover the most important ones for both platforms.
22.4.1 Tablespace Scan / Table Scan
z/OS: Tablespace scan. Reads every page in the tablespace sequentially. LUW: Table scan. Reads every page in the table's data containers.
This is the "brute force" approach — read everything, evaluate predicates against every row, discard non-matching rows. It sounds wasteful, but it is often the optimal choice:
- When the predicate has a high filter factor (matches most rows).
- When no suitable index exists.
- When the table is very small (the entire table fits in a few pages).
- When the optimizer determines that the sequential I/O pattern of a scan is cheaper than the random I/O pattern of index access.
Sequential I/O is dramatically faster than random I/O. On traditional spinning disks, sequential reads can be 50-100 times faster per page than random reads. Even on modern SSDs, sequential reads benefit from read-ahead and reduced command overhead. The optimizer knows this — it costs sequential I/O and random I/O differently.
Cost(tablespace scan) ≈ (total pages) × SEQIOCOST + (total rows) × CPU_per_row
22.4.2 Index Access
Index access uses a B-tree index to locate qualifying rows without scanning the entire table. DB2 supports several variants:
Matching index scan: The optimizer uses the index's key columns to navigate directly to the qualifying rows. This is the classic case — a predicate on an indexed column.
-- If ACCOUNT has an index on CUSTOMER_ID:
SELECT * FROM ACCOUNT WHERE CUSTOMER_ID = 1001;
The optimizer traverses the B-tree from root to leaf, finds the leaf page(s) containing entries for CUSTOMER_ID = 1001, and then reads the corresponding data pages.
Cost(matching index scan) ≈ (index levels) × RANDIOCOST
+ (matching leaf pages) × SEQIOCOST
+ (qualifying rows) × RANDIOCOST [data page reads]
+ (qualifying rows) × CPU_per_row
The critical factor is the last term: each qualifying row requires a random read of a data page (unless the data is clustered — see below). If many rows qualify, the random I/O cost dominates.
Non-matching index scan: The optimizer reads the entire index leaf chain sequentially but does not use the index tree to navigate. This is useful when:
- The index contains all columns needed by the query (index-only access).
- The index ordering is needed for ORDER BY or GROUP BY.
- The index is much smaller than the table.
Index-only access (also called "index-only scan"): All columns referenced in the query exist in the index. DB2 never needs to read the data pages at all. This is the fastest possible access path for queries that need only a subset of columns.
-- If there is an index on (CUSTOMER_ID, ACCOUNT_TYPE, BALANCE):
SELECT CUSTOMER_ID, ACCOUNT_TYPE, BALANCE
FROM ACCOUNT
WHERE CUSTOMER_ID = 1001;
If the index includes all three columns in the SELECT list, plus the predicate column, the optimizer can satisfy the entire query from the index without touching the data pages.
Cost(index-only) ≈ (index levels) × RANDIOCOST
+ (matching leaf pages) × SEQIOCOST
+ (qualifying entries) × CPU_per_entry
No data page reads — this is often orders of magnitude cheaper than a full index scan plus data access.
22.4.3 [z/OS] Multiple Index Access
On z/OS, DB2 can use multiple indexes on the same table simultaneously to satisfy a complex predicate. There are several flavors:
-
Index ANDing: Two or more indexes each identify qualifying RIDs (Record Identifiers). DB2 intersects the RID lists — only rows that appear in all lists qualify. This is useful for predicates like
WHERE BRANCH_ID = 5 AND STATUS = 'ACTIVE'when separate indexes exist on BRANCH_ID and STATUS. -
Index ORing: Two or more indexes each identify qualifying RIDs. DB2 unions the RID lists. This is useful for
WHERE BRANCH_ID = 5 OR STATUS = 'FROZEN'.
These operations use the RID pool — a dedicated area of memory in DBM1 where RID lists are built, sorted, and merged.
22.4.4 List Prefetch
List prefetch is a hybrid technique that combines the selectivity of index access with the efficiency of sequential I/O.
The process is: 1. Scan the index and collect all qualifying RIDs into a list. 2. Sort the RID list by page number. 3. Read the data pages in page-number order (sequential I/O instead of random I/O).
List prefetch converts random I/O into sequential I/O, and it is one of the most powerful performance techniques in DB2's repertoire. The optimizer chooses list prefetch when:
- The predicate is selective enough to benefit from an index (not too many rows qualify).
- But enough rows qualify that random I/O for each row would be expensive.
- The data is not clustered by the index being used.
Cost(list prefetch) ≈ (index cost)
+ (sort RIDs)
+ (unique pages) × SEQIOCOST [sorted page reads]
+ (qualifying rows) × CPU_per_row
22.4.5 Sequential Prefetch and Dynamic Prefetch
Sequential prefetch reads multiple pages ahead of the current position, anticipating that the next pages will be needed. DB2 uses it during tablespace scans and during sequential index leaf-page reads. On z/OS, the prefetch quantity is typically 32 pages (configurable by buffer pool).
Dynamic prefetch detects at runtime that sequential I/O is occurring even when the optimizer did not predict it (for example, if random index reads happen to hit consecutive pages). The prefetch manager switches from single-page reads to multi-page reads dynamically.
22.4.6 [LUW] Specific Access Path Variants
On LUW, the optimizer recognizes some additional access path types:
- Index scan: Similar to index access on z/OS. Can be forward or reverse.
- Index ANDing and ORing: LUW supports these, though the implementation details differ from z/OS (bitmap-based on LUW rather than RID-pool-based).
- Block-based access (MDC): For tables using Multi-Dimensional Clustering, the optimizer can use block indexes to skip entire blocks of data.
- Table queue access: For partitioned databases (DPF), the optimizer generates table queues to move data between database partitions.
22.5 Join Methods
When a query joins two or more tables, the optimizer must choose a join method for each join step. DB2 supports several join algorithms, each optimal under different conditions.
22.5.1 Nested Loop Join (NLJ)
The nested loop join is the simplest algorithm:
For each row in the OUTER table:
For each matching row in the INNER table:
Produce a joined row
The outer table is scanned once. For each row in the outer table, the inner table is probed — typically via an index lookup on the join column.
When it is optimal: - The outer table is small (few rows after filtering). - An efficient index exists on the inner table's join column. - The join produces few rows.
Cost profile:
Cost(NLJ) ≈ Cost(scan outer) + (outer rows) × Cost(probe inner)
If the outer table produces 100 rows and each inner probe costs 4 I/Os (3 index levels + 1 data page), the total inner cost is 400 I/Os. If the outer table produces 1,000,000 rows, the inner cost is 4,000,000 I/Os — catastrophic.
This is why cardinality estimation matters so much for NLJ. If the optimizer underestimates the outer cardinality, it chooses NLJ when hash join would be far better.
22.5.2 Merge Scan Join (Sort-Merge Join)
The merge scan join requires both inputs to be sorted on the join column:
Sort OUTER table on join column (if not already sorted)
Sort INNER table on join column (if not already sorted)
Merge the two sorted streams, matching on join column
When it is optimal: - Both inputs are large. - The inputs are already sorted (e.g., from an index scan on the join column, or from a preceding sort). - The join is an equi-join.
Cost profile:
Cost(merge join) ≈ Cost(sort outer) + Cost(sort inner) + Cost(merge)
The merge phase is linear — O(N + M) where N and M are the input sizes. The expensive part is sorting, but if the data is already in order (from index access or a prior sort), this cost is eliminated.
22.5.3 Hash Join
LUW has supported hash join since early versions. z/OS added hash join support in DB2 12.
The hash join builds a hash table from the smaller input (the build input) and probes it with the larger input (the probe input):
1. Read the BUILD input (smaller table), hash each row on the join column,
store in an in-memory hash table.
2. Read the PROBE input (larger table), hash each row on the join column,
look up matching rows in the hash table.
3. For each match, produce a joined row.
When it is optimal: - Both inputs are large. - No useful index exists on the join column. - The build input fits in available memory (sort heap on LUW, work file storage on z/OS). - The join is an equi-join.
Hash join is extremely efficient for large-to-large joins because it requires only one pass over each input (assuming the hash table fits in memory). Its cost is:
Cost(hash join) ≈ Cost(read build) + Cost(build hash table)
+ Cost(read probe) + Cost(probe hash table)
If the hash table does not fit in memory, DB2 uses a partitioned hash join (sometimes called a "grace hash join") that spills partitions to temp space and processes them in phases.
22.5.4 [z/OS] Hybrid Join
The hybrid join is a z/OS-specific optimization that combines aspects of nested loop and merge join. It works when:
- The inner table has a clustering index on the join column.
- The outer table is sorted on the join column.
The hybrid join sorts the outer RIDs by the inner table's clustering sequence, then merges the outer rows with the inner table's data pages. This avoids the random I/O penalty of nested loop join while avoiding the full sort cost of merge join.
22.5.5 Join Order and the Optimizer
The optimizer does not just choose the join method — it also chooses the join order. For a three-table join (A JOIN B JOIN C), the optimizer considers:
- A → B → C
- A → C → B
- B → A → C
- B → C → A
- C → A → B
- C → B → A
And for each order, a different join method at each step. The optimal join order depends on the cardinalities and available indexes. Generally, the optimizer prefers to:
- Start with the most selective table (fewest rows after filtering).
- Join to the table with the best index on the join column.
- Keep intermediate result sets small.
This is why the optimizer's cardinality estimates are so critical — they determine the join order, which determines the join methods, which determines the overall plan cost.
22.6 [z/OS] Stage 1 and Stage 2 Predicates
On z/OS, DB2 evaluates predicates at different stages in the data retrieval pipeline, and the stage at which a predicate is evaluated has a significant performance impact.
22.6.1 The Predicate Evaluation Stages
+-----------------------------------------------------------+
| Index Manager (Index screening) |
| - Can evaluate predicates against index key columns |
| - Very efficient: avoids reading data pages entirely |
+-----------------------------------------------------------+
| Qualifying RIDs passed down
v
+-----------------------------------------------------------+
| Data Manager (Stage 1 / Sargable predicates) |
| - Evaluates predicates as data pages are read |
| - Efficient: evaluated at the page level in DBM1 |
+-----------------------------------------------------------+
| Qualifying rows passed up
v
+-----------------------------------------------------------+
| Relational Data System (Stage 2 / Non-sargable predicates)|
| - Evaluates predicates after rows are materialized |
| - Less efficient: requires full row materialization |
+-----------------------------------------------------------+
Stage 1 predicates (also called sargable predicates — "Search ARGument able") are evaluated by the Data Manager as it reads data pages. This is efficient because:
- The predicate is evaluated close to the data, in the buffer pool.
- Non-qualifying rows are discarded early, before being passed up the processing chain.
- The CPU cost per row is minimal.
Stage 2 predicates are evaluated by the Relational Data System (RDS) after the row has been fully materialized. This is less efficient because:
- The row has already consumed buffer pool, CPU, and possibly I/O resources.
- Evaluation happens higher in the processing chain.
- More data must flow through the pipeline.
22.6.2 Common Stage 1 vs. Stage 2 Classifications
| Predicate Form | Stage | Notes |
|---|---|---|
COL = value |
1 | Simple equality |
COL > value |
1 | Simple range |
COL BETWEEN v1 AND v2 |
1 | Range |
COL LIKE 'ABC%' |
1 | Leading-constant LIKE |
COL IN (v1, v2, v3) |
1 | IN list |
COL IS NULL |
1 | NULL check |
COL LIKE '%ABC' |
2 | Trailing wildcard — cannot use index |
COL LIKE '%ABC%' |
2 | Embedded wildcard |
YEAR(COL) = 2024 |
2 | Function on column |
COL1 = COL2 (same table) |
2 | Column-to-column within same table |
SUBSTR(COL, 1, 3) = 'ABC' |
2 | Function on column |
COL + 1 = 10 |
2 | Arithmetic on column |
VALUE(COL, 0) > 5 |
2 | COALESCE/VALUE function |
22.6.3 The Performance Impact
The difference between Stage 1 and Stage 2 can be enormous. Consider a table with 10 million rows where the predicate qualifies 100 rows:
- Stage 1 evaluation: The Data Manager evaluates the predicate as it reads each page. Non-qualifying rows are discarded immediately. Only 100 rows flow up to RDS.
- Stage 2 evaluation: All 10 million rows flow from the Data Manager up to RDS, where the predicate is finally evaluated and 9,999,900 rows are discarded.
The Stage 2 scenario wastes massive CPU and memory, even if the I/O is the same.
22.6.4 Making Predicates Sargable
The most common way to accidentally create a Stage 2 predicate is to apply a function to a column:
-- Stage 2 (BAD): Function on column
WHERE YEAR(TXN_DATE) = 2024
-- Stage 1 (GOOD): Rewrite as range predicate
WHERE TXN_DATE >= '2024-01-01' AND TXN_DATE < '2025-01-01'
-- Stage 2 (BAD): Arithmetic on column
WHERE BALANCE / 100 > 50
-- Stage 1 (GOOD): Move arithmetic to the constant side
WHERE BALANCE > 5000
-- Stage 2 (BAD): SUBSTR on column
WHERE SUBSTR(ACCOUNT_TYPE, 1, 3) = 'CHE'
-- Stage 1 (GOOD): Use LIKE with leading constant
WHERE ACCOUNT_TYPE LIKE 'CHE%'
LUW Note: LUW does not use the Stage 1/Stage 2 terminology, but the same principle applies. Predicates that can be evaluated during data access (pushed down to the storage layer) are more efficient than predicates evaluated after row materialization. The key rule is the same: avoid functions on columns in WHERE clauses when possible.
22.7 Query Rewrite Optimization
Before the optimizer selects an access path, the query rewrite phase transforms the SQL into an equivalent but potentially more efficient form. These transformations are automatic and transparent — you write SQL in whatever way is clearest, and the optimizer rewrites it for performance.
22.7.1 Predicate Pushdown
The optimizer pushes predicates as close to the data source as possible. If a predicate can be evaluated earlier in the processing pipeline, fewer rows flow through later stages.
-- Original: Predicate on the outer query
SELECT * FROM (
SELECT a.*, c.LAST_NAME
FROM ACCOUNT a JOIN CUSTOMER c ON a.CUSTOMER_ID = c.CUSTOMER_ID
) AS sub
WHERE sub.BALANCE > 100000;
-- Rewritten: Predicate pushed into the subquery
SELECT a.*, c.LAST_NAME
FROM ACCOUNT a JOIN CUSTOMER c ON a.CUSTOMER_ID = c.CUSTOMER_ID
WHERE a.BALANCE > 100000;
The rewrite applies the BALANCE filter before the join, reducing the number of rows that must be joined.
22.7.2 Subquery to Join Transformation
The optimizer can transform certain correlated subqueries into joins, which opens up more join order and join method options:
-- Original: Correlated subquery (EXISTS)
SELECT c.CUSTOMER_ID, c.LAST_NAME
FROM CUSTOMER c
WHERE EXISTS (
SELECT 1 FROM ACCOUNT a
WHERE a.CUSTOMER_ID = c.CUSTOMER_ID AND a.BALANCE > 100000
);
-- Rewritten: Semi-join
SELECT DISTINCT c.CUSTOMER_ID, c.LAST_NAME
FROM CUSTOMER c INNER JOIN ACCOUNT a
ON c.CUSTOMER_ID = a.CUSTOMER_ID
WHERE a.BALANCE > 100000;
The join form allows the optimizer to choose the join order freely (should CUSTOMER or ACCOUNT be the outer table?), whereas the correlated subquery form forces CUSTOMER to be the outer table.
22.7.3 View Merging
When a query references a view, the optimizer can merge the view definition into the query, eliminating the view as a separate processing step:
-- View definition
CREATE VIEW HIGH_VALUE_ACCOUNTS AS
SELECT * FROM ACCOUNT WHERE BALANCE > 50000;
-- Query against the view
SELECT * FROM HIGH_VALUE_ACCOUNTS WHERE ACCOUNT_TYPE = 'SAVINGS';
-- After view merging
SELECT * FROM ACCOUNT WHERE BALANCE > 50000 AND ACCOUNT_TYPE = 'SAVINGS';
View merging allows the optimizer to consider all predicates together and choose the best access path for the combined conditions.
22.7.4 MQT Routing (Materialized Query Table)
DB2 can automatically route a query to a Materialized Query Table (MQT) if the MQT's definition subsumes the query. MQTs are precomputed result sets stored as physical tables — a form of materialized view.
-- MQT definition
CREATE TABLE MONTHLY_BRANCH_TOTALS AS (
SELECT b.BRANCH_ID, b.BRANCH_NAME,
YEAR(t.TXN_DATE) AS TXN_YEAR,
MONTH(t.TXN_DATE) AS TXN_MONTH,
SUM(t.AMOUNT) AS TOTAL_AMOUNT,
COUNT(*) AS TXN_COUNT
FROM TRANSACTION t
JOIN ACCOUNT a ON t.ACCOUNT_ID = a.ACCOUNT_ID
JOIN BRANCH b ON a.BRANCH_ID = b.BRANCH_ID
GROUP BY b.BRANCH_ID, b.BRANCH_NAME, YEAR(t.TXN_DATE), MONTH(t.TXN_DATE)
) DATA INITIALLY DEFERRED REFRESH DEFERRED;
If a user query requests monthly transaction totals by branch, the optimizer may recognize that the MQT already has the answer and route the query to the MQT instead of scanning and aggregating the base tables. This can turn a 30-minute query into a sub-second lookup.
On LUW, MQT routing is controlled by the CURRENT REFRESH AGE special register and the CURRENT MAINTAINED TABLE TYPES FOR OPTIMIZATION register:
-- Enable MQT routing
SET CURRENT REFRESH AGE = ANY;
SET CURRENT MAINTAINED TABLE TYPES FOR OPTIMIZATION = ALL;
On z/OS, MQT routing is controlled by CURRENT REFRESH AGE and is influenced by the QUERYOPT bind parameter.
22.7.5 Common Subexpression Elimination
When the same expression appears multiple times in a query, the optimizer computes it once and reuses the result:
SELECT CUSTOMER_ID,
BALANCE * 1.05 AS PROJECTED_BALANCE,
CASE WHEN BALANCE * 1.05 > 100000 THEN 'HIGH' ELSE 'NORMAL' END
FROM ACCOUNT;
The expression BALANCE * 1.05 is computed once per row, not twice.
22.7.6 OR-to-UNION Transformation
In some cases, the optimizer rewrites an OR predicate into a UNION ALL, allowing each branch to use a different index:
-- Original
SELECT * FROM ACCOUNT
WHERE BRANCH_ID = 5 OR STATUS = 'FROZEN';
-- Rewritten (conceptually)
SELECT * FROM ACCOUNT WHERE BRANCH_ID = 5
UNION ALL
SELECT * FROM ACCOUNT WHERE STATUS = 'FROZEN' AND BRANCH_ID <> 5;
Each branch of the UNION ALL can use a different index (one on BRANCH_ID, one on STATUS), which may be more efficient than a table scan.
22.8 Catalog Statistics and the Optimizer
The optimizer is only as good as the statistics it consumes. This section covers the most important catalog statistics and how RUNSTATS feeds them to the optimizer.
22.8.1 The Most Important Catalog Statistics
Here are the statistics that most directly affect optimizer decisions, ranked by impact:
Table-level statistics:
| Statistic | z/OS Catalog Column | LUW Catalog Column | Impact |
|---|---|---|---|
| Table cardinality (row count) | SYSIBM.SYSTABLES.CARDF | SYSCAT.TABLES.CARD | Fundamental — affects every cost estimate |
| Number of pages | SYSIBM.SYSTABLES.NPAGES | SYSCAT.TABLES.NPAGES | Determines scan cost |
| Percentage of pages with rows | SYSIBM.SYSTABLES.PCTPAGES | SYSCAT.TABLES.FPAGES | Affects scan efficiency |
Column-level statistics:
| Statistic | z/OS Catalog Column | LUW Catalog Column | Impact |
|---|---|---|---|
| Column cardinality | SYSIBM.SYSCOLUMNS.COLCARD | SYSCAT.COLUMNS.COLCARD | Default filter factor for equality |
| High value (HIGH2KEY) | SYSIBM.SYSCOLUMNS.HIGH2KEY | SYSCAT.COLUMNS.HIGH2KEY | Range predicate estimates |
| Low value (LOW2KEY) | SYSIBM.SYSCOLUMNS.LOW2KEY | SYSCAT.COLUMNS.LOW2KEY | Range predicate estimates |
| Frequency values | SYSIBM.SYSCOLDIST | SYSCAT.COLDIST (TYPE='F') | Accurate estimates for skewed data |
| Histogram (quantile) | SYSIBM.SYSCOLDIST | SYSCAT.COLDIST (TYPE='Q') | Accurate range estimates |
Index-level statistics:
| Statistic | z/OS Catalog Column | LUW Catalog Column | Impact |
|---|---|---|---|
| Number of leaf pages | SYSIBM.SYSINDEXES.NLEAF | SYSCAT.INDEXES.NLEAF | Index scan cost |
| Number of levels | SYSIBM.SYSINDEXES.NLEVELS | SYSCAT.INDEXES.NLEVELS | Index probe cost |
| First key cardinality | SYSIBM.SYSINDEXES.FIRSTKEYCARD | SYSCAT.INDEXES.FIRSTKEYCARD | Leading column selectivity |
| Full key cardinality | SYSIBM.SYSINDEXES.FULLKEYCARD | SYSCAT.INDEXES.FULLKEYCARD | Full key selectivity |
| Cluster ratio | SYSIBM.SYSINDEXES.CLUSTERRATIO | SYSCAT.INDEXES.CLUSTERRATIO | Data clustering — affects I/O pattern |
22.8.2 The Impact of Stale Statistics
Stale statistics are the number one cause of suboptimal access paths in production DB2 systems. Here is how stale statistics cause problems:
Scenario 1: Table growth without RUNSTATS
The ACCOUNT table had 500,000 rows when RUNSTATS was last run. Since then, 1.5 million new accounts have been added. The catalog still shows CARDF = 500,000.
The optimizer estimates that a table scan reads pages for 500,000 rows. An index access for 1,000 qualifying rows looks relatively expensive compared to the "small" table scan. The optimizer chooses the table scan. But the actual table has 2,000,000 rows — the table scan reads four times more pages than estimated, and the index would have been far cheaper.
Scenario 2: Data redistribution without RUNSTATS
The STATUS column used to have a roughly uniform distribution: 25% ACTIVE, 25% CLOSED, 25% DORMANT, 25% FROZEN. RUNSTATS captured COLCARD = 4, and the optimizer assumed uniform distribution.
After a data cleanup, the distribution is now: 98% ACTIVE, 1% CLOSED, 0.5% DORMANT, 0.5% FROZEN. The optimizer still estimates FF(STATUS = 'FROZEN') = 1/4 = 0.25 when the actual filter factor is 0.005. For a 2-million-row table, the optimizer estimates 500,000 qualifying rows instead of 10,000 — it avoids the index when the index would be ideal.
22.8.3 RUNSTATS Best Practices
Collect statistics after significant data changes:
-- z/OS: Comprehensive RUNSTATS
RUNSTATS TABLESPACE MERIDIAN.ACCOUNT_TS
TABLE(ACCOUNT)
INDEX(ALL)
COLUMN(STATUS) FREQVAL NUMCOLS 1 COUNT 20
COLUMN(ACCOUNT_TYPE) FREQVAL NUMCOLS 1 COUNT 10
COLUMN(BALANCE)
COLUMN(CUSTOMER_ID)
COLUMN(BRANCH_ID)
UPDATE ALL;
-- LUW: Comprehensive RUNSTATS
RUNSTATS ON TABLE MERIDIAN.ACCOUNT
WITH DISTRIBUTION ON ALL COLUMNS
AND SAMPLED DETAILED INDEXES ALL;
When to run RUNSTATS:
- After loading a significant amount of data (more than 10-15% of the table).
- After mass UPDATE operations that change indexed column values.
- After REORG operations (REORG can change clustering, which affects access path costs).
- Before REBIND operations (ensure the optimizer has current statistics).
- On a regular schedule for tables that change continuously.
Automation: On z/OS, many shops use DB2 Administration Tool or third-party tools (BMC, Broadcom) to automate RUNSTATS scheduling based on change thresholds. On LUW, the AUTO_RUNSTATS database configuration parameter enables automatic statistics collection — DB2 monitors table modification counts and triggers RUNSTATS when thresholds are exceeded.
-- LUW: Enable automatic statistics collection
UPDATE DB CFG FOR MERIDIANDB USING AUTO_RUNSTATS ON;
22.8.4 Real-Time Statistics (z/OS)
Starting with DB2 10 for z/OS, DB2 collects real-time statistics (RTS) — lightweight, continuously-updated statistics that track table and index size changes without running RUNSTATS. RTS information is stored in SYSIBM.SYSTABLESPACESTATS and SYSIBM.SYSINDEXSPACESTATS.
RTS does not replace RUNSTATS — it does not collect column-level distribution statistics. But it ensures that the optimizer always has current table cardinality and page count estimates, even between RUNSTATS executions. This is particularly important for volatile tables whose size changes dramatically throughout the day.
22.9 Why Access Paths Change
One of the most stressful events in a DBA's career is discovering that a critical production query suddenly changed its access plan and is now running ten times slower. Understanding why access paths change is essential for diagnosing and preventing these incidents.
22.9.1 The Top Five Reasons
1. RUNSTATS changed the statistics
This is the most common reason. RUNSTATS updated the catalog with new cardinality, new distribution statistics, or new cluster ratios. The optimizer re-evaluated cost formulas with the new inputs and chose a different plan.
Example: RUNSTATS runs on the TRANSACTION table and updates the cardinality from 50 million to 80 million. An index access path that was cheaper than a table scan at 50 million rows is now estimated as more expensive at 80 million rows (because the index is not well-clustered and the random I/O cost scales with cardinality). The optimizer switches to a table scan.
2. REBIND with different statistics
A REBIND recompiles a static SQL package, creating a new access plan based on current catalog statistics. If the statistics have changed since the last BIND, the new plan may be different.
This is often triggered intentionally (a DBA issues REBIND after index changes) or as a side effect (DB2 automatic rebind after a migration, or a "free" rebind triggered by DDL changes).
3. DB2 version migration
When you migrate to a new version of DB2, the optimizer may have new cost formulas, new access paths, new query rewrite rules, and new join methods. Plans that were optimal under the old optimizer may not be optimal under the new one.
IBM provides migration planning tools:
- z/OS: The DSNTEP2 explain utility and the REBIND SWITCH option allow you to compare old and new access plans before committing to the new version.
- LUW: The db2caem (Capture and Analysis of Explain data) tool helps compare plans across versions.
4. Data volume growth
Even without RUNSTATS, if real-time statistics detect that a table has grown significantly, the optimizer may change its plan. The crossover point between "index access is cheaper" and "table scan is cheaper" depends on the number of qualifying rows relative to the table size.
5. Parameter and configuration changes
Changes to buffer pool sizes, sort heap sizes, I/O cost parameters, or optimization levels can all change optimizer decisions. For example, increasing the sort heap on LUW makes sort-based plans (merge join, sort before group by) cheaper, potentially causing the optimizer to switch from hash join to merge join.
22.9.2 How to Investigate an Access Path Change
When a query's performance changes suddenly, follow this investigation process:
- Capture the current access plan using EXPLAIN (Chapter 23 covers this in detail).
- Compare it with the previous plan (if you have historical EXPLAIN data).
- Check when RUNSTATS last ran on the affected tables: ```sql -- z/OS SELECT NAME, CARDF, STATSTIME FROM SYSIBM.SYSTABLES WHERE NAME = 'ACCOUNT';
-- LUW
SELECT TABNAME, CARD, STATS_TIME
FROM SYSCAT.TABLES
WHERE TABNAME = 'ACCOUNT';
4. **Check if a REBIND occurred**:sql
-- z/OS: Check package timestamp
SELECT COLLID, NAME, BINDTIME
FROM SYSIBM.SYSPACKAGE
WHERE NAME = 'MYPACKAGE';
```
5. Check data volume changes: Compare current row counts with catalog statistics.
6. Check for DDL changes: New indexes, dropped indexes, altered columns.
7. Check for DB2 maintenance applied: APARs and PTFs can change optimizer behavior.
22.9.3 Access Path Stability Strategies
Preventing unexpected access path changes requires a proactive approach:
Run RUNSTATS before REBIND: Never REBIND with stale statistics. Always run RUNSTATS first, verify the statistics, and then REBIND.
Use PLANMGMT on z/OS: DB2 for z/OS supports plan management, which keeps the previous access plan as a fallback:
REBIND PACKAGE(COLL.PACKAGE) PLANMGMT(EXTENDED)
If the new plan performs worse, you can switch back:
REBIND PACKAGE(COLL.PACKAGE) SWITCH(PREVIOUS)
Monitor explain data: Capture EXPLAIN data regularly and compare plans over time. Automated tools can alert you when plans change.
Version migration testing: Before migrating DB2 versions, EXPLAIN all critical queries under both the old and new optimizer. Investigate any plan changes before they hit production.
22.10 Influencing the Optimizer
The optimizer usually makes good decisions. When it does not, you should first ensure that statistics are accurate, indexes are appropriate, and the SQL is well-written. Only after exhausting those options should you consider directly influencing the optimizer.
22.10.1 OPTIMIZE FOR n ROWS
The OPTIMIZE FOR n ROWS clause tells the optimizer that you intend to fetch only the first n rows, not the entire result set:
SELECT a.ACCOUNT_ID, a.BALANCE, c.LAST_NAME
FROM ACCOUNT a
JOIN CUSTOMER c ON a.CUSTOMER_ID = c.CUSTOMER_ID
WHERE a.BRANCH_ID = 5
ORDER BY a.BALANCE DESC
OPTIMIZE FOR 20 ROWS;
Without this clause, the optimizer might choose a plan that is optimal for returning all qualifying rows (e.g., hash join, sort). With OPTIMIZE FOR 20 ROWS, the optimizer favors plans that return the first rows quickly (e.g., nested loop join using an index that produces rows in the desired order, avoiding a sort).
This is particularly valuable for interactive applications that display results in pages.
22.10.2 REOPT (Reoptimization at Execution Time)
For queries with parameter markers or host variables, the optimizer must choose a plan without knowing the actual parameter values. It estimates filter factors using default assumptions (uniform distribution on the parameter column).
REOPT tells DB2 to re-optimize the query at execution time, when the actual parameter values are known:
-- z/OS: REOPT at BIND time
BIND PACKAGE(COLL.PKG) REOPT(ALWAYS)
-- LUW: Set at session level
SET CURRENT QUERY OPTIMIZATION = 5; -- normal level
-- Use REOPT ONCE to reoptimize on first execution with actual values
PREPARE stmt FROM 'SELECT ... WHERE ACCT_TYPE = ? AND BALANCE > ?' REOPT ONCE;
REOPT(ALWAYS): Reoptimizes every execution. Maximum accuracy but highest overhead. REOPT(ONCE): Reoptimizes on the first execution after PREPARE, then reuses the plan. Good balance. REOPT(AUTO) (z/OS 12+): DB2 decides whether reoptimization would be beneficial based on the parameter values.
22.10.3 [z/OS] Optimization Hints (OPTHINT)
On z/OS, you can provide optimization hints through the PLAN_TABLE. A hint tells the optimizer to use a specific access path for a specific query.
-- Insert a hint into PLAN_TABLE
INSERT INTO userid.PLAN_TABLE (
QUERYNO, QBLOCKNO, APPLNAME, PROGNAME,
OPTHINT, ACCESSTYPE, MATCHCOLS, ACCESSNAME, TABNO
) VALUES (
1, 1, 'MYAPP', 'MYPROG',
'HINT001', 'I', 2, 'IX_ACCOUNT_CUST_TYPE', 1
);
Then bind the program with the hint:
BIND PACKAGE(COLL.PKG) OPTHINT('HINT001')
Warning
: Optimization hints are a last resort. They override the optimizer's judgment and prevent it from adapting to changing data. A hint that improves performance today may degrade performance tomorrow when data volumes change. Use hints only when: - You have definitively proven the optimizer is choosing a suboptimal plan. - You have tried improving statistics, adding indexes, and rewriting the SQL. - You have a plan to review and remove the hint periodically.
22.10.4 [LUW] Optimization Profiles and Guidelines
On LUW, the equivalent of hints is optimization profiles (also called optimization guidelines). These are XML documents that specify the desired access plan:
-- Create an optimization profile
CALL SYSPROC.SYSINSTALLOBJECTS('EXPLAIN', 'C', NULL, NULL);
-- Apply a guideline using an optimization profile
-- (Simplified example — actual XML is more complex)
UPDATE ... SET OPTPROFILE = '<OPTPROFILE>
<STMTPROFILE ID="profile1">
<STMTKEY>
<SCHEMA>MERIDIAN</SCHEMA>
<STMTTEXT>SELECT * FROM ACCOUNT WHERE STATUS = ?</STMTTEXT>
</STMTKEY>
<OPTGUIDELINES>
<IXSCAN TABLE="ACCOUNT" INDEX="IX_ACCOUNT_STATUS"/>
</OPTGUIDELINES>
</STMTPROFILE>
</OPTPROFILE>';
LUW also supports statement-level optimization using the db2advis advisor tool, which recommends indexes and MQTs for specific queries.
22.10.5 Statistical Views (LUW)
On LUW, you can create statistical views that provide the optimizer with additional semantic information about the data:
-- Create a statistical view to help the optimizer understand a complex predicate pattern
CREATE VIEW ACTIVE_CHECKING AS
SELECT * FROM ACCOUNT
WHERE STATUS = 'ACTIVE' AND ACCOUNT_TYPE = 'CHECKING';
ALTER VIEW ACTIVE_CHECKING ENABLE QUERY OPTIMIZATION;
RUNSTATS ON TABLE MERIDIAN.ACCOUNT
WITH DISTRIBUTION ON ALL COLUMNS;
The optimizer can use the statistics collected on the view to make better estimates for queries that match the view's predicate pattern.
22.11 The Optimizer and the Meridian Bank
Let us apply everything we have learned to Meridian National Bank's workload. We will trace three representative queries through the optimizer's decision process.
22.11.1 Account Lookup by Customer ID
-- The bank's most frequent query: look up accounts for a customer
SELECT a.ACCOUNT_ID, a.ACCOUNT_TYPE, a.BALANCE, a.STATUS
FROM ACCOUNT a
WHERE a.CUSTOMER_ID = :hv_customer_id;
Optimizer analysis:
-
Statistics consulted: - ACCOUNT.CARDF = 2,000,000 rows - CUSTOMER_ID COLCARD = 850,000 (850K distinct customers) - Index IX_ACCOUNT_CUSTID exists on (CUSTOMER_ID) - IX_ACCOUNT_CUSTID.NLEVELS = 3 - IX_ACCOUNT_CUSTID.CLUSTERRATIO = 95 (well-clustered)
-
Filter factor: FF = 1/850,000 = 0.00000118 - Estimated qualifying rows: 2,000,000 × 0.00000118 ≈ 2.4 rows
-
Candidate plans: - Table scan: Read all ~80,000 pages. Cost ≈ 80,000 × SEQIOCOST + 2,000,000 × CPU. - Index scan on IX_ACCOUNT_CUSTID: 3 index levels + ~1 leaf page + ~2 data pages. Cost ≈ 5 × RANDIOCOST + 2.4 × CPU.
-
Decision: Index access wins overwhelmingly. The optimizer estimates the index path is approximately 16,000 times cheaper than the table scan.
-
Access plan: Matching index scan on IX_ACCOUNT_CUSTID, 2 matching columns = 0 (only one column matches), MATCHCOLS = 1.
This query runs in sub-millisecond time. The optimizer gets this right every time as long as COLCARD is reasonably accurate.
22.11.2 Transaction History with Date Range
-- Monthly statement query: all transactions for an account in a date range
SELECT t.TXN_ID, t.TXN_TYPE, t.AMOUNT, t.TXN_DATE, t.DESCRIPTION
FROM TRANSACTION t
WHERE t.ACCOUNT_ID = :hv_account_id
AND t.TXN_DATE BETWEEN :hv_start_date AND :hv_end_date
ORDER BY t.TXN_DATE DESC;
Optimizer analysis:
-
Statistics consulted: - TRANSACTION.CARDF = 80,000,000 rows - ACCOUNT_ID COLCARD = 2,000,000 - TXN_DATE: LOW2KEY = '2020-01-02', HIGH2KEY = '2024-12-30' - Composite index IX_TXN_ACCT_DATE on (ACCOUNT_ID, TXN_DATE) with NLEVELS = 4
-
Filter factors: - FF(ACCOUNT_ID = ?) = 1/2,000,000 = 0.0000005 - FF(TXN_DATE BETWEEN '2024-03-01' AND '2024-03-31') = 31 / (365 × 5) ≈ 0.017 - Combined FF = 0.0000005 × 0.017 = 0.0000000085 - Estimated rows: 80,000,000 × 0.0000000085 ≈ 0.68 rows
-
But wait: The optimizer recognizes that the compound filter factor is misleading here because ACCOUNT_ID and TXN_DATE are in the same composite index. With MATCHCOLS = 2, the index navigates directly to the leaf pages for this specific account and date range. The actual estimate uses index-level statistics: approximately 40 rows per account per month (from FIRSTKEYCARD and FULLKEYCARD ratios).
-
Candidate plans: - Index scan on IX_TXN_ACCT_DATE (MATCHCOLS=2): Navigate to the start of the date range for this account, scan forward. ~40 qualifying entries. Index delivers rows in TXN_DATE order, but DESC requires a reverse scan or sort. - Table scan: Cost ≈ 3,200,000 pages × SEQIOCOST. Absurdly expensive.
-
Decision: Index access with MATCHCOLS = 2. If the index is defined as (ACCOUNT_ID, TXN_DATE DESC), the optimizer avoids a sort entirely.
-
Performance observation: This query benefits enormously from the composite index. Without it, the optimizer would use the single-column index on ACCOUNT_ID, retrieve all ~40 transactions for the account across all years, and then filter by date — still fast for a single account. But the composite index makes it optimal.
22.11.3 Monthly Branch Performance Report
-- Management reporting: total deposits and withdrawals by branch for a month
SELECT b.BRANCH_NAME,
SUM(CASE WHEN t.TXN_TYPE = 'DEPOSIT' THEN t.AMOUNT ELSE 0 END) AS TOTAL_DEPOSITS,
SUM(CASE WHEN t.TXN_TYPE = 'WITHDRAWAL' THEN t.AMOUNT ELSE 0 END) AS TOTAL_WITHDRAWALS,
COUNT(*) AS TXN_COUNT
FROM BRANCH b
JOIN ACCOUNT a ON b.BRANCH_ID = a.BRANCH_ID
JOIN TRANSACTION t ON a.ACCOUNT_ID = t.ACCOUNT_ID
WHERE t.TXN_DATE BETWEEN '2024-03-01' AND '2024-03-31'
GROUP BY b.BRANCH_NAME
ORDER BY TOTAL_DEPOSITS DESC;
Optimizer analysis:
-
This is a three-table join with aggregation. The optimizer's search space includes: - 3! = 6 possible join orders. - 3 join methods per join step. - Multiple access paths per table.
-
Key statistics: - BRANCH: 50 rows. Tiny table — always a table scan is fine. - ACCOUNT: 2,000,000 rows. - TRANSACTION: 80,000,000 rows. The date predicate qualifies ~1.3 million rows (one month of 5 years).
-
Likely optimizer decision: - Start with TRANSACTION (apply the date filter first to reduce the large table). - Use an index on TXN_DATE to retrieve the ~1.3 million qualifying rows. - Hash join to ACCOUNT on ACCOUNT_ID (ACCOUNT is the build input, TRANSACTION is the probe). - Hash join to BRANCH on BRANCH_ID (BRANCH is the build input — 50 rows, tiny hash table). - Hash aggregate for GROUP BY. - Sort for ORDER BY.
-
MQT opportunity: If the MONTHLY_BRANCH_TOTALS MQT from Section 22.7.4 exists and is refreshed, the optimizer routes the query to the MQT — reducing the query from scanning millions of rows to reading 50 rows (one per branch per month). This is the kind of optimization that turns a 45-second report into a 50-millisecond lookup.
-
Without MQT: The query is I/O-intensive. The optimizer's choice of hash join vs. merge join depends on sort heap size and the available memory. On z/OS, if hash join is not enabled (DFLTMETH parameter), the optimizer uses merge scan join with sorts on both inputs.
22.11.4 Lessons from the Meridian Workload
The three queries illustrate the optimizer's adaptability:
- Point lookup (Account by Customer ID): Index access, nested loop join (if joining), sub-millisecond. The optimizer makes this trivially correct.
- Range scan (Transaction history): Composite index access, moderate result set. The optimizer benefits from detailed statistics on the composite index.
- Analytical aggregate (Branch report): Multi-table join with aggregation on a large table. The optimizer chooses hash join and benefits enormously from MQTs. This is where stale statistics hurt the most — a wrong cardinality estimate on the date range predicate can cascade into wrong join method choices.
The pattern is clear: as query complexity increases, the optimizer's dependence on accurate statistics increases proportionally. Simple OLTP queries are forgiving — the optimizer makes the right choice even with imperfect statistics. Complex analytical queries are unforgiving — a 2x cardinality estimation error can cause a 100x performance degradation.
Spaced Review: Connecting to Earlier Chapters
From Chapter 6 (Joining Tables): You learned the SQL syntax for INNER JOIN, OUTER JOIN, and CROSS JOIN. Now you understand what happens behind the scenes — the optimizer chooses nested loop, merge scan, or hash join based on estimated cardinalities and available indexes. The join you write in SQL is a logical specification; the optimizer determines the physical execution.
From Chapter 15 (Indexing Strategies): You learned when and why to create indexes. Now you understand how the optimizer evaluates those indexes — computing the cost of index access vs. table scan, considering cluster ratio, counting leaf pages and index levels. An index is only valuable if the optimizer knows about it (from RUNSTATS) and estimates that it reduces cost.
From Chapter 17 (Statistics and RUNSTATS): You learned how to collect statistics. Now you understand why each statistic matters and how it feeds into the optimizer's filter factor calculations, cardinality estimates, and cost formulas. RUNSTATS is not administrative busywork — it is feeding the brain of the optimizer.
Chapter Summary
The DB2 optimizer is a cost-based query optimizer that evaluates multiple candidate execution plans, estimates the cost of each plan using catalog statistics, and selects the plan with the lowest estimated cost.
Key principles:
-
Cost-based, not rule-based: The optimizer uses statistics and cost formulas, not fixed rules. The same query can produce different plans on different days if the statistics change.
-
Filter factors drive everything: The optimizer's primary task is estimating how many rows each operation produces. Filter factors are computed from COLCARD, HIGH2KEY/LOW2KEY, frequency values, and histograms.
-
The uniform distribution assumption is often wrong: Real data is skewed. Collecting detailed distribution statistics (frequency values, histograms, column group statistics) gives the optimizer the information it needs to make accurate estimates.
-
Access path selection depends on cost: Table scan vs. index access, nested loop vs. hash join — every choice is based on estimated cost, which depends on estimated cardinality, which depends on statistics.
-
Stage 1 vs. Stage 2 matters on z/OS: Writing sargable predicates (no functions on columns) ensures predicates are evaluated efficiently at Stage 1.
-
Query rewrite is automatic but powerful: Predicate pushdown, subquery-to-join transformation, view merging, and MQT routing happen transparently.
-
Stale statistics are the #1 enemy: Run RUNSTATS regularly, especially before REBIND operations and after significant data changes.
-
Access paths change for identifiable reasons: RUNSTATS, REBIND, version migration, data growth, and configuration changes are the top five causes.
-
Influence the optimizer as a last resort: OPTIMIZE FOR n ROWS, REOPT, and optimization hints are available when statistics and SQL improvements are insufficient.
-
The optimizer optimizes against a model of your data, not the data itself. When the model is accurate, the optimizer is brilliant. When the model is wrong, every decision downstream can be wrong. Your job as a DBA is to keep the model accurate.
What Comes Next
In Chapter 23, we will learn how to see the optimizer's decisions using EXPLAIN tools. You will learn to read EXPLAIN output, interpret access plan graphs, and use Visual Explain (LUW) and the PLAN_TABLE (z/OS) to diagnose performance problems. The optimizer's decisions are invisible until you EXPLAIN them — and Chapter 23 gives you the tools to make them visible.
"The optimizer is not your adversary. It is your most powerful ally — but it can only work with the information you give it. Give it accurate statistics, sargable predicates, and well-designed indexes, and it will find execution plans that no human could design by hand. Starve it of information, and it will guess. Sometimes it guesses well. Sometimes it does not. The difference between a good DBA and a great DBA is understanding which one happened and why."
Related Reading
Explore this topic in other books
IBM DB2 SQL Tuning Advanced COBOL DB2 Optimizer Advanced COBOL DB2 Performance