Chapter 6 Exercises: DB2 Optimizer Internals
Part A: Filter Factors and Cost Estimation (Exercises 1–5)
Exercise 1: Basic Filter Factor Calculation
A table CUSTOMER has 2,000,000 rows. RUNSTATS reports the following for column REGION: - COLCARDF = 8 - No frequency statistics collected
Calculate the filter factor for each predicate:
a) WHERE REGION = 'NORTHEAST'
b) WHERE REGION IN ('NORTHEAST', 'SOUTHEAST', 'MIDWEST')
c) WHERE REGION <> 'WEST'
d) WHERE REGION = 'NORTHEAST' AND CUST_TYPE = 'RETAIL' (CUST_TYPE COLCARDF = 4)
For part (d), explain why the independence assumption might cause a problem if all RETAIL customers happen to be in NORTHEAST.
Exercise 2: Range Predicate Filter Factors
Table DAILY_TRANSACTIONS has 500,000,000 rows. Column TRAN_DATE has: - COLCARDF = 365 - LOW2KEY = '2024-12-16' - HIGH2KEY = '2025-12-14'
Calculate the estimated filter factor and estimated row count for:
a) WHERE TRAN_DATE = '2025-06-15'
b) WHERE TRAN_DATE > '2025-11-01'
c) WHERE TRAN_DATE BETWEEN '2025-10-01' AND '2025-12-01'
Show your work for each calculation using the range interpolation formula.
Exercise 3: Compound Predicate Analysis
Given these catalog statistics for table CLAIMS (50,000,000 rows): - CLAIM_STATUS COLCARDF = 5 - PROVIDER_STATE COLCARDF = 52 - PROCEDURE_CODE COLCARDF = 12,000 - CLAIM_AMOUNT HIGH2KEY = 999999.99, LOW2KEY = 0.01
Calculate the combined filter factor and estimated rows for:
WHERE CLAIM_STATUS = 'OPEN'
AND PROVIDER_STATE = 'CA'
AND PROCEDURE_CODE = '99213'
AND CLAIM_AMOUNT > 500.00
Now suppose you know that 40% of all OPEN claims are from CA (because California has the most providers). How does the actual filter factor differ from the optimizer's estimate? By what factor is the optimizer wrong?
Exercise 4: Frequency Statistics Impact
A table has 10,000,000 rows. Column ACCT_TYPE has COLCARDF = 5 and the following frequency statistics:
| Value | FREQUENCYF |
|---|---|
| CHK | 0.45 |
| SAV | 0.30 |
| MMA | 0.15 |
| CDA | 0.08 |
| IRA | 0.02 |
a) Without frequency statistics, what filter factor would DB2 use for ACCT_TYPE = 'CHK'?
b) With frequency statistics, what filter factor does DB2 use for ACCT_TYPE = 'CHK'?
c) How many rows does each estimate predict?
d) Which estimate is closer to reality, and by how much?
e) For which value of ACCT_TYPE would the default (1/COLCARDF) estimate be closest to actual?
Exercise 5: Filter Factor and Access Path Crossover
Table ORDERS has 100,000,000 rows, 5,000,000 pages. An index IX_ORD_STATUS exists on column ORDER_STATUS (COLCARDF = 10, CLUSTERRATIO = 95%).
Assume these simplified I/O costs: - Random I/O: 4 ms per page - Sequential prefetch I/O: 0.1 ms per page (within a 32-page prefetch stream) - Tablespace scan reads all 5,000,000 pages via sequential prefetch
For a query WHERE ORDER_STATUS = :HV:
a) Calculate the estimated cost (I/O only) of index access when FF = 1/10 (10,000,000 rows). Assume 1 row per RID fetch on average due to high cluster ratio. b) Calculate the cost of a tablespace scan. c) At what filter factor does the crossover occur (where index cost equals tablespace scan cost)? d) If CLUSTERRATIO drops to 20%, how does the crossover point change?
Part B: EXPLAIN Interpretation (Exercises 6–12)
Exercise 6: Reading a Simple PLAN_TABLE
Interpret the following PLAN_TABLE output. For each step, identify: what table is accessed, how it's accessed, what join method is used, and whether any sorts occur.
QUERYNO QBLOCKNO PLANNO METHOD TNAME ACCESSTYPE MATCHCOLS ACCESSNAME INDEXONLY PREFETCH SORTN_JOIN SORTC_JOIN SORTN_ORDERBY
------- -------- ------ ------ -------------- ---------- --------- --------------- -------- -------- ---------- ---------- -------------
100 1 1 0 EMPLOYEE I 2 IX_EMP_DEPTDT N S N N N
100 1 2 1 DEPARTMENT I 1 IX_DEPT_PK Y N N N
What SQL could produce this access path? Write a plausible query.
Exercise 7: Diagnosing a Merge Scan Join
QUERYNO QBLOCKNO PLANNO METHOD TNAME ACCESSTYPE MATCHCOLS ACCESSNAME INDEXONLY PREFETCH SORTN_JOIN SORTC_JOIN
------- -------- ------ ------ ---------------- ---------- --------- ----------------- -------- -------- ---------- ----------
200 1 1 0 CLAIMS R 0 N S Y N
200 1 2 2 CLAIM_DETAILS R 0 N S N Y
a) What access method is used for each table? b) What join method connects them? c) Why might the optimizer have chosen tablespace scans for both tables? d) What indexes would you investigate creating to improve this? e) If CLAIMS has 50M rows and CLAIM_DETAILS has 200M rows, estimate the sort work file space needed (assume 200 bytes average row length for each table, 80% compression in sort work files).
Exercise 8: Index-Only Access Evaluation
Given this PLAN_TABLE row:
QUERYNO QBLOCKNO PLANNO METHOD TNAME ACCESSTYPE MATCHCOLS ACCESSNAME INDEXONLY PREFETCH
------- -------- ------ ------ ------------------ ---------- --------- -------------------- -------- --------
300 1 1 0 ACCOUNT_MASTER I 1 IX_AMAST_STATUS N L
The query is:
SELECT ACCOUNT_ID, ACCOUNT_NAME, CURRENT_BALANCE
FROM ACCOUNT_MASTER
WHERE STATUS = 'A';
Index IX_AMAST_STATUS is defined on (STATUS).
a) Why is INDEXONLY = 'N' even though STATUS is the only predicate column? b) What change to the index would make this query index-only? c) PREFETCH = 'L' (list prefetch). What does this tell you about the CLUSTERRATIO? d) Would a tablespace scan potentially be better here if STATUS = 'A' represents 80% of rows? Explain.
Exercise 9: Multi-Step EXPLAIN Interpretation
QUERYNO QBLOCKNO PLANNO METHOD TNAME ACCESSTYPE MATCHCOLS ACCESSNAME INDEXONLY PREFETCH SORTN_JOIN SORTC_JOIN SORTN_ORDERBY SORTN_GROUPBY
------- -------- ------ ------ ------------------- ---------- --------- ------------------ -------- -------- ---------- ---------- ------------- -------------
400 1 1 0 TRANSACTIONS I 2 IX_TRAN_ACCTDT N S N N N N
400 1 2 1 ACCOUNT_MASTER I 1 IX_AMAST_PK N N N N N
400 1 3 1 BRANCH_REFERENCE I 1 IX_BRANCH_PK Y N N N N
400 2 1 0 WORKFILE N/A 0 N N N Y N
a) How many tables are joined, and in what sequence? b) What join method is used for each join? c) QBLOCKNO changes to 2 in the last row — what does this indicate? d) SORTN_ORDERBY = 'Y' in the last row — what does this tell you? e) BRANCH_REFERENCE is INDEXONLY = 'Y' — what can you infer about the query's SELECT list?
Exercise 10: Comparing Two Plans
Plan A (current):
PLANNO METHOD TNAME ACCESSTYPE MATCHCOLS ACCESSNAME PREFETCH SORTN_JOIN
------ ------ -------------- ---------- --------- -------------- -------- ----------
1 0 ORDER_HEADER I 3 IX_OHDR_CUSTDT S N
2 1 ORDER_DETAIL I 1 IX_ODET_ORDID S N
Plan B (after RUNSTATS):
PLANNO METHOD TNAME ACCESSTYPE MATCHCOLS ACCESSNAME PREFETCH SORTN_JOIN
------ ------ -------------- ---------- --------- -------------- -------- ----------
1 0 ORDER_DETAIL R 0 S Y
2 2 ORDER_HEADER I 1 IX_OHDR_PK S N
a) Describe the fundamental differences between Plan A and Plan B. b) What likely changed in the statistics to cause this shift? c) Which plan would you expect to perform better for a query that returns 500 orders out of 10,000,000? Why? d) What immediate action would you take if Plan B caused a production performance degradation? e) What long-term prevention would you implement?
Exercise 11: DSN_STATEMNT_TABLE Analysis
QUERYNO PROCSU PROCMS COST_CATEGORY REASON
------- -------- -------- ------------- ------
500 4523891 12450 A
501 89201 340 B
502 15678234 45230 A NOIDX
a) Query 501 has COST_CATEGORY = 'B'. What does this mean, and what should you do first? b) Query 502 has REASON = 'NOIDX'. What does this tell you about the access path? c) Query 500 estimates 12.45 seconds. If the actual elapsed time is 180 seconds, what should you investigate? d) How would you use the PROCSU (CPU service units) values to prioritize tuning efforts across these three queries?
Exercise 12: Parallelism in EXPLAIN
QUERYNO PLANNO METHOD TNAME ACCESSTYPE MATCHCOLS ACCESS_DEGREE JOIN_DEGREE PARALLELISM_MODE
------- ------ ------ -------------- ---------- --------- ------------- ----------- ----------------
600 1 0 HUGE_TABLE R 0 4 0 C
600 2 4 LOOKUP_TABLE I 1 1 4 C
a) What does ACCESS_DEGREE = 4 for HUGE_TABLE mean? b) What is PARALLELISM_MODE = 'C'? c) JOIN_DEGREE = 4 for the second step — what does this tell you about how the hash join is parallelized? d) When might you want to disable this parallelism, and how would you do it?
Part C: RUNSTATS Strategy (Exercises 13–17)
Exercise 13: RUNSTATS Frequency Design
You manage a table POLICY_CLAIMS with this profile: - 200,000,000 rows - 50,000 new claims inserted daily - 10,000 claims updated to SETTLED status daily - Quarterly purge removes claims older than 7 years (~5M rows)
Design a RUNSTATS strategy: a) How often should full RUNSTATS run? b) Should you use inline RUNSTATS with REORG? When does REORG run? c) After the quarterly purge, what RUNSTATS considerations exist? d) Write the RUNSTATS JCL with appropriate COLUMN and COLGROUP specifications, given these common predicates: - CLAIM_STATUS = ? - PROVIDER_STATE = ? AND PROCEDURE_CODE = ? - CLAIM_DATE BETWEEN ? AND ? - POLICY_TYPE = ? AND CLAIM_AMOUNT > ?
Exercise 14: Column Group Statistics
A query performs poorly because of correlated predicates on table EMPLOYEE:
WHERE DEPARTMENT = 'ENGINEERING'
AND JOB_TITLE = 'SENIOR DEVELOPER'
AND LOCATION = 'SAN FRANCISCO'
RUNSTATS reports: - DEPARTMENT COLCARDF = 20 - JOB_TITLE COLCARDF = 150 - LOCATION COLCARDF = 30
a) What filter factor does DB2 calculate using the independence assumption? b) If 'SENIOR DEVELOPER' only exists in ENGINEERING and FINANCE departments, and all San Francisco engineers are senior developers, what is the actual filter factor for a 1,000,000-row table? c) Write the RUNSTATS control statement to collect column group statistics that would help. d) After collecting column group statistics, how does the optimizer's estimate change?
Exercise 15: Statistics Profile Management
Write SQL to create statistics profiles for these tables:
a) ACCOUNT_MASTER — primary key ACCOUNT_ID, common predicates on STATUS, ACCOUNT_TYPE, BRANCH_CODE, and the combination (STATUS, ACCOUNT_TYPE) b) DAILY_TRANSACTIONS — partitioned by TRAN_DATE, common predicates on ACCOUNT_ID, TRAN_TYPE, CHANNEL, STATUS, and combinations (TRAN_DATE, STATUS) and (CHANNEL, TRAN_TYPE)
Then write the RUNSTATS JCL that uses these profiles.
Exercise 16: Diagnosing Stale Statistics
You notice a query that used to run in 5 seconds now takes 3 minutes. EXPLAIN shows a tablespace scan where there was previously a matching index scan.
Your investigation reveals:
SELECT CARDF, NPAGES FROM SYSIBM.SYSTABLES
WHERE NAME = 'TRANSACTION_LOG' AND CREATOR = 'PROD';
-- CARDF = 50000000, NPAGES = 2500000
SELECT COUNT(*) FROM PROD.TRANSACTION_LOG;
-- COUNT = 5000000 (actual rows are 10x less than statistics show)
a) How did this discrepancy likely arise? b) How does this affect the filter factor calculations? c) What is the fix, and what prevention should you implement? d) If the table was recently REORGed without inline RUNSTATS, could that explain the discrepancy? How?
Exercise 17: RUNSTATS Impact Simulation
Before running RUNSTATS on a critical production table, you want to understand the potential impact.
a) Describe how to use EXPLAIN(ONLY) to simulate the effect of updated statistics without actually rebinding the package. b) Write the sequence of steps to safely evaluate RUNSTATS impact: 1. Capture current PLAN_TABLE baseline 2. Run RUNSTATS 3. Compare potential new plan to baseline 4. Decide whether to REBIND c) How does PLANMGMT(EXTENDED) give you a safety net during this process?
Part D: Plan Stability and Prevention (Exercises 18–22)
Exercise 18: PLANMGMT Scenario
A production package was bound at three different times with these access paths:
| Version | Bind Date | Access Path | Elapsed Time |
|---|---|---|---|
| ORIGINAL | 2024-01-15 | Index scan, 4 MATCHCOLS, NL join | 22 minutes |
| PREVIOUS | 2025-06-01 | Index scan, 4 MATCHCOLS, NL join | 24 minutes |
| CURRENT | 2025-12-16 | TS scan, merge scan join | 88 minutes |
a) Write the REBIND command to switch to the PREVIOUS version. b) Write the REBIND command to switch to the ORIGINAL version. c) After switching, what becomes the new CURRENT, PREVIOUS, and ORIGINAL? d) If you then REBIND with APREUSE(WARN), what happens?
Exercise 19: OPTHINT Construction
A query on CLAIMS always performs best with index IX_CLM_STATUSDT (STATUS, CLAIM_DATE) but the optimizer keeps choosing IX_CLM_PROVCODE (PROVIDER_CODE) after RUNSTATS.
a) Write the DSN_USERQUERY_TABLE INSERT to hint the preferred index. b) Write the BIND command that activates the hint. c) What are the risks of this approach? d) Under what circumstances should you remove the hint? e) What alternative approaches should you try before resorting to a hint?
Exercise 20: Preventing the CNB Incident
Design a complete plan regression prevention system for a production DB2 environment with 500 packages. Your design should include:
a) A PLAN_TABLE baseline capture process (when, how, retention) b) An automated comparison process after REBIND c) Alert severity levels and escalation procedures d) RUNSTATS scheduling and validation e) PLANMGMT settings and APREUSE usage during maintenance windows
Exercise 21: Dynamic SQL Plan Management
A CICS online application uses dynamic SQL (PREPARE/EXECUTE) for a search screen with many optional predicates. Users report intermittent slow response times.
a) Why are dynamic SQL plans potentially less stable than static SQL plans? b) What is statement cache invalidation, and what causes it? c) How does CONCENTRATE STATEMENTS WITH LITERALS help (or hurt)? d) The REOPT(AUTO) zparm setting — what does it do and when should it be used? e) Design a monitoring approach to detect dynamic SQL plan instability.
Exercise 22: DB2 Version Migration Access Path Management
You're migrating from DB2 12 to DB2 13 on a weekend maintenance window. 500 production packages need to be rebound.
a) Write the migration plan for preserving access path stability: 1. Pre-migration steps 2. Migration weekend steps 3. Post-migration validation 4. Fallback procedures b) How do you handle packages where the new optimizer produces a genuinely better plan? How do you distinguish "better new plan" from "regression"? c) What role does APREUSE play in this migration?
Part E: Applied Scenarios (Exercises 23–25)
Exercise 23: The Midnight Batch Crisis
At 11:30 PM, the end-of-day batch cycle starts. By 12:15 AM, the operations team reports that job EODPOST2 is running 5x slower than normal. You are the on-call DBA.
Walk through your complete diagnostic process: a) First five commands you would run (in order) b) EXPLAIN shows the access path hasn't changed — what else could cause 5x degradation? c) You discover that CLUSTERRATIO for the primary index dropped from 98% to 31% after a mass update job ran earlier in the day. How does this explain the degradation even without an access path change? d) What is your immediate fix? e) What process change prevents this in the future?
Exercise 24: The Index That Made Things Worse
A developer creates a new index to "help" a slow query:
CREATE INDEX IX_TRAN_NEW
ON DAILY_TRANSACTIONS (STATUS, CHANNEL, TRAN_TYPE, AMOUNT);
After REBIND, the query that was previously using IX_DTRAN_ACCTDT (ACCOUNT_ID, TRAN_DATE, TRAN_TYPE, AMOUNT) switches to the new index — and runs 10x slower.
a) Why did the optimizer choose the new index?
b) Under what predicate pattern would IX_TRAN_NEW be appropriate?
c) The query's WHERE clause is: WHERE ACCOUNT_ID = ? AND TRAN_DATE = ? AND STATUS = 'C'. Explain why IX_DTRAN_ACCTDT is vastly superior for this query.
d) What should the developer have done before creating the index?
e) What is the correct resolution? (Hint: don't just drop the index — other queries might benefit.)
Exercise 25: Comprehensive Optimization Case
You inherit a COBOL/DB2 batch program with this query that runs every night and currently takes 4 hours:
SELECT A.ACCOUNT_ID, A.ACCOUNT_NAME, A.BRANCH_CODE,
SUM(D.AMOUNT) AS DAILY_TOTAL,
COUNT(*) AS TRAN_COUNT
FROM ACCOUNT_MASTER A
INNER JOIN DAILY_TRANSACTIONS D
ON A.ACCOUNT_ID = D.ACCOUNT_ID
INNER JOIN BRANCH_REFERENCE B
ON A.BRANCH_CODE = B.BRANCH_CODE
WHERE D.TRAN_DATE = CURRENT DATE - 1 DAY
AND D.STATUS IN ('C', 'R')
AND A.STATUS = 'A'
AND B.REGION = 'NORTHEAST'
GROUP BY A.ACCOUNT_ID, A.ACCOUNT_NAME, A.BRANCH_CODE
ORDER BY A.BRANCH_CODE, DAILY_TOTAL DESC;
EXPLAIN shows: - Tablespace scan on DAILY_TRANSACTIONS (180 GB) - Merge scan join with ACCOUNT_MASTER - Nested loop join with BRANCH_REFERENCE - Two sorts (one for join, one for ORDER BY)
Table statistics: - DAILY_TRANSACTIONS: 500M rows, TRAN_DATE COLCARDF = 365 - ACCOUNT_MASTER: 5M rows - BRANCH_REFERENCE: 500 rows, REGION COLCARDF = 6
a) Calculate the expected filter factor for the DAILY_TRANSACTIONS predicates. b) Design an optimal index for DAILY_TRANSACTIONS for this query. c) What join sequence and join methods would you expect with good indexes? d) Could BRANCH_REFERENCE be eliminated from the join through denormalization? What are the tradeoffs? e) Estimate the performance improvement from your proposed changes (rough order of magnitude). f) Write the complete optimization action plan: indexes, RUNSTATS, REBIND, validation.
Part M: Spaced Review — Connecting to Previous Chapters
Exercise M1: Optimizer and z/OS Architecture (Chapter 1)
The DB2 optimizer runs in the DBM1 address space.
a) What other DB2 processing occurs in DBM1? b) If the optimizer is consuming excessive CPU during BIND processing (e.g., a very complex query with 15 table references), where would you see this in z/OS monitoring? c) How does the DB2 subsystem parameter OPTHI (optimization hint) relate to the zparm configuration you learned about in Chapter 1?
Exercise M2: Catalog Statistics and System Tables (Chapter 2)
The optimizer relies on SYSIBM.SYSCOLUMNS, SYSIBM.SYSTABLES, and SYSIBM.SYSINDEXES.
a) These are regular DB2 tables in the DSNDB06 catalog database. Can RUNSTATS itself experience performance problems accessing the catalog? Under what circumstances? b) If the catalog tablespaces need REORG, what is the correct procedure, and how does it differ from REORGing a user tablespace? c) How does CATMAINT relate to optimizer functionality during DB2 migration?
Exercise M3: Buffer Pools and the Optimizer (Chapter 3)
The optimizer's cost model factors in buffer pool hit ratios.
a) If a buffer pool's hit ratio improves from 85% to 99%, how might this change the optimizer's preference between index access and tablespace scan? b) The NPAGES statistic tells the optimizer how many pages a tablespace occupies. How does this interact with the buffer pool size to influence prefetch decisions? c) A DBA doubles the size of the buffer pool used by DAILY_TRANSACTIONS. After REBIND, the access path changes from tablespace scan to index scan. Explain the cost model mechanics that drive this change.
Exercise M4: Dataset Placement and I/O Cost (Chapter 4)
a) The optimizer estimates I/O cost using assumptions about device speed. If you migrate a tablespace from spinning disk to SSD (solid-state), does the optimizer automatically adjust its cost estimates? How? b) DSNDB07 work files are used for sorts in merge scan joins. From Chapter 4, what dataset placement best practices apply to work files? c) How does VSAM striping across multiple volumes affect the optimizer's tablespace scan cost estimation?
Exercise M5: SQL Patterns and Optimizer Interaction (Chapter 5)
From Chapter 5's advanced SQL patterns:
a) A correlated subquery is rewritten by the optimizer as a join. Under what circumstances does this rewrite improve the access path?
b) EXISTS vs. IN — the optimizer can transform between these. When does it matter which form you write?
c) A CASE expression in the WHERE clause — can the optimizer use an index on the column referenced inside the CASE? Explain with an example.