Chapter 27: Quiz — Performance Diagnosis Methodology

DataField.Dev

Chapter 27: Quiz — Performance Diagnosis Methodology

Test your understanding of DB2 performance diagnosis techniques and methodology.

Question 1

What is the correct first step in the performance diagnosis methodology?

A) Collect all available metrics B) Define the symptom precisely C) Run EXPLAIN on all SQL statements D) Increase the buffer pool size

Question 2

A transaction shows: Elapsed = 2,000ms, CPU = 50ms, I/O Wait = 100ms, Lock Wait = 1,800ms. This transaction is:

A) CPU-bound B) I/O-bound C) Lock-bound D) Memory-bound

Question 3

On z/OS, which accounting trace class provides the breakdown of suspension time into I/O, lock, and other categories?

A) Class 1 B) Class 2 C) Class 3 D) Class 7

Question 4

Which LUW table function returns per-statement performance metrics from the package cache?

A) MON_GET_CONNECTION B) MON_GET_ACTIVITY C) MON_GET_PKG_CACHE_STMT D) MON_GET_BUFFERPOOL

Question 5

A SQL statement reads 500,000 rows but returns only 50. The read efficiency ratio is:

A) 0.0001 B) 50 C) 10,000 D) 500,000

Question 6

What is the primary indicator of a CPU-bound SQL statement?

A) High lock wait time B) High getpage count relative to rows returned C) Low buffer pool hit ratio D) High sort overflow count

Question 7

A buffer pool hit ratio of 62% for an OLTP buffer pool indicates:

A) Excellent performance — no action needed B) The buffer pool is too large and should be reduced C) The buffer pool is too small or is being polluted by batch scans D) The database needs to be reorganized

Question 8

After making a performance change, the "before/after proof" step requires:

A) Asking users if the system feels faster B) Comparing the same metrics before and after under comparable conditions C) Running EXPLAIN and checking if the access path changed D) Restarting the database and measuring startup time

Question 9

On z/OS, the QWACAGET field in the accounting trace represents:

A) Total CPU time B) Number of getpages (buffer pool page accesses) C) Number of lock requests D) Elapsed time in seconds

Question 10

Which scenario most likely indicates stale statistics as the root cause?

A) A query that was fast yesterday and slow today with no code changes B) A query that has been gradually getting slower over 6 months C) A query that is slow only during the batch window D) A query that fails with SQLCODE -911

Question 11

The "read efficiency ratio" (ROWS_READ / ROWS_RETURNED) is most useful for identifying:

A) Lock contention problems B) Missing or suboptimal indexes C) Buffer pool sizing issues D) Sort overflow problems

Question 12

On LUW, which MON_GET function would you use to identify which tables have the most physical I/O?

A) MON_GET_PKG_CACHE_STMT B) MON_GET_CONNECTION C) MON_GET_TABLE D) MON_GET_BUFFERPOOL

Question 13

A batch job that ran in 3 hours last month now runs in 7 hours. The getpage count has increased proportionally. The most likely cause is:

A) CPU hardware degradation B) Data growth without statistics update or index addition C) Lock contention from OLTP D) Disk hardware failure

Question 14

What is a Stage 2 predicate on z/OS?

A) A predicate that is evaluated during the second phase of a two-phase commit B) A predicate that cannot be evaluated during index access and requires data page access C) A predicate that is applied after sorting the result set D) A predicate that uses a subquery

Question 15

Sequential prefetch not triggering on a sequential scan most likely indicates:

A) The buffer pool is too large B) The data is fragmented and needs REORG C) The isolation level is too restrictive D) The query is using a hash join

Question 16

You deployed a new index and see: Before (CPU 200ms, I/O 250ms), After (CPU 350ms, I/O 20ms). The most likely explanation for the CPU increase is:

A) The index is corrupt B) The index introduced more getpages than the table scan C) DB2 is now doing more work per I/O D) More rows qualify due to the new index

Question 17

Which of the following is NOT one of the four primary bottleneck categories in the diagnosis methodology?

A) CPU-bound B) Network-bound C) I/O-bound D) Lock-bound

Question 18

On LUW, the catalog cache hit ratio should be:

A) > 50% B) > 80% C) > 95% D) > 99%

Question 19

When investigating an intermittent performance problem, the most important type of data to collect is:

A) A single snapshot during the problem B) Time-series data collected at regular intervals before, during, and after the problem C) The EXPLAIN output for the affected query D) The database configuration parameters

Question 20

Why is it important to change only one thing at a time when fixing performance problems?

A) To minimize the number of restarts required B) To be able to attribute the improvement (or regression) to a specific change C) Because DB2 can only process one DDL change at a time D) To reduce the risk of lock conflicts during the change

Answer Key

B — Always define the symptom first. "The system is slow" is not actionable. A precise symptom (what, when, how much) directs the entire investigation.
C — Lock wait time (1,800ms) accounts for 90% of the elapsed time. This transaction is lock-bound.
C — Class 3 provides the suspension time breakdown (I/O, lock, latch, drain, etc.). Class 1 is total thread time, Class 2 is in-DB2 time.
C — MON_GET_PKG_CACHE_STMT returns metrics for every SQL statement cached in the package cache, including execution count, CPU, elapsed time, rows read, and rows returned.
C — 500,000 / 50 = 10,000. DB2 reads 10,000 rows for every row it returns, indicating severe index inefficiency.
B — High getpages mean DB2 is accessing many buffer pool pages, each requiring CPU for hash lookups and page access. This is the primary driver of CPU consumption in DB2.
C — A 62% hit ratio for OLTP means 38% of page accesses require physical I/O. OLTP pools should be > 98%. The pool is either too small or is being filled with non-OLTP data.
B — Quantitative comparison of the same metrics under comparable conditions (same day of week, similar load, similar time) is the only reliable way to prove a change worked.
B — QWACAGET is the getpage count — the number of times DB2 accessed a page in the buffer pool (logical reads).
B — Gradual degradation over months typically indicates data growth changing the optimizer's cost calculations, but stale statistics preventing the optimizer from choosing a better path.
B — A high ratio means DB2 reads many rows to return few. This is the classic symptom of a missing index or an index that does not match the query predicates.
C — MON_GET_TABLE returns per-table metrics including DATA_OBJECT_P_READS and INDEX_OBJECT_P_READS, showing which tables generate the most physical I/O.
B — Proportional getpage increase with data growth, without statistics updates, means DB2 is scanning more data. RUNSTATS and potential index changes are the fix.
B — Stage 2 predicates are evaluated after data pages are fetched, not during index access. They cannot leverage the index and require additional CPU and I/O. Common causes include functions on columns and leading wildcards.
B — When data is physically disorganized on disk, DB2 may not recognize the access pattern as sequential and will not trigger prefetch. REORG restores physical ordering.
C — The index eliminated I/O wait (from 250ms to 20ms) but the access path now processes data differently. The CPU increase is likely because more getpages happen against the buffer pool (all in-memory, hence CPU not I/O). Overall elapsed time likely decreased, making this a net improvement.
B — The four categories are CPU, I/O, lock, and memory. Network is an external factor outside DB2's control and is classified under "Other Wait."
D — The catalog cache should have > 99% hit ratio. A lower ratio means DB2 is repeatedly reading system catalog tables, wasting I/O and CPU.
B — Intermittent problems require time-series data to correlate the problem with specific events. A single snapshot may miss the problem entirely or capture it without context.
B — If you change two things and performance improves, you cannot know which change helped (or if one helped and the other hurt). Single-variable changes enable clear attribution.