Chapter 20 Quiz: Monitoring and Problem Determination
Test your understanding of DB2 monitoring concepts, tools, and diagnostic techniques across both z/OS and LUW platforms. Each question has one best answer unless otherwise noted.
Question 1
What is the primary advantage of proactive monitoring over reactive monitoring?
A) It eliminates the need for a DBA on-call rotation B) It detects problems before they impact end users C) It costs less to implement than reactive monitoring D) It requires fewer monitoring tools
Question 2
Which five pillars should a comprehensive DB2 monitoring strategy address?
A) CPU, Memory, Disk, Network, Security B) Availability, Performance, Capacity, Concurrency, Integrity C) Backup, Recovery, Logging, Locking, Indexing D) Hardware, Software, Network, Application, Database
Question 3
On z/OS, which DISPLAY DATABASE status code requires the most urgent response?
A) COPY (Copy Pending) B) UT (Utility Running) C) RECP (Recover Pending) D) RO (Read-Only)
Question 4
In the output of -DIS THREAD(*) DETAIL, a thread with status QW indicates:
A) The thread is actively executing SQL in DB2 B) The thread is queued and waiting for a resource C) The thread has completed and is being deallocated D) The thread is in a quiesce state
Question 5
A buffer pool hit ratio of 92% for an OLTP workload means approximately how much more physical I/O compared to a 99% hit ratio?
A) About 7% more B) About 2 times more C) About 8 times more D) About 50% more
Question 6
On z/OS, which IFCID is the single most important record for performance analysis, generated when a thread deallocates?
A) IFCID 001 — Statistics Record B) IFCID 003 — Accounting Record C) IFCID 006 — SQL Statement Start D) IFCID 044 — Lock Suspension
Question 7
Which SMF record type contains per-thread accounting data generated from the DB2 accounting trace?
A) SMF 100 B) SMF 101 C) SMF 102 D) SMF 103
Question 8
In a z/OS accounting report, if Class 1 Elapsed Time is 2.5 seconds and Class 2 Elapsed Time is 0.3 seconds, where was the majority of time spent?
A) Inside DB2 processing SQL B) Waiting for locks in DB2 C) Outside DB2, in the application or network D) Performing physical I/O
Question 9
Why is db2pd considered safe to run even during a performance crisis on LUW?
A) It uses SQL, which the optimizer handles efficiently B) It reads directly from shared memory without acquiring latches C) It only captures data when the database is idle D) It runs as a low-priority background process
Question 10
Which db2pd option would you use to identify who is blocking whom in a lock contention scenario?
A) db2pd -db MERIDIAN -locks
B) db2pd -db MERIDIAN -wlocks
C) db2pd -db MERIDIAN -transactions
D) db2pd -db MERIDIAN -applications
Question 11
What is the recommended modern approach for SQL-queryable monitoring data on DB2 LUW 9.7 and later?
A) GET SNAPSHOT commands B) db2pd with -file option C) MON_GET table functions D) Event monitors writing to files
Question 12
Which MON_GET table function would you use to find the most expensive SQL statements currently in the package cache?
A) MON_GET_CONNECTION
B) MON_GET_PKG_CACHE_STMT
C) MON_GET_TABLE
D) MON_GET_BUFFERPOOL
Question 13
An event monitor of type LOCKING would capture which of the following? (Select all that apply)
A) Deadlocks B) Lock timeouts C) Lock escalations D) Every lock acquisition
Question 14
In db2diag.log, which severity level indicates a major component failure requiring immediate investigation?
A) Warning B) Error C) Severe D) Critical
Question 15
Which db2diag command filters the diagnostic log to show only errors from the last 24 hours for the MERIDIAN database?
A) db2diag -time 24h -level Error -db MERIDIAN
B) db2diag -last 24 -severity Error -database MERIDIAN
C) db2diag -hours 24 -filter Error -name MERIDIAN
D) db2diag -period 24h -type Error -dbname MERIDIAN
Question 16
What is the recommended DIAGLEVEL setting for a production DB2 LUW instance?
A) 0 — Disabled, to minimize overhead B) 1 — Severe errors only C) 3 — Warnings and above D) 4 — All informational messages
Question 17
A sort overflow percentage of 8% indicates:
A) Normal sort behavior for most workloads B) A warning condition — sorts are occasionally spilling to disk C) A critical condition — sort heap is likely undersized D) An informational metric that rarely requires action
Question 18
If the package cache hit ratio is below 60%, the most likely cause is:
A) The package cache is too small B) The application is using literal SQL values instead of parameter markers C) Too many concurrent users are connected D) The buffer pool hit ratio is also low
Question 19
On z/OS, which command would you issue first if you receive an alert that 5 of 6 active log data sets are full?
A) -DIS BUFFERPOOL(*) DETAIL
B) -DIS THREAD(*) DETAIL
C) -DIS DATABASE(*) SPACENAM(*)
D) -DIS UTIL(*)
Question 20
What does db2support do?
A) Starts the DB2 support daemon for remote administration B) Collects diagnostic data into an archive for IBM Support C) Provides interactive troubleshooting guidance for common problems D) Connects to IBM's support portal and uploads diagnostic data automatically
Question 21
In a monitoring dashboard, Tier 1 indicators (red/yellow/green status) should refresh:
A) Every 1-2 seconds B) Every 15-30 seconds C) Every 5 minutes D) Every 15 minutes
Question 22
For the Meridian Bank monitoring setup, which traces should be left running at all times on z/OS?
A) Performance trace and audit trace B) Statistics trace and accounting trace C) Monitor trace and global trace D) All trace types with CLASS(*)
Question 23
When configuring alert hysteresis, requiring "3 consecutive intervals above threshold before alerting" serves what purpose?
A) It ensures the monitoring system has time to collect accurate data B) It prevents alert storms from transient metric spikes C) It gives the DBA time to respond before the alert fires D) It reduces the overhead of the alerting subsystem
Question 24
Which of the following is NOT a valid reason to start a z/OS performance trace in a production environment?
A) To investigate intermittent response time spikes affecting a specific plan B) To capture a general baseline of all DB2 activity for the next 48 hours C) To capture SQL statement text for a specific authorization ID experiencing problems D) To investigate lock contention between two specific threads
Question 25
In the Meridian Bank daily health check, which finding should trigger the most urgent response?
A) Buffer pool MERIDIAN_BP_DATA hit ratio at 96.5% B) Sort overflow percentage at 3.2% C) Last successful backup was 28 hours ago D) 15 active connections out of 200 maximum
Answer Key
| Question | Answer | Explanation |
|---|---|---|
| 1 | B | Proactive monitoring detects issues before user impact; it does not eliminate on-call (A) or necessarily cost less (C). |
| 2 | B | The five pillars defined in Section 20.1 are Availability, Performance, Capacity, Concurrency, and Integrity. |
| 3 | C | RECP (Recover Pending) indicates a potential data inconsistency requiring immediate RECOVER utility execution. COPY is important but not as urgent. |
| 4 | B | QW means the thread is queued waiting for a resource such as a thread limit or connection. |
| 5 | C | At 99%, 1% of requests cause I/O. At 92%, 8% cause I/O. That is 8 times more physical I/O (8%/1% = 8x). |
| 6 | B | IFCID 003 is the accounting record, generated at thread deallocation, containing comprehensive performance data. |
| 7 | B | SMF 101 records contain per-thread accounting data. SMF 100 is statistics, SMF 102 is performance trace. |
| 8 | C | Class 1 (2.5s) minus Class 2 (0.3s) = 2.2s spent outside DB2 — in the application or network layer. |
| 9 | B | db2pd reads directly from DB2's shared memory without acquiring latches, imposing virtually zero overhead. |
| 10 | B | db2pd -wlocks specifically shows lock waits with blocker-waiter relationships. -locks shows all locks but does not highlight the wait chain as clearly. |
| 11 | C | MON_GET table functions are the recommended modern approach, providing SQL-queryable monitoring data with low overhead. |
| 12 | B | MON_GET_PKG_CACHE_STMT returns statement-level metrics from the package cache, including execution counts and times. |
| 13 | A, B, C | A LOCKING event monitor captures deadlocks, lock timeouts, and lock escalations. It does not capture every individual lock acquisition (D). |
| 14 | C | Severe indicates a major component failure. Critical (D) is even more serious (database/instance compromise), but the question asks about "major component failure" specifically. |
| 15 | A | The correct syntax is db2diag -time 24h -level Error -db MERIDIAN. The other options use incorrect parameter names. |
| 16 | C | DIAGLEVEL 3 captures warnings and above — a good balance between information and log volume for production. |
| 17 | C | Sort overflow above 5% is critical per the thresholds in Section 20.10. At 8%, significant performance degradation from disk-based sorting is occurring. |
| 18 | B | A package cache hit ratio below 60% strongly suggests the application is not using parameter markers, causing unique SQL statements that cannot be reused. |
| 19 | B | Display threads to identify which thread is holding the oldest active log position, preventing log offloading. |
| 20 | B | db2support collects diagnostic data (configuration, logs, dump files, system info) into a compressed archive for IBM Support. |
| 21 | B | Tier 1 glance indicators should refresh every 15-30 seconds to reflect current state without excessive overhead. |
| 22 | B | Statistics and accounting traces should always run — they have low overhead and provide essential baseline and per-thread data. |
| 23 | B | Requiring consecutive intervals prevents transient spikes from generating false alerts (alert storms). |
| 24 | B | Performance traces generate high overhead and should never run unfiltered for extended periods. A 48-hour unfiltered trace is inappropriate for production. |
| 25 | C | A backup 28 hours old exceeds typical RPO requirements for a banking system and represents a data loss risk. The other metrics are mildly concerning but not urgent. |