Chapter 9 Key Takeaways

Core Principle

DB2 utilities are not DBA-only tools. As a COBOL developer, you own the performance, recoverability, and data integrity of your DB2 applications. That ownership extends to every utility that touches your data.

REORG

  1. REORG restores physical order, reclaims space, and refreshes compression dictionaries. It is justified when CLUSTERRATIOF drops below 0.80 or PERCDROP exceeds 0.10 — not by calendar date alone.

  2. Online REORG (SHRLEVEL CHANGE) keeps applications available during the BUILD phase. The only unavailability occurs during the SWITCH phase drain, typically lasting seconds. Requires a pre-created mapping table.

  3. TIMEOUT TERM is almost always the correct setting. It terminates the REORG if drain fails, rather than killing the applications blocking the drain. The REORG can be rescheduled; a terminated batch job may have cascading consequences.

  4. Partition-level REORG targets the work where it's needed. High-activity partitions may need weekly REORG; stable historical partitions may go months without one. Use metrics, not blanket schedules.

  5. Always use inline COPY and inline RUNSTATS with REORG. They add negligible time and eliminate two separate utility passes.

RUNSTATS

  1. RUNSTATS feeds the optimizer. Without current statistics, the optimizer uses stale data to choose access paths. This is the single most common cause of unexplained performance degradation.

  2. COLGROUP statistics are essential for correlated columns. Without them, the optimizer assumes column independence and overestimates selectivity, leading to suboptimal access paths.

  3. FREQVAL and HISTOGRAM capture data distribution. For skewed columns (where a few values dominate), FREQVAL tells the optimizer the actual selectivity instead of assuming uniform distribution.

  4. RUNSTATS profiles (USEPROFILE) ensure consistency. Define the statistics specification once in the catalog and reference it from every invocation. Eliminates the risk of incomplete or inconsistent statistics collection.

  5. RUNSTATS without REBIND is useless. The new statistics sit in the catalog doing nothing until packages are rebound. Always pair RUNSTATS with REBIND of affected packages.

COPY and RECOVER

  1. Dual image copies to different media are non-negotiable. If your only copy is on the same medium that failed, your backup is worthless. DASD + tape (or DASD + FlashCopy) provides media diversity.

  2. Copy frequency determines recovery time. More frequent copies mean less log data to apply during recovery. For critical tablespaces, daily full copies may be justified by the reduced recovery time.

  3. Point-in-time recovery requires cascade recovery of all RI-related tablespaces. Recovering a parent tablespace to an earlier point creates orphan rows in unrecoered dependent tablespaces.

  4. After any recovery, run CHECK DATA + CHECK INDEX + COPY + RUNSTATS + REBIND. This sequence verifies integrity, establishes a new recovery baseline, updates statistics, and ensures packages use current access paths.

  5. Assessment time typically exceeds recovery time. Pre-compute RI dependencies, document recovery order, and automate recovery planning. Every minute of assessment is a minute of outage.

LOAD and UNLOAD

  1. LOAD with LOG NO is 5-10x faster than SQL INSERT but requires an immediate image copy afterward, since the load cannot be recovered through the log.

  2. LOAD REPLACE deletes all existing data first. If the job fails mid-execution with LOG NO, you have a partially loaded table and no log-based recovery path. Always take an image copy before LOAD REPLACE.

  3. Parallel partition-level LOAD dramatically reduces elapsed time for partitioned tablespaces. Submit separate LOAD jobs for each partition and let them run concurrently.

  4. UNLOAD generates LOAD control statements (SYSPUNCH). These can be fed directly into a LOAD job, making UNLOAD/LOAD the standard approach for table restructuring.

CHECK and REBUILD

  1. CHECK DATA detects RI violations; CHECK INDEX detects index inconsistencies. Run both after recovery, after LOAD with ENFORCE NO, and whenever data integrity is suspect.

  2. REBUILD INDEX is the definitive fix for corrupt indexes. It drops and recreates indexes from the tablespace data. Always include STATISTICS YES.

Scheduling and Monitoring

  1. Threshold-based utility scheduling beats calendar-based scheduling. Monitor CLUSTERRATIOF, PERCDROP, STATSTIME, and COPYUPDATEDPAGESF to determine when each utility is actually needed.

  2. Pending statuses (REORP, CHKP, COPYP, RBDP) are DB2 protecting you from bad data. Don't look for workarounds — resolve the pending status by running the required utility.

  3. Your COBOL programs and DB2 utilities share the batch window. Coordinate scheduling, understand mutual dependencies, and design commit intervals that don't block utility drain operations.

The HA Banking System

  1. A complete utility schedule has four components: threshold-based triggers, calendar-based recurring jobs, event-driven execution, and monitoring/alerting. All four are needed; none alone is sufficient.