Chapter 9: DB2 Utilities for the COBOL Developer: REORG, RUNSTATS, COPY, RECOVER, and LOAD at Scale

36 min read

> "I've watched too many developers shrug off DB2 utilities as 'DBA stuff' and then spend three days debugging a performance problem that RUNSTATS would have fixed in four minutes." — Kwame Asante, CNB Senior Systems Architect

In This Chapter

9.1 Why DB2 Utilities Matter to Developers (Not Just DBA Work)
9.2 REORG: When It Helps, When It's Theater
9.3 RUNSTATS: Feeding the Optimizer
9.4 COPY and RECOVER: Backup Strategy and Recovery Scenarios
9.5 LOAD and UNLOAD: Bulk Data Operations at Scale
9.6 CHECK DATA/INDEX and REBUILD INDEX
9.7 Utility Scheduling and Monitoring
9.8 Project Checkpoint: DB2 Utility Schedule for the HA Banking System
9.9 Spaced Review
Summary

Exercises Quiz Case Study 01 Case Study 02 Key Takeaways Further Reading

Chapter 9: DB2 Utilities for the COBOL Developer: REORG, RUNSTATS, COPY, RECOVER, and LOAD at Scale

"I've watched too many developers shrug off DB2 utilities as 'DBA stuff' and then spend three days debugging a performance problem that RUNSTATS would have fixed in four minutes." — Kwame Asante, CNB Senior Systems Architect

You write the SQL. You design the tables. You tune the queries. And then you hand the whole thing off to the DBA team and assume they'll keep it humming. That works — right up until it doesn't. The morning your batch window overruns by ninety minutes because a REORG wasn't scheduled. The afternoon your COBOL program's EXPLAIN suddenly shows a tablespace scan where yesterday it used an index. The Friday night when a disk failure takes out your most critical tablespace and nobody on the recovery call understands the application's data dependencies.

DB2 utilities are not optional knowledge for serious COBOL developers. They are the operational backbone of every DB2 application. This chapter treats them as what they are: tools you need to understand, configure, and in many cases design the strategy for.

We will cover the five major utilities — REORG, RUNSTATS, COPY, RECOVER, and LOAD — plus the diagnostic utilities CHECK DATA, CHECK INDEX, and REBUILD INDEX. For each, you'll learn when it matters, how to invoke it, what the critical parameters are, and where the traps hide. By the end, you'll design the complete DB2 utility schedule for the HA Banking Transaction Processing System.

9.1 Why DB2 Utilities Matter to Developers (Not Just DBA Work)

Let's kill the myth right now: DB2 utilities are not the exclusive domain of database administrators. If you write COBOL programs that access DB2, you are a stakeholder in every utility that touches your data. Here's why.

The Developer's Stake

Performance ownership. Your program's performance depends on three things: the SQL you write, the indexes you define, and the statistics the optimizer uses to build access paths. Two of those three are directly controlled by DB2 utilities. When RUNSTATS hasn't been collected after a bulk load, the optimizer is flying blind. When a tablespace hasn't been REORGed in six months and your clustering index is 40% out of sequence, your sequential prefetch is reading three times the pages it should. These aren't DBA problems — they're your problems, because your batch job is the one that misses its SLA.

Recovery planning. When Sandra Chen's team at Federal Benefits lost a tablespace to a controller failure at 2:47 AM on a Tuesday, the recovery call included four DBAs and zero application developers. It took them ninety minutes to figure out which tablespaces had to be recovered in which order because of referential integrity constraints. The application developers knew those dependencies cold — they designed them. Sandra made sure her team was on every recovery call after that.

Batch window coordination. Your COBOL batch jobs and DB2 utilities compete for the same resource: time. A REORG on a tablespace locks out your batch updates. A COPY utility needs quiesce points that align with your commit logic. A LOAD REPLACE wipes your table clean — if your restart logic doesn't account for it, you'll reload data on top of a reload. Understanding utility scheduling isn't optional; it's survival.

Change management. When you add a column, change an index, or restructure a partition scheme, you're creating utility work. That ALTER TABLE you submitted? It might require a REORG to materialize. That new index? RUNSTATS needs to know about it before the optimizer can use it effectively. If you don't coordinate, you're shipping code that will underperform until someone notices.

The Utility Ecosystem

DB2 utilities operate on objects: tablespaces, index spaces, and partitions. Every utility has a utility ID (UTILID), which you use to monitor and control it. Utilities can run online (while applications access the data) or offline (with exclusive access). The choice between online and offline execution is not a DBA decision made in isolation — it directly affects your application's availability.

Here is the hierarchy of DB2 utilities we'll cover:

Utility	Purpose	Primary Benefit
REORG	Reorganize tablespace or index data	Restore clustering, reclaim space, improve I/O
RUNSTATS	Collect optimizer statistics	Accurate access paths, better performance
COPY	Create image copies (backups)	Recovery capability
RECOVER	Restore from image copies + logs	Data recovery after failure
LOAD	Bulk load data into tables	High-speed data population
UNLOAD	Extract data from tables	Bulk data extraction
CHECK DATA	Verify referential integrity	Detect orphan rows, constraint violations
CHECK INDEX	Verify index consistency	Detect index corruption
REBUILD INDEX	Rebuild indexes from data	Fix corrupted indexes

Every one of these utilities is invoked through JCL, typically using the DSNUPROC cataloged procedure. You submit the job, DB2 runs the utility, and you monitor it with the DISPLAY UTILITY command or through the utility's own output. We'll see the JCL patterns throughout this chapter.

The Cost of Ignorance

At CNB, Rob Tanner once spent two weeks tuning a COBOL batch program that processed mortgage payment postings. He rewrote SQL, added indexes, restructured the program logic. Performance improved by 12%. Then Lisa Park ran RUNSTATS on the three tablespaces involved and rebound the package. Performance improved by another 45%. The optimizer had been using stale statistics from before a major data migration that tripled the row count.

Rob doesn't ignore utilities anymore.

9.2 REORG: When It Helps, When It's Theater

REORG reorganizes data in a tablespace or index space, restoring physical order to match the clustering index sequence, reclaiming space from deleted rows, and compressing data if a compression dictionary is in use.

When REORG Actually Helps

REORG is not a magic performance button. It helps in specific, measurable situations:

Clustering degradation. When rows are inserted and the clustering index can't maintain physical order, pages split and data scatters. If your CLUSTERRATIOF drops below 80%, sequential prefetch becomes inefficient because DB2 reads pages that contain only a few relevant rows. REORG restores clustering.

Space reclamation. Deleted rows leave empty space on pages. DB2 reuses this space eventually, but fragmentation accumulates. If PERCDROP (percentage of dropped rows) exceeds 10-15%, you're carrying dead weight. REORG compacts the data.

Compression dictionary refresh. If you're using DB2 data compression, the dictionary is built during REORG. After significant data changes, the dictionary may no longer reflect the actual data patterns, reducing compression effectiveness.

Pending changes. Certain ALTER TABLE operations (adding a column with a default value, changing a column's length in certain cases) place the tablespace in REORG-pending (REORP) status. The REORG materializes those changes.

When REORG Is Theater

Running REORG on a schedule "just because" is a waste of resources. Signs you're doing unnecessary REORGs:

CLUSTERRATIOF is above 95% and PERCDROP is below 5%
The tablespace is small (under 1,000 pages) and performance is fine
You REORG every night on a table that gets 200 inserts a day
You REORG a table that's only ever accessed by direct key lookup (clustering doesn't matter for single-row FETCH)

Kwame's rule at CNB: "If you can't point to a metric that justifies it, don't REORG it. Every REORG has a cost — CPU, elapsed time, log volume, and unavailability risk. Make the cost earn its keep."

Offline REORG

The traditional REORG takes the tablespace offline. No application access during execution. The process:

UNLOAD data from the tablespace to a sequential dataset
DROP and recreate the internal structures
RELOAD data in clustering sequence
REBUILD all indexes
Optionally run inline COPY and RUNSTATS

//REORGTS  EXEC DSNUPROC,SYSTEM=DB2P,UID='REORGACCT'
//SYSIN    DD *
  REORG TABLESPACE DBCNB01.TSACCT01
    SORTDATA YES
    SORTKEYS YES
    STATISTICS YES
    SHRLEVEL NONE
    COPYDDN  (SYSCOPY1,SYSCOPY2)
    RECOVERYDDN (SYSREC1)
/*
//SYSCOPY1 DD DSN=CNB.DB2P.ACCT01.ICOPY1.D&DATE,
//            DISP=(NEW,CATLG),
//            UNIT=SYSDA,SPACE=(CYL,(500,100),RLSE)
//SYSCOPY2 DD DSN=CNB.DB2P.ACCT01.ICOPY2.D&DATE,
//            DISP=(NEW,CATLG),
//            UNIT=SYSDA,SPACE=(CYL,(500,100),RLSE)
//SYSREC1  DD DSN=CNB.DB2P.ACCT01.REORG.SYSREC,
//            DISP=(NEW,CATLG),
//            UNIT=SYSDA,SPACE=(CYL,(1000,200),RLSE)

Key parameters:

SORTDATA YES — Sorts data pages into clustering sequence. Almost always YES unless you have a specific reason.
SORTKEYS YES — Sorts index keys during rebuild. Significantly faster than building indexes without sorting.
STATISTICS YES — Runs inline RUNSTATS at the end. Eliminates a separate RUNSTATS job. Always do this.
COPYDDN — Produces an inline image copy. Eliminates a separate COPY job. Always do this for production tablespaces.
SHRLEVEL NONE — Offline. Full exclusive access. Fastest execution but zero application availability.

Online REORG with SHRLEVEL CHANGE

Online REORG keeps the tablespace available for read and write access during most of the reorganization. This is the standard approach for high-availability systems.

  REORG TABLESPACE DBCNB01.TSACCT01
    SORTDATA YES
    SORTKEYS YES
    STATISTICS YES
    SHRLEVEL CHANGE
    MAPPINGTABLE CNBADM.MAP_TSACCT01
    COPYDDN  (SYSCOPY1,SYSCOPY2)
    DRAIN_WAIT 60
    RETRY 3
    RETRY_DELAY 30
    TIMEOUT TERM

The SHRLEVEL CHANGE process:

BUILD phase: DB2 creates a shadow copy of the data in clustering order. Applications continue to read/write the original.
Mapping table: Tracks all changes made to the original data during the BUILD phase. You must create this table in advance.
SWITCH phase: DB2 applies logged changes, then briefly drains access (DRAIN_WAIT) to switch applications to the reorganized data. This is the only period of unavailability — typically seconds, not minutes.
LOG phase: Applies any remaining log records.

Critical parameters for online REORG:

MAPPINGTABLE — Required for SHRLEVEL CHANGE. Must be created in the same database. If you forget this, the REORG fails.
DRAIN_WAIT 60 — Seconds to wait for drain during SWITCH phase. If applications don't release their claims in 60 seconds, REORG retries.
RETRY 3 — Number of drain retries. With long-running units of work, you may need this.
RETRY_DELAY 30 — Seconds between retries.
TIMEOUT TERM — If drain fails after all retries, terminate the REORG rather than terminate the applications. You want this. The alternative (TIMEOUT ABEND) kills your batch jobs.

Creating the mapping table:

CREATE TABLE CNBADM.MAP_TSACCT01
  (SOURCE_RID  CHAR(5) NOT NULL,
   TARGET_XRID CHAR(5) NOT NULL,
   SOURCE_XRID CHAR(5) NOT NULL)
  IN DBCNB01.TSMAP01
  CCSID EBCDIC;

The mapping table does not need to persist data — it's used only during the REORG. But it must exist before the REORG starts, and it must be in a different tablespace from the one being reorganized.

REORG at Partition Level

For partitioned tablespaces — which every large production table should be — you can REORG individual partitions:

  REORG TABLESPACE DBCNB01.TSACCT01
    PART 7
    SHRLEVEL CHANGE
    MAPPINGTABLE CNBADM.MAP_TSACCT01
    STATISTICS YES
    COPYDDN  (SYSCOPY1)

This is critical for large tables. CNB's account transaction table has 24 partitions organized by posting date. They REORG only the most recently active partitions — typically partitions covering the last 90 days. Historical partitions are stable and rarely need reorganization.

Lisa Park's partition REORG strategy at CNB:

Partition Age	REORG Frequency	SHRLEVEL
Current month	Weekly (Sunday)	CHANGE
1-3 months	Biweekly	CHANGE
3-12 months	Monthly	CHANGE
12+ months	Quarterly (if metrics warrant)	CHANGE

REORG Pitfalls

Log volume. Online REORG with SHRLEVEL CHANGE generates significant log records — the original data, the shadow data, and the mapping. For a 10 GB tablespace, expect 25-30 GB of log data. Make sure your active log configuration can handle it without going into log-suspend.

REORG-pending cascades. If you have referential integrity constraints and you REORG a parent tablespace, the dependent tablespaces may go into CHECK-pending status. Plan your REORG sequence accordingly.

Long-running units of work. The SWITCH phase drain will fail if any application holds a claim on the tablespace with an uncommitted unit of work. This is why your COBOL programs must COMMIT at reasonable intervals — a batch program that processes 500,000 rows before committing is a REORG blocker.

Sort work space. REORG with SORTDATA needs work space roughly equal to the tablespace size. If your sort datasets are undersized, the REORG fails mid-execution. Use the DFSORT SMF records from your last REORG to size appropriately.

9.3 RUNSTATS: Feeding the Optimizer

RUNSTATS collects statistics about your data and stores them in the DB2 catalog. The optimizer uses these statistics to choose access paths. Without current statistics, the optimizer guesses — and it guesses wrong.

What RUNSTATS Collects

For tablespaces: - CARDF — Cardinality (number of rows) - NPAGES — Number of pages containing data - NACTIVE — Number of active pages - PCTROWCOMP — Percentage of rows compressed - Column statistics: COLCARDF, HIGH2KEY, LOW2KEY, frequency distributions, histograms

For indexes: - FIRSTKEYCARDF — Cardinality of the first key column - FULLKEYCARDF — Cardinality of the full key - CLUSTERRATIOF — How well the data is physically clustered - NLEAF, NLEVELS — Index structure metrics - Key column statistics and distributions

Basic RUNSTATS

//RUNSTATS EXEC DSNUPROC,SYSTEM=DB2P,UID='RSACCT01'
//SYSIN    DD *
  RUNSTATS TABLESPACE DBCNB01.TSACCT01
    TABLE (ALL)
    INDEX (ALL)
    SHRLEVEL CHANGE
    REPORT YES
    UPDATE ALL
    HISTORY ALL
/*

Key parameters:

TABLE (ALL) — Collect statistics for all tables in the tablespace. You can specify individual tables, but ALL is typically what you want.
INDEX (ALL) — Collect statistics for all indexes. Critical — index statistics are arguably more important than table statistics because the optimizer uses them to decide between index access and tablespace scan.
SHRLEVEL CHANGE — Collect statistics while applications continue to access the data. Always use this unless you have a specific reason for SHRLEVEL REFERENCE (which allows reads but not writes).
REPORT YES — Generate a statistics report. Useful for trend analysis.
UPDATE ALL — Update all catalog statistics. The alternative, UPDATE SPACE, only updates space-related statistics and is rarely sufficient.
HISTORY ALL — Maintain history in SYSSTATHISTORY tables. Essential for tracking statistical trends and diagnosing performance changes.

Column-Level Statistics and Distributions

Default RUNSTATS collects basic column statistics. For columns used in predicates, you need more:

  RUNSTATS TABLESPACE DBCNB01.TSACCT01
    TABLE (CNBADM.ACCT_TRANSACTION
      COLGROUP (ACCT_TYPE, POSTING_DATE)
        FREQVAL COUNT 20
      COLGROUP (BRANCH_ID, TRAN_STATUS)
        FREQVAL COUNT 15
      COLUMN  (ACCT_TYPE)
        FREQVAL COUNT 10
        HISTOGRAM NUMQUANTILES 50
    )
    INDEX (ALL)
    SHRLEVEL CHANGE
    UPDATE ALL
    HISTORY ALL

COLGROUP — Collects multi-column cardinality statistics. If the optimizer sees a predicate on ACCT_TYPE AND POSTING_DATE, it needs to know the combined cardinality, not just the individual column cardinalities multiplied together (which overestimates selectivity when columns are correlated).
FREQVAL COUNT 20 — Collects the 20 most frequent values. Critical for skewed distributions. If 60% of your transactions have ACCT_TYPE = 'CHK', the optimizer needs to know that.
HISTOGRAM NUMQUANTILES 50 — Collects a 50-bucket histogram for the column. Helps the optimizer estimate range predicate selectivity.

Inline RUNSTATS

Running RUNSTATS as part of another utility eliminates a separate job and ensures statistics are current immediately after the operation:

  REORG TABLESPACE DBCNB01.TSACCT01
    STATISTICS YES
    TABLE (CNBADM.ACCT_TRANSACTION
      COLGROUP (ACCT_TYPE, POSTING_DATE)
        FREQVAL COUNT 20
    )
    INDEX (ALL)

Inline RUNSTATS works with REORG and LOAD. The STATISTICS keyword activates it, and you can specify the same TABLE/INDEX/COLGROUP options as standalone RUNSTATS.

Why inline RUNSTATS should be your default. The window between a REORG or LOAD completing and RUNSTATS executing is a vulnerability. During that window, the optimizer still uses the old statistics. If a batch job executes during that gap, it may choose a suboptimal access path. Inline RUNSTATS eliminates the gap entirely — by the time the utility completes, the catalog is already updated with current statistics.

At Federal Benefits, Sandra Chen's team once saw a four-hour batch overrun caused by exactly this scenario. A REORG completed at 01:00, the RUNSTATS job was scheduled for 02:00, and the batch job started at 01:30 with stale statistics. The optimizer chose a tablespace scan on a freshly reorganized 40 GB tablespace instead of using the clustering index, because the old statistics showed CLUSTERRATIOF at 0.45 (pre-REORG) while the actual value was 0.98 (post-REORG). After that incident, Sandra mandated inline RUNSTATS for every REORG and LOAD — no exceptions.

Inline RUNSTATS with full profile options in LOAD:

  LOAD DATA INDDN SYSREC LOG NO
    RESUME YES
    COPYDDN (SYSCOPY1)
    STATISTICS YES
    TABLE (CNBADM.ACCT_TRANSACTION
      COLGROUP (ACCT_TYPE, POSTING_DATE)
        FREQVAL COUNT 20
      COLUMN  (BRANCH_ID)
        FREQVAL COUNT 30
        HISTOGRAM NUMQUANTILES 100
    )
    INDEX (ALL)
    INTO TABLE CNBADM.ACCT_TRANSACTION
      PART 7
    (ACCT_ID       POSITION(1)   CHAR(12)
     ...
    )

This single JCL step performs the LOAD, takes an inline image copy, and collects detailed statistics including column group cardinality, frequency values, and histograms. Three utility operations in one job step, zero vulnerability windows.

RUNSTATS Profiles

Instead of coding statistics options in every JCL stream, you can create a RUNSTATS profile that's stored in the DB2 catalog:

ALTER TABLESPACE DBCNB01.TSACCT01
  SET PROFILE RUNSTATS
    TABLE (CNBADM.ACCT_TRANSACTION
      COLGROUP (ACCT_TYPE, POSTING_DATE)
        FREQVAL COUNT 20
      COLUMN  (ACCT_TYPE)
        FREQVAL COUNT 10
        HISTOGRAM NUMQUANTILES 50
    )
    INDEX (ALL)
    UPDATE ALL
    HISTORY ALL;

Then invoke RUNSTATS using the profile:

  RUNSTATS TABLESPACE DBCNB01.TSACCT01
    USEPROFILE
    SHRLEVEL CHANGE

This is CNB's standard approach. The profile is defined once, maintained by the development team (who understand the data distributions), and used by every RUNSTATS invocation. No risk of a JCL coder forgetting the COLGROUP specifications.

When to Run RUNSTATS

The answer is not "on a schedule." The answer is "when your data changes enough to affect access paths."

After bulk operations. Any LOAD, mass INSERT, mass DELETE, or mass UPDATE should be followed by RUNSTATS. If your nightly batch inserts 500,000 rows into a 2-million-row table, RUNSTATS matters.

After REORG. Always. Use inline RUNSTATS.

After DDL changes. New indexes, new columns, altered column lengths — all need RUNSTATS.

When performance changes. If a query that ran in 2 seconds now takes 30, check when RUNSTATS last ran. Stale statistics are the single most common cause of unexplained performance degradation.

Threshold-based scheduling. The most sophisticated approach: monitor the STATSTIME column in SYSTABLEPART and SYSINDEXPART, compare it against the table's INSERT/DELETE/UPDATE activity (available from DB2 PM or RMF), and trigger RUNSTATS when activity exceeds a threshold. Ahmad Patel at Pinnacle Health built an automated process that does exactly this — RUNSTATS runs only when it's needed, not on a calendar.

The Rebind Connection

This is the part developers forget. RUNSTATS collects statistics. But your program's package was bound with the old statistics. The optimizer won't use the new statistics until you REBIND the package or the plan.

At CNB, every RUNSTATS job has a companion REBIND step:

//REBIND   EXEC PGM=IKJEFT01
//SYSTSPRT DD SYSOUT=*
//SYSTSIN  DD *
  DSN SYSTEM(DB2P)
  REBIND PACKAGE (CNBCOLL.PGMACT01.(*)) -
    EXPLAIN(YES)
  END
/*

The EXPLAIN(YES) captures the new access paths in the PLAN_TABLE so you can verify the optimizer actually changed its strategy. If it didn't, the RUNSTATS might not have been needed — or the optimizer already had the right plan.

9.4 COPY and RECOVER: Backup Strategy and Recovery Scenarios

COPY creates image copies of tablespaces and index spaces. RECOVER restores them. Together, they are your safety net against every form of data loss — hardware failure, software corruption, human error, and disaster scenarios.

Image Copy Types

Full image copy. A complete copy of every page in the tablespace. This is your baseline. Recovery can start from any full copy and apply logs forward.

Incremental image copy. Only pages changed since the last full or incremental copy. Faster to create, but recovery is slower because DB2 must apply the full copy, then each incremental, then logs.

Inline copy. Produced during REORG or LOAD. Same as a full image copy but captured during the utility instead of as a separate step.

COPY Utility

//COPYTS   EXEC DSNUPROC,SYSTEM=DB2P,UID='CPYACCT1'
//SYSIN    DD *
  COPY TABLESPACE DBCNB01.TSACCT01
    FULL YES
    SHRLEVEL REFERENCE
    COPYDDN  (SYSCOPY1,SYSCOPY2)
    PARALLEL 4
/*
//SYSCOPY1 DD DSN=CNB.DB2P.ACCT01.FCOPY1.D&DATE,
//            DISP=(NEW,CATLG),
//            UNIT=SYSDA,SPACE=(CYL,(500,100),RLSE)
//SYSCOPY2 DD DSN=CNB.DB2P.ACCT01.FCOPY2.D&DATE,
//            DISP=(NEW,CATLG),
//            UNIT=VTAPE,SPACE=(CYL,(500,100),RLSE)

Critical parameters:

FULL YES — Full image copy. Omit for incremental.
SHRLEVEL REFERENCE — Allows concurrent read access but not write. This is standard for most COPY operations. SHRLEVEL CHANGE allows writes but requires additional log processing during recovery.
COPYDDN (SYSCOPY1, SYSCOPY2) — Dual copies. Always take at least two copies to different media. CNB sends one to DASD and one to virtual tape. If you take only one copy and that media fails, your backup is gone.
PARALLEL 4 — Process up to 4 partitions in parallel. Dramatically reduces elapsed time for partitioned tablespaces.

FlashCopy Integration

On modern IBM storage (DS8000 series), DFSMSdss FlashCopy creates point-in-time consistent copies at the volume level in seconds, regardless of data size. DB2 can register these as image copies:

  COPY TABLESPACE DBCNB01.TSACCT01
    FULL YES
    SHRLEVEL REFERENCE
    FLASHCOPY YES
    COPYDDN  (SYSCOPY1)

FlashCopy reduces the COPY window from minutes or hours to seconds. The trade-off: it operates at the volume level, so you're copying the entire volume, not just the tablespace. Storage management needs to be involved.

At Pinnacle Health, Diane Morrison uses FlashCopy for all production image copies. The patient record tablespace is 180 GB across 48 partitions. Traditional COPY took 45 minutes. FlashCopy takes 8 seconds.

COPY Frequency Strategy

The question isn't "how often" — it's "how much data can you afford to re-process?"

Recovery time = time to restore image copy + time to apply log records. The older your image copy, the more log records DB2 must apply, and the longer recovery takes.

CNB's image copy strategy for the account transaction tablespace:

Copy Type	Frequency	Retention
Full (DASD)	Weekly Sunday	4 weeks
Full (Tape)	Weekly Sunday	90 days
Incremental (DASD)	Daily	7 days
Inline (during REORG)	Per REORG schedule	Until next full

For the most critical tablespaces (account balances, customer master), CNB takes full copies daily. The extra storage cost is nothing compared to the reduced recovery time.

RECOVER Utility

RECOVER restores a tablespace from image copies and applies log records to bring it current.

RECOVER to current:

//RECOVER  EXEC DSNUPROC,SYSTEM=DB2P,UID='RCVACCT1'
//SYSIN    DD *
  RECOVER TABLESPACE DBCNB01.TSACCT01
    RESTOREBEFORE DATE(2026-03-15)
    PARALLEL 4
/*

DB2 automatically selects the most efficient combination of full and incremental copies, then applies log records to bring the tablespace to the current point. RESTOREBEFORE tells DB2 to use copies taken before the specified date (useful when you suspect the most recent copies might be corrupted).

RECOVER TOCOPY — Restore to a specific image copy:

  RECOVER TABLESPACE DBCNB01.TSACCT01
    TOCOPY DSN=CNB.DB2P.ACCT01.FCOPY1.D260310

This restores to the exact point when the image copy was taken. No log records are applied. The tablespace reflects the data as of the copy time.

RECOVER TORBA — Point-in-time recovery:

  RECOVER TABLESPACE DBCNB01.TSACCT01
    TORBA X'0000A1B2C3D4E500'

Recovers to a specific relative byte address in the log. This is point-in-time recovery — you restore to the exact moment before a destructive event (a bad DELETE, a failed batch job, an application bug). The RBA comes from log analysis or from DISPLAY LOG output.

RECOVER TOLOGPOINT — Point-in-time with LRSN:

  RECOVER TABLESPACE DBCNB01.TSACCT01
    TOLOGPOINT X'A1B2C3D4E5F6'

In data sharing environments, TOLOGPOINT uses the Log Record Sequence Number (LRSN) instead of RBA. Same concept, different addressing.

Point-in-Time Recovery: The Cascade Problem

Point-in-time recovery is powerful but dangerous. When you recover a tablespace to a previous point in time, you create a data inconsistency: the recovered tablespace reflects data from, say, 10:00 AM, while all related tablespaces reflect data from the current time.

If your recovered tablespace has referential integrity relationships with other tablespaces, those other tablespaces may now contain orphan foreign key rows pointing to parent rows that no longer exist (because the parent was recovered to an earlier state).

The rule: when you do point-in-time recovery on a parent tablespace, you must also recover all dependent tablespaces to the same point. This is why Sandra Chen at Federal Benefits insists that application developers document every referential integrity dependency in the recovery procedures. The DBAs know which tablespaces have RI constraints, but only the application developers understand the logical data dependencies that aren't enforced by RI.

After point-in-time recovery, you must:

Run CHECK DATA to verify referential integrity
Run CHECK INDEX to verify index consistency
Take a full image copy (the tablespace is in COPY-pending status)
Run RUNSTATS (the data has changed)
REBIND affected packages

Recovery Scenarios at Scale

Scenario 1: Single tablespace media failure.

The simplest case. The disk containing your tablespace has a permanent I/O error. RECOVER to current using the most recent image copies and logs. Elapsed time depends on tablespace size and log volume.

Scenario 2: Application error — bad batch run.

Your COBOL batch program had a bug that corrupted 50,000 rows. You need to recover to the point just before the batch started. Use RECOVER TORBA with the RBA from the start of the batch job (captured in the job log or from DISPLAY LOG). Then re-run the batch with the fix.

Scenario 3: Multiple tablespace corruption.

A storage subsystem failure takes out six tablespaces across three databases. You need to recover all six, but three of them have RI relationships. Recovery order matters: recover parent tablespaces first, then dependents. Run CHECK DATA on all recovered tablespaces to verify consistency.

Scenario 4: Log loss.

The nightmare. If you lose active log data, you can only RECOVER TOCOPY — to the point of the last image copy. Everything after that is gone. This is why dual logging and remote log copy (for disaster recovery) are non-negotiable.

9.5 LOAD and UNLOAD: Bulk Data Operations at Scale

LOAD is the high-speed path for getting data into DB2 tables. UNLOAD gets it out. Both bypass normal SQL processing for raw performance.

LOAD Utility

LOAD reads sequential input data and inserts it directly into a tablespace, bypassing the SQL INSERT path. It's typically 5-10x faster than equivalent INSERT statements for large data volumes.

//LOADTS   EXEC DSNUPROC,SYSTEM=DB2P,UID='LDACCT01'
//SYSREC   DD DSN=CNB.PROD.ACCT.EXTRACT.D260315,
//            DISP=SHR
//SYSIN    DD *
  LOAD DATA INDDN SYSREC LOG NO
    RESUME YES
    INTO TABLE CNBADM.ACCT_TRANSACTION
      PART 7
    (ACCT_ID       POSITION(1)   CHAR(12)
     TRAN_DATE     POSITION(13)  DATE EXTERNAL
     TRAN_TYPE     POSITION(23)  CHAR(3)
     TRAN_AMOUNT   POSITION(26)  DECIMAL EXTERNAL(13)
     TRAN_DESC     POSITION(39)  CHAR(50)
     BRANCH_ID     POSITION(89)  CHAR(6)
     TELLER_ID     POSITION(95)  CHAR(8)
     POST_STATUS   POSITION(103) CHAR(1)
    )
/*

Key parameters:

LOG NO — Don't log individual row inserts. This makes LOAD dramatically faster but means you cannot RECOVER through the LOAD — you must take an image copy after. LOG YES logs everything but is much slower. LOG NO is standard for bulk loads followed by COPY.
RESUME YES — Add data to existing rows. RESUME NO (with REPLACE) deletes all existing data first. Be very clear about which one you want.
PART 7 — Load into a specific partition. Essential for partition-level data management.
ENFORCE NO — Skip referential integrity and check constraint validation during load. Faster, but you must run CHECK DATA afterward. ENFORCE YES validates during load.

LOAD REPLACE: The Dangerous Option

  LOAD DATA INDDN SYSREC LOG NO
    REPLACE
    INTO TABLE CNBADM.ACCT_TRANSACTION

REPLACE deletes all existing rows in the table before loading. This is a full table replacement. If your job fails mid-LOAD with REPLACE, you have a partially loaded table and no way to get the old data back (because LOG NO means it's not in the log, and REPLACE already deleted it).

Protection strategies:

Always take an image copy before LOAD REPLACE
Use LOAD REPLACE only in controlled batch windows
Have your COBOL restart logic account for the possibility that a LOAD REPLACE partially completed
Consider RESUME YES with a preceding DELETE instead — it's slower but recoverable

LOAD with Inline COPY and RUNSTATS

  LOAD DATA INDDN SYSREC LOG NO
    RESUME YES
    COPYDDN (SYSCOPY1, SYSCOPY2)
    STATISTICS YES
    INTO TABLE CNBADM.ACCT_TRANSACTION
      PART 7
    (ACCT_ID       POSITION(1)   CHAR(12)
     ...
    )

The inline COPY creates an image copy at the end of the LOAD, which resolves the COPY-pending status that LOG NO creates. The inline STATISTICS runs RUNSTATS on the loaded data. This pattern — LOAD with LOG NO, inline COPY, inline RUNSTATS — is the gold standard for bulk data loading.

LOAD Data Validation and Error Handling

LOAD provides several mechanisms for handling invalid data without aborting the entire load:

ERRDDN and MAPDDN. The ERRDDN dataset captures records that LOAD rejects (constraint violations, data type mismatches). The MAPDDN dataset captures records that couldn't be mapped to the target table structure.

  LOAD DATA INDDN SYSREC LOG NO
    RESUME YES
    ERRDDN SYSERR
    MAPDDN SYSMAP
    DISCARDDN SYSDISC
    ENFORCE CONSTRAINTS
    INTO TABLE CNBADM.ACCT_TRANSACTION
      PART 7
    (ACCT_ID       POSITION(1)   CHAR(12)
     TRAN_DATE     POSITION(13)  DATE EXTERNAL
     TRAN_TYPE     POSITION(23)  CHAR(3)
     TRAN_AMOUNT   POSITION(26)  DECIMAL EXTERNAL(13)
    )

With this configuration, if row 47,382 has an invalid date format, LOAD writes that record to the SYSERR dataset and continues loading the remaining rows. After the LOAD completes, you process the error dataset with a COBOL program that logs the failures and optionally corrects and reloads them.

DISCARDDN captures records discarded due to table check constraint violations. This is separate from ERRDDN, which captures format-level errors. The distinction matters: ERRDDN failures are data format problems (wrong data type, truncation), while DISCARDDN failures are business rule violations (check constraint says TRAN_TYPE must be in ('DEP', 'WDR', 'XFR') and the input has 'ABC').

Rob Tanner at CNB processes error and discard datasets through a standard error-handling COBOL program that writes to an exception table, sends an alert to the operations team, and automatically schedules a reload of corrected records in the next batch window.

WHEN clause filtering during LOAD. Like UNLOAD, LOAD supports WHEN clauses to selectively load records:

  LOAD DATA INDDN SYSREC LOG NO
    RESUME YES
    INTO TABLE CNBADM.ACCT_TRANSACTION
      PART 7
      WHEN (23:25) = 'DEP'
    (ACCT_ID       POSITION(1)   CHAR(12)
     TRAN_DATE     POSITION(13)  DATE EXTERNAL
     TRAN_TYPE     POSITION(23)  CHAR(3)
     TRAN_AMOUNT   POSITION(26)  DECIMAL EXTERNAL(13)
    )

The WHEN clause filters on byte positions in the input record, not on column names. Position 23:25 corresponds to the TRAN_TYPE field. Only deposit records are loaded; all other transaction types are skipped. This is useful when a single input file feeds multiple tables, each receiving a different record type.

LOAD Performance Tuning

SORTKEYS. If your table has indexes, LOAD must build index entries. SORTKEYS specifies the number of index key entries to sort in memory before writing to the index. Higher values use more memory but build indexes faster.

  LOAD DATA INDDN SYSREC LOG NO
    RESUME YES
    SORTKEYS 500000
    INTO TABLE CNBADM.ACCT_TRANSACTION

Parallel LOAD into partitioned tablespaces. You can submit multiple LOAD jobs, each loading into a different partition. DB2 allows concurrent utility access to different partitions of the same tablespace. CNB loads 24 partitions in parallel during their monthly data refresh — elapsed time drops from 6 hours to 25 minutes.

LOAD from cursor. Instead of a sequential file, LOAD can read from a DB2 cursor. This enables table-to-table data movement without intermediate files:

  LOAD DATA INCURSOR CSRMOVE LOG NO
    RESUME YES
    INTO TABLE CNBADM.ACCT_TRANSACTION_ARCHIVE

The cursor CSRMOVE is defined in a DCLGEN-style package and executes a SELECT against the source table.

UNLOAD Utility

UNLOAD extracts data from DB2 tables to sequential files. It's the complement of LOAD.

//UNLOAD   EXEC DSNUPROC,SYSTEM=DB2P,UID='ULACCT01'
//SYSREC   DD DSN=CNB.PROD.ACCT.UNLOAD.D260315,
//            DISP=(NEW,CATLG),
//            UNIT=SYSDA,SPACE=(CYL,(500,100),RLSE)
//SYSPUNCH DD DSN=CNB.PROD.ACCT.LOADCTL.D260315,
//            DISP=(NEW,CATLG),
//            UNIT=SYSDA,SPACE=(TRK,(5,1),RLSE)
//SYSIN    DD *
  UNLOAD TABLESPACE DBCNB01.TSACCT01
    SHRLEVEL CHANGE
    FROM TABLE CNBADM.ACCT_TRANSACTION
      HEADER NONE
      PUNCHDDN SYSPUNCH
/*

The SYSPUNCH dataset is critical — UNLOAD generates the LOAD control statements that exactly match the unloaded data format. You can feed this directly into a LOAD job. This is how you do table restructuring: UNLOAD the old structure, modify the LOAD control cards, LOAD into the new structure.

UNLOAD with selection:

  UNLOAD TABLESPACE DBCNB01.TSACCT01
    FROM TABLE CNBADM.ACCT_TRANSACTION
      WHEN (TRAN_DATE >= '2025-01-01'
        AND TRAN_DATE <  '2026-01-01')
      HEADER NONE

The WHEN clause filters rows during unload. This is faster than using a COBOL program with a cursor because UNLOAD reads pages directly without SQL overhead.

9.6 CHECK DATA/INDEX and REBUILD INDEX

These are the diagnostic and repair utilities. You hope you never need them. You should know exactly how to use them.

CHECK DATA

Verifies referential integrity constraints and check constraints. Detects orphan rows — rows in a dependent table whose foreign key doesn't match any row in the parent table.

  CHECK DATA TABLESPACE DBCNB01.TSTRAN01
    FOR EXCEPTION IN CNBADM.ACCT_TRANSACTION
      USE CNBADM.ACCT_TRAN_EXCEPTIONS
    SHRLEVEL CHANGE
    DELETE YES

FOR EXCEPTION — Names the table to check and an exception table where violating rows are copied. You must create the exception table in advance with the same columns as the base table plus a timestamp column.
DELETE YES — Automatically delete orphan rows. Use this with extreme caution — in most cases you want DELETE NO so you can analyze the exceptions first.
SHRLEVEL CHANGE — Run while applications access the data.

When to run CHECK DATA: - After point-in-time recovery - After LOAD with ENFORCE NO - After any situation where you suspect RI violations - When a tablespace goes into CHECK-pending (CHKP) status

Setting up the exception table. The exception table must have the same columns as the base table, in the same order, plus two additional columns at the end: a TIMESTAMP column (populated by CHECK DATA with the time the violation was detected) and an optional RID column (a CHAR(4) or CHAR(5) column holding the record identifier of the violating row). Create it like this:

CREATE TABLE CNBADM.ACCT_TRAN_EXCEPTIONS
    LIKE CNBADM.ACCT_TRANSACTION;

ALTER TABLE CNBADM.ACCT_TRAN_EXCEPTIONS
    ADD COLUMN EXCEPT_TS TIMESTAMP NOT NULL WITH DEFAULT
    ADD COLUMN EXCEPT_RID CHAR(5) FOR BIT DATA;

The LIKE clause clones the column definitions. The ALTER adds the exception-specific columns. If your base table has LOB columns, the exception table must have corresponding LOB columns — but be aware that CHECK DATA with LOB tables is significantly slower because it must read and validate each LOB value, not just the base row. For tables with large LOBs (e.g., document storage), Ahmad Patel at Pinnacle Health runs CHECK DATA with the AUXERROR INVALIDATE option, which marks invalid LOB column values without copying the entire LOB to the exception table. This reduces the I/O overhead from catastrophic to manageable.

CHECK DATA with scope control. For partitioned tablespaces, you can limit CHECK DATA to a specific partition, just like CHECK INDEX:

  CHECK DATA TABLESPACE DBCNB01.TSTRAN01
    PART 7
    FOR EXCEPTION IN CNBADM.ACCT_TRANSACTION
      USE CNBADM.ACCT_TRAN_EXCEPTIONS
    SCOPE AUXONLY
    SHRLEVEL CHANGE

The SCOPE AUXONLY option checks only LOB and XML column consistency without rechecking referential integrity — useful when you know RI is clean but suspect LOB corruption after a partial recovery.

CHECK INDEX

Verifies that index entries are consistent with the actual data in the tablespace.

  CHECK INDEX (CNBADM.IX_ACCT_TRAN_PK
               CNBADM.IX_ACCT_TRAN_DATE)
    SHRLEVEL CHANGE
    SORTDEVT SYSDA
    SORTNUM 4

CHECK INDEX scans the tablespace data and compares it against each index. Any mismatch — a data row without a corresponding index entry, or an index entry pointing to a nonexistent data row — is flagged.

If CHECK INDEX finds errors, you need REBUILD INDEX.

CHECK INDEX at Partition Level. For large partitioned tablespaces, running CHECK INDEX against all partitions is expensive. Target specific partitions:

  CHECK INDEX (CNBADM.IX_ACCT_TRAN_PK
               CNBADM.IX_ACCT_TRAN_DATE)
    PART 7
    SHRLEVEL CHANGE
    SORTDEVT SYSDA
    SORTNUM 4

This checks only partition 7's index entries, which is sufficient when you know the potential corruption is limited to a specific partition — for example, after a partition-level RECOVER or a failed LOAD into that partition.

CHECK INDEX output interpretation. CHECK INDEX writes its results to SYSPRINT. A clean run shows:

DSNU710I  CHECK INDEX COMPLETE, RETURN CODE=0
DSNU085I  UTILITY PROCESSING COMPLETE, HIGHEST RETURN CODE=0

When errors are found, you get detail records like:

DSNU711I  INDEX ENTRY FOUND WITH NO MATCHING TABLE ROW
  INDEX: CNBADM.IX_ACCT_TRAN_PK
  KEY VALUE: X'C3D5C2F0F0F1F2F3F4...'
  RID: X'000007A300000012'

Each DSNU711I message identifies a specific orphan index entry. Collect these for analysis before running REBUILD INDEX — the pattern of corruption (random entries vs. a contiguous range) tells you whether the root cause was a failed utility, a hardware error, or software corruption.

REBUILD INDEX

Drops and recreates indexes from the tablespace data.

  REBUILD INDEX (ALL)
    TABLESPACE DBCNB01.TSTRAN01
    SHRLEVEL NONE
    SORTDEVT SYSDA
    SORTNUM 4
    STATISTICS YES

REBUILD INDEX is a brute-force fix: it reads every row in the tablespace and constructs new index entries from scratch. It's definitive — after REBUILD INDEX, you know the indexes are correct.

When to use REBUILD INDEX: - After CHECK INDEX reports errors - After an abended utility that may have left indexes inconsistent - When index spaces show unexpected growth (possible duplicate entries from corruption) - After restoring a tablespace from a copy without recovering the associated indexes

The STATISTICS YES option collects RUNSTATS on the rebuilt indexes. Always include it.

9.7 Utility Scheduling and Monitoring

Utilities don't run themselves. You need a schedule that keeps your data healthy without consuming your entire batch window.

The Utility Schedule Framework

A complete utility schedule has four components:

Threshold-based triggers. Monitor CLUSTERRATIOF, PERCDROP, STATSTIME, and COPYPAGESF to determine when utilities are needed.
Calendar-based schedule. Recurring jobs for time-sensitive operations (daily copies, weekly REORGs).
Event-driven execution. Utilities triggered by specific events (after LOAD, after DDL changes, after recovery).
Monitoring and alerting. Track utility execution, detect failures, alert on pending statuses (REORP, CHKP, COPYP).

Monitoring Utilities: DISPLAY UTILITY

-DB2P DISPLAY UTILITY(*)

This TSO or console command shows all active and stopped utilities:

DSNV401I  -DB2P DISPLAY UTILITY REPORT
           UTILID = REORGACCT
           TYPE   = REORG
           PHASE  = BUILD
           STATUS = ACTIVE
           OBJECT = DBCNB01.TSACCT01 PART 7
           PERCENT COMPLETE = 47

Key status values: - ACTIVE — Running normally - STOPPED — Stopped, can be restarted - TERMINATED — Terminated, must be restarted from scratch

To terminate a stuck utility:

-DB2P TERM UTILITY(REORGACCT)

Pending Statuses

DB2 uses restrictive statuses to enforce utility hygiene. If you skip a required utility, DB2 restricts access to the object:

Status	Meaning	Resolution
REORP	REORG pending — ALTER needs materialization	Run REORG
CHKP	CHECK pending — RI may be violated	Run CHECK DATA
COPYP	COPY pending — no current image copy	Run COPY
RBDP	REBUILD pending — indexes may be inconsistent	Run REBUILD INDEX
LPL	Logical Page List — pages with I/O errors	Run RECOVER for affected pages

Your COBOL programs will get SQLCODE -904 (resource unavailable) if they try to access an object in a restrictive pending status. This is not a bug — it's DB2 protecting you from corrupt data.

Automating the Schedule

At CNB, Kwame's team uses a three-tier automation approach:

Tier 1: Threshold monitoring (daily). A batch job queries the DB2 catalog:

SELECT DBNAME, TSNAME, PARTITION,
       CLUSTERRATIOF, PERCDROP, STATSTIME,
       COPYUPDATEDPAGES, COPYUPDATEDPAGESF
FROM SYSIBM.SYSTABLEPART
WHERE CLUSTERRATIOF < 0.80
   OR PERCDROP > 0.10
   OR STATSTIME < CURRENT TIMESTAMP - 7 DAYS
   OR COPYUPDATEDPAGESF > 0.20
ORDER BY CLUSTERRATIOF;

Results feed into a scheduling system that generates the next night's utility jobs.

Tier 2: Calendar schedule (weekly/monthly). Fixed-schedule utilities for critical tablespaces where you can't afford to wait for threshold detection:

Sunday 02:00 — REORG high-activity tablespaces
Sunday 06:00 — Full image copies, all production tablespaces
Daily 01:00 — Incremental image copies, all production tablespaces
Daily 05:30 — RUNSTATS on tablespaces loaded/modified by nightly batch

Tier 3: Event-driven (on demand). Triggered by specific operations:

After every LOAD: inline COPY + inline RUNSTATS + REBIND
After every DDL change: REORG (if pending) + RUNSTATS + REBIND
After every recovery: CHECK DATA + CHECK INDEX + COPY + RUNSTATS + REBIND

Production Scheduling Considerations

Utility scheduling isn't just about what to run — it's about coordinating with everything else that runs in the same batch window. Here are the production realities that training courses skip:

CPU and I/O contention with batch jobs. REORG and LOAD are I/O-intensive. Running them concurrently with your heaviest batch jobs means both compete for channel paths and storage bandwidth. At CNB, Kwame's team discovered that running REORG SHRLEVEL CHANGE on the account transaction tablespace during the nightly batch cycle increased batch elapsed time by 22%. They moved the REORG to a Sunday morning window when batch volume is lowest.

Log volume management. Online REORG (SHRLEVEL CHANGE) generates log records proportional to 2.5-3x the tablespace size. LOAD with LOG YES generates log records for every inserted row. If your active log datasets are sized for normal workload, a large REORG or LOAD can fill them and trigger a log-suspend condition — which halts all DB2 activity until the archive process frees space. Size your active logs for peak utility workloads, not average application workloads. CNB sizes active logs at 3x the largest tablespace they REORG.

Utility job restart considerations. If a utility job fails mid-execution, you don't necessarily have to restart from scratch. REORG with SHRLEVEL CHANGE is restartable from the last completed phase. LOAD is restartable from the last commit point if LOG YES was specified. COPY is generally not restartable — you re-run it. Document the restart behavior for each utility in your runbook so the operator on call at 3 AM doesn't have to figure it out from the manual.

Maintenance window coordination with application teams. Before scheduling any utility with SHRLEVEL NONE or SHRLEVEL REFERENCE, communicate the expected unavailability window to every application team that accesses the affected tablespace. Kwame maintains a "utility impact matrix" — a spreadsheet mapping each tablespace to the applications, transactions, and batch jobs that depend on it, with the business owner's contact information. When a SHRLEVEL NONE REORG is needed, the matrix tells him exactly who to notify and what the business impact will be.

Utility Scheduling Anti-Patterns

The "Sunday Night Special." Running all REORGs and RUNSTATSes on Sunday night. This blows through your batch window, generates massive log volumes, and means your statistics are six days stale by Saturday.

The "Just REORG Everything" approach. Wasting CPU on tablespaces that don't need reorganization. Use metrics, not calendars.

The "We'll Get to It" deferral. Ignoring COPY-pending status because "we'll take a copy this weekend." When the disk fails on Wednesday, you'll wish you hadn't waited.

The "RUNSTATS Without REBIND" mistake. Collecting beautiful statistics that no package ever uses because nobody rebinds. RUNSTATS without REBIND is noise.

9.8 Project Checkpoint: DB2 Utility Schedule for the HA Banking System

It's time to apply everything to the HA Banking Transaction Processing System. Your project checkpoint requires a complete utility schedule for the system's DB2 objects.

The HA Banking System's DB2 Objects

From previous chapters, the system includes these primary tablespaces:

Tablespace	Table	Partitions	Size	Daily Activity
TSACCTMS	ACCT_MASTER	12	45 GB	15,000 updates
TSTRANAC	ACCT_TRANSACTION	24	180 GB	2.5M inserts
TSCUSTMR	CUSTOMER_MASTER	8	22 GB	3,000 updates
TSAUDIT	AUDIT_TRAIL	24	95 GB	3M inserts
TSXREF	ACCT_XREF	4	5 GB	500 updates
TSPARAM	SYSTEM_PARAMS	1	50 MB	10 updates

Your Checkpoint Deliverables

Design a complete utility schedule that addresses:

REORG strategy. For each tablespace, specify: frequency, SHRLEVEL, partition-level or full, inline COPY/RUNSTATS. Justify your choices based on the activity profile.
RUNSTATS strategy. Identify which columns need COLGROUP and FREQVAL statistics based on the query patterns from Chapter 6. Specify profiles for each critical tablespace.
Image copy strategy. Full copy frequency, incremental frequency, retention periods, dual copy requirements. Calculate expected storage consumption.
Recovery plan. Document the recovery order for the six tablespaces considering their RI relationships. Estimate recovery time for three scenarios: single tablespace failure, multiple tablespace failure, and point-in-time recovery.
Monitoring queries. Write the SQL that your threshold-monitoring job will use to detect when each utility is needed.

Sample Solution: TSTRANAC (Account Transaction)

Here's the approach for the highest-volume tablespace:

REORG: Weekly on Sunday for the current month's partition (most insert activity), biweekly for the prior 3 months, monthly for older partitions if CLUSTERRATIOF < 0.80. SHRLEVEL CHANGE with TIMEOUT TERM. Inline COPY and RUNSTATS with COLGROUP on (ACCT_ID, TRAN_DATE) and (TRAN_TYPE, POST_STATUS).

RUNSTATS: Daily after the nightly batch load for the active partition. Use profile-based RUNSTATS with FREQVAL COUNT 20 on TRAN_TYPE and HISTOGRAM NUMQUANTILES 50 on TRAN_AMOUNT. REBIND the five packages that access this table.

Image copy: Full copy weekly Sunday (after REORG inline copy). Incremental daily at 01:00. Dual copies — DASD (7-day retention) and virtual tape (90-day retention). FlashCopy for the full weekly copy.

Recovery order: TSACCTMS (parent) must be recovered before TSTRANAC (dependent). TSAUDIT can be recovered in parallel (no RI relationship). Estimated recovery time for TSTRANAC alone: full copy restore 12 minutes (FlashCopy), log apply 8-25 minutes depending on log volume. Total: 20-37 minutes.

Apply this same analysis to all six tablespaces for your complete checkpoint.

9.9 Spaced Review

From Chapter 4: Dataset Management — Utility Datasets

Every DB2 utility requires datasets: input datasets (SYSREC for LOAD), output datasets (SYSCOPY for COPY), work datasets (sort work for REORG), and control datasets (SYSIN for utility statements). Review Chapter 4's dataset allocation principles:

How do you size SYSCOPY datasets for a 45 GB tablespace? (Answer: approximately the tablespace size for full copy, plus 20% overhead. Use RLSE to release unused space.)
What happens if your REORG sort work dataset runs out of space? (Answer: The REORG abends. Use SMS-managed storage classes with auto-extend, or reference your DFSORT SMF data to pre-size accurately.)
Why do you specify DISP=(NEW,CATLG) for copy datasets? (Answer: Each copy is a new dataset with a date-qualified name. CATLG ensures it's cataloged for RECOVER to find later.)

From Chapter 6: Optimizer — RUNSTATS Impact

Chapter 6 covered how the DB2 optimizer selects access paths. Review the connection:

Why does the optimizer sometimes choose a tablespace scan even when an index exists? (Answer: Stale statistics may show low cardinality, making the optimizer think the index isn't selective enough. Or CLUSTERRATIOF may be so low that sequential prefetch through the index isn't efficient.)
What's the relationship between COLGROUP statistics and correlated predicates? (Answer: Without COLGROUP, the optimizer assumes column values are independent and multiplies individual selectivities. With correlated columns like STATE and ZIP_CODE, this overestimates selectivity by orders of magnitude.)
How does REORG affect optimizer decisions even before RUNSTATS runs? (Answer: REORG restores clustering, which improves CLUSTERRATIOF. But the optimizer won't know about the improvement until RUNSTATS collects the new CLUSTERRATIOF value and the package is rebound.)

Summary

DB2 utilities are operational requirements, not administrative overhead. As a COBOL developer, you own the performance, recoverability, and data integrity of your DB2 applications. That ownership extends to the utilities that maintain those properties.

REORG restores physical order and reclaims space — but only when metrics justify it. Online REORG with SHRLEVEL CHANGE keeps your applications available during reorganization, at the cost of additional log volume and the requirement for mapping tables.

RUNSTATS feeds the optimizer with current statistics. Without it, your carefully tuned SQL runs on stale information. Inline RUNSTATS, profiles, and threshold-based scheduling keep statistics fresh without wasting resources. Always REBIND after RUNSTATS.

COPY and RECOVER are your insurance policy. Dual copies to different media, appropriate frequency based on acceptable recovery time, and documented recovery procedures that account for referential integrity dependencies. Point-in-time recovery is powerful but requires recovering all related tablespaces to the same point.

LOAD provides high-speed bulk data loading with inline COPY and RUNSTATS capabilities. Use LOG NO for performance, but always take an image copy immediately after.

CHECK DATA, CHECK INDEX, and REBUILD INDEX are your diagnostic and repair tools. Run them after recovery operations, after LOAD with ENFORCE NO, and whenever you suspect data integrity issues.

The utility schedule ties it all together: threshold-based monitoring to detect when utilities are needed, calendar scheduling for critical recurring operations, and event-driven execution for post-operation housekeeping. Your HA Banking system depends on all of these working together.

Kwame sums it up: "A developer who doesn't understand DB2 utilities is like a pilot who doesn't understand maintenance. You can fly the plane, but when something goes wrong at 35,000 feet, you'd better know what keeps it airborne."