Case Study 2: Impact Analysis Using the Catalog Before a Major Change
Background
Sentinel Insurance Group (SIG) is a commercial property and casualty insurer headquartered in Hartford, Connecticut, with $22 billion in annual gross written premium. SIG's policy administration and claims management systems run on DB2 for z/OS 12 within a Parallel Sysplex configuration, processing approximately 95 million transactions per day across peak periods.
The DB2 environment is substantial:
| Metric | Value |
|---|---|
| Total databases | 28 |
| Total tables | 6,400 |
| Total indexes | 14,200 |
| Total views | 2,800 |
| Total stored procedures | 3,600 |
| Total packages (bound plans) | 18,000 |
| Total foreign key relationships | 4,200 |
| Total data volume (compressed) | 18 TB |
The DBA team consists of eight people led by Raymond Zhao, a principal DBA with 19 years of experience at SIG.
The Proposed Change
SIG's Chief Data Officer has mandated a strategic initiative: consolidate customer identity across all product lines. Today, SIG maintains separate customer records in the policy system (POLDB), the claims system (CLMDB), and the billing system (BILDB). A single human customer may have three or more rows across these systems with inconsistent identifiers, addresses, and contact information.
The solution requires a new master customer table (MASTER_CUSTOMER) and changes to the existing CUSTOMER table in POLDB:
- Add three new columns to POLDB.CUSTOMER: MASTER_CUSTOMER_ID, IDENTITY_VERIFIED_FLAG, and IDENTITY_VERIFIED_DATE
- Add a foreign key from POLDB.CUSTOMER.MASTER_CUSTOMER_ID to the new MASTER.MASTER_CUSTOMER table
- Modify the data type of POLDB.CUSTOMER.EMAIL_ADDRESS from CHAR(50) to VARCHAR(254) to support longer email addresses discovered during identity consolidation
- Drop the column POLDB.CUSTOMER.LEGACY_SYSTEM_CODE, which has been populated with blanks since a 2018 migration and is no longer referenced
The project sponsor wants to deploy all four changes in a single maintenance window, scheduled for the third Saturday of the month — a 6-hour window from 11:00 PM to 5:00 AM.
Raymond's job is to determine whether this is safe.
The Impact Analysis
Raymond begins by systematically querying the catalog to understand every dependency on the CUSTOMER table.
Step 1: Establish the Scope
First, Raymond needs to know the basic dimensions of the table being modified.
SELECT NAME, CARD, NPAGES, COLCOUNT,
CREATEDTS, ALTEREDTS, STATSTIME
FROM SYSIBM.SYSTABLES
WHERE CREATOR = 'POLDB' AND NAME = 'CUSTOMER';
Results: 14.2 million rows, 847,000 pages, 68 columns, created in 2003, last altered in 2021, last RUNSTATS two days ago. This is a large, mature table. Any change carries significant risk.
Step 2: Foreign Key Dependencies
Raymond queries for all foreign key relationships where CUSTOMER is the parent table — any child tables that reference it.
SELECT R.CREATOR AS CHILD_SCHEMA,
R.TBNAME AS CHILD_TABLE,
R.RELNAME AS CONSTRAINT_NAME,
R.DELETERULE,
T.CARD AS CHILD_ROWS
FROM SYSIBM.SYSRELS R
JOIN SYSIBM.SYSTABLES T
ON T.CREATOR = R.CREATOR AND T.NAME = R.TBNAME
WHERE R.REFTBCREATOR = 'POLDB'
AND R.REFTBNAME = 'CUSTOMER'
ORDER BY T.CARD DESC;
Results: 23 child tables across three schemas (POLDB, CLMDB, BILDB). The largest child table has 142 million rows. Seven of the relationships use CASCADE delete rules.
This is the first red flag. The proposed changes include adding a column and modifying a column data type. Adding a column is generally safe, but modifying a data type on a table with 23 dependent child tables requires checking whether any of those relationships involve the EMAIL_ADDRESS column being modified.
Step 3: Foreign Key Column Details
Raymond drills into the foreign key columns to determine which parent columns are referenced.
SELECT R.RELNAME, FK.COLNAME, FK.COLSEQ
FROM SYSIBM.SYSRELS R
JOIN SYSIBM.SYSFOREIGNKEYS FK
ON FK.CREATOR = R.CREATOR
AND FK.TBNAME = R.TBNAME
AND FK.RELNAME = R.RELNAME
WHERE R.REFTBCREATOR = 'POLDB'
AND R.REFTBNAME = 'CUSTOMER'
ORDER BY R.RELNAME, FK.COLSEQ;
Results: All 23 foreign keys reference either CUSTOMER_ID (the primary key) or a combination of CUSTOMER_ID and PRODUCT_LINE_CODE. None reference EMAIL_ADDRESS or LEGACY_SYSTEM_CODE. The data type change and column drop will not affect foreign key constraints directly.
Step 4: Index Dependencies
Raymond checks every index on the CUSTOMER table to see if any include the columns being modified or dropped.
SELECT I.NAME AS INDEX_NAME,
K.COLSEQ, K.COLNAME, K.ORDERING,
I.UNIQUERULE, I.CLUSTERING
FROM SYSIBM.SYSINDEXES I
JOIN SYSIBM.SYSKEYS K
ON K.IXCREATOR = I.CREATOR AND K.IXNAME = I.NAME
WHERE I.TBCREATOR = 'POLDB'
AND I.TBNAME = 'CUSTOMER'
ORDER BY I.NAME, K.COLSEQ;
Results: 11 indexes on the CUSTOMER table. One index (IX_CUST_EMAIL) includes EMAIL_ADDRESS as its second key column. No index includes LEGACY_SYSTEM_CODE.
This is the second critical finding. The VARCHAR conversion of EMAIL_ADDRESS will affect IX_CUST_EMAIL. On z/OS, ALTER COLUMN changes that affect indexed columns require the index to be rebuilt. Raymond notes this for the implementation plan.
Step 5: View Dependencies
Raymond checks for views that reference the CUSTOMER table.
SELECT D.DCREATOR AS VIEW_SCHEMA,
D.DNAME AS VIEW_NAME,
V.CREATOR AS BASE_SCHEMA,
V.NAME AS BASE_TABLE,
V.TYPE
FROM SYSIBM.SYSVIEWDEP D
JOIN SYSIBM.SYSTABLES V
ON V.CREATOR = D.DCREATOR AND V.NAME = D.DNAME
WHERE D.BCREATOR = 'POLDB'
AND D.BNAME = 'CUSTOMER'
ORDER BY D.DNAME;
Results: 34 views reference the CUSTOMER table. Raymond needs to check whether any of these views explicitly reference LEGACY_SYSTEM_CODE (the column being dropped) or EMAIL_ADDRESS (the column being altered).
He writes a follow-up query to check the view definitions stored in SYSIBM.SYSVIEWS:
SELECT V.CREATOR, V.NAME, V.SEQNO, V.TEXT
FROM SYSIBM.SYSVIEWS V
WHERE (V.CREATOR, V.NAME) IN (
SELECT D.DCREATOR, D.DNAME
FROM SYSIBM.SYSVIEWDEP D
WHERE D.BCREATOR = 'POLDB' AND D.BNAME = 'CUSTOMER'
)
AND (V.TEXT LIKE '%LEGACY_SYSTEM_CODE%'
OR V.TEXT LIKE '%EMAIL_ADDRESS%')
ORDER BY V.CREATOR, V.NAME, V.SEQNO;
Results: 3 views reference LEGACY_SYSTEM_CODE. 8 views reference EMAIL_ADDRESS. The 3 views referencing LEGACY_SYSTEM_CODE will fail after the column is dropped — they must be altered or recreated as part of the change.
Step 6: Package Dependencies
Raymond checks for compiled packages that reference the CUSTOMER table.
SELECT PD.DCREATOR AS PACKAGE_COLLECTION,
PD.DNAME AS PACKAGE_NAME,
P.VALID, P.OPERATIVE,
P.BINDTIME, P.OWNER
FROM SYSIBM.SYSPACKDEP PD
JOIN SYSIBM.SYSPACKAGE P
ON P.LOCATION = PD.DLOCATION
AND P.COLLID = PD.DCOLLID
AND P.NAME = PD.DNAME
WHERE PD.BCREATOR = 'POLDB'
AND PD.BNAME = 'CUSTOMER'
AND PD.BTYPE = 'T'
ORDER BY PD.DNAME;
Results: 187 packages reference the CUSTOMER table. When the table is altered, all 187 packages will be invalidated and require rebinding. Raymond needs to understand how long rebinding 187 packages will take and whether it fits within the 6-hour maintenance window.
He further filters for packages that reference the specific columns being changed:
-- Packages referencing LEGACY_SYSTEM_CODE
SELECT DISTINCT PD.DCREATOR, PD.DNAME
FROM SYSIBM.SYSPACKDEP PD
JOIN SYSIBM.SYSPACKSTMT PS
ON PS.LOCATION = PD.DLOCATION
AND PS.COLLID = PD.DCOLLID
AND PS.NAME = PD.DNAME
WHERE PD.BCREATOR = 'POLDB'
AND PD.BNAME = 'CUSTOMER'
AND PS.TEXT LIKE '%LEGACY_SYSTEM_CODE%';
Results: 4 packages explicitly reference LEGACY_SYSTEM_CODE. These packages will not just be invalidated — they will fail at bind time until the SQL is modified to remove the reference to the dropped column. These require code changes before the schema change.
Step 7: Trigger Dependencies
SELECT SCHEMA, NAME, TRIGTIME, TRIGEVENT, GRANULARITY
FROM SYSIBM.SYSTRIGGERS
WHERE TBCREATOR = 'POLDB'
AND TBNAME = 'CUSTOMER';
Results: 5 triggers defined on the CUSTOMER table. Two are AFTER INSERT triggers for audit logging. One is a BEFORE UPDATE trigger for data validation. Two are AFTER UPDATE triggers for replication. Raymond reviews the trigger text to determine if any reference the columns being modified.
The Consolidated Impact Report
Raymond compiles his findings:
| Dependency Type | Count | Affected by Change | Action Required |
|---|---|---|---|
| Foreign key child tables | 23 | 0 (no FK references modified columns) | None |
| Indexes | 11 | 1 (IX_CUST_EMAIL uses EMAIL_ADDRESS) | REBUILD INDEX after ALTER |
| Views | 34 | 11 (3 reference LEGACY_SYSTEM_CODE, 8 reference EMAIL_ADDRESS) | Recreate 3 views; verify 8 views |
| Packages | 187 | 187 (all invalidated) + 4 (reference dropped column) | REBIND 187 packages; fix 4 packages first |
| Triggers | 5 | 1 (audit trigger logs EMAIL_ADDRESS changes) | Verify trigger still valid after VARCHAR conversion |
| Total objects requiring attention | 260+ |
The Recommendation
Raymond presents his analysis to the project team and recommends splitting the change into two maintenance windows:
Window 1 (Week 1): - Add the three new columns (MASTER_CUSTOMER_ID, IDENTITY_VERIFIED_FLAG, IDENTITY_VERIFIED_DATE) - Add the new foreign key to MASTER.MASTER_CUSTOMER - These changes are additive and low-risk; existing code continues to work unchanged
Window 2 (Week 2): - Modify EMAIL_ADDRESS from CHAR(50) to VARCHAR(254) - Drop LEGACY_SYSTEM_CODE - Prerequisite: the 4 packages and 3 views referencing LEGACY_SYSTEM_CODE must be modified and recompiled before the window - During the window: ALTER TABLE, REBUILD IX_CUST_EMAIL, REBIND 187 packages - Estimated time: 4 hours (tight but feasible within the 6-hour window)
The project sponsor initially pushes back: "Why two windows? Can we not do it all at once?" Raymond walks through the risk analysis: if anything goes wrong during the additive changes in Window 1, the rollback is simple (DROP COLUMN, DROP FOREIGN KEY). But if the VARCHAR conversion and column drop are combined with the additive changes and something fails midway, the rollback becomes exponentially more complex — and the 6-hour window does not provide enough buffer for both forward execution and potential rollback.
The sponsor agrees to the two-window approach.
Outcome
Window 1 executes in 47 minutes without incident. All 187 packages remain valid because only new columns and a new foreign key were added — no existing columns were modified or dropped.
Window 2 executes in 3 hours and 22 minutes. The VARCHAR conversion completes in 18 minutes (faster than expected because the column was largely padded blanks). The column drop completes in 4 minutes. IX_CUST_EMAIL is rebuilt in 31 minutes. The 187 package rebinds complete in 2 hours and 29 minutes (the bottleneck, as expected). All 5 triggers remain valid after testing. No application errors are reported when systems resume at 5:00 AM.
Lessons Learned
1. Impact analysis is a catalog problem. Every dependency that Raymond needed to assess — foreign keys, indexes, views, packages, triggers — was available through standard catalog queries. Without the catalog, he would have been guessing based on documentation that may or may not be current.
2. The catalog reveals what documentation does not. The 34 views referencing the CUSTOMER table included 8 views that the application team had not documented. Three of these would have broken if the column drop had proceeded without the view modifications. The catalog is always current; documentation frequently is not.
3. Package count drives maintenance window sizing. The 187 package rebinds consumed 70% of the Window 2 time. Estimating rebind time from the catalog (number of packages, complexity of statements) is essential for accurate window planning.
4. Splitting changes reduces risk without adding cost. The two-window approach added one week to the timeline but dramatically reduced the risk of each window. In regulated industries, this trade-off is almost always worth making.
5. The recursive CTE is the DBA's secret weapon. Raymond's dependency chain queries allowed him to trace not just direct dependencies but indirect ones — views that depend on views, packages that reference views that reference the CUSTOMER table. Without recursive catalog queries, these second-order dependencies are easy to miss.
Discussion Questions
-
Raymond found 4 packages that explicitly reference LEGACY_SYSTEM_CODE. How would you verify that these are the ONLY packages affected? Could there be dynamic SQL that references this column but does not appear in SYSPACKDEP?
-
The VARCHAR conversion of EMAIL_ADDRESS required rebuilding IX_CUST_EMAIL. Under what circumstances could an ALTER COLUMN change avoid an index rebuild? Does the answer differ between z/OS and LUW?
-
If Raymond had been working on DB2 for LUW instead of z/OS, which SYSCAT views would he have used for each step of his analysis? Would the overall approach have been the same?
-
The project sponsor initially wanted all changes in a single window. Construct a risk argument quantifying the probability and impact of failure for the single-window approach versus the two-window approach. What is the expected cost of each option?
-
How would you build Raymond's impact analysis into an automated pre-change checklist that runs every time a schema change is proposed? What catalog queries would the checklist include?
Return to Chapter 21 | Continue to Key Takeaways