> "I spent fifteen years writing cursor loops that processed one row at a time. When I finally benchmarked multi-row FETCH against the same workload, I wanted to go back in time and slap myself." — Kwame Asante, Senior Systems Programmer, City...
In This Chapter
- 7.1 Beyond SELECT...INTO: Why Single-Row Operations Are Killing Your Batch Jobs
- 7.2 Multi-Row FETCH and INSERT: COBOL Host Variable Arrays
- 7.3 The MERGE Statement: Eliminating the SELECT-Decide-Write Anti-Pattern
- 7.4 Temporal Tables: The Database Remembers Everything
- 7.5 Recursive CTEs: Hierarchies Without Recursive Program Calls
- 7.6 OLAP Functions: Replacing Cursor Logic with SQL
- 7.7 Choosing Set-Based vs. Cursor-Based: Decision Framework
- 7.8 Spaced Review: Connecting Back
- 7.9 HA Banking Project Checkpoint
- Chapter Summary
Chapter 7: Advanced SQL for COBOL: Multi-Row Operations, Temporal Tables, Recursive CTEs, and OLAP Functions
"I spent fifteen years writing cursor loops that processed one row at a time. When I finally benchmarked multi-row FETCH against the same workload, I wanted to go back in time and slap myself." — Kwame Asante, Senior Systems Programmer, City National Bank
You know how to embed SQL in COBOL. You can declare cursors, FETCH into host variables, handle SQLCODEs, and write WHERE clauses that the optimizer doesn't hate. Chapters 1 and 6 gave you that foundation. Now we throw away the training wheels.
This chapter covers four capabilities that separate the COBOL-DB2 practitioner who writes adequate code from the one who writes code that survives peak-volume production: multi-row operations, temporal tables, recursive CTEs, and OLAP functions. Every one of these features exists because IBM looked at real workloads — batch jobs processing hundreds of millions of rows, audit requirements demanding historical state reconstruction, hierarchy traversals that generated thousands of recursive cursor calls — and built SQL-level solutions. Your job is to use them.
The HA Banking Transaction Processing System you're building will need all four. Multi-row operations for the nightly batch cycle that processes every transaction posted during the day. Temporal tables for regulatory audit trails that must answer "what did this account look like at 3:47 PM on March 3rd?" Recursive CTEs for the organizational hierarchy that determines approval chains. OLAP functions for the running balance calculations that currently consume 40% of the daily batch window.
Let's get to work.
7.1 Beyond SELECT...INTO: Why Single-Row Operations Are Killing Your Batch Jobs
Here's a number that should bother you: every time your COBOL program executes an SQL statement, DB2 performs a thread switch between the application address space and the DB2 address space. On a z16 running at full clip, that switch costs roughly 0.01 to 0.05 milliseconds. Sounds trivial. Now multiply it by 500 million — the number of transactions CNB processes daily.
At 0.03ms per switch, 500 million switches consume 4.17 hours just in thread-switching overhead. Not I/O. Not SQL execution. Just the cost of your COBOL program saying "Hey DB2, I need something" and DB2 saying "Here you go" — five hundred million times.
The Single-Row Bottleneck
The classic COBOL-DB2 pattern you learned in every training class looks like this:
EXEC SQL
DECLARE CSR-TRANSACTIONS CURSOR FOR
SELECT ACCT_NUM, TXN_AMT, TXN_DATE, TXN_TYPE
FROM TRANSACTION_HISTORY
WHERE PROCESS_FLAG = 'N'
ORDER BY ACCT_NUM, TXN_DATE
END-EXEC.
EXEC SQL OPEN CSR-TRANSACTIONS END-EXEC.
PERFORM UNTIL SQLCODE = 100
EXEC SQL
FETCH CSR-TRANSACTIONS
INTO :WS-ACCT-NUM, :WS-TXN-AMT,
:WS-TXN-DATE, :WS-TXN-TYPE
END-EXEC
IF SQLCODE = 0
PERFORM PROCESS-SINGLE-TRANSACTION
END-IF
END-PERFORM.
One FETCH, one row, one thread switch. For every single row. If you're processing 50 million rows in a nightly batch, you're doing 50 million thread switches just for the FETCH operations — before you count the UPDATE to mark each row as processed, and the INSERT to write the audit record. That's 150 million thread switches minimum.
What Multi-Row Operations Change
DB2 introduced multi-row FETCH in V8 and expanded the capability through subsequent releases. The concept is simple: instead of fetching one row per SQL call, you fetch n rows. Instead of inserting one row per SQL call, you insert n rows. The thread switch happens once for the entire block.
The performance math changes dramatically:
| Operation Pattern | Rows | Thread Switches | Overhead at 0.03ms |
|---|---|---|---|
| Single-row FETCH | 50,000,000 | 50,000,000 | 25 minutes |
| Multi-row FETCH (100 rows) | 50,000,000 | 500,000 | 15 seconds |
| Multi-row FETCH (1,000 rows) | 50,000,000 | 50,000 | 1.5 seconds |
That's not a typo. Going from single-row to multi-row FETCH with a rowset size of 1,000 reduces thread-switching overhead from 25 minutes to 1.5 seconds. The actual I/O and SQL processing time stays roughly the same — you're eliminating pure overhead.
💡 Production Reality: The sweet spot for rowset size is usually between 100 and 1,000 rows. Beyond 1,000, the WORKING-STORAGE requirements grow large and you start competing with other programs for virtual storage. Kwame's team at CNB settled on 500 rows after extensive benchmarking against their z15 configuration. Your optimal number will depend on row width, available storage, and buffer pool configuration.
But It's Not Just FETCH
Multi-row operations extend to INSERT and the MERGE statement (DB2's upsert pattern). The same thread-switching arithmetic applies. If your batch job produces audit records, confirmation records, or summary records, multi-row INSERT eliminates the same overhead on the output side.
And MERGE — the operation that checks "does this row exist? If yes, UPDATE it; if no, INSERT it" — eliminates the SELECT-then-decide-then-INSERT-or-UPDATE pattern that costs you three thread switches per row.
We'll cover each operation in detail. First, multi-row FETCH and INSERT.
7.2 Multi-Row FETCH and INSERT: COBOL Host Variable Arrays
The Mechanics of Multi-Row FETCH
Multi-row FETCH requires three things your single-row FETCH doesn't:
- A rowset-positioned cursor — declared with the
WITH ROWSET POSITIONINGclause - Host variable arrays — COBOL tables (OCCURS) sized to match your rowset
- The FETCH...FOR n ROWS syntax — telling DB2 how many rows you want per call
Here's the same transaction fetch rewritten for multi-row operation:
WORKING-STORAGE SECTION.
01 WS-ROWSET-SIZE PIC S9(4) COMP VALUE 500.
01 WS-ROWS-FETCHED PIC S9(9) COMP VALUE 0.
01 WS-TXN-ARRAY.
05 WS-TXN-ROW OCCURS 500 TIMES.
10 WS-ACCT-NUM PIC X(12).
10 WS-TXN-AMT PIC S9(13)V99 COMP-3.
10 WS-TXN-DATE PIC X(10).
10 WS-TXN-TYPE PIC X(3).
01 WS-IND-ARRAY.
05 WS-IND-ROW OCCURS 500 TIMES.
10 WS-IND-ACCT PIC S9(4) COMP.
10 WS-IND-AMT PIC S9(4) COMP.
10 WS-IND-DATE PIC S9(4) COMP.
10 WS-IND-TYPE PIC S9(4) COMP.
01 WS-IDX PIC S9(9) COMP VALUE 0.
PROCEDURE DIVISION.
EXEC SQL
DECLARE CSR-TXN-MULTI CURSOR
WITH ROWSET POSITIONING FOR
SELECT ACCT_NUM, TXN_AMT, TXN_DATE, TXN_TYPE
FROM TRANSACTION_HISTORY
WHERE PROCESS_FLAG = 'N'
ORDER BY ACCT_NUM, TXN_DATE
END-EXEC.
EXEC SQL OPEN CSR-TXN-MULTI END-EXEC.
PERFORM UNTIL SQLCODE = 100
EXEC SQL
FETCH NEXT ROWSET FROM CSR-TXN-MULTI
FOR :WS-ROWSET-SIZE ROWS
INTO :WS-ACCT-NUM, :WS-TXN-AMT,
:WS-TXN-DATE, :WS-TXN-TYPE
:WS-IND-ACCT, :WS-IND-AMT,
:WS-IND-DATE, :WS-IND-TYPE
END-EXEC
IF SQLCODE = 0 OR SQLCODE = 100
MOVE SQLERRD(3) TO WS-ROWS-FETCHED
PERFORM VARYING WS-IDX FROM 1 BY 1
UNTIL WS-IDX > WS-ROWS-FETCHED
PERFORM PROCESS-SINGLE-TRANSACTION
END-PERFORM
END-IF
END-PERFORM.
EXEC SQL CLOSE CSR-TXN-MULTI END-EXEC.
Let me walk through the critical details.
SQLERRD(3) tells you the actual row count. When you ask for 500 rows and only 317 remain, SQLCODE is 100 (end of result set) but SQLERRD(3) is 317. You must check SQLERRD(3) to know how many rows were actually placed in your arrays. Ignoring this and processing all 500 slots will give you garbage data in positions 318–500 from whatever was left in WORKING-STORAGE.
⚠️ Production Bug Alert: The most common multi-row FETCH bug in production is processing stale data from the array. When the last FETCH returns SQLCODE 100 with SQLERRD(3) = 317, your loop must stop at 317 — not 500. Kwame's team found this bug in three different programs during their multi-row migration at CNB. In one case, the program had been processing phantom transactions for six weeks before anyone noticed the account balance discrepancies.
Indicator variable arrays are mandatory in production. If any column can be NULL, you need indicator arrays. The indicator array must have the same OCCURS count as the host variable array. A value of -1 in the indicator means the corresponding column was NULL for that row. Skipping indicator arrays on nullable columns will get you SQLCODE -305 in production — and it will happen at 2 AM on the one night your on-call person is unreachable.
The OCCURS dimension must match the FOR n ROWS value. If your arrays are OCCURS 500 and you FETCH FOR 1000 ROWS, DB2 will happily write past the end of your arrays and corrupt whatever sits after them in WORKING-STORAGE. The COBOL compiler won't catch this. DB2 won't catch this. You'll find out when unrelated variables suddenly contain garbage. Use a working-storage variable for the rowset size and reference it in both the OCCURS and the FOR clause to keep them synchronized.
Multi-Row INSERT
The INSERT side follows the same pattern. Instead of inserting one audit record per transaction, you fill an array and insert the entire batch:
01 WS-AUDIT-ARRAY.
05 WS-AUDIT-ROW OCCURS 500 TIMES.
10 WS-AUD-ACCT PIC X(12).
10 WS-AUD-AMT PIC S9(13)V99 COMP-3.
10 WS-AUD-TSTAMP PIC X(26).
10 WS-AUD-ACTION PIC X(10).
10 WS-AUD-USER PIC X(8).
01 WS-INSERT-COUNT PIC S9(4) COMP VALUE 0.
PROCEDURE DIVISION.
ACCUMULATE-AUDIT-ROW.
ADD 1 TO WS-INSERT-COUNT.
MOVE WS-ACCT-NUM(WS-IDX) TO
WS-AUD-ACCT(WS-INSERT-COUNT).
MOVE WS-TXN-AMT(WS-IDX) TO
WS-AUD-AMT(WS-INSERT-COUNT).
MOVE WS-CURRENT-TSTAMP TO
WS-AUD-TSTAMP(WS-INSERT-COUNT).
MOVE 'PROCESSED' TO
WS-AUD-ACTION(WS-INSERT-COUNT).
MOVE WS-PROGRAM-USER TO
WS-AUD-USER(WS-INSERT-COUNT).
IF WS-INSERT-COUNT >= WS-ROWSET-SIZE
PERFORM FLUSH-AUDIT-ROWS
END-IF.
FLUSH-AUDIT-ROWS.
IF WS-INSERT-COUNT > 0
EXEC SQL
INSERT INTO AUDIT_TRAIL
(ACCT_NUM, TXN_AMT, AUDIT_TSTAMP,
AUDIT_ACTION, AUDIT_USER)
VALUES (:WS-AUD-ACCT, :WS-AUD-AMT,
:WS-AUD-TSTAMP, :WS-AUD-ACTION,
:WS-AUD-USER)
FOR :WS-INSERT-COUNT ROWS
END-EXEC
EVALUATE SQLCODE
WHEN 0
ADD WS-INSERT-COUNT TO
WS-TOTAL-INSERTS
WHEN OTHER
PERFORM HANDLE-INSERT-ERROR
END-EVALUATE
MOVE 0 TO WS-INSERT-COUNT
END-IF.
The pattern is accumulate-then-flush. You fill the array one row at a time during your processing loop, and when the array is full (or you've reached end-of-data), you flush the entire batch with one INSERT...FOR n ROWS.
Don't forget the final flush. If you process 50,000,317 rows with a rowset size of 500, the last 317 rows will sit in the array un-inserted after your main loop ends. You must call FLUSH-AUDIT-ROWS one final time after the loop completes. This is the second most common multi-row bug after the stale-data problem.
Storage Considerations
Multi-row operations trade CPU time for WORKING-STORAGE space. Each rowset requires enough storage for (rowset_size * row_width) bytes for the data arrays plus (rowset_size * 2 * column_count) bytes for indicator arrays.
For a 500-row rowset with 10 columns averaging 20 bytes each: - Data: 500 * 200 bytes = 100 KB - Indicators: 500 * 2 * 10 = 10 KB - Total: ~110 KB
That's modest. But if your row is 4 KB wide (large VARCHAR columns, LOB locators) and your rowset is 1,000 rows: - Data: 1,000 * 4,096 = 4 MB - That competes with your region size, especially if the program runs in a CICS transaction.
📊 Benchmark Data from CNB: Kwame's team tested rowset sizes of 50, 100, 200, 500, 1000, and 2000 against their transaction processing workload (avg row width 187 bytes). Performance improved linearly from 50 to 500, showed diminishing returns from 500 to 1000, and actually degraded at 2000 due to virtual storage paging. Their production standard is now 500 for batch, 100 for online (CICS/IMS).
Atomic vs. Non-Atomic Insert
DB2 supports two modes for multi-row INSERT:
- Non-atomic (default): If row 347 out of 500 fails (e.g., duplicate key), rows 1–346 are inserted and rows 348–500 are attempted. You get SQLCODE +252 and must check SQLERRD(3) for the count of successful inserts.
- Atomic: Declared with
ATOMICkeyword. If any row fails, all rows in the batch are rolled back. You get the SQLCODE for the failing row.
For the HA Banking system, use atomic inserts for financial transactions (you don't want partial batches posted) and non-atomic for audit/logging tables (you want as many records written as possible even if some fail).
* Atomic - all or nothing
EXEC SQL
INSERT INTO POSTED_TRANSACTIONS
(ACCT_NUM, TXN_AMT, POST_DATE)
VALUES (:WS-POST-ACCT, :WS-POST-AMT,
:WS-POST-DATE)
FOR :WS-INSERT-COUNT ROWS
ATOMIC
END-EXEC.
* Non-atomic - best effort
EXEC SQL
INSERT INTO AUDIT_LOG
(ACCT_NUM, LOG_MSG, LOG_TSTAMP)
VALUES (:WS-LOG-ACCT, :WS-LOG-MSG,
:WS-LOG-TSTAMP)
FOR :WS-INSERT-COUNT ROWS
NOT ATOMIC CONTINUE ON SQLEXCEPTION
END-EXEC.
7.3 The MERGE Statement: Eliminating the SELECT-Decide-Write Anti-Pattern
The Problem MERGE Solves
Every COBOL batch program I've maintained in 25 years has some version of this logic:
EXEC SQL
SELECT COUNT(*) INTO :WS-COUNT
FROM ACCOUNT_SUMMARY
WHERE ACCT_NUM = :WS-ACCT-NUM
END-EXEC.
IF WS-COUNT > 0
EXEC SQL
UPDATE ACCOUNT_SUMMARY
SET BALANCE = BALANCE + :WS-TXN-AMT,
LAST_TXN_DATE = :WS-TXN-DATE
WHERE ACCT_NUM = :WS-ACCT-NUM
END-EXEC
ELSE
EXEC SQL
INSERT INTO ACCOUNT_SUMMARY
(ACCT_NUM, BALANCE, LAST_TXN_DATE, OPEN_DATE)
VALUES (:WS-ACCT-NUM, :WS-TXN-AMT,
:WS-TXN-DATE, CURRENT DATE)
END-EXEC
END-IF.
Three SQL statements. Three thread switches. And a concurrency hole: between your SELECT COUNT(*) and your INSERT, another thread could insert the same ACCT_NUM, giving you a duplicate key error. You handle that with SQLCODE -803 and retry logic, adding complexity and more potential thread switches.
MERGE Syntax for COBOL
MERGE does this in one statement:
EXEC SQL
MERGE INTO ACCOUNT_SUMMARY AS TGT
USING (VALUES (:WS-ACCT-NUM,
:WS-TXN-AMT,
:WS-TXN-DATE))
AS SRC (ACCT_NUM, TXN_AMT, TXN_DATE)
ON TGT.ACCT_NUM = SRC.ACCT_NUM
WHEN MATCHED THEN
UPDATE SET
BALANCE = TGT.BALANCE + SRC.TXN_AMT,
LAST_TXN_DATE = SRC.TXN_DATE
WHEN NOT MATCHED THEN
INSERT (ACCT_NUM, BALANCE,
LAST_TXN_DATE, OPEN_DATE)
VALUES (SRC.ACCT_NUM, SRC.TXN_AMT,
SRC.TXN_DATE, CURRENT DATE)
END-EXEC.
One statement. One thread switch. No concurrency gap. DB2 handles the existence check, the locking, and the insert-or-update decision atomically.
MERGE with Multi-Row Source
The real power emerges when you combine MERGE with a multi-row source. Instead of merging one row at a time, you can merge an entire batch using a global temporary table or a common table expression as the source:
* Step 1: Multi-row INSERT into a global temp table
EXEC SQL
INSERT INTO SESSION.TXN_BATCH
(ACCT_NUM, TXN_AMT, TXN_DATE)
VALUES (:WS-BATCH-ACCT, :WS-BATCH-AMT,
:WS-BATCH-DATE)
FOR :WS-BATCH-COUNT ROWS
END-EXEC.
* Step 2: MERGE from temp table into target
EXEC SQL
MERGE INTO ACCOUNT_SUMMARY AS TGT
USING SESSION.TXN_BATCH AS SRC
ON TGT.ACCT_NUM = SRC.ACCT_NUM
WHEN MATCHED THEN
UPDATE SET
BALANCE = TGT.BALANCE + SRC.TXN_AMT,
LAST_TXN_DATE = SRC.TXN_DATE
WHEN NOT MATCHED THEN
INSERT (ACCT_NUM, BALANCE,
LAST_TXN_DATE, OPEN_DATE)
VALUES (SRC.ACCT_NUM, SRC.TXN_AMT,
SRC.TXN_DATE, CURRENT DATE)
END-EXEC.
* Step 3: Clean up temp table for next batch
EXEC SQL
DELETE FROM SESSION.TXN_BATCH
END-EXEC.
This pattern — load a temp table with multi-row INSERT, then MERGE from the temp table — gives you both the thread-switching benefit of multi-row operations and the atomic upsert semantics of MERGE.
🔗 Connection to Chapter 6: The optimizer treats MERGE as a combination of an index lookup (the ON clause) and either an UPDATE or INSERT. The access path for the ON clause follows the same rules you learned in Chapter 6. If ACCT_NUM has a unique index, the ON clause resolves via index lookup. If it doesn't, you're doing a tablespace scan for every source row — which is a disaster at scale. Always ensure the ON clause columns are indexed.
MERGE with Conditional Logic
MERGE supports additional filtering in the WHEN clauses, letting you apply different actions based on data conditions:
EXEC SQL
MERGE INTO ACCOUNT_SUMMARY AS TGT
USING (VALUES (:WS-ACCT-NUM,
:WS-TXN-AMT,
:WS-TXN-DATE,
:WS-TXN-TYPE))
AS SRC (ACCT_NUM, TXN_AMT,
TXN_DATE, TXN_TYPE)
ON TGT.ACCT_NUM = SRC.ACCT_NUM
WHEN MATCHED AND SRC.TXN_TYPE = 'CLO' THEN
UPDATE SET
BALANCE = 0,
ACCT_STATUS = 'C',
CLOSE_DATE = SRC.TXN_DATE
WHEN MATCHED AND SRC.TXN_TYPE <> 'CLO' THEN
UPDATE SET
BALANCE = TGT.BALANCE + SRC.TXN_AMT,
LAST_TXN_DATE = SRC.TXN_DATE
WHEN NOT MATCHED THEN
INSERT (ACCT_NUM, BALANCE,
LAST_TXN_DATE, OPEN_DATE,
ACCT_STATUS)
VALUES (SRC.ACCT_NUM, SRC.TXN_AMT,
SRC.TXN_DATE, CURRENT DATE, 'A')
END-EXEC.
The multiple WHEN MATCHED clauses let you branch the update logic based on the source data. DB2 evaluates them in order and executes the first one that matches. This replaces what would otherwise be an IF/ELSE structure in COBOL wrapped around two separate UPDATE statements — eliminating an entire thread switch and the associated lock acquisition overhead.
MERGE Gotchas for COBOL Programmers
SQLCODE after MERGE: A successful MERGE returns SQLCODE 0. SQLERRD(3) contains the total number of rows affected (updates + inserts combined). If you need to know how many were updates vs. inserts, you must use the DATA CHANGE TABLE REFERENCE extension or count them yourself through other means.
Duplicate source rows: If your source data contains two rows with the same ACCT_NUM, MERGE will fail with SQLCODE -788. The source of a MERGE must not contain duplicate join keys. If your input might have duplicates, aggregate them first (using a CTE or pre-processing in COBOL).
MERGE and triggers: If the target table has BEFORE/AFTER triggers for both INSERT and UPDATE, both sets of triggers are active during MERGE. If your INSERT trigger does something that conflicts with your UPDATE trigger, you'll get unexpected behavior. Review triggers before implementing MERGE against existing tables.
7.4 Temporal Tables: The Database Remembers Everything
Why Temporal Tables Matter for Mainframe Systems
Sandra at Federal Benefits puts it bluntly: "We got subpoenaed. They wanted to know exactly what a beneficiary's eligibility status was on a specific date eighteen months ago. We had audit trails, but reconstructing the state from a sequence of change records took two analysts three weeks. With temporal tables, it would have been a single SQL query."
Temporal tables maintain historical versions of rows automatically. When you UPDATE a row, DB2 doesn't just change it — it preserves the old version in a history table with timestamps marking when it was valid. When you DELETE a row, the deleted version is preserved. You can then query the table "as of" any point in time.
DB2 for z/OS supports three types of temporal tables:
System-Period Temporal Tables
The database manages the timestamps. Every row has two system-generated columns: SYS_START (when this version became current) and SYS_END (when this version stopped being current). The current row has SYS_END set to the maximum timestamp value ('9999-12-30-00.00.00.000000').
CREATE TABLE BENEFICIARY_STATUS
(
BENEFICIARY_ID CHAR(10) NOT NULL,
STATUS_CODE CHAR(2) NOT NULL,
BENEFIT_LEVEL CHAR(3) NOT NULL,
MONTHLY_AMOUNT DECIMAL(9,2) NOT NULL,
CASE_WORKER_ID CHAR(8) NOT NULL,
SYS_START TIMESTAMP(12) NOT NULL
GENERATED ALWAYS AS ROW BEGIN,
SYS_END TIMESTAMP(12) NOT NULL
GENERATED ALWAYS AS ROW END,
TXN_START_ID TIMESTAMP(12)
GENERATED ALWAYS AS TRANSACTION
START ID,
PERIOD SYSTEM_TIME (SYS_START, SYS_END)
)
IN DBFEDBEN.TSBENSTATUS;
CREATE TABLE BENEFICIARY_STATUS_HIST
(
BENEFICIARY_ID CHAR(10) NOT NULL,
STATUS_CODE CHAR(2) NOT NULL,
BENEFIT_LEVEL CHAR(3) NOT NULL,
MONTHLY_AMOUNT DECIMAL(9,2) NOT NULL,
CASE_WORKER_ID CHAR(8) NOT NULL,
SYS_START TIMESTAMP(12) NOT NULL,
SYS_END TIMESTAMP(12) NOT NULL,
TXN_START_ID TIMESTAMP(12)
)
IN DBFEDBEN.TSBENHIST;
ALTER TABLE BENEFICIARY_STATUS
ADD VERSIONING USE HISTORY TABLE
BENEFICIARY_STATUS_HIST;
Once versioning is active, every UPDATE and DELETE automatically archives the old row to BENEFICIARY_STATUS_HIST. Your COBOL program doesn't change at all — the standard UPDATE statement works exactly as before:
EXEC SQL
UPDATE BENEFICIARY_STATUS
SET STATUS_CODE = :WS-NEW-STATUS,
BENEFIT_LEVEL = :WS-NEW-LEVEL,
MONTHLY_AMOUNT = :WS-NEW-AMOUNT
WHERE BENEFICIARY_ID = :WS-BEN-ID
END-EXEC.
DB2 handles the archival transparently. The old row (with its previous STATUS_CODE, BENEFIT_LEVEL, and MONTHLY_AMOUNT) goes to the history table with SYS_END set to the current timestamp.
Querying Temporal Data
This is where the power lives. Sandra's subpoena response becomes:
SELECT BENEFICIARY_ID, STATUS_CODE, BENEFIT_LEVEL,
MONTHLY_AMOUNT, CASE_WORKER_ID,
SYS_START, SYS_END
FROM BENEFICIARY_STATUS
FOR SYSTEM_TIME AS OF '2025-09-15-15.47.00.000000'
WHERE BENEFICIARY_ID = '0047829156';
That single query returns the row as it existed at that exact timestamp — whether the current row or a historical version. No application code changes. No cursor loops through audit trails. No reconstruction logic.
The temporal query forms are:
| Syntax | Returns |
|---|---|
FOR SYSTEM_TIME AS OF timestamp |
The single version valid at that moment |
FOR SYSTEM_TIME FROM ts1 TO ts2 |
All versions overlapping the range (exclusive end) |
FOR SYSTEM_TIME BETWEEN ts1 AND ts2 |
All versions overlapping the range (inclusive end) |
Embedding Temporal Queries in COBOL
From COBOL, temporal queries work like any other SQL:
01 WS-QUERY-TSTAMP PIC X(26).
01 WS-BEN-ID PIC X(10).
01 WS-STATUS-CODE PIC X(2).
01 WS-BENEFIT-LVL PIC X(3).
01 WS-MONTHLY-AMT PIC S9(7)V99 COMP-3.
MOVE '2025-09-15-15.47.00.000000'
TO WS-QUERY-TSTAMP.
EXEC SQL
SELECT STATUS_CODE, BENEFIT_LEVEL,
MONTHLY_AMOUNT
INTO :WS-STATUS-CODE, :WS-BENEFIT-LVL,
:WS-MONTHLY-AMT
FROM BENEFICIARY_STATUS
FOR SYSTEM_TIME AS OF :WS-QUERY-TSTAMP
WHERE BENEFICIARY_ID = :WS-BEN-ID
END-EXEC.
⚠️ Critical Production Consideration: The history table grows without bound. DB2 doesn't automatically purge it. For a table with 10 million rows that averages two updates per row per month, the history table will accumulate 240 million rows per year. You need a retention policy and a batch job to purge history older than your regulatory requirement (typically 7 years for financial, 10 years for federal benefits). Sandra's team at Federal Benefits runs a monthly purge job with a 10-year retention window.
Application-Period Temporal Tables
Application-period temporal tables let your application define the validity period. Instead of system-managed timestamps ("when was this row current in the database"), you define business-meaningful periods ("when was this insurance policy active").
CREATE TABLE INSURANCE_POLICY
(
POLICY_ID CHAR(12) NOT NULL,
HOLDER_ID CHAR(10) NOT NULL,
COVERAGE_TYPE CHAR(3) NOT NULL,
PREMIUM_AMT DECIMAL(9,2) NOT NULL,
POLICY_START DATE NOT NULL,
POLICY_END DATE NOT NULL,
PERIOD BUSINESS_TIME (POLICY_START, POLICY_END)
);
Queries use FOR BUSINESS_TIME instead of FOR SYSTEM_TIME:
-- What policies were active on July 4th, 2025?
SELECT POLICY_ID, HOLDER_ID, COVERAGE_TYPE, PREMIUM_AMT
FROM INSURANCE_POLICY
FOR BUSINESS_TIME AS OF '2025-07-04'
WHERE HOLDER_ID = 'H00482910';
Temporal Query Patterns for COBOL Batch
In practice, temporal queries in COBOL batch programs follow several recurring patterns. Here are the three most common:
Pattern 1: Point-in-time state reconstruction. You need every row from a table as it existed at a specific moment — for example, all account balances at quarter-end close:
EXEC SQL
DECLARE CSR-QUARTER-SNAP CURSOR FOR
SELECT ACCT_NUM, BALANCE, STATUS_CODE
FROM ACCOUNT_BALANCE
FOR SYSTEM_TIME AS OF :WS-QUARTER-END-TS
ORDER BY ACCT_NUM
END-EXEC
Pattern 2: Change history extraction. You need all versions of a row that existed during a time range — for example, every change to a beneficiary's status during an audit period:
EXEC SQL
DECLARE CSR-CHANGES CURSOR FOR
SELECT STATUS_CODE, BENEFIT_LEVEL,
MONTHLY_AMOUNT,
SYS_START, SYS_END
FROM BENEFICIARY_STATUS
FOR SYSTEM_TIME FROM :WS-AUDIT-START
TO :WS-AUDIT-END
WHERE BENEFICIARY_ID = :WS-BEN-ID
ORDER BY SYS_START
END-EXEC
This returns every version of the row that overlapped the audit window, including versions that started before the window (if they were still current during it) and versions that ended during the window. Each row in the result set includes SYS_START and SYS_END, so you can reconstruct the complete timeline.
Pattern 3: Detecting changes between two points. You need to find all rows that were modified between two timestamps — the "delta extraction" pattern for feeding downstream systems:
SELECT CURR.BENEFICIARY_ID, CURR.STATUS_CODE,
PREV.STATUS_CODE AS PREV_STATUS
FROM BENEFICIARY_STATUS
FOR SYSTEM_TIME AS OF :WS-CURRENT-TS AS CURR
INNER JOIN BENEFICIARY_STATUS
FOR SYSTEM_TIME AS OF :WS-PREVIOUS-TS AS PREV
ON CURR.BENEFICIARY_ID = PREV.BENEFICIARY_ID
WHERE CURR.STATUS_CODE <> PREV.STATUS_CODE;
This self-join across two temporal points returns only the rows that changed between the two timestamps. Sandra at Federal Benefits uses this pattern to generate daily change feeds for three downstream regulatory systems.
Bi-Temporal Tables
Ahmad at Pinnacle Health needs both: "We need to know what the patient's coverage was (business time), but we also need to know when we recorded that information (system time) for compliance audits." Bi-temporal tables combine both period types on a single table:
CREATE TABLE CLAIMS_COVERAGE
(
CLAIM_ID CHAR(15) NOT NULL,
PATIENT_ID CHAR(10) NOT NULL,
COVERAGE_CODE CHAR(5) NOT NULL,
APPROVED_AMT DECIMAL(11,2) NOT NULL,
COV_START DATE NOT NULL,
COV_END DATE NOT NULL,
SYS_START TIMESTAMP(12) NOT NULL
GENERATED ALWAYS AS ROW BEGIN,
SYS_END TIMESTAMP(12) NOT NULL
GENERATED ALWAYS AS ROW END,
TXN_START_ID TIMESTAMP(12)
GENERATED ALWAYS AS TRANSACTION
START ID,
PERIOD BUSINESS_TIME (COV_START, COV_END),
PERIOD SYSTEM_TIME (SYS_START, SYS_END)
);
Bi-temporal queries can filter on both dimensions:
-- What coverage did we THINK this patient had for July 2025,
-- as recorded in our system on August 1st, 2025?
SELECT CLAIM_ID, COVERAGE_CODE, APPROVED_AMT
FROM CLAIMS_COVERAGE
FOR SYSTEM_TIME AS OF '2025-08-01-00.00.00.000000'
FOR BUSINESS_TIME AS OF '2025-07-15'
WHERE PATIENT_ID = 'P003928471';
This answers the question: "Based on what we knew on August 1st, what was this patient's coverage for July?" If the coverage was later corrected (say, on September 3rd), the bi-temporal query still returns the August 1st view — exactly what auditors need.
Performance Impact of Temporal Tables
Temporal tables add overhead to DML operations:
- INSERT: Minimal overhead. DB2 just populates the SYS_START column.
- UPDATE: Moderate overhead. DB2 must INSERT the old row into the history table before applying the UPDATE to the base table. This effectively doubles the I/O for updates.
- DELETE: Same as UPDATE. The deleted row is inserted into the history table.
For read-heavy workloads (which most mainframe OLTP systems are), the overhead is negligible. For update-heavy batch workloads, budget 40–60% additional elapsed time for the history table writes. Diane's team at Pinnacle Health saw a 52% increase in batch elapsed time when they first enabled temporal on the claims table, which they offset by converting the batch job to multi-row operations — netting a 30% overall improvement.
📊 Combined Impact at Pinnacle Health: Before temporal + multi-row migration, Pinnacle's claims batch ran 4.2 hours. After adding temporal tables (which increased it to 6.4 hours), then converting to multi-row FETCH/INSERT with rowset size 500, the job dropped to 2.9 hours. The temporal overhead was more than paid for by the multi-row efficiency gains.
7.5 Recursive CTEs: Hierarchies Without Recursive Program Calls
The Hierarchy Problem
Every enterprise has hierarchical data: organizational charts, account hierarchies (master accounts with sub-accounts), product category trees, regulatory reporting structures, geographic region hierarchies. The traditional COBOL approach is a cursor loop with a stack:
1. FETCH the root node.
2. Push it onto a WORKING-STORAGE stack.
3. FETCH all children of the current node.
4. For each child, push it and repeat.
5. Pop the stack when you exhaust children.
This generates one SELECT per node in the hierarchy. An org chart with 50,000 employees produces 50,000+ SQL calls. An account hierarchy with 12 levels deep and 200,000 nodes generates 200,000 calls with complex stack management logic that's a maintenance nightmare.
CTE Basics
A Common Table Expression (CTE) is a named temporary result set defined in a WITH clause:
WITH ACTIVE_ACCOUNTS AS (
SELECT ACCT_NUM, ACCT_NAME, ACCT_TYPE, BALANCE
FROM ACCOUNTS
WHERE STATUS = 'A'
)
SELECT ACCT_NUM, ACCT_NAME, BALANCE
FROM ACTIVE_ACCOUNTS
WHERE ACCT_TYPE = 'CHECKING'
AND BALANCE > 10000;
CTEs improve readability and allow you to reference the same subquery multiple times without repeating it. But the killer feature is recursion.
Recursive CTE Syntax
A recursive CTE has two parts: 1. The anchor member: A SELECT that retrieves the starting rows (the root nodes) 2. The recursive member: A SELECT that joins back to the CTE itself, retrieving the next level
WITH ACCT_HIERARCHY (ACCT_NUM, PARENT_ACCT, ACCT_NAME,
ACCT_LEVEL, HIERARCHY_PATH) AS (
-- Anchor: top-level accounts (no parent)
SELECT ACCT_NUM, PARENT_ACCT, ACCT_NAME,
1 AS ACCT_LEVEL,
CAST(ACCT_NUM AS VARCHAR(500)) AS HIERARCHY_PATH
FROM ACCOUNT_STRUCTURE
WHERE PARENT_ACCT IS NULL
UNION ALL
-- Recursive: children of current level
SELECT CHILD.ACCT_NUM, CHILD.PARENT_ACCT,
CHILD.ACCT_NAME,
PARENT.ACCT_LEVEL + 1,
PARENT.HIERARCHY_PATH CONCAT ' > '
CONCAT CHILD.ACCT_NUM
FROM ACCOUNT_STRUCTURE CHILD
INNER JOIN ACCT_HIERARCHY PARENT
ON CHILD.PARENT_ACCT = PARENT.ACCT_NUM
)
SELECT ACCT_NUM, PARENT_ACCT, ACCT_NAME,
ACCT_LEVEL, HIERARCHY_PATH
FROM ACCT_HIERARCHY
ORDER BY HIERARCHY_PATH;
DB2 executes this iteratively:
1. Run the anchor query — get all root accounts.
2. Run the recursive query using the anchor results as ACCT_HIERARCHY — get all level-2 accounts.
3. Run the recursive query using the level-2 results — get all level-3 accounts.
4. Repeat until the recursive query returns no rows.
The entire hierarchy is resolved in a single SQL statement. One thread switch. DB2 handles the iteration internally, using work files in the TEMP database.
Recursive CTEs in COBOL
From COBOL, you declare a cursor over the recursive CTE and fetch the results:
EXEC SQL
DECLARE CSR-ACCT-HIER CURSOR FOR
WITH ACCT_HIERARCHY
(ACCT_NUM, PARENT_ACCT, ACCT_NAME,
ACCT_LEVEL, HIERARCHY_PATH) AS (
SELECT ACCT_NUM, PARENT_ACCT, ACCT_NAME,
1, CAST(ACCT_NUM AS VARCHAR(500))
FROM ACCOUNT_STRUCTURE
WHERE PARENT_ACCT IS NULL
AND MASTER_ACCT = :WS-MASTER-ACCT
UNION ALL
SELECT C.ACCT_NUM, C.PARENT_ACCT,
C.ACCT_NAME,
P.ACCT_LEVEL + 1,
P.HIERARCHY_PATH CONCAT ' > '
CONCAT C.ACCT_NUM
FROM ACCOUNT_STRUCTURE C
INNER JOIN ACCT_HIERARCHY P
ON C.PARENT_ACCT = P.ACCT_NUM
)
SELECT ACCT_NUM, PARENT_ACCT, ACCT_NAME,
ACCT_LEVEL, HIERARCHY_PATH
FROM ACCT_HIERARCHY
ORDER BY HIERARCHY_PATH
END-EXEC.
You can combine this with multi-row FETCH for maximum efficiency. The cursor declaration is more complex, but the FETCH logic is identical to what you learned in Section 7.2.
Preventing Infinite Recursion
If your data has cycles (account A's parent is B, B's parent is C, C's parent is A), the recursive CTE will loop until DB2 runs out of resources. Two safeguards:
1. Maximum recursion depth:
Use a level counter in the recursive member and filter on it in the final SELECT:
-- In the recursive member, add:
WHERE P.ACCT_LEVEL < 20 -- stop at 20 levels deep
Or better, enforce it within the CTE itself so DB2 stops iterating.
2. Cycle detection with HIERARCHY_PATH:
Check whether the current node already appears in the path:
-- In the recursive member, add:
AND LOCATE(C.ACCT_NUM, P.HIERARCHY_PATH) = 0
⚠️ Production Discipline: Always include a depth limit in recursive CTEs. Even if your data shouldn't have cycles, a single bad row from a data migration or a bug in the account setup program can create one. Without a depth limit, that one bad row will consume your TEMP database and potentially bring down the DB2 subsystem. Rob at CNB learned this the hard way during a test run — the runaway recursive CTE filled 50 GB of TEMP space in under two minutes before they could cancel it.
Approval Chain Example for HA Banking
For the HA Banking system, recursive CTEs solve the approval chain problem. When a high-value transaction needs multi-level approval, you need to traverse the organizational hierarchy from the initiator up to someone with sufficient authority:
WITH APPROVAL_CHAIN (EMP_ID, EMP_NAME, TITLE,
APPROVAL_LIMIT, MGR_ID,
CHAIN_LEVEL) AS (
-- Anchor: the employee who initiated the transaction
SELECT EMP_ID, EMP_NAME, TITLE, APPROVAL_LIMIT,
MGR_ID, 1
FROM EMPLOYEE_DIRECTORY
WHERE EMP_ID = :WS-INITIATOR-ID
UNION ALL
SELECT MGR.EMP_ID, MGR.EMP_NAME, MGR.TITLE,
MGR.APPROVAL_LIMIT, MGR.MGR_ID,
SUB.CHAIN_LEVEL + 1
FROM EMPLOYEE_DIRECTORY MGR
INNER JOIN APPROVAL_CHAIN SUB
ON MGR.EMP_ID = SUB.MGR_ID
WHERE SUB.APPROVAL_LIMIT < :WS-TXN-AMOUNT
AND SUB.CHAIN_LEVEL < 10
)
SELECT EMP_ID, EMP_NAME, TITLE, APPROVAL_LIMIT,
CHAIN_LEVEL
FROM APPROVAL_CHAIN
ORDER BY CHAIN_LEVEL;
This returns the complete approval chain, stopping when it reaches someone whose approval limit exceeds the transaction amount (or the CEO, whichever comes first).
7.6 OLAP Functions: Replacing Cursor Logic with SQL
The Running Balance Problem
Here's a batch routine that exists in some form at every bank:
Open cursor ordered by ACCT_NUM, TXN_DATE.
Set PREV-ACCT to spaces.
Set RUNNING-BAL to 0.
Loop:
FETCH next row.
IF ACCT_NUM not equal PREV-ACCT
MOVE 0 to RUNNING-BAL
MOVE ACCT_NUM to PREV-ACCT
END-IF.
ADD TXN-AMT TO RUNNING-BAL.
UPDATE the row with RUNNING-BAL.
This is a running total calculation. It requires the rows to be processed in exact order, it requires maintaining state across rows, and it requires updating each row individually. The entire batch window for CNB's running balance job was 2.1 hours — processing 500 million transactions sequentially.
OLAP functions (also called window functions or analytic functions) let DB2 do this calculation within the SQL itself.
Core OLAP Functions
ROW_NUMBER() — assigns a sequential integer to each row within a partition:
SELECT ACCT_NUM, TXN_DATE, TXN_AMT,
ROW_NUMBER() OVER (
PARTITION BY ACCT_NUM
ORDER BY TXN_DATE, TXN_SEQUENCE
) AS TXN_SEQ_NUM
FROM TRANSACTION_HISTORY
WHERE PROCESS_DATE = CURRENT DATE;
This replaces the COBOL pattern of maintaining a counter that resets at each account break. Every row gets a sequence number within its account, ordered by transaction date.
RANK() and DENSE_RANK() — like ROW_NUMBER but handle ties:
-- Find the top 5 transaction amounts per account
SELECT ACCT_NUM, TXN_AMT, TXN_DATE,
RANK() OVER (
PARTITION BY ACCT_NUM
ORDER BY TXN_AMT DESC
) AS AMT_RANK
FROM TRANSACTION_HISTORY
WHERE PROCESS_DATE = CURRENT DATE;
RANK() leaves gaps (1, 2, 2, 4). DENSE_RANK() doesn't (1, 2, 2, 3). Choose based on whether your business logic needs gap-free ranking.
LAG() and LEAD() — access previous and next rows without a self-join:
SELECT ACCT_NUM, TXN_DATE, TXN_AMT,
LAG(TXN_AMT, 1) OVER (
PARTITION BY ACCT_NUM
ORDER BY TXN_DATE
) AS PREV_TXN_AMT,
TXN_AMT - LAG(TXN_AMT, 1) OVER (
PARTITION BY ACCT_NUM
ORDER BY TXN_DATE
) AS AMT_CHANGE
FROM TRANSACTION_HISTORY;
This replaces the classic COBOL pattern of saving the previous row's values in WORKING-STORAGE and comparing them to the current row. LAG(column, n) looks back n rows; LEAD(column, n) looks forward n rows.
Running totals with SUM() OVER:
SELECT ACCT_NUM, TXN_DATE, TXN_AMT,
SUM(TXN_AMT) OVER (
PARTITION BY ACCT_NUM
ORDER BY TXN_DATE, TXN_SEQUENCE
ROWS BETWEEN UNBOUNDED PRECEDING
AND CURRENT ROW
) AS RUNNING_BALANCE
FROM TRANSACTION_HISTORY
WHERE PROCESS_DATE = CURRENT DATE
ORDER BY ACCT_NUM, TXN_DATE, TXN_SEQUENCE;
That single SQL statement replaces the entire running balance batch program. DB2 partitions the data by account, orders within each partition by date and sequence, and computes the cumulative sum from the first row through the current row.
Window Frame Specifications
The ROWS BETWEEN and RANGE BETWEEN clauses define the window frame — which rows contribute to the aggregate:
| Frame | Meaning |
|---|---|
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW |
All rows from start of partition to current row (running total) |
ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING |
Previous row, current row, next row (3-row moving average) |
ROWS BETWEEN 29 PRECEDING AND CURRENT ROW |
30-row moving window |
RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW |
All rows with values <= current row's value (handles ties differently) |
The difference between ROWS and RANGE matters when you have duplicate ORDER BY values. ROWS counts physical rows; RANGE uses the logical value. For transaction processing where you have a unique sequence number in the ORDER BY, they're equivalent. For summary reporting where you might order by date (and multiple transactions share a date), RANGE groups all same-date transactions together.
OLAP Functions in COBOL
You use OLAP functions in cursor declarations and fetch the computed values alongside the base columns:
01 WS-RESULT-ARRAY.
05 WS-RESULT-ROW OCCURS 500 TIMES.
10 WS-R-ACCT PIC X(12).
10 WS-R-TXN-DATE PIC X(10).
10 WS-R-TXN-AMT PIC S9(13)V99 COMP-3.
10 WS-R-RUN-BAL PIC S9(15)V99 COMP-3.
10 WS-R-PREV-AMT PIC S9(13)V99 COMP-3.
10 WS-R-TXN-RANK PIC S9(9) COMP.
EXEC SQL
DECLARE CSR-OLAP-TXN CURSOR
WITH ROWSET POSITIONING FOR
SELECT ACCT_NUM, TXN_DATE, TXN_AMT,
SUM(TXN_AMT) OVER (
PARTITION BY ACCT_NUM
ORDER BY TXN_DATE, TXN_SEQ
ROWS BETWEEN UNBOUNDED PRECEDING
AND CURRENT ROW
) AS RUNNING_BAL,
LAG(TXN_AMT, 1, 0) OVER (
PARTITION BY ACCT_NUM
ORDER BY TXN_DATE, TXN_SEQ
) AS PREV_TXN_AMT,
ROW_NUMBER() OVER (
PARTITION BY ACCT_NUM
ORDER BY TXN_AMT DESC
) AS TXN_RANK
FROM TRANSACTION_HISTORY
WHERE PROCESS_DATE = CURRENT DATE
ORDER BY ACCT_NUM, TXN_DATE, TXN_SEQ
END-EXEC.
Now your COBOL program fetches pre-computed running balances, previous transaction amounts, and rankings. The COBOL logic becomes: fetch rows, apply business rules, write results. All the state management and accumulation logic lives in SQL where DB2's optimizer can parallelize it.
Performance Implications
OLAP functions require DB2 to sort the data according to the PARTITION BY and ORDER BY clauses. If the data is already in the right order (because of a clustering index or a prior ORDER BY), the sort is free. If not, DB2 uses work files in TEMP for the sort.
For CNB's running balance job: - Old approach (cursor loop + COBOL accumulation): 2.1 hours - OLAP function approach (multi-row FETCH of pre-computed results): 38 minutes
The 70% reduction came from three sources: 1. Eliminated 500 million individual UPDATE statements for the running balance (now computed on the fly) 2. Multi-row FETCH reduced thread switches by 99.8% 3. DB2's internal sort was faster than the sequential cursor processing because DB2 could use parallelism
📊 Benchmark Note: OLAP functions with large partitions (millions of rows per partition) consume significant TEMP space. Kwame's team sized TEMP at 2x their largest partition. For the HA Banking system, ensure your TEMP database is sized for your largest account's transaction volume. The top 0.1% of CNB accounts have over 1 million transactions per day — those partitions drive the TEMP requirement.
Combining Multiple OLAP Functions
You can use multiple OLAP functions in the same SELECT, each with its own PARTITION BY and ORDER BY:
SELECT ACCT_NUM, BRANCH_CODE, TXN_DATE, TXN_AMT,
-- Running balance per account
SUM(TXN_AMT) OVER (
PARTITION BY ACCT_NUM
ORDER BY TXN_DATE
ROWS UNBOUNDED PRECEDING
) AS ACCT_RUNNING_BAL,
-- Running total per branch
SUM(TXN_AMT) OVER (
PARTITION BY BRANCH_CODE
ORDER BY TXN_DATE
ROWS UNBOUNDED PRECEDING
) AS BRANCH_RUNNING_TOTAL,
-- Rank within branch by amount
RANK() OVER (
PARTITION BY BRANCH_CODE
ORDER BY TXN_AMT DESC
) AS BRANCH_AMT_RANK,
-- Percent of branch total
TXN_AMT /
SUM(TXN_AMT) OVER (
PARTITION BY BRANCH_CODE
) * 100 AS PCT_OF_BRANCH
FROM TRANSACTION_HISTORY
WHERE PROCESS_DATE = CURRENT DATE;
Each OVER clause defines an independent window. DB2 may need multiple sort passes for different PARTITION BY clauses, but it's still vastly more efficient than implementing these calculations in COBOL with multiple cursor passes over the data.
FIRST_VALUE and LAST_VALUE — Accessing Boundary Rows
Two additional OLAP functions that replace common COBOL patterns:
SELECT ACCT_NUM, TXN_DATE, TXN_AMT,
FIRST_VALUE(TXN_AMT) OVER (
PARTITION BY ACCT_NUM
ORDER BY TXN_DATE
ROWS BETWEEN UNBOUNDED PRECEDING
AND UNBOUNDED FOLLOWING
) AS FIRST_TXN_AMT,
LAST_VALUE(TXN_AMT) OVER (
PARTITION BY ACCT_NUM
ORDER BY TXN_DATE
ROWS BETWEEN UNBOUNDED PRECEDING
AND UNBOUNDED FOLLOWING
) AS LAST_TXN_AMT
FROM TRANSACTION_HISTORY
WHERE PROCESS_DATE = CURRENT DATE;
FIRST_VALUE returns the value from the first row in the window. LAST_VALUE returns the value from the last row. These replace the classic COBOL pattern of saving the first and last values in WORKING-STORAGE variables during a break-level loop. Note that LAST_VALUE requires the window frame to extend to UNBOUNDED FOLLOWING — without it, the default frame (UNBOUNDED PRECEDING to CURRENT ROW) means LAST_VALUE always returns the current row's value, which is never what you want.
NTILE() — Distribution Buckets Without COBOL Math
NTILE divides a partition's rows into a specified number of roughly equal groups and assigns a bucket number to each row. This is the function that replaces the COBOL pattern of counting total rows, dividing by the number of buckets, and manually assigning group numbers in a second pass.
SELECT ACCT_NUM, TXN_DATE, TXN_AMT,
NTILE(10) OVER (
PARTITION BY BRANCH_CODE
ORDER BY TXN_AMT DESC
) AS DECILE
FROM TRANSACTION_HISTORY
WHERE PROCESS_DATE = CURRENT DATE;
This assigns every transaction within each branch to a decile based on amount. Decile 1 holds the top 10% of transactions by amount; decile 10 holds the bottom 10%. CNB's fraud detection team uses exactly this pattern to flag transactions in the top decile for manual review. Before NTILE, they ran a two-pass COBOL program — first pass counted rows per branch, second pass assigned deciles using a counter and a threshold array in WORKING-STORAGE. The two-pass approach required either reading the data twice (doubling I/O) or caching row counts in a branch array (limiting scalability to however many branches fit in the array). NTILE does it in a single SQL pass.
The function handles uneven division cleanly. If a branch has 103 transactions and you request NTILE(10), the first three buckets get 11 rows each and the remaining seven get 10 rows each. DB2 distributes the remainder across the lower-numbered buckets. Your COBOL program doesn't need to handle the modulo arithmetic.
A common production use is percentile-based reporting for regulatory compliance. Federal Benefits uses NTILE(100) to assign percentile ranks to benefit payment amounts across program categories, feeding directly into their quarterly variance report. The entire calculation that previously required a COBOL sort, a percentile assignment program, and a merge step now lives in a single cursor declaration.
-- Quartile analysis for monthly account activity reporting
SELECT BRANCH_CODE, ACCT_NUM, MONTHLY_VOLUME,
NTILE(4) OVER (
PARTITION BY BRANCH_CODE
ORDER BY MONTHLY_VOLUME
) AS VOLUME_QUARTILE,
AVG(MONTHLY_VOLUME) OVER (
PARTITION BY BRANCH_CODE
) AS BRANCH_AVG
FROM ACCOUNT_ACTIVITY_SUMMARY
WHERE REPORT_MONTH = :WS-REPORT-MONTH;
Combining NTILE with other window functions in the same SELECT — as shown above with AVG() OVER — gives you multi-dimensional analysis in a single pass. Each row gets its quartile assignment and its branch average without any COBOL accumulation logic.
7.7 Choosing Set-Based vs. Cursor-Based: Decision Framework
Not every problem should be solved with set-based SQL. Here's the decision framework Lisa at CNB developed after their migration:
Use Set-Based (OLAP, MERGE, Multi-Row) When:
-
The logic is data-dependent, not business-rule-dependent. Running totals, rankings, duplicate detection, gap analysis — these are data transformations that SQL handles natively.
-
The entire result set fits through a single logical operation. If you can express the transformation as "for each row, compute X based on neighboring rows," that's an OLAP function.
-
Error handling is at the batch level. If a failure means "roll back the entire batch and restart," set-based operations are natural — they're already atomic.
-
The row count is large. The thread-switching savings from multi-row operations scale linearly with row count. For 1,000 rows, the savings are negligible. For 10 million rows, they're transformative.
Use Cursor-Based When:
-
Each row requires complex conditional business logic. If processing a transaction requires calling an external service, checking a rules engine, or applying 47 different business rules based on transaction type, account type, and customer status — that logic lives in COBOL, not SQL.
-
Error handling is at the row level. If a failure on row 347 means "log the error, skip the row, continue processing," you need cursor-based logic to maintain control over each row's fate.
-
The processing has side effects. If processing a row triggers a CICS transaction, sends a message to MQ, writes to a VSAM file, or calls a web service — those operations can't be embedded in SQL.
-
The data volume is small. For a lookup table with 500 rows, the overhead difference between single-row and multi-row is measured in milliseconds. Don't add complexity for no benefit.
The Hybrid Approach (What Production Actually Looks Like)
Real production programs use both. The HA Banking batch job will:
- Multi-row FETCH transactions from the queue (set-based).
- COBOL logic to validate each transaction, apply business rules, determine routing (cursor-based).
- Multi-row INSERT valid transactions to the posting table (set-based).
- MERGE account summary records (set-based).
- OLAP function query for the running balance report (set-based).
- Cursor loop for transactions that failed validation, each requiring individual error handling and notification (cursor-based).
The hybrid approach is not a compromise — it's the correct architecture. Use SQL's power for data operations. Use COBOL's power for business logic. The boundary between them is the boundary between "what the database does" and "what the application decides."
🔗 Connection to Chapter 1: Remember the DB2 subsystem architecture from Chapter 1 — the separation between the application address space and the DB2 address space. Every technique in this chapter reduces the number of times your program crosses that boundary. Multi-row operations cross it fewer times. OLAP functions push computation to the DB2 side so results cross once. Recursive CTEs resolve entire hierarchies without crossing at all. The architecture hasn't changed; your use of it has.
Migration Strategy
If you're converting existing cursor-loop programs to use these techniques, Lisa recommends this order:
-
Multi-row FETCH first. The change is mechanical (add OCCURS, change FETCH syntax, add loop) and the performance gain is immediate and measurable. Low risk, high reward.
-
Multi-row INSERT second. Same mechanical change on the output side. The accumulate-then-flush pattern is straightforward.
-
MERGE third. Replacing SELECT-then-INSERT-or-UPDATE patterns with MERGE is moderately complex — you need to verify that the ON clause columns have proper indexes and that no trigger conflicts exist.
-
OLAP functions fourth. Replacing COBOL accumulation logic with OLAP functions requires rethinking the program structure. The SQL is simpler, but the COBOL changes may be extensive if the program's control flow depends on the accumulation.
-
Recursive CTEs last. These replace the most complex COBOL logic (stack-based hierarchy traversal) and require the most testing. But when they work, they eliminate entire paragraphs of code.
7.8 Spaced Review: Connecting Back
Chapter 1 Review: z/OS DB2 Subsystem
Every technique in this chapter operates within the subsystem architecture you learned in Chapter 1. Multi-row operations reduce cross-address-space calls. OLAP functions shift computation to the DB2 engine. Recursive CTEs resolve within DB2's work file processing. Understanding the cost of the application-to-DB2 boundary is what motivates all of these optimizations.
Quick check: In which address space does the OLAP function's sort operation execute — the application's or DB2's? (Answer: DB2's. The sort uses DB2's work files in the TEMP database, not your program's WORKING-STORAGE.)
Chapter 6 Review: The Optimizer
Every SQL construct in this chapter affects the access path the optimizer chooses:
- Multi-row FETCH doesn't change the access path — it changes the delivery mechanism.
- MERGE uses the index on the ON clause columns. No index means tablespace scan per source row.
- Recursive CTEs generate work file access paths that appear in EXPLAIN as "work file scan" operations.
- OLAP functions add sort operations to the access path if the data isn't already in the required order.
Quick check: You add ROW_NUMBER() OVER (PARTITION BY BRANCH_CODE ORDER BY TXN_DATE) to a query. The table has a clustering index on (ACCT_NUM, TXN_DATE). Will DB2 need a sort? (Answer: Yes. The clustering index is on ACCT_NUM, not BRANCH_CODE. DB2 must sort the data by BRANCH_CODE, TXN_DATE to evaluate the OLAP function.)
7.9 HA Banking Project Checkpoint
Objective
Implement multi-row operations and temporal tables for the HA Banking Transaction Processing System.
Requirements
-
Multi-row transaction fetch: Modify the transaction processing program to use multi-row FETCH with a rowset size of 500. The program reads from
HA_TXN_QUEUEand processes daily transactions. -
Multi-row audit insert: Implement accumulate-then-flush pattern for writing to
HA_AUDIT_TRAIL. Use atomic insert for theHA_POSTED_TXNtable and non-atomic for audit. -
Temporal table for account balances: Create
HA_ACCOUNT_BALANCEas a system-period temporal table with history tableHA_ACCOUNT_BALANCE_HIST. Every balance change must be tracked. -
MERGE for account summaries: Replace the existing SELECT/INSERT/UPDATE pattern for
HA_ACCOUNT_SUMMARYwith a MERGE statement. -
Running balance calculation: Implement the daily running balance report using OLAP functions (
SUM() OVERwithROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW). -
Approval chain query: Write a recursive CTE that traverses
HA_EMPLOYEE_HIERARCHYto determine the approval chain for transactions exceeding the initiator's authority limit.
Deliverables
- DDL for
HA_ACCOUNT_BALANCEtemporal table and its history table - Modified COBOL source with multi-row FETCH and INSERT
- MERGE statement for
HA_ACCOUNT_SUMMARY - Recursive CTE for approval chain
- OLAP query for running balance report
- Test cases demonstrating temporal query ("as of" a specific timestamp)
Acceptance Criteria
- Multi-row FETCH uses indicator arrays for all nullable columns
- Final flush is performed after the main processing loop
- Temporal table DDL includes both SYS_START/SYS_END and TRANSACTION START ID
- Recursive CTE includes depth limit of 15 levels
- OLAP running balance query partitions by account and orders by date + sequence
- All SQLCODEs are checked after every SQL statement (no exceptions)
Architecture Note
The temporal table for account balances is not optional. The HA Banking system operates under OCC (Office of the Comptroller of the Currency) examination requirements. Examiners will ask: "What was this account's balance at the time this suspicious transaction was processed?" Without temporal tables, answering that question requires reconstructing balance history from the audit trail — a process that takes hours per account and is error-prone. With temporal tables, it's a single SQL query.
Chapter Summary
You now have five tools that didn't exist in your COBOL-DB2 toolkit this morning:
Multi-row FETCH and INSERT eliminate thread-switching overhead by processing blocks of rows instead of individual rows. The performance improvement scales linearly with volume — exactly the characteristic you want for batch programs processing millions of rows. Remember: check SQLERRD(3) for actual row count, always include indicator arrays, and never forget the final flush.
MERGE replaces the SELECT-then-decide-then-write pattern with a single atomic operation. It eliminates concurrency gaps, reduces thread switches by 67%, and simplifies your COBOL logic. Ensure the ON clause columns are indexed and watch for duplicate source rows.
Temporal tables let DB2 maintain historical versions automatically. System-period for tracking when data was current in the database. Application-period for business-meaningful validity periods. Bi-temporal for both. The history table needs a retention policy, and DML overhead increases 40–60% for update-heavy workloads — offset by combining with multi-row operations.
Recursive CTEs resolve hierarchies in a single SQL statement, replacing stack-based COBOL logic that generates one SQL call per node. Always include a depth limit. Always check for cycles. The performance improvement grows with hierarchy size.
OLAP functions push running totals, rankings, row comparisons, and moving aggregates into SQL. They replace the most complex cursor-processing patterns with declarative SQL that DB2 can optimize and parallelize. The PARTITION BY and ORDER BY clauses are the keys — get them right and the function does the rest.
The decision isn't "which technique do I use?" It's "which combination?" Production batch programs use multi-row FETCH to get data, COBOL logic to apply business rules, multi-row INSERT to write results, MERGE to maintain summary tables, OLAP functions for analytical calculations, and recursive CTEs for hierarchy traversal. The hybrid approach is the production approach.
Next chapter, we go deeper into DB2 locking and concurrency — because all these set-based operations change the locking profile of your programs in ways that will surprise you if you're not prepared.
Related Reading
Explore this topic in other books
IBM DB2 SQL Fundamentals IBM DB2 Embedded SQL Learning COBOL Embedded SQL & DB2 Intermediate COBOL Embedded SQL Fundamentals