Case Study 32.1: COBOL Batch Program — Monthly Interest Calculation

DataField.Dev

Case Study 32.1: COBOL Batch Program — Monthly Interest Calculation

Background

Meridian National Bank runs a monthly interest capitalization process on the last business day of each month. Unlike the daily interest accrual program (INTCALC) that calculates daily interest and stores it in an ACCRUED_INTEREST column, the monthly capitalization program (INTCAP) takes the accumulated accrued interest and adds it to the account balance — effectively "paying" the interest to the customer.

This process affects approximately 2.3 million savings and money market accounts. The batch window allocation is 45 minutes. The program must also generate an interest payment record for each account (for tax reporting purposes — IRS Form 1099-INT) and update a monthly summary table used by the finance department.

The Challenge

The previous version of INTCAP was written 15 years ago with single-row FETCH and single-row INSERT. As the bank grew, the program began exceeding its batch window. Last month it ran for 62 minutes, delaying the downstream statement generation job and nearly causing a missed SLA with the bank's online banking vendor.

The batch operations team has asked you to redesign the program for better performance without changing the business logic.

Current Program Analysis

The existing program structure:

1. OPEN cursor for all accounts with ACCRUED_INTEREST > 0
2. FETCH one row
3. UPDATE ACCOUNTS — add accrued interest to balance, zero out accrued interest
4. INSERT into INTEREST_PAYMENTS — one record per account
5. ADD to monthly summary accumulators
6. COMMIT every 500 rows
7. Loop to step 2

Profiling shows: - 2,300,000 FETCH calls - 2,300,000 UPDATE calls - 2,300,000 INSERT calls - 4,600 COMMIT calls - Total SQL calls: approximately 6,904,600

The DB2 accounting trace shows that 68% of elapsed time is in SQL call overhead, not actual I/O or computation.

Solution Design

Strategy 1: Multi-Row FETCH with Batch INSERT

Convert the single-row operations to multi-row operations with a rowset size of 500:

       EXEC SQL
           DECLARE CSR-INTCAP CURSOR WITH HOLD
               WITH ROWSET POSITIONING FOR
               SELECT ACCOUNT_NUM, BALANCE,
                      ACCRUED_INTEREST, ACCT_TYPE
               FROM MERIDIAN.ACCOUNTS
               WHERE ACCRUED_INTEREST > 0
                 AND STATUS = 'ACTIVE'
                 AND ACCT_TYPE IN ('SAV', 'MMA')
               ORDER BY ACCOUNT_NUM
       END-EXEC.

Host variable arrays:

       01  HV-ACCT-ARR.
           05  HV-ACCT-NUM-A   PIC X(10)     OCCURS 500.
       01  HV-BAL-ARR.
           05  HV-BALANCE-A    PIC S9(13)V99 COMP-3
                                              OCCURS 500.
       01  HV-ACCR-ARR.
           05  HV-ACCRUED-A    PIC S9(13)V99 COMP-3
                                              OCCURS 500.
       01  HV-TYPE-ARR.
           05  HV-ACCT-TYPE-A  PIC X(3)      OCCURS 500.

Multi-row FETCH:

       EXEC SQL
           FETCH NEXT ROWSET FROM CSR-INTCAP
           FOR 500 ROWS
           INTO :HV-ACCT-NUM-A,
                :HV-BALANCE-A,
                :HV-ACCRUED-A,
                :HV-ACCT-TYPE-A
       END-EXEC.
       MOVE SQLERRD(3) TO WS-ROWS-FETCHED.

Strategy 2: Searched UPDATE Instead of Positioned UPDATE

Rather than updating one row at a time with WHERE CURRENT OF, accumulate the account numbers and use a single searched UPDATE with an IN list — or better, use a staging approach:

      * After processing the rowset, perform a bulk update
       EXEC SQL
           UPDATE MERIDIAN.ACCOUNTS
           SET BALANCE = BALANCE + ACCRUED_INTEREST,
               ACCRUED_INTEREST = 0,
               LAST_INT_PAID_DT = CURRENT DATE
           WHERE ACCOUNT_NUM IN
               (SELECT ACCOUNT_NUM
                FROM MERIDIAN.ACCOUNTS
                WHERE ACCRUED_INTEREST > 0
                  AND STATUS = 'ACTIVE'
                  AND ACCT_TYPE IN ('SAV', 'MMA'))
       END-EXEC.

However, this approach lacks the row-by-row control needed for error handling and summary accumulation. A hybrid approach works better: process the fetched rowset in memory, then use multi-row INSERT for the interest payment records.

Strategy 3: Multi-Row INSERT for Payment Records

       EXEC SQL
           INSERT INTO MERIDIAN.INTEREST_PAYMENTS
               (ACCOUNT_NUM, PAYMENT_DATE, PAYMENT_AMT,
                ACCT_TYPE, TAX_YEAR)
           VALUES (:HV-PAY-ACCT-A, :HV-PAY-DATE-A,
                   :HV-PAY-AMT-A, :HV-PAY-TYPE-A,
                   :HV-PAY-YEAR-A)
           FOR :WS-INSERT-COUNT ROWS
       END-EXEC.

Performance Results

Metric	Before (Single-Row)	After (Multi-Row)	Improvement
FETCH calls	2,300,000	4,600	99.8% reduction
UPDATE calls	2,300,000	2,300,000 (still row-by-row)	No change
INSERT calls	2,300,000	4,600	99.8% reduction
Total SQL calls	6,904,600	2,309,200	66.6% reduction
Elapsed time	62 minutes	23 minutes	62.9% reduction
CPU time	18.4 minutes	8.1 minutes	56.0% reduction

The UPDATE remains row-by-row because each account needs individual error handling and the positioned update ensures we are modifying the correct row. Even so, the overall improvement is dramatic.

Lessons Learned

SQL call overhead dominates in high-volume batch. The actual I/O and computation per row is small. Reducing the number of SQL calls is the highest-leverage optimization.
Multi-row FETCH is the single biggest improvement for read-heavy loops. A rowset size of 500 reduces FETCH calls by 99.8%.
Multi-row INSERT is equally powerful for programs that generate output rows.
The optimal rowset size is not always "as large as possible." Testing showed that rowset sizes above 500 provided diminishing returns for this workload. The sweet spot depends on row width, buffer pool size, and available WORKING-STORAGE.
WITH HOLD and periodic COMMIT remain essential. Even with multi-row operations, committing every N rowsets prevents lock escalation and keeps the batch window predictable.

Discussion Questions

Why was the UPDATE left as a row-by-row operation instead of converting to a set-based approach?
What would be the risk of using ISOLATION(UR) for this program?
How would you handle the scenario where the multi-row INSERT partially fails (some rows insert, others violate constraints)?
If this program runs concurrently with online banking, what ISOLATION level should the BIND use?