Case Study 2: Debugging a Complex Batch Balancing Problem
Background
Cornerstone Community Bank runs a nightly batch cycle that calculates and posts accrued interest on approximately 340,000 savings accounts. The interest calculation program, INTCALC, has been in production for four years. Every morning, the accounting department receives a summary report showing the total interest accrued for the night, broken down by account type (regular savings, money market, and certificate of deposit). This total feeds into the general ledger as a debit to Interest Expense and a credit to Accrued Interest Payable.
On Wednesday morning, the chief accountant calls IT with a problem: the interest accrual total on the INTCALC report does not match the total computed independently by the bank's financial planning system. The discrepancy is $847.63. The financial planning system shows total accrued interest of $156,284.19, while INTCALC reports $155,436.56. This $847.63 difference is unacceptable for regulatory purposes -- the bank's books must balance to the penny.
The financial planning system uses a simple spreadsheet-style calculation: for each account, it multiplies the balance by the annual rate divided by 365. INTCALC uses what should be the same formula but applies it within a COBOL COMPUTE statement. The two systems process the same account data (extracted from the same master file at the same point in time). They should produce identical results.
Marcus Williams, a senior COBOL programmer, is assigned to find and fix the discrepancy. He has no abend to analyze, no error messages to read -- just two numbers that should be equal and are not.
Step 1: Verifying the Scope of the Problem
Marcus starts by determining whether the discrepancy is concentrated in a few accounts or spread across many. He modifies INTCALC to write a detail file containing each account's calculated interest, then compares it to the financial planning system's per-account detail using DFSORT ICETOOL:
//COMPARE EXEC PGM=ICETOOL
//TOOLMSG DD SYSOUT=*
//DFSMSG DD SYSOUT=*
//INTCDET DD DSN=CSB.INTCALC.DETAIL,DISP=SHR
//FPSDET DD DSN=CSB.FINPLAN.DETAIL,DISP=SHR
//DIFFOUT DD DSN=CSB.COMPARE.DIFF,
// DISP=(NEW,CATLG,DELETE),
// SPACE=(CYL,(1,1),RLSE),
// DCB=(RECFM=FB,LRECL=80,BLKSIZE=0)
//TOOLIN DD *
SPLICE FROM(INTCDET) TO(DIFFOUT) ON(1,10,CH) -
WITH(21,11,ZD) KEEPNODUPS
SPLICE FROM(FPSDET) TO(DIFFOUT) ON(1,10,CH) -
WITH(21,11,ZD) ALREADY
/*
The comparison reveals that 12,847 accounts have different interest amounts between the two systems. Each individual difference is small -- ranging from $0.01 to $0.14 -- but they all go in the same direction: INTCALC consistently calculates slightly less interest than the financial planning system. This pattern suggests a systematic issue, not random data errors.
Step 2: Adding Strategic DISPLAY Statements
Marcus adds DISPLAY statements to the interest calculation paragraph to examine the intermediate values for a sample of accounts. He selects five accounts with known discrepancies:
WORKING-STORAGE SECTION.
01 WS-DEBUG-ACCOUNTS.
05 WS-DEBUG-ACCT-1 PIC X(10) VALUE '0001004523'.
05 WS-DEBUG-ACCT-2 PIC X(10) VALUE '0001078901'.
05 WS-DEBUG-ACCT-3 PIC X(10) VALUE '0001156789'.
05 WS-DEBUG-ACCT-4 PIC X(10) VALUE '0002003456'.
05 WS-DEBUG-ACCT-5 PIC X(10) VALUE '0002089012'.
01 WS-CALC-FIELDS.
05 WS-BALANCE PIC S9(11)V99.
05 WS-ANNUAL-RATE PIC 9V9(4).
05 WS-DAILY-RATE PIC 9V9(8).
05 WS-DAILY-INTEREST PIC S9(9)V99.
05 WS-ACCRUED-TOTAL PIC S9(13)V99.
01 WS-DEBUG-FLAG PIC X VALUE 'N'.
88 FL-DEBUG-THIS-ACCT VALUE 'Y'.
3000-CALCULATE-INTEREST.
*--- Check if this is a debug account
MOVE 'N' TO WS-DEBUG-FLAG
IF WS-ACCOUNT-NUM = WS-DEBUG-ACCT-1
OR WS-ACCOUNT-NUM = WS-DEBUG-ACCT-2
OR WS-ACCOUNT-NUM = WS-DEBUG-ACCT-3
OR WS-ACCOUNT-NUM = WS-DEBUG-ACCT-4
OR WS-ACCOUNT-NUM = WS-DEBUG-ACCT-5
MOVE 'Y' TO WS-DEBUG-FLAG
END-IF
MOVE MR-BALANCE TO WS-BALANCE
MOVE MR-ANNUAL-RATE TO WS-ANNUAL-RATE
IF FL-DEBUG-THIS-ACCT
DISPLAY '=== INTEREST CALC DEBUG ==='
DISPLAY 'ACCOUNT: ' WS-ACCOUNT-NUM
DISPLAY 'BALANCE: ' WS-BALANCE
DISPLAY 'ANN RATE: ' WS-ANNUAL-RATE
END-IF
*--- Step 1: Calculate daily rate
COMPUTE WS-DAILY-RATE =
WS-ANNUAL-RATE / 365
IF FL-DEBUG-THIS-ACCT
DISPLAY 'DAILY RATE (RATE/365): '
WS-DAILY-RATE
END-IF
*--- Step 2: Calculate daily interest
COMPUTE WS-DAILY-INTEREST ROUNDED =
WS-BALANCE * WS-DAILY-RATE
IF FL-DEBUG-THIS-ACCT
DISPLAY 'DAILY INTEREST (BAL*RATE): '
WS-DAILY-INTEREST
END-IF
*--- Step 3: Add to running total
ADD WS-DAILY-INTEREST TO WS-ACCRUED-TOTAL
ON SIZE ERROR
DISPLAY 'SIZE ERROR ON ACCRUAL TOTAL'
END-ADD
IF FL-DEBUG-THIS-ACCT
DISPLAY 'RUNNING TOTAL: '
WS-ACCRUED-TOTAL
DISPLAY '=== END DEBUG ==='
END-IF
.
Step 3: Analyzing the Debug Output
Marcus runs the modified program. The debug output for account 0001004523 is:
=== INTEREST CALC DEBUG ===
ACCOUNT: 0001004523
BALANCE: 00000045678.23
ANN RATE: 4.2500
DAILY RATE (RATE/365): 0.01164383
DAILY INTEREST (BAL*RATE): 0000000531.70
RUNNING TOTAL: 0000000000531.70
=== END DEBUG ===
The financial planning system shows the interest for this account as $5.32, not $5.31. A one-cent difference. Marcus pulls out his calculator and computes manually:
Balance: $45,678.23
Annual rate: 4.25%
Daily rate: 4.25 / 365 = 0.01164383561643835...
Daily interest: $45,678.23 * 0.0116438356... = $531.7088...
Rounded to cents: $5.32 (rounding 0.7088 up)
Wait -- Marcus notices something. The DISPLAY shows the daily interest as $531.70, but it should be approximately $5.32 (not $531). He checks the PIC clauses:
05 WS-DAILY-INTEREST PIC S9(9)V99.
The DISPLAY output 0000000531.70 means the value stored is 531.70, not 5.3170. But the correct daily interest for a $45,678.23 balance at 4.25% should be about $5.32. Something is wrong with the calculation magnitude.
Marcus looks more carefully at the calculation and the field definitions:
05 WS-ANNUAL-RATE PIC 9V9(4).
The annual rate is stored as 4.2500 in a PIC 9V9(4) field. This means the value represents 4.2500 -- the rate as a percentage (4.25%). But the formula requires the rate as a decimal (0.0425). The program is dividing 4.25 by 365, getting 0.01164383, and then multiplying the balance by that value. The result is 100 times too large.
But wait -- the financial planning system is also dividing 4.25 by 365. If both systems use the rate as a percentage, they should both be 100x too large by the same factor. Marcus re-reads the code more carefully and realizes that his initial observation about the magnitude was wrong. Looking at the DISPLAY output again:
DAILY INTEREST (BAL*RATE): 0000000531.70
PIC S9(9)V99 with value 0000000531.70 means the value is 531.70. But wait -- the expected result of $45,678.23 * 0.01164383 is $531.708... Marcus suddenly realizes the DISPLAY is showing a value that is consistent with the rate already being in percentage form (since 45678.23 * 0.01164383 = 531.708). The daily interest is indeed $531.70 (this represents 1/365 of the annual interest at the percentage rate, which needs to be divided by 100).
Marcus re-examines the full calculation flow and finds the actual issue is elsewhere. He looks at how the final daily interest amount is computed and stored:
*--- Step 2: Calculate daily interest
COMPUTE WS-DAILY-INTEREST ROUNDED =
WS-BALANCE * WS-DAILY-RATE
The value $531.708... is rounded to $531.70 (because WS-DAILY-INTEREST is PIC S9(9)V99 -- only two decimal places). But the correct value is $531.71 (since 0.708 rounds up). Why is ROUNDED producing $531.70 instead of $531.71?
Step 4: The Daily Rate Precision Problem
Marcus focuses on the intermediate daily rate:
DAILY RATE (RATE/365): 0.01164383
WS-DAILY-RATE is PIC 9V9(8), giving 8 decimal places. The computed daily rate is 0.01164383, but the true value is 0.01164383561643835616...
The rate has been truncated at 8 decimal places. The COMPUTE statement dividing the rate by 365 does not use ROUNDED:
COMPUTE WS-DAILY-RATE =
WS-ANNUAL-RATE / 365
Without ROUNDED, the result is truncated, not rounded. The true value 0.0116438356... is stored as 0.01164383 (truncated to 8 decimal places). The lost precision is 0.00000000561643...
Marcus calculates the impact of this truncation:
Correct daily rate: 0.01164383561644
Truncated daily rate: 0.01164383000000
Difference: 0.00000000561644
Impact per account: $45,678.23 * 0.00000000561644 = $0.000256...
A quarter of a penny per account per day. Over 12,847 accounts, that adds up. But Marcus realizes this is not enough to explain the $847.63 discrepancy. The truncation at the daily rate level accounts for only a fraction of a cent per account -- and the observed discrepancies are $0.01 to $0.14 per account.
Step 5: The Compound Truncation Effect
Marcus adds more precision to his debug output by introducing a higher-precision intermediate field:
01 WS-HIGH-PREC-FIELDS.
05 WS-HP-DAILY-RATE PIC 9V9(15).
05 WS-HP-INTEREST PIC S9(9)V9(6).
05 WS-RATE-DIFF PIC S9V9(15).
05 WS-INT-DIFF PIC S9(9)V9(6).
*--- High precision calculation for comparison
COMPUTE WS-HP-DAILY-RATE =
WS-ANNUAL-RATE / 365
COMPUTE WS-HP-INTEREST =
WS-BALANCE * WS-HP-DAILY-RATE
COMPUTE WS-RATE-DIFF =
WS-HP-DAILY-RATE - WS-DAILY-RATE
COMPUTE WS-INT-DIFF =
WS-HP-INTEREST - WS-DAILY-INTEREST
IF FL-DEBUG-THIS-ACCT
DISPLAY 'HIGH PREC RATE: '
WS-HP-DAILY-RATE
DISPLAY 'STANDARD RATE: '
WS-DAILY-RATE
DISPLAY 'RATE DIFFERENCE: '
WS-RATE-DIFF
DISPLAY 'HIGH PREC INTEREST: '
WS-HP-INTEREST
DISPLAY 'STANDARD INTEREST: '
WS-DAILY-INTEREST
DISPLAY 'INTEREST DIFF: '
WS-INT-DIFF
END-IF
The output reveals:
HIGH PREC RATE: 0.011643835616438
STANDARD RATE: 0.01164383
RATE DIFFERENCE: 0.000000005616438
HIGH PREC INTEREST: 0000000531.708856
STANDARD INTEREST: 0000000531.70
INTEREST DIFF: 0000000000.008856
The difference is $0.008856 for this single account on one day. But the program actually calculates interest for multiple days at once (the number of days since the last accrual). Marcus checks the code more carefully and finds the critical section:
2500-PROCESS-ACCOUNT.
MOVE MR-LAST-ACCRUAL-DATE TO WS-LAST-DATE
COMPUTE WS-DAYS-ELAPSED =
FUNCTION INTEGER-OF-DATE(WS-CURRENT-DATE-N)
- FUNCTION INTEGER-OF-DATE(WS-LAST-DATE-N)
IF WS-DAYS-ELAPSED > 0
PERFORM WS-DAYS-ELAPSED TIMES
PERFORM 3000-CALCULATE-INTEREST
END-PERFORM
END-IF
.
The program calculates interest by calling the daily interest calculation once for each elapsed day. For accounts that have not been accrued in several days (weekends, holidays), this means the truncation error compounds:
1 day: $0.008856 truncation loss
3 days: $0.026568 truncation loss (3 * $0.008856)
7 days: $0.061992 truncation loss (holiday week)
And critically, after each day's calculation, the interest is added to WS-ACCRUED-TOTAL with ROUNDED, which rounds to the nearest cent. This means each day's rounding error accumulates separately. After 3 days, three separate rounding operations each potentially losing half a cent compound to a 1-2 cent discrepancy.
Step 6: Confirming the Root Cause
Marcus now has the full picture. The discrepancy is caused by two interacting precision issues:
-
Daily rate truncation: WS-DAILY-RATE (PIC 9V9(8)) truncates the daily rate, losing precision in the 9th and subsequent decimal places.
-
Per-day rounding accumulation: The program calculates interest one day at a time and rounds to the nearest cent after each day. Each rounding operation can lose or gain up to half a cent. Over multiple days, these rounding differences accumulate.
The financial planning system, by contrast, calculates interest in a single operation: balance * rate / 365 * days, rounding only once at the end. This single-rounding approach preserves intermediate precision and rounds only the final result.
To confirm, Marcus modifies the program to calculate interest in a single operation and compares:
3000-CALCULATE-INTEREST-FIXED.
COMPUTE WS-DAILY-INTEREST ROUNDED =
WS-BALANCE * WS-ANNUAL-RATE
/ 365 * WS-DAYS-ELAPSED
.
With this change, the program produces a total of $156,284.19 -- matching the financial planning system exactly.
Step 7: The Fix
Marcus proposes two changes:
Change 1: Single-Operation Interest Calculation
Replace the per-day loop with a single COMPUTE that multiplies by the number of elapsed days:
2500-PROCESS-ACCOUNT.
MOVE MR-LAST-ACCRUAL-DATE TO WS-LAST-DATE
COMPUTE WS-DAYS-ELAPSED =
FUNCTION INTEGER-OF-DATE(WS-CURRENT-DATE-N)
- FUNCTION INTEGER-OF-DATE(WS-LAST-DATE-N)
IF WS-DAYS-ELAPSED > 0
PERFORM 3000-CALCULATE-INTEREST
END-IF
.
3000-CALCULATE-INTEREST.
COMPUTE WS-PERIOD-INTEREST ROUNDED =
WS-BALANCE * (WS-ANNUAL-RATE / 36500)
* WS-DAYS-ELAPSED
ADD WS-PERIOD-INTEREST TO WS-ACCRUED-TOTAL
ON SIZE ERROR
DISPLAY 'SIZE ERROR: ACCRUAL TOTAL'
DISPLAY 'ACCOUNT: ' WS-ACCOUNT-NUM
DISPLAY 'INTEREST: ' WS-PERIOD-INTEREST
END-ADD
.
Notice the division by 36500 instead of 365. Since the rate is stored as a percentage (e.g., 4.25), dividing by 36500 (365 * 100) converts it to a daily decimal rate in one step, avoiding an intermediate field that could lose precision.
Change 2: Increase Intermediate Precision
For cases where per-day calculation is required (such as accounts with daily balance changes), use higher-precision intermediate fields:
01 WS-HIGH-PREC-RATE PIC 9V9(15).
01 WS-HIGH-PREC-INT PIC S9(9)V9(6).
3000-CALCULATE-INTEREST-DAILY.
COMPUTE WS-HIGH-PREC-RATE =
WS-ANNUAL-RATE / 36500
COMPUTE WS-HIGH-PREC-INT =
WS-BALANCE * WS-HIGH-PREC-RATE
COMPUTE WS-DAILY-INTEREST ROUNDED =
WS-HIGH-PREC-INT
ADD WS-DAILY-INTEREST TO WS-ACCRUED-TOTAL
ON SIZE ERROR
DISPLAY 'SIZE ERROR: ACCRUAL TOTAL'
END-ADD
.
Step 8: Validation and Deployment
Marcus runs the fixed program against the full production dataset. The new total matches the financial planning system to the penny: $156,284.19. He creates a comparison report showing the per-account differences between the old and new calculations:
ACCOUNT OLD INTEREST NEW INTEREST DIFFERENCE
0001004523 531.70 531.71 +0.01
0001078901 287.43 287.45 +0.02
0001156789 1,024.89 1,025.03 +0.14
...
TOTAL 155,436.56 156,284.19 +847.63
The accounting department reviews the comparison and approves the change. The fix is deployed the following Monday. A one-time adjustment entry of $847.63 is posted to true up the general ledger for the cumulative difference that accumulated over the past quarterly accrual period.
Lessons Learned
This case study illustrates debugging principles specific to financial COBOL programming:
Rounding errors are not rounding errors. The problem was not that ROUNDED was broken. ROUNDED worked exactly as designed -- it rounded each intermediate result to the nearest cent. The problem was that the program structure forced unnecessary intermediate rounding by calculating interest one day at a time. Each rounding operation is individually correct, but the cumulative effect of many small roundings produces a measurable discrepancy.
Precision demands precision. Financial calculations require explicit attention to the precision of every intermediate field. A PIC 9V9(8) daily rate field seems precise enough (eight decimal places), but when the true value has 16 significant digits, the truncation matters. The rule of thumb for financial COBOL: intermediate calculation fields should have at least 6 more decimal places than the final result requires.
DISPLAY debugging is an art of reduction. Marcus did not dump 340,000 accounts. He selected five accounts with known discrepancies and added DISPLAY statements that showed every intermediate value in the calculation chain. This surgical approach produced actionable information without drowning in output.
Compare two independent calculations. The most powerful debugging technique for numerical errors is to compute the same result two different ways and compare. By adding high-precision fields alongside the standard fields, Marcus could see exactly where precision was lost.
The formula matters as much as the code. The original code implemented the correct mathematical formula. The error was not in the formula but in how the formula was decomposed into computational steps. Calculating (balance * rate / 365) once and multiplying by days is mathematically equivalent to calculating (balance * rate / 365) separately for each day and summing -- but computationally, the single-step approach preserves precision because it rounds only once.
Financial systems must agree. When two systems calculate the same quantity independently, they must produce identical results. Discrepancies, no matter how small, indicate that the systems use different assumptions, precision, or rounding rules. These discrepancies must be investigated and resolved because regulators, auditors, and accountants expect the books to balance exactly.