Case Study 1: GlobalBank Transaction Routing Overhaul
Background
In January 2024, GlobalBank's transaction routing module — TXN-ROUTE-0100 — was responsible for 47% of all production incidents in the core banking system. The module, originally written in 1994 by a contractor who had since left the company, contained a single 280-line nested IF statement that determined how each incoming transaction was processed.
Maria Chen, GlobalBank's senior COBOL developer, had been advocating for a rewrite for two years. "Every time we add a new transaction type or a new regulatory requirement, someone has to modify that nested IF," she explained to the project steering committee. "And every time someone modifies it, there is a 30% chance they introduce a new bug, because no one fully understands the nesting anymore."
Derek Washington, who had joined the team six months earlier, was the catalyst for action. During his first week maintaining TXN-ROUTE-0100, he spent three full days tracing the nested IF structure on paper — using colored markers to track which ELSE matched which IF — before he felt confident enough to make a one-line change.
The Problem
The original code had the following characteristics:
- Nesting depth: 11 levels at maximum, 7 levels on average
- Raw value comparisons: 47 instances of comparing fields against literal values like 'A', 'C', 'W', etc.
- No EVALUATE statements: The entire routing decision was a single IF-ELSE tree
- No 88-level condition names: All conditions compared identifiers directly to literals
- Duplicate logic: The same validation checks appeared in 6 different branches
- Inconsistent error handling: Some branches logged errors, others silently fell through
A representative fragment (simplified from the original 280 lines):
IF WS-TXN-CHANNEL = 'A'
IF WS-TXN-TYPE = 'W'
IF WS-ACCT-TYPE = 'C' OR 'S'
IF WS-AMT <= 500
PERFORM 3100-ATM-SMALL-WD
ELSE
IF WS-AMT <= 1000
IF WS-CUST-LEVEL = 'P' OR 'G'
PERFORM 3100-ATM-LARGE-WD
ELSE
MOVE 'ATM LIMIT EXCEEDED'
TO WS-ERR
PERFORM 9100-ERROR
END-IF
ELSE
MOVE 'ATM LIMIT EXCEEDED'
TO WS-ERR
PERFORM 9100-ERROR
END-IF
END-IF
ELSE
MOVE 'INVALID ACCT FOR ATM' TO WS-ERR
PERFORM 9100-ERROR
END-IF
ELSE
IF WS-TXN-TYPE = 'D'
* ... continues for 200+ more lines
The Approach
Maria and Derek planned a three-phase refactoring:
Phase 1: Define 88-Level Condition Names
Before touching the PROCEDURE DIVISION, they created comprehensive 88-level definitions:
01 WS-TXN-CHANNEL PIC X(01).
88 CHANNEL-ATM VALUE 'A'.
88 CHANNEL-BRANCH VALUE 'B'.
88 CHANNEL-MOBILE VALUE 'M'.
88 CHANNEL-ONLINE VALUE 'O'.
88 CHANNEL-BATCH VALUE 'X'.
88 CHANNEL-IS-SELF-SVC VALUE 'A' 'M' 'O'.
88 CHANNEL-IS-STAFFED VALUE 'B'.
88 CHANNEL-IS-VALID VALUE 'A' 'B' 'M' 'O' 'X'.
01 WS-TXN-TYPE PIC X(02).
88 TXN-WITHDRAWAL VALUE 'WD'.
88 TXN-DEPOSIT VALUE 'DP'.
88 TXN-TRANSFER VALUE 'TF'.
88 TXN-PAYMENT VALUE 'PY'.
88 TXN-INQUIRY VALUE 'IQ'.
88 TXN-CHANGES-BAL VALUE 'WD' 'DP' 'TF' 'PY'.
88 TXN-IS-VALID VALUE 'WD' 'DP' 'TF'
'PY' 'IQ'.
This step alone took a full day — cataloging every value used in the original 280-line IF and assigning meaningful names. They discovered three values ('R', 'X', and '9') that appeared in the code but were not documented anywhere. After research, they determined that 'R' was a reversal transaction type added in 2008, 'X' was an internal batch channel code, and '9' was a test artifact that should never have reached production.
Phase 2: Replace Nested IF with EVALUATE
The core routing logic was replaced with a two-subject EVALUATE:
2000-ROUTE-TRANSACTION.
PERFORM 2010-VALIDATE-INPUTS
IF NOT WS-VALID-INPUT
GO TO 2000-ROUTE-EXIT
END-IF
PERFORM 2020-CHECK-REGULATORY
IF OFAC-BLOCKED
PERFORM 9200-BLOCK-TXN
GO TO 2000-ROUTE-EXIT
END-IF
EVALUATE TRUE ALSO TRUE
WHEN TXN-INQUIRY ALSO ANY
PERFORM 3400-PROCESS-INQUIRY
WHEN TXN-DEPOSIT ALSO CHANNEL-ATM
PERFORM 3210-ATM-DEPOSIT
WHEN TXN-DEPOSIT ALSO CHANNEL-IS-STAFFED
PERFORM 3220-BRANCH-DEPOSIT
WHEN TXN-DEPOSIT ALSO CHANNEL-BATCH
PERFORM 3230-BATCH-DEPOSIT
WHEN TXN-DEPOSIT ALSO ANY
PERFORM 3200-DIGITAL-DEPOSIT
WHEN TXN-WITHDRAWAL ALSO ANY
PERFORM 2100-VALIDATE-AND-WITHDRAW
WHEN TXN-TRANSFER ALSO ANY
PERFORM 2200-VALIDATE-AND-TRANSFER
WHEN TXN-PAYMENT ALSO ANY
PERFORM 2300-VALIDATE-AND-PAY
WHEN OTHER
MOVE 'UNHANDLED TXN/CHANNEL COMBO'
TO WS-ERROR-MSG
PERFORM 9100-LOG-ERROR
END-EVALUATE
PERFORM 2030-POST-TXN-PROCESSING
.
2000-ROUTE-EXIT.
EXIT.
Phase 3: Extract Validation into Separate Paragraphs
Validation logic that had been duplicated across branches was consolidated:
2100-VALIDATE-AND-WITHDRAW.
PERFORM 2110-CHECK-ACCT-WD-ELIGIBLE
IF NOT WS-WD-ELIGIBLE
GO TO 2100-WITHDRAW-EXIT
END-IF
PERFORM 2120-CHECK-BALANCE
IF NOT WS-BALANCE-OK
GO TO 2100-WITHDRAW-EXIT
END-IF
PERFORM 2130-CHECK-CHANNEL-LIMITS
IF NOT WS-WITHIN-LIMITS
EVALUATE TRUE
WHEN CUST-PREMIUM
PERFORM 2140-PREMIUM-OVERRIDE
WHEN OTHER
MOVE 'LIMIT EXCEEDED' TO WS-ERROR-MSG
PERFORM 9100-LOG-ERROR
GO TO 2100-WITHDRAW-EXIT
END-EVALUATE
END-IF
PERFORM 3100-PROCESS-WITHDRAWAL
.
2100-WITHDRAW-EXIT.
EXIT.
Results
The refactoring was completed in three weeks (including thorough regression testing) and deployed in the March 2024 release cycle.
Quantitative results (first 6 months post-deployment):
| Metric | Before | After | Change |
|---|---|---|---|
| Production incidents (monthly avg) | 4.2 | 1.7 | -60% |
| Mean time to diagnose issue | 3.1 hours | 0.8 hours | -74% |
| Lines of PROCEDURE DIVISION | 280 | 185 | -34% |
| Nesting depth (maximum) | 11 | 3 | -73% |
| New feature implementation time | 2-3 days | 4-8 hours | -75% |
| Code review time per change | 2 hours | 30 minutes | -75% |
Qualitative outcomes:
- Derek Washington could now modify the routing logic independently, without Maria's oversight
- Sarah Kim (business analyst at MedClaim, who consulted on the project) said the EVALUATE version "reads like the business rules document"
- Two new transaction types were added in the six months post-refactoring with zero incidents
- The refactored code served as a template for similar refactoring across six other modules
Lessons Learned
-
88-level condition names are the foundation. The team estimated that 40% of the readability improvement came from replacing raw value comparisons with named conditions.
-
EVALUATE TRUE ALSO TRUE maps directly to business decision tables. When Sarah reviewed the refactored code against her requirements document, she could verify correctness by comparing the EVALUATE clauses to the rows and columns of her decision table.
-
Guard clauses simplify the happy path. By checking error conditions first and exiting, the remaining code only handles valid scenarios — reducing nesting depth dramatically.
-
Refactoring is not rewriting. Maria was careful to preserve the exact behavior of the original code, bugs and all. Known bugs were documented and fixed in a separate change request to maintain audit traceability.
Discussion Questions
-
Why did Maria insist on preserving known bugs during the refactoring rather than fixing them simultaneously? What risks would combining refactoring with bug fixes introduce?
-
The team used GO TO for guard clause exits. Could the same structure be achieved without GO TO? What tradeoffs would be involved?
-
The EVALUATE uses
WHEN TXN-DEPOSIT ALSO CHANNEL-ATMbeforeWHEN TXN-DEPOSIT ALSO ANY. Why is this ordering important? What would happen if the order were reversed? -
How would you approach testing this refactoring to ensure the new code behaves identically to the old code for all input combinations?