Case Study 1: GlobalBank Transaction Routing Overhaul

Background

In January 2024, GlobalBank's transaction routing module — TXN-ROUTE-0100 — was responsible for 47% of all production incidents in the core banking system. The module, originally written in 1994 by a contractor who had since left the company, contained a single 280-line nested IF statement that determined how each incoming transaction was processed.

Maria Chen, GlobalBank's senior COBOL developer, had been advocating for a rewrite for two years. "Every time we add a new transaction type or a new regulatory requirement, someone has to modify that nested IF," she explained to the project steering committee. "And every time someone modifies it, there is a 30% chance they introduce a new bug, because no one fully understands the nesting anymore."

Derek Washington, who had joined the team six months earlier, was the catalyst for action. During his first week maintaining TXN-ROUTE-0100, he spent three full days tracing the nested IF structure on paper — using colored markers to track which ELSE matched which IF — before he felt confident enough to make a one-line change.

The Problem

The original code had the following characteristics:

  • Nesting depth: 11 levels at maximum, 7 levels on average
  • Raw value comparisons: 47 instances of comparing fields against literal values like 'A', 'C', 'W', etc.
  • No EVALUATE statements: The entire routing decision was a single IF-ELSE tree
  • No 88-level condition names: All conditions compared identifiers directly to literals
  • Duplicate logic: The same validation checks appeared in 6 different branches
  • Inconsistent error handling: Some branches logged errors, others silently fell through

A representative fragment (simplified from the original 280 lines):

       IF WS-TXN-CHANNEL = 'A'
           IF WS-TXN-TYPE = 'W'
               IF WS-ACCT-TYPE = 'C' OR 'S'
                   IF WS-AMT <= 500
                       PERFORM 3100-ATM-SMALL-WD
                   ELSE
                       IF WS-AMT <= 1000
                           IF WS-CUST-LEVEL = 'P' OR 'G'
                               PERFORM 3100-ATM-LARGE-WD
                           ELSE
                               MOVE 'ATM LIMIT EXCEEDED'
                                   TO WS-ERR
                               PERFORM 9100-ERROR
                           END-IF
                       ELSE
                           MOVE 'ATM LIMIT EXCEEDED'
                               TO WS-ERR
                           PERFORM 9100-ERROR
                       END-IF
                   END-IF
               ELSE
                   MOVE 'INVALID ACCT FOR ATM' TO WS-ERR
                   PERFORM 9100-ERROR
               END-IF
           ELSE
               IF WS-TXN-TYPE = 'D'
      *        ... continues for 200+ more lines

The Approach

Maria and Derek planned a three-phase refactoring:

Phase 1: Define 88-Level Condition Names

Before touching the PROCEDURE DIVISION, they created comprehensive 88-level definitions:

       01  WS-TXN-CHANNEL           PIC X(01).
           88  CHANNEL-ATM           VALUE 'A'.
           88  CHANNEL-BRANCH        VALUE 'B'.
           88  CHANNEL-MOBILE        VALUE 'M'.
           88  CHANNEL-ONLINE        VALUE 'O'.
           88  CHANNEL-BATCH         VALUE 'X'.
           88  CHANNEL-IS-SELF-SVC   VALUE 'A' 'M' 'O'.
           88  CHANNEL-IS-STAFFED    VALUE 'B'.
           88  CHANNEL-IS-VALID      VALUE 'A' 'B' 'M' 'O' 'X'.

       01  WS-TXN-TYPE              PIC X(02).
           88  TXN-WITHDRAWAL        VALUE 'WD'.
           88  TXN-DEPOSIT           VALUE 'DP'.
           88  TXN-TRANSFER          VALUE 'TF'.
           88  TXN-PAYMENT           VALUE 'PY'.
           88  TXN-INQUIRY           VALUE 'IQ'.
           88  TXN-CHANGES-BAL       VALUE 'WD' 'DP' 'TF' 'PY'.
           88  TXN-IS-VALID          VALUE 'WD' 'DP' 'TF'
                                           'PY' 'IQ'.

This step alone took a full day — cataloging every value used in the original 280-line IF and assigning meaningful names. They discovered three values ('R', 'X', and '9') that appeared in the code but were not documented anywhere. After research, they determined that 'R' was a reversal transaction type added in 2008, 'X' was an internal batch channel code, and '9' was a test artifact that should never have reached production.

Phase 2: Replace Nested IF with EVALUATE

The core routing logic was replaced with a two-subject EVALUATE:

       2000-ROUTE-TRANSACTION.
           PERFORM 2010-VALIDATE-INPUTS
           IF NOT WS-VALID-INPUT
               GO TO 2000-ROUTE-EXIT
           END-IF

           PERFORM 2020-CHECK-REGULATORY

           IF OFAC-BLOCKED
               PERFORM 9200-BLOCK-TXN
               GO TO 2000-ROUTE-EXIT
           END-IF

           EVALUATE TRUE ALSO TRUE
               WHEN TXN-INQUIRY   ALSO ANY
                   PERFORM 3400-PROCESS-INQUIRY
               WHEN TXN-DEPOSIT   ALSO CHANNEL-ATM
                   PERFORM 3210-ATM-DEPOSIT
               WHEN TXN-DEPOSIT   ALSO CHANNEL-IS-STAFFED
                   PERFORM 3220-BRANCH-DEPOSIT
               WHEN TXN-DEPOSIT   ALSO CHANNEL-BATCH
                   PERFORM 3230-BATCH-DEPOSIT
               WHEN TXN-DEPOSIT   ALSO ANY
                   PERFORM 3200-DIGITAL-DEPOSIT
               WHEN TXN-WITHDRAWAL ALSO ANY
                   PERFORM 2100-VALIDATE-AND-WITHDRAW
               WHEN TXN-TRANSFER  ALSO ANY
                   PERFORM 2200-VALIDATE-AND-TRANSFER
               WHEN TXN-PAYMENT   ALSO ANY
                   PERFORM 2300-VALIDATE-AND-PAY
               WHEN OTHER
                   MOVE 'UNHANDLED TXN/CHANNEL COMBO'
                       TO WS-ERROR-MSG
                   PERFORM 9100-LOG-ERROR
           END-EVALUATE

           PERFORM 2030-POST-TXN-PROCESSING
           .
       2000-ROUTE-EXIT.
           EXIT.

Phase 3: Extract Validation into Separate Paragraphs

Validation logic that had been duplicated across branches was consolidated:

       2100-VALIDATE-AND-WITHDRAW.
           PERFORM 2110-CHECK-ACCT-WD-ELIGIBLE
           IF NOT WS-WD-ELIGIBLE
               GO TO 2100-WITHDRAW-EXIT
           END-IF

           PERFORM 2120-CHECK-BALANCE
           IF NOT WS-BALANCE-OK
               GO TO 2100-WITHDRAW-EXIT
           END-IF

           PERFORM 2130-CHECK-CHANNEL-LIMITS
           IF NOT WS-WITHIN-LIMITS
               EVALUATE TRUE
                   WHEN CUST-PREMIUM
                       PERFORM 2140-PREMIUM-OVERRIDE
                   WHEN OTHER
                       MOVE 'LIMIT EXCEEDED' TO WS-ERROR-MSG
                       PERFORM 9100-LOG-ERROR
                       GO TO 2100-WITHDRAW-EXIT
               END-EVALUATE
           END-IF

           PERFORM 3100-PROCESS-WITHDRAWAL
           .
       2100-WITHDRAW-EXIT.
           EXIT.

Results

The refactoring was completed in three weeks (including thorough regression testing) and deployed in the March 2024 release cycle.

Quantitative results (first 6 months post-deployment):

Metric Before After Change
Production incidents (monthly avg) 4.2 1.7 -60%
Mean time to diagnose issue 3.1 hours 0.8 hours -74%
Lines of PROCEDURE DIVISION 280 185 -34%
Nesting depth (maximum) 11 3 -73%
New feature implementation time 2-3 days 4-8 hours -75%
Code review time per change 2 hours 30 minutes -75%

Qualitative outcomes:

  • Derek Washington could now modify the routing logic independently, without Maria's oversight
  • Sarah Kim (business analyst at MedClaim, who consulted on the project) said the EVALUATE version "reads like the business rules document"
  • Two new transaction types were added in the six months post-refactoring with zero incidents
  • The refactored code served as a template for similar refactoring across six other modules

Lessons Learned

  1. 88-level condition names are the foundation. The team estimated that 40% of the readability improvement came from replacing raw value comparisons with named conditions.

  2. EVALUATE TRUE ALSO TRUE maps directly to business decision tables. When Sarah reviewed the refactored code against her requirements document, she could verify correctness by comparing the EVALUATE clauses to the rows and columns of her decision table.

  3. Guard clauses simplify the happy path. By checking error conditions first and exiting, the remaining code only handles valid scenarios — reducing nesting depth dramatically.

  4. Refactoring is not rewriting. Maria was careful to preserve the exact behavior of the original code, bugs and all. Known bugs were documented and fixed in a separate change request to maintain audit traceability.

Discussion Questions

  1. Why did Maria insist on preserving known bugs during the refactoring rather than fixing them simultaneously? What risks would combining refactoring with bug fixes introduce?

  2. The team used GO TO for guard clause exits. Could the same structure be achieved without GO TO? What tradeoffs would be involved?

  3. The EVALUATE uses WHEN TXN-DEPOSIT ALSO CHANNEL-ATM before WHEN TXN-DEPOSIT ALSO ANY. Why is this ordering important? What would happen if the order were reversed?

  4. How would you approach testing this refactoring to ensure the new code behaves identically to the old code for all input combinations?