Case Study 1: GlobalBank — Refactoring ACCT-MAINT for Better Structure
Background
ACCT-MAINT was GlobalBank's oldest continuously running COBOL program, originally written in 1989 by a contractor named Harold Reeves. Over its 35-year life, it had been modified by at least 40 different programmers. Harold's original 800-line program had grown to 4,200 lines through accretion — each new requirement adding code but never reorganizing the whole.
Maria Chen described the program to the project steering committee: "ACCT-MAINT is a geological formation. You can read the layers — 1989 code at the bottom, 1990s patches in the middle, 2000s additions on top, and a layer of 2010s hotfixes scattered through everything. No one has looked at the whole program as a whole since Harold left in 1993."
The Catalyst
In October 2023, a regulatory change required GlobalBank to add a new account status code: 'R' for "Restricted." This should have been a one-day change. It took two weeks and three production rollbacks.
The problem: ACCT-MAINT had 47 places where account status was checked using raw value comparisons (IF A-STAT = 'A', IF A-STAT = 'C' OR 'S'). Derek Washington, assigned to add the 'R' status, found and updated 44 of them. He missed three. The first rollback was caused by a missed check that allowed restricted accounts to be modified. The second rollback was caused by a different missed check in the batch reporting section. The third was caused by a regression introduced when fixing the second issue.
Derek's frustrated post-incident report concluded: "This program is unmaintainable in its current form. Any change touching account status requires modifying dozens of locations. 88-level condition names would have reduced this to changing a single data definition."
Assessment
Maria and Derek spent a week analyzing ACCT-MAINT before proposing the refactoring. Their findings:
Structural Issues
| Issue | Count | Risk Level |
|---|---|---|
| Paragraphs with no hierarchical numbering | 127 | Medium |
| Paragraphs with non-descriptive names | 89 | High |
| GO TO statements (total) | 78 | High |
| GO TO jumping backward (loop simulation) | 12 | Critical |
| GO TO jumping forward across >100 lines | 23 | High |
| Paragraphs exceeding 100 lines | 7 | High |
| Maximum nesting depth | 9 levels | Critical |
| Period-terminated IF statements (no END-IF) | 340+ | Medium |
Data Issues
| Issue | Count |
|---|---|
| 88-level condition names defined | 0 |
| Raw value comparisons for account status | 47 |
| Single-character variable names | 31 |
| Undocumented "magic numbers" | 19 |
Flow Issues
The most alarming discovery was the backward GO TO loop at the heart of the program:
PROCESS-IT.
READ ACCT-FILE AT END MOVE 'Y' TO EOF-SW.
IF EOF-SW = 'Y' GO TO WRAP-UP.
... 200 lines of processing ...
GO TO PROCESS-IT.
This was not a PERFORM loop — it was a GO TO loop that simulated iteration. The "200 lines of processing" contained conditional GO TOs that jumped to error handling and back. Tracing the actual execution path through this tangle required drawing a flowchart by hand.
The Refactoring Plan
Maria proposed a phased approach:
Phase 1: Data Layer (Week 1)
Before touching any PROCEDURE DIVISION code, define all 88-level condition names:
01 WS-ACCOUNT-STATUS PIC X(01).
88 ACCT-ACTIVE VALUE 'A'.
88 ACCT-CLOSED VALUE 'C'.
88 ACCT-SUSPENDED VALUE 'S'.
88 ACCT-RESTRICTED VALUE 'R'.
88 ACCT-PENDING VALUE 'P'.
88 ACCT-PROCESSABLE VALUE 'A' 'P'.
88 ACCT-MODIFIABLE VALUE 'A' 'R'.
88 ACCT-STATUS-VALID VALUE 'A' 'C' 'S' 'R' 'P'.
Replace all 47 raw value comparisons with condition name references. This was a behavior-preserving change — no logic changed, only the syntax.
Phase 2: Eliminate Backward GO TO (Week 1)
Replace the GO TO loop with a proper PERFORM UNTIL:
1300-PROCESS-ALL-ACCOUNTS.
PERFORM 2100-READ-ACCOUNT
PERFORM 2000-PROCESS-ONE-ACCOUNT
UNTIL END-OF-FILE
OR WS-RECORDS-READ > WS-MAX-RECORDS
.
This was the highest-risk change — converting the fundamental control flow — and required the most thorough testing.
Phase 3: Extract Paragraphs (Week 2)
Break the 200-line processing block into named paragraphs:
2000-PROCESS-ONE-ACCOUNT.
PERFORM 2100-VALIDATE-ACTION-CODE
IF ACTION-CODE-VALID
EVALUATE TRUE
WHEN ACTION-IS-ADD
PERFORM 3100-ADD-ACCOUNT
WHEN ACTION-IS-CHANGE
PERFORM 3200-CHANGE-ACCOUNT
WHEN ACTION-IS-DELETE
PERFORM 3300-DELETE-ACCOUNT
END-EVALUATE
ELSE
PERFORM 9100-LOG-INVALID-ACTION
END-IF
PERFORM 2600-UPDATE-COUNTERS
PERFORM 2100-READ-ACCOUNT
.
Phase 4: Convert to END-IF (Week 2)
Replace all period-terminated IF statements with END-IF. This was tedious but straightforward, and it eliminated the "dangling else" risk entirely.
Phase 5: Replace Forward GO TOs (Week 2)
The remaining forward GO TOs were converted to structured alternatives — EVALUATE, condition flags, or (in 8 cases where the team agreed it was clearest) PERFORM THRU with GO TO to an exit paragraph.
Testing Strategy
The team created a comprehensive regression test suite before making any changes:
-
Baseline capture: Ran the original program against a full month of production data and captured all output files, return codes, and statistics.
-
Incremental comparison: After each phase, ran the same test data and compared outputs byte-for-byte with the baseline.
-
Edge case testing: Created specific test cases for every account status code, every action code, and every error condition they could identify.
-
Path coverage: Used the IBM Debug Tool to verify that all paragraphs in the refactored code were executed at least once.
Results
Quantitative
| Metric | Before | After |
|---|---|---|
| Total lines | 4,200 | 4,600 |
| Number of paragraphs | 127 | 156 |
| Average paragraph size | 33 lines | 15 lines |
| Maximum paragraph size | 237 lines | 28 lines |
| GO TO statements | 78 | 8 (all to exit paragraphs) |
| 88-level condition names | 0 | 23 |
| Maximum nesting depth | 9 | 3 |
| Cyclomatic complexity (max paragraph) | 27 | 6 |
Production Impact (First 12 Months)
| Metric | Before Refactoring | After Refactoring |
|---|---|---|
| Production incidents | 11 | 3 |
| Average time to diagnose | 4.2 hours | 0.9 hours |
| Average time to fix and deploy | 3.1 days | 0.8 days |
| New feature implementation time | 2-3 weeks | 2-3 days |
The Ultimate Validation
Six months after the refactoring, GlobalBank needed to add another new account status code: 'F' for "Frozen" (a regulatory hold). Derek Washington completed the change in four hours, including testing. He modified three lines:
- Added
88 ACCT-FROZEN VALUE 'F'to the status definition - Added
'F'to theACCT-STATUS-VALID88-level value list - Added a WHEN clause to the EVALUATE in
2200-ROUTE-BY-STATUS
No other code needed to change. No missed checks, no rollbacks, no midnight pages. "That," Maria told the project review meeting, "is what good structure buys you."
Discussion Questions
-
The refactoring increased total line count from 4,200 to 4,600. Does this mean the refactored code is worse? Why or why not?
-
The team chose to keep 8 GO TO statements (all to exit paragraphs within PERFORM THRU ranges). Was this the right decision? What would the code look like if they had eliminated all GO TOs?
-
Phase 1 (88-level definitions and replacing raw comparisons) was described as "behavior-preserving." However, it still required regression testing. Why?
-
What organizational or process changes would reduce the likelihood of ACCT-MAINT deteriorating back into poorly structured code over the next 35 years?
-
Harold Reeves wrote the original program in 1989, before END-IF and inline PERFORM existed. To what extent should we judge old code by modern standards?