Case Study 2: Federal Benefits Administration's LE Modernization and Marcus Whitfield's Knowledge Transfer
Background
Federal Benefits Administration (FBA) operates a 40-year-old mainframe application that manages benefits for 15 million federal employees and retirees. The codebase comprises 15 million lines of COBOL across 2,400 programs. The system processes enrollment changes, benefit calculations, claims adjudication, and payments — a mix of batch and CICS workloads running on a 2-LPAR Parallel Sysplex.
Sandra Chen, the modernization lead, faces a challenge that Kwame Mensah at CNB never had to: programs so old that they predate Language Environment itself.
Marcus Whitfield has been at FBA for 37 years. He started as a COBOL programmer in 1988, when MVS/XA was the operating system and VS COBOL was the compiler. He has personally written or modified over 400 of the 2,400 programs. He knows which programs have hidden assumptions about storage layout, which ones use non-standard OS macros, and which ones will break if recompiled. He is the only person who knows these things.
Marcus is retiring in four months.
The LE Landscape at FBA
Sandra commissioned a comprehensive audit of FBA's COBOL programs. Marcus led the audit, assisted by two junior programmers whom Sandra explicitly assigned as knowledge-transfer recipients.
The Audit Results
| Category | Count | LE Status | Risk |
|---|---|---|---|
| Group A: Compiled with Enterprise COBOL V6.x | 890 | Full LE compatibility | Low |
| Group B: Compiled with Enterprise COBOL V5.x | 620 | LE compatible (current) | Low-Medium |
| Group C: Compiled with Enterprise COBOL V4.x | 410 | LE compatible (degrading) | Medium |
| Group D: Compiled with Enterprise COBOL V3.x | 230 | LE compatible (fragile) | High |
| Group E: Compiled with VS COBOL II | 195 | Pre-LE; uses IGZOPT | Very High |
| Group F: Compiled with VS COBOL | 32 | Pre-LE; no runtime module | Critical |
| Group G: Source code unavailable | 23 | Unknown | Unknown/Critical |
| Total | 2,400 |
The 32 VS COBOL programs (Group F) are the most alarming. VS COBOL — not VS COBOL II, but the original VS COBOL compiler — produced code that predates even IGZOPT. These programs run under LE's "pre-Language Environment compatibility" mode, which translates ancient runtime interfaces into current LE equivalents. Each z/OS upgrade degrades this compatibility layer.
Marcus's knowledge of these 32 programs is irreplaceable: "I wrote 11 of them. I modified another 14. For the remaining 7, I know the people who wrote them — they all retired years ago. I know which ones use GETMAIN directly, which ones assume below-the-line storage, which ones have inline assembler, and which ones can be safely recompiled."
The 23 Source-Unavailable Programs
The 23 programs with no available source code are the most dangerous from an LE perspective:
- 8 were originally part of a vendor package from a company that went out of business in 2003
- 7 were written by a contractor team in 1995; the source was on a decommissioned system
- 5 were modified by Marcus using Superzap (direct hex modification of the load module) when source was lost
- 3 were compiled from source that Marcus believes exists "somewhere in the archive" but hasn't been located
For these 23 programs, recompilation is not an option. They must run under LE's compatibility mode indefinitely — or be rewritten from scratch.
The Knowledge Transfer Plan
Sandra created a structured knowledge transfer program with three components:
Component 1: The LE Compatibility Document
Marcus spent two weeks creating a document titled "LE Considerations for FBA COBOL Programs." It was 47 pages. Excerpts:
Program FBAENR01 (Enrollment Processing — VS COBOL II, compiled 1991):
This program uses CALL 'IGZERRE' to invoke the VS COBOL II error handling routine. Under LE, this CALL is intercepted by LE's compatibility layer and redirected to CEE3ABD. The program also uses ILBOABN0 for abnormal termination — this is translated to CEEMRCR by LE's compatibility shim. Both translations work today, but IBM's roadmap shows deprecation of IGZERRE translation in a future LE release.
The program uses GETMAIN R,LV=4096 directly (an assembler macro embedded via CALL to an assembler subroutine). LE does not manage this storage — it won't appear in RPTSTG, and LE won't free it at enclave termination. The assembler subroutine FBAASM01 must be reviewed separately.
SAFE TO RECOMPILE? Yes, with caution. The embedded CALL to FBAASM01 requires AMODE 31 compatibility verification. Test in isolated CICS region before production.
Program FBACALC3 (Benefit Calculation — VS COBOL, compiled 1986, Superzap modified 2004):
DO NOT RECOMPILE. No source code. Load module was hex-modified in 2004 to change a calculation constant (federal contribution percentage from 7.00% to 7.65%). The modification is at offset X'002A40' — I changed packed decimal X'70000C' to X'76500C'. The rest of the module is original 1986 object code.
This program runs in AMODE 24. It does not use LE at all — it uses the original VS COBOL runtime library (IGCXXXX modules). LE provides a compatibility wrapper that loads the original modules from the LE library. If IBM removes VS COBOL compatibility from LE, this program stops working immediately.
RECOMMENDED: Rewrite from specification. The benefit calculation logic is documented in FBA Policy Manual Section 3.4.7. Estimated effort: 3 weeks.
Program FBAAUD07 (Audit Trail — Enterprise COBOL V3.1, compiled 2003):
This program calls 5 subprograms dynamically. It runs in a long-running batch job (8-hour window). It does NOT cancel subprograms after calling them. RPTSTG shows heap growth of approximately 50 MB per hour — a slow storage leak from accumulated subprogram working storage. The job currently has enough headroom (600 MB free at end-of-run), but at 50 MB/hour growth rate, the headroom will be exhausted in approximately 12 hours — which is fine for the 8-hour window but would fail if the window extends.
The fix: add CANCEL 'FBASUB01' through CANCEL 'FBASUB05' after each processing loop iteration. But nobody has modified this program since 2003 — any change requires full regression testing of the audit trail output against 20 years of expected results.
CEEUOPT status: No CEEUOPT linked. Uses CEEDOPT defaults. TRAP is ON (from CEEDOPT). STORAGE is NONE (from CEEDOPT). This program would benefit from STORAGE(00,FE,00) but the change requires re-linking, which triggers the "no modification since 2003" concern.
Component 2: CEEUOPT Standardization
Marcus and Sandra designed three standard CEEUOPT templates for FBA:
Template 1: Modern Batch (Enterprise COBOL V5+)
FBABATCH CEEXOPT HEAP=(524288,524288,ANYWHERE,KEEP, X
33554432,16777216), X
STACK=(262144,262144,ANYWHERE,KEEP, X
262144,131072), X
STORAGE=(00,FE,00), X
ALL31=ON, X
TRAP=(ON,SPIE), X
ABTERMENC=ABEND, X
CEEDUMP=ON, X
TERMTHDACT=TRACE, X
RPTSTG=OFF
END
Template 2: Legacy Batch (Enterprise COBOL V3/V4)
FBALEGCY CEEXOPT HEAP=(131072,131072,ANYWHERE,KEEP, X
8192,4096), X
STACK=(131072,131072,ANYWHERE,KEEP, X
131072,65536), X
STORAGE=(00,00,00), X
ALL31=ON, X
TRAP=(ON,SPIE), X
ABTERMENC=ABEND, X
CEEDUMP=ON, X
TERMTHDACT=TRACE
END
Note: STORAGE(00,00,00) instead of (00,FE,00) for legacy programs. Marcus explained: "Some of the V3/V4 programs have logic that depends on storage containing whatever was there before — uninitialized fields that happen to be zeros from a previous GETMAIN. Filling freed storage with X'FE' would break them. We'll use (00,00,00) for now and migrate to (00,FE,00) as we recompile and verify each program."
Template 3: CICS (CEECOPT — all programs)
FBACICS CEEXOPT STORAGE=(00,FE,00), X
ALL31=ON, X
TRAP=(ON,SPIE), X
CEEDUMP=40, X
TERMTHDACT=MSG
END
Component 3: The Risk Register
Sandra created an LE Compatibility Risk Register — a spreadsheet tracking every program with LE risk:
| Column | Purpose |
|---|---|
| Program name | Identifier |
| Compiler version | From load module scan |
| Compile date | From load module scan |
| LE compatibility level | Mapped from compiler version |
| Source available | Yes/No/Partial |
| CEEUOPT linked | Yes/No |
| Known dependencies | Below-line, assembler, GETMAIN, IGZOPT |
| Marcus's assessment | Safe to recompile / Recompile with caution / Do not recompile / Rewrite |
| Priority | Emergency / High / Medium / Low |
| Target date | When this program will be addressed |
The register was maintained in a secured spreadsheet with version control. Sandra designated it as a "living document" — updated after every recompilation, z/OS upgrade, or LE PTF.
The Recompilation Campaign
Sandra planned a phased recompilation campaign, prioritized by risk:
Phase 1: Emergency (Month 1) — Groups E and F
Approach: Marcus personally supervised the recompilation of the 195 VS COBOL II programs. For each:
- Retrieve source from Endevor
- Verify source matches load module (some programs had source that didn't match the production module)
- Compile with Enterprise COBOL V6.3
- Run baseline comparison test (execute both old and new modules, compare outputs)
- If outputs match: promote new module
- If outputs differ: Marcus investigates, adjusts, retests
Result: 178 of 195 recompiled successfully with identical outputs. 12 had minor differences (cosmetic — date formatting, spacing) that were acceptable. 5 had significant differences requiring source modification:
- 2 programs used COBOL features removed in V6 (ALTER/GO TO DEPENDING ON with modified targets)
- 2 programs had numeric precision differences due to V6's improved decimal handling
- 1 program relied on a VS COBOL II runtime behavior that V6 doesn't replicate
The 5 difficult programs consumed 60% of the Phase 1 effort. Marcus resolved each one personally.
The 32 VS COBOL programs (Group F) were assessed individually: - 18 were recompiled after source was located and verified - 9 were flagged for rewrite (source unavailable or Superzap-modified) - 5 were documented as running under LE compatibility mode with acceptance of risk
Phase 2: High Priority (Month 2) — Group D
The 230 Enterprise COBOL V3.x programs were more straightforward — no VS COBOL II compatibility issues. Recompilation was largely automated. Marcus reviewed the 30 programs he had identified as "recompile with caution" — programs with assembler subprogram calls or non-standard I/O.
Phase 3: Medium Priority (Month 3) — Group C
410 Enterprise COBOL V4.x programs. Mostly automated recompilation. Marcus was available for consultation but junior team members handled the process.
Phase 4: Ongoing — Groups A and B + the 23 Source-Unavailable
Groups A and B were already current and required no action. The 23 source-unavailable programs were placed on the Risk Register with annual testing against each LE upgrade. Sandra budgeted 4 rewrite projects per year to gradually eliminate these programs.
Marcus's Last Month
In his final month, Marcus conducted a series of knowledge-transfer sessions:
Session 1: "How to Read a CEEDUMP When LE Compatibility Mode Fails" Marcus demonstrated CEEDUMP interpretation for programs running under LE's pre-LE compatibility layer. The CEEDUMP format is different — the traceback uses older control block names, and the condition information references IGZERRE instead of CEE3xxx messages.
Session 2: "The 23 Programs Without Source — What I Know" Marcus spent four hours documenting each of the 23 source-unavailable programs: what they do, how they interact with other programs, what their working storage looks like (from his memory), and what would need to be replicated in a rewrite.
Session 3: "FBA's Hidden Assembler Subroutines" Marcus identified 34 assembler subroutines called by COBOL programs. These subroutines are outside LE's management — they use direct GETMAIN, direct SVC calls, and below-the-line storage. Marcus documented each one's storage behavior and LE interaction.
Session 4: "The Programs That Will Break Next" Marcus identified 8 programs that he predicted would fail within the next 2-3 years due to LE evolution, data volume growth, or storage constraints. He documented each prediction with technical rationale.
On his last day, Marcus gave Sandra a handwritten note: "The mainframe isn't going away, and the COBOL isn't going away. But the people who understand both are going away. Invest in the next generation."
Outcome and Metrics
Six Months After Marcus's Retirement
| Metric | Before Modernization | After Phase 1-3 | Change |
|---|---|---|---|
| Programs at LE risk (VS COBOL/VS COBOL II) | 227 | 14 | -94% |
| Programs with CEEUOPT linked | ~200 (estimated) | 1,850 | +825% |
| Programs with STORAGE(00,FE,00) | ~50 (estimated) | 1,500 | +2,900% |
| U4038 abends (6-month count) | 0 (pre-PTF) → 5 (during PTF) | 0 | Eliminated |
| Uninitialized data bugs found via STORAGE option | Unknown | 7 in first 3 months | New capability |
| Source-unavailable programs rewritten | 0 | 4 | Ongoing (4/year) |
Twelve Months After Marcus's Retirement
Sandra's team discovered that two of Marcus's eight "programs that will break next" predictions came true: 1. A program hit the 2 GB bar (exactly as Marcus predicted) when a reference table grew past the threshold he calculated 2. A VS COBOL compatibility-mode program failed after a z/OS maintenance PTF removed a specific IGZERRE translation that Marcus had flagged
Both were resolved within hours using Marcus's documentation. Without the documentation, Sandra estimated they would have taken days.
Discussion Questions
-
Marcus's 47-page LE Compatibility Document is arguably the most valuable artifact in this case study. How would you ensure this document remains current after Marcus leaves? What governance process would you implement?
-
Marcus chose STORAGE(00,00,00) instead of STORAGE(00,FE,00) for legacy programs because some depend on uninitialized storage behavior. Is this the right decision? What testing would you need to safely migrate these programs to (00,FE,00)?
-
The 23 source-unavailable programs represent a permanent LE compatibility risk. Sandra budgeted 4 rewrites per year. At that rate, the last program would be rewritten in 6 years. Is this pace acceptable? What factors should determine the priority order?
-
Compare FBA's situation with CNB's U4038 incident (Case Study 1). CNB had 17 at-risk programs; FBA had 227. What explains the difference? What organizational factor allowed FBA's technical debt to accumulate more than CNB's?
-
Marcus's retiring knowledge is the real crisis — not the LE compatibility. Propose a "knowledge escrow" program that organizations could implement to prevent this situation. What knowledge should be captured, how, and in what format?
-
The knowledge transfer took approximately 3 months of Marcus's time and 6 months of the junior programmers' time. Calculate the cost (assume Marcus at $85/hour, juniors at $55/hour, 160 hours/month each). Compare this to the potential cost of production failures without the knowledge transfer.