Case Study 1: GlobalBank Account Statement Formatting
Background
GlobalBank generates monthly account statements for 1.2 million customers. The statement generation pipeline reads fixed-format transaction records from the mainframe's VSAM files and produces formatted output that feeds into the document composition system (which adds logos, graphics, and produces the final PDF).
The formatting step — program GBFMT01 — must transform raw fixed-field data into human-readable statement lines. This is primarily a string handling task.
The Problem
Derek Washington has been asked to enhance the statement formatting program. The current version uses only MOVE statements with predefined output line templates. It works but has limitations:
- Transaction descriptions are always 30 characters, even when the actual description is shorter — resulting in large gaps in the output
- Branch codes appear as raw codes ("NYC01") instead of human-readable names ("New York — Main Branch")
- Dates appear in YYYYMMDD format instead of MM/DD/YYYY
- Amounts are not formatted with thousand separators
Maria Chen reviews Derek's proposed approach: "Good plan. Use STRING with DELIMITED BY SPACE to trim trailing spaces from descriptions. Use reference modification for the date reformatting — it's faster than UNSTRING for a fixed-format field. Use INSPECT CONVERTING to ensure descriptions are uppercase for consistency."
Requirements
- Reformat dates from YYYYMMDD to MM/DD/YYYY
- Format amounts with dollar signs and thousand separators
- Trim trailing spaces from transaction descriptions
- Convert branch codes to full names using a lookup table
- Build formatted statement lines using STRING
- Handle edge cases: missing descriptions, zero amounts, unknown branch codes
Key Design Decisions
Date formatting: Use reference modification (not UNSTRING) because the date format is fixed — characters 1-4 are always the year, 5-6 are always the month, 7-8 are always the day. No parsing is needed.
Amount formatting: Use MOVE to a numeric-edited field (PIC $ZZZ,ZZ9.99), then STRING the edited field into the output line.
Description trimming: Use DELIMITED BY SPACE in STRING. But wait — some descriptions contain embedded spaces ("ATM WITHDRAWAL"). Using DELIMITED BY SPACE would truncate "ATM WITHDRAWAL" to "ATM". Solution: compute the actual length using a right-trim loop, then use reference modification with that length.
* Find actual length of description
MOVE 30 TO WS-DESC-LEN
PERFORM UNTIL WS-DESC-LEN = 0
OR TXN-DESC(WS-DESC-LEN:1) NOT = SPACE
SUBTRACT 1 FROM WS-DESC-LEN
END-PERFORM
* Use the actual length in STRING
IF WS-DESC-LEN > 0
STRING TXN-DESC(1:WS-DESC-LEN)
DELIMITED BY SIZE
INTO WS-OUTPUT
WITH POINTER WS-PTR
END-STRING
END-IF
Performance Results
The enhanced GBFMT01 processes 1.2 million statements (average 47 transactions each, ~56 million transaction lines) in the batch window. Performance comparison:
| Approach | Time for 56M lines |
|---|---|
| Template MOVE only (old) | 12 minutes |
| STRING for everything (initial) | 28 minutes |
| Hybrid: MOVE template + ref mod (final) | 14 minutes |
The "STRING for everything" approach was too slow. Derek adopted Maria's hybrid approach: use MOVE to place the output line template, then use reference modification to overlay the variable fields. STRING is used only for the description field where actual concatenation is needed.
Lessons Learned
- Reference modification outperforms STRING for fixed-position output — when you know exactly where a field goes, MOVE with reference modification is faster
- STRING is essential for variable-length concatenation — trimmed descriptions, built-up messages
- INSPECT CONVERTING is the most efficient case converter — faster than character-by-character loops
- Right-trimming is a fundamental pattern — compute actual length, then use it everywhere
Discussion Questions
- Why is
DELIMITED BY SPACEproblematic for fields with embedded spaces? What alternatives exist? - How would you handle transaction descriptions that contain special characters (ampersands, angle brackets) if the output is destined for an HTML template?
- The hybrid approach (template MOVE + reference modification) was 17% slower than the pure template approach. Is this acceptable? What factors determine whether the readability improvement is worth the performance cost?
- How would you modify this program to produce pipe-delimited output for a downstream analytics system instead of formatted text?