Case Study 1: GlobalBank Variable-Length Transaction Description Parser
Background
GlobalBank's online banking platform generates transaction descriptions in a structured but variable-length format. Each description contains a channel identifier, an action type, and a variable number of key-value detail fields. The daily statement generation batch must parse these descriptions to extract the merchant name, reference number, and other details for formatted customer statements.
The Problem
Transaction descriptions arrive in the following format:
CHANNEL/ACTION/KEY1:VALUE1/KEY2:VALUE2/.../KEYn:VALUEn
Examples:
POS/PURCHASE/MERCHANT:WHOLE-FOODS/REF:POS99281/AMT:87.43
ATM/WITHDRAWAL/BRANCH:0042/TERMINAL:ATM-7/CITY:BOSTON
ACH/DIRECT-DEPOSIT/EMPLOYER:ACME-CORP/REF:DD20240615
WIRE/INTL/BENEFICIARY:J-SMITH/SWIFT:ABCDUS33/COUNTRY:UK
The challenge: descriptions vary from 20 to 200 characters, contain 2-8 fields, and the detail fields are optional and appear in no guaranteed order. The batch processes 1.2 million transactions nightly.
Design
Derek Washington proposed using reference modification with the pointer-scan pattern rather than UNSTRING because: 1. The number of fields varies per record (UNSTRING requires knowing the field count) 2. Key-value pairs need secondary parsing on the colon delimiter 3. Performance: reference modification avoids UNSTRING's delimiter-scanning overhead
Maria Chen approved the design with the requirement that every reference modification operation include boundary validation.
Implementation Highlights
The parser uses a two-pass approach: 1. First pass: Scan for slash delimiters, recording the start position and length of each field 2. Second pass: For detail fields (position 3+), split on the colon to extract key and value
A field map table stores up to 10 field positions:
01 WS-FIELD-MAP.
05 WS-FM-COUNT PIC 99 VALUE ZERO.
05 WS-FM-ENTRY OCCURS 10 TIMES.
10 WS-FM-START PIC 9(3).
10 WS-FM-LEN PIC 9(3).
After parsing, the program searches the extracted key-value pairs for specific keys (MERCHANT, REF, EMPLOYER, etc.) to populate the statement detail fields.
Results
The parser processed all 1.2 million descriptions in under 90 seconds — well within the batch window. Zero parsing errors occurred in the first month of production, attributed to the comprehensive boundary checking that Maria required.
Lessons Learned
- Reference modification boundary checks add negligible overhead — less than 2% of total parse time — but prevent catastrophic data corruption
- The field-map intermediate structure made the code far more readable than a single-pass approach would have been
- Key-value ordering varies — never assume a specific order; always search by key name
Discussion Questions
- How would you modify the parser to handle descriptions that contain literal slash characters in values (e.g.,
MERCHANT:7-ELEVEN/FOOD)? - What would be the impact of using UNSTRING instead of reference modification for this use case?
- How should the parser handle malformed descriptions (missing channel, no delimiters)?
- If the description format were changed to JSON in a future modernization, how would the parsing approach change?