Case Study 1: GlobalBank Variable-Length Transaction Description Parser

DataField.Dev

Case Study 1: GlobalBank Variable-Length Transaction Description Parser

Background

GlobalBank's online banking platform generates transaction descriptions in a structured but variable-length format. Each description contains a channel identifier, an action type, and a variable number of key-value detail fields. The daily statement generation batch must parse these descriptions to extract the merchant name, reference number, and other details for formatted customer statements.

The Problem

Transaction descriptions arrive in the following format:

CHANNEL/ACTION/KEY1:VALUE1/KEY2:VALUE2/.../KEYn:VALUEn

Examples:

POS/PURCHASE/MERCHANT:WHOLE-FOODS/REF:POS99281/AMT:87.43
ATM/WITHDRAWAL/BRANCH:0042/TERMINAL:ATM-7/CITY:BOSTON
ACH/DIRECT-DEPOSIT/EMPLOYER:ACME-CORP/REF:DD20240615
WIRE/INTL/BENEFICIARY:J-SMITH/SWIFT:ABCDUS33/COUNTRY:UK

The challenge: descriptions vary from 20 to 200 characters, contain 2-8 fields, and the detail fields are optional and appear in no guaranteed order. The batch processes 1.2 million transactions nightly.

Design

Derek Washington proposed using reference modification with the pointer-scan pattern rather than UNSTRING because: 1. The number of fields varies per record (UNSTRING requires knowing the field count) 2. Key-value pairs need secondary parsing on the colon delimiter 3. Performance: reference modification avoids UNSTRING's delimiter-scanning overhead

Maria Chen approved the design with the requirement that every reference modification operation include boundary validation.

Implementation Highlights

The parser uses a two-pass approach: 1. First pass: Scan for slash delimiters, recording the start position and length of each field 2. Second pass: For detail fields (position 3+), split on the colon to extract key and value

A field map table stores up to 10 field positions:

01  WS-FIELD-MAP.
    05  WS-FM-COUNT     PIC 99 VALUE ZERO.
    05  WS-FM-ENTRY     OCCURS 10 TIMES.
        10  WS-FM-START PIC 9(3).
        10  WS-FM-LEN   PIC 9(3).

After parsing, the program searches the extracted key-value pairs for specific keys (MERCHANT, REF, EMPLOYER, etc.) to populate the statement detail fields.

Results

The parser processed all 1.2 million descriptions in under 90 seconds — well within the batch window. Zero parsing errors occurred in the first month of production, attributed to the comprehensive boundary checking that Maria required.

Lessons Learned

Reference modification boundary checks add negligible overhead — less than 2% of total parse time — but prevent catastrophic data corruption
The field-map intermediate structure made the code far more readable than a single-pass approach would have been
Key-value ordering varies — never assume a specific order; always search by key name

Discussion Questions

How would you modify the parser to handle descriptions that contain literal slash characters in values (e.g., MERCHANT:7-ELEVEN/FOOD)?
What would be the impact of using UNSTRING instead of reference modification for this use case?
How should the parser handle malformed descriptions (missing channel, no delimiters)?
If the description format were changed to JSON in a future modernization, how would the parsing approach change?