> "When you absolutely need to reach into a data item and extract exactly the bytes you want — no more, no less — reference modification is your scalpel. It is precise, powerful, and unforgiving of errors." — Priya Kapoor, Architect, GlobalBank
In This Chapter
- 19.1 Reference Modification Fundamentals
- 19.2 The LENGTH OF Special Register
- 19.3 Dynamic Substring Operations
- 19.4 Parsing Without UNSTRING: Reference Modification Patterns
- 19.5 Advanced Parsing Patterns with Reference Modification
- 19.6 Production Patterns: Multi-Record Parsing
- 19.7 Pointer-Based Processing: ADDRESS OF
- 19.6 Processing Variable-Format Records
- 19.7 STRING Statement with POINTER Phrase
- 19.8 UNSTRING with POINTER Phrase
- 19.9 Advanced Technique: Building a Generic Field Extractor
- 19.10 Reference Modification in CICS and Online Programs
- 19.11 GlobalBank Case Study: Transaction Description Parser
- 19.12 MedClaim Case Study: EDI 837 Claim Parsing
- 19.12 Complete Worked Example: CSV to Fixed-Width Converter
- 19.13 Defensive Programming for Reference Modification
- 19.13 The Student Mainframe Lab
- 19.15 Performance Considerations for Reference Modification
- 19.16 Chapter Summary
Chapter 19: Pointer and Reference Modification
"When you absolutely need to reach into a data item and extract exactly the bytes you want — no more, no less — reference modification is your scalpel. It is precise, powerful, and unforgiving of errors." — Priya Kapoor, Architect, GlobalBank
Most COBOL data manipulation works at the field level: you MOVE entire fields, INSPECT entire fields, STRING and UNSTRING entire fields. But there are times when you need finer control — when you need to extract the third through seventh characters of a field, or build an output string one piece at a time without knowing in advance how long each piece will be, or parse a variable-format record where field boundaries shift depending on the data.
Reference modification gives you byte-level access to any alphanumeric data item. Combined with the LENGTH OF special register and pointer-based techniques, it provides the tools you need for advanced data manipulation — the same kind of substring and pointer operations that programmers in C or Java take for granted.
In this chapter, we explore reference modification in depth, develop practical patterns for parsing and building dynamic strings, and apply these techniques to real-world problems at both GlobalBank and MedClaim.
19.1 Reference Modification Fundamentals
Reference modification allows you to refer to a substring of any alphanumeric or national data item by specifying a starting position and, optionally, a length.
Basic Syntax
identifier(starting-position : length)
- starting-position: An arithmetic expression that evaluates to a positive integer. Position 1 is the leftmost byte.
- length: An optional arithmetic expression that evaluates to a positive integer. If omitted, the substring extends from the starting position to the end of the item.
Simple Examples
01 WS-FULL-NAME PIC X(30) VALUE "CHEN, MARIA L.".
01 WS-FIRST-CHAR PIC X(1).
01 WS-LAST-NAME PIC X(15).
01 WS-SUBSTRING PIC X(10).
PROCEDURE DIVISION.
MOVE WS-FULL-NAME(1:4) TO WS-LAST-NAME
*> WS-LAST-NAME = "CHEN" (positions 1-4)
MOVE WS-FULL-NAME(7:5) TO WS-FIRST-CHAR
*> Gets "MARIA" (positions 7-11)
MOVE WS-FULL-NAME(7:) TO WS-SUBSTRING
*> Gets from position 7 to end: "MARIA L. "
Rules and Constraints
- The starting position must be >= 1.
- The starting position plus length minus 1 must not exceed the total length of the data item.
- Both starting position and length must evaluate to positive integers.
- Reference modification can be applied to any alphanumeric, alphabetic, or national item — including group items.
- Reference modification can be used anywhere an identifier is valid: MOVE, IF, EVALUATE, STRING, DISPLAY, etc.
⚠️ Defensive Programming Alert: The compiler does not generate runtime boundary checks for reference modification unless you enable the SSRANGE (or equivalent) option. Out-of-range reference modification silently reads or writes memory outside the data item, causing data corruption or abends that are extremely difficult to diagnose.
Arithmetic Expressions in Reference Modification
The starting position and length can be arithmetic expressions:
01 WS-POS PIC 99 VALUE 1.
01 WS-LEN PIC 99 VALUE 5.
01 WS-DATA PIC X(50).
MOVE WS-DATA(WS-POS:WS-LEN) TO WS-OUTPUT
*> Start 3 positions past current, take 10 bytes
MOVE WS-DATA(WS-POS + 3 : WS-LEN + 5)
TO WS-OUTPUT
*> Dynamic extraction based on calculated values
COMPUTE WS-POS = WS-FIELD-OFFSET + 1
COMPUTE WS-LEN = WS-FIELD-LENGTH
MOVE WS-DATA(WS-POS:WS-LEN) TO WS-EXTRACTED
Reference Modification on Group Items
You can apply reference modification to group items, treating the entire group as a string of bytes:
01 WS-CUSTOMER-RECORD.
05 WS-CUST-ID PIC 9(6).
05 WS-CUST-NAME PIC X(30).
05 WS-CUST-ADDR PIC X(50).
*> Extract bytes 7 through 36 (the name field)
MOVE WS-CUSTOMER-RECORD(7:30) TO WS-NAME-ONLY
*> This is equivalent to:
MOVE WS-CUST-NAME TO WS-NAME-ONLY
While extracting named fields this way is pointless (just use the field name), it becomes valuable when processing records with variable layouts or when the field boundaries are data-driven.
19.2 The LENGTH OF Special Register
The LENGTH OF special register returns the number of bytes allocated to a data item. It is evaluated at compile time for fixed-length items and at runtime for variable-length items (those with OCCURS DEPENDING ON).
01 WS-RECORD PIC X(100).
01 WS-REC-LENGTH PIC 9(5).
MOVE LENGTH OF WS-RECORD TO WS-REC-LENGTH
*> WS-REC-LENGTH = 100
01 WS-TABLE-AREA.
05 WS-COUNT PIC 9(3) COMP.
05 WS-ITEM PIC X(20) OCCURS 1 TO 50 TIMES
DEPENDING ON WS-COUNT.
MOVE 10 TO WS-COUNT
MOVE LENGTH OF WS-TABLE-AREA TO WS-REC-LENGTH
*> WS-REC-LENGTH = 2 + (10 * 20) = 202
Using LENGTH OF with Reference Modification
LENGTH OF is invaluable for building safe reference modification logic:
IF WS-POS + WS-LEN - 1 > LENGTH OF WS-DATA
DISPLAY "REF MOD OUT OF RANGE"
DISPLAY "POS=" WS-POS " LEN=" WS-LEN
" MAX=" LENGTH OF WS-DATA
PERFORM 9900-ERROR-HANDLER
ELSE
MOVE WS-DATA(WS-POS:WS-LEN) TO WS-OUTPUT
END-IF
FUNCTION LENGTH vs. LENGTH OF
COBOL provides both LENGTH OF (a special register) and FUNCTION LENGTH (an intrinsic function). They differ subtly:
MOVE LENGTH OF WS-FIELD TO WS-LEN
*> Returns allocated byte length
MOVE FUNCTION LENGTH(WS-FIELD) TO WS-LEN
*> Also returns byte length, but can be used in
*> arithmetic expressions more naturally
COMPUTE WS-LAST = FUNCTION LENGTH(WS-FIELD)
MOVE WS-FIELD(WS-LAST:1) TO WS-LAST-CHAR
*> Gets the last character of the field
FUNCTION LENGTH can also be applied to literals:
MOVE FUNCTION LENGTH("HELLO WORLD") TO WS-LEN
*> WS-LEN = 11
19.3 Dynamic Substring Operations
Reference modification becomes truly powerful when combined with variables for position and length, enabling dynamic data extraction.
Scanning for a Delimiter
A common pattern is scanning a field for a specific character and extracting what comes before and after it:
01 WS-INPUT PIC X(80).
01 WS-PART-1 PIC X(80).
01 WS-PART-2 PIC X(80).
01 WS-SCAN-POS PIC 99.
01 WS-DELIM-POS PIC 99 VALUE ZERO.
01 WS-INPUT-LEN PIC 99.
PERFORM SPLIT-ON-COMMA.
SPLIT-ON-COMMA.
MOVE SPACES TO WS-PART-1 WS-PART-2
MOVE FUNCTION LENGTH(WS-INPUT) TO WS-INPUT-LEN
* Find the comma
MOVE ZERO TO WS-DELIM-POS
PERFORM VARYING WS-SCAN-POS FROM 1 BY 1
UNTIL WS-SCAN-POS > WS-INPUT-LEN
OR WS-DELIM-POS > ZERO
IF WS-INPUT(WS-SCAN-POS:1) = ","
MOVE WS-SCAN-POS TO WS-DELIM-POS
END-IF
END-PERFORM
IF WS-DELIM-POS > ZERO
* Extract before comma
IF WS-DELIM-POS > 1
MOVE WS-INPUT(1:WS-DELIM-POS - 1)
TO WS-PART-1
END-IF
* Extract after comma
IF WS-DELIM-POS < WS-INPUT-LEN
MOVE WS-INPUT(WS-DELIM-POS + 1:)
TO WS-PART-2
END-IF
ELSE
* No comma found - entire input is part 1
MOVE WS-INPUT TO WS-PART-1
END-IF
.
💡 Why not use UNSTRING? UNSTRING is often cleaner for simple delimiter-based splitting. But reference modification gives you more control: you can handle multiple delimiter types, skip quoted sections, process nested delimiters, or extract fixed-position fields from variable-format records where UNSTRING would be awkward.
Extracting Fixed Fields from Variable Positions
Consider a record where the first 2 bytes indicate a record type, and the layout of the remaining bytes depends on the type:
01 WS-VARIABLE-RECORD PIC X(200).
01 WS-REC-TYPE PIC X(2).
01 WS-FIELD-A PIC X(20).
01 WS-FIELD-B PIC X(10).
01 WS-START PIC 9(3).
MOVE WS-VARIABLE-RECORD(1:2) TO WS-REC-TYPE
EVALUATE WS-REC-TYPE
WHEN "01"
*> Type 01: name at pos 3-22, code at pos 23-32
MOVE WS-VARIABLE-RECORD(3:20) TO WS-FIELD-A
MOVE WS-VARIABLE-RECORD(23:10) TO WS-FIELD-B
WHEN "02"
*> Type 02: code at pos 3-12, name at pos 13-32
MOVE WS-VARIABLE-RECORD(13:20) TO WS-FIELD-A
MOVE WS-VARIABLE-RECORD(3:10) TO WS-FIELD-B
WHEN "03"
*> Type 03: skip 2-byte length, then name
MOVE WS-VARIABLE-RECORD(3:2) TO WS-START
MOVE WS-VARIABLE-RECORD(5:WS-START)
TO WS-FIELD-A
END-EVALUATE
19.4 Parsing Without UNSTRING: Reference Modification Patterns
Reference modification enables powerful parsing patterns that go beyond what UNSTRING can do easily.
Pattern 1: Token Extraction with Variable-Width Delimiters
*------------------------------------------------------------
* Extract tokens from a pipe-delimited string where
* fields can be empty (consecutive pipes)
*------------------------------------------------------------
01 WS-INPUT-STRING PIC X(200).
01 WS-TOKENS.
05 WS-TOKEN PIC X(50) OCCURS 10 TIMES.
01 WS-PARSE-VARS.
05 WS-CURR-POS PIC 9(3) VALUE 1.
05 WS-TOKEN-START PIC 9(3).
05 WS-TOKEN-NUM PIC 9(2) VALUE 1.
05 WS-MAX-LEN PIC 9(3).
05 WS-TOKEN-LEN PIC 9(3).
PARSE-PIPE-DELIMITED.
MOVE FUNCTION LENGTH(WS-INPUT-STRING)
TO WS-MAX-LEN
MOVE 1 TO WS-CURR-POS
MOVE 1 TO WS-TOKEN-NUM
MOVE SPACES TO WS-TOKENS
PERFORM UNTIL WS-CURR-POS > WS-MAX-LEN
OR WS-TOKEN-NUM > 10
MOVE WS-CURR-POS TO WS-TOKEN-START
* Scan for next pipe or end of string
PERFORM UNTIL WS-CURR-POS > WS-MAX-LEN
IF WS-INPUT-STRING(WS-CURR-POS:1)
= "|"
EXIT PERFORM
END-IF
ADD 1 TO WS-CURR-POS
END-PERFORM
* Extract token
COMPUTE WS-TOKEN-LEN =
WS-CURR-POS - WS-TOKEN-START
IF WS-TOKEN-LEN > ZERO
AND WS-TOKEN-LEN <= 50
MOVE WS-INPUT-STRING(
WS-TOKEN-START:WS-TOKEN-LEN)
TO WS-TOKEN(WS-TOKEN-NUM)
END-IF
ADD 1 TO WS-TOKEN-NUM
ADD 1 TO WS-CURR-POS
END-PERFORM
.
Pattern 2: Right-to-Left Parsing
Sometimes you need to parse from the right — for example, extracting a file extension from a filename:
01 WS-FILENAME PIC X(80).
01 WS-EXTENSION PIC X(10).
01 WS-BASENAME PIC X(70).
01 WS-SCAN PIC 9(3).
01 WS-DOT-POS PIC 9(3) VALUE ZERO.
01 WS-NAME-LEN PIC 9(3).
EXTRACT-EXTENSION.
MOVE SPACES TO WS-EXTENSION WS-BASENAME
MOVE FUNCTION LENGTH(WS-FILENAME)
TO WS-NAME-LEN
* Find rightmost period by scanning backward
MOVE ZERO TO WS-DOT-POS
PERFORM VARYING WS-SCAN
FROM WS-NAME-LEN BY -1
UNTIL WS-SCAN < 1
OR WS-DOT-POS > ZERO
IF WS-FILENAME(WS-SCAN:1) = "."
MOVE WS-SCAN TO WS-DOT-POS
END-IF
END-PERFORM
IF WS-DOT-POS > ZERO AND < WS-NAME-LEN
MOVE WS-FILENAME(WS-DOT-POS + 1:
WS-NAME-LEN - WS-DOT-POS)
TO WS-EXTENSION
MOVE WS-FILENAME(1:WS-DOT-POS - 1)
TO WS-BASENAME
ELSE
MOVE WS-FILENAME TO WS-BASENAME
MOVE SPACES TO WS-EXTENSION
END-IF
.
Pattern 3: Building Dynamic Output with a Pointer
A common and extremely useful pattern is building an output string piece by piece using a position pointer:
01 WS-OUTPUT-LINE PIC X(132).
01 WS-OUT-PTR PIC 9(3) VALUE 1.
01 WS-PIECE-LEN PIC 9(3).
BUILD-OUTPUT-LINE.
MOVE SPACES TO WS-OUTPUT-LINE
MOVE 1 TO WS-OUT-PTR
* Append account number
MOVE WS-ACCT-NUM TO
WS-OUTPUT-LINE(WS-OUT-PTR:10)
ADD 10 TO WS-OUT-PTR
* Append separator
MOVE " | " TO
WS-OUTPUT-LINE(WS-OUT-PTR:3)
ADD 3 TO WS-OUT-PTR
* Append name (trimmed)
MOVE FUNCTION LENGTH(
FUNCTION TRIM(WS-CUST-NAME TRAILING))
TO WS-PIECE-LEN
IF WS-PIECE-LEN > ZERO
MOVE FUNCTION TRIM(
WS-CUST-NAME TRAILING)
TO WS-OUTPUT-LINE(
WS-OUT-PTR:WS-PIECE-LEN)
ADD WS-PIECE-LEN TO WS-OUT-PTR
END-IF
* Append separator
MOVE " | " TO
WS-OUTPUT-LINE(WS-OUT-PTR:3)
ADD 3 TO WS-OUT-PTR
* Append formatted balance
MOVE WS-FORMATTED-BAL TO
WS-OUTPUT-LINE(WS-OUT-PTR:15)
ADD 15 TO WS-OUT-PTR
.
This pattern produces tightly packed output without the wasted space that occurs when moving fixed-length fields to fixed positions. It is the COBOL equivalent of string concatenation in modern languages.
📊 Performance Note: The pointer-based build pattern is significantly more efficient than repeated STRING...DELIMITED BY operations for building complex output lines, because STRING must scan for delimiters on every call. Reference modification with a pointer goes directly to the target position.
19.5 Advanced Parsing Patterns with Reference Modification
Before we cover pointer-based processing, let us explore several more parsing patterns that arise in production COBOL programs. These patterns combine the scanning, extracting, and building techniques from the previous sections into reusable solutions for common data manipulation problems.
Pattern 4: Nested Delimiter Parsing
Some data formats use multiple delimiter levels. For example, GlobalBank's batch control records use pipes to separate fields and colons to separate sub-fields:
CTRL|BATCH:20240615:001|STATUS:OPEN|RECORDS:15234|AMOUNT:4523891.77
The parser must handle two levels: first split on pipes, then split selected fields on colons:
01 WS-CTRL-RECORD PIC X(200).
01 WS-LEVEL-1.
05 WS-L1-COUNT PIC 99 VALUE ZERO.
05 WS-L1-FIELD PIC X(60) OCCURS 10 TIMES.
01 WS-LEVEL-2.
05 WS-L2-COUNT PIC 99 VALUE ZERO.
05 WS-L2-FIELD PIC X(30) OCCURS 5 TIMES.
01 WS-PARSE-CTL.
05 WS-PC-POS PIC 9(3).
05 WS-PC-START PIC 9(3).
05 WS-PC-LEN PIC 9(3).
05 WS-PC-MAX PIC 9(3).
05 WS-PC-DELIM PIC X(1).
PARSE-CTRL-RECORD.
* Level 1: Split on pipes
MOVE "|" TO WS-PC-DELIM
MOVE WS-CTRL-RECORD TO WS-WORK-BUFFER
PERFORM SPLIT-ON-DELIMITER
MOVE WS-L1-COUNT TO WS-SAVE-L1-COUNT
* Copy results to level-1 array
PERFORM VARYING WS-PC-POS FROM 1 BY 1
UNTIL WS-PC-POS > WS-L1-COUNT
MOVE WS-SPLIT-RESULT(WS-PC-POS)
TO WS-L1-FIELD(WS-PC-POS)
END-PERFORM
* Level 2: Split field 2 on colons
MOVE ":" TO WS-PC-DELIM
MOVE WS-L1-FIELD(2) TO WS-WORK-BUFFER
PERFORM SPLIT-ON-DELIMITER
* Now WS-L2-FIELD(1) = "BATCH"
* WS-L2-FIELD(2) = "20240615"
* WS-L2-FIELD(3) = "001"
.
The SPLIT-ON-DELIMITER paragraph is a reusable component that accepts any single-character delimiter and populates a result array. Building reusable parsing components is a hallmark of well-designed COBOL programs.
Pattern 5: Fixed-Width Field Extraction with a Field Map
When processing records from multiple sources with different layouts, a field map table eliminates hard-coded positions:
01 WS-FIELD-MAP.
05 WS-FM-COUNT PIC 99 VALUE ZERO.
05 WS-FM-ENTRY OCCURS 20 TIMES.
10 WS-FM-NAME PIC X(15).
10 WS-FM-START PIC 9(3).
10 WS-FM-LENGTH PIC 9(3).
01 WS-INPUT-RECORD PIC X(500).
01 WS-EXTRACTED PIC X(100).
EXTRACT-BY-FIELD-MAP.
PERFORM VARYING WS-FM-IDX FROM 1 BY 1
UNTIL WS-FM-IDX > WS-FM-COUNT
* Boundary check
IF WS-FM-START(WS-FM-IDX) +
WS-FM-LENGTH(WS-FM-IDX) - 1
> FUNCTION LENGTH(WS-INPUT-RECORD)
DISPLAY "FIELD MAP OVERFLOW: "
WS-FM-NAME(WS-FM-IDX)
PERFORM 9100-LOG-ERROR
ELSE
MOVE SPACES TO WS-EXTRACTED
MOVE WS-INPUT-RECORD(
WS-FM-START(WS-FM-IDX):
WS-FM-LENGTH(WS-FM-IDX))
TO WS-EXTRACTED
DISPLAY WS-FM-NAME(WS-FM-IDX)
" = [" WS-EXTRACTED "]"
END-IF
END-PERFORM
.
This data-driven approach means you can support new record layouts by loading a different field map — no program changes required. It is the COBOL equivalent of a schema or metadata definition.
📊 Production Pattern: MedClaim's EDI processing system uses field maps stored in DB2 tables. When a new payer sends claims in a slightly different format, Sarah Kim updates the field map rows rather than requesting a program change. James Okafor estimates this has prevented over 40 change requests in the past year.
Pattern 6: Hexadecimal Dump with Reference Modification
For debugging, a hexadecimal dump of a data item can be invaluable. Reference modification makes this straightforward:
01 WS-HEX-TABLE.
05 FILLER PIC X(16)
VALUE "0123456789ABCDEF".
01 WS-HEX-CHARS REDEFINES WS-HEX-TABLE.
05 WS-HEX-CHAR PIC X(1) OCCURS 16 TIMES.
01 WS-DUMP-LINE PIC X(80).
01 WS-DUMP-PTR PIC 9(3).
01 WS-BYTE-VAL PIC 9(3) COMP.
01 WS-HIGH-NIBBLE PIC 9 COMP.
01 WS-LOW-NIBBLE PIC 9 COMP.
01 WS-DUMP-POS PIC 9(5).
01 WS-DUMP-MAX PIC 9(5).
HEX-DUMP-FIELD.
MOVE FUNCTION LENGTH(WS-TARGET-FIELD)
TO WS-DUMP-MAX
MOVE 1 TO WS-DUMP-POS
PERFORM UNTIL WS-DUMP-POS > WS-DUMP-MAX
MOVE SPACES TO WS-DUMP-LINE
MOVE 1 TO WS-DUMP-PTR
* Position label
STRING WS-DUMP-POS DELIMITED BY SIZE
": " DELIMITED BY SIZE
INTO WS-DUMP-LINE
WITH POINTER WS-DUMP-PTR
END-STRING
* Hex bytes (up to 16 per line)
PERFORM VARYING WS-BYTE-IDX FROM 0 BY 1
UNTIL WS-BYTE-IDX >= 16
OR WS-DUMP-POS + WS-BYTE-IDX
> WS-DUMP-MAX
COMPUTE WS-BYTE-VAL =
FUNCTION ORD(WS-TARGET-FIELD(
WS-DUMP-POS + WS-BYTE-IDX
:1)) - 1
COMPUTE WS-HIGH-NIBBLE =
WS-BYTE-VAL / 16
COMPUTE WS-LOW-NIBBLE =
FUNCTION MOD(WS-BYTE-VAL, 16)
MOVE WS-HEX-CHAR(
WS-HIGH-NIBBLE + 1) TO
WS-DUMP-LINE(WS-DUMP-PTR:1)
ADD 1 TO WS-DUMP-PTR
MOVE WS-HEX-CHAR(
WS-LOW-NIBBLE + 1) TO
WS-DUMP-LINE(WS-DUMP-PTR:1)
ADD 1 TO WS-DUMP-PTR
MOVE " " TO
WS-DUMP-LINE(WS-DUMP-PTR:1)
ADD 1 TO WS-DUMP-PTR
END-PERFORM
DISPLAY WS-DUMP-LINE
ADD 16 TO WS-DUMP-POS
END-PERFORM
.
This hex dump utility is a tool that Derek Washington keeps in his personal copybook library. It has saved him countless hours debugging data corruption issues in production — being able to see the actual hex values of a record reveals problems (embedded binary data, EBCDIC/ASCII issues, packed decimal corruption) that DISPLAY alone cannot show.
Pattern 7: Word Counting and Text Analysis
For report generation and data quality analysis, counting words and analyzing text content is useful:
01 WS-TEXT-INPUT PIC X(200).
01 WS-WORD-COUNT PIC 9(3) VALUE ZERO.
01 WS-IN-WORD-FLAG PIC X(1) VALUE "N".
88 WS-IN-WORD VALUE "Y".
88 WS-NOT-IN-WORD VALUE "N".
01 WS-TEXT-POS PIC 9(3).
01 WS-TEXT-LEN PIC 9(3).
01 WS-CURR-CHAR PIC X(1).
COUNT-WORDS.
MOVE ZERO TO WS-WORD-COUNT
SET WS-NOT-IN-WORD TO TRUE
MOVE FUNCTION LENGTH(
FUNCTION TRIM(WS-TEXT-INPUT TRAILING))
TO WS-TEXT-LEN
PERFORM VARYING WS-TEXT-POS FROM 1 BY 1
UNTIL WS-TEXT-POS > WS-TEXT-LEN
MOVE WS-TEXT-INPUT(WS-TEXT-POS:1)
TO WS-CURR-CHAR
IF WS-CURR-CHAR = SPACE
IF WS-IN-WORD
SET WS-NOT-IN-WORD TO TRUE
END-IF
ELSE
IF WS-NOT-IN-WORD
ADD 1 TO WS-WORD-COUNT
SET WS-IN-WORD TO TRUE
END-IF
END-IF
END-PERFORM
.
💡 Design Insight: Every one of these patterns follows the same fundamental structure: initialize a position variable to 1, scan forward one byte at a time using reference modification, track state (in-word, in-quotes, delimiter position), and act based on what each byte contains. Once you internalize this pattern, you can parse virtually any text format in COBOL.
19.6 Production Patterns: Multi-Record Parsing
In production systems, reference modification is rarely used on a single field in isolation. It is part of a larger processing pipeline where records arrive in variable formats, must be validated, parsed, transformed, and routed. This section presents production-grade patterns that combine reference modification with error handling, logging, and recovery.
Pattern: Record Type Identification and Routing
Many mainframe file formats use the first few bytes of each record to identify the record type. A single file may contain header records, detail records, trailer records, and control records — all with different layouts:
01 WS-INPUT-RECORD PIC X(500).
01 WS-REC-TYPE PIC X(2).
01 WS-REC-SUBTYPE PIC X(3).
01 WS-REC-LEN PIC 9(3).
ROUTE-RECORD.
* Extract record type from first 2 bytes
MOVE WS-INPUT-RECORD(1:2) TO WS-REC-TYPE
* Extract subtype from bytes 3-5
MOVE WS-INPUT-RECORD(3:3) TO WS-REC-SUBTYPE
EVALUATE WS-REC-TYPE
WHEN "HD"
PERFORM PARSE-HEADER-RECORD
WHEN "DT"
EVALUATE WS-REC-SUBTYPE
WHEN "CLM"
PERFORM PARSE-CLAIM-DETAIL
WHEN "SVC"
PERFORM PARSE-SERVICE-LINE
WHEN "ADJ"
PERFORM PARSE-ADJUSTMENT
WHEN OTHER
PERFORM LOG-UNKNOWN-SUBTYPE
END-EVALUATE
WHEN "TR"
PERFORM PARSE-TRAILER-RECORD
WHEN OTHER
PERFORM LOG-UNKNOWN-RECORD
END-EVALUATE
.
At MedClaim, James Okafor processes remittance files that contain 14 different record types. Each type has a different layout, and the layouts change between payer versions. Reference modification allows the routing logic to remain stable even as individual record parsers are updated.
Pattern: Variable-Length Field Extraction with Length Prefixes
Some formats (particularly those originating from IBM systems) use length-prefixed fields rather than delimiters. Each field is preceded by a 2-byte or 4-byte length indicator:
01 WS-LP-RECORD PIC X(2000).
01 WS-LP-POS PIC 9(4) VALUE 1.
01 WS-LP-FIELD-LEN PIC 9(4).
01 WS-LP-FIELD-DATA PIC X(500).
01 WS-LP-FIELD-COUNT PIC 99 VALUE ZERO.
PARSE-LENGTH-PREFIXED.
MOVE 1 TO WS-LP-POS
MOVE ZERO TO WS-LP-FIELD-COUNT
PERFORM UNTIL WS-LP-POS >= WS-RECORD-LEN
OR WS-LP-FIELD-COUNT >= 20
* Extract 2-byte length prefix
IF WS-LP-POS + 1 > WS-RECORD-LEN
EXIT PERFORM
END-IF
MOVE WS-LP-RECORD(WS-LP-POS:2)
TO WS-LP-FIELD-LEN-X
COMPUTE WS-LP-FIELD-LEN =
FUNCTION NUMVAL(WS-LP-FIELD-LEN-X)
ADD 2 TO WS-LP-POS
* Validate length
IF WS-LP-FIELD-LEN > 0
AND WS-LP-FIELD-LEN <= 500
AND WS-LP-POS + WS-LP-FIELD-LEN - 1
<= WS-RECORD-LEN
* Extract field data
MOVE SPACES TO WS-LP-FIELD-DATA
MOVE WS-LP-RECORD(
WS-LP-POS:WS-LP-FIELD-LEN)
TO WS-LP-FIELD-DATA
ADD 1 TO WS-LP-FIELD-COUNT
ADD WS-LP-FIELD-LEN TO WS-LP-POS
ELSE
* Invalid length — log and stop parsing
DISPLAY "BAD FIELD LEN AT POS "
WS-LP-POS ": " WS-LP-FIELD-LEN
EXIT PERFORM
END-IF
END-PERFORM
.
⚠️ Defensive Programming: Every reference modification in this code is preceded by a boundary check. The check
WS-LP-POS + WS-LP-FIELD-LEN - 1 <= WS-RECORD-LENprevents reading past the end of the record. Without this check, a corrupt length prefix could cause the program to read into adjacent memory — a bug that might work silently for months before manifesting as corrupt output.
Pattern: Building Audit Trail Records
Production systems must log their processing for audit and debugging. Reference modification is ideal for building variable-length audit records:
01 WS-AUDIT-REC PIC X(500).
01 WS-AUDIT-POS PIC 9(3) VALUE 1.
BUILD-AUDIT-RECORD.
MOVE SPACES TO WS-AUDIT-REC
MOVE 1 TO WS-AUDIT-POS
* Timestamp (21 bytes)
MOVE FUNCTION CURRENT-DATE
TO WS-AUDIT-REC(WS-AUDIT-POS:21)
ADD 21 TO WS-AUDIT-POS
* Separator
MOVE "|" TO WS-AUDIT-REC(WS-AUDIT-POS:1)
ADD 1 TO WS-AUDIT-POS
* Program ID (8 bytes)
MOVE "CLMPROC " TO
WS-AUDIT-REC(WS-AUDIT-POS:8)
ADD 8 TO WS-AUDIT-POS
* Separator
MOVE "|" TO WS-AUDIT-REC(WS-AUDIT-POS:1)
ADD 1 TO WS-AUDIT-POS
* Action code (3 bytes)
MOVE WS-ACTION-CODE TO
WS-AUDIT-REC(WS-AUDIT-POS:3)
ADD 3 TO WS-AUDIT-POS
* Separator
MOVE "|" TO WS-AUDIT-REC(WS-AUDIT-POS:1)
ADD 1 TO WS-AUDIT-POS
* Claim ID (variable — use TRIM)
MOVE FUNCTION TRIM(WS-CLM-ID) TO
WS-AUDIT-REC(WS-AUDIT-POS:12)
ADD 12 TO WS-AUDIT-POS
* Write the audit record (only the used portion)
COMPUTE WS-AUDIT-WRITE-LEN =
WS-AUDIT-POS - 1
WRITE AUDIT-FILE-REC FROM
WS-AUDIT-REC(1:WS-AUDIT-WRITE-LEN)
.
This pointer-build pattern (maintain a position counter, append each piece, advance the counter) is the single most important reference modification idiom in production COBOL. Maria Chen estimates that 60% of all reference modification code at GlobalBank uses this pattern.
19.7 Pointer-Based Processing: ADDRESS OF
COBOL provides pointer data items and the ADDRESS OF special register for lower-level memory manipulation. This is an advanced feature used primarily for:
- Interfacing with non-COBOL programs (C, Assembler)
- Processing dynamically allocated memory
- Optimizing access to large data areas in LINKAGE SECTION
Pointer Data Items
01 WS-DATA-PTR USAGE IS POINTER.
01 WS-NULL-PTR USAGE IS POINTER VALUE NULL.
ADDRESS OF Special Register
Every item in the LINKAGE SECTION has an associated ADDRESS OF that represents its memory address:
LINKAGE SECTION.
01 LS-RECORD PIC X(100).
01 LS-BUFFER PIC X(500).
PROCEDURE DIVISION.
SET ADDRESS OF LS-RECORD TO WS-DATA-PTR
*> Now LS-RECORD refers to whatever WS-DATA-PTR
*> points to
SET Statement with Pointers
SET WS-DATA-PTR TO ADDRESS OF WS-RECORD
*> WS-DATA-PTR now contains the address of WS-RECORD
SET ADDRESS OF LS-RECORD TO WS-DATA-PTR
*> LS-RECORD now overlays the memory at WS-DATA-PTR
SET WS-DATA-PTR TO NULL
*> Reset pointer to null
Practical Example: Processing a Buffer
When receiving data from a CICS communication area, a Language Environment service, or a C function, you often get a pointer to a buffer rather than the data directly:
LINKAGE SECTION.
01 LS-RESPONSE-BUFFER.
05 LS-RESP-LENGTH PIC 9(5) COMP.
05 LS-RESP-DATA PIC X(4000).
01 LS-PARSED-RECORD.
05 LS-PR-TYPE PIC X(2).
05 LS-PR-LENGTH PIC 9(4) COMP.
05 LS-PR-PAYLOAD PIC X(996).
WORKING-STORAGE SECTION.
01 WS-BUFFER-PTR USAGE IS POINTER.
01 WS-CURRENT-PTR USAGE IS POINTER.
01 WS-OFFSET PIC 9(5) COMP VALUE ZERO.
01 WS-REC-COUNT PIC 9(5) COMP VALUE ZERO.
PROCEDURE DIVISION.
* Assume WS-BUFFER-PTR is set by caller
SET ADDRESS OF LS-RESPONSE-BUFFER
TO WS-BUFFER-PTR
* Process records within the buffer
MOVE ZERO TO WS-OFFSET
PERFORM UNTIL WS-OFFSET >= LS-RESP-LENGTH
SET WS-CURRENT-PTR TO WS-BUFFER-PTR
* Advance pointer by offset + 5 (header)
SET WS-CURRENT-PTR UP BY 5
SET WS-CURRENT-PTR UP BY WS-OFFSET
SET ADDRESS OF LS-PARSED-RECORD
TO WS-CURRENT-PTR
ADD 1 TO WS-REC-COUNT
PERFORM 3100-PROCESS-RECORD
ADD LS-PR-LENGTH TO WS-OFFSET
ADD 6 TO WS-OFFSET
END-PERFORM
.
⚠️ Safety Warning: Pointer manipulation is the most dangerous area of COBOL programming. An incorrect pointer leads to storage violations (SOC4 abend on z/OS). Use pointers only when absolutely necessary, validate pointer values, and test exhaustively. Most business logic should use reference modification instead.
19.6 Processing Variable-Format Records
One of the most valuable applications of reference modification is processing records whose layout varies based on content. This is common in financial systems (different transaction types), healthcare (different claim formats), and data interchange (EDI, XML-like structures).
GlobalBank: Variable-Length Transaction Descriptions
GlobalBank's transaction records include a variable-length description field. The record layout is:
Positions 1-10: Account number
Positions 11-18: Transaction date (YYYYMMDD)
Positions 19-19: Transaction type (D=Debit, C=Credit)
Positions 20-31: Amount (9(10)V99)
Positions 32-34: Description length (3 digits)
Positions 35-?: Description text (variable length)
01 WS-TRANS-RECORD PIC X(534).
01 WS-TRANS-FIELDS.
05 WS-TR-ACCT PIC X(10).
05 WS-TR-DATE PIC 9(8).
05 WS-TR-TYPE PIC X(1).
05 WS-TR-AMOUNT PIC 9(10)V99.
05 WS-TR-DESC-LEN PIC 9(3).
05 WS-TR-DESC PIC X(500).
PARSE-TRANSACTION.
MOVE WS-TRANS-RECORD(1:10) TO WS-TR-ACCT
MOVE WS-TRANS-RECORD(11:8) TO WS-TR-DATE
MOVE WS-TRANS-RECORD(19:1) TO WS-TR-TYPE
MOVE WS-TRANS-RECORD(20:12) TO WS-TR-AMOUNT
MOVE WS-TRANS-RECORD(32:3) TO WS-TR-DESC-LEN
* Defensive check on description length
IF WS-TR-DESC-LEN < 0
OR WS-TR-DESC-LEN > 500
DISPLAY "INVALID DESC LENGTH: "
WS-TR-DESC-LEN
MOVE ZERO TO WS-TR-DESC-LEN
MOVE SPACES TO WS-TR-DESC
PERFORM 9100-LOG-ERROR
ELSE IF WS-TR-DESC-LEN > ZERO
MOVE WS-TRANS-RECORD(35:WS-TR-DESC-LEN)
TO WS-TR-DESC
ELSE
MOVE SPACES TO WS-TR-DESC
END-IF
.
MedClaim: Processing Variable-Format EDI Segments
Electronic Data Interchange (EDI) is the backbone of healthcare claims processing. EDI segments use delimiters (typically * for elements and ~ for segments) rather than fixed positions. Reference modification is ideal for parsing these.
An EDI 837 claim segment might look like:
CLM*12345678*1200.50*11:B:1*Y*A~
The parser must extract each element between the asterisks:
01 WS-EDI-SEGMENT PIC X(500).
01 WS-ELEMENTS.
05 WS-ELEMENT PIC X(50) OCCURS 20 TIMES.
01 WS-EDI-VARS.
05 WS-ELEM-NUM PIC 99 VALUE 1.
05 WS-SCAN-POS PIC 9(3) VALUE 1.
05 WS-ELEM-START PIC 9(3).
05 WS-ELEM-LEN PIC 9(3).
05 WS-SEG-LEN PIC 9(3).
PARSE-EDI-SEGMENT.
MOVE SPACES TO WS-ELEMENTS
MOVE 1 TO WS-ELEM-NUM
MOVE 1 TO WS-SCAN-POS
* Find segment terminator to get actual length
MOVE ZERO TO WS-SEG-LEN
PERFORM VARYING WS-SEG-LEN FROM 1 BY 1
UNTIL WS-SEG-LEN >
FUNCTION LENGTH(WS-EDI-SEGMENT)
IF WS-EDI-SEGMENT(WS-SEG-LEN:1) = "~"
OR WS-EDI-SEGMENT(WS-SEG-LEN:1) =
SPACES
SUBTRACT 1 FROM WS-SEG-LEN
EXIT PERFORM
END-IF
END-PERFORM
* Skip segment identifier (first element before *)
PERFORM UNTIL WS-SCAN-POS > WS-SEG-LEN
IF WS-EDI-SEGMENT(WS-SCAN-POS:1) = "*"
ADD 1 TO WS-SCAN-POS
EXIT PERFORM
END-IF
ADD 1 TO WS-SCAN-POS
END-PERFORM
* Extract remaining elements
PERFORM UNTIL WS-SCAN-POS > WS-SEG-LEN
OR WS-ELEM-NUM > 20
MOVE WS-SCAN-POS TO WS-ELEM-START
* Find next delimiter
PERFORM UNTIL WS-SCAN-POS > WS-SEG-LEN
IF WS-EDI-SEGMENT(WS-SCAN-POS:1)
= "*"
OR WS-EDI-SEGMENT(WS-SCAN-POS:1)
= "~"
EXIT PERFORM
END-IF
ADD 1 TO WS-SCAN-POS
END-PERFORM
COMPUTE WS-ELEM-LEN =
WS-SCAN-POS - WS-ELEM-START
IF WS-ELEM-LEN > ZERO
AND WS-ELEM-LEN <= 50
MOVE WS-EDI-SEGMENT(
WS-ELEM-START:WS-ELEM-LEN)
TO WS-ELEMENT(WS-ELEM-NUM)
END-IF
ADD 1 TO WS-ELEM-NUM
ADD 1 TO WS-SCAN-POS
END-PERFORM
SUBTRACT 1 FROM WS-ELEM-NUM
DISPLAY "PARSED " WS-ELEM-NUM " ELEMENTS"
.
🔗 Cross-Reference: EDI processing is covered in greater detail in Chapter 33 (Interfacing with External Systems). The parsing techniques introduced here form the foundation for the full EDI 837/835 processing pipeline discussed there.
Understanding Pointer Arithmetic
Pointer arithmetic in COBOL is more restricted than in C. You can only increment or decrement pointers using SET:
SET WS-PTR UP BY 100
*> Advances the pointer by 100 bytes
SET WS-PTR DOWN BY 50
*> Backs the pointer up by 50 bytes
You cannot add two pointers, subtract two pointers to get a distance, or compare pointers with < or >. The only pointer comparison available is equality:
IF WS-PTR = NULL
DISPLAY "POINTER IS NULL"
END-IF
IF WS-PTR-1 = WS-PTR-2
DISPLAY "POINTERS MATCH"
END-IF
These restrictions are intentional — they prevent the kind of arbitrary memory access bugs that plague C programs. COBOL's pointer model is a controlled subset designed for specific inter-language and dynamic memory use cases, not general-purpose memory manipulation.
When to Use Pointers vs. Reference Modification
The decision between pointers and reference modification is usually straightforward:
| Use Case | Best Approach |
|---|---|
| Extracting substrings from fixed-length fields | Reference modification |
| Building output strings dynamically | Reference modification with pointer variable |
| Parsing delimited data | Reference modification |
| Interfacing with C functions | Pointers (ADDRESS OF) |
| Processing CICS COMMAREA/TWA | Pointers (SET ADDRESS OF) |
| Accessing dynamically allocated memory | Pointers (CEEGTST/CEECZST) |
| Processing LINKAGE SECTION data | Pointers (SET ADDRESS OF) |
| Overlaying different record structures | Reference modification or REDEFINES |
In practice, reference modification handles 95% of byte-level data manipulation needs. Pointers are reserved for system-level programming and inter-language communication.
Working with NULL Pointers
Always initialize pointers before use and check for NULL before dereferencing:
01 WS-DATA-PTR USAGE IS POINTER VALUE NULL.
*> Before using:
IF WS-DATA-PTR = NULL
DISPLAY "ERROR: POINTER NOT INITIALIZED"
PERFORM 9900-ABEND
END-IF
SET ADDRESS OF LS-RECORD TO WS-DATA-PTR
*> Safe to access LS-RECORD now
A NULL pointer dereference causes a SOC4 abend on z/OS (equivalent to a segmentation fault on Unix). The abend dump may not clearly indicate the cause, making NULL pointer bugs particularly difficult to diagnose in production. Always validate.
Reference Modification with INSPECT
A lesser-known but powerful combination is using reference modification within an INSPECT statement to examine or replace characters within a specific portion of a field:
*> Count digits in positions 5-10 only
INSPECT WS-DATA(5:6)
TALLYING WS-DIGIT-COUNT
FOR ALL "0" THRU "9"
*> Replace all spaces with zeros in positions 1-8
INSPECT WS-AMOUNT-FIELD(1:8)
REPLACING ALL SPACES BY ZEROS
*> Convert lowercase to uppercase in first 20 chars
INSPECT WS-NAME-FIELD(1:20)
CONVERTING "abcdefghijklmnopqrstuvwxyz"
TO "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
This combination is useful when you need to transform only part of a field — for example, uppercasing a name field without affecting a numeric suffix, or zero-filling only the integer portion of an amount string.
🧪 Try It Yourself: String Utilities Library
Build a set of reusable string utility paragraphs using reference modification: 1. LEFT-PAD: Pad a string on the left with a specified character to a target length 2. RIGHT-PAD: Pad on the right (similar to default COBOL behavior, but with a configurable pad character) 3. CENTER: Center a string within a target field 4. CONTAINS: Return TRUE if a substring exists within a string 5. REPLACE-ALL: Replace all occurrences of a search string with a replacement string 6. SUBSTR-COUNT: Count occurrences of a substring
These utilities mirror the string functions available in Java or Python, implemented using COBOL reference modification.
19.7 STRING Statement with POINTER Phrase
The STRING statement has an optional POINTER phrase that maintains a position counter, enabling you to build output strings incrementally:
01 WS-REPORT-LINE PIC X(132).
01 WS-STR-PTR PIC 9(3) VALUE 1.
BUILD-REPORT-LINE.
MOVE SPACES TO WS-REPORT-LINE
MOVE 1 TO WS-STR-PTR
STRING
WS-ACCT-NUM DELIMITED BY SIZE
" | " DELIMITED BY SIZE
WS-CUST-NAME DELIMITED BY " "
" | " DELIMITED BY SIZE
WS-BALANCE DELIMITED BY SIZE
INTO WS-REPORT-LINE
WITH POINTER WS-STR-PTR
END-STRING
* WS-STR-PTR now points to the position
* AFTER the last character written.
* Subtract 1 to get the actual content length.
SUBTRACT 1 FROM WS-STR-PTR
DISPLAY "LINE LENGTH: " WS-STR-PTR
.
The POINTER phrase is also useful for building output across multiple STRING operations:
BUILD-MULTI-PART.
MOVE SPACES TO WS-OUTPUT
MOVE 1 TO WS-STR-PTR
* Part 1: Header
STRING "ACCT: " DELIMITED BY SIZE
WS-ACCT-NUM DELIMITED BY SIZE
INTO WS-OUTPUT
WITH POINTER WS-STR-PTR
END-STRING
* Part 2: Conditionally add branch info
IF WS-INCLUDE-BRANCH = "Y"
STRING " BRANCH: " DELIMITED BY SIZE
WS-BRANCH-NAME DELIMITED BY " "
INTO WS-OUTPUT
WITH POINTER WS-STR-PTR
END-STRING
END-IF
* Part 3: Always add date
STRING " DATE: " DELIMITED BY SIZE
WS-FORMATTED-DATE DELIMITED BY SIZE
INTO WS-OUTPUT
WITH POINTER WS-STR-PTR
END-STRING
.
💡 Key Insight: The WITH POINTER phrase makes STRING work like the pointer-based reference modification pattern from Section 19.4, but with the convenience of delimiter handling. Use STRING WITH POINTER when your pieces have natural delimiters; use reference modification when you need exact byte positioning.
19.8 UNSTRING with POINTER Phrase
Similarly, UNSTRING supports a POINTER phrase for incremental parsing:
01 WS-CSV-RECORD PIC X(500).
01 WS-FIELDS.
05 WS-FIELD-1 PIC X(30).
05 WS-FIELD-2 PIC X(30).
05 WS-FIELD-3 PIC X(30).
01 WS-UNS-PTR PIC 9(3) VALUE 1.
01 WS-DELIM-FOUND PIC X(1).
01 WS-FIELD-COUNT PIC 9(3).
PARSE-CSV.
MOVE 1 TO WS-UNS-PTR
* Extract first field
UNSTRING WS-CSV-RECORD
DELIMITED BY "," OR X"0D" OR X"0A"
INTO WS-FIELD-1
DELIMITER IN WS-DELIM-FOUND
COUNT IN WS-FIELD-COUNT
WITH POINTER WS-UNS-PTR
END-UNSTRING
* Extract second field (continues from where
* the first left off)
UNSTRING WS-CSV-RECORD
DELIMITED BY "," OR X"0D" OR X"0A"
INTO WS-FIELD-2
WITH POINTER WS-UNS-PTR
END-UNSTRING
* Extract third field
UNSTRING WS-CSV-RECORD
DELIMITED BY "," OR X"0D" OR X"0A"
INTO WS-FIELD-3
WITH POINTER WS-UNS-PTR
END-UNSTRING
.
19.9 Advanced Technique: Building a Generic Field Extractor
Let us combine reference modification, LENGTH OF, and pointer techniques to build a reusable field extraction utility:
*============================================================
* Generic delimited field extractor
* Input: WS-GFE-INPUT - string to parse
* WS-GFE-DELIM - delimiter character
* WS-GFE-FIELD-NUM - which field to extract (1-based)
* Output: WS-GFE-RESULT - extracted field
* WS-GFE-RESULT-LEN - length of extracted field
* WS-GFE-STATUS - 'F'ound, 'N'ot found, 'E'rror
*============================================================
01 WS-GENERIC-FIELD-EXTRACT.
05 WS-GFE-INPUT PIC X(500).
05 WS-GFE-DELIM PIC X(1).
05 WS-GFE-FIELD-NUM PIC 99.
05 WS-GFE-RESULT PIC X(100).
05 WS-GFE-RESULT-LEN PIC 9(3).
05 WS-GFE-STATUS PIC X(1).
88 WS-GFE-FOUND VALUE "F".
88 WS-GFE-NOT-FOUND VALUE "N".
88 WS-GFE-ERROR VALUE "E".
01 WS-GFE-WORK.
05 WS-GFE-POS PIC 9(3).
05 WS-GFE-START PIC 9(3).
05 WS-GFE-CURR-FIELD PIC 99.
05 WS-GFE-MAX-LEN PIC 9(3).
05 WS-GFE-LEN PIC 9(3).
EXTRACT-DELIMITED-FIELD.
MOVE SPACES TO WS-GFE-RESULT
MOVE ZERO TO WS-GFE-RESULT-LEN
SET WS-GFE-NOT-FOUND TO TRUE
IF WS-GFE-FIELD-NUM < 1
SET WS-GFE-ERROR TO TRUE
EXIT PARAGRAPH
END-IF
MOVE FUNCTION LENGTH(WS-GFE-INPUT)
TO WS-GFE-MAX-LEN
MOVE 1 TO WS-GFE-POS
MOVE 1 TO WS-GFE-CURR-FIELD
* Skip to the target field
PERFORM UNTIL WS-GFE-CURR-FIELD >=
WS-GFE-FIELD-NUM
OR WS-GFE-POS > WS-GFE-MAX-LEN
IF WS-GFE-INPUT(WS-GFE-POS:1) =
WS-GFE-DELIM
ADD 1 TO WS-GFE-CURR-FIELD
END-IF
ADD 1 TO WS-GFE-POS
END-PERFORM
IF WS-GFE-CURR-FIELD < WS-GFE-FIELD-NUM
SET WS-GFE-NOT-FOUND TO TRUE
EXIT PARAGRAPH
END-IF
* We are now at the start of the target field
MOVE WS-GFE-POS TO WS-GFE-START
* Find the end of this field
PERFORM UNTIL WS-GFE-POS > WS-GFE-MAX-LEN
IF WS-GFE-INPUT(WS-GFE-POS:1) =
WS-GFE-DELIM
EXIT PERFORM
END-IF
ADD 1 TO WS-GFE-POS
END-PERFORM
COMPUTE WS-GFE-LEN =
WS-GFE-POS - WS-GFE-START
IF WS-GFE-LEN > 100
MOVE 100 TO WS-GFE-LEN
END-IF
IF WS-GFE-LEN > ZERO
MOVE WS-GFE-INPUT(
WS-GFE-START:WS-GFE-LEN)
TO WS-GFE-RESULT
MOVE WS-GFE-LEN TO WS-GFE-RESULT-LEN
SET WS-GFE-FOUND TO TRUE
ELSE
SET WS-GFE-FOUND TO TRUE
MOVE ZERO TO WS-GFE-RESULT-LEN
END-IF
.
This utility can parse CSV, pipe-delimited, tab-delimited, or any single-character-delimited format by changing WS-GFE-DELIM.
19.10 Reference Modification in CICS and Online Programs
Reference modification is not limited to batch programs. Online programs running under CICS or IMS make extensive use of reference modification for processing communication area data, building dynamic screen content, and parsing web service payloads.
CICS Communication Area Parsing
In CICS programs, the DFHCOMMAREA is the primary mechanism for passing data between programs. When programs follow a "router" pattern — where a front-end program determines which back-end program to invoke — the communication area often contains a variable-format payload:
01 DFHCOMMAREA.
05 CA-REQUEST-TYPE PIC X(4).
05 CA-PAYLOAD-LEN PIC 9(4).
05 CA-PAYLOAD PIC X(2000).
* Parse based on request type
EVALUATE CA-REQUEST-TYPE
WHEN "INQY"
* Inquiry: payload is account number (10) + date (8)
MOVE CA-PAYLOAD(1:10) TO WS-ACCT-NUM
MOVE CA-PAYLOAD(11:8) TO WS-INQUIRY-DATE
WHEN "XFER"
* Transfer: from-acct (10) + to-acct (10)
* + amount (12) + memo (50)
MOVE CA-PAYLOAD(1:10) TO WS-FROM-ACCT
MOVE CA-PAYLOAD(11:10) TO WS-TO-ACCT
MOVE CA-PAYLOAD(21:12) TO WS-XFER-AMOUNT
MOVE CA-PAYLOAD(33:50) TO WS-XFER-MEMO
WHEN "STMT"
* Statement: acct (10) + start-date (8)
* + end-date (8) + page-num (4)
MOVE CA-PAYLOAD(1:10) TO WS-STMT-ACCT
MOVE CA-PAYLOAD(11:8) TO WS-STMT-START
MOVE CA-PAYLOAD(19:8) TO WS-STMT-END
MOVE CA-PAYLOAD(27:4) TO WS-STMT-PAGE
END-EVALUATE
💡 Design Pattern: Maria Chen explains: "Every CICS program at GlobalBank uses a communication area layout where the first few bytes identify the request type, followed by a payload that varies by type. Reference modification is how we extract the payload fields without needing a separate copybook for every possible request format. The front-end knows how to build the payload; the back-end knows how to parse it."
Building Dynamic BMS Map Data
When constructing dynamic screen content in CICS, reference modification enables building formatted display lines without static copybook structures:
01 WS-SCREEN-LINE PIC X(80).
01 WS-POS PIC 99.
BUILD-DETAIL-LINE.
MOVE SPACES TO WS-SCREEN-LINE
MOVE 1 TO WS-POS
* Column 1-10: Account number
MOVE WS-ACCT-NUM TO WS-SCREEN-LINE(1:10)
* Column 12-19: Date
MOVE WS-DATE-FORMATTED TO WS-SCREEN-LINE(12:8)
* Column 21-35: Description (truncated to 15)
MOVE WS-DESCRIPTION(1:15)
TO WS-SCREEN-LINE(21:15)
* Column 37-50: Amount (right-justified)
MOVE WS-DISPLAY-AMT TO WS-SCREEN-LINE(37:14)
* Column 52-53: Status
MOVE WS-STATUS-CODE TO WS-SCREEN-LINE(52:2)
.
Processing JSON Payloads from Web Services
Modern COBOL programs increasingly process JSON data received from web services. While IBM provides JSON PARSE in Enterprise COBOL 6+, many shops still process JSON payloads using reference modification, particularly when the JSON structure is simple and predictable:
01 WS-JSON-PAYLOAD PIC X(5000).
01 WS-JSON-LEN PIC 9(4).
01 WS-SCAN-POS PIC 9(4).
01 WS-VALUE-START PIC 9(4).
01 WS-VALUE-LEN PIC 9(4).
01 WS-TARGET-KEY PIC X(30).
FIND-JSON-VALUE.
* Simple JSON value extractor
* Finds "key":"value" and returns value
* Does NOT handle nested objects or arrays
* Build search string: "key":"
STRING '"' WS-TARGET-KEY DELIMITED SPACES
'":"' DELIMITED SIZE
INTO WS-SEARCH-PATTERN
WITH POINTER WS-PATTERN-LEN
END-STRING
SUBTRACT 1 FROM WS-PATTERN-LEN
* Scan for the key
MOVE 1 TO WS-SCAN-POS
PERFORM UNTIL WS-SCAN-POS >
WS-JSON-LEN - WS-PATTERN-LEN
IF WS-JSON-PAYLOAD(
WS-SCAN-POS:WS-PATTERN-LEN)
= WS-SEARCH-PATTERN(1:WS-PATTERN-LEN)
* Found the key — value starts after it
COMPUTE WS-VALUE-START =
WS-SCAN-POS + WS-PATTERN-LEN
* Find the closing quote
MOVE WS-VALUE-START TO WS-SCAN-POS
PERFORM UNTIL
WS-JSON-PAYLOAD(WS-SCAN-POS:1)
= '"'
OR WS-SCAN-POS > WS-JSON-LEN
ADD 1 TO WS-SCAN-POS
END-PERFORM
COMPUTE WS-VALUE-LEN =
WS-SCAN-POS - WS-VALUE-START
IF WS-VALUE-LEN > 0
AND WS-VALUE-LEN <= 200
MOVE WS-JSON-PAYLOAD(
WS-VALUE-START:WS-VALUE-LEN)
TO WS-EXTRACTED-VALUE
SET WS-VALUE-FOUND TO TRUE
END-IF
EXIT PERFORM
END-IF
ADD 1 TO WS-SCAN-POS
END-PERFORM
.
⚠️ Production Warning: This simple JSON parser handles only flat key-value pairs with string values. For production JSON processing with nested objects, arrays, and escaped characters, use IBM's JSON PARSE statement (Enterprise COBOL 6.1+) or a JSON parsing subprogram. Tomás Rivera at MedClaim notes: "We started with reference modification for JSON, and it worked fine until we got a payload with escaped quotes inside a value. Now we use JSON PARSE for anything from external systems and only use ref-mod parsing for our own internal simple-format messages."
XML Element Extraction with Reference Modification
Healthcare systems frequently process XML documents. Reference modification provides a lightweight XML element extraction capability:
01 WS-XML-DATA PIC X(10000).
01 WS-XML-LEN PIC 9(5).
01 WS-TAG-NAME PIC X(30).
01 WS-OPEN-TAG PIC X(32).
01 WS-CLOSE-TAG PIC X(33).
01 WS-OPEN-LEN PIC 99.
01 WS-CLOSE-LEN PIC 99.
EXTRACT-XML-ELEMENT.
* Build open and close tags
STRING "<" WS-TAG-NAME DELIMITED SPACES
">" DELIMITED SIZE
INTO WS-OPEN-TAG
WITH POINTER WS-OPEN-LEN
END-STRING
SUBTRACT 1 FROM WS-OPEN-LEN
STRING "</" WS-TAG-NAME DELIMITED SPACES
">" DELIMITED SIZE
INTO WS-CLOSE-TAG
WITH POINTER WS-CLOSE-LEN
END-STRING
SUBTRACT 1 FROM WS-CLOSE-LEN
* Scan for open tag
MOVE 1 TO WS-SCAN-POS
SET WS-ELEMENT-NOT-FOUND TO TRUE
PERFORM UNTIL WS-SCAN-POS >
WS-XML-LEN - WS-OPEN-LEN
IF WS-XML-DATA(WS-SCAN-POS:WS-OPEN-LEN)
= WS-OPEN-TAG(1:WS-OPEN-LEN)
* Content starts after open tag
COMPUTE WS-VALUE-START =
WS-SCAN-POS + WS-OPEN-LEN
* Find close tag
MOVE WS-VALUE-START TO WS-INNER-POS
PERFORM UNTIL WS-INNER-POS >
WS-XML-LEN - WS-CLOSE-LEN
IF WS-XML-DATA(
WS-INNER-POS:WS-CLOSE-LEN)
= WS-CLOSE-TAG(1:WS-CLOSE-LEN)
COMPUTE WS-VALUE-LEN =
WS-INNER-POS
- WS-VALUE-START
MOVE WS-XML-DATA(
WS-VALUE-START:WS-VALUE-LEN)
TO WS-EXTRACTED-VALUE
SET WS-ELEMENT-FOUND TO TRUE
EXIT PERFORM
END-IF
ADD 1 TO WS-INNER-POS
END-PERFORM
EXIT PERFORM
END-IF
ADD 1 TO WS-SCAN-POS
END-PERFORM
.
Sarah Kim uses this pattern at MedClaim for processing CDA (Clinical Document Architecture) documents that arrive with eligibility responses: "We extract just the three or four elements we need — patient name, subscriber ID, effective dates — and ignore the rest of the XML structure. It is not a full XML parser, but for targeted extraction of known elements, it is fast and reliable."
INSPECT Combined with Reference Modification
The INSPECT statement and reference modification are powerful when combined. INSPECT can count or replace characters within a specific portion of a field:
* Count commas only in the data portion (skip header)
INSPECT WS-RECORD(WS-HEADER-LEN + 1:
WS-RECORD-LEN - WS-HEADER-LEN)
TALLYING WS-COMMA-COUNT
FOR ALL ","
* Replace pipes with commas in the payload only
INSPECT WS-BUFFER(WS-PAYLOAD-START:
WS-PAYLOAD-LEN)
REPLACING ALL "|" BY ","
* Count digits in a specific substring
INSPECT WS-FIELD(WS-START:WS-LEN)
TALLYING WS-DIGIT-COUNT
FOR ALL "0" "1" "2" "3" "4" "5"
"6" "7" "8" "9"
This combination is particularly useful when processing records with a fixed-format header followed by a variable-format body. You can apply INSPECT operations to just the body portion without affecting the header.
19.11 GlobalBank Case Study: Transaction Description Parser
GlobalBank's online banking system sends transaction descriptions in a structured format that varies by transaction type. The description field encodes multiple pieces of information separated by slashes:
ATM/WITHDRAWAL/BRANCH-0042/TERMINAL-7
POS/PURCHASE/MERCHANT:AMAZON.COM/REF:A123456
ACH/DIRECT-DEPOSIT/EMPLOYER:ACME-CORP
WIRE/INTL/BENEFICIARY:J.SMITH/SWIFT:ABCDUS33
The parser must handle: - Variable number of fields per transaction type - Fields that contain colons (key:value pairs) - Missing optional fields
01 WS-TRANS-DESC PIC X(200).
01 WS-PARSED-TRANS.
05 WS-PT-CHANNEL PIC X(10).
05 WS-PT-ACTION PIC X(20).
05 WS-PT-DETAILS.
10 WS-PT-DETAIL PIC X(50)
OCCURS 5 TIMES.
05 WS-PT-DETAIL-CNT PIC 9.
PARSE-TRANS-DESCRIPTION.
MOVE SPACES TO WS-PARSED-TRANS
MOVE ZERO TO WS-PT-DETAIL-CNT
* Use the generic field extractor
MOVE WS-TRANS-DESC TO WS-GFE-INPUT
MOVE "/" TO WS-GFE-DELIM
* Field 1: Channel
MOVE 1 TO WS-GFE-FIELD-NUM
PERFORM EXTRACT-DELIMITED-FIELD
IF WS-GFE-FOUND
MOVE WS-GFE-RESULT TO WS-PT-CHANNEL
END-IF
* Field 2: Action
MOVE 2 TO WS-GFE-FIELD-NUM
PERFORM EXTRACT-DELIMITED-FIELD
IF WS-GFE-FOUND
MOVE WS-GFE-RESULT TO WS-PT-ACTION
END-IF
* Fields 3+: Variable details
PERFORM VARYING WS-GFE-FIELD-NUM
FROM 3 BY 1
UNTIL WS-GFE-FIELD-NUM > 7
OR WS-PT-DETAIL-CNT >= 5
PERFORM EXTRACT-DELIMITED-FIELD
IF WS-GFE-FOUND
ADD 1 TO WS-PT-DETAIL-CNT
MOVE WS-GFE-RESULT TO
WS-PT-DETAIL(WS-PT-DETAIL-CNT)
ELSE
EXIT PERFORM
END-IF
END-PERFORM
.
After parsing, the detail fields can be further decomposed. For POS transactions, the detail field "MERCHANT:AMAZON.COM" needs to be split on the colon to extract the merchant name. Derek Washington's code applies the same reference modification technique recursively — using the same scanning pattern on each detail field:
PARSE-KEY-VALUE-DETAILS.
* For each detail that contains a colon,
* extract key and value
PERFORM VARYING WS-DT-IDX FROM 1 BY 1
UNTIL WS-DT-IDX > WS-PT-DETAIL-CNT
* Find the colon
MOVE ZERO TO WS-COLON-POS
PERFORM VARYING WS-SCAN FROM 1 BY 1
UNTIL WS-SCAN >
FUNCTION LENGTH(
WS-PT-DETAIL(WS-DT-IDX))
IF WS-PT-DETAIL(WS-DT-IDX)
(WS-SCAN:1) = ":"
MOVE WS-SCAN TO WS-COLON-POS
EXIT PERFORM
END-IF
END-PERFORM
IF WS-COLON-POS > 0
* Key is before colon
MOVE WS-PT-DETAIL(WS-DT-IDX)
(1:WS-COLON-POS - 1)
TO WS-DT-KEY(WS-DT-IDX)
* Value is after colon
COMPUTE WS-VAL-LEN =
FUNCTION LENGTH(
FUNCTION TRIM(
WS-PT-DETAIL(WS-DT-IDX)))
- WS-COLON-POS
IF WS-VAL-LEN > 0
MOVE WS-PT-DETAIL(WS-DT-IDX)
(WS-COLON-POS + 1:WS-VAL-LEN)
TO WS-DT-VALUE(WS-DT-IDX)
END-IF
END-IF
END-PERFORM
.
💡 Design Pattern: This two-level parsing strategy — first split the record on the primary delimiter, then split individual fields on a secondary delimiter — is the standard approach for structured text in COBOL. It generalizes to any number of levels, though in practice more than two levels is rare and may indicate that a different data format (XML, JSON) would be more appropriate.
19.12 MedClaim Case Study: EDI 837 Claim Parsing
MedClaim receives electronic claims in ANSI X12 837 format. Each claim consists of multiple segments, each terminated by ~, with elements separated by * and sub-elements by :.
A simplified claim extract:
ST*837*0001~
BHT*0019*00*12345*20240615*1200*CH~
CLM*CLAIM001*1500.00*11:B:1*Y*A~
SV1*HC:99213*125.00*UN*1~
DTP*472*D8*20240610~
The parsing uses reference modification to navigate through the variable-length segments:
01 WS-EDI-BUFFER PIC X(5000).
01 WS-EDI-BUF-LEN PIC 9(5).
01 WS-SEG-START PIC 9(5) VALUE 1.
01 WS-SEG-END PIC 9(5).
01 WS-CURRENT-SEG PIC X(500).
01 WS-SEG-ID PIC X(3).
PROCESS-EDI-BUFFER.
MOVE FUNCTION LENGTH(
FUNCTION TRIM(WS-EDI-BUFFER TRAILING))
TO WS-EDI-BUF-LEN
MOVE 1 TO WS-SEG-START
PERFORM UNTIL WS-SEG-START > WS-EDI-BUF-LEN
* Find segment terminator
MOVE WS-SEG-START TO WS-SEG-END
PERFORM UNTIL WS-SEG-END > WS-EDI-BUF-LEN
IF WS-EDI-BUFFER(WS-SEG-END:1) = "~"
EXIT PERFORM
END-IF
ADD 1 TO WS-SEG-END
END-PERFORM
* Extract segment
COMPUTE WS-ELEM-LEN =
WS-SEG-END - WS-SEG-START
IF WS-ELEM-LEN > 0
AND WS-ELEM-LEN <= 500
MOVE SPACES TO WS-CURRENT-SEG
MOVE WS-EDI-BUFFER(
WS-SEG-START:WS-ELEM-LEN)
TO WS-CURRENT-SEG
* Get segment identifier (first 2-3 chars)
MOVE WS-CURRENT-SEG(1:3)
TO WS-SEG-ID
PERFORM PROCESS-SEGMENT
END-IF
* Advance past terminator
COMPUTE WS-SEG-START =
WS-SEG-END + 1
END-PERFORM
.
PROCESS-SEGMENT.
EVALUATE TRUE
WHEN WS-SEG-ID(1:2) = "ST"
PERFORM PARSE-ST-SEGMENT
WHEN WS-SEG-ID(1:3) = "BHT"
PERFORM PARSE-BHT-SEGMENT
WHEN WS-SEG-ID(1:3) = "CLM"
PERFORM PARSE-CLM-SEGMENT
WHEN WS-SEG-ID(1:3) = "SV1"
PERFORM PARSE-SV1-SEGMENT
WHEN WS-SEG-ID(1:3) = "DTP"
PERFORM PARSE-DTP-SEGMENT
WHEN OTHER
DISPLAY "UNKNOWN SEGMENT: " WS-SEG-ID
END-EVALUATE
.
⚖️ The Modernization Spectrum: EDI is a 1970s data interchange format still used for billions of dollars in healthcare transactions daily. Modern COBOL systems must parse this format reliably while increasingly also supporting JSON and XML. Reference modification provides the byte-level precision needed for EDI's strict positional requirements, while newer COBOL features (FUNCTION TRIM, FUNCTION LOWER-CASE) help with modern format translation. The modernization path is not to replace COBOL but to extend its capabilities.
19.12 Complete Worked Example: CSV to Fixed-Width Converter
Let us bring together multiple reference modification patterns into a complete production program. This program reads CSV files (with quoted fields), converts them to fixed-width output records, and demonstrates defensive boundary checking throughout.
IDENTIFICATION DIVISION.
PROGRAM-ID. CSV2FIX.
*============================================================
* CSV to Fixed-Width Converter
* Parses quoted CSV input and produces fixed-width output
* using reference modification for all parsing operations.
*============================================================
DATA DIVISION.
WORKING-STORAGE SECTION.
01 WS-CSV-LINE PIC X(500).
01 WS-OUTPUT-REC PIC X(200).
01 WS-FIELD-VALUES.
05 WS-FV-COUNT PIC 99 VALUE ZERO.
05 WS-FV-ENTRY OCCURS 20 TIMES.
10 WS-FV-VALUE PIC X(50).
10 WS-FV-LEN PIC 9(3).
01 WS-CSV-PARSE.
05 WS-CP-POS PIC 9(3) VALUE 1.
05 WS-CP-START PIC 9(3).
05 WS-CP-LEN PIC 9(3).
05 WS-CP-MAX PIC 9(3).
05 WS-CP-IN-QUOTES PIC X VALUE "N".
88 WS-CP-QUOTED VALUE "Y".
88 WS-CP-UNQUOTED VALUE "N".
05 WS-CP-CHAR PIC X.
* Output field map
01 WS-OUTPUT-MAP.
05 WS-OM-ENTRY OCCURS 6 TIMES.
10 WS-OM-START PIC 9(3).
10 WS-OM-LENGTH PIC 9(3).
* Record counter
01 WS-REC-COUNT PIC 9(7) VALUE ZERO.
01 WS-ERR-COUNT PIC 9(5) VALUE ZERO.
PROCEDURE DIVISION.
0000-MAIN.
PERFORM 1000-SETUP-FIELD-MAP
PERFORM 2000-PROCESS-TEST-DATA
DISPLAY "RECORDS: " WS-REC-COUNT
" ERRORS: " WS-ERR-COUNT
STOP RUN
.
1000-SETUP-FIELD-MAP.
* Define output positions for 6 fields
MOVE 001 TO WS-OM-START(1)
MOVE 010 TO WS-OM-LENGTH(1)
MOVE 011 TO WS-OM-START(2)
MOVE 030 TO WS-OM-LENGTH(2)
MOVE 041 TO WS-OM-START(3)
MOVE 020 TO WS-OM-LENGTH(3)
MOVE 061 TO WS-OM-START(4)
MOVE 015 TO WS-OM-LENGTH(4)
MOVE 076 TO WS-OM-START(5)
MOVE 002 TO WS-OM-LENGTH(5)
MOVE 078 TO WS-OM-START(6)
MOVE 010 TO WS-OM-LENGTH(6)
.
2000-PROCESS-TEST-DATA.
MOVE
'ABC001,"Smith, John",123 Main St,New York,NY,10001'
TO WS-CSV-LINE
PERFORM 3000-PARSE-AND-CONVERT
ADD 1 TO WS-REC-COUNT
MOVE
'DEF002,Jane Doe,"456 Oak Ave, Apt 3B",Boston,MA,02101'
TO WS-CSV-LINE
PERFORM 3000-PARSE-AND-CONVERT
ADD 1 TO WS-REC-COUNT
.
3000-PARSE-AND-CONVERT.
PERFORM 3100-PARSE-CSV-LINE
PERFORM 3200-BUILD-FIXED-RECORD
DISPLAY "OUTPUT: [" WS-OUTPUT-REC "]"
.
3100-PARSE-CSV-LINE.
MOVE ZERO TO WS-FV-COUNT
MOVE 1 TO WS-CP-POS
SET WS-CP-UNQUOTED TO TRUE
MOVE FUNCTION LENGTH(
FUNCTION TRIM(WS-CSV-LINE TRAILING))
TO WS-CP-MAX
PERFORM UNTIL WS-CP-POS > WS-CP-MAX
OR WS-FV-COUNT >= 20
MOVE WS-CP-POS TO WS-CP-START
MOVE ZERO TO WS-CP-LEN
* Check if field starts with quote
IF WS-CSV-LINE(WS-CP-POS:1) = '"'
* Quoted field - scan for closing quote
ADD 1 TO WS-CP-POS
MOVE WS-CP-POS TO WS-CP-START
PERFORM UNTIL WS-CP-POS > WS-CP-MAX
IF WS-CSV-LINE(WS-CP-POS:1) = '"'
EXIT PERFORM
END-IF
ADD 1 TO WS-CP-POS
END-PERFORM
COMPUTE WS-CP-LEN =
WS-CP-POS - WS-CP-START
ADD 1 TO WS-CP-POS
ELSE
* Unquoted field - scan for comma
PERFORM UNTIL WS-CP-POS > WS-CP-MAX
IF WS-CSV-LINE(WS-CP-POS:1) = ","
EXIT PERFORM
END-IF
ADD 1 TO WS-CP-POS
END-PERFORM
COMPUTE WS-CP-LEN =
WS-CP-POS - WS-CP-START
END-IF
* Store extracted field
ADD 1 TO WS-FV-COUNT
MOVE SPACES TO WS-FV-VALUE(WS-FV-COUNT)
IF WS-CP-LEN > 0 AND WS-CP-LEN <= 50
MOVE WS-CSV-LINE(
WS-CP-START:WS-CP-LEN)
TO WS-FV-VALUE(WS-FV-COUNT)
END-IF
MOVE WS-CP-LEN TO
WS-FV-LEN(WS-FV-COUNT)
* Skip comma
IF WS-CP-POS <= WS-CP-MAX
AND WS-CSV-LINE(WS-CP-POS:1) = ","
ADD 1 TO WS-CP-POS
ELSE
ADD 1 TO WS-CP-POS
END-IF
END-PERFORM
.
3200-BUILD-FIXED-RECORD.
MOVE SPACES TO WS-OUTPUT-REC
PERFORM VARYING WS-CP-POS FROM 1 BY 1
UNTIL WS-CP-POS > WS-FV-COUNT
OR WS-CP-POS > 6
* Boundary check
IF WS-OM-START(WS-CP-POS) +
WS-OM-LENGTH(WS-CP-POS) - 1
> FUNCTION LENGTH(WS-OUTPUT-REC)
ADD 1 TO WS-ERR-COUNT
ELSE
IF WS-FV-LEN(WS-CP-POS) <=
WS-OM-LENGTH(WS-CP-POS)
MOVE WS-FV-VALUE(WS-CP-POS) TO
WS-OUTPUT-REC(
WS-OM-START(WS-CP-POS):
WS-OM-LENGTH(WS-CP-POS))
ELSE
* Truncate to output field size
MOVE WS-FV-VALUE(WS-CP-POS)(
1:WS-OM-LENGTH(WS-CP-POS))
TO WS-OUTPUT-REC(
WS-OM-START(WS-CP-POS):
WS-OM-LENGTH(WS-CP-POS))
END-IF
END-IF
END-PERFORM
.
This complete worked example demonstrates: - Quoted CSV parsing with reference modification (not UNSTRING) - State tracking (in-quotes vs. not-in-quotes) - Field map-driven output generation - Boundary checking on both input extraction and output placement - Truncation handling when input exceeds output field width - Error counting for operational monitoring
19.13 Defensive Programming for Reference Modification
Reference modification errors are among the hardest COBOL bugs to diagnose because they silently corrupt adjacent memory areas.
Always Validate Position and Length
SAFE-REF-MOD.
IF WS-START-POS < 1
DISPLAY "START POS < 1: " WS-START-POS
PERFORM 9900-ERROR-HANDLER
END-IF
IF WS-REF-LENGTH < 1
DISPLAY "LENGTH < 1: " WS-REF-LENGTH
PERFORM 9900-ERROR-HANDLER
END-IF
COMPUTE WS-END-POS =
WS-START-POS + WS-REF-LENGTH - 1
IF WS-END-POS > LENGTH OF WS-TARGET
DISPLAY "REF MOD OVERFLOW: END POS "
WS-END-POS " > MAX "
LENGTH OF WS-TARGET
PERFORM 9900-ERROR-HANDLER
END-IF
MOVE WS-TARGET(WS-START-POS:WS-REF-LENGTH)
TO WS-OUTPUT
.
Enable Compile-Time Checking
On IBM Enterprise COBOL, use the SSRANGE option:
CBL SSRANGE
On Micro Focus COBOL:
SET COBFLAGS="-C bound"
Test Edge Cases
Always test with: - Position = 1 (first byte) - Position = LENGTH OF item (last byte, length 1) - Length = LENGTH OF item (entire field) - Zero-length input strings - Maximum-length input strings
19.13 The Student Mainframe Lab
🧪 Try It Yourself: CSV Parser
Build a program that reads a CSV file and parses each line using reference modification (not UNSTRING). Handle: 1. Regular comma-separated values 2. Quoted fields that may contain commas:
"Smith, John",42,"New York"3. Empty fields between consecutive commas 4. Display each field on a separate line with its field numberHint: Maintain a state variable ("in quotes" vs. "not in quotes") as you scan each character.
🧪 Try It Yourself: Dynamic Report Builder
Write a program that builds report lines dynamically using the pointer pattern. The report should include: 1. A configurable set of columns (name, account, balance, branch) 2. Variable-width columns (only as wide as the longest value) 3. Column separators 4. A header line and a separator line 5. Right-aligned numeric columns
Use reference modification with a pointer variable to assemble each line.
Common Reference Modification Bugs and Their Symptoms
Understanding how reference modification bugs manifest helps you diagnose them quickly in production:
Bug: Start position = 0
*> BUG: WS-POS was not initialized and contains 0
MOVE WS-DATA(WS-POS:5) TO WS-OUTPUT
Symptom: On z/OS without SSRANGE, this reads from one byte before the start of WS-DATA, producing garbage in the first byte of output. With SSRANGE enabled, it produces a runtime error. On some platforms, it may work by accident if the preceding byte happens to contain valid data.
Bug: Length exceeds remaining bytes
*> BUG: WS-POS = 45, WS-LEN = 10, but WS-DATA is PIC X(50)
MOVE WS-DATA(WS-POS:WS-LEN) TO WS-OUTPUT
*> Attempts to read positions 45-54, but only 45-50 exist
Symptom: Reads 5 bytes past the end of WS-DATA, picking up whatever data follows it in WORKING-STORAGE. Often manifests as random characters appended to otherwise valid data. Extremely intermittent — results change if you add or remove other working-storage items.
Bug: Moving to reference-modified target that overlaps
*> BUG: Source and target overlap
MOVE WS-RECORD(5:10) TO WS-RECORD(3:10)
Symptom: Results are unpredictable. The compiler may process the MOVE left-to-right or right-to-left, and the result depends on which direction. This is undefined behavior in the COBOL standard. If you need to shift data within a field, use an intermediate work area.
Bug: Negative or zero length from arithmetic
COMPUTE WS-LEN = WS-END-POS - WS-START-POS
*> If WS-END-POS < WS-START-POS, WS-LEN is negative
MOVE WS-DATA(WS-START-POS:WS-LEN) TO WS-OUTPUT
Symptom: Unpredictable — some compilers treat a zero or negative length as an error, others may process it as a very large positive number due to unsigned arithmetic. Always validate: IF WS-LEN > ZERO AND WS-LEN <= maximum-reasonable-value.
⚠️ Production Story: Derek Washington spent three days debugging a production problem where customer addresses occasionally had random account numbers appended to them. The cause was a reference modification that read 5 bytes past the end of the address field. The bytes that followed in WORKING-STORAGE happened to be the start of the account number field. Adding a single boundary check (
IF WS-POS + WS-LEN - 1 > LENGTH OF WS-ADDRESS) fixed the bug permanently.
19.15 Performance Considerations for Reference Modification
Reference modification is generally very fast because it translates to simple address arithmetic at the machine instruction level. However, some patterns can cause performance issues:
Avoid Repeated LENGTH OF Calculations in Loops
*> SLOW: LENGTH OF evaluated on each iteration
PERFORM VARYING WS-POS FROM 1 BY 1
UNTIL WS-POS > FUNCTION LENGTH(WS-DATA)
...
END-PERFORM
*> FASTER: Compute once, reuse
MOVE FUNCTION LENGTH(WS-DATA) TO WS-MAX-POS
PERFORM VARYING WS-POS FROM 1 BY 1
UNTIL WS-POS > WS-MAX-POS
...
END-PERFORM
For fixed-length fields, FUNCTION LENGTH is resolved at compile time and has zero runtime cost. But for variable-length fields (OCCURS DEPENDING ON) or function arguments, it may require runtime computation.
Minimize Reference Modification in Inner Loops
If you are processing millions of records and each record requires multiple reference modification operations, consider using REDEFINES to create named fields for the most common extraction patterns, reserving reference modification for the truly dynamic cases.
*> Instead of this (many ref-mods per record):
MOVE WS-RECORD(1:10) TO WS-ACCT
MOVE WS-RECORD(11:8) TO WS-DATE
MOVE WS-RECORD(19:12) TO WS-AMOUNT
*> Use this (named fields, no ref-mod at all):
01 WS-RECORD.
05 WS-ACCT PIC X(10).
05 WS-DATE PIC 9(8).
05 WS-AMOUNT PIC 9(10)V99.
Reference modification is for cases where the field boundaries are not known at compile time. When they are known, use named fields — they are faster and more readable.
19.16 Chapter Summary
Reference modification and pointer techniques give you byte-level control over COBOL data, enabling capabilities that go far beyond standard MOVE and STRING operations:
- Reference modification
identifier(start:length)extracts or overwrites specific bytes within any alphanumeric item. Both start and length can be arithmetic expressions, enabling fully dynamic data access. - LENGTH OF and FUNCTION LENGTH return the byte size of data items, essential for safe boundary checking in reference modification operations.
- Dynamic substring operations — scanning for delimiters, extracting tokens, parsing variable-format records — are built by combining reference modification with position-tracking variables.
- The pointer build pattern — maintaining a position counter and appending pieces via reference modification — is the standard COBOL idiom for constructing dynamic output.
- ADDRESS OF and POINTER data items provide memory-level access for interfacing with non-COBOL programs and processing dynamically allocated buffers, but carry significant risk.
- STRING/UNSTRING WITH POINTER extends the built-in string operations with incremental processing capability.
- Defensive programming is critical: always validate position and length before reference modification, enable SSRANGE during development, and test boundary conditions exhaustively.
These techniques are essential for processing modern data interchange formats (EDI, delimited files, variable-format records) within COBOL programs, bridging the gap between COBOL's fixed-format heritage and the variable-format data that modern systems exchange.
⚖️ The Modernization Spectrum: Reference modification represents COBOL's pragmatic response to a changing data landscape. Where COBOL was designed for fixed-format records with known field positions, today's data is often variable-format, delimited, or structured as XML and JSON. Reference modification does not attempt to transform COBOL into a string-processing language like Python or Perl — instead, it provides just enough byte-level access to handle the variable-format cases that fixed-format structures cannot. This is the Modernization Spectrum in action: adapting without abandoning the language's core strengths of clarity, reliability, and decimal precision.
"Reference modification is where COBOL meets C. It gives you the same byte-level power — and the same responsibility. Use it wisely." — Maria Chen, Senior Developer, GlobalBank