Chapter 19: Pointer and Reference Modification

DataField.Dev

21 min read

> "When you absolutely need to reach into a data item and extract exactly the bytes you want — no more, no less — reference modification is your scalpel. It is precise, powerful, and unforgiving of errors." — Priya Kapoor, Architect, GlobalBank

In This Chapter

19.1 Reference Modification Fundamentals
19.2 The LENGTH OF Special Register
19.3 Dynamic Substring Operations
19.4 Parsing Without UNSTRING: Reference Modification Patterns
19.5 Advanced Parsing Patterns with Reference Modification
19.6 Production Patterns: Multi-Record Parsing
19.7 Pointer-Based Processing: ADDRESS OF
19.6 Processing Variable-Format Records
19.7 STRING Statement with POINTER Phrase
19.8 UNSTRING with POINTER Phrase
19.9 Advanced Technique: Building a Generic Field Extractor
19.10 Reference Modification in CICS and Online Programs
19.11 GlobalBank Case Study: Transaction Description Parser
19.12 MedClaim Case Study: EDI 837 Claim Parsing
19.12 Complete Worked Example: CSV to Fixed-Width Converter
19.13 Defensive Programming for Reference Modification
19.13 The Student Mainframe Lab
19.15 Performance Considerations for Reference Modification
19.16 Chapter Summary

Exercises Quiz Case Study 01 Case Study 02 Key Takeaways Further Reading

Chapter 19: Pointer and Reference Modification

"When you absolutely need to reach into a data item and extract exactly the bytes you want — no more, no less — reference modification is your scalpel. It is precise, powerful, and unforgiving of errors." — Priya Kapoor, Architect, GlobalBank

Most COBOL data manipulation works at the field level: you MOVE entire fields, INSPECT entire fields, STRING and UNSTRING entire fields. But there are times when you need finer control — when you need to extract the third through seventh characters of a field, or build an output string one piece at a time without knowing in advance how long each piece will be, or parse a variable-format record where field boundaries shift depending on the data.

Reference modification gives you byte-level access to any alphanumeric data item. Combined with the LENGTH OF special register and pointer-based techniques, it provides the tools you need for advanced data manipulation — the same kind of substring and pointer operations that programmers in C or Java take for granted.

In this chapter, we explore reference modification in depth, develop practical patterns for parsing and building dynamic strings, and apply these techniques to real-world problems at both GlobalBank and MedClaim.

19.1 Reference Modification Fundamentals

Reference modification allows you to refer to a substring of any alphanumeric or national data item by specifying a starting position and, optionally, a length.

Basic Syntax

identifier(starting-position : length)

starting-position: An arithmetic expression that evaluates to a positive integer. Position 1 is the leftmost byte.
length: An optional arithmetic expression that evaluates to a positive integer. If omitted, the substring extends from the starting position to the end of the item.

Simple Examples

01  WS-FULL-NAME    PIC X(30) VALUE "CHEN, MARIA L.".
01  WS-FIRST-CHAR   PIC X(1).
01  WS-LAST-NAME    PIC X(15).
01  WS-SUBSTRING    PIC X(10).

PROCEDURE DIVISION.
    MOVE WS-FULL-NAME(1:4) TO WS-LAST-NAME
    *> WS-LAST-NAME = "CHEN" (positions 1-4)

    MOVE WS-FULL-NAME(7:5) TO WS-FIRST-CHAR
    *> Gets "MARIA" (positions 7-11)

    MOVE WS-FULL-NAME(7:) TO WS-SUBSTRING
    *> Gets from position 7 to end: "MARIA L.      "

Rules and Constraints

The starting position must be >= 1.
The starting position plus length minus 1 must not exceed the total length of the data item.
Both starting position and length must evaluate to positive integers.
Reference modification can be applied to any alphanumeric, alphabetic, or national item — including group items.
Reference modification can be used anywhere an identifier is valid: MOVE, IF, EVALUATE, STRING, DISPLAY, etc.

⚠️ Defensive Programming Alert: The compiler does not generate runtime boundary checks for reference modification unless you enable the SSRANGE (or equivalent) option. Out-of-range reference modification silently reads or writes memory outside the data item, causing data corruption or abends that are extremely difficult to diagnose.

Arithmetic Expressions in Reference Modification

The starting position and length can be arithmetic expressions:

01  WS-POS   PIC 99 VALUE 1.
01  WS-LEN   PIC 99 VALUE 5.
01  WS-DATA  PIC X(50).

    MOVE WS-DATA(WS-POS:WS-LEN) TO WS-OUTPUT

    *> Start 3 positions past current, take 10 bytes
    MOVE WS-DATA(WS-POS + 3 : WS-LEN + 5)
        TO WS-OUTPUT

    *> Dynamic extraction based on calculated values
    COMPUTE WS-POS = WS-FIELD-OFFSET + 1
    COMPUTE WS-LEN = WS-FIELD-LENGTH
    MOVE WS-DATA(WS-POS:WS-LEN) TO WS-EXTRACTED

Reference Modification on Group Items

You can apply reference modification to group items, treating the entire group as a string of bytes:

01  WS-CUSTOMER-RECORD.
    05  WS-CUST-ID      PIC 9(6).
    05  WS-CUST-NAME    PIC X(30).
    05  WS-CUST-ADDR    PIC X(50).

    *> Extract bytes 7 through 36 (the name field)
    MOVE WS-CUSTOMER-RECORD(7:30) TO WS-NAME-ONLY

    *> This is equivalent to:
    MOVE WS-CUST-NAME TO WS-NAME-ONLY

While extracting named fields this way is pointless (just use the field name), it becomes valuable when processing records with variable layouts or when the field boundaries are data-driven.

19.2 The LENGTH OF Special Register

The LENGTH OF special register returns the number of bytes allocated to a data item. It is evaluated at compile time for fixed-length items and at runtime for variable-length items (those with OCCURS DEPENDING ON).

01  WS-RECORD      PIC X(100).
01  WS-REC-LENGTH  PIC 9(5).

    MOVE LENGTH OF WS-RECORD TO WS-REC-LENGTH
    *> WS-REC-LENGTH = 100

01  WS-TABLE-AREA.
    05  WS-COUNT    PIC 9(3) COMP.
    05  WS-ITEM     PIC X(20) OCCURS 1 TO 50 TIMES
                    DEPENDING ON WS-COUNT.

    MOVE 10 TO WS-COUNT
    MOVE LENGTH OF WS-TABLE-AREA TO WS-REC-LENGTH
    *> WS-REC-LENGTH = 2 + (10 * 20) = 202

Using LENGTH OF with Reference Modification

LENGTH OF is invaluable for building safe reference modification logic:

    IF WS-POS + WS-LEN - 1 > LENGTH OF WS-DATA
        DISPLAY "REF MOD OUT OF RANGE"
        DISPLAY "POS=" WS-POS " LEN=" WS-LEN
                " MAX=" LENGTH OF WS-DATA
        PERFORM 9900-ERROR-HANDLER
    ELSE
        MOVE WS-DATA(WS-POS:WS-LEN) TO WS-OUTPUT
    END-IF

FUNCTION LENGTH vs. LENGTH OF

COBOL provides both LENGTH OF (a special register) and FUNCTION LENGTH (an intrinsic function). They differ subtly:

    MOVE LENGTH OF WS-FIELD TO WS-LEN
    *> Returns allocated byte length

    MOVE FUNCTION LENGTH(WS-FIELD) TO WS-LEN
    *> Also returns byte length, but can be used in
    *> arithmetic expressions more naturally

    COMPUTE WS-LAST = FUNCTION LENGTH(WS-FIELD)
    MOVE WS-FIELD(WS-LAST:1) TO WS-LAST-CHAR
    *> Gets the last character of the field

FUNCTION LENGTH can also be applied to literals:

    MOVE FUNCTION LENGTH("HELLO WORLD") TO WS-LEN
    *> WS-LEN = 11

19.3 Dynamic Substring Operations

Reference modification becomes truly powerful when combined with variables for position and length, enabling dynamic data extraction.

Scanning for a Delimiter

A common pattern is scanning a field for a specific character and extracting what comes before and after it:

       01  WS-INPUT       PIC X(80).
       01  WS-PART-1      PIC X(80).
       01  WS-PART-2      PIC X(80).
       01  WS-SCAN-POS    PIC 99.
       01  WS-DELIM-POS   PIC 99 VALUE ZERO.
       01  WS-INPUT-LEN   PIC 99.

       PERFORM SPLIT-ON-COMMA.

       SPLIT-ON-COMMA.
           MOVE SPACES TO WS-PART-1 WS-PART-2
           MOVE FUNCTION LENGTH(WS-INPUT) TO WS-INPUT-LEN

      *    Find the comma
           MOVE ZERO TO WS-DELIM-POS
           PERFORM VARYING WS-SCAN-POS FROM 1 BY 1
               UNTIL WS-SCAN-POS > WS-INPUT-LEN
                  OR WS-DELIM-POS > ZERO
               IF WS-INPUT(WS-SCAN-POS:1) = ","
                   MOVE WS-SCAN-POS TO WS-DELIM-POS
               END-IF
           END-PERFORM

           IF WS-DELIM-POS > ZERO
      *        Extract before comma
               IF WS-DELIM-POS > 1
                   MOVE WS-INPUT(1:WS-DELIM-POS - 1)
                       TO WS-PART-1
               END-IF
      *        Extract after comma
               IF WS-DELIM-POS < WS-INPUT-LEN
                   MOVE WS-INPUT(WS-DELIM-POS + 1:)
                       TO WS-PART-2
               END-IF
           ELSE
      *        No comma found - entire input is part 1
               MOVE WS-INPUT TO WS-PART-1
           END-IF
           .

💡 Why not use UNSTRING? UNSTRING is often cleaner for simple delimiter-based splitting. But reference modification gives you more control: you can handle multiple delimiter types, skip quoted sections, process nested delimiters, or extract fixed-position fields from variable-format records where UNSTRING would be awkward.

Extracting Fixed Fields from Variable Positions

Consider a record where the first 2 bytes indicate a record type, and the layout of the remaining bytes depends on the type:

01  WS-VARIABLE-RECORD  PIC X(200).
01  WS-REC-TYPE         PIC X(2).
01  WS-FIELD-A          PIC X(20).
01  WS-FIELD-B          PIC X(10).
01  WS-START            PIC 9(3).

    MOVE WS-VARIABLE-RECORD(1:2) TO WS-REC-TYPE

    EVALUATE WS-REC-TYPE
        WHEN "01"
            *> Type 01: name at pos 3-22, code at pos 23-32
            MOVE WS-VARIABLE-RECORD(3:20) TO WS-FIELD-A
            MOVE WS-VARIABLE-RECORD(23:10) TO WS-FIELD-B
        WHEN "02"
            *> Type 02: code at pos 3-12, name at pos 13-32
            MOVE WS-VARIABLE-RECORD(13:20) TO WS-FIELD-A
            MOVE WS-VARIABLE-RECORD(3:10) TO WS-FIELD-B
        WHEN "03"
            *> Type 03: skip 2-byte length, then name
            MOVE WS-VARIABLE-RECORD(3:2) TO WS-START
            MOVE WS-VARIABLE-RECORD(5:WS-START)
                TO WS-FIELD-A
    END-EVALUATE

19.4 Parsing Without UNSTRING: Reference Modification Patterns

Reference modification enables powerful parsing patterns that go beyond what UNSTRING can do easily.

Pattern 1: Token Extraction with Variable-Width Delimiters

      *------------------------------------------------------------
      * Extract tokens from a pipe-delimited string where
      * fields can be empty (consecutive pipes)
      *------------------------------------------------------------
       01  WS-INPUT-STRING  PIC X(200).
       01  WS-TOKENS.
           05  WS-TOKEN     PIC X(50) OCCURS 10 TIMES.
       01  WS-PARSE-VARS.
           05  WS-CURR-POS  PIC 9(3) VALUE 1.
           05  WS-TOKEN-START PIC 9(3).
           05  WS-TOKEN-NUM  PIC 9(2) VALUE 1.
           05  WS-MAX-LEN   PIC 9(3).
           05  WS-TOKEN-LEN  PIC 9(3).

       PARSE-PIPE-DELIMITED.
           MOVE FUNCTION LENGTH(WS-INPUT-STRING)
               TO WS-MAX-LEN
           MOVE 1 TO WS-CURR-POS
           MOVE 1 TO WS-TOKEN-NUM
           MOVE SPACES TO WS-TOKENS

           PERFORM UNTIL WS-CURR-POS > WS-MAX-LEN
                      OR WS-TOKEN-NUM > 10
               MOVE WS-CURR-POS TO WS-TOKEN-START
      *        Scan for next pipe or end of string
               PERFORM UNTIL WS-CURR-POS > WS-MAX-LEN
                   IF WS-INPUT-STRING(WS-CURR-POS:1)
                      = "|"
                       EXIT PERFORM
                   END-IF
                   ADD 1 TO WS-CURR-POS
               END-PERFORM

      *        Extract token
               COMPUTE WS-TOKEN-LEN =
                   WS-CURR-POS - WS-TOKEN-START
               IF WS-TOKEN-LEN > ZERO
                AND WS-TOKEN-LEN <= 50
                   MOVE WS-INPUT-STRING(
                       WS-TOKEN-START:WS-TOKEN-LEN)
                       TO WS-TOKEN(WS-TOKEN-NUM)
               END-IF

               ADD 1 TO WS-TOKEN-NUM
               ADD 1 TO WS-CURR-POS
           END-PERFORM
           .

Pattern 2: Right-to-Left Parsing

Sometimes you need to parse from the right — for example, extracting a file extension from a filename:

       01  WS-FILENAME    PIC X(80).
       01  WS-EXTENSION   PIC X(10).
       01  WS-BASENAME    PIC X(70).
       01  WS-SCAN        PIC 9(3).
       01  WS-DOT-POS     PIC 9(3) VALUE ZERO.
       01  WS-NAME-LEN    PIC 9(3).

       EXTRACT-EXTENSION.
           MOVE SPACES TO WS-EXTENSION WS-BASENAME
           MOVE FUNCTION LENGTH(WS-FILENAME)
               TO WS-NAME-LEN

      *    Find rightmost period by scanning backward
           MOVE ZERO TO WS-DOT-POS
           PERFORM VARYING WS-SCAN
               FROM WS-NAME-LEN BY -1
               UNTIL WS-SCAN < 1
                  OR WS-DOT-POS > ZERO
               IF WS-FILENAME(WS-SCAN:1) = "."
                   MOVE WS-SCAN TO WS-DOT-POS
               END-IF
           END-PERFORM

           IF WS-DOT-POS > ZERO AND < WS-NAME-LEN
               MOVE WS-FILENAME(WS-DOT-POS + 1:
                   WS-NAME-LEN - WS-DOT-POS)
                   TO WS-EXTENSION
               MOVE WS-FILENAME(1:WS-DOT-POS - 1)
                   TO WS-BASENAME
           ELSE
               MOVE WS-FILENAME TO WS-BASENAME
               MOVE SPACES TO WS-EXTENSION
           END-IF
           .

Pattern 3: Building Dynamic Output with a Pointer

A common and extremely useful pattern is building an output string piece by piece using a position pointer:

       01  WS-OUTPUT-LINE  PIC X(132).
       01  WS-OUT-PTR      PIC 9(3) VALUE 1.
       01  WS-PIECE-LEN    PIC 9(3).

       BUILD-OUTPUT-LINE.
           MOVE SPACES TO WS-OUTPUT-LINE
           MOVE 1 TO WS-OUT-PTR

      *    Append account number
           MOVE WS-ACCT-NUM TO
               WS-OUTPUT-LINE(WS-OUT-PTR:10)
           ADD 10 TO WS-OUT-PTR

      *    Append separator
           MOVE " | " TO
               WS-OUTPUT-LINE(WS-OUT-PTR:3)
           ADD 3 TO WS-OUT-PTR

      *    Append name (trimmed)
           MOVE FUNCTION LENGTH(
               FUNCTION TRIM(WS-CUST-NAME TRAILING))
               TO WS-PIECE-LEN
           IF WS-PIECE-LEN > ZERO
               MOVE FUNCTION TRIM(
                   WS-CUST-NAME TRAILING)
                   TO WS-OUTPUT-LINE(
                      WS-OUT-PTR:WS-PIECE-LEN)
               ADD WS-PIECE-LEN TO WS-OUT-PTR
           END-IF

      *    Append separator
           MOVE " | " TO
               WS-OUTPUT-LINE(WS-OUT-PTR:3)
           ADD 3 TO WS-OUT-PTR

      *    Append formatted balance
           MOVE WS-FORMATTED-BAL TO
               WS-OUTPUT-LINE(WS-OUT-PTR:15)
           ADD 15 TO WS-OUT-PTR
           .

This pattern produces tightly packed output without the wasted space that occurs when moving fixed-length fields to fixed positions. It is the COBOL equivalent of string concatenation in modern languages.

📊 Performance Note: The pointer-based build pattern is significantly more efficient than repeated STRING...DELIMITED BY operations for building complex output lines, because STRING must scan for delimiters on every call. Reference modification with a pointer goes directly to the target position.

19.5 Advanced Parsing Patterns with Reference Modification

Before we cover pointer-based processing, let us explore several more parsing patterns that arise in production COBOL programs. These patterns combine the scanning, extracting, and building techniques from the previous sections into reusable solutions for common data manipulation problems.

Pattern 4: Nested Delimiter Parsing

Some data formats use multiple delimiter levels. For example, GlobalBank's batch control records use pipes to separate fields and colons to separate sub-fields:

CTRL|BATCH:20240615:001|STATUS:OPEN|RECORDS:15234|AMOUNT:4523891.77

The parser must handle two levels: first split on pipes, then split selected fields on colons:

       01  WS-CTRL-RECORD    PIC X(200).
       01  WS-LEVEL-1.
           05  WS-L1-COUNT   PIC 99 VALUE ZERO.
           05  WS-L1-FIELD   PIC X(60) OCCURS 10 TIMES.
       01  WS-LEVEL-2.
           05  WS-L2-COUNT   PIC 99 VALUE ZERO.
           05  WS-L2-FIELD   PIC X(30) OCCURS 5 TIMES.
       01  WS-PARSE-CTL.
           05  WS-PC-POS     PIC 9(3).
           05  WS-PC-START   PIC 9(3).
           05  WS-PC-LEN     PIC 9(3).
           05  WS-PC-MAX     PIC 9(3).
           05  WS-PC-DELIM   PIC X(1).

       PARSE-CTRL-RECORD.
      *    Level 1: Split on pipes
           MOVE "|" TO WS-PC-DELIM
           MOVE WS-CTRL-RECORD TO WS-WORK-BUFFER
           PERFORM SPLIT-ON-DELIMITER
           MOVE WS-L1-COUNT TO WS-SAVE-L1-COUNT
      *    Copy results to level-1 array
           PERFORM VARYING WS-PC-POS FROM 1 BY 1
               UNTIL WS-PC-POS > WS-L1-COUNT
               MOVE WS-SPLIT-RESULT(WS-PC-POS)
                   TO WS-L1-FIELD(WS-PC-POS)
           END-PERFORM

      *    Level 2: Split field 2 on colons
           MOVE ":" TO WS-PC-DELIM
           MOVE WS-L1-FIELD(2) TO WS-WORK-BUFFER
           PERFORM SPLIT-ON-DELIMITER
      *    Now WS-L2-FIELD(1) = "BATCH"
      *        WS-L2-FIELD(2) = "20240615"
      *        WS-L2-FIELD(3) = "001"
           .

The SPLIT-ON-DELIMITER paragraph is a reusable component that accepts any single-character delimiter and populates a result array. Building reusable parsing components is a hallmark of well-designed COBOL programs.

Pattern 5: Fixed-Width Field Extraction with a Field Map

When processing records from multiple sources with different layouts, a field map table eliminates hard-coded positions:

       01  WS-FIELD-MAP.
           05  WS-FM-COUNT    PIC 99 VALUE ZERO.
           05  WS-FM-ENTRY    OCCURS 20 TIMES.
               10  WS-FM-NAME    PIC X(15).
               10  WS-FM-START   PIC 9(3).
               10  WS-FM-LENGTH  PIC 9(3).

       01  WS-INPUT-RECORD   PIC X(500).
       01  WS-EXTRACTED      PIC X(100).

       EXTRACT-BY-FIELD-MAP.
           PERFORM VARYING WS-FM-IDX FROM 1 BY 1
               UNTIL WS-FM-IDX > WS-FM-COUNT
      *        Boundary check
               IF WS-FM-START(WS-FM-IDX) +
                  WS-FM-LENGTH(WS-FM-IDX) - 1
                  > FUNCTION LENGTH(WS-INPUT-RECORD)
                   DISPLAY "FIELD MAP OVERFLOW: "
                           WS-FM-NAME(WS-FM-IDX)
                   PERFORM 9100-LOG-ERROR
               ELSE
                   MOVE SPACES TO WS-EXTRACTED
                   MOVE WS-INPUT-RECORD(
                       WS-FM-START(WS-FM-IDX):
                       WS-FM-LENGTH(WS-FM-IDX))
                       TO WS-EXTRACTED
                   DISPLAY WS-FM-NAME(WS-FM-IDX)
                           " = [" WS-EXTRACTED "]"
               END-IF
           END-PERFORM
           .

This data-driven approach means you can support new record layouts by loading a different field map — no program changes required. It is the COBOL equivalent of a schema or metadata definition.

📊 Production Pattern: MedClaim's EDI processing system uses field maps stored in DB2 tables. When a new payer sends claims in a slightly different format, Sarah Kim updates the field map rows rather than requesting a program change. James Okafor estimates this has prevented over 40 change requests in the past year.

Pattern 6: Hexadecimal Dump with Reference Modification

For debugging, a hexadecimal dump of a data item can be invaluable. Reference modification makes this straightforward:

       01  WS-HEX-TABLE.
           05  FILLER PIC X(16)
               VALUE "0123456789ABCDEF".
       01  WS-HEX-CHARS REDEFINES WS-HEX-TABLE.
           05  WS-HEX-CHAR PIC X(1) OCCURS 16 TIMES.

       01  WS-DUMP-LINE      PIC X(80).
       01  WS-DUMP-PTR       PIC 9(3).
       01  WS-BYTE-VAL       PIC 9(3) COMP.
       01  WS-HIGH-NIBBLE    PIC 9 COMP.
       01  WS-LOW-NIBBLE     PIC 9 COMP.
       01  WS-DUMP-POS       PIC 9(5).
       01  WS-DUMP-MAX       PIC 9(5).

       HEX-DUMP-FIELD.
           MOVE FUNCTION LENGTH(WS-TARGET-FIELD)
               TO WS-DUMP-MAX
           MOVE 1 TO WS-DUMP-POS

           PERFORM UNTIL WS-DUMP-POS > WS-DUMP-MAX
               MOVE SPACES TO WS-DUMP-LINE
               MOVE 1 TO WS-DUMP-PTR

      *        Position label
               STRING WS-DUMP-POS DELIMITED BY SIZE
                      ": " DELIMITED BY SIZE
                      INTO WS-DUMP-LINE
                      WITH POINTER WS-DUMP-PTR
               END-STRING

      *        Hex bytes (up to 16 per line)
               PERFORM VARYING WS-BYTE-IDX FROM 0 BY 1
                   UNTIL WS-BYTE-IDX >= 16
                      OR WS-DUMP-POS + WS-BYTE-IDX
                         > WS-DUMP-MAX
                   COMPUTE WS-BYTE-VAL =
                       FUNCTION ORD(WS-TARGET-FIELD(
                           WS-DUMP-POS + WS-BYTE-IDX
                           :1)) - 1
                   COMPUTE WS-HIGH-NIBBLE =
                       WS-BYTE-VAL / 16
                   COMPUTE WS-LOW-NIBBLE =
                       FUNCTION MOD(WS-BYTE-VAL, 16)
                   MOVE WS-HEX-CHAR(
                       WS-HIGH-NIBBLE + 1) TO
                       WS-DUMP-LINE(WS-DUMP-PTR:1)
                   ADD 1 TO WS-DUMP-PTR
                   MOVE WS-HEX-CHAR(
                       WS-LOW-NIBBLE + 1) TO
                       WS-DUMP-LINE(WS-DUMP-PTR:1)
                   ADD 1 TO WS-DUMP-PTR
                   MOVE " " TO
                       WS-DUMP-LINE(WS-DUMP-PTR:1)
                   ADD 1 TO WS-DUMP-PTR
               END-PERFORM

               DISPLAY WS-DUMP-LINE
               ADD 16 TO WS-DUMP-POS
           END-PERFORM
           .

This hex dump utility is a tool that Derek Washington keeps in his personal copybook library. It has saved him countless hours debugging data corruption issues in production — being able to see the actual hex values of a record reveals problems (embedded binary data, EBCDIC/ASCII issues, packed decimal corruption) that DISPLAY alone cannot show.

Pattern 7: Word Counting and Text Analysis

For report generation and data quality analysis, counting words and analyzing text content is useful:

       01  WS-TEXT-INPUT     PIC X(200).
       01  WS-WORD-COUNT     PIC 9(3) VALUE ZERO.
       01  WS-IN-WORD-FLAG   PIC X(1) VALUE "N".
           88  WS-IN-WORD        VALUE "Y".
           88  WS-NOT-IN-WORD    VALUE "N".
       01  WS-TEXT-POS        PIC 9(3).
       01  WS-TEXT-LEN        PIC 9(3).
       01  WS-CURR-CHAR      PIC X(1).

       COUNT-WORDS.
           MOVE ZERO TO WS-WORD-COUNT
           SET WS-NOT-IN-WORD TO TRUE
           MOVE FUNCTION LENGTH(
               FUNCTION TRIM(WS-TEXT-INPUT TRAILING))
               TO WS-TEXT-LEN

           PERFORM VARYING WS-TEXT-POS FROM 1 BY 1
               UNTIL WS-TEXT-POS > WS-TEXT-LEN
               MOVE WS-TEXT-INPUT(WS-TEXT-POS:1)
                   TO WS-CURR-CHAR

               IF WS-CURR-CHAR = SPACE
                   IF WS-IN-WORD
                       SET WS-NOT-IN-WORD TO TRUE
                   END-IF
               ELSE
                   IF WS-NOT-IN-WORD
                       ADD 1 TO WS-WORD-COUNT
                       SET WS-IN-WORD TO TRUE
                   END-IF
               END-IF
           END-PERFORM
           .

💡 Design Insight: Every one of these patterns follows the same fundamental structure: initialize a position variable to 1, scan forward one byte at a time using reference modification, track state (in-word, in-quotes, delimiter position), and act based on what each byte contains. Once you internalize this pattern, you can parse virtually any text format in COBOL.

19.6 Production Patterns: Multi-Record Parsing

In production systems, reference modification is rarely used on a single field in isolation. It is part of a larger processing pipeline where records arrive in variable formats, must be validated, parsed, transformed, and routed. This section presents production-grade patterns that combine reference modification with error handling, logging, and recovery.

Pattern: Record Type Identification and Routing

Many mainframe file formats use the first few bytes of each record to identify the record type. A single file may contain header records, detail records, trailer records, and control records — all with different layouts:

       01  WS-INPUT-RECORD    PIC X(500).
       01  WS-REC-TYPE        PIC X(2).
       01  WS-REC-SUBTYPE     PIC X(3).
       01  WS-REC-LEN         PIC 9(3).

       ROUTE-RECORD.
      *    Extract record type from first 2 bytes
           MOVE WS-INPUT-RECORD(1:2) TO WS-REC-TYPE
      *    Extract subtype from bytes 3-5
           MOVE WS-INPUT-RECORD(3:3) TO WS-REC-SUBTYPE

           EVALUATE WS-REC-TYPE
               WHEN "HD"
                   PERFORM PARSE-HEADER-RECORD
               WHEN "DT"
                   EVALUATE WS-REC-SUBTYPE
                       WHEN "CLM"
                           PERFORM PARSE-CLAIM-DETAIL
                       WHEN "SVC"
                           PERFORM PARSE-SERVICE-LINE
                       WHEN "ADJ"
                           PERFORM PARSE-ADJUSTMENT
                       WHEN OTHER
                           PERFORM LOG-UNKNOWN-SUBTYPE
                   END-EVALUATE
               WHEN "TR"
                   PERFORM PARSE-TRAILER-RECORD
               WHEN OTHER
                   PERFORM LOG-UNKNOWN-RECORD
           END-EVALUATE
           .

At MedClaim, James Okafor processes remittance files that contain 14 different record types. Each type has a different layout, and the layouts change between payer versions. Reference modification allows the routing logic to remain stable even as individual record parsers are updated.

Pattern: Variable-Length Field Extraction with Length Prefixes

Some formats (particularly those originating from IBM systems) use length-prefixed fields rather than delimiters. Each field is preceded by a 2-byte or 4-byte length indicator:

       01  WS-LP-RECORD       PIC X(2000).
       01  WS-LP-POS          PIC 9(4) VALUE 1.
       01  WS-LP-FIELD-LEN    PIC 9(4).
       01  WS-LP-FIELD-DATA   PIC X(500).
       01  WS-LP-FIELD-COUNT  PIC 99 VALUE ZERO.

       PARSE-LENGTH-PREFIXED.
           MOVE 1 TO WS-LP-POS
           MOVE ZERO TO WS-LP-FIELD-COUNT

           PERFORM UNTIL WS-LP-POS >= WS-RECORD-LEN
                      OR WS-LP-FIELD-COUNT >= 20
      *        Extract 2-byte length prefix
               IF WS-LP-POS + 1 > WS-RECORD-LEN
                   EXIT PERFORM
               END-IF
               MOVE WS-LP-RECORD(WS-LP-POS:2)
                   TO WS-LP-FIELD-LEN-X
               COMPUTE WS-LP-FIELD-LEN =
                   FUNCTION NUMVAL(WS-LP-FIELD-LEN-X)
               ADD 2 TO WS-LP-POS

      *        Validate length
               IF WS-LP-FIELD-LEN > 0
                   AND WS-LP-FIELD-LEN <= 500
                   AND WS-LP-POS + WS-LP-FIELD-LEN - 1
                       <= WS-RECORD-LEN
      *            Extract field data
                   MOVE SPACES TO WS-LP-FIELD-DATA
                   MOVE WS-LP-RECORD(
                       WS-LP-POS:WS-LP-FIELD-LEN)
                       TO WS-LP-FIELD-DATA
                   ADD 1 TO WS-LP-FIELD-COUNT
                   ADD WS-LP-FIELD-LEN TO WS-LP-POS
               ELSE
      *            Invalid length — log and stop parsing
                   DISPLAY "BAD FIELD LEN AT POS "
                       WS-LP-POS ": " WS-LP-FIELD-LEN
                   EXIT PERFORM
               END-IF
           END-PERFORM
           .

⚠️ Defensive Programming: Every reference modification in this code is preceded by a boundary check. The check WS-LP-POS + WS-LP-FIELD-LEN - 1 <= WS-RECORD-LEN prevents reading past the end of the record. Without this check, a corrupt length prefix could cause the program to read into adjacent memory — a bug that might work silently for months before manifesting as corrupt output.

Pattern: Building Audit Trail Records

Production systems must log their processing for audit and debugging. Reference modification is ideal for building variable-length audit records:

       01  WS-AUDIT-REC       PIC X(500).
       01  WS-AUDIT-POS       PIC 9(3) VALUE 1.

       BUILD-AUDIT-RECORD.
           MOVE SPACES TO WS-AUDIT-REC
           MOVE 1 TO WS-AUDIT-POS

      *    Timestamp (21 bytes)
           MOVE FUNCTION CURRENT-DATE
               TO WS-AUDIT-REC(WS-AUDIT-POS:21)
           ADD 21 TO WS-AUDIT-POS

      *    Separator
           MOVE "|" TO WS-AUDIT-REC(WS-AUDIT-POS:1)
           ADD 1 TO WS-AUDIT-POS

      *    Program ID (8 bytes)
           MOVE "CLMPROC " TO
               WS-AUDIT-REC(WS-AUDIT-POS:8)
           ADD 8 TO WS-AUDIT-POS

      *    Separator
           MOVE "|" TO WS-AUDIT-REC(WS-AUDIT-POS:1)
           ADD 1 TO WS-AUDIT-POS

      *    Action code (3 bytes)
           MOVE WS-ACTION-CODE TO
               WS-AUDIT-REC(WS-AUDIT-POS:3)
           ADD 3 TO WS-AUDIT-POS

      *    Separator
           MOVE "|" TO WS-AUDIT-REC(WS-AUDIT-POS:1)
           ADD 1 TO WS-AUDIT-POS

      *    Claim ID (variable — use TRIM)
           MOVE FUNCTION TRIM(WS-CLM-ID) TO
               WS-AUDIT-REC(WS-AUDIT-POS:12)
           ADD 12 TO WS-AUDIT-POS

      *    Write the audit record (only the used portion)
           COMPUTE WS-AUDIT-WRITE-LEN =
               WS-AUDIT-POS - 1
           WRITE AUDIT-FILE-REC FROM
               WS-AUDIT-REC(1:WS-AUDIT-WRITE-LEN)
           .

This pointer-build pattern (maintain a position counter, append each piece, advance the counter) is the single most important reference modification idiom in production COBOL. Maria Chen estimates that 60% of all reference modification code at GlobalBank uses this pattern.

19.7 Pointer-Based Processing: ADDRESS OF

COBOL provides pointer data items and the ADDRESS OF special register for lower-level memory manipulation. This is an advanced feature used primarily for:

Interfacing with non-COBOL programs (C, Assembler)
Processing dynamically allocated memory
Optimizing access to large data areas in LINKAGE SECTION

Pointer Data Items

01  WS-DATA-PTR    USAGE IS POINTER.
01  WS-NULL-PTR    USAGE IS POINTER VALUE NULL.

ADDRESS OF Special Register

Every item in the LINKAGE SECTION has an associated ADDRESS OF that represents its memory address:

LINKAGE SECTION.
01  LS-RECORD      PIC X(100).
01  LS-BUFFER      PIC X(500).

PROCEDURE DIVISION.
    SET ADDRESS OF LS-RECORD TO WS-DATA-PTR
    *> Now LS-RECORD refers to whatever WS-DATA-PTR
    *> points to

SET Statement with Pointers

    SET WS-DATA-PTR TO ADDRESS OF WS-RECORD
    *> WS-DATA-PTR now contains the address of WS-RECORD

    SET ADDRESS OF LS-RECORD TO WS-DATA-PTR
    *> LS-RECORD now overlays the memory at WS-DATA-PTR

    SET WS-DATA-PTR TO NULL
    *> Reset pointer to null

Practical Example: Processing a Buffer

When receiving data from a CICS communication area, a Language Environment service, or a C function, you often get a pointer to a buffer rather than the data directly:

       LINKAGE SECTION.
       01  LS-RESPONSE-BUFFER.
           05  LS-RESP-LENGTH   PIC 9(5) COMP.
           05  LS-RESP-DATA     PIC X(4000).

       01  LS-PARSED-RECORD.
           05  LS-PR-TYPE       PIC X(2).
           05  LS-PR-LENGTH     PIC 9(4) COMP.
           05  LS-PR-PAYLOAD    PIC X(996).

       WORKING-STORAGE SECTION.
       01  WS-BUFFER-PTR       USAGE IS POINTER.
       01  WS-CURRENT-PTR      USAGE IS POINTER.
       01  WS-OFFSET           PIC 9(5) COMP VALUE ZERO.
       01  WS-REC-COUNT        PIC 9(5) COMP VALUE ZERO.

       PROCEDURE DIVISION.
      *    Assume WS-BUFFER-PTR is set by caller
           SET ADDRESS OF LS-RESPONSE-BUFFER
               TO WS-BUFFER-PTR

      *    Process records within the buffer
           MOVE ZERO TO WS-OFFSET
           PERFORM UNTIL WS-OFFSET >= LS-RESP-LENGTH
               SET WS-CURRENT-PTR TO WS-BUFFER-PTR
      *        Advance pointer by offset + 5 (header)
               SET WS-CURRENT-PTR UP BY 5
               SET WS-CURRENT-PTR UP BY WS-OFFSET
               SET ADDRESS OF LS-PARSED-RECORD
                   TO WS-CURRENT-PTR
               ADD 1 TO WS-REC-COUNT
               PERFORM 3100-PROCESS-RECORD
               ADD LS-PR-LENGTH TO WS-OFFSET
               ADD 6 TO WS-OFFSET
           END-PERFORM
           .

⚠️ Safety Warning: Pointer manipulation is the most dangerous area of COBOL programming. An incorrect pointer leads to storage violations (SOC4 abend on z/OS). Use pointers only when absolutely necessary, validate pointer values, and test exhaustively. Most business logic should use reference modification instead.

19.6 Processing Variable-Format Records

One of the most valuable applications of reference modification is processing records whose layout varies based on content. This is common in financial systems (different transaction types), healthcare (different claim formats), and data interchange (EDI, XML-like structures).

GlobalBank: Variable-Length Transaction Descriptions

GlobalBank's transaction records include a variable-length description field. The record layout is:

Positions 1-10:   Account number
Positions 11-18:  Transaction date (YYYYMMDD)
Positions 19-19:  Transaction type (D=Debit, C=Credit)
Positions 20-31:  Amount (9(10)V99)
Positions 32-34:  Description length (3 digits)
Positions 35-?:   Description text (variable length)

       01  WS-TRANS-RECORD    PIC X(534).
       01  WS-TRANS-FIELDS.
           05  WS-TR-ACCT     PIC X(10).
           05  WS-TR-DATE     PIC 9(8).
           05  WS-TR-TYPE     PIC X(1).
           05  WS-TR-AMOUNT   PIC 9(10)V99.
           05  WS-TR-DESC-LEN PIC 9(3).
           05  WS-TR-DESC     PIC X(500).

       PARSE-TRANSACTION.
           MOVE WS-TRANS-RECORD(1:10)  TO WS-TR-ACCT
           MOVE WS-TRANS-RECORD(11:8)  TO WS-TR-DATE
           MOVE WS-TRANS-RECORD(19:1)  TO WS-TR-TYPE
           MOVE WS-TRANS-RECORD(20:12) TO WS-TR-AMOUNT
           MOVE WS-TRANS-RECORD(32:3)  TO WS-TR-DESC-LEN

      *    Defensive check on description length
           IF WS-TR-DESC-LEN < 0
           OR WS-TR-DESC-LEN > 500
               DISPLAY "INVALID DESC LENGTH: "
                       WS-TR-DESC-LEN
               MOVE ZERO TO WS-TR-DESC-LEN
               MOVE SPACES TO WS-TR-DESC
               PERFORM 9100-LOG-ERROR
           ELSE IF WS-TR-DESC-LEN > ZERO
               MOVE WS-TRANS-RECORD(35:WS-TR-DESC-LEN)
                   TO WS-TR-DESC
           ELSE
               MOVE SPACES TO WS-TR-DESC
           END-IF
           .

MedClaim: Processing Variable-Format EDI Segments

Electronic Data Interchange (EDI) is the backbone of healthcare claims processing. EDI segments use delimiters (typically * for elements and ~ for segments) rather than fixed positions. Reference modification is ideal for parsing these.

An EDI 837 claim segment might look like:

CLM*12345678*1200.50*11:B:1*Y*A~

The parser must extract each element between the asterisks:

       01  WS-EDI-SEGMENT    PIC X(500).
       01  WS-ELEMENTS.
           05  WS-ELEMENT     PIC X(50) OCCURS 20 TIMES.
       01  WS-EDI-VARS.
           05  WS-ELEM-NUM    PIC 99 VALUE 1.
           05  WS-SCAN-POS    PIC 9(3) VALUE 1.
           05  WS-ELEM-START  PIC 9(3).
           05  WS-ELEM-LEN    PIC 9(3).
           05  WS-SEG-LEN     PIC 9(3).

       PARSE-EDI-SEGMENT.
           MOVE SPACES TO WS-ELEMENTS
           MOVE 1 TO WS-ELEM-NUM
           MOVE 1 TO WS-SCAN-POS

      *    Find segment terminator to get actual length
           MOVE ZERO TO WS-SEG-LEN
           PERFORM VARYING WS-SEG-LEN FROM 1 BY 1
               UNTIL WS-SEG-LEN >
                     FUNCTION LENGTH(WS-EDI-SEGMENT)
               IF WS-EDI-SEGMENT(WS-SEG-LEN:1) = "~"
                  OR WS-EDI-SEGMENT(WS-SEG-LEN:1) =
                     SPACES
                   SUBTRACT 1 FROM WS-SEG-LEN
                   EXIT PERFORM
               END-IF
           END-PERFORM

      *    Skip segment identifier (first element before *)
           PERFORM UNTIL WS-SCAN-POS > WS-SEG-LEN
               IF WS-EDI-SEGMENT(WS-SCAN-POS:1) = "*"
                   ADD 1 TO WS-SCAN-POS
                   EXIT PERFORM
               END-IF
               ADD 1 TO WS-SCAN-POS
           END-PERFORM

      *    Extract remaining elements
           PERFORM UNTIL WS-SCAN-POS > WS-SEG-LEN
                      OR WS-ELEM-NUM > 20
               MOVE WS-SCAN-POS TO WS-ELEM-START

      *        Find next delimiter
               PERFORM UNTIL WS-SCAN-POS > WS-SEG-LEN
                   IF WS-EDI-SEGMENT(WS-SCAN-POS:1)
                      = "*"
                      OR WS-EDI-SEGMENT(WS-SCAN-POS:1)
                         = "~"
                       EXIT PERFORM
                   END-IF
                   ADD 1 TO WS-SCAN-POS
               END-PERFORM

               COMPUTE WS-ELEM-LEN =
                   WS-SCAN-POS - WS-ELEM-START
               IF WS-ELEM-LEN > ZERO
                AND WS-ELEM-LEN <= 50
                   MOVE WS-EDI-SEGMENT(
                       WS-ELEM-START:WS-ELEM-LEN)
                       TO WS-ELEMENT(WS-ELEM-NUM)
               END-IF

               ADD 1 TO WS-ELEM-NUM
               ADD 1 TO WS-SCAN-POS
           END-PERFORM

           SUBTRACT 1 FROM WS-ELEM-NUM
           DISPLAY "PARSED " WS-ELEM-NUM " ELEMENTS"
           .

🔗 Cross-Reference: EDI processing is covered in greater detail in Chapter 33 (Interfacing with External Systems). The parsing techniques introduced here form the foundation for the full EDI 837/835 processing pipeline discussed there.

Understanding Pointer Arithmetic

Pointer arithmetic in COBOL is more restricted than in C. You can only increment or decrement pointers using SET:

    SET WS-PTR UP BY 100
    *> Advances the pointer by 100 bytes

    SET WS-PTR DOWN BY 50
    *> Backs the pointer up by 50 bytes

You cannot add two pointers, subtract two pointers to get a distance, or compare pointers with < or >. The only pointer comparison available is equality:

    IF WS-PTR = NULL
        DISPLAY "POINTER IS NULL"
    END-IF

    IF WS-PTR-1 = WS-PTR-2
        DISPLAY "POINTERS MATCH"
    END-IF

These restrictions are intentional — they prevent the kind of arbitrary memory access bugs that plague C programs. COBOL's pointer model is a controlled subset designed for specific inter-language and dynamic memory use cases, not general-purpose memory manipulation.

When to Use Pointers vs. Reference Modification

The decision between pointers and reference modification is usually straightforward:

Use Case	Best Approach
Extracting substrings from fixed-length fields	Reference modification
Building output strings dynamically	Reference modification with pointer variable
Parsing delimited data	Reference modification
Interfacing with C functions	Pointers (ADDRESS OF)
Processing CICS COMMAREA/TWA	Pointers (SET ADDRESS OF)
Accessing dynamically allocated memory	Pointers (CEEGTST/CEECZST)
Processing LINKAGE SECTION data	Pointers (SET ADDRESS OF)
Overlaying different record structures	Reference modification or REDEFINES

In practice, reference modification handles 95% of byte-level data manipulation needs. Pointers are reserved for system-level programming and inter-language communication.

Working with NULL Pointers

Always initialize pointers before use and check for NULL before dereferencing:

01  WS-DATA-PTR    USAGE IS POINTER VALUE NULL.

*> Before using:
    IF WS-DATA-PTR = NULL
        DISPLAY "ERROR: POINTER NOT INITIALIZED"
        PERFORM 9900-ABEND
    END-IF

    SET ADDRESS OF LS-RECORD TO WS-DATA-PTR
    *> Safe to access LS-RECORD now

A NULL pointer dereference causes a SOC4 abend on z/OS (equivalent to a segmentation fault on Unix). The abend dump may not clearly indicate the cause, making NULL pointer bugs particularly difficult to diagnose in production. Always validate.

Reference Modification with INSPECT

A lesser-known but powerful combination is using reference modification within an INSPECT statement to examine or replace characters within a specific portion of a field:

    *> Count digits in positions 5-10 only
    INSPECT WS-DATA(5:6)
        TALLYING WS-DIGIT-COUNT
        FOR ALL "0" THRU "9"

    *> Replace all spaces with zeros in positions 1-8
    INSPECT WS-AMOUNT-FIELD(1:8)
        REPLACING ALL SPACES BY ZEROS

    *> Convert lowercase to uppercase in first 20 chars
    INSPECT WS-NAME-FIELD(1:20)
        CONVERTING "abcdefghijklmnopqrstuvwxyz"
        TO         "ABCDEFGHIJKLMNOPQRSTUVWXYZ"

This combination is useful when you need to transform only part of a field — for example, uppercasing a name field without affecting a numeric suffix, or zero-filling only the integer portion of an amount string.

🧪 Try It Yourself: String Utilities Library

Build a set of reusable string utility paragraphs using reference modification: 1. LEFT-PAD: Pad a string on the left with a specified character to a target length 2. RIGHT-PAD: Pad on the right (similar to default COBOL behavior, but with a configurable pad character) 3. CENTER: Center a string within a target field 4. CONTAINS: Return TRUE if a substring exists within a string 5. REPLACE-ALL: Replace all occurrences of a search string with a replacement string 6. SUBSTR-COUNT: Count occurrences of a substring

These utilities mirror the string functions available in Java or Python, implemented using COBOL reference modification.

19.7 STRING Statement with POINTER Phrase

The STRING statement has an optional POINTER phrase that maintains a position counter, enabling you to build output strings incrementally:

       01  WS-REPORT-LINE   PIC X(132).
       01  WS-STR-PTR       PIC 9(3) VALUE 1.

       BUILD-REPORT-LINE.
           MOVE SPACES TO WS-REPORT-LINE
           MOVE 1 TO WS-STR-PTR

           STRING
               WS-ACCT-NUM DELIMITED BY SIZE
               " | "        DELIMITED BY SIZE
               WS-CUST-NAME DELIMITED BY "  "
               " | "        DELIMITED BY SIZE
               WS-BALANCE   DELIMITED BY SIZE
               INTO WS-REPORT-LINE
               WITH POINTER WS-STR-PTR
           END-STRING

      *    WS-STR-PTR now points to the position
      *    AFTER the last character written.
      *    Subtract 1 to get the actual content length.
           SUBTRACT 1 FROM WS-STR-PTR
           DISPLAY "LINE LENGTH: " WS-STR-PTR
           .

The POINTER phrase is also useful for building output across multiple STRING operations:

       BUILD-MULTI-PART.
           MOVE SPACES TO WS-OUTPUT
           MOVE 1 TO WS-STR-PTR

      *    Part 1: Header
           STRING "ACCT: " DELIMITED BY SIZE
                  WS-ACCT-NUM DELIMITED BY SIZE
                  INTO WS-OUTPUT
                  WITH POINTER WS-STR-PTR
           END-STRING

      *    Part 2: Conditionally add branch info
           IF WS-INCLUDE-BRANCH = "Y"
               STRING " BRANCH: " DELIMITED BY SIZE
                      WS-BRANCH-NAME DELIMITED BY "  "
                      INTO WS-OUTPUT
                      WITH POINTER WS-STR-PTR
               END-STRING
           END-IF

      *    Part 3: Always add date
           STRING " DATE: " DELIMITED BY SIZE
                  WS-FORMATTED-DATE DELIMITED BY SIZE
                  INTO WS-OUTPUT
                  WITH POINTER WS-STR-PTR
           END-STRING
           .

💡 Key Insight: The WITH POINTER phrase makes STRING work like the pointer-based reference modification pattern from Section 19.4, but with the convenience of delimiter handling. Use STRING WITH POINTER when your pieces have natural delimiters; use reference modification when you need exact byte positioning.

19.8 UNSTRING with POINTER Phrase

Similarly, UNSTRING supports a POINTER phrase for incremental parsing:

       01  WS-CSV-RECORD   PIC X(500).
       01  WS-FIELDS.
           05  WS-FIELD-1   PIC X(30).
           05  WS-FIELD-2   PIC X(30).
           05  WS-FIELD-3   PIC X(30).
       01  WS-UNS-PTR      PIC 9(3) VALUE 1.
       01  WS-DELIM-FOUND  PIC X(1).
       01  WS-FIELD-COUNT  PIC 9(3).

       PARSE-CSV.
           MOVE 1 TO WS-UNS-PTR

      *    Extract first field
           UNSTRING WS-CSV-RECORD
               DELIMITED BY "," OR X"0D" OR X"0A"
               INTO WS-FIELD-1
                   DELIMITER IN WS-DELIM-FOUND
                   COUNT IN WS-FIELD-COUNT
               WITH POINTER WS-UNS-PTR
           END-UNSTRING

      *    Extract second field (continues from where
      *    the first left off)
           UNSTRING WS-CSV-RECORD
               DELIMITED BY "," OR X"0D" OR X"0A"
               INTO WS-FIELD-2
               WITH POINTER WS-UNS-PTR
           END-UNSTRING

      *    Extract third field
           UNSTRING WS-CSV-RECORD
               DELIMITED BY "," OR X"0D" OR X"0A"
               INTO WS-FIELD-3
               WITH POINTER WS-UNS-PTR
           END-UNSTRING
           .

19.9 Advanced Technique: Building a Generic Field Extractor

Let us combine reference modification, LENGTH OF, and pointer techniques to build a reusable field extraction utility:

      *============================================================
      * Generic delimited field extractor
      * Input:  WS-GFE-INPUT    - string to parse
      *         WS-GFE-DELIM    - delimiter character
      *         WS-GFE-FIELD-NUM - which field to extract (1-based)
      * Output: WS-GFE-RESULT   - extracted field
      *         WS-GFE-RESULT-LEN - length of extracted field
      *         WS-GFE-STATUS   - 'F'ound, 'N'ot found, 'E'rror
      *============================================================
       01  WS-GENERIC-FIELD-EXTRACT.
           05  WS-GFE-INPUT       PIC X(500).
           05  WS-GFE-DELIM       PIC X(1).
           05  WS-GFE-FIELD-NUM   PIC 99.
           05  WS-GFE-RESULT      PIC X(100).
           05  WS-GFE-RESULT-LEN  PIC 9(3).
           05  WS-GFE-STATUS      PIC X(1).
               88  WS-GFE-FOUND       VALUE "F".
               88  WS-GFE-NOT-FOUND   VALUE "N".
               88  WS-GFE-ERROR       VALUE "E".
       01  WS-GFE-WORK.
           05  WS-GFE-POS         PIC 9(3).
           05  WS-GFE-START       PIC 9(3).
           05  WS-GFE-CURR-FIELD  PIC 99.
           05  WS-GFE-MAX-LEN    PIC 9(3).
           05  WS-GFE-LEN        PIC 9(3).

       EXTRACT-DELIMITED-FIELD.
           MOVE SPACES TO WS-GFE-RESULT
           MOVE ZERO TO WS-GFE-RESULT-LEN
           SET WS-GFE-NOT-FOUND TO TRUE

           IF WS-GFE-FIELD-NUM < 1
               SET WS-GFE-ERROR TO TRUE
               EXIT PARAGRAPH
           END-IF

           MOVE FUNCTION LENGTH(WS-GFE-INPUT)
               TO WS-GFE-MAX-LEN
           MOVE 1 TO WS-GFE-POS
           MOVE 1 TO WS-GFE-CURR-FIELD

      *    Skip to the target field
           PERFORM UNTIL WS-GFE-CURR-FIELD >=
                         WS-GFE-FIELD-NUM
                      OR WS-GFE-POS > WS-GFE-MAX-LEN
               IF WS-GFE-INPUT(WS-GFE-POS:1) =
                  WS-GFE-DELIM
                   ADD 1 TO WS-GFE-CURR-FIELD
               END-IF
               ADD 1 TO WS-GFE-POS
           END-PERFORM

           IF WS-GFE-CURR-FIELD < WS-GFE-FIELD-NUM
               SET WS-GFE-NOT-FOUND TO TRUE
               EXIT PARAGRAPH
           END-IF

      *    We are now at the start of the target field
           MOVE WS-GFE-POS TO WS-GFE-START

      *    Find the end of this field
           PERFORM UNTIL WS-GFE-POS > WS-GFE-MAX-LEN
               IF WS-GFE-INPUT(WS-GFE-POS:1) =
                  WS-GFE-DELIM
                   EXIT PERFORM
               END-IF
               ADD 1 TO WS-GFE-POS
           END-PERFORM

           COMPUTE WS-GFE-LEN =
               WS-GFE-POS - WS-GFE-START
           IF WS-GFE-LEN > 100
               MOVE 100 TO WS-GFE-LEN
           END-IF

           IF WS-GFE-LEN > ZERO
               MOVE WS-GFE-INPUT(
                   WS-GFE-START:WS-GFE-LEN)
                   TO WS-GFE-RESULT
               MOVE WS-GFE-LEN TO WS-GFE-RESULT-LEN
               SET WS-GFE-FOUND TO TRUE
           ELSE
               SET WS-GFE-FOUND TO TRUE
               MOVE ZERO TO WS-GFE-RESULT-LEN
           END-IF
           .

This utility can parse CSV, pipe-delimited, tab-delimited, or any single-character-delimited format by changing WS-GFE-DELIM.

19.10 Reference Modification in CICS and Online Programs

Reference modification is not limited to batch programs. Online programs running under CICS or IMS make extensive use of reference modification for processing communication area data, building dynamic screen content, and parsing web service payloads.

CICS Communication Area Parsing

In CICS programs, the DFHCOMMAREA is the primary mechanism for passing data between programs. When programs follow a "router" pattern — where a front-end program determines which back-end program to invoke — the communication area often contains a variable-format payload:

       01  DFHCOMMAREA.
           05  CA-REQUEST-TYPE   PIC X(4).
           05  CA-PAYLOAD-LEN    PIC 9(4).
           05  CA-PAYLOAD        PIC X(2000).

      *    Parse based on request type
       EVALUATE CA-REQUEST-TYPE
           WHEN "INQY"
      *        Inquiry: payload is account number (10) + date (8)
               MOVE CA-PAYLOAD(1:10) TO WS-ACCT-NUM
               MOVE CA-PAYLOAD(11:8) TO WS-INQUIRY-DATE
           WHEN "XFER"
      *        Transfer: from-acct (10) + to-acct (10)
      *                  + amount (12) + memo (50)
               MOVE CA-PAYLOAD(1:10)  TO WS-FROM-ACCT
               MOVE CA-PAYLOAD(11:10) TO WS-TO-ACCT
               MOVE CA-PAYLOAD(21:12) TO WS-XFER-AMOUNT
               MOVE CA-PAYLOAD(33:50) TO WS-XFER-MEMO
           WHEN "STMT"
      *        Statement: acct (10) + start-date (8)
      *                   + end-date (8) + page-num (4)
               MOVE CA-PAYLOAD(1:10)  TO WS-STMT-ACCT
               MOVE CA-PAYLOAD(11:8)  TO WS-STMT-START
               MOVE CA-PAYLOAD(19:8)  TO WS-STMT-END
               MOVE CA-PAYLOAD(27:4)  TO WS-STMT-PAGE
       END-EVALUATE

💡 Design Pattern: Maria Chen explains: "Every CICS program at GlobalBank uses a communication area layout where the first few bytes identify the request type, followed by a payload that varies by type. Reference modification is how we extract the payload fields without needing a separate copybook for every possible request format. The front-end knows how to build the payload; the back-end knows how to parse it."

Building Dynamic BMS Map Data

When constructing dynamic screen content in CICS, reference modification enables building formatted display lines without static copybook structures:

       01  WS-SCREEN-LINE    PIC X(80).
       01  WS-POS            PIC 99.

       BUILD-DETAIL-LINE.
           MOVE SPACES TO WS-SCREEN-LINE
           MOVE 1 TO WS-POS
      *    Column 1-10: Account number
           MOVE WS-ACCT-NUM TO WS-SCREEN-LINE(1:10)
      *    Column 12-19: Date
           MOVE WS-DATE-FORMATTED TO WS-SCREEN-LINE(12:8)
      *    Column 21-35: Description (truncated to 15)
           MOVE WS-DESCRIPTION(1:15)
               TO WS-SCREEN-LINE(21:15)
      *    Column 37-50: Amount (right-justified)
           MOVE WS-DISPLAY-AMT TO WS-SCREEN-LINE(37:14)
      *    Column 52-53: Status
           MOVE WS-STATUS-CODE TO WS-SCREEN-LINE(52:2)
           .

Processing JSON Payloads from Web Services

Modern COBOL programs increasingly process JSON data received from web services. While IBM provides JSON PARSE in Enterprise COBOL 6+, many shops still process JSON payloads using reference modification, particularly when the JSON structure is simple and predictable:

       01  WS-JSON-PAYLOAD   PIC X(5000).
       01  WS-JSON-LEN       PIC 9(4).
       01  WS-SCAN-POS       PIC 9(4).
       01  WS-VALUE-START    PIC 9(4).
       01  WS-VALUE-LEN      PIC 9(4).
       01  WS-TARGET-KEY     PIC X(30).

       FIND-JSON-VALUE.
      *    Simple JSON value extractor
      *    Finds "key":"value" and returns value
      *    Does NOT handle nested objects or arrays

      *    Build search string: "key":"
           STRING '"' WS-TARGET-KEY DELIMITED SPACES
                  '":"' DELIMITED SIZE
               INTO WS-SEARCH-PATTERN
               WITH POINTER WS-PATTERN-LEN
           END-STRING
           SUBTRACT 1 FROM WS-PATTERN-LEN

      *    Scan for the key
           MOVE 1 TO WS-SCAN-POS
           PERFORM UNTIL WS-SCAN-POS >
                         WS-JSON-LEN - WS-PATTERN-LEN
               IF WS-JSON-PAYLOAD(
                   WS-SCAN-POS:WS-PATTERN-LEN)
                       = WS-SEARCH-PATTERN(1:WS-PATTERN-LEN)
      *            Found the key — value starts after it
                   COMPUTE WS-VALUE-START =
                       WS-SCAN-POS + WS-PATTERN-LEN
      *            Find the closing quote
                   MOVE WS-VALUE-START TO WS-SCAN-POS
                   PERFORM UNTIL
                       WS-JSON-PAYLOAD(WS-SCAN-POS:1)
                           = '"'
                       OR WS-SCAN-POS > WS-JSON-LEN
                       ADD 1 TO WS-SCAN-POS
                   END-PERFORM
                   COMPUTE WS-VALUE-LEN =
                       WS-SCAN-POS - WS-VALUE-START
                   IF WS-VALUE-LEN > 0
                       AND WS-VALUE-LEN <= 200
                       MOVE WS-JSON-PAYLOAD(
                           WS-VALUE-START:WS-VALUE-LEN)
                           TO WS-EXTRACTED-VALUE
                       SET WS-VALUE-FOUND TO TRUE
                   END-IF
                   EXIT PERFORM
               END-IF
               ADD 1 TO WS-SCAN-POS
           END-PERFORM
           .

⚠️ Production Warning: This simple JSON parser handles only flat key-value pairs with string values. For production JSON processing with nested objects, arrays, and escaped characters, use IBM's JSON PARSE statement (Enterprise COBOL 6.1+) or a JSON parsing subprogram. Tomás Rivera at MedClaim notes: "We started with reference modification for JSON, and it worked fine until we got a payload with escaped quotes inside a value. Now we use JSON PARSE for anything from external systems and only use ref-mod parsing for our own internal simple-format messages."

XML Element Extraction with Reference Modification

Healthcare systems frequently process XML documents. Reference modification provides a lightweight XML element extraction capability:

       01  WS-XML-DATA       PIC X(10000).
       01  WS-XML-LEN        PIC 9(5).
       01  WS-TAG-NAME       PIC X(30).
       01  WS-OPEN-TAG       PIC X(32).
       01  WS-CLOSE-TAG      PIC X(33).
       01  WS-OPEN-LEN       PIC 99.
       01  WS-CLOSE-LEN      PIC 99.

       EXTRACT-XML-ELEMENT.
      *    Build open and close tags
           STRING "<" WS-TAG-NAME DELIMITED SPACES
                  ">" DELIMITED SIZE
               INTO WS-OPEN-TAG
               WITH POINTER WS-OPEN-LEN
           END-STRING
           SUBTRACT 1 FROM WS-OPEN-LEN

           STRING "</" WS-TAG-NAME DELIMITED SPACES
                  ">" DELIMITED SIZE
               INTO WS-CLOSE-TAG
               WITH POINTER WS-CLOSE-LEN
           END-STRING
           SUBTRACT 1 FROM WS-CLOSE-LEN

      *    Scan for open tag
           MOVE 1 TO WS-SCAN-POS
           SET WS-ELEMENT-NOT-FOUND TO TRUE
           PERFORM UNTIL WS-SCAN-POS >
                         WS-XML-LEN - WS-OPEN-LEN
               IF WS-XML-DATA(WS-SCAN-POS:WS-OPEN-LEN)
                   = WS-OPEN-TAG(1:WS-OPEN-LEN)
      *            Content starts after open tag
                   COMPUTE WS-VALUE-START =
                       WS-SCAN-POS + WS-OPEN-LEN
      *            Find close tag
                   MOVE WS-VALUE-START TO WS-INNER-POS
                   PERFORM UNTIL WS-INNER-POS >
                       WS-XML-LEN - WS-CLOSE-LEN
                       IF WS-XML-DATA(
                           WS-INNER-POS:WS-CLOSE-LEN)
                           = WS-CLOSE-TAG(1:WS-CLOSE-LEN)
                           COMPUTE WS-VALUE-LEN =
                               WS-INNER-POS
                               - WS-VALUE-START
                           MOVE WS-XML-DATA(
                               WS-VALUE-START:WS-VALUE-LEN)
                               TO WS-EXTRACTED-VALUE
                           SET WS-ELEMENT-FOUND TO TRUE
                           EXIT PERFORM
                       END-IF
                       ADD 1 TO WS-INNER-POS
                   END-PERFORM
                   EXIT PERFORM
               END-IF
               ADD 1 TO WS-SCAN-POS
           END-PERFORM
           .

Sarah Kim uses this pattern at MedClaim for processing CDA (Clinical Document Architecture) documents that arrive with eligibility responses: "We extract just the three or four elements we need — patient name, subscriber ID, effective dates — and ignore the rest of the XML structure. It is not a full XML parser, but for targeted extraction of known elements, it is fast and reliable."

INSPECT Combined with Reference Modification

The INSPECT statement and reference modification are powerful when combined. INSPECT can count or replace characters within a specific portion of a field:

      *    Count commas only in the data portion (skip header)
           INSPECT WS-RECORD(WS-HEADER-LEN + 1:
               WS-RECORD-LEN - WS-HEADER-LEN)
               TALLYING WS-COMMA-COUNT
               FOR ALL ","

      *    Replace pipes with commas in the payload only
           INSPECT WS-BUFFER(WS-PAYLOAD-START:
               WS-PAYLOAD-LEN)
               REPLACING ALL "|" BY ","

      *    Count digits in a specific substring
           INSPECT WS-FIELD(WS-START:WS-LEN)
               TALLYING WS-DIGIT-COUNT
               FOR ALL "0" "1" "2" "3" "4" "5"
                       "6" "7" "8" "9"

This combination is particularly useful when processing records with a fixed-format header followed by a variable-format body. You can apply INSPECT operations to just the body portion without affecting the header.

19.11 GlobalBank Case Study: Transaction Description Parser

GlobalBank's online banking system sends transaction descriptions in a structured format that varies by transaction type. The description field encodes multiple pieces of information separated by slashes:

ATM/WITHDRAWAL/BRANCH-0042/TERMINAL-7
POS/PURCHASE/MERCHANT:AMAZON.COM/REF:A123456
ACH/DIRECT-DEPOSIT/EMPLOYER:ACME-CORP
WIRE/INTL/BENEFICIARY:J.SMITH/SWIFT:ABCDUS33

The parser must handle: - Variable number of fields per transaction type - Fields that contain colons (key:value pairs) - Missing optional fields

       01  WS-TRANS-DESC     PIC X(200).
       01  WS-PARSED-TRANS.
           05  WS-PT-CHANNEL     PIC X(10).
           05  WS-PT-ACTION      PIC X(20).
           05  WS-PT-DETAILS.
               10  WS-PT-DETAIL  PIC X(50)
                                 OCCURS 5 TIMES.
           05  WS-PT-DETAIL-CNT  PIC 9.

       PARSE-TRANS-DESCRIPTION.
           MOVE SPACES TO WS-PARSED-TRANS
           MOVE ZERO TO WS-PT-DETAIL-CNT

      *    Use the generic field extractor
           MOVE WS-TRANS-DESC TO WS-GFE-INPUT
           MOVE "/" TO WS-GFE-DELIM

      *    Field 1: Channel
           MOVE 1 TO WS-GFE-FIELD-NUM
           PERFORM EXTRACT-DELIMITED-FIELD
           IF WS-GFE-FOUND
               MOVE WS-GFE-RESULT TO WS-PT-CHANNEL
           END-IF

      *    Field 2: Action
           MOVE 2 TO WS-GFE-FIELD-NUM
           PERFORM EXTRACT-DELIMITED-FIELD
           IF WS-GFE-FOUND
               MOVE WS-GFE-RESULT TO WS-PT-ACTION
           END-IF

      *    Fields 3+: Variable details
           PERFORM VARYING WS-GFE-FIELD-NUM
               FROM 3 BY 1
               UNTIL WS-GFE-FIELD-NUM > 7
                  OR WS-PT-DETAIL-CNT >= 5
               PERFORM EXTRACT-DELIMITED-FIELD
               IF WS-GFE-FOUND
                   ADD 1 TO WS-PT-DETAIL-CNT
                   MOVE WS-GFE-RESULT TO
                       WS-PT-DETAIL(WS-PT-DETAIL-CNT)
               ELSE
                   EXIT PERFORM
               END-IF
           END-PERFORM
           .

After parsing, the detail fields can be further decomposed. For POS transactions, the detail field "MERCHANT:AMAZON.COM" needs to be split on the colon to extract the merchant name. Derek Washington's code applies the same reference modification technique recursively — using the same scanning pattern on each detail field:

       PARSE-KEY-VALUE-DETAILS.
      *    For each detail that contains a colon,
      *    extract key and value
           PERFORM VARYING WS-DT-IDX FROM 1 BY 1
               UNTIL WS-DT-IDX > WS-PT-DETAIL-CNT
      *        Find the colon
               MOVE ZERO TO WS-COLON-POS
               PERFORM VARYING WS-SCAN FROM 1 BY 1
                   UNTIL WS-SCAN >
                       FUNCTION LENGTH(
                           WS-PT-DETAIL(WS-DT-IDX))
                   IF WS-PT-DETAIL(WS-DT-IDX)
                       (WS-SCAN:1) = ":"
                       MOVE WS-SCAN TO WS-COLON-POS
                       EXIT PERFORM
                   END-IF
               END-PERFORM

               IF WS-COLON-POS > 0
      *            Key is before colon
                   MOVE WS-PT-DETAIL(WS-DT-IDX)
                       (1:WS-COLON-POS - 1)
                       TO WS-DT-KEY(WS-DT-IDX)
      *            Value is after colon
                   COMPUTE WS-VAL-LEN =
                       FUNCTION LENGTH(
                           FUNCTION TRIM(
                               WS-PT-DETAIL(WS-DT-IDX)))
                       - WS-COLON-POS
                   IF WS-VAL-LEN > 0
                       MOVE WS-PT-DETAIL(WS-DT-IDX)
                           (WS-COLON-POS + 1:WS-VAL-LEN)
                           TO WS-DT-VALUE(WS-DT-IDX)
                   END-IF
               END-IF
           END-PERFORM
           .

💡 Design Pattern: This two-level parsing strategy — first split the record on the primary delimiter, then split individual fields on a secondary delimiter — is the standard approach for structured text in COBOL. It generalizes to any number of levels, though in practice more than two levels is rare and may indicate that a different data format (XML, JSON) would be more appropriate.

19.12 MedClaim Case Study: EDI 837 Claim Parsing

MedClaim receives electronic claims in ANSI X12 837 format. Each claim consists of multiple segments, each terminated by ~, with elements separated by * and sub-elements by :.

A simplified claim extract:

ST*837*0001~
BHT*0019*00*12345*20240615*1200*CH~
CLM*CLAIM001*1500.00*11:B:1*Y*A~
SV1*HC:99213*125.00*UN*1~
DTP*472*D8*20240610~

The parsing uses reference modification to navigate through the variable-length segments:

       01  WS-EDI-BUFFER     PIC X(5000).
       01  WS-EDI-BUF-LEN    PIC 9(5).
       01  WS-SEG-START      PIC 9(5) VALUE 1.
       01  WS-SEG-END        PIC 9(5).
       01  WS-CURRENT-SEG    PIC X(500).
       01  WS-SEG-ID         PIC X(3).

       PROCESS-EDI-BUFFER.
           MOVE FUNCTION LENGTH(
               FUNCTION TRIM(WS-EDI-BUFFER TRAILING))
               TO WS-EDI-BUF-LEN
           MOVE 1 TO WS-SEG-START

           PERFORM UNTIL WS-SEG-START > WS-EDI-BUF-LEN
      *        Find segment terminator
               MOVE WS-SEG-START TO WS-SEG-END
               PERFORM UNTIL WS-SEG-END > WS-EDI-BUF-LEN
                   IF WS-EDI-BUFFER(WS-SEG-END:1) = "~"
                       EXIT PERFORM
                   END-IF
                   ADD 1 TO WS-SEG-END
               END-PERFORM

      *        Extract segment
               COMPUTE WS-ELEM-LEN =
                   WS-SEG-END - WS-SEG-START
               IF WS-ELEM-LEN > 0
                AND WS-ELEM-LEN <= 500
                   MOVE SPACES TO WS-CURRENT-SEG
                   MOVE WS-EDI-BUFFER(
                       WS-SEG-START:WS-ELEM-LEN)
                       TO WS-CURRENT-SEG

      *            Get segment identifier (first 2-3 chars)
                   MOVE WS-CURRENT-SEG(1:3)
                       TO WS-SEG-ID
                   PERFORM PROCESS-SEGMENT
               END-IF

      *        Advance past terminator
               COMPUTE WS-SEG-START =
                   WS-SEG-END + 1
           END-PERFORM
           .

       PROCESS-SEGMENT.
           EVALUATE TRUE
               WHEN WS-SEG-ID(1:2) = "ST"
                   PERFORM PARSE-ST-SEGMENT
               WHEN WS-SEG-ID(1:3) = "BHT"
                   PERFORM PARSE-BHT-SEGMENT
               WHEN WS-SEG-ID(1:3) = "CLM"
                   PERFORM PARSE-CLM-SEGMENT
               WHEN WS-SEG-ID(1:3) = "SV1"
                   PERFORM PARSE-SV1-SEGMENT
               WHEN WS-SEG-ID(1:3) = "DTP"
                   PERFORM PARSE-DTP-SEGMENT
               WHEN OTHER
                   DISPLAY "UNKNOWN SEGMENT: " WS-SEG-ID
           END-EVALUATE
           .

⚖️ The Modernization Spectrum: EDI is a 1970s data interchange format still used for billions of dollars in healthcare transactions daily. Modern COBOL systems must parse this format reliably while increasingly also supporting JSON and XML. Reference modification provides the byte-level precision needed for EDI's strict positional requirements, while newer COBOL features (FUNCTION TRIM, FUNCTION LOWER-CASE) help with modern format translation. The modernization path is not to replace COBOL but to extend its capabilities.

19.12 Complete Worked Example: CSV to Fixed-Width Converter

Let us bring together multiple reference modification patterns into a complete production program. This program reads CSV files (with quoted fields), converts them to fixed-width output records, and demonstrates defensive boundary checking throughout.

       IDENTIFICATION DIVISION.
       PROGRAM-ID. CSV2FIX.
      *============================================================
      * CSV to Fixed-Width Converter
      * Parses quoted CSV input and produces fixed-width output
      * using reference modification for all parsing operations.
      *============================================================

       DATA DIVISION.
       WORKING-STORAGE SECTION.

       01  WS-CSV-LINE        PIC X(500).
       01  WS-OUTPUT-REC      PIC X(200).
       01  WS-FIELD-VALUES.
           05  WS-FV-COUNT    PIC 99 VALUE ZERO.
           05  WS-FV-ENTRY    OCCURS 20 TIMES.
               10  WS-FV-VALUE    PIC X(50).
               10  WS-FV-LEN      PIC 9(3).

       01  WS-CSV-PARSE.
           05  WS-CP-POS       PIC 9(3) VALUE 1.
           05  WS-CP-START     PIC 9(3).
           05  WS-CP-LEN       PIC 9(3).
           05  WS-CP-MAX       PIC 9(3).
           05  WS-CP-IN-QUOTES PIC X VALUE "N".
               88  WS-CP-QUOTED   VALUE "Y".
               88  WS-CP-UNQUOTED VALUE "N".
           05  WS-CP-CHAR      PIC X.

      * Output field map
       01  WS-OUTPUT-MAP.
           05  WS-OM-ENTRY  OCCURS 6 TIMES.
               10  WS-OM-START   PIC 9(3).
               10  WS-OM-LENGTH  PIC 9(3).

      * Record counter
       01  WS-REC-COUNT        PIC 9(7) VALUE ZERO.
       01  WS-ERR-COUNT        PIC 9(5) VALUE ZERO.

       PROCEDURE DIVISION.
       0000-MAIN.
           PERFORM 1000-SETUP-FIELD-MAP
           PERFORM 2000-PROCESS-TEST-DATA
           DISPLAY "RECORDS: " WS-REC-COUNT
                   " ERRORS: " WS-ERR-COUNT
           STOP RUN
           .

       1000-SETUP-FIELD-MAP.
      *    Define output positions for 6 fields
           MOVE 001 TO WS-OM-START(1)
           MOVE 010 TO WS-OM-LENGTH(1)
           MOVE 011 TO WS-OM-START(2)
           MOVE 030 TO WS-OM-LENGTH(2)
           MOVE 041 TO WS-OM-START(3)
           MOVE 020 TO WS-OM-LENGTH(3)
           MOVE 061 TO WS-OM-START(4)
           MOVE 015 TO WS-OM-LENGTH(4)
           MOVE 076 TO WS-OM-START(5)
           MOVE 002 TO WS-OM-LENGTH(5)
           MOVE 078 TO WS-OM-START(6)
           MOVE 010 TO WS-OM-LENGTH(6)
           .

       2000-PROCESS-TEST-DATA.
           MOVE
          'ABC001,"Smith, John",123 Main St,New York,NY,10001'
               TO WS-CSV-LINE
           PERFORM 3000-PARSE-AND-CONVERT
           ADD 1 TO WS-REC-COUNT

           MOVE
           'DEF002,Jane Doe,"456 Oak Ave, Apt 3B",Boston,MA,02101'
               TO WS-CSV-LINE
           PERFORM 3000-PARSE-AND-CONVERT
           ADD 1 TO WS-REC-COUNT
           .

       3000-PARSE-AND-CONVERT.
           PERFORM 3100-PARSE-CSV-LINE
           PERFORM 3200-BUILD-FIXED-RECORD
           DISPLAY "OUTPUT: [" WS-OUTPUT-REC "]"
           .

       3100-PARSE-CSV-LINE.
           MOVE ZERO TO WS-FV-COUNT
           MOVE 1 TO WS-CP-POS
           SET WS-CP-UNQUOTED TO TRUE

           MOVE FUNCTION LENGTH(
               FUNCTION TRIM(WS-CSV-LINE TRAILING))
               TO WS-CP-MAX

           PERFORM UNTIL WS-CP-POS > WS-CP-MAX
                      OR WS-FV-COUNT >= 20
               MOVE WS-CP-POS TO WS-CP-START
               MOVE ZERO TO WS-CP-LEN

      *        Check if field starts with quote
               IF WS-CSV-LINE(WS-CP-POS:1) = '"'
      *            Quoted field - scan for closing quote
                   ADD 1 TO WS-CP-POS
                   MOVE WS-CP-POS TO WS-CP-START
                   PERFORM UNTIL WS-CP-POS > WS-CP-MAX
                       IF WS-CSV-LINE(WS-CP-POS:1) = '"'
                           EXIT PERFORM
                       END-IF
                       ADD 1 TO WS-CP-POS
                   END-PERFORM
                   COMPUTE WS-CP-LEN =
                       WS-CP-POS - WS-CP-START
                   ADD 1 TO WS-CP-POS
               ELSE
      *            Unquoted field - scan for comma
                   PERFORM UNTIL WS-CP-POS > WS-CP-MAX
                       IF WS-CSV-LINE(WS-CP-POS:1) = ","
                           EXIT PERFORM
                       END-IF
                       ADD 1 TO WS-CP-POS
                   END-PERFORM
                   COMPUTE WS-CP-LEN =
                       WS-CP-POS - WS-CP-START
               END-IF

      *        Store extracted field
               ADD 1 TO WS-FV-COUNT
               MOVE SPACES TO WS-FV-VALUE(WS-FV-COUNT)
               IF WS-CP-LEN > 0 AND WS-CP-LEN <= 50
                   MOVE WS-CSV-LINE(
                       WS-CP-START:WS-CP-LEN)
                       TO WS-FV-VALUE(WS-FV-COUNT)
               END-IF
               MOVE WS-CP-LEN TO
                   WS-FV-LEN(WS-FV-COUNT)

      *        Skip comma
               IF WS-CP-POS <= WS-CP-MAX
                AND WS-CSV-LINE(WS-CP-POS:1) = ","
                   ADD 1 TO WS-CP-POS
               ELSE
                   ADD 1 TO WS-CP-POS
               END-IF
           END-PERFORM
           .

       3200-BUILD-FIXED-RECORD.
           MOVE SPACES TO WS-OUTPUT-REC
           PERFORM VARYING WS-CP-POS FROM 1 BY 1
               UNTIL WS-CP-POS > WS-FV-COUNT
                  OR WS-CP-POS > 6
      *        Boundary check
               IF WS-OM-START(WS-CP-POS) +
                  WS-OM-LENGTH(WS-CP-POS) - 1
                  > FUNCTION LENGTH(WS-OUTPUT-REC)
                   ADD 1 TO WS-ERR-COUNT
               ELSE
                   IF WS-FV-LEN(WS-CP-POS) <=
                      WS-OM-LENGTH(WS-CP-POS)
                       MOVE WS-FV-VALUE(WS-CP-POS) TO
                           WS-OUTPUT-REC(
                               WS-OM-START(WS-CP-POS):
                               WS-OM-LENGTH(WS-CP-POS))
                   ELSE
      *                Truncate to output field size
                       MOVE WS-FV-VALUE(WS-CP-POS)(
                           1:WS-OM-LENGTH(WS-CP-POS))
                           TO WS-OUTPUT-REC(
                               WS-OM-START(WS-CP-POS):
                               WS-OM-LENGTH(WS-CP-POS))
                   END-IF
               END-IF
           END-PERFORM
           .

This complete worked example demonstrates: - Quoted CSV parsing with reference modification (not UNSTRING) - State tracking (in-quotes vs. not-in-quotes) - Field map-driven output generation - Boundary checking on both input extraction and output placement - Truncation handling when input exceeds output field width - Error counting for operational monitoring

19.13 Defensive Programming for Reference Modification

Reference modification errors are among the hardest COBOL bugs to diagnose because they silently corrupt adjacent memory areas.

Always Validate Position and Length

       SAFE-REF-MOD.
           IF WS-START-POS < 1
               DISPLAY "START POS < 1: " WS-START-POS
               PERFORM 9900-ERROR-HANDLER
           END-IF

           IF WS-REF-LENGTH < 1
               DISPLAY "LENGTH < 1: " WS-REF-LENGTH
               PERFORM 9900-ERROR-HANDLER
           END-IF

           COMPUTE WS-END-POS =
               WS-START-POS + WS-REF-LENGTH - 1
           IF WS-END-POS > LENGTH OF WS-TARGET
               DISPLAY "REF MOD OVERFLOW: END POS "
                       WS-END-POS " > MAX "
                       LENGTH OF WS-TARGET
               PERFORM 9900-ERROR-HANDLER
           END-IF

           MOVE WS-TARGET(WS-START-POS:WS-REF-LENGTH)
               TO WS-OUTPUT
           .

Enable Compile-Time Checking

On IBM Enterprise COBOL, use the SSRANGE option:

CBL SSRANGE

On Micro Focus COBOL:

SET COBFLAGS="-C bound"

Test Edge Cases

Always test with: - Position = 1 (first byte) - Position = LENGTH OF item (last byte, length 1) - Length = LENGTH OF item (entire field) - Zero-length input strings - Maximum-length input strings

19.13 The Student Mainframe Lab

🧪 Try It Yourself: CSV Parser

Build a program that reads a CSV file and parses each line using reference modification (not UNSTRING). Handle: 1. Regular comma-separated values 2. Quoted fields that may contain commas: "Smith, John",42,"New York" 3. Empty fields between consecutive commas 4. Display each field on a separate line with its field number

Hint: Maintain a state variable ("in quotes" vs. "not in quotes") as you scan each character.

🧪 Try It Yourself: Dynamic Report Builder

Write a program that builds report lines dynamically using the pointer pattern. The report should include: 1. A configurable set of columns (name, account, balance, branch) 2. Variable-width columns (only as wide as the longest value) 3. Column separators 4. A header line and a separator line 5. Right-aligned numeric columns

Use reference modification with a pointer variable to assemble each line.

Common Reference Modification Bugs and Their Symptoms

Understanding how reference modification bugs manifest helps you diagnose them quickly in production:

Bug: Start position = 0

*> BUG: WS-POS was not initialized and contains 0
MOVE WS-DATA(WS-POS:5) TO WS-OUTPUT

Symptom: On z/OS without SSRANGE, this reads from one byte before the start of WS-DATA, producing garbage in the first byte of output. With SSRANGE enabled, it produces a runtime error. On some platforms, it may work by accident if the preceding byte happens to contain valid data.

Bug: Length exceeds remaining bytes

*> BUG: WS-POS = 45, WS-LEN = 10, but WS-DATA is PIC X(50)
MOVE WS-DATA(WS-POS:WS-LEN) TO WS-OUTPUT
*> Attempts to read positions 45-54, but only 45-50 exist

Symptom: Reads 5 bytes past the end of WS-DATA, picking up whatever data follows it in WORKING-STORAGE. Often manifests as random characters appended to otherwise valid data. Extremely intermittent — results change if you add or remove other working-storage items.

Bug: Moving to reference-modified target that overlaps

*> BUG: Source and target overlap
MOVE WS-RECORD(5:10) TO WS-RECORD(3:10)

Symptom: Results are unpredictable. The compiler may process the MOVE left-to-right or right-to-left, and the result depends on which direction. This is undefined behavior in the COBOL standard. If you need to shift data within a field, use an intermediate work area.

Bug: Negative or zero length from arithmetic

COMPUTE WS-LEN = WS-END-POS - WS-START-POS
*> If WS-END-POS < WS-START-POS, WS-LEN is negative
MOVE WS-DATA(WS-START-POS:WS-LEN) TO WS-OUTPUT

Symptom: Unpredictable — some compilers treat a zero or negative length as an error, others may process it as a very large positive number due to unsigned arithmetic. Always validate: IF WS-LEN > ZERO AND WS-LEN <= maximum-reasonable-value.

⚠️ Production Story: Derek Washington spent three days debugging a production problem where customer addresses occasionally had random account numbers appended to them. The cause was a reference modification that read 5 bytes past the end of the address field. The bytes that followed in WORKING-STORAGE happened to be the start of the account number field. Adding a single boundary check (IF WS-POS + WS-LEN - 1 > LENGTH OF WS-ADDRESS) fixed the bug permanently.

19.15 Performance Considerations for Reference Modification

Reference modification is generally very fast because it translates to simple address arithmetic at the machine instruction level. However, some patterns can cause performance issues:

Avoid Repeated LENGTH OF Calculations in Loops

*> SLOW: LENGTH OF evaluated on each iteration
    PERFORM VARYING WS-POS FROM 1 BY 1
        UNTIL WS-POS > FUNCTION LENGTH(WS-DATA)
        ...
    END-PERFORM

*> FASTER: Compute once, reuse
    MOVE FUNCTION LENGTH(WS-DATA) TO WS-MAX-POS
    PERFORM VARYING WS-POS FROM 1 BY 1
        UNTIL WS-POS > WS-MAX-POS
        ...
    END-PERFORM

For fixed-length fields, FUNCTION LENGTH is resolved at compile time and has zero runtime cost. But for variable-length fields (OCCURS DEPENDING ON) or function arguments, it may require runtime computation.

Minimize Reference Modification in Inner Loops

If you are processing millions of records and each record requires multiple reference modification operations, consider using REDEFINES to create named fields for the most common extraction patterns, reserving reference modification for the truly dynamic cases.

*> Instead of this (many ref-mods per record):
    MOVE WS-RECORD(1:10) TO WS-ACCT
    MOVE WS-RECORD(11:8) TO WS-DATE
    MOVE WS-RECORD(19:12) TO WS-AMOUNT

*> Use this (named fields, no ref-mod at all):
01  WS-RECORD.
    05  WS-ACCT   PIC X(10).
    05  WS-DATE   PIC 9(8).
    05  WS-AMOUNT PIC 9(10)V99.

Reference modification is for cases where the field boundaries are not known at compile time. When they are known, use named fields — they are faster and more readable.

19.16 Chapter Summary

Reference modification and pointer techniques give you byte-level control over COBOL data, enabling capabilities that go far beyond standard MOVE and STRING operations:

Reference modification identifier(start:length) extracts or overwrites specific bytes within any alphanumeric item. Both start and length can be arithmetic expressions, enabling fully dynamic data access.
LENGTH OF and FUNCTION LENGTH return the byte size of data items, essential for safe boundary checking in reference modification operations.
Dynamic substring operations — scanning for delimiters, extracting tokens, parsing variable-format records — are built by combining reference modification with position-tracking variables.
The pointer build pattern — maintaining a position counter and appending pieces via reference modification — is the standard COBOL idiom for constructing dynamic output.
ADDRESS OF and POINTER data items provide memory-level access for interfacing with non-COBOL programs and processing dynamically allocated buffers, but carry significant risk.
STRING/UNSTRING WITH POINTER extends the built-in string operations with incremental processing capability.
Defensive programming is critical: always validate position and length before reference modification, enable SSRANGE during development, and test boundary conditions exhaustively.

These techniques are essential for processing modern data interchange formats (EDI, delimited files, variable-format records) within COBOL programs, bridging the gap between COBOL's fixed-format heritage and the variable-format data that modern systems exchange.

⚖️ The Modernization Spectrum: Reference modification represents COBOL's pragmatic response to a changing data landscape. Where COBOL was designed for fixed-format records with known field positions, today's data is often variable-format, delimited, or structured as XML and JSON. Reference modification does not attempt to transform COBOL into a string-processing language like Python or Perl — instead, it provides just enough byte-level access to handle the variable-format cases that fixed-format structures cannot. This is the Modernization Spectrum in action: adapting without abandoning the language's core strengths of clarity, reliability, and decimal precision.

"Reference modification is where COBOL meets C. It gives you the same byte-level power — and the same responsibility. Use it wisely." — Maria Chen, Senior Developer, GlobalBank