Case Study 2: MedClaim Code Translation Tables

DataField.Dev

Case Study 2: MedClaim Code Translation Tables

Background

MedClaim's claims processing system uses hundreds of code values: procedure codes (CPT), diagnosis codes (ICD-10 subsets), specialty codes, denial reason codes, state codes, plan type codes, and more. Originally, each code translation was hardcoded as an EVALUATE statement:

       TRANSLATE-SPECIALTY-CODE.
           EVALUATE PRV-SPECIALTY-CODE
               WHEN 'CAR'
                   MOVE 'CARDIOLOGY' TO WS-SPEC-DESCRIPTION
               WHEN 'DER'
                   MOVE 'DERMATOLOGY' TO WS-SPEC-DESCRIPTION
               WHEN 'ENT'
                   MOVE 'EAR NOSE AND THROAT'
                       TO WS-SPEC-DESCRIPTION
               WHEN 'GAS'
                   MOVE 'GASTROENTEROLOGY'
                       TO WS-SPEC-DESCRIPTION
      *        ... 150 more WHEN clauses ...
               WHEN OTHER
                   MOVE 'UNKNOWN' TO WS-SPEC-DESCRIPTION
           END-EVALUATE.

Sarah Kim counted the problem: the claims adjudication program had 12 different EVALUATE blocks translating different code types, totaling over 2,800 WHEN clauses. "Every time a new code is added — which happens quarterly for procedure codes — we have to modify and recompile the program," she reported. "Last quarter, a missed recompile caused 3,000 claims to get 'UNKNOWN' descriptions on their explanation of benefits."

James Okafor proposed replacing the hardcoded EVALUATEs with data-driven translation tables stored in relative files.

Design

The Code Table Pattern

Each code type would have its own relative file. Since the codes are short (typically 2-5 characters), they can be mapped to sequential integers using a simple conversion:

Code Type	Code Length	Max Codes	RRDS Slots	Mapping Strategy
State codes	2 alpha	60	676	(char1-'A')*26 + (char2-'A') + 1
Specialty codes	3 alpha	200	17,576	(c1676) + (c226) + c3 + 1
Plan type codes	2 alphanumeric	50	1,296	Positional encoding
Denial reason codes	5 alphanumeric	300	Hash (prime 397)

State Code Mapping Example

For two-letter state codes (AL, AK, AZ, ..., WY), the mapping converts each letter to a number (A=0, B=1, ..., Z=25) and computes:

RRN = (first_letter * 26) + second_letter + 1

'AL' → (0 * 26) + 11 + 1 = 12
'CA' → (2 * 26) + 0 + 1  = 53
'NY' → (13 * 26) + 24 + 1 = 363
'TX' → (19 * 26) + 23 + 1 = 518

This creates a sparse file (676 slots for ~60 actual states/territories), but the sparsity is acceptable because: - The file is tiny (676 * 60 bytes = ~40 KB) - Access is always by direct calculation — no hashing, no probing, no collisions - The mapping is deterministic and reversible

Implementation

Code Table Loader

A generic table loader program reads code/description pairs from a sequential extract and loads them into an RRDS:

       IDENTIFICATION DIVISION.
       PROGRAM-ID. CODE-TBL-LOAD.

       ENVIRONMENT DIVISION.
       INPUT-OUTPUT SECTION.
       FILE-CONTROL.
           SELECT CODE-INPUT-FILE
               ASSIGN TO CODEINP
               ORGANIZATION IS LINE SEQUENTIAL
               FILE STATUS IS WS-INP-STATUS.

           SELECT CODE-TABLE-FILE
               ASSIGN TO CODETBL
               ORGANIZATION IS RELATIVE
               ACCESS MODE IS RANDOM
               RELATIVE KEY IS WS-RELATIVE-KEY
               FILE STATUS IS WS-TBL-STATUS.

       DATA DIVISION.
       FILE SECTION.
       FD  CODE-INPUT-FILE.
       01  CODE-INPUT-RECORD.
           05  CI-CODE-VALUE          PIC X(05).
           05  CI-DESCRIPTION         PIC X(50).
           05  CI-EFFECTIVE-DATE      PIC 9(08).
           05  CI-TERM-DATE           PIC 9(08).

       FD  CODE-TABLE-FILE.
       01  CODE-TABLE-RECORD.
           05  CT-CODE-VALUE          PIC X(05).
           05  CT-DESCRIPTION         PIC X(50).
           05  CT-EFFECTIVE-DATE      PIC 9(08).
           05  CT-TERM-DATE           PIC 9(08).

       WORKING-STORAGE SECTION.
       01  WS-INP-STATUS             PIC XX.
       01  WS-TBL-STATUS             PIC XX.
           88  WS-TBL-SUCCESS        VALUE '00'.
           88  WS-TBL-DUP-KEY        VALUE '22'.
       01  WS-RELATIVE-KEY           PIC 9(07).
       01  WS-RECORDS-LOADED         PIC 9(05) VALUE ZERO.

       01  WS-ALPHA-VALUES           PIC X(36)
           VALUE 'ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789'.
       01  WS-CHAR-POS-1             PIC 9(02).
       01  WS-CHAR-POS-2             PIC 9(02).
       01  WS-CHAR-POS-3             PIC 9(02).

       01  WS-EOF                    PIC X VALUE 'N'.
           88  WS-END-OF-FILE        VALUE 'Y'.

       PROCEDURE DIVISION.
       0000-MAIN.
           OPEN INPUT  CODE-INPUT-FILE
           OPEN OUTPUT CODE-TABLE-FILE
           CLOSE CODE-TABLE-FILE
           OPEN I-O CODE-TABLE-FILE

           PERFORM UNTIL WS-END-OF-FILE
               READ CODE-INPUT-FILE
                   AT END
                       SET WS-END-OF-FILE TO TRUE
                   NOT AT END
                       PERFORM 1000-COMPUTE-RRN
                       PERFORM 2000-WRITE-RECORD
               END-READ
           END-PERFORM

           CLOSE CODE-INPUT-FILE
                 CODE-TABLE-FILE
           DISPLAY 'Code table loaded: '
                   WS-RECORDS-LOADED ' records'
           STOP RUN.

       1000-COMPUTE-RRN.
      *    Convert first 3 characters to positional RRN
      *    Each position: A-Z = 0-25, 0-9 = 26-35
           PERFORM 1100-GET-CHAR-POS-1
           PERFORM 1200-GET-CHAR-POS-2
           COMPUTE WS-RELATIVE-KEY =
               (WS-CHAR-POS-1 * 36) + WS-CHAR-POS-2 + 1.

       1100-GET-CHAR-POS-1.
           INSPECT WS-ALPHA-VALUES
               TALLYING WS-CHAR-POS-1
               FOR CHARACTERS BEFORE INITIAL
                   CI-CODE-VALUE(1:1).

       1200-GET-CHAR-POS-2.
           INSPECT WS-ALPHA-VALUES
               TALLYING WS-CHAR-POS-2
               FOR CHARACTERS BEFORE INITIAL
                   CI-CODE-VALUE(2:1).

       2000-WRITE-RECORD.
           MOVE CI-CODE-VALUE      TO CT-CODE-VALUE
           MOVE CI-DESCRIPTION     TO CT-DESCRIPTION
           MOVE CI-EFFECTIVE-DATE  TO CT-EFFECTIVE-DATE
           MOVE CI-TERM-DATE       TO CT-TERM-DATE

           WRITE CODE-TABLE-RECORD
               INVALID KEY
                   IF WS-TBL-DUP-KEY
                       DISPLAY 'Duplicate code: '
                               CI-CODE-VALUE
                               ' at slot ' WS-RELATIVE-KEY
                   ELSE
                       DISPLAY 'Write error: ' WS-TBL-STATUS
                   END-IF
               NOT INVALID KEY
                   ADD 1 TO WS-RECORDS-LOADED
           END-WRITE.

Code Table Lookup Module

The lookup module was designed as a reusable called program:

       IDENTIFICATION DIVISION.
       PROGRAM-ID. CODE-LOOKUP.

       DATA DIVISION.
       LINKAGE SECTION.
       01  LS-REQUEST.
           05  LS-TABLE-NAME         PIC X(08).
           05  LS-CODE-VALUE         PIC X(05).
           05  LS-DESCRIPTION        PIC X(50).
           05  LS-RETURN-CODE        PIC 9(02).

       PROCEDURE DIVISION USING LS-REQUEST.
      *    Called by claims programs to translate codes
      *    Returns LS-DESCRIPTION and LS-RETURN-CODE
      *    RC=00: Found, RC=23: Not found, RC=99: Error

Results

Before vs. After

Metric	Before (EVALUATE)	After (RRDS tables)
Code change deployment	Recompile + redeploy	Load new table data
Time to add a new code	2-4 hours (dev + test)	5 minutes (add to extract)
Lines of code for translations	2,800+ WHEN clauses	150 lines (generic lookup)
Risk of missed recompile	High (happened quarterly)	None (data-driven)
Runtime lookup performance	~0 ms (in-memory EVALUATE)	~2 ms (1 RRDS I/O)

Performance Trade-off

The RRDS approach was slightly slower than in-memory EVALUATE (2 ms vs. essentially 0 ms per lookup). For 500,000 claims/month with an average of 8 code lookups per claim, this added:

4,000,000 lookups * 2 ms = 8,000 seconds = ~2.2 hours total
But: spread across 30 days of processing = ~4.4 minutes/day

Sarah Kim approved: "Four extra minutes per day is nothing compared to the quarterly panic of updating 2,800 WHEN clauses without breaking anything."

Hybrid Optimization

For the highest-volume code tables (state codes, specialty codes), James Okafor later added an in-memory cache using COBOL tables (OCCURS with SEARCH ALL). The RRDS was read once at program startup to populate the table, and then all lookups were in-memory:

       01  WS-STATE-TABLE.
           05  WS-STATE-ENTRY OCCURS 60 TIMES
               ASCENDING KEY IS WS-ST-CODE
               INDEXED BY WS-ST-IDX.
               10  WS-ST-CODE         PIC X(02).
               10  WS-ST-DESCRIPTION  PIC X(30).

This gave the best of both worlds: data-driven (codes stored externally, loaded at startup) with in-memory performance (zero I/O per lookup during processing).

Discussion Questions

The character-position mapping for state codes creates a sparse file (676 slots for 60 records). Under what circumstances would this sparsity become a problem? At what code length would a hash function be preferable?
Compare the RRDS translation table approach with a DB2 reference table. What are the advantages of each? When would you recommend migrating to DB2?
James's hybrid approach (load RRDS into COBOL table at startup) eliminates the runtime I/O overhead. What are the limitations? What if the code table has 50,000 entries?
The quarterly code update problem was caused by a process failure (missed recompile), not a technical limitation. Could the same problem have been solved by better change management procedures instead of a technical redesign?
MedClaim now has 12 RRDS code tables. What operational challenges does this create for the systems administration team? How would you organize the table refresh process?