Case Study 2: MedClaim Code Translation Tables
Background
MedClaim's claims processing system uses hundreds of code values: procedure codes (CPT), diagnosis codes (ICD-10 subsets), specialty codes, denial reason codes, state codes, plan type codes, and more. Originally, each code translation was hardcoded as an EVALUATE statement:
TRANSLATE-SPECIALTY-CODE.
EVALUATE PRV-SPECIALTY-CODE
WHEN 'CAR'
MOVE 'CARDIOLOGY' TO WS-SPEC-DESCRIPTION
WHEN 'DER'
MOVE 'DERMATOLOGY' TO WS-SPEC-DESCRIPTION
WHEN 'ENT'
MOVE 'EAR NOSE AND THROAT'
TO WS-SPEC-DESCRIPTION
WHEN 'GAS'
MOVE 'GASTROENTEROLOGY'
TO WS-SPEC-DESCRIPTION
* ... 150 more WHEN clauses ...
WHEN OTHER
MOVE 'UNKNOWN' TO WS-SPEC-DESCRIPTION
END-EVALUATE.
Sarah Kim counted the problem: the claims adjudication program had 12 different EVALUATE blocks translating different code types, totaling over 2,800 WHEN clauses. "Every time a new code is added — which happens quarterly for procedure codes — we have to modify and recompile the program," she reported. "Last quarter, a missed recompile caused 3,000 claims to get 'UNKNOWN' descriptions on their explanation of benefits."
James Okafor proposed replacing the hardcoded EVALUATEs with data-driven translation tables stored in relative files.
Design
The Code Table Pattern
Each code type would have its own relative file. Since the codes are short (typically 2-5 characters), they can be mapped to sequential integers using a simple conversion:
| Code Type | Code Length | Max Codes | RRDS Slots | Mapping Strategy |
|---|---|---|---|---|
| State codes | 2 alpha | 60 | 676 | (char1-'A')*26 + (char2-'A') + 1 |
| Specialty codes | 3 alpha | 200 | 17,576 | (c1676) + (c226) + c3 + 1 |
| Plan type codes | 2 alphanumeric | 50 | 1,296 | Positional encoding |
| Denial reason codes | 5 alphanumeric | 300 | Hash (prime 397) |
State Code Mapping Example
For two-letter state codes (AL, AK, AZ, ..., WY), the mapping converts each letter to a number (A=0, B=1, ..., Z=25) and computes:
RRN = (first_letter * 26) + second_letter + 1
'AL' → (0 * 26) + 11 + 1 = 12
'CA' → (2 * 26) + 0 + 1 = 53
'NY' → (13 * 26) + 24 + 1 = 363
'TX' → (19 * 26) + 23 + 1 = 518
This creates a sparse file (676 slots for ~60 actual states/territories), but the sparsity is acceptable because: - The file is tiny (676 * 60 bytes = ~40 KB) - Access is always by direct calculation — no hashing, no probing, no collisions - The mapping is deterministic and reversible
Implementation
Code Table Loader
A generic table loader program reads code/description pairs from a sequential extract and loads them into an RRDS:
IDENTIFICATION DIVISION.
PROGRAM-ID. CODE-TBL-LOAD.
ENVIRONMENT DIVISION.
INPUT-OUTPUT SECTION.
FILE-CONTROL.
SELECT CODE-INPUT-FILE
ASSIGN TO CODEINP
ORGANIZATION IS LINE SEQUENTIAL
FILE STATUS IS WS-INP-STATUS.
SELECT CODE-TABLE-FILE
ASSIGN TO CODETBL
ORGANIZATION IS RELATIVE
ACCESS MODE IS RANDOM
RELATIVE KEY IS WS-RELATIVE-KEY
FILE STATUS IS WS-TBL-STATUS.
DATA DIVISION.
FILE SECTION.
FD CODE-INPUT-FILE.
01 CODE-INPUT-RECORD.
05 CI-CODE-VALUE PIC X(05).
05 CI-DESCRIPTION PIC X(50).
05 CI-EFFECTIVE-DATE PIC 9(08).
05 CI-TERM-DATE PIC 9(08).
FD CODE-TABLE-FILE.
01 CODE-TABLE-RECORD.
05 CT-CODE-VALUE PIC X(05).
05 CT-DESCRIPTION PIC X(50).
05 CT-EFFECTIVE-DATE PIC 9(08).
05 CT-TERM-DATE PIC 9(08).
WORKING-STORAGE SECTION.
01 WS-INP-STATUS PIC XX.
01 WS-TBL-STATUS PIC XX.
88 WS-TBL-SUCCESS VALUE '00'.
88 WS-TBL-DUP-KEY VALUE '22'.
01 WS-RELATIVE-KEY PIC 9(07).
01 WS-RECORDS-LOADED PIC 9(05) VALUE ZERO.
01 WS-ALPHA-VALUES PIC X(36)
VALUE 'ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789'.
01 WS-CHAR-POS-1 PIC 9(02).
01 WS-CHAR-POS-2 PIC 9(02).
01 WS-CHAR-POS-3 PIC 9(02).
01 WS-EOF PIC X VALUE 'N'.
88 WS-END-OF-FILE VALUE 'Y'.
PROCEDURE DIVISION.
0000-MAIN.
OPEN INPUT CODE-INPUT-FILE
OPEN OUTPUT CODE-TABLE-FILE
CLOSE CODE-TABLE-FILE
OPEN I-O CODE-TABLE-FILE
PERFORM UNTIL WS-END-OF-FILE
READ CODE-INPUT-FILE
AT END
SET WS-END-OF-FILE TO TRUE
NOT AT END
PERFORM 1000-COMPUTE-RRN
PERFORM 2000-WRITE-RECORD
END-READ
END-PERFORM
CLOSE CODE-INPUT-FILE
CODE-TABLE-FILE
DISPLAY 'Code table loaded: '
WS-RECORDS-LOADED ' records'
STOP RUN.
1000-COMPUTE-RRN.
* Convert first 3 characters to positional RRN
* Each position: A-Z = 0-25, 0-9 = 26-35
PERFORM 1100-GET-CHAR-POS-1
PERFORM 1200-GET-CHAR-POS-2
COMPUTE WS-RELATIVE-KEY =
(WS-CHAR-POS-1 * 36) + WS-CHAR-POS-2 + 1.
1100-GET-CHAR-POS-1.
INSPECT WS-ALPHA-VALUES
TALLYING WS-CHAR-POS-1
FOR CHARACTERS BEFORE INITIAL
CI-CODE-VALUE(1:1).
1200-GET-CHAR-POS-2.
INSPECT WS-ALPHA-VALUES
TALLYING WS-CHAR-POS-2
FOR CHARACTERS BEFORE INITIAL
CI-CODE-VALUE(2:1).
2000-WRITE-RECORD.
MOVE CI-CODE-VALUE TO CT-CODE-VALUE
MOVE CI-DESCRIPTION TO CT-DESCRIPTION
MOVE CI-EFFECTIVE-DATE TO CT-EFFECTIVE-DATE
MOVE CI-TERM-DATE TO CT-TERM-DATE
WRITE CODE-TABLE-RECORD
INVALID KEY
IF WS-TBL-DUP-KEY
DISPLAY 'Duplicate code: '
CI-CODE-VALUE
' at slot ' WS-RELATIVE-KEY
ELSE
DISPLAY 'Write error: ' WS-TBL-STATUS
END-IF
NOT INVALID KEY
ADD 1 TO WS-RECORDS-LOADED
END-WRITE.
Code Table Lookup Module
The lookup module was designed as a reusable called program:
IDENTIFICATION DIVISION.
PROGRAM-ID. CODE-LOOKUP.
DATA DIVISION.
LINKAGE SECTION.
01 LS-REQUEST.
05 LS-TABLE-NAME PIC X(08).
05 LS-CODE-VALUE PIC X(05).
05 LS-DESCRIPTION PIC X(50).
05 LS-RETURN-CODE PIC 9(02).
PROCEDURE DIVISION USING LS-REQUEST.
* Called by claims programs to translate codes
* Returns LS-DESCRIPTION and LS-RETURN-CODE
* RC=00: Found, RC=23: Not found, RC=99: Error
Results
Before vs. After
| Metric | Before (EVALUATE) | After (RRDS tables) |
|---|---|---|
| Code change deployment | Recompile + redeploy | Load new table data |
| Time to add a new code | 2-4 hours (dev + test) | 5 minutes (add to extract) |
| Lines of code for translations | 2,800+ WHEN clauses | 150 lines (generic lookup) |
| Risk of missed recompile | High (happened quarterly) | None (data-driven) |
| Runtime lookup performance | ~0 ms (in-memory EVALUATE) | ~2 ms (1 RRDS I/O) |
Performance Trade-off
The RRDS approach was slightly slower than in-memory EVALUATE (2 ms vs. essentially 0 ms per lookup). For 500,000 claims/month with an average of 8 code lookups per claim, this added:
4,000,000 lookups * 2 ms = 8,000 seconds = ~2.2 hours total
But: spread across 30 days of processing = ~4.4 minutes/day
Sarah Kim approved: "Four extra minutes per day is nothing compared to the quarterly panic of updating 2,800 WHEN clauses without breaking anything."
Hybrid Optimization
For the highest-volume code tables (state codes, specialty codes), James Okafor later added an in-memory cache using COBOL tables (OCCURS with SEARCH ALL). The RRDS was read once at program startup to populate the table, and then all lookups were in-memory:
01 WS-STATE-TABLE.
05 WS-STATE-ENTRY OCCURS 60 TIMES
ASCENDING KEY IS WS-ST-CODE
INDEXED BY WS-ST-IDX.
10 WS-ST-CODE PIC X(02).
10 WS-ST-DESCRIPTION PIC X(30).
This gave the best of both worlds: data-driven (codes stored externally, loaded at startup) with in-memory performance (zero I/O per lookup during processing).
Discussion Questions
-
The character-position mapping for state codes creates a sparse file (676 slots for 60 records). Under what circumstances would this sparsity become a problem? At what code length would a hash function be preferable?
-
Compare the RRDS translation table approach with a DB2 reference table. What are the advantages of each? When would you recommend migrating to DB2?
-
James's hybrid approach (load RRDS into COBOL table at startup) eliminates the runtime I/O overhead. What are the limitations? What if the code table has 50,000 entries?
-
The quarterly code update problem was caused by a process failure (missed recompile), not a technical limitation. Could the same problem have been solved by better change management procedures instead of a technical redesign?
-
MedClaim now has 12 RRDS code tables. What operational challenges does this create for the systems administration team? How would you organize the table refresh process?