Chapter 12: Indexed File Processing (VSAM KSDS) -- Key Takeaways
Chapter Summary
While sequential files are processed from beginning to end, many business applications require the ability to access specific records directly by a key value, such as an account number, employee ID, or product code. Indexed file processing, implemented on IBM mainframes through VSAM Key-Sequenced Data Sets (KSDS), provides this capability. This chapter covered how to define indexed files in COBOL, how to perform random and sequential access against them, and how to create and manage VSAM KSDS clusters using IDCAMS.
An indexed file maintains its records in key sequence and builds an index structure that enables direct access to any record by its primary key. In COBOL, indexed files are defined with ORGANIZATION IS INDEXED in the SELECT clause, along with RECORD KEY to specify the primary key field and optional ALTERNATE RECORD KEY for secondary access paths. The ACCESS MODE clause determines whether the program reads records sequentially, randomly by key, or dynamically (combining both modes in a single program). These three access modes give COBOL programs remarkable flexibility in how they interact with indexed data.
The PROCEDURE DIVISION statements for indexed files extend beyond the READ and WRITE of sequential processing to include REWRITE (update an existing record in place), DELETE (remove a record by key), and START (position the file pointer for sequential reading from a specific key value). Each of these operations updates the FILE STATUS variable, and checking this status is even more critical with indexed files than with sequential files because operations such as writing a duplicate key or reading a nonexistent record are expected conditions rather than catastrophic errors. We also examined how VSAM KSDS clusters are defined outside the COBOL program using IDCAMS DEFINE CLUSTER, which establishes the physical dataset with its key position, key length, record size, and space allocation.
Key Concepts
- ORGANIZATION IS INDEXED in the SELECT clause tells the COBOL runtime that the file is an indexed file backed by a VSAM KSDS (or equivalent indexed file on non-mainframe platforms).
- RECORD KEY IS identifies the primary key field, which must be a field defined within the FD record description. Every record must have a unique primary key value.
- ALTERNATE RECORD KEY IS defines a secondary key that provides an additional access path to the data. The WITH DUPLICATES phrase allows multiple records to share the same alternate key value.
- ACCESS MODE IS SEQUENTIAL processes records in key sequence from beginning to end, operating similarly to sequential file access but in primary key order.
- ACCESS MODE IS RANDOM allows direct access to any record by moving the desired key value to the RECORD KEY field before performing a READ, REWRITE, or DELETE.
- ACCESS MODE IS DYNAMIC combines sequential and random access in a single program, enabling operations such as positioning to a specific key with START and then reading forward sequentially.
- READ file-name in random mode retrieves the record whose primary key matches the current value in the RECORD KEY field. The INVALID KEY clause handles the case where no matching record exists.
- READ file-name NEXT RECORD in dynamic mode reads the next record in key sequence, used after START or a previous READ NEXT to traverse the file sequentially from a given position.
- WRITE record-name adds a new record to the indexed file, inserting it in the correct key sequence position. The INVALID KEY clause handles duplicate key violations.
- REWRITE record-name replaces an existing record with the contents of the record area. The record must have been previously read (in sequential or dynamic mode), and the primary key must not be changed.
- DELETE file-name removes the record whose primary key matches the current RECORD KEY value. In sequential access mode, the record must have been previously read.
- START file-name positions the file pointer to a record matching a specified key condition, without actually reading the record. It is used with READ NEXT to begin sequential reading from a specific point.
- START supports relational conditions: KEY IS EQUAL TO, KEY IS GREATER THAN, KEY IS NOT LESS THAN, and KEY IS GREATER THAN OR EQUAL TO.
- FILE STATUS codes for indexed files include "00" (success), "02" (duplicate alternate key), "10" (end of file), "22" (duplicate primary key on WRITE), "23" (record not found on READ/START/DELETE), and "21" (sequence error).
- VSAM KSDS clusters are defined using IDCAMS DEFINE CLUSTER with parameters for key position (KEYS), record size (RECORDSIZE), space allocation (CYLINDERS/TRACKS/RECORDS), and dataset name (NAME).
Common Pitfalls
- Not setting the RECORD KEY before random READ: The READ statement in random mode uses the current value of the RECORD KEY field to locate the record. If the key field contains spaces, zeros, or a stale value from a previous operation, the READ retrieves the wrong record or returns status "23".
- Attempting to change the primary key on REWRITE: REWRITE updates the record in place, and the primary key value must remain unchanged. Attempting to REWRITE with a modified primary key produces a FILE STATUS error. To change a record's key, DELETE the old record and WRITE a new one.
- Forgetting INVALID KEY / checking FILE STATUS: With indexed files, conditions such as "record not found" and "duplicate key" are normal business situations, not errors. Failing to handle them causes the program to abend or silently produce incorrect results.
- Using READ instead of READ NEXT in dynamic mode: In ACCESS MODE IS DYNAMIC, a plain READ performs a random read by key. To read sequentially after a START, you must use READ file-name NEXT RECORD. Omitting NEXT causes each READ to re-read the record matching the current key value.
- Not defining the VSAM cluster before running the program: Unlike sequential files, which can be created automatically by JCL, VSAM KSDS files must be pre-defined using IDCAMS DEFINE CLUSTER before a COBOL program can open them. Attempting to OPEN a nonexistent VSAM file produces a FILE STATUS error (typically "35").
- Mismatch between COBOL key definition and IDCAMS key definition: The key position and length specified in IDCAMS DEFINE CLUSTER must exactly match the position and length of the RECORD KEY field in the COBOL FD. A mismatch causes data corruption or abends.
- Opening an indexed file in OUTPUT mode: OPEN OUTPUT on an indexed file deletes all existing records and prepares the file for initial loading. If the intent is to add records to an existing file, use OPEN I-O and WRITE, or OPEN EXTEND where supported.
- Not handling FILE STATUS "02" for alternate keys: Status "02" means a record was successfully written or rewritten but has a duplicate value in an alternate key that allows duplicates. This is an informational status, not an error, but programs that check only for "00" may incorrectly treat it as a failure.
Quick Reference
* ENVIRONMENT DIVISION -- indexed file definition
ENVIRONMENT DIVISION.
INPUT-OUTPUT SECTION.
FILE-CONTROL.
SELECT CUSTOMER-FILE
ASSIGN TO CUSTFILE
ORGANIZATION IS INDEXED
ACCESS MODE IS DYNAMIC
RECORD KEY IS CUST-ID
ALTERNATE RECORD KEY IS CUST-LAST-NAME
WITH DUPLICATES
FILE STATUS IS WS-CUST-STATUS.
* DATA DIVISION -- FD and record layout
FILE SECTION.
FD CUSTOMER-FILE.
01 CUSTOMER-RECORD.
05 CUST-ID PIC X(10).
05 CUST-LAST-NAME PIC X(25).
05 CUST-FIRST-NAME PIC X(20).
05 CUST-BALANCE PIC S9(7)V99 COMP-3.
05 CUST-STATUS-CODE PIC X(01).
WORKING-STORAGE SECTION.
01 WS-CUST-STATUS PIC X(02).
88 CUST-SUCCESS VALUE "00".
88 CUST-DUP-ALT-KEY VALUE "02".
88 CUST-EOF VALUE "10".
88 CUST-DUP-KEY VALUE "22".
88 CUST-NOT-FOUND VALUE "23".
* Random READ by primary key
MOVE "CUST000123" TO CUST-ID
READ CUSTOMER-FILE
INVALID KEY
DISPLAY "Customer not found"
NOT INVALID KEY
DISPLAY "Found: " CUST-FIRST-NAME
" " CUST-LAST-NAME
END-READ
* WRITE a new record
MOVE "CUST000999" TO CUST-ID
MOVE "Smith" TO CUST-LAST-NAME
MOVE "John" TO CUST-FIRST-NAME
MOVE 1500.00 TO CUST-BALANCE
MOVE "A" TO CUST-STATUS-CODE
WRITE CUSTOMER-RECORD
INVALID KEY
DISPLAY "Duplicate key: " CUST-ID
END-WRITE
* REWRITE (update) an existing record
MOVE "CUST000123" TO CUST-ID
READ CUSTOMER-FILE
INVALID KEY DISPLAY "Not found"
END-READ
IF CUST-SUCCESS
ADD 100.00 TO CUST-BALANCE
REWRITE CUSTOMER-RECORD
INVALID KEY
DISPLAY "Rewrite error"
END-REWRITE
END-IF
* DELETE a record
MOVE "CUST000456" TO CUST-ID
DELETE CUSTOMER-FILE
INVALID KEY
DISPLAY "Delete failed - not found"
END-DELETE
* START and READ NEXT (sequential from a point)
MOVE "CUST000500" TO CUST-ID
START CUSTOMER-FILE
KEY IS GREATER THAN OR EQUAL TO CUST-ID
INVALID KEY
DISPLAY "No records from this key"
END-START
IF CUST-SUCCESS
PERFORM UNTIL CUST-EOF
READ CUSTOMER-FILE NEXT RECORD
AT END SET CUST-EOF TO TRUE
NOT AT END
DISPLAY CUST-ID " "
CUST-LAST-NAME
END-READ
END-PERFORM
END-IF
* IDCAMS DEFINE CLUSTER (JCL step)
* //DEFVSAM EXEC PGM=IDCAMS
* //SYSPRINT DD SYSOUT=*
* //SYSIN DD *
* DEFINE CLUSTER ( -
* NAME(PROD.CUSTOMER.KSDS) -
* INDEXED -
* KEYS(10 0) -
* RECORDSIZE(80 80) -
* CYLINDERS(5 1) -
* FREESPACE(20 10) -
* SHAREOPTIONS(2 3) -
* ) -
* DATA (NAME(PROD.CUSTOMER.KSDS.DATA)) -
* INDEX (NAME(PROD.CUSTOMER.KSDS.INDEX))
* /*
What's Next
Chapter 13 will explore relative file processing and additional VSAM file organizations, including RRDS (Relative Record Data Sets) and ESDS (Entry-Sequenced Data Sets). You will see how ORGANIZATION IS RELATIVE uses a numeric record number as the key, providing fixed-position access to records. The chapter also covers advanced file-handling topics such as file sharing, file locking, and strategies for handling concurrent access in online and batch environments, building on the indexed file foundations established in this chapter.