Appendix E: VSAM and File Organization Reference

Virtual Storage Access Method (VSAM) is the primary file access method for online and batch COBOL applications on IBM z/OS. While newer data stores — DB2, IMS, MQ — handle many workloads, VSAM remains the backbone of high-performance file-based processing. CICS applications use VSAM for control tables, configuration data, and transaction logs. Batch applications use VSAM for master files, work files, and inter-step communication. Understanding VSAM's architecture is not optional for the intermediate COBOL programmer — it is essential.

This appendix covers VSAM's internal architecture, the four data set types, complete IDCAMS syntax for common operations, performance tuning parameters, the full file status code table, alternate index management, and a production tuning checklist.


E.1 VSAM Architecture

E.1.1 Control Intervals (CIs)

The control interval is VSAM's fundamental unit of data transfer. When your COBOL program issues a READ, VSAM transfers an entire CI from disk to a buffer in memory. The CI contains:

+------------------------------------------------------------------+
| Record 1 | Record 2 | ... | Record n | Free Space | RDFs | CIDF |
+------------------------------------------------------------------+
  • Records: The actual data, stored sequentially within the CI.
  • Free space: Unused space within the CI, available for record insertion or expansion. Controlled by the FREESPACE parameter.
  • RDFs (Record Definition Fields): 3 bytes each, at the end of the CI (before the CIDF), describing each record's length and position. For fixed-length records, a single RDF with a repeat count covers all records.
  • CIDF (Control Interval Definition Field): 4 bytes at the very end of the CI, containing the offset and length of free space within the CI.

CI sizes: 512 bytes to 32,768 bytes, in multiples of 512 (up to 8,192) or multiples of 2,048 (above 8,192). VSAM selects a default CI size based on the DEFINE CLUSTER parameters, but you can override it.

Why CI size matters: Larger CIs transfer more data per I/O operation, which benefits sequential processing. Smaller CIs waste less buffer space when accessing individual records randomly, reducing memory pressure. The optimal CI size depends on your access pattern.

E.1.2 Control Areas (CAs)

A control area is a group of CIs that VSAM manages as a unit for space allocation and free-space distribution. The CA is the unit of secondary allocation — when a VSAM data set needs more space, it extends by one CA.

Control Area:
+-------+-------+-------+-------+- - -+-------+---------------+
| CI 1  | CI 2  | CI 3  | CI 4  | ... | CI n  | Free CIs      |
+-------+-------+-------+-------+- - -+-------+---------------+
  • The number of CIs per CA is determined by the CI size and the allocation unit (tracks or cylinders).
  • Free CIs within a CA are reserved by the FREESPACE CA percentage.
  • CA splits occur when a CI split requires space that is not available within the current CA.

E.1.3 The Index Component

For KSDS data sets, VSAM maintains a separate index component organized as a B+ tree:

                    Sequence Set
                   /     |      \
              Index     Index    Index
              Set       Set      Set
             /   \     /   \    /   \
           CI    CI   CI   CI  CI   CI
           (data records)
  • Sequence set: The lowest level of the index. Each sequence set record points to one CA and lists the highest key in each CI within that CA. The sequence set is horizontally chained for sequential access.
  • Index set: Higher levels of the B+ tree. Each index set record points to a range of sequence set records. Multi-level indexes exist only for very large data sets.

Index lookup: For a random READ by key, VSAM traverses the index from the top of the index set down to the sequence set, then reads the target CI from the data component. For a data set with a 3-level index (rare — typically 2 levels suffice for millions of records), this requires 3 index I/Os plus 1 data I/O. With buffering (LSR pool or NSR buffers), index levels are typically cached in memory, reducing this to 1 data I/O.

E.1.4 Catalogs

VSAM data sets are registered in ICF catalogs (Integrated Catalog Facility). The catalog records the data set's attributes, volume location, and component names. The LISTCAT command displays catalog information.

Every z/OS system has a master catalog and typically multiple user catalogs. The high-level qualifier of the data set name determines which user catalog is searched. VSAM data sets must be cataloged — there is no concept of an uncataloged VSAM file.


E.2 VSAM Data Set Types

E.2.1 Comparison Table

Feature KSDS ESDS RRDS LDS
Full name Key-Sequenced Entry-Sequenced Relative Record Linear
Record access by Primary key, alternate key RBA (Relative Byte Address) Relative record number Direct memory mapping
Random access Yes (by key) By RBA only Yes (by slot number) Memory-mapped pages
Sequential access Yes (key order) Yes (insertion order) Yes (slot order) Yes (byte stream)
Records can be deleted Yes No (logically only) Yes (slot becomes empty) N/A
Records can be updated Yes (same or different length) Yes (same length only) Yes (same length) N/A
Variable-length records Yes Yes No (fixed only) N/A
Spanned records Yes Yes No N/A
Index component Yes No No No
Typical use Master files, reference tables Logs, journals, audit trails Hash-accessible tables DB2 tablespaces, memory-mapped files

E.2.2 KSDS — Key-Sequenced Data Set

Architecture: Data component + index component. Records are stored in key sequence within CIs. The index provides rapid key-based access.

Strengths: - Random access by primary key — O(log n) via B+ tree. - Sequential access in key order — efficient for batch processing. - Alternate indexes for access by secondary keys. - Variable-length records supported. - Records can be inserted, updated, and deleted.

Weaknesses: - CI and CA splits degrade performance over time (see E.5). - More complex to define and tune than sequential files. - Index maintenance adds overhead to insert/update/delete operations.

COBOL SELECT:

SELECT CUSTOMER-FILE
    ASSIGN TO CUSTFILE
    ORGANIZATION IS INDEXED
    ACCESS MODE IS DYNAMIC
    RECORD KEY IS CUST-KEY
    ALTERNATE RECORD KEY IS CUST-NAME WITH DUPLICATES
    FILE STATUS IS WS-CUST-STATUS.

Best for: Any data that requires both random and sequential access by a unique key. Customer masters, product catalogs, account files, reference tables.

E.2.3 ESDS — Entry-Sequenced Data Set

Architecture: Data component only (no index). Records are appended in the order they are written and cannot be physically deleted. Each record is identified by its Relative Byte Address (RBA) — its byte offset from the beginning of the data set.

Strengths: - Simple and fast for sequential writing (append only). - No index overhead. - Alternate indexes can be defined over ESDS. - Records can be updated in place (same length only).

Weaknesses: - No random access by key (only by RBA, which is not a logical identifier). - Records cannot be deleted (only logically flagged). - Data set grows monotonically — must be periodically reorganized.

COBOL SELECT:

SELECT AUDIT-LOG
    ASSIGN TO AUDITLOG
    ORGANIZATION IS SEQUENTIAL
    ACCESS MODE IS SEQUENTIAL
    FILE STATUS IS WS-AUDIT-STATUS.

Best for: Audit trails, transaction logs, journals — any append-only workload where records are written once and read sequentially.

E.2.4 RRDS — Relative Record Data Set

Architecture: Data component only, organized as fixed-size slots numbered from 1 to n. Each slot either contains a record or is empty.

Strengths: - O(1) random access by record number — the fastest VSAM access. - No index overhead. - Simple to understand and manage. - Records can be inserted, updated, and deleted.

Weaknesses: - Fixed-length records only. - Slots must be pre-allocated — the data set has a fixed maximum capacity. - Empty slots waste space if the data set is sparse. - No access by logical key (only by slot number).

COBOL SELECT:

SELECT HASH-TABLE
    ASSIGN TO HASHTBL
    ORGANIZATION IS RELATIVE
    ACCESS MODE IS RANDOM
    RELATIVE KEY IS WS-SLOT-NUMBER
    FILE STATUS IS WS-HASH-STATUS.

Best for: Lookup tables where the record number has logical meaning (e.g., state codes 01-50, product category codes), hash tables, temporary work files with known capacity.

E.2.5 LDS — Linear Data Set

Architecture: A byte-stream data set with no record structure. VSAM treats it as a sequence of 4,096-byte pages that can be memory-mapped via DIV (Data-in-Virtual) or window services.

Strengths: - Memory-mapped access — extremely fast for programs that can work with raw pages. - Used internally by DB2 for tablespaces and index spaces. - No CI/CA structure overhead.

Weaknesses: - No record-level access — the program must manage its own record layout within pages. - Cannot be accessed with standard COBOL file I/O. - Specialized use cases only.

Best for: DB2 tablespaces, shared memory regions, data spaces. Not typically used directly by COBOL application programs.


E.3 IDCAMS DEFINE CLUSTER — Complete Syntax

E.3.1 KSDS Definition

DEFINE CLUSTER (                               -
    NAME(hlq.data.set.name)                    -
    INDEXED                                    -
    KEYS(length offset)                        -
    RECORDSIZE(average maximum)                -
    SHAREOPTIONS(crossregion crosssystem)       -
    FREESPACE(ci-percent ca-percent)            -
    SPEED | RECOVERY                            -
    REUSE | NOREUSE                             -
    SPANNED | NONSPANNED                        -
    ERASE | NOERASE                             -
    WRITECHECK | NOWRITECHECK                   -
    IMBED | NOIMBED                             -
    REPLICATE | NOREPLICATE                     -
  )                                            -
  DATA (                                       -
    NAME(hlq.data.set.name.DATA)               -
    CYLINDERS(primary secondary)               -
    | TRACKS(primary secondary)                -
    | RECORDS(primary secondary)               -
    | KILOBYTES(primary secondary)             -
    | MEGABYTES(primary secondary)             -
    CONTROLINTERVALSIZE(bytes)                 -
    BUFFERSPACE(bytes)                         -
  )                                            -
  INDEX (                                      -
    NAME(hlq.data.set.name.INDEX)              -
    CYLINDERS(primary secondary)               -
    CONTROLINTERVALSIZE(bytes)                 -
  )                                            -
  CATALOG(catalog-name)

E.3.2 ESDS Definition

DEFINE CLUSTER (                               -
    NAME(hlq.esds.name)                        -
    NONINDEXED                                 -
    RECORDSIZE(average maximum)                -
    SHAREOPTIONS(2 3)                          -
  )                                            -
  DATA (                                       -
    NAME(hlq.esds.name.DATA)                   -
    CYLINDERS(primary secondary)               -
    CONTROLINTERVALSIZE(bytes)                 -
  )

E.3.3 RRDS Definition

DEFINE CLUSTER (                               -
    NAME(hlq.rrds.name)                        -
    NUMBERED                                   -
    RECORDSIZE(length length)                  -
    SHAREOPTIONS(2 3)                          -
  )                                            -
  DATA (                                       -
    NAME(hlq.rrds.name.DATA)                   -
    RECORDS(primary secondary)                 -
  )

Note: RRDS RECORDSIZE must specify the same value for both average and maximum (fixed-length only).

E.3.4 Key Parameters Explained

KEYS(length offset): - length — Number of bytes in the primary key (1 to 255). - offset — Byte position of the key within the record (0-based). For a key starting at position 1 (COBOL-style), use offset 0.

RECORDSIZE(average maximum): - For fixed-length records, specify the same value for both. - For variable-length, average is used for space calculation and maximum sets the upper limit.

SHAREOPTIONS(crossregion crosssystem):

Value Cross-Region Cross-System
1 One writer OR multiple readers Same
2 One writer AND multiple readers Same
3 Multiple writers and readers (no VSAM integrity) Same
4 Multiple writers and readers (buffer refresh) Same
  • SHAREOPTIONS(2 3) is the most common choice for CICS files: one CICS region writes while others read, with cross-system sharing requiring application-level integrity.
  • SHAREOPTIONS(1) provides the strongest integrity but the least concurrency.
  • Never use SHAREOPTIONS(3) or (4) without application-level record locking — VSAM does not guarantee buffer coherency.

FREESPACE(ci-percent ca-percent): - ci-percent — Percentage of each CI to leave free during initial load or reorganization. Free space within CIs allows records to be inserted without splitting. - ca-percent — Percentage of CIs within each CA to leave empty. Free CIs accommodate CI splits without triggering CA splits. - Typical values: FREESPACE(20 10) for moderate insert activity, FREESPACE(40 20) for heavy insert activity, FREESPACE(0 0) for read-only reference files.

SPEED vs. RECOVERY: - SPEED — Skips preformatting of CIs during initial load. Faster loading, but if the load fails mid-stream, the data set is unusable. - RECOVERY — Preformats all CIs before loading. Slower, but a failed load can be restarted.


E.4 File Status Codes — Complete Reference

Every COBOL file I/O operation sets the two-byte file status field. The first byte indicates the category; the second provides specificity.

E.4.1 Status Code Table

Code Category Meaning Recommended Action
00 Success Operation completed successfully Continue normal processing
02 Success Read: duplicate key exists for alternate key. Write: record written, alternate key is duplicate Continue; be aware of duplicates
04 Success Record length mismatch (read a record shorter or longer than expected) Verify record layout; may need variable-length handling
05 Success OPEN on OPTIONAL file — file not present, treated as empty Continue; file will be created on first WRITE
07 Success Non-reel/unit CLOSE with NO REWIND or REEL/UNIT for non-tape Continue; informational only
10 End of file No more records (sequential READ past end) Normal condition — end processing loop
14 End of file Relative file: READ of record beyond file boundary Stop sequential READ; may need to extend file
21 Invalid key Sequence error: key of record being written is not in ascending sequence (sequential access on KSDS) Fix program logic — records must be in key order for sequential WRITE
22 Invalid key Duplicate primary key on WRITE, or duplicate alternate key on WRITE when alternates do not allow duplicates Check for pre-existing record; handle duplicate key business logic
23 Invalid key Record not found on READ, START, or DELETE Handle "not found" condition — may be normal business logic
24 Invalid key Boundary violation: WRITE beyond file boundary (RRDS — slot > maximum, or disk space exhausted) Extend file allocation or handle as error
30 Permanent error Non-recoverable I/O error Log error details, abend or skip record with error handling
34 Permanent error Boundary violation on sequential WRITE — disk full Extend allocation; check SPACE parameters
35 Permanent error OPEN failed — file does not exist and OPTIONAL not specified Verify data set name and catalog entry
37 Permanent error OPEN mode conflict — file does not support the requested mode Check file organization vs. OPEN mode (e.g., OPEN I-O on sequential)
38 Permanent error OPEN failed — file previously locked with CLOSE WITH LOCK Open in a new run unit or use a different file
39 Permanent error Attribute conflict — actual file attributes do not match program's definition Verify RECFM, LRECL, KEY length/position vs. FD and SELECT
41 Logic error OPEN on a file that is already open Check program flow; ensure no duplicate OPEN
42 Logic error CLOSE on a file that is not open Check program flow; ensure file was opened
43 Logic error DELETE or REWRITE without a prior successful READ (for sequential access) Issue READ before DELETE/REWRITE
44 Logic error REWRITE with different record length on fixed-length file, or boundary violation Check record length; fixed-length REWRITE must use same size
46 Logic error Sequential READ on a file not positioned (no prior READ succeeded, or positioned beyond end) Issue START to reposition, or verify prior READ succeeded
47 Logic error READ on file not opened INPUT or I-O Check OPEN statement mode
48 Logic error WRITE on file not opened OUTPUT, I-O, or EXTEND Check OPEN statement mode
49 Logic error DELETE or REWRITE on file not opened I-O Change OPEN mode to I-O

E.4.2 VSAM-Specific Extended Status (Return Code + Function Code + Feedback Code)

When the basic file status is 9x (90-99), the extended file status provides additional VSAM-specific information. Access the extended status through:

05 WS-FILE-STATUS.
   10 WS-STATUS-1      PIC X.
   10 WS-STATUS-2      PIC X.
05 WS-VSAM-RETURN-CODE  PIC S9(2) COMP.
05 WS-VSAM-FUNCTION-CODE PIC S9(1) COMP.
05 WS-VSAM-FEEDBACK-CODE PIC S9(3) COMP.

Common VSAM feedback codes:

Feedback Meaning
004 Read past end of file
008 Duplicate key
016 Record not found
028 Out of space
036 Key sequence error
092 File not open
096 No current record position
108 Exclusive control conflict
116 Maximum number of extents exceeded
132 Not enough virtual storage
148 No DD statement for the file

E.4.3 File Status Checking Pattern

PERFORM 2100-READ-CUSTOMER
IF WS-CUST-STATUS NOT = '00' AND '02'
    IF WS-CUST-STATUS = '23'
        PERFORM 2110-CUSTOMER-NOT-FOUND
    ELSE IF WS-CUST-STATUS = '10'
        SET WS-END-OF-FILE TO TRUE
    ELSE
        DISPLAY 'UNEXPECTED FILE STATUS: ' WS-CUST-STATUS
        DISPLAY 'ON FILE: CUSTOMER-FILE'
        DISPLAY 'OPERATION: READ'
        PERFORM 9000-FILE-ERROR-ABEND
    END-IF
END-IF

Best practice: Check file status after every I/O operation. Build a standard error-handling paragraph that displays the file name, operation, and status code, then calls a centralized abend routine.


E.5 CI/CA Splits and Performance Impact

E.5.1 CI Splits

A CI split occurs when VSAM needs to insert or update a record in a CI that has insufficient free space. VSAM:

  1. Allocates a free CI within the same CA.
  2. Moves approximately half the records from the full CI to the new CI.
  3. Inserts the new record in the appropriate CI.
  4. Updates the index to reflect the new CI.

Performance impact: - The split itself requires multiple I/O operations (read original CI, write both CIs, update index). - After the split, records that were in key sequence within one CI are now spread across two CIs. Sequential reads that previously required one I/O now require two. - Repeated splits create a fragmented data set where logical key sequence no longer corresponds to physical disk sequence.

E.5.2 CA Splits

A CA split occurs when a CI split is needed but there are no free CIs in the current CA. VSAM must:

  1. Allocate a new CA (secondary allocation).
  2. Move approximately half the CIs from the full CA to the new CA.
  3. Update the sequence set and index set.

Performance impact: CA splits are far more expensive than CI splits. They move large amounts of data, cause significant I/O, and can trigger secondary space allocation. Excessive CA splits are the leading cause of VSAM performance degradation in production.

E.5.3 Diagnosing Split Activity

Use LISTCAT with the ALL option to see split statistics:

LISTCAT ENTRIES(MY.VSAM.KSDS) ALL

Key fields in the output:

Field Meaning Healthy Value
CI SPLITS Number of CI splits since last reorganization Varies; ratio to inserts matters
CA SPLITS Number of CA splits since last reorganization Should be near zero
EXTENTS Number of physical extents Fewer is better; < 5 is ideal
REC-TOTAL Total records Baseline for analysis
REC-DELETED Deleted records (space not reclaimed until reorg) Should be small relative to total

Rule of thumb: If CI splits exceed 10% of the record count, or if any CA splits have occurred, the data set needs reorganization.

E.5.4 Reorganization

Reorganizing a VSAM KSDS restores records to physical key sequence, eliminates deleted-record space, and resets the split counters:

//* STEP 1: UNLOAD
//UNLOAD   EXEC PGM=IDCAMS
//SYSPRINT DD   SYSOUT=*
//INDD     DD   DSN=MY.VSAM.KSDS,DISP=SHR
//OUTDD    DD   DSN=MY.VSAM.BACKUP,
//              DISP=(NEW,CATLG,DELETE),
//              SPACE=(CYL,(50,10))
//SYSIN    DD   *
  REPRO INFILE(INDD) OUTFILE(OUTDD)
/*
//*
//* STEP 2: DELETE AND REDEFINE
//REDEF    EXEC PGM=IDCAMS,COND=(0,NE,UNLOAD)
//SYSPRINT DD   SYSOUT=*
//SYSIN    DD   *
  DELETE MY.VSAM.KSDS CLUSTER PURGE
  DEFINE CLUSTER ( ... same parameters as original ... )
/*
//*
//* STEP 3: RELOAD
//RELOAD   EXEC PGM=IDCAMS,COND=(0,NE,REDEF)
//SYSPRINT DD   SYSOUT=*
//INDD     DD   DSN=MY.VSAM.BACKUP,DISP=SHR
//OUTDD    DD   DSN=MY.VSAM.KSDS,DISP=SHR
//SYSIN    DD   *
  REPRO INFILE(INDD) OUTFILE(OUTDD)
/*

E.6 Alternate Index Creation and Maintenance

An alternate index (AIX) allows access to a KSDS or ESDS by a secondary key. For example, accessing a customer file by name when the primary key is customer number.

E.6.1 Steps to Create an Alternate Index

Step 1: Define the AIX

DEFINE ALTERNATEINDEX (                         -
    NAME(MY.VSAM.CUSTNAME.AIX)                  -
    RELATE(MY.VSAM.CUSTOMER)                    -
    KEYS(30 50)                                 -
    NONUNIQUEKEY                                -
    UPGRADE                                     -
    RECORDSIZE(50 500)                          -
    SHAREOPTIONS(2 3)                           -
  )                                             -
  DATA (                                        -
    NAME(MY.VSAM.CUSTNAME.AIX.DATA)             -
    CYLINDERS(5 2)                              -
  )                                             -
  INDEX (                                       -
    NAME(MY.VSAM.CUSTNAME.AIX.INDEX)            -
    CYLINDERS(2 1)                              -
  )

Key parameters:

  • RELATE — The base cluster that this AIX indexes.
  • KEYS(length offset) — The alternate key's length and position within the base cluster's records.
  • NONUNIQUEKEY or UNIQUEKEY — Whether duplicate alternate key values are allowed.
  • UPGRADE — VSAM automatically updates the AIX when the base cluster is updated. Without UPGRADE, the AIX becomes stale.
  • RECORDSIZE — For NONUNIQUEKEY, records contain pointers to all base records with the same alternate key. The maximum record size must accommodate the largest expected number of duplicates.

Step 2: Define a PATH

DEFINE PATH (                                   -
    NAME(MY.VSAM.CUSTNAME.PATH)                 -
    PATHENTRY(MY.VSAM.CUSTNAME.AIX)             -
  )

The PATH associates the AIX with the base cluster and provides a single name for COBOL to reference.

Step 3: Build the AIX

BLDINDEX INFILE(BASEDD) OUTFILE(AIXDD)

This scans the base cluster and populates the AIX. Required for the initial build and after any bulk load that bypasses the UPGRADE mechanism.

E.6.2 COBOL Access via Alternate Index

SELECT CUSTOMER-FILE
    ASSIGN TO CUSTPATH
    ORGANIZATION IS INDEXED
    ACCESS MODE IS DYNAMIC
    RECORD KEY IS CUST-NUMBER
    ALTERNATE RECORD KEY IS CUST-NAME WITH DUPLICATES
    FILE STATUS IS WS-CUST-STATUS.

The DD name in JCL must point to the PATH, not the base cluster:

//CUSTPATH DD   DSN=MY.VSAM.CUSTNAME.PATH,DISP=SHR

To read by alternate key:

MOVE 'SMITH, JOHN' TO CUST-NAME
READ CUSTOMER-FILE
    KEY IS CUST-NAME
    INVALID KEY
        PERFORM 2100-CUSTOMER-NOT-FOUND
END-READ

E.6.3 AIX Maintenance Considerations

  • UPGRADE: Always specify UPGRADE unless you have a specific reason not to. Without it, inserts and updates to the base cluster will not update the AIX, leading to incorrect query results.
  • Rebuild after bulk loads: If you use REPRO to bulk-load the base cluster (which bypasses UPGRADE), you must rebuild all AIXes afterward with BLDINDEX.
  • Performance impact: Each AIX adds overhead to every insert, update, and delete on the base cluster. Limit the number of AIXes to those genuinely needed for application access patterns.
  • NONUNIQUEKEY record sizes: For a NONUNIQUEKEY AIX, the maximum record size should accommodate the maximum number of duplicate key values. Each duplicate adds the primary key length + 5 bytes of pointer overhead. If your customer name key could have up to 100 duplicates and the primary key is 10 bytes, the max record size should be at least 30 (AIX key) + 100 * 15 = 1,530 bytes.

E.7 VSAM Performance Parameters

E.7.1 FREESPACE

FREESPACE(ci-percent ca-percent)
Scenario Recommended CI% Recommended CA% Rationale
Read-only reference file 0 0 No inserts, maximize data density
Low insert rate (< 5% growth/month) 10 5 Minimal splits
Moderate insert rate (5-20% growth) 20 10 Balance space and performance
High insert rate (> 20% growth) 30-40 15-20 Minimize splits between reorgs
Sequential key insertion (e.g., timestamp) 0-5 5 Inserts at end, minimal CI splits
Random key insertion (e.g., hash or UUID) 30-40 15-20 Inserts throughout file, maximum split prevention

E.7.2 BUFFERSPACE and Buffer Management

NSR (Non-Shared Resources):

BUFFERSPACE(bytes)

Specifies the virtual storage VSAM allocates for I/O buffers. More buffers mean more CIs can be cached in memory, reducing physical I/O.

Guidelines:

  • For sequential processing: At least 2 data buffers per string (number of concurrent I/O operations). More buffers enable read-ahead.
  • For random processing: Buffer the entire index component if possible, plus 2-3 data buffers.
  • Formula for index buffering: index CI size * number of index levels * 2

LSR (Local Shared Resources):

For CICS applications, VSAM files typically use LSR buffering (controlled by CICS FCT definitions rather than IDCAMS). LSR pools share buffers across multiple files, improving overall memory utilization.

Benefits of LSR: - Buffer pool is shared — a single pool can serve many files. - Hiperspace buffering (data space buffering) can extend the effective buffer pool. - CICS manages buffer allocation and look-aside (checking if a CI is already in a buffer before issuing I/O).

E.7.3 CI Size Selection

Access Pattern Optimal CI Size Rationale
Sequential batch processing 12,288 - 32,768 bytes Maximize data per I/O
Random online access (CICS) 2,048 - 4,096 bytes Minimize wasted buffer space
Mixed (batch + online) 4,096 - 8,192 bytes Compromise
Small reference tables Match record size Fit one record per CI for random, or maximize for sequential

Formula for records per CI:

Records per CI = (CI_size - 10) / (record_length + 7)

Where 10 bytes = CIDF (4) + minimum RDF (3) + free space descriptor (3), and 7 bytes per record = RDF overhead for variable-length. For fixed-length records with a single RDF, the formula simplifies.

E.7.4 SHAREOPTIONS

Cross-Region Integrity Concurrency Use Case
1 Full Low (exclusive write) Single-region batch
2 Read integrity Moderate (one writer + readers) CICS + batch read
3 None (application must manage) Full Multiple CICS regions
4 Partial (buffer refresh) Full Multi-system shared

For CICS applications with multiple regions, SHAREOPTIONS(2 3) is standard, with CICS VSAM Record-Level Sharing (RLS) providing the necessary integrity beyond what SHAREOPTIONS alone guarantees.


E.8 VSAM Tuning Checklist

Use this checklist when deploying a new VSAM file to production or investigating performance issues with an existing one.

E.8.1 Design Phase

  • [ ] Choose the correct data set type (KSDS, ESDS, RRDS) based on access requirements.
  • [ ] Define primary key with the correct length and offset.
  • [ ] Set RECORDSIZE accurately — measure the actual record layout.
  • [ ] Choose CI size appropriate for the primary access pattern.
  • [ ] Set FREESPACE based on expected insert rate and distribution.
  • [ ] Set SHAREOPTIONS based on concurrent access requirements.
  • [ ] Define SPACE allocation: primary for initial load, secondary for growth.
  • [ ] Plan alternate indexes — only those required by application access patterns.
  • [ ] Specify UPGRADE on all alternate indexes unless there is a documented reason not to.

E.8.2 Initial Load

  • [ ] Use SPEED for initial loads (faster, no preformatting).
  • [ ] Sort input data by primary key before REPRO into KSDS.
  • [ ] Verify with LISTCAT that record count matches expected count.
  • [ ] Build all alternate indexes after initial load (BLDINDEX).
  • [ ] Record baseline statistics (LISTCAT ALL): record count, CI/CA splits (should be 0), extents.

E.8.3 Ongoing Monitoring

  • [ ] Schedule weekly LISTCAT to check CI/CA split counts.
  • [ ] Monitor extent count — excessive extents (> 10) signal need for reorganization.
  • [ ] Track deleted-record count — deleted records are not reclaimed until reorganization.
  • [ ] Monitor response time for random READ operations — degradation suggests fragmentation.
  • [ ] Check buffer hit ratio in CICS statistics — low ratio means insufficient buffers.

E.8.4 Reorganization

  • [ ] Schedule regular reorganization based on split activity (monthly for high-update files, quarterly for moderate).
  • [ ] Always REPRO to a backup before DELETE/DEFINE/REPRO cycle.
  • [ ] Rebuild all alternate indexes after reorganization.
  • [ ] Verify record counts match before and after.
  • [ ] Take baseline LISTCAT statistics after reorganization.
  • [ ] Consider adjusting FREESPACE if split patterns indicate the current values are too low or too high.

E.8.5 Troubleshooting

  • [ ] Slow random reads: Check buffer allocation, CI size, and index levels. Buffer the index component.
  • [ ] Slow sequential processing: Check CI size (too small?), check for excessive CA splits (physical fragmentation).
  • [ ] Space exhaustion: Check extent count, secondary allocation size, and deleted-record accumulation.
  • [ ] Duplicate key errors: Verify key definition (length, offset) matches actual record layout. Check for data corruption.
  • [ ] Open failures: Run VERIFY to fix end-of-file marker after abnormal termination. Check SHAREOPTIONS for concurrent access conflicts.
  • [ ] Stale alternate index: Rebuild with BLDINDEX. Verify UPGRADE is specified on AIX definition.

This appendix provides the VSAM knowledge an intermediate COBOL programmer needs to make informed decisions about file design, to interpret error conditions correctly, and to diagnose performance issues. For comprehensive VSAM documentation, consult IBM's DFSMS Access Method Services for Catalogs (SC23-6853), DFSMS Using Data Sets (SC23-6855), and the CICS Transaction Server Resource Definition Guide for CICS-specific VSAM configuration.