Case Study: Migrating from VSAM to DB2 at a Credit Union
Background
Lakewood Federal Credit Union, serving 185,000 members across three states, had operated its core account processing system on VSAM files since 1991. The system comprised 167 COBOL batch programs and 43 CICS online programs that collectively managed member accounts, share (savings) accounts, loan accounts, certificates of deposit, and transaction histories. The VSAM infrastructure included 34 KSDS (Key-Sequenced Data Set) files, 8 RRDS (Relative Record Data Set) files, and 12 ESDS (Entry-Sequenced Data Set) files.
By 2021, the VSAM-based architecture had become a significant liability. The credit union's leadership identified four critical problems driving the need for change.
First, reporting was painfully slow and inflexible. Every new report required a sequential scan of one or more VSAM files, and complex queries that joined data across multiple files required custom batch programs that took weeks to develop. Regulatory examiners had begun requesting ad-hoc queries that the current system simply could not support in a timely manner.
Second, data integrity enforcement was entirely the responsibility of application code. With 210 programs accessing the same VSAM files, inconsistencies had crept in over the years. An internal audit found 1,847 member records with orphaned cross-references, meaning that the account pointed to a member record that did not exist or vice versa. In a relational database, foreign key constraints would have prevented these inconsistencies from occurring.
Third, concurrent access limitations were causing processing bottlenecks. VSAM's record-level locking, while functional, did not support the kind of concurrent read/write access patterns that the credit union's growing online banking platform demanded. Members occasionally encountered "system busy" messages during peak hours when batch processes conflicted with online access.
Fourth, disaster recovery was complex. Replicating VSAM files to a disaster recovery site required bespoke backup and synchronization procedures that were fragile and difficult to test. DB2's built-in replication capabilities offered a more robust solution.
The credit union's CIO, Samuel Okafor, approved a two-year project to migrate core account data from VSAM to DB2, with a budget of $1.8 million covering software licenses, consulting services, and internal staff allocation.
Planning the Migration
The project team, led by senior developer Angela Rivas, consisted of four COBOL developers, one DB2 database administrator, one systems programmer, and an external consulting firm with DB2 migration experience.
The team's first task was to map the existing VSAM file structures to a relational database schema. This was more complex than it initially appeared. VSAM files in the Lakewood system used several design patterns that did not translate directly to relational tables.
The primary challenge was the use of redefines and variable-length records. The member master VSAM file, for example, contained records where the same physical bytes could represent different data depending on a type code:
* Original VSAM Member Master Record Layout
01 MEMBER-MASTER-RECORD.
05 MM-MEMBER-NUMBER PIC X(10).
05 MM-RECORD-TYPE PIC X(02).
88 MM-TYPE-PERSONAL VALUE 'PE'.
88 MM-TYPE-BUSINESS VALUE 'BU'.
88 MM-TYPE-JOINT VALUE 'JT'.
05 MM-COMMON-DATA.
10 MM-NAME-LAST PIC X(25).
10 MM-NAME-FIRST PIC X(20).
10 MM-DATE-OPENED PIC 9(08).
10 MM-STATUS-CODE PIC X(01).
05 MM-TYPE-SPECIFIC-DATA.
10 MM-PERSONAL-DATA REDEFINES MM-TYPE-SPECIFIC-DATA.
15 MM-P-SSN PIC X(09).
15 MM-P-DOB PIC 9(08).
15 MM-P-EMPLOYER PIC X(30).
15 FILLER PIC X(53).
10 MM-BUSINESS-DATA REDEFINES MM-TYPE-SPECIFIC-DATA.
15 MM-B-TAX-ID PIC X(09).
15 MM-B-INCORP-DT PIC 9(08).
15 MM-B-BUS-TYPE PIC X(20).
15 MM-B-CONTACT PIC X(30).
15 FILLER PIC X(33).
10 MM-JOINT-DATA REDEFINES MM-TYPE-SPECIFIC-DATA.
15 MM-J-SSN-PRIM PIC X(09).
15 MM-J-SSN-SEC PIC X(09).
15 MM-J-DOB-PRIM PIC 9(08).
15 MM-J-DOB-SEC PIC 9(08).
15 FILLER PIC X(66).
In the relational model, the team chose a table-per-type inheritance pattern. A base MEMBER table held the common data, and separate tables for MEMBER_PERSONAL, MEMBER_BUSINESS, and MEMBER_JOINT held the type-specific attributes, linked by foreign keys. This eliminated the REDEFINES pattern and made each field independently queryable.
The DB2 schema for this structure was:
CREATE TABLE MEMBER (
MEMBER_NUMBER CHAR(10) NOT NULL,
RECORD_TYPE CHAR(2) NOT NULL,
NAME_LAST VARCHAR(25) NOT NULL,
NAME_FIRST VARCHAR(20) NOT NULL,
DATE_OPENED DATE NOT NULL,
STATUS_CODE CHAR(1) NOT NULL,
CONSTRAINT PK_MEMBER PRIMARY KEY (MEMBER_NUMBER),
CONSTRAINT CK_MEMBER_TYPE
CHECK (RECORD_TYPE IN ('PE', 'BU', 'JT'))
);
CREATE TABLE MEMBER_PERSONAL (
MEMBER_NUMBER CHAR(10) NOT NULL,
SSN CHAR(9) NOT NULL,
DATE_OF_BIRTH DATE NOT NULL,
EMPLOYER_NAME VARCHAR(30),
CONSTRAINT PK_MEMBER_PERSONAL
PRIMARY KEY (MEMBER_NUMBER),
CONSTRAINT FK_MEMBER_PERSONAL
FOREIGN KEY (MEMBER_NUMBER)
REFERENCES MEMBER (MEMBER_NUMBER)
);
Converting COBOL Programs: From VSAM to Embedded SQL
The conversion of COBOL programs was the most labor-intensive phase of the project. Each program had to be analyzed to identify all VSAM operations, and each operation had to be translated to its DB2 equivalent.
The team developed a conversion pattern guide that mapped common VSAM operations to their Embedded SQL equivalents. The most fundamental conversion was the basic record read.
The original VSAM random read operation:
* VSAM Random Read by Key
2100-READ-MEMBER-RECORD.
MOVE WS-MEMBER-NUMBER TO MM-MEMBER-NUMBER
READ MEMBER-MASTER-FILE INTO MEMBER-MASTER-RECORD
KEY IS MM-MEMBER-NUMBER
INVALID KEY
SET MEMBER-NOT-FOUND TO TRUE
MOVE '1' TO WS-READ-STATUS
NOT INVALID KEY
SET MEMBER-FOUND TO TRUE
MOVE '0' TO WS-READ-STATUS
END-READ
.
Became the following Embedded SQL equivalent:
* DB2 Equivalent - Single Row SELECT
2100-READ-MEMBER-RECORD.
EXEC SQL
SELECT MEMBER_NUMBER,
RECORD_TYPE,
NAME_LAST,
NAME_FIRST,
DATE_OPENED,
STATUS_CODE
INTO :WS-MEMBER-NUMBER,
:WS-RECORD-TYPE,
:WS-NAME-LAST,
:WS-NAME-FIRST,
:WS-DATE-OPENED,
:WS-STATUS-CODE
FROM MEMBER
WHERE MEMBER_NUMBER = :WS-SEARCH-MEMBER-NUM
END-EXEC
EVALUATE SQLCODE
WHEN 0
SET MEMBER-FOUND TO TRUE
MOVE '0' TO WS-READ-STATUS
WHEN +100
SET MEMBER-NOT-FOUND TO TRUE
MOVE '1' TO WS-READ-STATUS
WHEN OTHER
MOVE SQLCODE TO WS-DISPLAY-SQLCODE
STRING 'DB2 ERROR ON MEMBER SELECT: '
WS-DISPLAY-SQLCODE
DELIMITED BY SIZE
INTO ER-MESSAGE-TEXT
END-STRING
PERFORM 8500-LOG-ERROR
MOVE '9' TO WS-READ-STATUS
END-EVALUATE
.
Sequential file processing, which was the backbone of most batch programs, required conversion from VSAM sequential reads to DB2 cursor processing:
* ORIGINAL VSAM Sequential Processing
PROCEDURE DIVISION.
0000-MAIN-CONTROL.
OPEN INPUT MEMBER-MASTER-FILE
READ MEMBER-MASTER-FILE INTO MEMBER-MASTER-RECORD
AT END SET END-OF-FILE TO TRUE
END-READ
PERFORM 2000-PROCESS-MEMBER
UNTIL END-OF-FILE
CLOSE MEMBER-MASTER-FILE
STOP RUN
.
2000-PROCESS-MEMBER.
PERFORM 2100-APPLY-BUSINESS-LOGIC
READ MEMBER-MASTER-FILE INTO MEMBER-MASTER-RECORD
AT END SET END-OF-FILE TO TRUE
END-READ
.
The DB2 equivalent using cursors:
* DB2 Cursor-Based Sequential Processing
WORKING-STORAGE SECTION.
EXEC SQL INCLUDE SQLCA END-EXEC.
EXEC SQL DECLARE MEMBER-CURSOR CURSOR FOR
SELECT MEMBER_NUMBER,
RECORD_TYPE,
NAME_LAST,
NAME_FIRST,
DATE_OPENED,
STATUS_CODE
FROM MEMBER
ORDER BY MEMBER_NUMBER
END-EXEC.
PROCEDURE DIVISION.
0000-MAIN-CONTROL.
PERFORM 1000-INITIALIZE
PERFORM 2000-PROCESS-MEMBER
UNTIL END-OF-FILE
PERFORM 9000-TERMINATE
STOP RUN
.
1000-INITIALIZE.
EXEC SQL OPEN MEMBER-CURSOR END-EXEC
IF SQLCODE NOT = 0
PERFORM 8500-HANDLE-SQL-ERROR
STOP RUN
END-IF
PERFORM 8000-FETCH-NEXT-MEMBER
.
8000-FETCH-NEXT-MEMBER.
EXEC SQL
FETCH MEMBER-CURSOR
INTO :WS-MEMBER-NUMBER,
:WS-RECORD-TYPE,
:WS-NAME-LAST,
:WS-NAME-FIRST,
:WS-DATE-OPENED,
:WS-STATUS-CODE
END-EXEC
EVALUATE SQLCODE
WHEN 0
SET NOT-END-OF-FILE TO TRUE
WHEN +100
SET END-OF-FILE TO TRUE
WHEN OTHER
PERFORM 8500-HANDLE-SQL-ERROR
END-EVALUATE
.
9000-TERMINATE.
EXEC SQL CLOSE MEMBER-CURSOR END-EXEC
EXEC SQL COMMIT END-EXEC
.
Write operations underwent a similar transformation. VSAM WRITE and REWRITE statements became SQL INSERT and UPDATE statements, with the critical addition of COMMIT logic to manage transaction boundaries, something that had no direct equivalent in the VSAM world.
* DB2 Update with Commit Control
2300-UPDATE-MEMBER-STATUS.
EXEC SQL
UPDATE MEMBER
SET STATUS_CODE = :WS-NEW-STATUS
WHERE MEMBER_NUMBER = :WS-MEMBER-NUMBER
END-EXEC
EVALUATE SQLCODE
WHEN 0
ADD 1 TO CT-RECORDS-UPDATED
WHEN -803
MOVE 'DUPLICATE KEY ON MEMBER UPDATE'
TO ER-MESSAGE-TEXT
PERFORM 8500-LOG-ERROR
WHEN -911
MOVE 'TIMEOUT/DEADLOCK ON MEMBER UPDATE'
TO ER-MESSAGE-TEXT
PERFORM 8500-LOG-ERROR
PERFORM 8600-HANDLE-DEADLOCK
WHEN OTHER
PERFORM 8500-HANDLE-SQL-ERROR
END-EVALUATE
* Commit every 500 records to limit lock duration
IF CT-RECORDS-UPDATED > 0
AND FUNCTION MOD(CT-RECORDS-UPDATED, 500) = 0
EXEC SQL COMMIT END-EXEC
END-IF
.
Data Migration: The Critical Cutover
Migrating 185,000 member records, 420,000 account records, and 14 million transaction history records from VSAM to DB2 required meticulous planning. The team developed a three-stage migration approach.
Stage one was a full historical load. All existing data was extracted from VSAM files and loaded into DB2 tables using a set of custom COBOL extraction programs and DB2 LOAD utilities. This stage was performed on a weekend and took approximately nine hours. The extraction programs handled data transformations required by the new schema, including converting the REDEFINES-based record types into separate table rows and transforming PIC 9(08) dates into proper DB2 DATE columns.
The date conversion alone was a substantial effort. VSAM records stored dates in multiple formats across different files: YYYYMMDD in some, MMDDYYYY in others, and Julian dates (YYYYDDD) in a few legacy files. The extraction programs normalized all dates to the ISO format expected by DB2:
* Date Conversion During Migration Extract
3100-CONVERT-DATE-FORMAT.
EVALUATE TRUE
WHEN WS-SOURCE-FORMAT = 'YYYYMMDD'
MOVE WS-SOURCE-DATE(1:4) TO WS-DB2-YEAR
MOVE WS-SOURCE-DATE(5:2) TO WS-DB2-MONTH
MOVE WS-SOURCE-DATE(7:2) TO WS-DB2-DAY
WHEN WS-SOURCE-FORMAT = 'MMDDYYYY'
MOVE WS-SOURCE-DATE(5:4) TO WS-DB2-YEAR
MOVE WS-SOURCE-DATE(1:2) TO WS-DB2-MONTH
MOVE WS-SOURCE-DATE(3:2) TO WS-DB2-DAY
WHEN WS-SOURCE-FORMAT = 'JULIAN'
PERFORM 3150-CONVERT-JULIAN-TO-GREG
END-EVALUATE
STRING WS-DB2-YEAR '-'
WS-DB2-MONTH '-'
WS-DB2-DAY
DELIMITED BY SIZE
INTO WS-DB2-DATE-STRING
END-STRING
.
Stage two was a delta synchronization. Between the full load and the final cutover, daily synchronization jobs extracted changes from VSAM (identified through a journal facility) and applied them to DB2. This kept the DB2 database current during the testing period without requiring another full load.
Stage three was the final cutover. On the designated cutover weekend, the delta synchronization was run one final time after all online systems were taken offline. A comprehensive reconciliation program then compared record counts and control totals between VSAM and DB2 to verify data integrity.
* Data Reconciliation - Compare VSAM and DB2 Record Counts
3000-RECONCILE-MEMBER-COUNTS.
MOVE 0 TO CT-VSAM-MEMBERS
OPEN INPUT MEMBER-MASTER-FILE
READ MEMBER-MASTER-FILE
AT END SET END-OF-FILE TO TRUE
END-READ
PERFORM UNTIL END-OF-FILE
ADD 1 TO CT-VSAM-MEMBERS
READ MEMBER-MASTER-FILE
AT END SET END-OF-FILE TO TRUE
END-READ
END-PERFORM
CLOSE MEMBER-MASTER-FILE
EXEC SQL
SELECT COUNT(*)
INTO :CT-DB2-MEMBERS
FROM MEMBER
END-EXEC
IF CT-VSAM-MEMBERS = CT-DB2-MEMBERS
DISPLAY 'MEMBER COUNT RECONCILED: '
CT-VSAM-MEMBERS
ELSE
DISPLAY '*** MEMBER COUNT MISMATCH ***'
DISPLAY 'VSAM: ' CT-VSAM-MEMBERS
DISPLAY 'DB2: ' CT-DB2-MEMBERS
MOVE 'Y' TO WS-RECONCILE-ERROR
END-IF
.
Performance Tuning
The initial performance of the converted programs was disappointing. Several batch programs that had run in under thirty minutes with VSAM were taking over two hours with DB2. The team undertook a systematic performance tuning effort.
The first discovery was that many programs were performing row-at-a-time processing where set-based operations would have been far more efficient. The nightly interest calculation program, for example, had been converted by translating each VSAM READ into a FETCH and each REWRITE into an UPDATE, preserving the original one-record-at-a-time logic. Rewriting the core calculation as a single SQL UPDATE with a subselect reduced the runtime from 97 minutes to 4 minutes:
* Optimized Interest Calculation - Set-Based Approach
2000-CALCULATE-DAILY-INTEREST.
EXEC SQL
UPDATE SHARE_ACCOUNT SA
SET SA.ACCRUED_INTEREST =
SA.ACCRUED_INTEREST +
(SA.CURRENT_BALANCE *
(SELECT IT.DAILY_RATE
FROM INTEREST_TIER IT
WHERE IT.TIER_CODE = SA.INTEREST_TIER
AND IT.EFFECTIVE_DATE <= CURRENT DATE
AND (IT.EXPIRATION_DATE IS NULL
OR IT.EXPIRATION_DATE >
CURRENT DATE)))
WHERE SA.STATUS_CODE = 'A'
AND SA.CURRENT_BALANCE > 0
END-EXEC
IF SQLCODE = 0
MOVE SQLERRD(3) TO CT-ACCOUNTS-UPDATED
DISPLAY 'INTEREST CALCULATED FOR '
CT-ACCOUNTS-UPDATED ' ACCOUNTS'
EXEC SQL COMMIT END-EXEC
ELSE
PERFORM 8500-HANDLE-SQL-ERROR
EXEC SQL ROLLBACK END-EXEC
END-IF
.
The second major tuning effort involved index optimization. The initial DB2 schema had been created with only primary key indexes. Performance analysis using DB2's EXPLAIN facility revealed numerous table scans on frequently queried columns. Adding secondary indexes on STATUS_CODE, DATE_OPENED, and INTEREST_TIER columns for the SHARE_ACCOUNT table reduced several batch program runtimes by 60% or more.
The third optimization was commit frequency tuning. Early converted programs committed after every single row update, which created enormous log overhead. The team standardized on commit intervals based on program type: every 500 rows for batch updates, every 1,000 rows for batch inserts, and at logical transaction boundaries for online programs.
Challenges and Setbacks
The migration was not without serious difficulties. The most significant setback occurred during parallel testing in month nine of the project.
The team had been running converted programs in parallel with their VSAM counterparts for six weeks, comparing output record by record. Most programs produced identical results. However, the monthly statement generation program produced different account balances for 3,247 out of 185,000 members. Investigation revealed that the discrepancy was caused by differences in sort order between VSAM and DB2.
The VSAM transaction history file was an ESDS (Entry-Sequenced Data Set), meaning records were stored in the order they were written. The DB2 equivalent table had no guaranteed ordering unless an ORDER BY clause was specified. The statement program accumulated transactions in the order they were read, and for transactions occurring on the same date, the VSAM ordering (by entry sequence) differed from the DB2 ordering (which was effectively random without ORDER BY). When transactions included running balance calculations, even small ordering differences compounded across a month's worth of activity.
The fix required adding a sequence number column to the transaction history table, populated during migration from the ESDS relative byte address, and adding ORDER BY clauses to all cursor declarations that processed transactions:
EXEC SQL DECLARE TXN-HISTORY-CURSOR CURSOR FOR
SELECT TXN_DATE,
TXN_SEQUENCE,
TXN_TYPE,
TXN_AMOUNT,
TXN_DESCRIPTION
FROM TRANSACTION_HISTORY
WHERE ACCOUNT_NUMBER = :WS-ACCOUNT-NUMBER
AND TXN_DATE BETWEEN :WS-START-DATE
AND :WS-END-DATE
ORDER BY TXN_DATE, TXN_SEQUENCE
END-EXEC
Another challenge was the handling of VSAM alternate indexes. Several programs accessed the same VSAM file through different keys using alternate index paths. In the DB2 environment, these became different WHERE clauses on the same table, but the programs had been written to process each alternate index path as if it were a separate file. Reconciling these different access patterns into a clean set of SQL queries required careful analysis of each program's actual data access requirements.
Lessons Learned
The Lakewood migration completed on schedule and within budget, going live after twenty months of effort. The team documented several important lessons.
The conversion was not a mechanical translation exercise. Simply replacing READ with SELECT and WRITE with INSERT produced technically functional but poorly performing programs. The team learned to think about data access differently when working with DB2, leveraging set-based operations, join capabilities, and the query optimizer in ways that had no VSAM equivalent.
Data type conversions required exhaustive testing. Differences in how VSAM PIC clauses and DB2 column types handled edge cases, particularly around packed decimal precision, null values, and date boundaries, produced subtle discrepancies that were only caught through comprehensive parallel testing.
Commit strategy was more important than initially appreciated. The VSAM environment had no concept of explicit transaction commit; changes were written directly to disk. Introducing commit points into batch programs required careful analysis to determine appropriate intervals that balanced performance with recovery granularity.
The sort order issue demonstrated that assumptions embedded in decades-old code were often invisible and undocumented. No specification document mentioned that the statement program depended on ESDS insertion order; it was simply how VSAM worked, and the original developers had relied on it without comment.
Finally, the credit union's reporting capabilities improved dramatically after the migration. Ad-hoc queries that previously required custom batch programs could now be answered in minutes using SQL. The first time a regulatory examiner requested a complex cross-reference report and received it within an hour, Samuel Okafor knew the investment had been worthwhile.
Conclusion
The Lakewood Federal Credit Union's VSAM-to-DB2 migration demonstrates both the substantial benefits and the genuine complexity of modernizing COBOL data access patterns. The relational model brought referential integrity, flexible querying, improved concurrent access, and simplified disaster recovery. However, achieving these benefits required more than a mechanical code conversion; it demanded a fundamental shift in how the development team thought about data access, transaction management, and performance optimization. For organizations contemplating a similar migration, the Lakewood experience argues strongly for comprehensive parallel testing, performance benchmarking from the earliest stages, and a willingness to redesign data access patterns rather than simply translating them.