Case Study 1: End-of-Day Transaction Sorting
Background
Heartland National Bank processes approximately 1.2 million transactions each business day across its retail banking, commercial banking, and online banking channels. These transactions arrive throughout the day in chronological order -- the order in which customers performed them -- and are accumulated in a raw transaction file. At the close of business, the bank's batch posting system must apply these transactions to customer accounts in a specific sequence: sorted by account number, then by transaction date, and within the same date, by transaction type (debits before credits, to avoid temporarily inflating balances).
The raw transaction file also contains records that should not be posted: voided transactions, transactions flagged for compliance review, and test transactions injected by the QA team during the day. These must be filtered out before sorting. Additionally, the operations team needs a summary of transaction volumes by type and channel, produced as a byproduct of the sort process.
This case study demonstrates the full power of COBOL's SORT statement with both INPUT PROCEDURE and OUTPUT PROCEDURE, showing how to filter, transform, sort, and summarize data in a single, efficient pass.
Data Design
The Raw Transaction File
Transactions arrive in the raw file exactly as they were generated, with a 200-byte fixed-length record:
SELECT RAW-TRANS-FILE
ASSIGN TO RAWTXN
ORGANIZATION IS SEQUENTIAL
FILE STATUS IS WS-RAW-STATUS.
FD RAW-TRANS-FILE
RECORDING MODE IS F
RECORD CONTAINS 200 CHARACTERS.
01 RAW-TRANS-RECORD.
05 RTR-TXN-ID PIC X(12).
05 RTR-ACCOUNT-NUMBER PIC X(10).
05 RTR-TXN-DATE PIC 9(8).
05 RTR-TXN-TIME PIC 9(6).
05 RTR-TXN-TYPE PIC X(2).
88 RTR-DEBIT VALUE "DB".
88 RTR-CREDIT VALUE "CR".
88 RTR-FEE VALUE "FE".
88 RTR-INTEREST VALUE "IN".
88 RTR-ADJUSTMENT VALUE "AJ".
88 RTR-REVERSAL VALUE "RV".
05 RTR-TXN-AMOUNT PIC S9(11)V99.
05 RTR-CHANNEL PIC X(3).
88 RTR-BRANCH VALUE "BRN".
88 RTR-ATM VALUE "ATM".
88 RTR-ONLINE VALUE "ONL".
88 RTR-MOBILE VALUE "MOB".
88 RTR-ACH VALUE "ACH".
88 RTR-WIRE VALUE "WIR".
05 RTR-BRANCH-CODE PIC X(4).
05 RTR-TELLER-ID PIC X(6).
05 RTR-STATUS-FLAG PIC X(1).
88 RTR-NORMAL VALUE "N".
88 RTR-VOIDED VALUE "V".
88 RTR-COMPLIANCE VALUE "C".
88 RTR-TEST VALUE "T".
88 RTR-POSTABLE VALUE "N".
05 RTR-DESCRIPTION PIC X(30).
05 RTR-FILLER PIC X(105).
The Sort Work File
The sort description defines the fields that the SORT statement uses for ordering:
SELECT SORT-WORK-FILE
ASSIGN TO SORTWK01.
SD SORT-WORK-FILE
RECORD CONTAINS 200 CHARACTERS.
01 SORT-RECORD.
05 SR-TXN-ID PIC X(12).
05 SR-ACCOUNT-NUMBER PIC X(10).
05 SR-TXN-DATE PIC 9(8).
05 SR-TXN-TIME PIC 9(6).
05 SR-TXN-TYPE PIC X(2).
05 SR-TXN-AMOUNT PIC S9(11)V99.
05 SR-CHANNEL PIC X(3).
05 SR-BRANCH-CODE PIC X(4).
05 SR-TELLER-ID PIC X(6).
05 SR-STATUS-FLAG PIC X(1).
05 SR-DESCRIPTION PIC X(30).
05 SR-FILLER PIC X(105).
The Sorted Output File
SELECT SORTED-TRANS-FILE
ASSIGN TO SRTDTXN
ORGANIZATION IS SEQUENTIAL
FILE STATUS IS WS-SRT-STATUS.
FD SORTED-TRANS-FILE
RECORDING MODE IS F
RECORD CONTAINS 200 CHARACTERS.
01 SORTED-TRANS-RECORD PIC X(200).
The Sort Statement with INPUT and OUTPUT PROCEDURE
The heart of the program is the SORT statement that connects the input filtering, the sort operation, and the output summarization:
PROCEDURE DIVISION.
0000-MAIN.
PERFORM 1000-INITIALIZE
SORT SORT-WORK-FILE
ON ASCENDING KEY SR-ACCOUNT-NUMBER
ON ASCENDING KEY SR-TXN-DATE
ON ASCENDING KEY SR-TXN-TYPE
INPUT PROCEDURE IS 2000-FILTER-INPUT
OUTPUT PROCEDURE IS 3000-SUMMARIZE-OUTPUT
PERFORM 4000-PRODUCE-REPORTS
PERFORM 9000-TERMINATE
STOP RUN
.
The three-level sort key ensures the correct posting order:
-
Account number (ascending): Groups all transactions for the same account together, allowing the posting program to load each account once and apply all its transactions before moving to the next.
-
Transaction date (ascending): Within each account, transactions are ordered chronologically. This is important for interest calculations and overdraft detection, where the date of each transaction affects the outcome.
-
Transaction type (ascending): Within the same date, debits ("DB") sort before credits ("CR") alphabetically. This conservative approach prevents a credit from temporarily inflating the balance before a debit on the same day draws it down.
INPUT PROCEDURE: Filtering and Validation
The INPUT PROCEDURE reads the raw transaction file, filters out non-postable records, validates remaining records, and RELEASEs valid records to the sort:
2000-FILTER-INPUT.
OPEN INPUT RAW-TRANS-FILE
OPEN OUTPUT REJECT-FILE
MOVE 0 TO WS-RAW-READ-COUNT
MOVE 0 TO WS-RELEASED-COUNT
MOVE 0 TO WS-FILTERED-COUNT
MOVE 0 TO WS-REJECTED-COUNT
PERFORM 2100-READ-RAW-FILE
PERFORM UNTIL WS-RAW-EOF
ADD 1 TO WS-RAW-READ-COUNT
* Step 1: Filter non-postable records
IF NOT RTR-POSTABLE
ADD 1 TO WS-FILTERED-COUNT
PERFORM 2200-TALLY-FILTERED
PERFORM 2100-READ-RAW-FILE
CONTINUE
END-IF
* Step 2: Validate postable records
PERFORM 2300-VALIDATE-TRANSACTION
IF WS-VALIDATION-FAILED
ADD 1 TO WS-REJECTED-COUNT
PERFORM 2400-WRITE-REJECT
PERFORM 2100-READ-RAW-FILE
CONTINUE
END-IF
* Step 3: Accumulate input statistics
PERFORM 2500-ACCUMULATE-INPUT-STATS
* Step 4: Release valid record to the sort
MOVE RAW-TRANS-RECORD TO SORT-RECORD
RELEASE SORT-RECORD
ADD 1 TO WS-RELEASED-COUNT
PERFORM 2100-READ-RAW-FILE
END-PERFORM
CLOSE RAW-TRANS-FILE
CLOSE REJECT-FILE
.
Filtering Logic
Non-postable records are identified by the status flag. Each type is counted separately for the operations report:
2200-TALLY-FILTERED.
EVALUATE TRUE
WHEN RTR-VOIDED
ADD 1 TO WS-VOIDED-COUNT
WHEN RTR-COMPLIANCE
ADD 1 TO WS-COMPLIANCE-COUNT
WHEN RTR-TEST
ADD 1 TO WS-TEST-COUNT
END-EVALUATE
.
Validation Logic
Records that pass the status filter still need data validation before they can be trusted for posting:
2300-VALIDATE-TRANSACTION.
SET WS-VALIDATION-PASSED TO TRUE
* Check 1: Account number must be numeric
IF RTR-ACCOUNT-NUMBER IS NOT NUMERIC
MOVE "INVALID ACCOUNT NUMBER"
TO WS-REJECT-REASON
SET WS-VALIDATION-FAILED TO TRUE
END-IF
* Check 2: Transaction date must be valid
IF WS-VALIDATION-PASSED
IF RTR-TXN-DATE < 20200101
OR RTR-TXN-DATE > WS-CURRENT-DATE-NUM
MOVE "INVALID TRANSACTION DATE"
TO WS-REJECT-REASON
SET WS-VALIDATION-FAILED TO TRUE
END-IF
END-IF
* Check 3: Amount must be positive and non-zero
IF WS-VALIDATION-PASSED
IF RTR-TXN-AMOUNT = ZEROS
OR RTR-TXN-AMOUNT < ZEROS
MOVE "INVALID TRANSACTION AMOUNT"
TO WS-REJECT-REASON
SET WS-VALIDATION-FAILED TO TRUE
END-IF
END-IF
* Check 4: Transaction type must be recognized
IF WS-VALIDATION-PASSED
IF NOT (RTR-DEBIT OR RTR-CREDIT
OR RTR-FEE OR RTR-INTEREST
OR RTR-ADJUSTMENT OR RTR-REVERSAL)
MOVE "UNKNOWN TRANSACTION TYPE"
TO WS-REJECT-REASON
SET WS-VALIDATION-FAILED TO TRUE
END-IF
END-IF
.
Input Statistics Accumulation
While reading the input, the program accumulates counts and totals by channel and type. These statistics feed the operations report without requiring a second pass through the data:
2500-ACCUMULATE-INPUT-STATS.
* Count by channel
EVALUATE TRUE
WHEN RTR-BRANCH
ADD 1 TO WS-BRANCH-TXN-COUNT
ADD RTR-TXN-AMOUNT TO WS-BRANCH-TXN-TOTAL
WHEN RTR-ATM
ADD 1 TO WS-ATM-TXN-COUNT
ADD RTR-TXN-AMOUNT TO WS-ATM-TXN-TOTAL
WHEN RTR-ONLINE
ADD 1 TO WS-ONLINE-TXN-COUNT
ADD RTR-TXN-AMOUNT TO WS-ONLINE-TXN-TOTAL
WHEN RTR-MOBILE
ADD 1 TO WS-MOBILE-TXN-COUNT
ADD RTR-TXN-AMOUNT TO WS-MOBILE-TXN-TOTAL
WHEN RTR-ACH
ADD 1 TO WS-ACH-TXN-COUNT
ADD RTR-TXN-AMOUNT TO WS-ACH-TXN-TOTAL
WHEN RTR-WIRE
ADD 1 TO WS-WIRE-TXN-COUNT
ADD RTR-TXN-AMOUNT TO WS-WIRE-TXN-TOTAL
END-EVALUATE
* Count by type (debits vs credits)
EVALUATE TRUE
WHEN RTR-DEBIT
WHEN RTR-FEE
ADD 1 TO WS-TOTAL-DEBIT-COUNT
ADD RTR-TXN-AMOUNT TO WS-TOTAL-DEBIT-AMT
WHEN RTR-CREDIT
WHEN RTR-INTEREST
ADD 1 TO WS-TOTAL-CREDIT-COUNT
ADD RTR-TXN-AMOUNT TO WS-TOTAL-CREDIT-AMT
WHEN RTR-ADJUSTMENT
WHEN RTR-REVERSAL
ADD 1 TO WS-TOTAL-ADJUST-COUNT
ADD RTR-TXN-AMOUNT TO WS-TOTAL-ADJUST-AMT
END-EVALUATE
.
OUTPUT PROCEDURE: Summarization and Account Grouping
The OUTPUT PROCEDURE receives records from the sort in sorted order and writes them to the output file while producing account-level summary records:
3000-SUMMARIZE-OUTPUT.
OPEN OUTPUT SORTED-TRANS-FILE
OPEN OUTPUT ACCOUNT-SUMMARY-FILE
MOVE 0 TO WS-OUTPUT-COUNT
MOVE 0 TO WS-ACCOUNT-COUNT
MOVE SPACES TO WS-PREV-ACCOUNT
PERFORM 3100-RETURN-SORTED
PERFORM UNTIL WS-SORT-EOF
* Detect account break
IF SR-ACCOUNT-NUMBER NOT = WS-PREV-ACCOUNT
IF WS-PREV-ACCOUNT NOT = SPACES
PERFORM 3300-WRITE-ACCOUNT-SUMMARY
END-IF
PERFORM 3200-START-NEW-ACCOUNT
END-IF
* Accumulate account-level totals
PERFORM 3400-ACCUMULATE-ACCOUNT-TOTALS
* Write sorted record to output
MOVE SORT-RECORD TO SORTED-TRANS-RECORD
WRITE SORTED-TRANS-RECORD
ADD 1 TO WS-OUTPUT-COUNT
PERFORM 3100-RETURN-SORTED
END-PERFORM
* Write summary for the last account
IF WS-PREV-ACCOUNT NOT = SPACES
PERFORM 3300-WRITE-ACCOUNT-SUMMARY
END-IF
CLOSE SORTED-TRANS-FILE
CLOSE ACCOUNT-SUMMARY-FILE
.
3100-RETURN-SORTED.
RETURN SORT-WORK-FILE INTO SORT-RECORD
AT END SET WS-SORT-EOF TO TRUE
END-RETURN
.
3200-START-NEW-ACCOUNT.
MOVE SR-ACCOUNT-NUMBER TO WS-PREV-ACCOUNT
MOVE 0 TO WS-ACCT-TXN-COUNT
MOVE 0 TO WS-ACCT-DEBIT-TOTAL
MOVE 0 TO WS-ACCT-CREDIT-TOTAL
MOVE 0 TO WS-ACCT-NET-CHANGE
ADD 1 TO WS-ACCOUNT-COUNT
.
3300-WRITE-ACCOUNT-SUMMARY.
COMPUTE WS-ACCT-NET-CHANGE =
WS-ACCT-CREDIT-TOTAL - WS-ACCT-DEBIT-TOTAL
MOVE WS-PREV-ACCOUNT TO ASR-ACCOUNT-NUMBER
MOVE WS-ACCT-TXN-COUNT TO ASR-TXN-COUNT
MOVE WS-ACCT-DEBIT-TOTAL TO ASR-TOTAL-DEBITS
MOVE WS-ACCT-CREDIT-TOTAL
TO ASR-TOTAL-CREDITS
MOVE WS-ACCT-NET-CHANGE TO ASR-NET-CHANGE
WRITE ACCOUNT-SUMMARY-RECORD
.
3400-ACCUMULATE-ACCOUNT-TOTALS.
ADD 1 TO WS-ACCT-TXN-COUNT
EVALUATE TRUE
WHEN SR-TXN-TYPE = "DB"
WHEN SR-TXN-TYPE = "FE"
ADD SR-TXN-AMOUNT TO WS-ACCT-DEBIT-TOTAL
WHEN SR-TXN-TYPE = "CR"
WHEN SR-TXN-TYPE = "IN"
ADD SR-TXN-AMOUNT TO WS-ACCT-CREDIT-TOTAL
WHEN SR-TXN-TYPE = "AJ"
WHEN SR-TXN-TYPE = "RV"
IF SR-TXN-AMOUNT > 0
ADD SR-TXN-AMOUNT
TO WS-ACCT-CREDIT-TOTAL
ELSE
SUBTRACT SR-TXN-AMOUNT
FROM WS-ACCT-DEBIT-TOTAL
END-IF
END-EVALUATE
.
JCL for the End-of-Day Sort Job
//EODTSORT JOB (ACCT),'EOD TXN SORT',
// CLASS=A,MSGCLASS=X,NOTIFY=&SYSUID,
// TYPRUN=SCAN
//*
//* END-OF-DAY TRANSACTION SORT AND SUMMARIZE
//*
//SORTJOB EXEC PGM=EODTSORT,REGION=256M
//STEPLIB DD DSN=PROD.LOADLIB,DISP=SHR
//*
//* RAW TRANSACTION INPUT (DAILY GDG)
//RAWTXN DD DSN=BANK.DAILY.RAW.TRANS(0),DISP=SHR
//*
//* SORT WORK FILES
//SORTWK01 DD DSN=&&SORTWK1,
// DISP=(NEW,DELETE),
// SPACE=(CYL,(100,50)),
// UNIT=SYSDA
//SORTWK02 DD DSN=&&SORTWK2,
// DISP=(NEW,DELETE),
// SPACE=(CYL,(100,50)),
// UNIT=SYSDA
//SORTWK03 DD DSN=&&SORTWK3,
// DISP=(NEW,DELETE),
// SPACE=(CYL,(100,50)),
// UNIT=SYSDA
//*
//* SORTED OUTPUT (INPUT TO POSTING PROGRAM)
//SRTDTXN DD DSN=BANK.DAILY.SORTED.TRANS(+1),
// DISP=(NEW,CATLG,DELETE),
// SPACE=(CYL,(50,20)),
// DCB=(RECFM=FB,LRECL=200,BLKSIZE=0)
//*
//* ACCOUNT-LEVEL SUMMARY FILE
//ACCTSUM DD DSN=BANK.DAILY.ACCOUNT.SUMMARY(+1),
// DISP=(NEW,CATLG,DELETE),
// SPACE=(CYL,(10,5)),
// DCB=(RECFM=FB,LRECL=80,BLKSIZE=0)
//*
//* REJECTED TRANSACTIONS
//REJECTS DD DSN=BANK.DAILY.REJECTS(+1),
// DISP=(NEW,CATLG,DELETE),
// SPACE=(CYL,(5,2)),
// DCB=(RECFM=FB,LRECL=230,BLKSIZE=0)
//*
//* OPERATIONS REPORT
//OPSRPT DD SYSOUT=*
//*
//SYSOUT DD SYSOUT=*
//SYSUDUMP DD SYSOUT=*
The Operations Report
The program produces a report at the end of the run that gives the operations team a complete picture of the day's transaction flow:
================================================================
HEARTLAND NATIONAL BANK
END-OF-DAY TRANSACTION SORT REPORT
RUN DATE: 2026-02-10 RUN TIME: 18:30:45
================================================================
INPUT SUMMARY
--------------------------------------------------------
Raw Records Read: 1,203,847
Filtered Out:
Voided Transactions: 3,218
Compliance Hold: 487
Test Transactions: 92
Rejected (Validation Errors): 156
Released to Sort: 1,199,894
OUTPUT SUMMARY
--------------------------------------------------------
Sorted Records Written: 1,199,894
Unique Accounts Affected: 287,443
CHANNEL BREAKDOWN
--------------------------------------------------------
Channel Count Total Amount
------ ---------- -----------------
Branch 312,445 $ 847,293,104.55
ATM 289,101 $ 234,567,890.00
Online 298,776 $ 512,344,221.87
Mobile 187,442 $ 198,765,432.10
ACH 98,230 $1,234,567,890.22
Wire 13,900 $2,876,543,210.99
DEBIT / CREDIT SUMMARY
--------------------------------------------------------
Total Debits: 687,221 $3,102,445,667.88
Total Credits: 498,773 $2,798,112,980.45
Adjustments: 13,900 $ 3,523,101.40
================================================================
Why INPUT and OUTPUT PROCEDURE Together
This case study demonstrates why the combination of INPUT and OUTPUT PROCEDURE is often preferable to the simpler SORT...USING...GIVING form:
-
INPUT PROCEDURE advantage: The filtering and validation logic runs before records enter the sort. This means the sort processes only valid, postable records, reducing sort time and memory usage. Without INPUT PROCEDURE, the program would need a separate pre-processing step to filter the raw file.
-
OUTPUT PROCEDURE advantage: The account-level summarization takes advantage of the sorted order. Because the OUTPUT PROCEDURE receives records in sorted sequence, it can detect account breaks with a simple comparison against the previous account number. Without OUTPUT PROCEDURE, a separate post-processing step would be needed to scan the sorted file again.
-
Single pass efficiency: The entire process -- read, filter, validate, sort, summarize, write -- happens in a single program execution. There are no intermediate temporary files between the filter step and the sort, or between the sort and the summary step. This eliminates disk I/O for intermediate files and reduces the batch window.
-
Statistics without extra I/O: Both the input statistics (channel breakdown) and the output statistics (account counts) are gathered during the sort process itself. A separate statistics program would require an additional sequential pass through the data.
Testing the Sort Program
Test Scenario 1: Sort Order Verification
- Create a test file with transactions for three accounts, in reverse order
- Run the sort program
- Verify that the output file is ordered by account, then date, then type
- Confirm that within the same account and date, debits appear before credits
Test Scenario 2: Filter Effectiveness
- Create a test file with a mix of normal, voided, compliance, and test records
- Run the sort program
- Verify that only normal records appear in the sorted output
- Verify that the operations report counts match the input file
Test Scenario 3: Validation Rejection
- Include records with invalid account numbers (alphabetic characters), future dates, zero amounts, and unknown transaction types
- Verify each is written to the reject file with the correct reason code
- Verify none appear in the sorted output
Test Scenario 4: Account Summary Accuracy
- Create a test file with known transactions for a single account
- Calculate expected debit total, credit total, and net change manually
- Verify the account summary record matches the manual calculation
Discussion Questions
-
The sort key places transaction type in ascending order so that "DB" sorts before "CR." What would happen if a new transaction type "AA" (automatic adjustment) were introduced? Would it sort correctly, and if not, how would you fix the sort key design?
-
Why does the INPUT PROCEDURE use RELEASE rather than WRITE, and the OUTPUT PROCEDURE use RETURN rather than READ? What would happen if you attempted to use READ on the sort work file inside an OUTPUT PROCEDURE?
-
The program accumulates statistics using a series of ADD statements in an EVALUATE. Could this be done more efficiently using a table (array) indexed by channel code? What are the trade-offs?
-
If the raw transaction file contained 50 million records instead of 1.2 million, what changes to the JCL sort work file allocation would be necessary? How does DFSORT handle files that exceed the sort work file capacity?
-
The OUTPUT PROCEDURE detects account breaks by comparing the current account number to the previous one. This works because the sort guarantees order. What would happen if the SORT statement failed silently and returned unsorted records? How could the OUTPUT PROCEDURE detect this condition?