Case Study 1: End-of-Day Transaction Sorting

Background

Heartland National Bank processes approximately 1.2 million transactions each business day across its retail banking, commercial banking, and online banking channels. These transactions arrive throughout the day in chronological order -- the order in which customers performed them -- and are accumulated in a raw transaction file. At the close of business, the bank's batch posting system must apply these transactions to customer accounts in a specific sequence: sorted by account number, then by transaction date, and within the same date, by transaction type (debits before credits, to avoid temporarily inflating balances).

The raw transaction file also contains records that should not be posted: voided transactions, transactions flagged for compliance review, and test transactions injected by the QA team during the day. These must be filtered out before sorting. Additionally, the operations team needs a summary of transaction volumes by type and channel, produced as a byproduct of the sort process.

This case study demonstrates the full power of COBOL's SORT statement with both INPUT PROCEDURE and OUTPUT PROCEDURE, showing how to filter, transform, sort, and summarize data in a single, efficient pass.


Data Design

The Raw Transaction File

Transactions arrive in the raw file exactly as they were generated, with a 200-byte fixed-length record:

       SELECT RAW-TRANS-FILE
           ASSIGN TO RAWTXN
           ORGANIZATION IS SEQUENTIAL
           FILE STATUS IS WS-RAW-STATUS.

       FD  RAW-TRANS-FILE
           RECORDING MODE IS F
           RECORD CONTAINS 200 CHARACTERS.

       01  RAW-TRANS-RECORD.
           05  RTR-TXN-ID             PIC X(12).
           05  RTR-ACCOUNT-NUMBER     PIC X(10).
           05  RTR-TXN-DATE           PIC 9(8).
           05  RTR-TXN-TIME           PIC 9(6).
           05  RTR-TXN-TYPE           PIC X(2).
               88  RTR-DEBIT          VALUE "DB".
               88  RTR-CREDIT         VALUE "CR".
               88  RTR-FEE            VALUE "FE".
               88  RTR-INTEREST       VALUE "IN".
               88  RTR-ADJUSTMENT     VALUE "AJ".
               88  RTR-REVERSAL       VALUE "RV".
           05  RTR-TXN-AMOUNT         PIC S9(11)V99.
           05  RTR-CHANNEL            PIC X(3).
               88  RTR-BRANCH         VALUE "BRN".
               88  RTR-ATM            VALUE "ATM".
               88  RTR-ONLINE         VALUE "ONL".
               88  RTR-MOBILE         VALUE "MOB".
               88  RTR-ACH            VALUE "ACH".
               88  RTR-WIRE           VALUE "WIR".
           05  RTR-BRANCH-CODE        PIC X(4).
           05  RTR-TELLER-ID          PIC X(6).
           05  RTR-STATUS-FLAG        PIC X(1).
               88  RTR-NORMAL         VALUE "N".
               88  RTR-VOIDED         VALUE "V".
               88  RTR-COMPLIANCE     VALUE "C".
               88  RTR-TEST           VALUE "T".
               88  RTR-POSTABLE       VALUE "N".
           05  RTR-DESCRIPTION        PIC X(30).
           05  RTR-FILLER             PIC X(105).

The Sort Work File

The sort description defines the fields that the SORT statement uses for ordering:

       SELECT SORT-WORK-FILE
           ASSIGN TO SORTWK01.

       SD  SORT-WORK-FILE
           RECORD CONTAINS 200 CHARACTERS.

       01  SORT-RECORD.
           05  SR-TXN-ID              PIC X(12).
           05  SR-ACCOUNT-NUMBER      PIC X(10).
           05  SR-TXN-DATE            PIC 9(8).
           05  SR-TXN-TIME            PIC 9(6).
           05  SR-TXN-TYPE            PIC X(2).
           05  SR-TXN-AMOUNT          PIC S9(11)V99.
           05  SR-CHANNEL             PIC X(3).
           05  SR-BRANCH-CODE         PIC X(4).
           05  SR-TELLER-ID           PIC X(6).
           05  SR-STATUS-FLAG         PIC X(1).
           05  SR-DESCRIPTION         PIC X(30).
           05  SR-FILLER              PIC X(105).

The Sorted Output File

       SELECT SORTED-TRANS-FILE
           ASSIGN TO SRTDTXN
           ORGANIZATION IS SEQUENTIAL
           FILE STATUS IS WS-SRT-STATUS.

       FD  SORTED-TRANS-FILE
           RECORDING MODE IS F
           RECORD CONTAINS 200 CHARACTERS.

       01  SORTED-TRANS-RECORD        PIC X(200).

The Sort Statement with INPUT and OUTPUT PROCEDURE

The heart of the program is the SORT statement that connects the input filtering, the sort operation, and the output summarization:

       PROCEDURE DIVISION.
       0000-MAIN.
           PERFORM 1000-INITIALIZE
           SORT SORT-WORK-FILE
               ON ASCENDING KEY SR-ACCOUNT-NUMBER
               ON ASCENDING KEY SR-TXN-DATE
               ON ASCENDING KEY SR-TXN-TYPE
               INPUT PROCEDURE IS 2000-FILTER-INPUT
               OUTPUT PROCEDURE IS 3000-SUMMARIZE-OUTPUT
           PERFORM 4000-PRODUCE-REPORTS
           PERFORM 9000-TERMINATE
           STOP RUN
           .

The three-level sort key ensures the correct posting order:

  1. Account number (ascending): Groups all transactions for the same account together, allowing the posting program to load each account once and apply all its transactions before moving to the next.

  2. Transaction date (ascending): Within each account, transactions are ordered chronologically. This is important for interest calculations and overdraft detection, where the date of each transaction affects the outcome.

  3. Transaction type (ascending): Within the same date, debits ("DB") sort before credits ("CR") alphabetically. This conservative approach prevents a credit from temporarily inflating the balance before a debit on the same day draws it down.


INPUT PROCEDURE: Filtering and Validation

The INPUT PROCEDURE reads the raw transaction file, filters out non-postable records, validates remaining records, and RELEASEs valid records to the sort:

       2000-FILTER-INPUT.
           OPEN INPUT RAW-TRANS-FILE
           OPEN OUTPUT REJECT-FILE

           MOVE 0 TO WS-RAW-READ-COUNT
           MOVE 0 TO WS-RELEASED-COUNT
           MOVE 0 TO WS-FILTERED-COUNT
           MOVE 0 TO WS-REJECTED-COUNT

           PERFORM 2100-READ-RAW-FILE
           PERFORM UNTIL WS-RAW-EOF
               ADD 1 TO WS-RAW-READ-COUNT

      *        Step 1: Filter non-postable records
               IF NOT RTR-POSTABLE
                   ADD 1 TO WS-FILTERED-COUNT
                   PERFORM 2200-TALLY-FILTERED
                   PERFORM 2100-READ-RAW-FILE
                   CONTINUE
               END-IF

      *        Step 2: Validate postable records
               PERFORM 2300-VALIDATE-TRANSACTION
               IF WS-VALIDATION-FAILED
                   ADD 1 TO WS-REJECTED-COUNT
                   PERFORM 2400-WRITE-REJECT
                   PERFORM 2100-READ-RAW-FILE
                   CONTINUE
               END-IF

      *        Step 3: Accumulate input statistics
               PERFORM 2500-ACCUMULATE-INPUT-STATS

      *        Step 4: Release valid record to the sort
               MOVE RAW-TRANS-RECORD TO SORT-RECORD
               RELEASE SORT-RECORD
               ADD 1 TO WS-RELEASED-COUNT

               PERFORM 2100-READ-RAW-FILE
           END-PERFORM

           CLOSE RAW-TRANS-FILE
           CLOSE REJECT-FILE
           .

Filtering Logic

Non-postable records are identified by the status flag. Each type is counted separately for the operations report:

       2200-TALLY-FILTERED.
           EVALUATE TRUE
               WHEN RTR-VOIDED
                   ADD 1 TO WS-VOIDED-COUNT
               WHEN RTR-COMPLIANCE
                   ADD 1 TO WS-COMPLIANCE-COUNT
               WHEN RTR-TEST
                   ADD 1 TO WS-TEST-COUNT
           END-EVALUATE
           .

Validation Logic

Records that pass the status filter still need data validation before they can be trusted for posting:

       2300-VALIDATE-TRANSACTION.
           SET WS-VALIDATION-PASSED TO TRUE

      *    Check 1: Account number must be numeric
           IF RTR-ACCOUNT-NUMBER IS NOT NUMERIC
               MOVE "INVALID ACCOUNT NUMBER"
                   TO WS-REJECT-REASON
               SET WS-VALIDATION-FAILED TO TRUE
           END-IF

      *    Check 2: Transaction date must be valid
           IF WS-VALIDATION-PASSED
               IF RTR-TXN-DATE < 20200101
               OR RTR-TXN-DATE > WS-CURRENT-DATE-NUM
                   MOVE "INVALID TRANSACTION DATE"
                       TO WS-REJECT-REASON
                   SET WS-VALIDATION-FAILED TO TRUE
               END-IF
           END-IF

      *    Check 3: Amount must be positive and non-zero
           IF WS-VALIDATION-PASSED
               IF RTR-TXN-AMOUNT = ZEROS
               OR RTR-TXN-AMOUNT < ZEROS
                   MOVE "INVALID TRANSACTION AMOUNT"
                       TO WS-REJECT-REASON
                   SET WS-VALIDATION-FAILED TO TRUE
               END-IF
           END-IF

      *    Check 4: Transaction type must be recognized
           IF WS-VALIDATION-PASSED
               IF NOT (RTR-DEBIT OR RTR-CREDIT
                   OR RTR-FEE OR RTR-INTEREST
                   OR RTR-ADJUSTMENT OR RTR-REVERSAL)
                   MOVE "UNKNOWN TRANSACTION TYPE"
                       TO WS-REJECT-REASON
                   SET WS-VALIDATION-FAILED TO TRUE
               END-IF
           END-IF
           .

Input Statistics Accumulation

While reading the input, the program accumulates counts and totals by channel and type. These statistics feed the operations report without requiring a second pass through the data:

       2500-ACCUMULATE-INPUT-STATS.
      *    Count by channel
           EVALUATE TRUE
               WHEN RTR-BRANCH
                   ADD 1 TO WS-BRANCH-TXN-COUNT
                   ADD RTR-TXN-AMOUNT TO WS-BRANCH-TXN-TOTAL
               WHEN RTR-ATM
                   ADD 1 TO WS-ATM-TXN-COUNT
                   ADD RTR-TXN-AMOUNT TO WS-ATM-TXN-TOTAL
               WHEN RTR-ONLINE
                   ADD 1 TO WS-ONLINE-TXN-COUNT
                   ADD RTR-TXN-AMOUNT TO WS-ONLINE-TXN-TOTAL
               WHEN RTR-MOBILE
                   ADD 1 TO WS-MOBILE-TXN-COUNT
                   ADD RTR-TXN-AMOUNT TO WS-MOBILE-TXN-TOTAL
               WHEN RTR-ACH
                   ADD 1 TO WS-ACH-TXN-COUNT
                   ADD RTR-TXN-AMOUNT TO WS-ACH-TXN-TOTAL
               WHEN RTR-WIRE
                   ADD 1 TO WS-WIRE-TXN-COUNT
                   ADD RTR-TXN-AMOUNT TO WS-WIRE-TXN-TOTAL
           END-EVALUATE

      *    Count by type (debits vs credits)
           EVALUATE TRUE
               WHEN RTR-DEBIT
               WHEN RTR-FEE
                   ADD 1 TO WS-TOTAL-DEBIT-COUNT
                   ADD RTR-TXN-AMOUNT TO WS-TOTAL-DEBIT-AMT
               WHEN RTR-CREDIT
               WHEN RTR-INTEREST
                   ADD 1 TO WS-TOTAL-CREDIT-COUNT
                   ADD RTR-TXN-AMOUNT TO WS-TOTAL-CREDIT-AMT
               WHEN RTR-ADJUSTMENT
               WHEN RTR-REVERSAL
                   ADD 1 TO WS-TOTAL-ADJUST-COUNT
                   ADD RTR-TXN-AMOUNT TO WS-TOTAL-ADJUST-AMT
           END-EVALUATE
           .

OUTPUT PROCEDURE: Summarization and Account Grouping

The OUTPUT PROCEDURE receives records from the sort in sorted order and writes them to the output file while producing account-level summary records:

       3000-SUMMARIZE-OUTPUT.
           OPEN OUTPUT SORTED-TRANS-FILE
           OPEN OUTPUT ACCOUNT-SUMMARY-FILE

           MOVE 0 TO WS-OUTPUT-COUNT
           MOVE 0 TO WS-ACCOUNT-COUNT
           MOVE SPACES TO WS-PREV-ACCOUNT

           PERFORM 3100-RETURN-SORTED
           PERFORM UNTIL WS-SORT-EOF
      *        Detect account break
               IF SR-ACCOUNT-NUMBER NOT = WS-PREV-ACCOUNT
                   IF WS-PREV-ACCOUNT NOT = SPACES
                       PERFORM 3300-WRITE-ACCOUNT-SUMMARY
                   END-IF
                   PERFORM 3200-START-NEW-ACCOUNT
               END-IF

      *        Accumulate account-level totals
               PERFORM 3400-ACCUMULATE-ACCOUNT-TOTALS

      *        Write sorted record to output
               MOVE SORT-RECORD TO SORTED-TRANS-RECORD
               WRITE SORTED-TRANS-RECORD
               ADD 1 TO WS-OUTPUT-COUNT

               PERFORM 3100-RETURN-SORTED
           END-PERFORM

      *    Write summary for the last account
           IF WS-PREV-ACCOUNT NOT = SPACES
               PERFORM 3300-WRITE-ACCOUNT-SUMMARY
           END-IF

           CLOSE SORTED-TRANS-FILE
           CLOSE ACCOUNT-SUMMARY-FILE
           .

       3100-RETURN-SORTED.
           RETURN SORT-WORK-FILE INTO SORT-RECORD
               AT END SET WS-SORT-EOF TO TRUE
           END-RETURN
           .

       3200-START-NEW-ACCOUNT.
           MOVE SR-ACCOUNT-NUMBER TO WS-PREV-ACCOUNT
           MOVE 0 TO WS-ACCT-TXN-COUNT
           MOVE 0 TO WS-ACCT-DEBIT-TOTAL
           MOVE 0 TO WS-ACCT-CREDIT-TOTAL
           MOVE 0 TO WS-ACCT-NET-CHANGE
           ADD 1 TO WS-ACCOUNT-COUNT
           .

       3300-WRITE-ACCOUNT-SUMMARY.
           COMPUTE WS-ACCT-NET-CHANGE =
               WS-ACCT-CREDIT-TOTAL - WS-ACCT-DEBIT-TOTAL

           MOVE WS-PREV-ACCOUNT     TO ASR-ACCOUNT-NUMBER
           MOVE WS-ACCT-TXN-COUNT   TO ASR-TXN-COUNT
           MOVE WS-ACCT-DEBIT-TOTAL TO ASR-TOTAL-DEBITS
           MOVE WS-ACCT-CREDIT-TOTAL
                                     TO ASR-TOTAL-CREDITS
           MOVE WS-ACCT-NET-CHANGE  TO ASR-NET-CHANGE

           WRITE ACCOUNT-SUMMARY-RECORD
           .

       3400-ACCUMULATE-ACCOUNT-TOTALS.
           ADD 1 TO WS-ACCT-TXN-COUNT
           EVALUATE TRUE
               WHEN SR-TXN-TYPE = "DB"
               WHEN SR-TXN-TYPE = "FE"
                   ADD SR-TXN-AMOUNT TO WS-ACCT-DEBIT-TOTAL
               WHEN SR-TXN-TYPE = "CR"
               WHEN SR-TXN-TYPE = "IN"
                   ADD SR-TXN-AMOUNT TO WS-ACCT-CREDIT-TOTAL
               WHEN SR-TXN-TYPE = "AJ"
               WHEN SR-TXN-TYPE = "RV"
                   IF SR-TXN-AMOUNT > 0
                       ADD SR-TXN-AMOUNT
                           TO WS-ACCT-CREDIT-TOTAL
                   ELSE
                       SUBTRACT SR-TXN-AMOUNT
                           FROM WS-ACCT-DEBIT-TOTAL
                   END-IF
           END-EVALUATE
           .

JCL for the End-of-Day Sort Job

//EODTSORT JOB (ACCT),'EOD TXN SORT',
//         CLASS=A,MSGCLASS=X,NOTIFY=&SYSUID,
//         TYPRUN=SCAN
//*
//* END-OF-DAY TRANSACTION SORT AND SUMMARIZE
//*
//SORTJOB  EXEC PGM=EODTSORT,REGION=256M
//STEPLIB  DD DSN=PROD.LOADLIB,DISP=SHR
//*
//* RAW TRANSACTION INPUT (DAILY GDG)
//RAWTXN   DD DSN=BANK.DAILY.RAW.TRANS(0),DISP=SHR
//*
//* SORT WORK FILES
//SORTWK01 DD DSN=&&SORTWK1,
//         DISP=(NEW,DELETE),
//         SPACE=(CYL,(100,50)),
//         UNIT=SYSDA
//SORTWK02 DD DSN=&&SORTWK2,
//         DISP=(NEW,DELETE),
//         SPACE=(CYL,(100,50)),
//         UNIT=SYSDA
//SORTWK03 DD DSN=&&SORTWK3,
//         DISP=(NEW,DELETE),
//         SPACE=(CYL,(100,50)),
//         UNIT=SYSDA
//*
//* SORTED OUTPUT (INPUT TO POSTING PROGRAM)
//SRTDTXN  DD DSN=BANK.DAILY.SORTED.TRANS(+1),
//         DISP=(NEW,CATLG,DELETE),
//         SPACE=(CYL,(50,20)),
//         DCB=(RECFM=FB,LRECL=200,BLKSIZE=0)
//*
//* ACCOUNT-LEVEL SUMMARY FILE
//ACCTSUM  DD DSN=BANK.DAILY.ACCOUNT.SUMMARY(+1),
//         DISP=(NEW,CATLG,DELETE),
//         SPACE=(CYL,(10,5)),
//         DCB=(RECFM=FB,LRECL=80,BLKSIZE=0)
//*
//* REJECTED TRANSACTIONS
//REJECTS  DD DSN=BANK.DAILY.REJECTS(+1),
//         DISP=(NEW,CATLG,DELETE),
//         SPACE=(CYL,(5,2)),
//         DCB=(RECFM=FB,LRECL=230,BLKSIZE=0)
//*
//* OPERATIONS REPORT
//OPSRPT   DD SYSOUT=*
//*
//SYSOUT   DD SYSOUT=*
//SYSUDUMP DD SYSOUT=*

The Operations Report

The program produces a report at the end of the run that gives the operations team a complete picture of the day's transaction flow:

================================================================
  HEARTLAND NATIONAL BANK
  END-OF-DAY TRANSACTION SORT REPORT
  RUN DATE: 2026-02-10    RUN TIME: 18:30:45
================================================================

  INPUT SUMMARY
  --------------------------------------------------------
  Raw Records Read:              1,203,847
  Filtered Out:
    Voided Transactions:             3,218
    Compliance Hold:                   487
    Test Transactions:                  92
  Rejected (Validation Errors):        156
  Released to Sort:              1,199,894

  OUTPUT SUMMARY
  --------------------------------------------------------
  Sorted Records Written:        1,199,894
  Unique Accounts Affected:        287,443

  CHANNEL BREAKDOWN
  --------------------------------------------------------
  Channel     Count        Total Amount
  ------  ----------  -----------------
  Branch     312,445   $  847,293,104.55
  ATM        289,101   $  234,567,890.00
  Online     298,776   $  512,344,221.87
  Mobile     187,442   $  198,765,432.10
  ACH         98,230   $1,234,567,890.22
  Wire        13,900   $2,876,543,210.99

  DEBIT / CREDIT SUMMARY
  --------------------------------------------------------
  Total Debits:    687,221   $3,102,445,667.88
  Total Credits:   498,773   $2,798,112,980.45
  Adjustments:      13,900   $    3,523,101.40

================================================================

Why INPUT and OUTPUT PROCEDURE Together

This case study demonstrates why the combination of INPUT and OUTPUT PROCEDURE is often preferable to the simpler SORT...USING...GIVING form:

  1. INPUT PROCEDURE advantage: The filtering and validation logic runs before records enter the sort. This means the sort processes only valid, postable records, reducing sort time and memory usage. Without INPUT PROCEDURE, the program would need a separate pre-processing step to filter the raw file.

  2. OUTPUT PROCEDURE advantage: The account-level summarization takes advantage of the sorted order. Because the OUTPUT PROCEDURE receives records in sorted sequence, it can detect account breaks with a simple comparison against the previous account number. Without OUTPUT PROCEDURE, a separate post-processing step would be needed to scan the sorted file again.

  3. Single pass efficiency: The entire process -- read, filter, validate, sort, summarize, write -- happens in a single program execution. There are no intermediate temporary files between the filter step and the sort, or between the sort and the summary step. This eliminates disk I/O for intermediate files and reduces the batch window.

  4. Statistics without extra I/O: Both the input statistics (channel breakdown) and the output statistics (account counts) are gathered during the sort process itself. A separate statistics program would require an additional sequential pass through the data.


Testing the Sort Program

Test Scenario 1: Sort Order Verification

  1. Create a test file with transactions for three accounts, in reverse order
  2. Run the sort program
  3. Verify that the output file is ordered by account, then date, then type
  4. Confirm that within the same account and date, debits appear before credits

Test Scenario 2: Filter Effectiveness

  1. Create a test file with a mix of normal, voided, compliance, and test records
  2. Run the sort program
  3. Verify that only normal records appear in the sorted output
  4. Verify that the operations report counts match the input file

Test Scenario 3: Validation Rejection

  1. Include records with invalid account numbers (alphabetic characters), future dates, zero amounts, and unknown transaction types
  2. Verify each is written to the reject file with the correct reason code
  3. Verify none appear in the sorted output

Test Scenario 4: Account Summary Accuracy

  1. Create a test file with known transactions for a single account
  2. Calculate expected debit total, credit total, and net change manually
  3. Verify the account summary record matches the manual calculation

Discussion Questions

  1. The sort key places transaction type in ascending order so that "DB" sorts before "CR." What would happen if a new transaction type "AA" (automatic adjustment) were introduced? Would it sort correctly, and if not, how would you fix the sort key design?

  2. Why does the INPUT PROCEDURE use RELEASE rather than WRITE, and the OUTPUT PROCEDURE use RETURN rather than READ? What would happen if you attempted to use READ on the sort work file inside an OUTPUT PROCEDURE?

  3. The program accumulates statistics using a series of ADD statements in an EVALUATE. Could this be done more efficiently using a table (array) indexed by channel code? What are the trade-offs?

  4. If the raw transaction file contained 50 million records instead of 1.2 million, what changes to the JCL sort work file allocation would be necessary? How does DFSORT handle files that exceed the sort work file capacity?

  5. The OUTPUT PROCEDURE detects account breaks by comparing the current account number to the previous one. This works because the sort guarantees order. What would happen if the SORT statement failed silently and returned unsorted records? How could the OUTPUT PROCEDURE detect this condition?