Case Study 1: GlobalBank Daily Transaction Sort

Background

Every night at 10:00 PM Eastern, GlobalBank's batch cycle begins. The first critical step is GBSORT01 — the daily transaction sort. This program takes the day's raw transaction feed (a sequential file of all transactions from ATMs, branches, online banking, and wire transfers) and produces a clean, sorted file for downstream processing.

The sorted file feeds into BAL-CALC (balance calculation), RPT-DAILY (daily reports), and TXN-HIST (historical archive update). If GBSORT01 fails or produces incorrect output, the entire nightly cycle halts.

The Problem

Derek Washington has been asked to enhance GBSORT01. The current version uses a simple USING/GIVING sort with no validation. Last month, a corrupted feed from the ATM network included 12,000 records with blank account numbers. These records passed through the sort and caused S0C7 ABENDs in BAL-CALC, which expected numeric account data in specific positions. The entire batch cycle had to be restarted at 2:00 AM after Maria Chen manually removed the bad records.

Maria's requirement: "Add validation in an INPUT PROCEDURE. If more than 0.5% of records fail validation, abort the sort — something is wrong with the feed and we need to investigate, not process garbage."

Requirements

  1. Validate every transaction record before releasing to sort: - Account number must not be spaces - Transaction date must be numeric and within the current year - Transaction type must be D, W, T, F, or I - Amount must not be zero - Branch code must be in the valid branch table

  2. Sort valid records by account number (ascending), date (ascending), time (ascending), amount (descending)

  3. Implement the 0.5% error threshold — abort if exceeded after 1,000 records

  4. Write rejected records to an error file with reason codes

  5. Produce a sorted output file for downstream processing

  6. Display reconciliation statistics

Design Decisions

Derek discusses the design with Maria:

Derek: "Should I validate the branch code against a file or a hardcoded table?"

Maria: "Use a table in WORKING-STORAGE loaded from a reference file at program start. The branch list changes maybe twice a year — loading it once at startup is efficient, and it avoids repeated file I/O during the sort."

Derek: "What about the error threshold — should it be hardcoded at 0.5%?"

Maria: "Make it a working storage variable initialized to 0.5, but eventually we should read it from a parameter file. For now, hardcoded is fine — just make sure it is in a clearly labeled data item, not a magic number buried in the code."

Solution Architecture

[Raw Transaction Feed] → INPUT PROCEDURE (validate + filter)
                                 ↓
                         [SORT by acct/date/time/amount]
                                 ↓
                         GIVING → [Sorted Transaction File]

Side outputs:
  [Error/Reject File] ← rejected records with reason codes
  [Console log]       ← reconciliation statistics

Key Code Excerpts

Branch Table Validation

       01  WS-BRANCH-TABLE.
           05  WS-BRANCH-COUNT  PIC 9(03) VALUE 25.
           05  WS-BRANCHES.
               10  FILLER       PIC X(05) VALUE 'NYC01'.
               10  FILLER       PIC X(05) VALUE 'NYC02'.
               10  FILLER       PIC X(05) VALUE 'CHI01'.
               10  FILLER       PIC X(05) VALUE 'LAX01'.
               10  FILLER       PIC X(05) VALUE 'LAX02'.
      *        ... additional branches ...
           05  WS-BRANCH-ENTRY REDEFINES WS-BRANCHES
               OCCURS 25 TIMES PIC X(05).
       01  WS-BRANCH-FOUND     PIC X VALUE 'N'.
           88  BRANCH-VALID    VALUE 'Y'.

       1300-VALIDATE-BRANCH.
           MOVE 'N' TO WS-BRANCH-FOUND
           PERFORM VARYING WS-BRX FROM 1 BY 1
               UNTIL WS-BRX > WS-BRANCH-COUNT
               OR BRANCH-VALID
               IF WC-BRANCH = WS-BRANCH-ENTRY(WS-BRX)
                   SET BRANCH-VALID TO TRUE
               END-IF
           END-PERFORM.

Error Threshold Check

       1400-CHECK-THRESHOLD.
           IF WS-READ-CNT > 1000
               COMPUTE WS-ERROR-PCT ROUNDED =
                   (WS-ERROR-CNT / WS-READ-CNT) * 100
               IF WS-ERROR-PCT > WS-MAX-ERROR-PCT
                   DISPLAY 'GBSORT01: ERROR THRESHOLD EXCEEDED'
                   DISPLAY '  ERROR RATE: ' WS-ERROR-PCT '%'
                   DISPLAY '  THRESHOLD:  ' WS-MAX-ERROR-PCT '%'
                   DISPLAY '  ERRORS: ' WS-ERROR-CNT
                       ' OF ' WS-READ-CNT
                   MOVE 16 TO SORT-RETURN
               END-IF
           END-IF.

Results

After deployment, GBSORT01 caught several data quality issues in its first week:

Date Records Errors Rate Action
Mon 2,312,456 34 0.001% Normal — sorted successfully
Tue 2,298,103 28 0.001% Normal — sorted successfully
Wed 2,345,892 15,234 0.65% ABORTED — ATM feed corruption
Thu 2,301,445 41 0.002% Normal — sorted successfully
Fri 2,567,201 52 0.002% Normal — sorted successfully

On Wednesday, the program correctly identified the corrupted ATM feed and aborted, preventing the downstream S0C7 ABENDs that had occurred the previous month. Operations was alerted immediately and the ATM feed was resubmitted after the vendor corrected the issue.

Lessons Learned

  1. Validation in INPUT PROCEDURE is not optional — it is a production necessity
  2. Error thresholds prevent cascade failures — better to stop early than corrupt downstream files
  3. Reconciliation counts are essential — they enable operations to verify correct processing without examining every record
  4. The SORT verb is more than just ordering — with INPUT/OUTPUT PROCEDUREs, it becomes a complete data pipeline stage

Discussion Questions

  1. What would happen if the error threshold were set too low (e.g., 0.01%)? Too high (e.g., 10%)?
  2. How would you modify this program to handle multiple transaction feeds (ATM, online, branch) arriving as separate files?
  3. What additional validation rules would you add for a banking transaction?
  4. How would you test this program's error threshold logic without a production-sized dataset?