20 min read

In the world of mainframe computing, performance is not an abstract concern --- it is a business-critical requirement measured in dollars and cents. Every CPU second consumed by a COBOL program has a direct cost, often calculated by the Million...

Chapter 32: Performance Tuning for COBOL Programs

Part VI - Mainframe Environment and Batch Processing

In the world of mainframe computing, performance is not an abstract concern --- it is a business-critical requirement measured in dollars and cents. Every CPU second consumed by a COBOL program has a direct cost, often calculated by the Million Service Units (MSU) consumed, which determines the monthly software licensing fees an organization pays to IBM and other vendors. A poorly performing batch job that overruns its processing window can delay end-of-day settlement, impacting millions of dollars in financial transactions. A CICS transaction that takes two seconds instead of half a second can mean the difference between a responsive teller system and long customer queues at the branch.

This chapter provides a comprehensive guide to performance tuning for COBOL programs running on z/OS. We will examine performance from every angle: compiler options that generate faster code, coding techniques that reduce CPU consumption, I/O optimization that minimizes elapsed time, DB2 tuning that eliminates unnecessary overhead, and CICS performance patterns that keep online systems responsive. Throughout, we will use realistic examples drawn from financial batch processing and high-volume transaction systems.

Key Concept

Performance tuning on the mainframe is a disciplined engineering activity, not guesswork. Every optimization must be measured before and after implementation. The three fundamental metrics are CPU time (the cost you pay for), elapsed time (the time the user or batch window experiences), and I/O count (often the dominant factor in elapsed time). Changes that reduce one metric may increase another, so you must understand the trade-offs.


32.1 Performance Fundamentals

Before tuning any COBOL program, you must understand what you are measuring and why. Mainframe performance is characterized by several distinct metrics, each telling a different part of the story.

32.1.1 CPU Time

CPU time is the amount of processor time consumed by your program. On z/OS, CPU time is reported in two forms:

  • TCB time --- Time spent executing your application code under the Task Control Block. This is the time your COBOL program is actually running instructions on the processor.
  • SRB time --- Time spent executing system services on behalf of your task under the Service Request Block. This includes I/O completion processing, paging, and other system activities.

CPU time is the primary cost driver because IBM's Sub-Capacity Pricing model bases software license fees on the rolling four-hour average of MSU consumption. Reducing CPU time directly reduces costs.

32.1.2 Elapsed Time (Wall Clock Time)

Elapsed time is the total time from when your job step starts to when it completes. Elapsed time includes:

  • CPU time (your code executing)
  • I/O wait time (waiting for disk reads and writes)
  • Queue time (waiting for CPU, memory, or other resources)
  • Lock/latch wait time (waiting for DB2 or VSAM locks)

In batch processing, elapsed time determines whether your jobs fit within the batch window. A typical banking batch window runs from approximately 6:00 PM to 6:00 AM, during which all end-of-day processing must complete before online systems come back up.

32.1.3 I/O Counts

I/O operations are often the single largest contributor to elapsed time. Each physical I/O to a disk subsystem takes milliseconds, and those milliseconds add up quickly when processing millions of records. The key I/O metrics are:

  • EXCP count --- The number of Execute Channel Programs issued, representing physical I/O operations.
  • Connect time --- The time the channel is connected to the device performing the I/O.
  • Disconnect time --- The time waiting for the device to position (seek time, rotational delay).

32.1.4 Memory Usage

Memory on z/OS is divided into regions, and each job step has a REGION parameter that controls how much virtual storage is available. Excessive memory usage can lead to:

  • Paging, which dramatically increases elapsed time.
  • Storage shortages that prevent other jobs from running.
  • ABEND S878 (insufficient virtual storage).

The JCL to capture basic performance metrics for a batch job:

//PERFTEST JOB (BANK,PERF),'PERFORMANCE TEST',
//             CLASS=A,MSGCLASS=X,MSGLEVEL=(1,1),
//             NOTIFY=&SYSUID,
//             REGION=0M
//*================================================================*
//* PERFORMANCE MEASUREMENT JOB                                     *
//* REGION=0M allows the job to use all available storage.          *
//* In production, specify an appropriate REGION size.              *
//*================================================================*
//*
//* Step 1: Run the program being measured
//*
//RUNPGM   EXEC PGM=DAYENDBT,TIME=1440
//STEPLIB  DD DSN=BANK.PROD.LOADLIB,DISP=SHR
//CUSTMAST DD DSN=BANK.PROD.CUSTMAST,DISP=SHR
//TRANFILE DD DSN=BANK.PROD.DAYTRANS,DISP=SHR
//OUTFILE  DD DSN=BANK.PROD.DAYEND.OUTPUT,
//            DISP=(NEW,CATLG,DELETE),
//            UNIT=SYSDA,
//            SPACE=(CYL,(100,50),RLSE),
//            DCB=(RECFM=FB,LRECL=200,BLKSIZE=27800)
//SYSOUT   DD SYSOUT=*
//*
//* The job log will show CPU time, elapsed time, and EXCP counts
//* in the IEF374I and IEF375I messages.
//* Example output:
//*   IEF374I STEP /RUNPGM  / START 2025040.1800
//*   IEF375I JOB /PERFTEST/ STOP  2025040.1823
//*            CPU    0MIN 12.45SEC  SRB  0MIN 01.22SEC

32.2 COBOL Compiler Optimization Options

The IBM Enterprise COBOL compiler provides several options that directly affect the performance of the generated code. Choosing the right compiler options is one of the easiest and most impactful performance improvements you can make.

32.2.1 OPTIMIZE

The OPTIMIZE option is the single most important performance-related compiler option. It instructs the compiler to analyze your COBOL code and generate more efficient machine code:

  • OPTIMIZE(0) --- No optimization (default). The compiler generates straightforward code that is easy to debug but not optimized for performance.
  • OPTIMIZE(1) --- Standard optimization. The compiler performs local optimizations within each paragraph, including common subexpression elimination, strength reduction, and dead code elimination.
  • OPTIMIZE(2) --- Full optimization. The compiler performs global optimizations across the entire program, including interprocedural analysis, loop optimization, and register allocation improvements.

Key Concept

Always compile production programs with OPTIMIZE(2). The performance improvement typically ranges from 5% to 30% CPU reduction compared to OPTIMIZE(0), with no changes to your source code. The only trade-off is longer compile times and the fact that debugging optimized code can be more difficult because the generated instructions may not correspond one-to-one with source statements.

//*================================================================*
//* COMPILE WITH OPTIMIZATION FOR PRODUCTION                        *
//*================================================================*
//COMPILE  EXEC PGM=IGYCRCTL,
//             PARM='OPTIMIZE(2),TRUNC(OPT),NUMPROC(PFD),
//             FASTSRT,SSRANGE,LIST,OFFSET,XREF'
//STEPLIB  DD DSN=IGY.V6R4M0.SIGYCOMP,DISP=SHR
//SYSIN    DD DSN=BANK.SOURCE.COBOL(DAYENDBT),DISP=SHR
//SYSLIB   DD DSN=BANK.SOURCE.COPYLIB,DISP=SHR
//         DD DSN=CICS.SDFHCOB,DISP=SHR
//SYSLIN   DD DSN=&&OBJECT,DISP=(NEW,PASS),
//            UNIT=SYSDA,SPACE=(CYL,(5,2))
//SYSPRINT DD SYSOUT=*
//SYSUT1   DD UNIT=SYSDA,SPACE=(CYL,(10,5))
//SYSUT2   DD UNIT=SYSDA,SPACE=(CYL,(10,5))
//SYSUT3   DD UNIT=SYSDA,SPACE=(CYL,(10,5))
//SYSUT4   DD UNIT=SYSDA,SPACE=(CYL,(10,5))
//SYSUT5   DD UNIT=SYSDA,SPACE=(CYL,(10,5))
//SYSUT6   DD UNIT=SYSDA,SPACE=(CYL,(10,5))
//SYSUT7   DD UNIT=SYSDA,SPACE=(CYL,(10,5))

32.2.2 TRUNC

The TRUNC option controls how the compiler handles BINARY (COMP) data items when their values exceed the number of digits specified in the PICTURE clause:

  • TRUNC(STD) --- Truncates binary values to the number of digits in the PIC clause. This generates extra instructions for every binary arithmetic operation to ensure truncation.
  • TRUNC(OPT) --- Assumes the programmer ensures values do not exceed the PIC size. No truncation instructions are generated, resulting in faster binary arithmetic.
  • TRUNC(BIN) --- Treats binary items as full binary values regardless of the PIC clause. Useful for interfacing with non-COBOL programs but generates the most overhead for decimal arithmetic.
      *================================================================*
      * TRUNC(OPT) vs TRUNC(STD) impact example                        *
      * With TRUNC(STD), every binary operation includes extra          *
      * instructions to truncate to the PIC size.                       *
      * With TRUNC(OPT), the compiler trusts that values fit.           *
      *================================================================*

       01  WS-COUNTERS.
           05  WS-RECORD-COUNT    PIC S9(9)  COMP.
           05  WS-LOOP-INDEX      PIC S9(4)  COMP.
           05  WS-TABLE-SIZE      PIC S9(4)  COMP VALUE 1000.

      * With TRUNC(STD), this simple ADD generates approximately
      * 6-8 machine instructions including a divide to truncate.
      * With TRUNC(OPT), it generates 1-2 instructions.
           ADD 1 TO WS-RECORD-COUNT.

32.2.3 NUMPROC

The NUMPROC option controls how the compiler handles sign processing for packed decimal (COMP-3) and zoned decimal (DISPLAY) data:

  • NUMPROC(NOPFD) --- The compiler generates code to "fix" the sign on every numeric operation, converting any valid sign representation to the preferred sign. This is the safest but slowest option.
  • NUMPROC(PFD) --- The compiler assumes all numeric data already has preferred signs. No sign-fixing instructions are generated. This is significantly faster for programs that do heavy numeric processing.
  • NUMPROC(MIG) --- A migration option that provides NOPFD behavior. Not recommended for new development.

Key Concept

For programs that perform intensive decimal arithmetic --- such as interest calculation, general ledger posting, or end-of-day settlement --- switching from NUMPROC(NOPFD) to NUMPROC(PFD) can reduce CPU consumption for arithmetic operations by 10-20%. However, you must ensure that all input data has valid preferred signs, or you may get incorrect results.

32.2.4 FASTSRT

The FASTSRT option tells the compiler to allow DFSORT (or SyncSort) to manage the I/O for SORT input and output files directly, bypassing COBOL's I/O routines. This can dramatically reduce the overhead of SORT operations:

      *================================================================*
      * SORT optimization with FASTSRT                                  *
      * When FASTSRT is active, the sort product handles all I/O        *
      * for the USING and GIVING files directly, eliminating the        *
      * overhead of COBOL's file management routines.                   *
      *================================================================*

       ENVIRONMENT DIVISION.
       INPUT-OUTPUT SECTION.
       FILE-CONTROL.
           SELECT SORT-FILE  ASSIGN TO SORTWORK.
           SELECT INPUT-FILE ASSIGN TO INFILE.
           SELECT OUTPUT-FILE ASSIGN TO OUTFILE.

       DATA DIVISION.
       FILE SECTION.
       SD  SORT-FILE.
       01  SORT-RECORD.
           05  SR-ACCOUNT-NUMBER  PIC X(10).
           05  SR-TRANS-DATE      PIC X(8).
           05  SR-TRANS-TIME      PIC X(6).
           05  SR-TRANS-AMOUNT    PIC S9(11)V99 COMP-3.
           05  SR-TRANS-TYPE      PIC X(2).
           05  FILLER             PIC X(171).

       FD  INPUT-FILE.
       01  INPUT-RECORD           PIC X(200).

       FD  OUTPUT-FILE.
       01  OUTPUT-RECORD          PIC X(200).

       PROCEDURE DIVISION.
       0000-MAIN-LOGIC.
      *    With FASTSRT, this SORT with USING/GIVING allows the
      *    sort product to handle all I/O directly --- much faster
      *    than using INPUT PROCEDURE / OUTPUT PROCEDURE.
           SORT SORT-FILE
               ON ASCENDING KEY SR-ACCOUNT-NUMBER
               ON ASCENDING KEY SR-TRANS-DATE
               ON ASCENDING KEY SR-TRANS-TIME
               USING INPUT-FILE
               GIVING OUTPUT-FILE

           IF SORT-RETURN NOT = 0
               DISPLAY 'SORT FAILED. SORT RETURN CODE: '
                       SORT-RETURN
                   UPON CONSOLE
               MOVE 16 TO RETURN-CODE
           END-IF

           STOP RUN.
Option Performance Impact Recommendation
SSRANGE Adds range-checking for subscripts and reference modification. Overhead of 5-15%. Use in test; consider removing in production if validated.
NOTEST Removes debugging hooks. Reduces code size and improves performance. Use in production.
AWO Apply Write Only --- buffers output writes for better I/O efficiency. Use for sequential output files.
BLOCK0 Treats BLOCK CONTAINS 0 as system-determined blocking. Use for optimal block sizes.
RENT Generates reentrant code, required for CICS and recommended for batch. Always use.
ARITH(EXTEND) Allows 31-digit precision for COMPUTE. Slightly more overhead for large precision. Use when you need extended precision.

32.3 Efficient Coding Techniques

Beyond compiler options, the way you write your COBOL code has a significant impact on performance. This section covers the most impactful coding techniques.

COBOL provides two table search verbs: SEARCH (linear search) and SEARCH ALL (binary search). The performance difference is dramatic for large tables:

  • SEARCH examines entries sequentially from the current index position. For a table of N entries, it averages N/2 comparisons to find an entry.
  • SEARCH ALL uses a binary search algorithm, requiring at most log2(N) comparisons. For a table of 10,000 entries, SEARCH averages 5,000 comparisons; SEARCH ALL requires at most 14.
       IDENTIFICATION DIVISION.
       PROGRAM-ID. SRCHPERF.
      *================================================================*
      * TABLE SEARCH PERFORMANCE COMPARISON                             *
      * Demonstrates SEARCH ALL (binary) vs SEARCH (linear)             *
      * for a rate lookup table used in interest calculations.          *
      *                                                                 *
      * Cross-reference: Chapter 27 (Table Handling)                    *
      *================================================================*

       DATA DIVISION.
       WORKING-STORAGE SECTION.

       01  WS-RATE-TABLE.
           05  WS-RATE-ENTRY OCCURS 500 TIMES
               ASCENDING KEY IS WS-RATE-PRODUCT-CODE
               INDEXED BY WS-RATE-IDX.
               10  WS-RATE-PRODUCT-CODE   PIC X(6).
               10  WS-RATE-TIER-CODE      PIC X(2).
               10  WS-RATE-EFFECTIVE-DATE PIC 9(8).
               10  WS-RATE-VALUE          PIC 9V9(6) COMP-3.
               10  WS-RATE-DESCRIPTION    PIC X(30).

       01  WS-SEARCH-KEY              PIC X(6).
       01  WS-FOUND-RATE              PIC 9V9(6).
       01  WS-FOUND-FLAG              PIC 9 VALUE 0.
           88  WS-RATE-FOUND          VALUE 1.
           88  WS-RATE-NOT-FOUND      VALUE 0.

       PROCEDURE DIVISION.

       1000-BINARY-SEARCH-RATE.
      *    SEARCH ALL: Binary search - O(log N) performance
      *    The table MUST be in ascending order by the KEY field
      *    and the ASCENDING KEY clause must be specified in the
      *    OCCURS clause.
           MOVE 0 TO WS-FOUND-FLAG

           SEARCH ALL WS-RATE-ENTRY
               AT END
                   MOVE 0 TO WS-FOUND-FLAG
               WHEN WS-RATE-PRODUCT-CODE(WS-RATE-IDX)
                   = WS-SEARCH-KEY
                   MOVE 1 TO WS-FOUND-FLAG
                   MOVE WS-RATE-VALUE(WS-RATE-IDX)
                       TO WS-FOUND-RATE
           END-SEARCH.

       2000-LINEAR-SEARCH-RATE.
      *    SEARCH: Linear search - O(N) performance
      *    Much slower for large tables, but does not require
      *    the table to be sorted.
           SET WS-RATE-IDX TO 1
           MOVE 0 TO WS-FOUND-FLAG

           SEARCH WS-RATE-ENTRY
               AT END
                   MOVE 0 TO WS-FOUND-FLAG
               WHEN WS-RATE-PRODUCT-CODE(WS-RATE-IDX)
                   = WS-SEARCH-KEY
                   MOVE 1 TO WS-FOUND-FLAG
                   MOVE WS-RATE-VALUE(WS-RATE-IDX)
                       TO WS-FOUND-RATE
           END-SEARCH.

32.3.2 PERFORM VARYING Optimization

The PERFORM VARYING statement is one of the most frequently executed statements in COBOL programs. Small optimizations in loop processing can have a large cumulative effect when the loop executes millions of times:

      *================================================================*
      * PERFORM VARYING OPTIMIZATION TECHNIQUES                         *
      *================================================================*

       01  WS-LOOP-VARS.
           05  WS-IDX                 PIC S9(8) COMP.
           05  WS-MAX-IDX             PIC S9(8) COMP.
           05  WS-TOTAL               PIC S9(13)V99 COMP-3.

       01  WS-TRANSACTION-TABLE.
           05  WS-TRANS-COUNT         PIC S9(8) COMP.
           05  WS-TRANS-ENTRY OCCURS 10000 TIMES
               INDEXED BY WS-TRANS-IDX.
               10  WS-TRANS-ACCT      PIC X(10).
               10  WS-TRANS-AMOUNT    PIC S9(11)V99 COMP-3.
               10  WS-TRANS-TYPE      PIC X(2).

      *--- TECHNIQUE 1: Use INDEXED BY instead of a data item -----*
      *    Index names are stored as displacement values, eliminating
      *    the multiplication needed with subscripts.
      *
      *    SLOWER (subscript - requires multiplication on each access):
      *    PERFORM VARYING WS-IDX FROM 1 BY 1
      *        UNTIL WS-IDX > WS-TRANS-COUNT
      *        ADD WS-TRANS-AMOUNT(WS-IDX) TO WS-TOTAL
      *    END-PERFORM
      *
      *    FASTER (index - displacement is pre-calculated):
           PERFORM VARYING WS-TRANS-IDX FROM 1 BY 1
               UNTIL WS-TRANS-IDX > WS-TRANS-COUNT
               ADD WS-TRANS-AMOUNT(WS-TRANS-IDX) TO WS-TOTAL
           END-PERFORM

      *--- TECHNIQUE 2: Move invariant operations outside the loop -*
      *
      *    SLOWER (function called on every iteration):
      *    PERFORM VARYING WS-TRANS-IDX FROM 1 BY 1
      *        UNTIL WS-TRANS-IDX > WS-TRANS-COUNT
      *        IF WS-TRANS-ACCT(WS-TRANS-IDX) =
      *           FUNCTION UPPER-CASE(WS-SEARCH-ACCOUNT)
      *            ADD WS-TRANS-AMOUNT(WS-TRANS-IDX) TO WS-TOTAL
      *        END-IF
      *    END-PERFORM
      *
      *    FASTER (function called once before the loop):
           MOVE FUNCTION UPPER-CASE(WS-SEARCH-ACCOUNT)
               TO WS-UPPER-SEARCH-ACCT

           PERFORM VARYING WS-TRANS-IDX FROM 1 BY 1
               UNTIL WS-TRANS-IDX > WS-TRANS-COUNT
               IF WS-TRANS-ACCT(WS-TRANS-IDX) =
                   WS-UPPER-SEARCH-ACCT
                   ADD WS-TRANS-AMOUNT(WS-TRANS-IDX) TO WS-TOTAL
               END-IF
           END-PERFORM.

      *--- TECHNIQUE 3: Minimize operations inside the loop --------*
      *
      *    SLOWER (multiple operations per iteration):
      *    PERFORM VARYING WS-TRANS-IDX FROM 1 BY 1
      *        UNTIL WS-TRANS-IDX > WS-TRANS-COUNT
      *        MOVE WS-TRANS-AMOUNT(WS-TRANS-IDX) TO WS-TEMP
      *        ADD WS-TEMP TO WS-TOTAL
      *        ADD 1 TO WS-ITEM-COUNT
      *        COMPUTE WS-AVERAGE = WS-TOTAL / WS-ITEM-COUNT
      *    END-PERFORM
      *
      *    FASTER (compute average once after the loop):
           MOVE 0 TO WS-TOTAL
           MOVE 0 TO WS-ITEM-COUNT
           PERFORM VARYING WS-TRANS-IDX FROM 1 BY 1
               UNTIL WS-TRANS-IDX > WS-TRANS-COUNT
               ADD WS-TRANS-AMOUNT(WS-TRANS-IDX) TO WS-TOTAL
               ADD 1 TO WS-ITEM-COUNT
           END-PERFORM
           IF WS-ITEM-COUNT > 0
               COMPUTE WS-AVERAGE = WS-TOTAL / WS-ITEM-COUNT
           END-IF.

32.3.3 Data Type Efficiency

The choice of data types in COBOL has a direct impact on CPU consumption. Different data types have different costs for arithmetic, comparison, and move operations:

      *================================================================*
      * DATA TYPE PERFORMANCE CHARACTERISTICS                           *
      *================================================================*

       01  WS-DATA-TYPES.
      *    COMP (BINARY) - Fastest for arithmetic and comparisons
      *    when used as subscripts, loop counters, and flags.
      *    Uses hardware binary arithmetic instructions.
           05  WS-BINARY-COUNTER      PIC S9(8) COMP.
           05  WS-BINARY-FLAG         PIC S9(4) COMP.

      *    COMP-3 (PACKED DECIMAL) - Best for financial calculations.
      *    Uses hardware packed decimal instructions.
      *    Ideal for amounts, balances, quantities.
           05  WS-PACKED-AMOUNT       PIC S9(13)V99 COMP-3.
           05  WS-PACKED-RATE         PIC S9V9(6) COMP-3.

      *    DISPLAY (ZONED DECIMAL) - Slowest for arithmetic.
      *    Requires conversion to packed or binary before operations.
      *    Use only for data that will be displayed/printed as-is.
           05  WS-DISPLAY-AMOUNT      PIC 9(13)V99.

      *    COMP-5 (NATIVE BINARY) - Same as COMP but always uses
      *    full binary range regardless of TRUNC option.
      *    Use for interfacing with non-COBOL programs.
           05  WS-NATIVE-BINARY       PIC S9(8) COMP-5.

      *================================================================*
      * PERFORMANCE RULE: For counters, subscripts, and flags,          *
      * always use PIC S9(8) COMP. This maps to a fullword (4 bytes)   *
      * and uses the fastest hardware instructions.                     *
      *                                                                 *
      * For financial amounts, use COMP-3 (packed decimal).             *
      * Avoid DISPLAY for any field involved in arithmetic.             *
      *================================================================*

      *    SLOWER - mixing DISPLAY and COMP-3 causes conversions:
      *    ADD WS-DISPLAY-AMOUNT TO WS-PACKED-AMOUNT
      *    (The compiler must convert DISPLAY to packed decimal
      *     before the addition, then convert back)

      *    FASTER - keep all arithmetic operands as COMP-3:
           ADD WS-PACKED-RATE TO WS-PACKED-AMOUNT.

Key Concept

The golden rule of COBOL data types for performance is: use COMP (binary) for counters, subscripts, loop variables, and flags; use COMP-3 (packed decimal) for financial amounts and arithmetic operands; use DISPLAY only for fields that are read from or written to external files in character format. Never perform arithmetic on DISPLAY fields if you can avoid it. This principle was also discussed in Chapter 28 when covering numeric data types.

32.3.4 Efficient EVALUATE vs. Nested IF

When testing multiple conditions, EVALUATE is generally more efficient and readable than nested IF statements, particularly when the compiler can optimize it into a branch table:

      *    EFFICIENT: EVALUATE generates optimized branch logic
           EVALUATE WS-TRANS-TYPE
               WHEN 'DP'
                   PERFORM 5100-PROCESS-DEPOSIT
               WHEN 'WD'
                   PERFORM 5200-PROCESS-WITHDRAWAL
               WHEN 'XF'
                   PERFORM 5300-PROCESS-TRANSFER
               WHEN 'PY'
                   PERFORM 5400-PROCESS-PAYMENT
               WHEN 'FE'
                   PERFORM 5500-PROCESS-FEE
               WHEN OTHER
                   PERFORM 5900-PROCESS-UNKNOWN
           END-EVALUATE.

32.4 File I/O Optimization

I/O is almost always the dominant factor in batch program elapsed time. A program that processes 10 million records can spend 90% or more of its elapsed time waiting for I/O. Optimizing I/O is therefore the highest-leverage activity for batch performance tuning.

32.4.1 Block Size Optimization

The block size determines how many logical records are read or written in a single physical I/O operation. A larger block size means fewer I/O operations, which directly reduces elapsed time:

//*================================================================*
//* BLOCK SIZE OPTIMIZATION EXAMPLES                                *
//*================================================================*
//*
//* POOR PERFORMANCE - small block size, many I/O operations:
//* //CUSTMAST DD DSN=BANK.PROD.CUSTMAST,DISP=SHR,
//* //            DCB=(RECFM=FB,LRECL=200,BLKSIZE=200)
//* (1 record per block = 1 I/O per record)
//*
//* GOOD PERFORMANCE - optimal block size:
//CUSTMAST DD DSN=BANK.PROD.CUSTMAST,DISP=SHR,
//            DCB=(RECFM=FB,LRECL=200,BLKSIZE=27800)
//* (139 records per block, using half-track blocking)
//*
//* BEST PRACTICE - let the system determine optimal block size:
//* //CUSTMAST DD DSN=BANK.PROD.CUSTMAST,DISP=SHR,
//* //            DCB=(RECFM=FB,LRECL=200,BLKSIZE=0)
//* BLKSIZE=0 tells DFSMS to choose the optimal block size
//* based on the device geometry.
//*
//*================================================================*
//* BLOCK SIZE CALCULATION FOR 3390 DISK:                           *
//* Track capacity = 56,664 bytes                                   *
//* Half-track = 27,998 bytes                                       *
//* For LRECL=200: BLKSIZE = (27998 / 200) * 200 = 27800           *
//* This fits 139 records per block.                                *
//*                                                                 *
//* Impact: Processing 10,000,000 records                           *
//*   BLKSIZE=200:  10,000,000 I/Os                                 *
//*   BLKSIZE=27800:    71,943 I/Os                                 *
//*   Reduction: 99.3% fewer I/O operations                         *
//*================================================================*

32.4.2 Buffer Optimization (BUFNO and BUFSIZE)

Buffers are areas of memory that hold blocks of data being read from or written to files. More buffers allow the system to read ahead (for input) or write behind (for output), overlapping I/O with processing:

//*================================================================*
//* BUFFER OPTIMIZATION                                             *
//*================================================================*
//*
//* For sequential input files - read-ahead buffering:
//TRANFILE DD DSN=BANK.PROD.DAYTRANS,DISP=SHR,
//            DCB=(BUFNO=20)
//* 20 buffers allows aggressive read-ahead, keeping the
//* program processing while the next blocks are being read.
//*
//* For sequential output files - write-behind buffering:
//OUTFILE  DD DSN=BANK.PROD.DAYEND.OUTPUT,
//            DISP=(NEW,CATLG,DELETE),
//            UNIT=SYSDA,
//            SPACE=(CYL,(100,50),RLSE),
//            DCB=(RECFM=FB,LRECL=200,BLKSIZE=27800,BUFNO=20)
//*
//* For VSAM files - use BUFFERSPACE or AMP parameter:
//ACCTMAST DD DSN=BANK.PROD.ACCTMAST,DISP=SHR,
//            AMP=('BUFND=20,BUFNI=10')
//* BUFND=20: 20 data buffers for data CI reads
//* BUFNI=10: 10 index buffers to cache the VSAM index

32.4.3 VSAM Tuning

VSAM (Virtual Storage Access Method) files are the most common file organization for online and random-access data on z/OS. As discussed in Chapter 22, VSAM performance depends on several factors.

CI and CA Splits:

A Control Interval (CI) split occurs when a record is inserted into a full CI. The CI must be split into two, with half the records moved to a new CI. A Control Area (CA) split is even more expensive --- it occurs when all CIs in a CA are full and a new CI cannot be allocated within the CA.

       IDENTIFICATION DIVISION.
       PROGRAM-ID. VSAMPERF.
      *================================================================*
      * VSAM PERFORMANCE OPTIMIZATION                                   *
      * Demonstrates techniques for efficient VSAM access               *
      *================================================================*

       ENVIRONMENT DIVISION.
       INPUT-OUTPUT SECTION.
       FILE-CONTROL.
           SELECT ACCOUNT-FILE
               ASSIGN TO ACCTMAST
               ORGANIZATION IS INDEXED
               ACCESS MODE IS DYNAMIC
               RECORD KEY IS ACCT-KEY
               FILE STATUS IS WS-FILE-STATUS.

       DATA DIVISION.
       FILE SECTION.
       FD  ACCOUNT-FILE.
       01  ACCOUNT-RECORD.
           05  ACCT-KEY               PIC X(14).
           05  ACCT-DATA              PIC X(186).

       WORKING-STORAGE SECTION.
       01  WS-FILE-STATUS             PIC XX.
       01  WS-RECORDS-READ            PIC S9(8) COMP VALUE 0.

       PROCEDURE DIVISION.

       1000-SEQUENTIAL-BROWSE-TECHNIQUE.
      *    For processing many records in key sequence,
      *    sequential (browse) access is MUCH faster than
      *    random READ because it uses sequential buffering.
      *
      *    SLOWER - random reads for sequential processing:
      *    PERFORM VARYING WS-IDX FROM 1 BY 1
      *        UNTIL WS-IDX > WS-NUM-ACCOUNTS
      *        MOVE WS-ACCT-TABLE(WS-IDX) TO ACCT-KEY
      *        READ ACCOUNT-FILE
      *            INVALID KEY CONTINUE
      *        END-READ
      *    END-PERFORM
      *
      *    FASTER - sequential browse when reading many records:
           MOVE LOW-VALUES TO ACCT-KEY
           START ACCOUNT-FILE
               KEY IS NOT LESS THAN ACCT-KEY
               INVALID KEY
                   DISPLAY 'START FAILED' UPON CONSOLE
           END-START

           PERFORM UNTIL WS-FILE-STATUS NOT = '00'
               READ ACCOUNT-FILE NEXT
                   AT END
                       CONTINUE
               END-READ
               IF WS-FILE-STATUS = '00'
                   ADD 1 TO WS-RECORDS-READ
                   PERFORM 2000-PROCESS-ACCOUNT
               END-IF
           END-PERFORM.

       3000-BATCH-UPDATE-TECHNIQUE.
      *    For batch updates, sort the updates into key sequence
      *    and process sequentially. This minimizes random I/O
      *    and CI splits by updating records in physical order.
           CONTINUE.

VSAM Performance JCL:

//*================================================================*
//* VSAM CLUSTER DEFINITION WITH PERFORMANCE OPTIONS                *
//*================================================================*
//DEFVSAM  EXEC PGM=IDCAMS
//SYSPRINT DD SYSOUT=*
//SYSIN    DD *
  DELETE BANK.PROD.ACCTMAST CLUSTER PURGE
  SET MAXCC = 0

  DEFINE CLUSTER ( -
    NAME(BANK.PROD.ACCTMAST) -
    INDEXED -
    RECORDSIZE(200 200) -
    KEYS(14 0) -
    SHAREOPTIONS(2 3) -
    SPEED -
    FREESPACE(20 10) -
    CONTROLINTERVALSIZE(4096) -
    ) -
  DATA ( -
    NAME(BANK.PROD.ACCTMAST.DATA) -
    CYLINDERS(500 100) -
    CONTROLINTERVALSIZE(4096) -
    BUFFERSPACE(1048576) -
    ) -
  INDEX ( -
    NAME(BANK.PROD.ACCTMAST.INDEX) -
    CYLINDERS(10 5) -
    CONTROLINTERVALSIZE(2048) -
    )
/*
//*
//* FREESPACE(20 10):
//*   20% free space in each CI for future inserts
//*   10% of CIs left empty in each CA for CI splits
//*
//* SHAREOPTIONS(2 3):
//*   2 = Multiple readers, one writer (within a system)
//*   3 = Multiple readers and writers (cross-system via VSAM RLS)
//*
//* BUFFERSPACE(1048576):
//*   1 MB of buffer space for data I/O

Key Concept

The most common VSAM performance problem is excessive CI and CA splits caused by insufficient FREESPACE. Monitor CI/CA split counts using IDCAMS LISTCAT and reorganize VSAM files before splits become a significant performance drag. A VSAM file with heavy insert activity should have FREESPACE(20 10) or higher, adjusted based on monitoring data. This was discussed in Chapter 29 when covering VSAM file design.


32.5 DB2 Performance

For COBOL programs that access DB2, the SQL statements are often the dominant consumer of both CPU and elapsed time. Efficient SQL is critical for performance.

32.5.1 Efficient SQL Coding

The way you write SQL directly affects how DB2 accesses data. Even small changes in SQL syntax can cause DB2 to choose a dramatically different access path:

       IDENTIFICATION DIVISION.
       PROGRAM-ID. DB2PERF.
      *================================================================*
      * DB2 PERFORMANCE OPTIMIZATION EXAMPLES                           *
      * Demonstrates efficient SQL coding techniques for COBOL.         *
      *                                                                 *
      * Cross-reference: Chapter 24 (DB2 Programming),                  *
      *                  Chapter 30 (SQL Best Practices)                 *
      *================================================================*

       DATA DIVISION.
       WORKING-STORAGE SECTION.
           EXEC SQL INCLUDE SQLCA END-EXEC.

       01  WS-HOST-VARS.
           05  WS-ACCT-NUMBER         PIC X(10).
           05  WS-BRANCH-CODE         PIC X(4).
           05  WS-START-DATE          PIC X(10).
           05  WS-END-DATE            PIC X(10).
           05  WS-TRANS-TOTAL         PIC S9(13)V99 COMP-3.
           05  WS-TRANS-COUNT         PIC S9(8) COMP.
           05  WS-ACCT-BALANCE        PIC S9(13)V99 COMP-3.
           05  WS-ACCT-NAME           PIC X(30).
           05  WS-ACCT-STATUS         PIC X(1).
           05  WS-NULL-IND            PIC S9(4) COMP.

       PROCEDURE DIVISION.

       1000-USE-INDEX-FRIENDLY-PREDICATES.
      *    INEFFICIENT - function on column prevents index use:
      *    EXEC SQL
      *        SELECT ACCT_NUMBER, ACCT_NAME
      *        INTO :WS-ACCT-NUMBER, :WS-ACCT-NAME
      *        FROM BANKDB.CUSTOMER
      *        WHERE SUBSTR(ACCT_NUMBER,1,4) = :WS-BRANCH-CODE
      *    END-EXEC
      *
      *    EFFICIENT - range predicate allows index scan:
           EXEC SQL
               SELECT ACCT_NUMBER, ACCT_NAME
               INTO :WS-ACCT-NUMBER, :WS-ACCT-NAME
               FROM BANKDB.CUSTOMER
               WHERE ACCT_NUMBER >= :WS-BRANCH-CODE || '000000'
               AND   ACCT_NUMBER <= :WS-BRANCH-CODE || '999999'
           END-EXEC.

       2000-SELECT-ONLY-NEEDED-COLUMNS.
      *    INEFFICIENT - SELECT * reads all columns:
      *    EXEC SQL
      *        SELECT *
      *        INTO :WS-CUSTOMER-RECORD
      *        FROM BANKDB.CUSTOMER
      *        WHERE ACCT_NUMBER = :WS-ACCT-NUMBER
      *    END-EXEC
      *
      *    EFFICIENT - select only what you need:
           EXEC SQL
               SELECT ACCT_NAME, ACCT_BALANCE, ACCT_STATUS
               INTO :WS-ACCT-NAME, :WS-ACCT-BALANCE,
                    :WS-ACCT-STATUS
               FROM BANKDB.CUSTOMER
               WHERE ACCT_NUMBER = :WS-ACCT-NUMBER
           END-EXEC.

       3000-USE-HOST-VARIABLES-NOT-LITERALS.
      *    INEFFICIENT - literals cause different access paths
      *    to be cached for each different value:
      *    EXEC SQL
      *        SELECT COUNT(*)
      *        INTO :WS-TRANS-COUNT
      *        FROM BANKDB.TRANSACTIONS
      *        WHERE BRANCH_CODE = 'B001'
      *        AND   TRANS_DATE >= '2025-01-01'
      *    END-EXEC
      *
      *    EFFICIENT - host variables allow access path reuse:
           MOVE 'B001' TO WS-BRANCH-CODE
           MOVE '2025-01-01' TO WS-START-DATE
           EXEC SQL
               SELECT COUNT(*)
               INTO :WS-TRANS-COUNT
               FROM BANKDB.TRANSACTIONS
               WHERE BRANCH_CODE = :WS-BRANCH-CODE
               AND   TRANS_DATE >= :WS-START-DATE
           END-EXEC.

       4000-AVOID-UNNECESSARY-SORTS.
      *    INEFFICIENT - ORDER BY on a non-indexed column
      *    forces a sort operation in DB2:
      *    EXEC SQL
      *        DECLARE C-UNSORTED CURSOR FOR
      *        SELECT ACCT_NUMBER, TRANS_DATE, TRANS_AMOUNT
      *        FROM   BANKDB.TRANSACTIONS
      *        WHERE  BRANCH_CODE = :WS-BRANCH-CODE
      *        ORDER BY TRANS_AMOUNT DESC
      *    END-EXEC
      *
      *    EFFICIENT - ORDER BY on the indexed key avoids a sort:
           EXEC SQL
               DECLARE C-SORTED CURSOR FOR
               SELECT ACCT_NUMBER, TRANS_DATE, TRANS_AMOUNT
               FROM   BANKDB.TRANSACTIONS
               WHERE  BRANCH_CODE = :WS-BRANCH-CODE
               ORDER BY ACCT_NUMBER, TRANS_DATE
           END-EXEC.

       5000-USE-FETCH-FOR-MULTIPLE-ROWS.
      *    EFFICIENT - MULTI-ROW FETCH retrieves many rows in
      *    one DB2 call, reducing the number of FETCH calls
      *    from one per row to one per block of rows:
           EXEC SQL
               DECLARE C-MULTI CURSOR FOR
               SELECT ACCT_NUMBER, TRANS_DATE, TRANS_AMOUNT
               FROM   BANKDB.TRANSACTIONS
               WHERE  BRANCH_CODE = :WS-BRANCH-CODE
               AND    TRANS_DATE BETWEEN :WS-START-DATE
                                    AND :WS-END-DATE
               ORDER BY ACCT_NUMBER
           END-EXEC

           EXEC SQL OPEN C-MULTI END-EXEC

           EXEC SQL
               FETCH C-MULTI
               FOR 100 ROWS
               INTO :WS-ACCT-ARRAY,
                    :WS-DATE-ARRAY,
                    :WS-AMOUNT-ARRAY
           END-EXEC.

32.5.2 Using EXPLAIN to Analyze Access Paths

The DB2 EXPLAIN facility reveals how DB2 will access data for your SQL statements. This is essential for understanding and optimizing performance:

//*================================================================*
//* EXPLAIN THE ACCESS PATH FOR A COBOL PROGRAM'S SQL               *
//*================================================================*
//EXPLAIN  EXEC PGM=IKJEFT01,DYNAMNBR=20
//STEPLIB  DD DSN=DSN.V13R1.SDSNLOAD,DISP=SHR
//SYSTSPRT DD SYSOUT=*
//SYSPRINT DD SYSOUT=*
//SYSTSIN  DD *
  DSN SYSTEM(DB2P)
  BIND PLAN(DAYENDPL) -
       MEMBER(DAYENDBT) -
       ACTION(REPLACE) -
       EXPLAIN(YES) -
       ISOLATION(CS)
  END
/*
//*
//* After the bind, query the PLAN_TABLE to see access paths:
//*
//QUERY    EXEC PGM=IKJEFT01,DYNAMNBR=20
//STEPLIB  DD DSN=DSN.V13R1.SDSNLOAD,DISP=SHR
//SYSTSPRT DD SYSOUT=*
//SYSPRINT DD SYSOUT=*
//SYSTSIN  DD *
  DSN SYSTEM(DB2P)
  RUN PROGRAM(DSNTEP2) PLAN(DSNTEP2) -
      LIB('DSN.V13R1.RUNLIB.LOAD')
  END
//SYSIN    DD *
  SELECT QUERYNO, QBLOCKNO, PLANNO,
         METHOD, TNAME, ACCESSTYPE,
         MATCHCOLS, INDEXONLY, ACCESSNAME
  FROM   SYSIBM.PLAN_TABLE
  WHERE  APPLNAME = 'DAYENDPL'
  ORDER BY QUERYNO, QBLOCKNO, PLANNO;
/*

Key EXPLAIN output values to look for:

Column Good Values Problem Values
ACCESSTYPE I (index), I1 (one-fetch index) R (tablespace scan)
MATCHCOLS > 0 (index columns matched) 0 (no matching, full index scan)
INDEXONLY Y (data from index only, no table access) N (must access table pages)
METHOD 0 (nested loop join) 1,2,3,4 (various sorts required)

32.5.3 Host Variable Optimization

The way you define host variables in COBOL can affect DB2 performance. Mismatched data types between host variables and DB2 columns cause DB2 to perform conversions, which adds CPU overhead:

      *================================================================*
      * HOST VARIABLE ALIGNMENT WITH DB2 COLUMN TYPES                   *
      *================================================================*

      *    DB2 Column                      COBOL Host Variable
      *    --------------------------      -------------------------
      *    CHAR(10)                        PIC X(10)
      *    VARCHAR(30)                     01 WS-NAME.
      *                                       49 WS-NAME-LEN PIC S9(4) COMP.
      *                                       49 WS-NAME-TEXT PIC X(30).
      *    DECIMAL(13,2)                   PIC S9(13)V99 COMP-3
      *    INTEGER                         PIC S9(9) COMP
      *    SMALLINT                        PIC S9(4) COMP
      *    DATE                            PIC X(10)
      *    TIMESTAMP                       PIC X(26)
      *
      *    IMPORTANT: Mismatched types cause DB2 to convert at runtime.
      *    For example, if a DB2 column is DECIMAL(13,2) and your host
      *    variable is PIC S9(11)V99 COMP-3 (DECIMAL(13,2)), no
      *    conversion is needed. But if your host variable is
      *    PIC 9(13).99 (DISPLAY), DB2 must convert on every fetch.

       01  WS-EFFICIENT-HOST-VARS.
      *    These match DB2 column types exactly - no conversion needed
           05  WS-ACCT-NUMBER          PIC X(10).
           05  WS-ACCT-BALANCE         PIC S9(13)V99 COMP-3.
           05  WS-TRANS-COUNT          PIC S9(9) COMP.
           05  WS-TRANS-DATE           PIC X(10).
           05  WS-LAST-UPDATE          PIC X(26).

       01  WS-INEFFICIENT-HOST-VARS.
      *    These require DB2 runtime conversion - AVOID
           05  WS-BAD-BALANCE          PIC 9(13)V99.
           05  WS-BAD-COUNT            PIC 9(9).

32.6 CICS Performance

CICS performance tuning focuses on keeping transaction response times low and system throughput high. The techniques differ from batch tuning because CICS programs must share resources with many other concurrent transactions.

32.6.1 Pseudo-Conversational vs. Conversational Design

As discussed in Chapter 23, pseudo-conversational programming is essential for CICS performance. In a conversational program, the task waits while the user reads the screen, consuming a CICS task slot and all associated resources. In a pseudo-conversational program, the task ends after sending the screen and restarts when the user presses Enter:

      *================================================================*
      * PSEUDO-CONVERSATIONAL PATTERN FOR CICS PERFORMANCE              *
      * Cross-reference: Chapter 23 (CICS Programming)                  *
      *================================================================*

       PROCEDURE DIVISION.
       0000-MAIN-LOGIC.
           EVALUATE TRUE
               WHEN EIBCALEN = 0
      *            First entry - display the initial screen
                   PERFORM 1000-SEND-MAP
               WHEN EIBAID = DFHENTER
      *            User pressed Enter - process input
                   PERFORM 2000-RECEIVE-AND-PROCESS
               WHEN EIBAID = DFHPF3
      *            User pressed PF3 - exit
                   EXEC CICS RETURN END-EXEC
               WHEN OTHER
                   PERFORM 1000-SEND-MAP
           END-EVALUATE

      *    Return control to CICS - the task ENDS here.
      *    No resources are consumed while the user thinks.
           EXEC CICS RETURN
               TRANSID('BINQ')
               COMMAREA(WS-COMMAREA)
               LENGTH(LENGTH OF WS-COMMAREA)
           END-EXEC.

32.6.2 BMS Optimization

BMS (Basic Mapping Support) map processing can be optimized by sending only the data that has changed:

      *================================================================*
      * BMS PERFORMANCE OPTIMIZATION TECHNIQUES                         *
      *================================================================*

       PROCEDURE DIVISION.

       1000-EFFICIENT-MAP-SEND.
      *    TECHNIQUE 1: Send only data, not the map format
      *    On first display, send both MAP and MAPONLY:
           EXEC CICS SEND MAP('ACCTMAP')
               MAPSET('ACCTSET')
               FROM(ACCTMAPO)
               ERASE
               RESP(WS-RESP)
           END-EXEC.

       1100-UPDATE-DATA-ONLY.
      *    On subsequent displays, send DATAONLY to avoid
      *    resending the entire map format:
           EXEC CICS SEND MAP('ACCTMAP')
               MAPSET('ACCTSET')
               FROM(ACCTMAPO)
               DATAONLY
               RESP(WS-RESP)
           END-EXEC.

       2000-MINIMIZE-COMMAREA-SIZE.
      *    TECHNIQUE 2: Keep the COMMAREA as small as possible.
      *    Every byte in the COMMAREA is written to the TS queue
      *    or coupling facility when the task ends and read back
      *    when it restarts. Smaller = faster.
      *
      *    BAD: Storing a 2000-byte record in the COMMAREA
      *    01  WS-BIG-COMMAREA.
      *        05  CA-CUSTOMER-RECORD  PIC X(2000).
      *
      *    GOOD: Store only the key; re-read the data on restart
       01  WS-SMALL-COMMAREA.
           05  CA-STATE               PIC X(2).
           05  CA-ACCOUNT-KEY         PIC X(14).
           05  CA-SCREEN-PAGE         PIC S9(4) COMP.
           05  CA-ERROR-FLAG          PIC X(1).
      *    Total: 21 bytes instead of 2000+ bytes

32.6.3 CICS Resource Optimization

      *================================================================*
      * CICS RESOURCE OPTIMIZATION TECHNIQUES                           *
      *================================================================*

       PROCEDURE DIVISION.

       1000-EFFICIENT-FILE-READS.
      *    TECHNIQUE: Combine file reads when possible.
      *    Each EXEC CICS READ is a separate request to the
      *    CICS file control module. Minimize the number of
      *    file requests per transaction.
      *
      *    SLOWER - two separate reads:
      *    EXEC CICS READ FILE('CUSTMAST') ...
      *    EXEC CICS READ FILE('ACCTMAST') ...
      *
      *    FASTER - if data is in DB2, use a single SQL join:
           EXEC SQL
               SELECT C.CUST_NAME, A.ACCT_BALANCE,
                      A.ACCT_STATUS
               INTO :WS-CUST-NAME, :WS-ACCT-BALANCE,
                    :WS-ACCT-STATUS
               FROM BANKDB.CUSTOMER C
               INNER JOIN BANKDB.ACCT_MASTER A
               ON C.CUST_ID = A.CUST_ID
               WHERE A.ACCT_NUMBER = :WS-ACCT-NUMBER
           END-EXEC.

       2000-EFFICIENT-ENQUEUE.
      *    TECHNIQUE: Minimize the time locks are held.
      *    Read with UPDATE only when you are about to update.
      *    Do all validation BEFORE acquiring the lock.

      *    SLOWER - lock held during validation:
      *    EXEC CICS READ FILE('ACCTMAST') UPDATE ...
      *    PERFORM 3000-VALIDATE-INPUT     <-- lock held here
      *    EXEC CICS REWRITE FILE('ACCTMAST') ...

      *    FASTER - validate first, then lock briefly:
           EXEC CICS READ FILE('ACCTMAST')
               INTO(WS-ACCOUNT-RECORD)
               RIDFLD(WS-ACCT-KEY)
               RESP(WS-RESP)
           END-EXEC

           PERFORM 3000-VALIDATE-INPUT

           IF WS-INPUT-VALID
      *        Now acquire the lock and update quickly
               EXEC CICS READ FILE('ACCTMAST')
                   INTO(WS-ACCOUNT-RECORD)
                   RIDFLD(WS-ACCT-KEY)
                   UPDATE
                   RESP(WS-RESP)
               END-EXEC
      *        Verify data hasn't changed since our read
               IF WS-ACCT-BALANCE = WS-SAVED-BALANCE
                   PERFORM 4000-APPLY-UPDATE
                   EXEC CICS REWRITE FILE('ACCTMAST')
                       FROM(WS-ACCOUNT-RECORD)
                       RESP(WS-RESP)
                   END-EXEC
               ELSE
      *            Optimistic lock failure - data changed
                   EXEC CICS UNLOCK FILE('ACCTMAST')
                       RESP(WS-RESP)
                   END-EXEC
                   PERFORM 5000-HANDLE-CONCURRENT-UPDATE
               END-IF
           END-IF.

32.7 Memory Optimization

Memory layout in WORKING-STORAGE and the choice of data representations can significantly affect both CPU consumption and memory footprint.

32.7.1 WORKING-STORAGE Layout

The order of fields in WORKING-STORAGE can affect performance due to hardware alignment considerations and cache line utilization:

      *================================================================*
      * WORKING-STORAGE LAYOUT OPTIMIZATION                             *
      *================================================================*

      *    PRINCIPLE 1: Group frequently-accessed fields together.
      *    Fields accessed together in the same paragraph should be
      *    defined near each other so they occupy the same cache line.

      *    GOOD - related fields are adjacent:
       01  WS-TRANSACTION-PROCESSING.
           05  WS-TRANS-COUNT      PIC S9(8) COMP.
           05  WS-TRANS-TOTAL      PIC S9(13)V99 COMP-3.
           05  WS-TRANS-TYPE       PIC X(2).
           05  WS-TRANS-STATUS     PIC X(1).

      *    PRINCIPLE 2: Align COMP fields on fullword boundaries.
      *    The compiler may insert slack bytes for alignment.
      *    Define COMP fields in groups to minimize slack.

      *    SUBOPTIMAL - slack bytes inserted for alignment:
      *    01  WS-MIXED-FIELDS.
      *        05  WS-FLAG-1       PIC X.        (1 byte)
      *                                          (3 bytes slack)
      *        05  WS-COUNTER-1    PIC S9(8) COMP. (4 bytes)
      *        05  WS-FLAG-2       PIC X.        (1 byte)
      *                                          (3 bytes slack)
      *        05  WS-COUNTER-2    PIC S9(8) COMP. (4 bytes)
      *    Total: 16 bytes with 6 bytes wasted on slack

      *    OPTIMAL - COMP fields grouped together:
       01  WS-ALIGNED-FIELDS.
           05  WS-COUNTER-1        PIC S9(8) COMP.
           05  WS-COUNTER-2        PIC S9(8) COMP.
           05  WS-FLAG-1           PIC X.
           05  WS-FLAG-2           PIC X.
      *    Total: 10 bytes with no slack

      *    PRINCIPLE 3: Use COMP-3 for financial amounts.
      *    COMP-3 uses approximately half the bytes of DISPLAY
      *    for the same precision:
      *
      *    PIC S9(13)V99 DISPLAY  = 16 bytes
      *    PIC S9(13)V99 COMP-3   =  8 bytes
      *    PIC S9(13)V99 COMP     =  8 bytes (but slower for
      *                                        decimal arithmetic)

32.7.2 Reducing Memory Footprint

For programs that process large tables or arrays, memory optimization can prevent paging and improve performance:

      *================================================================*
      * MEMORY OPTIMIZATION FOR LARGE TABLE PROCESSING                  *
      *================================================================*

       01  WS-DESIGN-CHOICES.
      *    APPROACH 1: Full table in memory (fast but memory-heavy)
      *    Use when table is small enough and accessed randomly.
           05  WS-SMALL-TABLE.
               10  WS-SMALL-ENTRY OCCURS 1000 TIMES
                   INDEXED BY WS-SM-IDX.
                   15  WS-SM-KEY       PIC X(10).
                   15  WS-SM-DATA      PIC X(40).
      *        Memory: 50,000 bytes = ~49 KB (acceptable)

      *    APPROACH 2: For large tables, load only what you need.
      *    Instead of loading 1,000,000 records into a table,
      *    process them in blocks or use a sorted file with
      *    sequential access.

      *    BAD - trying to load everything into memory:
      *    05  WS-HUGE-TABLE.
      *        10  WS-HUGE-ENTRY OCCURS 1000000 TIMES.
      *            15  WS-HG-KEY   PIC X(10).
      *            15  WS-HG-DATA  PIC X(190).
      *    Memory: 200,000,000 bytes = ~191 MB (too large!)

      *    GOOD - process in blocks of 10,000:
       01  WS-BLOCK-TABLE.
           05  WS-BLOCK-SIZE          PIC S9(8) COMP VALUE 10000.
           05  WS-BLOCK-ENTRY OCCURS 10000 TIMES
               INDEXED BY WS-BLK-IDX.
               10  WS-BLK-KEY         PIC X(10).
               10  WS-BLK-DATA        PIC X(190).
      *    Memory: 2,000,000 bytes = ~1.9 MB (manageable)

32.8 Batch Job Performance

Batch processing is the heart of mainframe workloads in financial institutions. End-of-day processing, statement generation, interest calculation, and regulatory reporting all run as batch jobs. Optimizing the batch cycle is critical for meeting processing windows.

32.8.1 SORT Optimization

SORT operations are among the most resource-intensive activities in batch processing. The IBM DFSORT product (or SyncSort alternative) provides extensive tuning options:

//*================================================================*
//* OPTIMIZED SORT JOB FOR END-OF-DAY TRANSACTION PROCESSING        *
//*================================================================*
//SORTJOB  JOB (BANK,EOD),'EOD SORT',
//             CLASS=A,MSGCLASS=X,MSGLEVEL=(1,1),
//             NOTIFY=&SYSUID,REGION=0M
//*
//SORTRANS EXEC PGM=SORT,PARM='DYNALLOC=(SYSDA,10)'
//*
//* SORT CONTROL STATEMENTS
//SYSIN    DD *
  SORT FIELDS=(1,10,CH,A,   Account number ascending
               11,8,CH,A,   Transaction date ascending
               19,6,CH,A)   Transaction time ascending
  INCLUDE COND=(35,1,CH,EQ,C'A')  Include only active records
  SUM FIELDS=NONE                  Remove duplicate keys
  OPTION FILSZ=E50000000          Estimated 50 million records
/*
//*
//* INPUT FILE - daily transactions
//SORTIN   DD DSN=BANK.PROD.DAYTRANS,DISP=SHR
//*
//* OUTPUT FILE - sorted transactions
//SORTOUT  DD DSN=BANK.PROD.DAYTRANS.SORTED,
//            DISP=(NEW,CATLG,DELETE),
//            UNIT=SYSDA,
//            SPACE=(CYL,(500,100),RLSE),
//            DCB=(RECFM=FB,LRECL=200,BLKSIZE=27800)
//*
//* SORT WORK DATASETS - allocated dynamically via DYNALLOC
//* The PARM DYNALLOC=(SYSDA,10) allocates up to 10 work datasets
//* on SYSDA, allowing sort to use parallel I/O.
//*
//SYSOUT   DD SYSOUT=*

Using SORT within a COBOL program efficiently:

       IDENTIFICATION DIVISION.
       PROGRAM-ID. EODPROC.
      *================================================================*
      * END-OF-DAY TRANSACTION PROCESSOR                                *
      * Uses internal SORT with FASTSRT for optimal performance.        *
      * Processes the daily transaction file, sorting by account        *
      * and applying to the account master.                             *
      *                                                                 *
      * Cross-reference: Chapter 28 (SORT/MERGE),                       *
      *                  Chapter 31 (Security)                           *
      *================================================================*

       ENVIRONMENT DIVISION.
       INPUT-OUTPUT SECTION.
       FILE-CONTROL.
           SELECT SORT-FILE    ASSIGN TO SORTWORK.
           SELECT TRANS-FILE   ASSIGN TO TRANSIN.
           SELECT ACCOUNT-FILE ASSIGN TO ACCTMAST
               ORGANIZATION IS INDEXED
               ACCESS MODE IS RANDOM
               RECORD KEY IS AM-ACCT-KEY
               FILE STATUS IS WS-ACCT-STATUS.
           SELECT REPORT-FILE  ASSIGN TO RPTOUT.

       DATA DIVISION.
       FILE SECTION.
       SD  SORT-FILE.
       01  SORT-RECORD.
           05  SR-ACCT-NUMBER     PIC X(10).
           05  SR-TRANS-DATE      PIC X(8).
           05  SR-TRANS-TIME      PIC X(6).
           05  SR-TRANS-AMOUNT    PIC S9(11)V99 COMP-3.
           05  SR-TRANS-TYPE      PIC X(2).
           05  SR-TRANS-STATUS    PIC X(1).
           05  FILLER             PIC X(166).

       FD  TRANS-FILE
           BLOCK CONTAINS 0 RECORDS.
       01  TRANS-RECORD           PIC X(200).

       FD  ACCOUNT-FILE.
       01  ACCOUNT-MASTER.
           05  AM-ACCT-KEY        PIC X(14).
           05  AM-ACCT-NAME       PIC X(30).
           05  AM-ACCT-BALANCE    PIC S9(13)V99 COMP-3.
           05  AM-ACCT-STATUS     PIC X(1).
           05  AM-TRANS-COUNT     PIC S9(5) COMP.
           05  AM-LAST-TRANS-DATE PIC X(8).
           05  FILLER             PIC X(134).

       FD  REPORT-FILE.
       01  REPORT-RECORD          PIC X(133).

       WORKING-STORAGE SECTION.
       01  WS-ACCT-STATUS         PIC XX.
       01  WS-PREV-ACCT           PIC X(10).
       01  WS-ACCT-TOTAL          PIC S9(13)V99 COMP-3.
       01  WS-ACCT-TRANS-COUNT    PIC S9(5) COMP.
       01  WS-RECORDS-PROCESSED   PIC S9(9) COMP VALUE 0.
       01  WS-ACCOUNTS-UPDATED    PIC S9(9) COMP VALUE 0.
       01  WS-EOF-FLAG            PIC 9 VALUE 0.
           88  END-OF-SORTED      VALUE 1.

       PROCEDURE DIVISION.
       0000-MAIN-LOGIC.
           OPEN I-O ACCOUNT-FILE
           OPEN OUTPUT REPORT-FILE

      *    FASTSRT handles the USING file I/O directly.
      *    OUTPUT PROCEDURE processes the sorted records.
           SORT SORT-FILE
               ON ASCENDING KEY SR-ACCT-NUMBER
               ON ASCENDING KEY SR-TRANS-DATE
               ON ASCENDING KEY SR-TRANS-TIME
               USING TRANS-FILE
               OUTPUT PROCEDURE 2000-PROCESS-SORTED

           CLOSE ACCOUNT-FILE
           CLOSE REPORT-FILE

           DISPLAY 'EOD PROCESSING COMPLETE'       UPON CONSOLE
           DISPLAY '  TRANSACTIONS: ' WS-RECORDS-PROCESSED
               UPON CONSOLE
           DISPLAY '  ACCOUNTS:     ' WS-ACCOUNTS-UPDATED
               UPON CONSOLE
           STOP RUN.

       2000-PROCESS-SORTED.
           MOVE SPACES TO WS-PREV-ACCT
           MOVE 0 TO WS-ACCT-TOTAL
           MOVE 0 TO WS-ACCT-TRANS-COUNT

           RETURN SORT-FILE
               AT END SET END-OF-SORTED TO TRUE
           END-RETURN

           PERFORM UNTIL END-OF-SORTED
      *        Accumulate transactions for the same account
               IF SR-ACCT-NUMBER NOT = WS-PREV-ACCT
                   AND WS-PREV-ACCT NOT = SPACES
      *            Account break - apply accumulated total
                   PERFORM 3000-UPDATE-ACCOUNT
               END-IF

               IF SR-ACCT-NUMBER NOT = WS-PREV-ACCT
                   MOVE SR-ACCT-NUMBER TO WS-PREV-ACCT
                   MOVE 0 TO WS-ACCT-TOTAL
                   MOVE 0 TO WS-ACCT-TRANS-COUNT
               END-IF

      *        Accumulate the transaction amount
               EVALUATE SR-TRANS-TYPE
                   WHEN 'CR'
                       ADD SR-TRANS-AMOUNT TO WS-ACCT-TOTAL
                   WHEN 'DB'
                       SUBTRACT SR-TRANS-AMOUNT FROM WS-ACCT-TOTAL
               END-EVALUATE
               ADD 1 TO WS-ACCT-TRANS-COUNT
               ADD 1 TO WS-RECORDS-PROCESSED

               RETURN SORT-FILE
                   AT END SET END-OF-SORTED TO TRUE
               END-RETURN
           END-PERFORM

      *    Process the last account
           IF WS-PREV-ACCT NOT = SPACES
               PERFORM 3000-UPDATE-ACCOUNT
           END-IF.

       3000-UPDATE-ACCOUNT.
      *    Read the account master record
           MOVE WS-PREV-ACCT TO AM-ACCT-KEY(1:10)
           READ ACCOUNT-FILE
               INVALID KEY
                   DISPLAY 'ACCOUNT NOT FOUND: ' WS-PREV-ACCT
                       UPON CONSOLE
                   GO TO 3000-EXIT
           END-READ

      *    Apply the net transaction total
           ADD WS-ACCT-TOTAL TO AM-ACCT-BALANCE
           ADD WS-ACCT-TRANS-COUNT TO AM-TRANS-COUNT

      *    Rewrite the updated record
           REWRITE ACCOUNT-MASTER
               INVALID KEY
                   DISPLAY 'REWRITE FAILED: ' WS-PREV-ACCT
                       UPON CONSOLE
           END-REWRITE

           ADD 1 TO WS-ACCOUNTS-UPDATED.

       3000-EXIT.
           EXIT.

32.8.2 Checkpoint/Restart

For long-running batch jobs, checkpoint/restart allows a job to resume from its last checkpoint after a failure, rather than starting over from the beginning. This is critical for jobs that take hours to complete:

       IDENTIFICATION DIVISION.
       PROGRAM-ID. CHKPTRST.
      *================================================================*
      * CHECKPOINT/RESTART IMPLEMENTATION                               *
      * Saves processing state at regular intervals so the job          *
      * can be restarted from the last checkpoint after a failure.      *
      *================================================================*

       DATA DIVISION.
       WORKING-STORAGE SECTION.
       01  WS-CHECKPOINT-DATA.
           05  WS-CHK-RECORD-COUNT    PIC S9(9) COMP.
           05  WS-CHK-LAST-KEY        PIC X(14).
           05  WS-CHK-RUNNING-TOTAL   PIC S9(15)V99 COMP-3.
           05  WS-CHK-HASH-TOTAL      PIC S9(18) COMP.
           05  WS-CHK-TIMESTAMP       PIC X(26).

       01  WS-CHECKPOINT-INTERVAL     PIC S9(8) COMP VALUE 100000.
       01  WS-RECORDS-SINCE-CHK       PIC S9(8) COMP VALUE 0.
       01  WS-RESTART-FLAG            PIC 9 VALUE 0.
           88  WS-IS-RESTART          VALUE 1.
           88  WS-IS-INITIAL          VALUE 0.
       01  WS-CHECKPOINT-ID           PIC X(8).

       PROCEDURE DIVISION.
       0000-MAIN-LOGIC.
      *    Check if this is a restart
           PERFORM 0500-CHECK-RESTART

           IF WS-IS-RESTART
               PERFORM 0600-RESTORE-CHECKPOINT
           END-IF

           PERFORM 1000-PROCESS-RECORDS
               UNTIL WS-END-OF-FILE

      *    Final checkpoint at end of processing
           PERFORM 5000-TAKE-CHECKPOINT

           STOP RUN.

       0500-CHECK-RESTART.
      *    Check for the existence of a checkpoint dataset
      *    If it exists and contains valid data, this is a restart
           OPEN INPUT CHECKPOINT-FILE
           IF WS-CHK-FILE-STATUS = '00'
               READ CHECKPOINT-FILE INTO WS-CHECKPOINT-DATA
               IF WS-CHK-FILE-STATUS = '00'
                   MOVE 1 TO WS-RESTART-FLAG
                   DISPLAY 'RESTART DETECTED. RESUMING FROM '
                           'RECORD ' WS-CHK-RECORD-COUNT
                           ' KEY ' WS-CHK-LAST-KEY
                       UPON CONSOLE
               END-IF
               CLOSE CHECKPOINT-FILE
           ELSE
               MOVE 0 TO WS-RESTART-FLAG
           END-IF.

       0600-RESTORE-CHECKPOINT.
      *    Position input file past already-processed records
      *    Restore running totals from checkpoint
           DISPLAY 'RESTORING FROM CHECKPOINT: '
                   'RECORDS=' WS-CHK-RECORD-COUNT
                   ' LAST-KEY=' WS-CHK-LAST-KEY
               UPON CONSOLE.

       1000-PROCESS-RECORDS.
      *    Process the current record
           PERFORM 2000-BUSINESS-LOGIC

           ADD 1 TO WS-RECORDS-SINCE-CHK

      *    Take a checkpoint at regular intervals
           IF WS-RECORDS-SINCE-CHK >= WS-CHECKPOINT-INTERVAL
               PERFORM 5000-TAKE-CHECKPOINT
               MOVE 0 TO WS-RECORDS-SINCE-CHK
           END-IF

           PERFORM 1100-READ-NEXT-RECORD.

       5000-TAKE-CHECKPOINT.
      *    Save current processing state
           MOVE FUNCTION CURRENT-DATE TO WS-CHK-TIMESTAMP
           MOVE WS-RECORD-COUNT       TO WS-CHK-RECORD-COUNT
           MOVE WS-CURRENT-KEY        TO WS-CHK-LAST-KEY
           MOVE WS-RUNNING-TOTAL      TO WS-CHK-RUNNING-TOTAL

      *    Write checkpoint data
           OPEN OUTPUT CHECKPOINT-FILE
           WRITE CHECKPOINT-RECORD FROM WS-CHECKPOINT-DATA
           CLOSE CHECKPOINT-FILE

      *    Issue a DB2 COMMIT to save database changes
           EXEC SQL COMMIT END-EXEC

           DISPLAY 'CHECKPOINT TAKEN AT RECORD '
                   WS-CHK-RECORD-COUNT
                   ' TIME ' WS-CHK-TIMESTAMP
               UPON CONSOLE.

       2000-BUSINESS-LOGIC.
           CONTINUE.

       1100-READ-NEXT-RECORD.
           CONTINUE.

32.8.3 Parallel Processing

For very large batch workloads, splitting the work across multiple parallel job steps or jobs can dramatically reduce elapsed time:

//*================================================================*
//* PARALLEL BATCH PROCESSING - END OF DAY                          *
//* Split the account master into ranges and process in parallel.   *
//* Each step processes a different range of account numbers.        *
//*================================================================*
//EODPAR   JOB (BANK,EOD),'PARALLEL EOD',
//             CLASS=A,MSGCLASS=X,MSGLEVEL=(1,1),
//             NOTIFY=&SYSUID
//*
//*----------------------------------------------------------------*
//* STEP 1: Split the transaction file by account range             *
//*----------------------------------------------------------------*
//SPLIT    EXEC PGM=SORT
//SORTIN   DD DSN=BANK.PROD.DAYTRANS.SORTED,DISP=SHR
//SORTOUT1 DD DSN=&&RANGE1,DISP=(NEW,PASS),
//            UNIT=SYSDA,SPACE=(CYL,(100,50)),
//            DCB=(RECFM=FB,LRECL=200,BLKSIZE=27800)
//SORTOUT2 DD DSN=&&RANGE2,DISP=(NEW,PASS),
//            UNIT=SYSDA,SPACE=(CYL,(100,50)),
//            DCB=(RECFM=FB,LRECL=200,BLKSIZE=27800)
//SORTOUT3 DD DSN=&&RANGE3,DISP=(NEW,PASS),
//            UNIT=SYSDA,SPACE=(CYL,(100,50)),
//            DCB=(RECFM=FB,LRECL=200,BLKSIZE=27800)
//SORTOUT4 DD DSN=&&RANGE4,DISP=(NEW,PASS),
//            UNIT=SYSDA,SPACE=(CYL,(100,50)),
//            DCB=(RECFM=FB,LRECL=200,BLKSIZE=27800)
//SYSIN    DD *
  SORT FIELDS=COPY
  OUTFIL FNAMES=SORTOUT1,INCLUDE=(1,4,CH,LE,C'2499')
  OUTFIL FNAMES=SORTOUT2,INCLUDE=(1,4,CH,GE,C'2500',
         AND,1,4,CH,LE,C'4999')
  OUTFIL FNAMES=SORTOUT3,INCLUDE=(1,4,CH,GE,C'5000',
         AND,1,4,CH,LE,C'7499')
  OUTFIL FNAMES=SORTOUT4,INCLUDE=(1,4,CH,GE,C'7500')
/*
//SYSOUT   DD SYSOUT=*
//*
//*----------------------------------------------------------------*
//* STEPS 2-5: Process each range in parallel                       *
//* Note: These steps can run concurrently because they access      *
//* non-overlapping account ranges.                                 *
//*----------------------------------------------------------------*
//PROC1    EXEC PGM=EODPROC,COND=(0,NE,SPLIT)
//STEPLIB  DD DSN=BANK.PROD.LOADLIB,DISP=SHR
//TRANSIN  DD DSN=&&RANGE1,DISP=(OLD,DELETE)
//ACCTMAST DD DSN=BANK.PROD.ACCTMAST,DISP=SHR
//RPTOUT   DD SYSOUT=*
//SYSOUT   DD SYSOUT=*
//*
//PROC2    EXEC PGM=EODPROC,COND=(0,NE,SPLIT)
//STEPLIB  DD DSN=BANK.PROD.LOADLIB,DISP=SHR
//TRANSIN  DD DSN=&&RANGE2,DISP=(OLD,DELETE)
//ACCTMAST DD DSN=BANK.PROD.ACCTMAST,DISP=SHR
//RPTOUT   DD SYSOUT=*
//SYSOUT   DD SYSOUT=*
//*
//PROC3    EXEC PGM=EODPROC,COND=(0,NE,SPLIT)
//STEPLIB  DD DSN=BANK.PROD.LOADLIB,DISP=SHR
//TRANSIN  DD DSN=&&RANGE3,DISP=(OLD,DELETE)
//ACCTMAST DD DSN=BANK.PROD.ACCTMAST,DISP=SHR
//RPTOUT   DD SYSOUT=*
//SYSOUT   DD SYSOUT=*
//*
//PROC4    EXEC PGM=EODPROC,COND=(0,NE,SPLIT)
//STEPLIB  DD DSN=BANK.PROD.LOADLIB,DISP=SHR
//TRANSIN  DD DSN=&&RANGE4,DISP=(OLD,DELETE)
//ACCTMAST DD DSN=BANK.PROD.ACCTMAST,DISP=SHR
//RPTOUT   DD SYSOUT=*
//SYSOUT   DD SYSOUT=*

32.9 Performance Monitoring Tools

Effective performance tuning requires good measurement. z/OS provides several tools for collecting and analyzing performance data.

32.9.1 SMF Records

The System Management Facilities (SMF) produce records that capture detailed information about every job, step, and resource usage on the system. The most important SMF record types for COBOL performance tuning are:

SMF Record Type Content Use
Type 30 Job/step resource usage (CPU, I/O, memory) Primary source for batch job performance data
Type 42 VSAM dataset activity (opens, closes, I/O counts, splits) VSAM tuning and split analysis
Type 101 DB2 accounting data DB2 performance analysis
Type 102 DB2 statistics DB2 system-level tuning
Type 110 CICS performance data CICS transaction response time analysis
//*================================================================*
//* EXTRACT SMF TYPE 30 RECORDS FOR PERFORMANCE ANALYSIS            *
//*================================================================*
//SMFEXT   JOB (BANK,PERF),'SMF EXTRACT',CLASS=A
//STEP1    EXEC PGM=IFASMFDP
//SYSPRINT DD SYSOUT=*
//DUMPIN   DD DSN=SYS1.MAN1,DISP=SHR
//DUMPOUT  DD DSN=BANK.PERF.SMF30.EXTRACT,
//            DISP=(NEW,CATLG,DELETE),
//            UNIT=SYSDA,SPACE=(CYL,(50,20),RLSE),
//            DCB=(RECFM=VBS,LRECL=32760,BLKSIZE=32760)
//SYSIN    DD *
  INDD(DUMPIN,OPTIONS(DUMP))
  OUTDD(DUMPOUT,TYPE(30))
  DATE(2025060,2025060)
  START(1800)
  END(0600)
/*

32.9.2 IBM Debugging and Performance Tools

IBM provides several tools specifically designed for analyzing COBOL program performance:

IBM Debug Tool / z/OS Debugger allows you to step through COBOL programs, set breakpoints, and examine data values. While primarily a debugging tool, it is invaluable for understanding program flow and identifying performance bottlenecks in logic.

Strobe (now IBM Application Performance Analyzer) is a sampling-based performance analysis tool. It periodically samples where the CPU is executing within your COBOL program and produces a report showing which paragraphs and statements consume the most CPU:

 ============================================================
  APPLICATION PERFORMANCE ANALYZER - PROGRAM: DAYENDBT
  SAMPLING INTERVAL: 2 MS    TOTAL SAMPLES: 50,000
 ============================================================

  TOP CPU CONSUMERS:
  RANK  PARAGRAPH              SAMPLES    PCT    CUMULATIVE
  ----  ---------------------  --------  -----   ----------
    1   3000-UPDATE-ACCOUNT       18,500  37.0%      37.0%
    2   2500-CALCULATE-INTEREST   12,000  24.0%      61.0%
    3   4000-WRITE-AUDIT-RECORD    7,500  15.0%      76.0%
    4   1000-READ-TRANSACTION      5,000  10.0%      86.0%
    5   5000-GENERATE-REPORT       3,000   6.0%      92.0%
    6   OTHER                      4,000   8.0%     100.0%

  TOP STATEMENTS IN 3000-UPDATE-ACCOUNT:
  LINE   STATEMENT                          SAMPLES   PCT
  -----  ---------------------------------  -------  -----
  00523  COMPUTE WS-NEW-BALANCE =            8,200   44.3%
         WS-OLD-BALANCE + WS-NET-CHANGE
  00528  REWRITE ACCOUNT-MASTER              6,100   33.0%
  00519  READ ACCOUNT-FILE                   4,200   22.7%

This output tells you exactly where to focus your tuning efforts. In this case, 37% of CPU time is spent in the 3000-UPDATE-ACCOUNT paragraph, and within that paragraph, the COMPUTE statement and the REWRITE are the dominant consumers.

32.9.3 Batch Job Elapsed Time Analysis

A systematic approach to analyzing batch job elapsed time:

       IDENTIFICATION DIVISION.
       PROGRAM-ID. PERFMON.
      *================================================================*
      * PERFORMANCE MONITORING INSTRUMENTATION                          *
      * Add timing instrumentation to critical sections of code         *
      * to identify elapsed-time bottlenecks.                           *
      *================================================================*

       DATA DIVISION.
       WORKING-STORAGE SECTION.
       01  WS-PERF-COUNTERS.
           05  WS-START-TIME          PIC X(21).
           05  WS-END-TIME            PIC X(21).
           05  WS-FILE-READ-COUNT     PIC S9(9) COMP VALUE 0.
           05  WS-DB2-CALL-COUNT      PIC S9(9) COMP VALUE 0.
           05  WS-SORT-COUNT          PIC S9(9) COMP VALUE 0.
           05  WS-RECORDS-PROCESSED   PIC S9(9) COMP VALUE 0.

       01  WS-TIMING-SECTION.
           05  WS-SECTION-START       PIC 9(8)V9(6).
           05  WS-SECTION-END         PIC 9(8)V9(6).
           05  WS-SECTION-ELAPSED     PIC 9(8)V9(6).

       PROCEDURE DIVISION.
       0000-MAIN-LOGIC.
      *    Capture start time
           MOVE FUNCTION CURRENT-DATE TO WS-START-TIME
           DISPLAY 'PROGRAM START: ' WS-START-TIME UPON CONSOLE

           PERFORM 1000-PHASE-1-READ-AND-SORT
           PERFORM 2000-PHASE-2-PROCESS-AND-UPDATE
           PERFORM 3000-PHASE-3-GENERATE-REPORTS

      *    Capture end time and display statistics
           MOVE FUNCTION CURRENT-DATE TO WS-END-TIME
           DISPLAY 'PROGRAM END:   ' WS-END-TIME UPON CONSOLE
           DISPLAY 'STATISTICS:' UPON CONSOLE
           DISPLAY '  FILE READS:    ' WS-FILE-READ-COUNT
               UPON CONSOLE
           DISPLAY '  DB2 CALLS:     ' WS-DB2-CALL-COUNT
               UPON CONSOLE
           DISPLAY '  RECORDS:       ' WS-RECORDS-PROCESSED
               UPON CONSOLE
           STOP RUN.

       1000-PHASE-1-READ-AND-SORT.
           ACCEPT WS-SECTION-START FROM TIME
           DISPLAY 'PHASE 1 START: SORT TRANSACTIONS'
               UPON CONSOLE

      *    ... sort processing ...

           ACCEPT WS-SECTION-END FROM TIME
           COMPUTE WS-SECTION-ELAPSED =
               WS-SECTION-END - WS-SECTION-START
           DISPLAY 'PHASE 1 COMPLETE. ELAPSED: '
               WS-SECTION-ELAPSED ' SECONDS'
               UPON CONSOLE.

       2000-PHASE-2-PROCESS-AND-UPDATE.
           ACCEPT WS-SECTION-START FROM TIME
           DISPLAY 'PHASE 2 START: PROCESS UPDATES'
               UPON CONSOLE

      *    ... update processing ...

           ACCEPT WS-SECTION-END FROM TIME
           COMPUTE WS-SECTION-ELAPSED =
               WS-SECTION-END - WS-SECTION-START
           DISPLAY 'PHASE 2 COMPLETE. ELAPSED: '
               WS-SECTION-ELAPSED ' SECONDS'
               UPON CONSOLE.

       3000-PHASE-3-GENERATE-REPORTS.
           ACCEPT WS-SECTION-START FROM TIME
           DISPLAY 'PHASE 3 START: GENERATE REPORTS'
               UPON CONSOLE

      *    ... report generation ...

           ACCEPT WS-SECTION-END FROM TIME
           COMPUTE WS-SECTION-ELAPSED =
               WS-SECTION-END - WS-SECTION-START
           DISPLAY 'PHASE 3 COMPLETE. ELAPSED: '
               WS-SECTION-ELAPSED ' SECONDS'
               UPON CONSOLE.

32.10 Real-World Tuning Case Study: Optimizing a Daily Batch Cycle

To bring all of these techniques together, let us walk through a realistic performance tuning case study based on a daily end-of-day batch cycle at a mid-sized retail bank.

32.10.1 The Problem

The bank's end-of-day batch cycle processes approximately 15 million debit and credit card transactions daily. The batch window is from 7:00 PM to 5:00 AM (10 hours). The current cycle takes 9 hours and 45 minutes, leaving only a 15-minute margin. With transaction volumes growing 15% year over year, the batch window will be exceeded within two months unless performance is improved.

The batch cycle consists of:

Job Description Current Elapsed Time
EOD010 Sort daily transactions 45 minutes
EOD020 Apply transactions to account master 3 hours 15 minutes
EOD030 Calculate daily interest 2 hours 30 minutes
EOD040 Generate customer statements 1 hour 45 minutes
EOD050 Regulatory reporting (SOX/BSA) 1 hour 30 minutes
Total 9 hours 45 minutes

32.10.2 Analysis Phase

Using the tools described earlier (SMF records, Application Performance Analyzer, DB2 EXPLAIN), we analyze each job:

EOD020 - Transaction Processing (3h 15m): - Application Performance Analyzer shows 62% of CPU time in DB2 calls. - DB2 EXPLAIN reveals a tablespace scan on the account master for each transaction lookup (the index on ACCT_NUMBER was inadvertently dropped during a recent DB2 migration). - VSAM LISTCAT shows 47,000 CI splits on the account master file since the last reorganization.

EOD030 - Interest Calculation (2h 30m): - The program uses DISPLAY numeric fields for all interest rate calculations, causing constant pack/unpack conversions. - The program is compiled with OPTIMIZE(0) and NUMPROC(NOPFD). - Each account is read individually with random VSAM I/O rather than sequential browse.

EOD040 - Statement Generation (1h 45m): - The output file has BLKSIZE=133 (unblocked), causing one I/O per line of output. - The program reads the entire 200-byte customer record to extract just the name and address.

32.10.3 Optimization Actions

Fix 1: Restore the DB2 index (EOD020)

-- Recreate the missing index on the account master table
CREATE UNIQUE INDEX BANKDB.ACCT_MASTER_PK
    ON BANKDB.ACCT_MASTER (ACCT_NUMBER ASC)
    USING STOGROUP BANKSGRP
    PRIQTY 500000
    SECQTY 100000
    BUFFERPOOL BP1
    CLOSE NO;

-- Verify the access path is now using the index
EXPLAIN ALL SET QUERYNO = 1 FOR
    SELECT ACCT_NUMBER, ACCT_BALANCE, ACCT_STATUS
    FROM BANKDB.ACCT_MASTER
    WHERE ACCT_NUMBER = '0001234567';

Fix 2: Reorganize the VSAM file (EOD020)

//*================================================================*
//* REORGANIZE ACCOUNT MASTER VSAM FILE                             *
//* Eliminates CI/CA splits and restores free space                 *
//*================================================================*
//REORG    JOB (BANK,DBA),'VSAM REORG',CLASS=A
//*
//STEP1    EXEC PGM=IDCAMS
//SYSPRINT DD SYSOUT=*
//INFILE   DD DSN=BANK.PROD.ACCTMAST,DISP=SHR
//BACKUP   DD DSN=BANK.PROD.ACCTMAST.BACKUP,
//            DISP=(NEW,CATLG,DELETE),
//            UNIT=SYSDA,SPACE=(CYL,(600,100),RLSE)
//SYSIN    DD *
  REPRO INFILE(INFILE) OUTFILE(BACKUP)
/*
//*
//STEP2    EXEC PGM=IDCAMS,COND=(0,NE,STEP1)
//SYSPRINT DD SYSOUT=*
//SYSIN    DD *
  DELETE BANK.PROD.ACCTMAST CLUSTER
  DEFINE CLUSTER ( -
    NAME(BANK.PROD.ACCTMAST) -
    INDEXED -
    RECORDSIZE(200 200) -
    KEYS(14 0) -
    FREESPACE(20 10) -
    SHAREOPTIONS(2 3) -
    SPEED) -
  DATA ( -
    NAME(BANK.PROD.ACCTMAST.DATA) -
    CYLINDERS(500 100) -
    CONTROLINTERVALSIZE(4096) -
    BUFFERSPACE(1048576)) -
  INDEX ( -
    NAME(BANK.PROD.ACCTMAST.INDEX) -
    CYLINDERS(10 5))
/*
//*
//STEP3    EXEC PGM=IDCAMS,COND=(0,NE,STEP2)
//SYSPRINT DD SYSOUT=*
//INFILE   DD DSN=BANK.PROD.ACCTMAST.BACKUP,DISP=SHR
//OUTFILE  DD DSN=BANK.PROD.ACCTMAST,DISP=SHR
//SYSIN    DD *
  REPRO INFILE(INFILE) OUTFILE(OUTFILE)
/*

Fix 3: Optimize the interest calculation program (EOD030)

       IDENTIFICATION DIVISION.
       PROGRAM-ID. EOD030.
      *================================================================*
      * OPTIMIZED DAILY INTEREST CALCULATION                            *
      *                                                                 *
      * Performance improvements:                                       *
      *   1. Changed DISPLAY fields to COMP-3 for arithmetic            *
      *   2. Compiled with OPTIMIZE(2), NUMPROC(PFD), TRUNC(OPT)       *
      *   3. Changed from random reads to sequential browse             *
      *   4. Moved rate table lookup outside the per-account loop       *
      *   5. Used SEARCH ALL instead of SEARCH for rate lookup          *
      *================================================================*

       DATA DIVISION.
       WORKING-STORAGE SECTION.

      *    BEFORE (all DISPLAY - slow):
      *    01  WS-INTEREST-CALC.
      *        05  WS-DAILY-RATE        PIC 9V9(8).
      *        05  WS-ACCRUED-INT       PIC 9(11)V99.
      *        05  WS-BALANCE           PIC 9(13)V99.
      *
      *    AFTER (COMP-3 - fast):
       01  WS-INTEREST-CALC.
           05  WS-DAILY-RATE           PIC S9V9(8) COMP-3.
           05  WS-ACCRUED-INT          PIC S9(11)V99 COMP-3.
           05  WS-BALANCE              PIC S9(13)V99 COMP-3.
           05  WS-ANNUAL-RATE          PIC S9V9(6) COMP-3.
           05  WS-DAYS-IN-YEAR         PIC S9(3) COMP VALUE 365.

       01  WS-RATE-TABLE.
           05  WS-NUM-RATES            PIC S9(4) COMP.
           05  WS-RATE-ENTRY OCCURS 200 TIMES
               ASCENDING KEY IS WS-RT-PRODUCT
               INDEXED BY WS-RT-IDX.
               10  WS-RT-PRODUCT       PIC X(6).
               10  WS-RT-ANNUAL-RATE   PIC S9V9(6) COMP-3.

       PROCEDURE DIVISION.
       0000-MAIN-LOGIC.
      *    Load the rate table ONCE before processing
           PERFORM 0500-LOAD-RATE-TABLE

      *    Use sequential browse instead of random reads
           OPEN INPUT ACCOUNT-FILE
           MOVE LOW-VALUES TO AM-ACCT-KEY
           START ACCOUNT-FILE KEY NOT LESS THAN AM-ACCT-KEY

           PERFORM UNTIL WS-END-OF-FILE
               READ ACCOUNT-FILE NEXT
                   AT END SET WS-END-OF-FILE TO TRUE
               END-READ
               IF NOT WS-END-OF-FILE
                   PERFORM 2000-CALCULATE-INTEREST
               END-IF
           END-PERFORM

           CLOSE ACCOUNT-FILE
           STOP RUN.

       0500-LOAD-RATE-TABLE.
      *    Load all interest rates into memory once
           MOVE 0 TO WS-NUM-RATES
           EXEC SQL
               DECLARE C-RATES CURSOR FOR
               SELECT PRODUCT_CODE, ANNUAL_RATE
               FROM   BANKDB.INTEREST_RATES
               WHERE  EFFECTIVE_DATE <= CURRENT DATE
               AND    EXPIRY_DATE >= CURRENT DATE
               ORDER BY PRODUCT_CODE
           END-EXEC

           EXEC SQL OPEN C-RATES END-EXEC

           PERFORM UNTIL SQLCODE NOT = 0
               ADD 1 TO WS-NUM-RATES
               EXEC SQL
                   FETCH C-RATES
                   INTO :WS-RT-PRODUCT(WS-NUM-RATES),
                        :WS-RT-ANNUAL-RATE(WS-NUM-RATES)
               END-EXEC
           END-PERFORM

           EXEC SQL CLOSE C-RATES END-EXEC

           DISPLAY 'LOADED ' WS-NUM-RATES ' INTEREST RATES'
               UPON CONSOLE.

       2000-CALCULATE-INTEREST.
      *    Look up the rate using binary search (SEARCH ALL)
           SEARCH ALL WS-RATE-ENTRY
               AT END
                   MOVE 0 TO WS-ANNUAL-RATE
               WHEN WS-RT-PRODUCT(WS-RT-IDX) = AM-PRODUCT-CODE
                   MOVE WS-RT-ANNUAL-RATE(WS-RT-IDX)
                       TO WS-ANNUAL-RATE
           END-SEARCH

      *    Calculate daily interest using COMP-3 fields
      *    (no pack/unpack conversions needed)
           IF WS-ANNUAL-RATE > 0
               COMPUTE WS-DAILY-RATE =
                   WS-ANNUAL-RATE / WS-DAYS-IN-YEAR
               COMPUTE WS-ACCRUED-INT ROUNDED =
                   AM-ACCT-BALANCE * WS-DAILY-RATE
           ELSE
               MOVE 0 TO WS-ACCRUED-INT
           END-IF.

Fix 4: Optimize statement generation output (EOD040)

//*================================================================*
//* STATEMENT GENERATION - OPTIMIZED JCL                            *
//* Changed BLKSIZE from 133 to 27930 (210x reduction in I/O)      *
//* Added BUFNO=20 for write-behind buffering                       *
//*================================================================*
//EOD040   EXEC PGM=STMTGEN
//STEPLIB  DD DSN=BANK.PROD.LOADLIB,DISP=SHR
//CUSTMAST DD DSN=BANK.PROD.CUSTMAST,DISP=SHR,
//            AMP=('BUFND=20,BUFNI=10')
//STMTOUT  DD DSN=BANK.PROD.STATEMENTS.DAILY,
//            DISP=(NEW,CATLG,DELETE),
//            UNIT=SYSDA,
//            SPACE=(CYL,(200,50),RLSE),
//            DCB=(RECFM=FBA,LRECL=133,BLKSIZE=27930,BUFNO=20)
//SYSOUT   DD SYSOUT=*

32.10.4 Results

After implementing all optimizations:

Job Before After Improvement
EOD010 45 min 40 min 11% (SORT work area tuning)
EOD020 3h 15m 55 min 72% (DB2 index + VSAM reorg)
EOD030 2h 30m 45 min 70% (COMP-3 + sequential I/O + OPTIMIZE(2))
EOD040 1h 45m 25 min 76% (block size + buffering)
EOD050 1h 30m 1h 20m 11% (minor SQL tuning)
Total 9h 45m 4h 05m 58% overall reduction

The batch window now has nearly 6 hours of headroom, providing capacity for the projected 15% annual transaction growth for the next several years.

Key Concept

The largest performance gains almost always come from fixing fundamental issues --- missing indexes, unblocked output files, wrong data types for arithmetic, and unnecessary random I/O. Micro-optimizations within COBOL code (shaving a few instructions from a loop) are far less impactful than addressing these structural problems. Always measure first, fix the biggest problems first, and measure again after each change.


32.11 Performance Tuning Checklist

Use this checklist when tuning a COBOL program. Items are ordered by typical impact, from highest to lowest:

Compiler Options

  • [ ] OPTIMIZE(2) for production compilations
  • [ ] TRUNC(OPT) if binary values always fit PIC size
  • [ ] NUMPROC(PFD) if all numeric data has preferred signs
  • [ ] FASTSRT for programs that use SORT
  • [ ] NOTEST for production (remove debugging hooks)
  • [ ] AWO for sequential output files

File I/O

  • [ ] Block size optimized for device geometry (half-track or full-track)
  • [ ] BUFNO increased for sequential files (10-30 buffers)
  • [ ] VSAM BUFND and BUFNI specified via AMP parameter
  • [ ] VSAM FREESPACE appropriate for insert patterns
  • [ ] VSAM files reorganized regularly to eliminate CI/CA splits
  • [ ] Sequential access used instead of random when processing >15% of records

DB2

  • [ ] All WHERE clause predicates are indexable (no functions on columns)
  • [ ] Indexes exist for all frequently-used access paths
  • [ ] EXPLAIN shows index access, not tablespace scans
  • [ ] Host variables match DB2 column types exactly
  • [ ] SELECT lists include only needed columns (no SELECT *)
  • [ ] Multi-row FETCH used for cursor processing
  • [ ] COMMIT frequency balanced between lock duration and overhead

COBOL Coding

  • [ ] COMP (binary) used for counters, subscripts, and flags
  • [ ] COMP-3 (packed decimal) used for financial amounts
  • [ ] SEARCH ALL used for sorted table lookups
  • [ ] Index names used instead of subscripts for table access
  • [ ] Invariant expressions moved outside loops
  • [ ] Average computed after the loop, not inside it
  • [ ] EVALUATE used instead of nested IF for multi-way branches

CICS

  • [ ] Pseudo-conversational design used throughout
  • [ ] COMMAREA kept as small as possible
  • [ ] BMS DATAONLY used for screen refreshes
  • [ ] Locks acquired as late as possible and held briefly
  • [ ] DB2 calls minimized (use JOINs instead of multiple SELECTs)

Batch Job Design

  • [ ] SORT parameters tuned (FILSZ, DYNALLOC, work area allocation)
  • [ ] Checkpoint/restart implemented for long-running jobs
  • [ ] Parallel processing considered for large-volume jobs
  • [ ] Job scheduling dependencies reviewed for unnecessary serialization

32.12 Summary

Performance tuning for COBOL programs on z/OS is both an art and a science. The science lies in measuring accurately, understanding the architecture, and applying known optimization techniques. The art lies in knowing which optimizations to apply first and understanding the trade-offs between CPU time, elapsed time, memory usage, and code maintainability.

In this chapter, we have covered:

  • Performance fundamentals --- CPU time, elapsed time, I/O counts, and memory usage as the four pillars of mainframe performance measurement.
  • Compiler optimization --- OPTIMIZE(2), TRUNC(OPT), NUMPROC(PFD), and FASTSRT as the most impactful compiler options for performance.
  • Efficient coding techniques --- SEARCH ALL for sorted tables, PERFORM VARYING optimization, data type selection, and EVALUATE for multi-way branching.
  • File I/O optimization --- Block size calculation, buffer management, and VSAM tuning including FREESPACE, CI/CA split management, and buffer allocation.
  • DB2 performance --- Index-friendly predicates, EXPLAIN analysis, host variable alignment, and multi-row FETCH.
  • CICS performance --- Pseudo-conversational design, BMS optimization, COMMAREA minimization, and lock management.
  • Memory optimization --- WORKING-STORAGE layout, data type sizing, and block processing for large datasets.
  • Batch performance --- SORT optimization, checkpoint/restart, and parallel processing patterns.
  • Monitoring tools --- SMF records, Application Performance Analyzer, and instrumentation techniques.
  • Case study --- A realistic end-of-day batch cycle optimization that achieved a 58% reduction in elapsed time through a combination of DB2 index restoration, VSAM reorganization, compiler option changes, data type corrections, and I/O optimization.

The security considerations discussed in Chapter 31 add overhead to every resource access. As you tune for performance, remember that security overhead is non-negotiable --- you must optimize within the constraints of your security model, never by circumventing it. Similarly, the audit trail requirements discussed in Chapter 31 add I/O that must be accounted for in your performance budget.

Performance tuning is an ongoing activity, not a one-time event. Transaction volumes grow, data accumulates, and access patterns change. The monitoring and measurement techniques in this chapter should be part of your regular operational routine, identifying and addressing performance degradation before it impacts the business.


Chapter Review Questions:

  1. What are the three fundamental performance metrics on z/OS, and what does each one measure?
  2. Explain the difference between OPTIMIZE(0), OPTIMIZE(1), and OPTIMIZE(2). Why should production programs always use OPTIMIZE(2)?
  3. A program processes 10 million records. Compare the number of I/O operations for BLKSIZE=200 vs. BLKSIZE=27800 with LRECL=200. What is the percentage reduction?
  4. Why is SEARCH ALL significantly faster than SEARCH for a table with 10,000 entries? What is the prerequisite for using SEARCH ALL?
  5. Describe three ways to optimize DB2 SQL performance in a COBOL program. Which typically has the largest impact?
  6. Why is pseudo-conversational programming essential for CICS performance? What resources are wasted in conversational mode?
  7. Explain the concept of checkpoint/restart and why it is important for long-running batch jobs.
  8. Given a batch cycle with five sequential jobs taking 2h, 3h, 1h, 2h, and 1h respectively, how would you redesign the cycle to reduce total elapsed time through parallel processing? What constraints must you consider?
  9. Review the performance tuning case study. Which single fix had the largest impact and why?
  10. How do the security requirements from Chapter 31 affect performance tuning decisions? Give two specific examples.