In the world of mainframe computing, performance is not an abstract concern --- it is a business-critical requirement measured in dollars and cents. Every CPU second consumed by a COBOL program has a direct cost, often calculated by the Million...
In This Chapter
- 32.1 Performance Fundamentals
- 32.2 COBOL Compiler Optimization Options
- 32.3 Efficient Coding Techniques
- 32.4 File I/O Optimization
- 32.5 DB2 Performance
- 32.6 CICS Performance
- 32.7 Memory Optimization
- 32.8 Batch Job Performance
- 32.9 Performance Monitoring Tools
- 32.10 Real-World Tuning Case Study: Optimizing a Daily Batch Cycle
- 32.11 Performance Tuning Checklist
- 32.12 Summary
Chapter 32: Performance Tuning for COBOL Programs
Part VI - Mainframe Environment and Batch Processing
In the world of mainframe computing, performance is not an abstract concern --- it is a business-critical requirement measured in dollars and cents. Every CPU second consumed by a COBOL program has a direct cost, often calculated by the Million Service Units (MSU) consumed, which determines the monthly software licensing fees an organization pays to IBM and other vendors. A poorly performing batch job that overruns its processing window can delay end-of-day settlement, impacting millions of dollars in financial transactions. A CICS transaction that takes two seconds instead of half a second can mean the difference between a responsive teller system and long customer queues at the branch.
This chapter provides a comprehensive guide to performance tuning for COBOL programs running on z/OS. We will examine performance from every angle: compiler options that generate faster code, coding techniques that reduce CPU consumption, I/O optimization that minimizes elapsed time, DB2 tuning that eliminates unnecessary overhead, and CICS performance patterns that keep online systems responsive. Throughout, we will use realistic examples drawn from financial batch processing and high-volume transaction systems.
Key Concept
Performance tuning on the mainframe is a disciplined engineering activity, not guesswork. Every optimization must be measured before and after implementation. The three fundamental metrics are CPU time (the cost you pay for), elapsed time (the time the user or batch window experiences), and I/O count (often the dominant factor in elapsed time). Changes that reduce one metric may increase another, so you must understand the trade-offs.
32.1 Performance Fundamentals
Before tuning any COBOL program, you must understand what you are measuring and why. Mainframe performance is characterized by several distinct metrics, each telling a different part of the story.
32.1.1 CPU Time
CPU time is the amount of processor time consumed by your program. On z/OS, CPU time is reported in two forms:
- TCB time --- Time spent executing your application code under the Task Control Block. This is the time your COBOL program is actually running instructions on the processor.
- SRB time --- Time spent executing system services on behalf of your task under the Service Request Block. This includes I/O completion processing, paging, and other system activities.
CPU time is the primary cost driver because IBM's Sub-Capacity Pricing model bases software license fees on the rolling four-hour average of MSU consumption. Reducing CPU time directly reduces costs.
32.1.2 Elapsed Time (Wall Clock Time)
Elapsed time is the total time from when your job step starts to when it completes. Elapsed time includes:
- CPU time (your code executing)
- I/O wait time (waiting for disk reads and writes)
- Queue time (waiting for CPU, memory, or other resources)
- Lock/latch wait time (waiting for DB2 or VSAM locks)
In batch processing, elapsed time determines whether your jobs fit within the batch window. A typical banking batch window runs from approximately 6:00 PM to 6:00 AM, during which all end-of-day processing must complete before online systems come back up.
32.1.3 I/O Counts
I/O operations are often the single largest contributor to elapsed time. Each physical I/O to a disk subsystem takes milliseconds, and those milliseconds add up quickly when processing millions of records. The key I/O metrics are:
- EXCP count --- The number of Execute Channel Programs issued, representing physical I/O operations.
- Connect time --- The time the channel is connected to the device performing the I/O.
- Disconnect time --- The time waiting for the device to position (seek time, rotational delay).
32.1.4 Memory Usage
Memory on z/OS is divided into regions, and each job step has a REGION parameter that controls how much virtual storage is available. Excessive memory usage can lead to:
- Paging, which dramatically increases elapsed time.
- Storage shortages that prevent other jobs from running.
- ABEND S878 (insufficient virtual storage).
The JCL to capture basic performance metrics for a batch job:
//PERFTEST JOB (BANK,PERF),'PERFORMANCE TEST',
// CLASS=A,MSGCLASS=X,MSGLEVEL=(1,1),
// NOTIFY=&SYSUID,
// REGION=0M
//*================================================================*
//* PERFORMANCE MEASUREMENT JOB *
//* REGION=0M allows the job to use all available storage. *
//* In production, specify an appropriate REGION size. *
//*================================================================*
//*
//* Step 1: Run the program being measured
//*
//RUNPGM EXEC PGM=DAYENDBT,TIME=1440
//STEPLIB DD DSN=BANK.PROD.LOADLIB,DISP=SHR
//CUSTMAST DD DSN=BANK.PROD.CUSTMAST,DISP=SHR
//TRANFILE DD DSN=BANK.PROD.DAYTRANS,DISP=SHR
//OUTFILE DD DSN=BANK.PROD.DAYEND.OUTPUT,
// DISP=(NEW,CATLG,DELETE),
// UNIT=SYSDA,
// SPACE=(CYL,(100,50),RLSE),
// DCB=(RECFM=FB,LRECL=200,BLKSIZE=27800)
//SYSOUT DD SYSOUT=*
//*
//* The job log will show CPU time, elapsed time, and EXCP counts
//* in the IEF374I and IEF375I messages.
//* Example output:
//* IEF374I STEP /RUNPGM / START 2025040.1800
//* IEF375I JOB /PERFTEST/ STOP 2025040.1823
//* CPU 0MIN 12.45SEC SRB 0MIN 01.22SEC
32.2 COBOL Compiler Optimization Options
The IBM Enterprise COBOL compiler provides several options that directly affect the performance of the generated code. Choosing the right compiler options is one of the easiest and most impactful performance improvements you can make.
32.2.1 OPTIMIZE
The OPTIMIZE option is the single most important performance-related compiler option. It instructs the compiler to analyze your COBOL code and generate more efficient machine code:
- OPTIMIZE(0) --- No optimization (default). The compiler generates straightforward code that is easy to debug but not optimized for performance.
- OPTIMIZE(1) --- Standard optimization. The compiler performs local optimizations within each paragraph, including common subexpression elimination, strength reduction, and dead code elimination.
- OPTIMIZE(2) --- Full optimization. The compiler performs global optimizations across the entire program, including interprocedural analysis, loop optimization, and register allocation improvements.
Key Concept
Always compile production programs with OPTIMIZE(2). The performance improvement typically ranges from 5% to 30% CPU reduction compared to OPTIMIZE(0), with no changes to your source code. The only trade-off is longer compile times and the fact that debugging optimized code can be more difficult because the generated instructions may not correspond one-to-one with source statements.
//*================================================================*
//* COMPILE WITH OPTIMIZATION FOR PRODUCTION *
//*================================================================*
//COMPILE EXEC PGM=IGYCRCTL,
// PARM='OPTIMIZE(2),TRUNC(OPT),NUMPROC(PFD),
// FASTSRT,SSRANGE,LIST,OFFSET,XREF'
//STEPLIB DD DSN=IGY.V6R4M0.SIGYCOMP,DISP=SHR
//SYSIN DD DSN=BANK.SOURCE.COBOL(DAYENDBT),DISP=SHR
//SYSLIB DD DSN=BANK.SOURCE.COPYLIB,DISP=SHR
// DD DSN=CICS.SDFHCOB,DISP=SHR
//SYSLIN DD DSN=&&OBJECT,DISP=(NEW,PASS),
// UNIT=SYSDA,SPACE=(CYL,(5,2))
//SYSPRINT DD SYSOUT=*
//SYSUT1 DD UNIT=SYSDA,SPACE=(CYL,(10,5))
//SYSUT2 DD UNIT=SYSDA,SPACE=(CYL,(10,5))
//SYSUT3 DD UNIT=SYSDA,SPACE=(CYL,(10,5))
//SYSUT4 DD UNIT=SYSDA,SPACE=(CYL,(10,5))
//SYSUT5 DD UNIT=SYSDA,SPACE=(CYL,(10,5))
//SYSUT6 DD UNIT=SYSDA,SPACE=(CYL,(10,5))
//SYSUT7 DD UNIT=SYSDA,SPACE=(CYL,(10,5))
32.2.2 TRUNC
The TRUNC option controls how the compiler handles BINARY (COMP) data items when their values exceed the number of digits specified in the PICTURE clause:
- TRUNC(STD) --- Truncates binary values to the number of digits in the PIC clause. This generates extra instructions for every binary arithmetic operation to ensure truncation.
- TRUNC(OPT) --- Assumes the programmer ensures values do not exceed the PIC size. No truncation instructions are generated, resulting in faster binary arithmetic.
- TRUNC(BIN) --- Treats binary items as full binary values regardless of the PIC clause. Useful for interfacing with non-COBOL programs but generates the most overhead for decimal arithmetic.
*================================================================*
* TRUNC(OPT) vs TRUNC(STD) impact example *
* With TRUNC(STD), every binary operation includes extra *
* instructions to truncate to the PIC size. *
* With TRUNC(OPT), the compiler trusts that values fit. *
*================================================================*
01 WS-COUNTERS.
05 WS-RECORD-COUNT PIC S9(9) COMP.
05 WS-LOOP-INDEX PIC S9(4) COMP.
05 WS-TABLE-SIZE PIC S9(4) COMP VALUE 1000.
* With TRUNC(STD), this simple ADD generates approximately
* 6-8 machine instructions including a divide to truncate.
* With TRUNC(OPT), it generates 1-2 instructions.
ADD 1 TO WS-RECORD-COUNT.
32.2.3 NUMPROC
The NUMPROC option controls how the compiler handles sign processing for packed decimal (COMP-3) and zoned decimal (DISPLAY) data:
- NUMPROC(NOPFD) --- The compiler generates code to "fix" the sign on every numeric operation, converting any valid sign representation to the preferred sign. This is the safest but slowest option.
- NUMPROC(PFD) --- The compiler assumes all numeric data already has preferred signs. No sign-fixing instructions are generated. This is significantly faster for programs that do heavy numeric processing.
- NUMPROC(MIG) --- A migration option that provides NOPFD behavior. Not recommended for new development.
Key Concept
For programs that perform intensive decimal arithmetic --- such as interest calculation, general ledger posting, or end-of-day settlement --- switching from NUMPROC(NOPFD) to NUMPROC(PFD) can reduce CPU consumption for arithmetic operations by 10-20%. However, you must ensure that all input data has valid preferred signs, or you may get incorrect results.
32.2.4 FASTSRT
The FASTSRT option tells the compiler to allow DFSORT (or SyncSort) to manage the I/O for SORT input and output files directly, bypassing COBOL's I/O routines. This can dramatically reduce the overhead of SORT operations:
*================================================================*
* SORT optimization with FASTSRT *
* When FASTSRT is active, the sort product handles all I/O *
* for the USING and GIVING files directly, eliminating the *
* overhead of COBOL's file management routines. *
*================================================================*
ENVIRONMENT DIVISION.
INPUT-OUTPUT SECTION.
FILE-CONTROL.
SELECT SORT-FILE ASSIGN TO SORTWORK.
SELECT INPUT-FILE ASSIGN TO INFILE.
SELECT OUTPUT-FILE ASSIGN TO OUTFILE.
DATA DIVISION.
FILE SECTION.
SD SORT-FILE.
01 SORT-RECORD.
05 SR-ACCOUNT-NUMBER PIC X(10).
05 SR-TRANS-DATE PIC X(8).
05 SR-TRANS-TIME PIC X(6).
05 SR-TRANS-AMOUNT PIC S9(11)V99 COMP-3.
05 SR-TRANS-TYPE PIC X(2).
05 FILLER PIC X(171).
FD INPUT-FILE.
01 INPUT-RECORD PIC X(200).
FD OUTPUT-FILE.
01 OUTPUT-RECORD PIC X(200).
PROCEDURE DIVISION.
0000-MAIN-LOGIC.
* With FASTSRT, this SORT with USING/GIVING allows the
* sort product to handle all I/O directly --- much faster
* than using INPUT PROCEDURE / OUTPUT PROCEDURE.
SORT SORT-FILE
ON ASCENDING KEY SR-ACCOUNT-NUMBER
ON ASCENDING KEY SR-TRANS-DATE
ON ASCENDING KEY SR-TRANS-TIME
USING INPUT-FILE
GIVING OUTPUT-FILE
IF SORT-RETURN NOT = 0
DISPLAY 'SORT FAILED. SORT RETURN CODE: '
SORT-RETURN
UPON CONSOLE
MOVE 16 TO RETURN-CODE
END-IF
STOP RUN.
32.2.5 Additional Performance-Related Options
| Option | Performance Impact | Recommendation |
|---|---|---|
| SSRANGE | Adds range-checking for subscripts and reference modification. Overhead of 5-15%. | Use in test; consider removing in production if validated. |
| NOTEST | Removes debugging hooks. Reduces code size and improves performance. | Use in production. |
| AWO | Apply Write Only --- buffers output writes for better I/O efficiency. | Use for sequential output files. |
| BLOCK0 | Treats BLOCK CONTAINS 0 as system-determined blocking. | Use for optimal block sizes. |
| RENT | Generates reentrant code, required for CICS and recommended for batch. | Always use. |
| ARITH(EXTEND) | Allows 31-digit precision for COMPUTE. Slightly more overhead for large precision. | Use when you need extended precision. |
32.3 Efficient Coding Techniques
Beyond compiler options, the way you write your COBOL code has a significant impact on performance. This section covers the most impactful coding techniques.
32.3.1 SEARCH ALL vs. SEARCH
COBOL provides two table search verbs: SEARCH (linear search) and SEARCH ALL (binary search). The performance difference is dramatic for large tables:
- SEARCH examines entries sequentially from the current index position. For a table of N entries, it averages N/2 comparisons to find an entry.
- SEARCH ALL uses a binary search algorithm, requiring at most log2(N) comparisons. For a table of 10,000 entries, SEARCH averages 5,000 comparisons; SEARCH ALL requires at most 14.
IDENTIFICATION DIVISION.
PROGRAM-ID. SRCHPERF.
*================================================================*
* TABLE SEARCH PERFORMANCE COMPARISON *
* Demonstrates SEARCH ALL (binary) vs SEARCH (linear) *
* for a rate lookup table used in interest calculations. *
* *
* Cross-reference: Chapter 27 (Table Handling) *
*================================================================*
DATA DIVISION.
WORKING-STORAGE SECTION.
01 WS-RATE-TABLE.
05 WS-RATE-ENTRY OCCURS 500 TIMES
ASCENDING KEY IS WS-RATE-PRODUCT-CODE
INDEXED BY WS-RATE-IDX.
10 WS-RATE-PRODUCT-CODE PIC X(6).
10 WS-RATE-TIER-CODE PIC X(2).
10 WS-RATE-EFFECTIVE-DATE PIC 9(8).
10 WS-RATE-VALUE PIC 9V9(6) COMP-3.
10 WS-RATE-DESCRIPTION PIC X(30).
01 WS-SEARCH-KEY PIC X(6).
01 WS-FOUND-RATE PIC 9V9(6).
01 WS-FOUND-FLAG PIC 9 VALUE 0.
88 WS-RATE-FOUND VALUE 1.
88 WS-RATE-NOT-FOUND VALUE 0.
PROCEDURE DIVISION.
1000-BINARY-SEARCH-RATE.
* SEARCH ALL: Binary search - O(log N) performance
* The table MUST be in ascending order by the KEY field
* and the ASCENDING KEY clause must be specified in the
* OCCURS clause.
MOVE 0 TO WS-FOUND-FLAG
SEARCH ALL WS-RATE-ENTRY
AT END
MOVE 0 TO WS-FOUND-FLAG
WHEN WS-RATE-PRODUCT-CODE(WS-RATE-IDX)
= WS-SEARCH-KEY
MOVE 1 TO WS-FOUND-FLAG
MOVE WS-RATE-VALUE(WS-RATE-IDX)
TO WS-FOUND-RATE
END-SEARCH.
2000-LINEAR-SEARCH-RATE.
* SEARCH: Linear search - O(N) performance
* Much slower for large tables, but does not require
* the table to be sorted.
SET WS-RATE-IDX TO 1
MOVE 0 TO WS-FOUND-FLAG
SEARCH WS-RATE-ENTRY
AT END
MOVE 0 TO WS-FOUND-FLAG
WHEN WS-RATE-PRODUCT-CODE(WS-RATE-IDX)
= WS-SEARCH-KEY
MOVE 1 TO WS-FOUND-FLAG
MOVE WS-RATE-VALUE(WS-RATE-IDX)
TO WS-FOUND-RATE
END-SEARCH.
32.3.2 PERFORM VARYING Optimization
The PERFORM VARYING statement is one of the most frequently executed statements in COBOL programs. Small optimizations in loop processing can have a large cumulative effect when the loop executes millions of times:
*================================================================*
* PERFORM VARYING OPTIMIZATION TECHNIQUES *
*================================================================*
01 WS-LOOP-VARS.
05 WS-IDX PIC S9(8) COMP.
05 WS-MAX-IDX PIC S9(8) COMP.
05 WS-TOTAL PIC S9(13)V99 COMP-3.
01 WS-TRANSACTION-TABLE.
05 WS-TRANS-COUNT PIC S9(8) COMP.
05 WS-TRANS-ENTRY OCCURS 10000 TIMES
INDEXED BY WS-TRANS-IDX.
10 WS-TRANS-ACCT PIC X(10).
10 WS-TRANS-AMOUNT PIC S9(11)V99 COMP-3.
10 WS-TRANS-TYPE PIC X(2).
*--- TECHNIQUE 1: Use INDEXED BY instead of a data item -----*
* Index names are stored as displacement values, eliminating
* the multiplication needed with subscripts.
*
* SLOWER (subscript - requires multiplication on each access):
* PERFORM VARYING WS-IDX FROM 1 BY 1
* UNTIL WS-IDX > WS-TRANS-COUNT
* ADD WS-TRANS-AMOUNT(WS-IDX) TO WS-TOTAL
* END-PERFORM
*
* FASTER (index - displacement is pre-calculated):
PERFORM VARYING WS-TRANS-IDX FROM 1 BY 1
UNTIL WS-TRANS-IDX > WS-TRANS-COUNT
ADD WS-TRANS-AMOUNT(WS-TRANS-IDX) TO WS-TOTAL
END-PERFORM
*--- TECHNIQUE 2: Move invariant operations outside the loop -*
*
* SLOWER (function called on every iteration):
* PERFORM VARYING WS-TRANS-IDX FROM 1 BY 1
* UNTIL WS-TRANS-IDX > WS-TRANS-COUNT
* IF WS-TRANS-ACCT(WS-TRANS-IDX) =
* FUNCTION UPPER-CASE(WS-SEARCH-ACCOUNT)
* ADD WS-TRANS-AMOUNT(WS-TRANS-IDX) TO WS-TOTAL
* END-IF
* END-PERFORM
*
* FASTER (function called once before the loop):
MOVE FUNCTION UPPER-CASE(WS-SEARCH-ACCOUNT)
TO WS-UPPER-SEARCH-ACCT
PERFORM VARYING WS-TRANS-IDX FROM 1 BY 1
UNTIL WS-TRANS-IDX > WS-TRANS-COUNT
IF WS-TRANS-ACCT(WS-TRANS-IDX) =
WS-UPPER-SEARCH-ACCT
ADD WS-TRANS-AMOUNT(WS-TRANS-IDX) TO WS-TOTAL
END-IF
END-PERFORM.
*--- TECHNIQUE 3: Minimize operations inside the loop --------*
*
* SLOWER (multiple operations per iteration):
* PERFORM VARYING WS-TRANS-IDX FROM 1 BY 1
* UNTIL WS-TRANS-IDX > WS-TRANS-COUNT
* MOVE WS-TRANS-AMOUNT(WS-TRANS-IDX) TO WS-TEMP
* ADD WS-TEMP TO WS-TOTAL
* ADD 1 TO WS-ITEM-COUNT
* COMPUTE WS-AVERAGE = WS-TOTAL / WS-ITEM-COUNT
* END-PERFORM
*
* FASTER (compute average once after the loop):
MOVE 0 TO WS-TOTAL
MOVE 0 TO WS-ITEM-COUNT
PERFORM VARYING WS-TRANS-IDX FROM 1 BY 1
UNTIL WS-TRANS-IDX > WS-TRANS-COUNT
ADD WS-TRANS-AMOUNT(WS-TRANS-IDX) TO WS-TOTAL
ADD 1 TO WS-ITEM-COUNT
END-PERFORM
IF WS-ITEM-COUNT > 0
COMPUTE WS-AVERAGE = WS-TOTAL / WS-ITEM-COUNT
END-IF.
32.3.3 Data Type Efficiency
The choice of data types in COBOL has a direct impact on CPU consumption. Different data types have different costs for arithmetic, comparison, and move operations:
*================================================================*
* DATA TYPE PERFORMANCE CHARACTERISTICS *
*================================================================*
01 WS-DATA-TYPES.
* COMP (BINARY) - Fastest for arithmetic and comparisons
* when used as subscripts, loop counters, and flags.
* Uses hardware binary arithmetic instructions.
05 WS-BINARY-COUNTER PIC S9(8) COMP.
05 WS-BINARY-FLAG PIC S9(4) COMP.
* COMP-3 (PACKED DECIMAL) - Best for financial calculations.
* Uses hardware packed decimal instructions.
* Ideal for amounts, balances, quantities.
05 WS-PACKED-AMOUNT PIC S9(13)V99 COMP-3.
05 WS-PACKED-RATE PIC S9V9(6) COMP-3.
* DISPLAY (ZONED DECIMAL) - Slowest for arithmetic.
* Requires conversion to packed or binary before operations.
* Use only for data that will be displayed/printed as-is.
05 WS-DISPLAY-AMOUNT PIC 9(13)V99.
* COMP-5 (NATIVE BINARY) - Same as COMP but always uses
* full binary range regardless of TRUNC option.
* Use for interfacing with non-COBOL programs.
05 WS-NATIVE-BINARY PIC S9(8) COMP-5.
*================================================================*
* PERFORMANCE RULE: For counters, subscripts, and flags, *
* always use PIC S9(8) COMP. This maps to a fullword (4 bytes) *
* and uses the fastest hardware instructions. *
* *
* For financial amounts, use COMP-3 (packed decimal). *
* Avoid DISPLAY for any field involved in arithmetic. *
*================================================================*
* SLOWER - mixing DISPLAY and COMP-3 causes conversions:
* ADD WS-DISPLAY-AMOUNT TO WS-PACKED-AMOUNT
* (The compiler must convert DISPLAY to packed decimal
* before the addition, then convert back)
* FASTER - keep all arithmetic operands as COMP-3:
ADD WS-PACKED-RATE TO WS-PACKED-AMOUNT.
Key Concept
The golden rule of COBOL data types for performance is: use COMP (binary) for counters, subscripts, loop variables, and flags; use COMP-3 (packed decimal) for financial amounts and arithmetic operands; use DISPLAY only for fields that are read from or written to external files in character format. Never perform arithmetic on DISPLAY fields if you can avoid it. This principle was also discussed in Chapter 28 when covering numeric data types.
32.3.4 Efficient EVALUATE vs. Nested IF
When testing multiple conditions, EVALUATE is generally more efficient and readable than nested IF statements, particularly when the compiler can optimize it into a branch table:
* EFFICIENT: EVALUATE generates optimized branch logic
EVALUATE WS-TRANS-TYPE
WHEN 'DP'
PERFORM 5100-PROCESS-DEPOSIT
WHEN 'WD'
PERFORM 5200-PROCESS-WITHDRAWAL
WHEN 'XF'
PERFORM 5300-PROCESS-TRANSFER
WHEN 'PY'
PERFORM 5400-PROCESS-PAYMENT
WHEN 'FE'
PERFORM 5500-PROCESS-FEE
WHEN OTHER
PERFORM 5900-PROCESS-UNKNOWN
END-EVALUATE.
32.4 File I/O Optimization
I/O is almost always the dominant factor in batch program elapsed time. A program that processes 10 million records can spend 90% or more of its elapsed time waiting for I/O. Optimizing I/O is therefore the highest-leverage activity for batch performance tuning.
32.4.1 Block Size Optimization
The block size determines how many logical records are read or written in a single physical I/O operation. A larger block size means fewer I/O operations, which directly reduces elapsed time:
//*================================================================*
//* BLOCK SIZE OPTIMIZATION EXAMPLES *
//*================================================================*
//*
//* POOR PERFORMANCE - small block size, many I/O operations:
//* //CUSTMAST DD DSN=BANK.PROD.CUSTMAST,DISP=SHR,
//* // DCB=(RECFM=FB,LRECL=200,BLKSIZE=200)
//* (1 record per block = 1 I/O per record)
//*
//* GOOD PERFORMANCE - optimal block size:
//CUSTMAST DD DSN=BANK.PROD.CUSTMAST,DISP=SHR,
// DCB=(RECFM=FB,LRECL=200,BLKSIZE=27800)
//* (139 records per block, using half-track blocking)
//*
//* BEST PRACTICE - let the system determine optimal block size:
//* //CUSTMAST DD DSN=BANK.PROD.CUSTMAST,DISP=SHR,
//* // DCB=(RECFM=FB,LRECL=200,BLKSIZE=0)
//* BLKSIZE=0 tells DFSMS to choose the optimal block size
//* based on the device geometry.
//*
//*================================================================*
//* BLOCK SIZE CALCULATION FOR 3390 DISK: *
//* Track capacity = 56,664 bytes *
//* Half-track = 27,998 bytes *
//* For LRECL=200: BLKSIZE = (27998 / 200) * 200 = 27800 *
//* This fits 139 records per block. *
//* *
//* Impact: Processing 10,000,000 records *
//* BLKSIZE=200: 10,000,000 I/Os *
//* BLKSIZE=27800: 71,943 I/Os *
//* Reduction: 99.3% fewer I/O operations *
//*================================================================*
32.4.2 Buffer Optimization (BUFNO and BUFSIZE)
Buffers are areas of memory that hold blocks of data being read from or written to files. More buffers allow the system to read ahead (for input) or write behind (for output), overlapping I/O with processing:
//*================================================================*
//* BUFFER OPTIMIZATION *
//*================================================================*
//*
//* For sequential input files - read-ahead buffering:
//TRANFILE DD DSN=BANK.PROD.DAYTRANS,DISP=SHR,
// DCB=(BUFNO=20)
//* 20 buffers allows aggressive read-ahead, keeping the
//* program processing while the next blocks are being read.
//*
//* For sequential output files - write-behind buffering:
//OUTFILE DD DSN=BANK.PROD.DAYEND.OUTPUT,
// DISP=(NEW,CATLG,DELETE),
// UNIT=SYSDA,
// SPACE=(CYL,(100,50),RLSE),
// DCB=(RECFM=FB,LRECL=200,BLKSIZE=27800,BUFNO=20)
//*
//* For VSAM files - use BUFFERSPACE or AMP parameter:
//ACCTMAST DD DSN=BANK.PROD.ACCTMAST,DISP=SHR,
// AMP=('BUFND=20,BUFNI=10')
//* BUFND=20: 20 data buffers for data CI reads
//* BUFNI=10: 10 index buffers to cache the VSAM index
32.4.3 VSAM Tuning
VSAM (Virtual Storage Access Method) files are the most common file organization for online and random-access data on z/OS. As discussed in Chapter 22, VSAM performance depends on several factors.
CI and CA Splits:
A Control Interval (CI) split occurs when a record is inserted into a full CI. The CI must be split into two, with half the records moved to a new CI. A Control Area (CA) split is even more expensive --- it occurs when all CIs in a CA are full and a new CI cannot be allocated within the CA.
IDENTIFICATION DIVISION.
PROGRAM-ID. VSAMPERF.
*================================================================*
* VSAM PERFORMANCE OPTIMIZATION *
* Demonstrates techniques for efficient VSAM access *
*================================================================*
ENVIRONMENT DIVISION.
INPUT-OUTPUT SECTION.
FILE-CONTROL.
SELECT ACCOUNT-FILE
ASSIGN TO ACCTMAST
ORGANIZATION IS INDEXED
ACCESS MODE IS DYNAMIC
RECORD KEY IS ACCT-KEY
FILE STATUS IS WS-FILE-STATUS.
DATA DIVISION.
FILE SECTION.
FD ACCOUNT-FILE.
01 ACCOUNT-RECORD.
05 ACCT-KEY PIC X(14).
05 ACCT-DATA PIC X(186).
WORKING-STORAGE SECTION.
01 WS-FILE-STATUS PIC XX.
01 WS-RECORDS-READ PIC S9(8) COMP VALUE 0.
PROCEDURE DIVISION.
1000-SEQUENTIAL-BROWSE-TECHNIQUE.
* For processing many records in key sequence,
* sequential (browse) access is MUCH faster than
* random READ because it uses sequential buffering.
*
* SLOWER - random reads for sequential processing:
* PERFORM VARYING WS-IDX FROM 1 BY 1
* UNTIL WS-IDX > WS-NUM-ACCOUNTS
* MOVE WS-ACCT-TABLE(WS-IDX) TO ACCT-KEY
* READ ACCOUNT-FILE
* INVALID KEY CONTINUE
* END-READ
* END-PERFORM
*
* FASTER - sequential browse when reading many records:
MOVE LOW-VALUES TO ACCT-KEY
START ACCOUNT-FILE
KEY IS NOT LESS THAN ACCT-KEY
INVALID KEY
DISPLAY 'START FAILED' UPON CONSOLE
END-START
PERFORM UNTIL WS-FILE-STATUS NOT = '00'
READ ACCOUNT-FILE NEXT
AT END
CONTINUE
END-READ
IF WS-FILE-STATUS = '00'
ADD 1 TO WS-RECORDS-READ
PERFORM 2000-PROCESS-ACCOUNT
END-IF
END-PERFORM.
3000-BATCH-UPDATE-TECHNIQUE.
* For batch updates, sort the updates into key sequence
* and process sequentially. This minimizes random I/O
* and CI splits by updating records in physical order.
CONTINUE.
VSAM Performance JCL:
//*================================================================*
//* VSAM CLUSTER DEFINITION WITH PERFORMANCE OPTIONS *
//*================================================================*
//DEFVSAM EXEC PGM=IDCAMS
//SYSPRINT DD SYSOUT=*
//SYSIN DD *
DELETE BANK.PROD.ACCTMAST CLUSTER PURGE
SET MAXCC = 0
DEFINE CLUSTER ( -
NAME(BANK.PROD.ACCTMAST) -
INDEXED -
RECORDSIZE(200 200) -
KEYS(14 0) -
SHAREOPTIONS(2 3) -
SPEED -
FREESPACE(20 10) -
CONTROLINTERVALSIZE(4096) -
) -
DATA ( -
NAME(BANK.PROD.ACCTMAST.DATA) -
CYLINDERS(500 100) -
CONTROLINTERVALSIZE(4096) -
BUFFERSPACE(1048576) -
) -
INDEX ( -
NAME(BANK.PROD.ACCTMAST.INDEX) -
CYLINDERS(10 5) -
CONTROLINTERVALSIZE(2048) -
)
/*
//*
//* FREESPACE(20 10):
//* 20% free space in each CI for future inserts
//* 10% of CIs left empty in each CA for CI splits
//*
//* SHAREOPTIONS(2 3):
//* 2 = Multiple readers, one writer (within a system)
//* 3 = Multiple readers and writers (cross-system via VSAM RLS)
//*
//* BUFFERSPACE(1048576):
//* 1 MB of buffer space for data I/O
Key Concept
The most common VSAM performance problem is excessive CI and CA splits caused by insufficient FREESPACE. Monitor CI/CA split counts using IDCAMS LISTCAT and reorganize VSAM files before splits become a significant performance drag. A VSAM file with heavy insert activity should have FREESPACE(20 10) or higher, adjusted based on monitoring data. This was discussed in Chapter 29 when covering VSAM file design.
32.5 DB2 Performance
For COBOL programs that access DB2, the SQL statements are often the dominant consumer of both CPU and elapsed time. Efficient SQL is critical for performance.
32.5.1 Efficient SQL Coding
The way you write SQL directly affects how DB2 accesses data. Even small changes in SQL syntax can cause DB2 to choose a dramatically different access path:
IDENTIFICATION DIVISION.
PROGRAM-ID. DB2PERF.
*================================================================*
* DB2 PERFORMANCE OPTIMIZATION EXAMPLES *
* Demonstrates efficient SQL coding techniques for COBOL. *
* *
* Cross-reference: Chapter 24 (DB2 Programming), *
* Chapter 30 (SQL Best Practices) *
*================================================================*
DATA DIVISION.
WORKING-STORAGE SECTION.
EXEC SQL INCLUDE SQLCA END-EXEC.
01 WS-HOST-VARS.
05 WS-ACCT-NUMBER PIC X(10).
05 WS-BRANCH-CODE PIC X(4).
05 WS-START-DATE PIC X(10).
05 WS-END-DATE PIC X(10).
05 WS-TRANS-TOTAL PIC S9(13)V99 COMP-3.
05 WS-TRANS-COUNT PIC S9(8) COMP.
05 WS-ACCT-BALANCE PIC S9(13)V99 COMP-3.
05 WS-ACCT-NAME PIC X(30).
05 WS-ACCT-STATUS PIC X(1).
05 WS-NULL-IND PIC S9(4) COMP.
PROCEDURE DIVISION.
1000-USE-INDEX-FRIENDLY-PREDICATES.
* INEFFICIENT - function on column prevents index use:
* EXEC SQL
* SELECT ACCT_NUMBER, ACCT_NAME
* INTO :WS-ACCT-NUMBER, :WS-ACCT-NAME
* FROM BANKDB.CUSTOMER
* WHERE SUBSTR(ACCT_NUMBER,1,4) = :WS-BRANCH-CODE
* END-EXEC
*
* EFFICIENT - range predicate allows index scan:
EXEC SQL
SELECT ACCT_NUMBER, ACCT_NAME
INTO :WS-ACCT-NUMBER, :WS-ACCT-NAME
FROM BANKDB.CUSTOMER
WHERE ACCT_NUMBER >= :WS-BRANCH-CODE || '000000'
AND ACCT_NUMBER <= :WS-BRANCH-CODE || '999999'
END-EXEC.
2000-SELECT-ONLY-NEEDED-COLUMNS.
* INEFFICIENT - SELECT * reads all columns:
* EXEC SQL
* SELECT *
* INTO :WS-CUSTOMER-RECORD
* FROM BANKDB.CUSTOMER
* WHERE ACCT_NUMBER = :WS-ACCT-NUMBER
* END-EXEC
*
* EFFICIENT - select only what you need:
EXEC SQL
SELECT ACCT_NAME, ACCT_BALANCE, ACCT_STATUS
INTO :WS-ACCT-NAME, :WS-ACCT-BALANCE,
:WS-ACCT-STATUS
FROM BANKDB.CUSTOMER
WHERE ACCT_NUMBER = :WS-ACCT-NUMBER
END-EXEC.
3000-USE-HOST-VARIABLES-NOT-LITERALS.
* INEFFICIENT - literals cause different access paths
* to be cached for each different value:
* EXEC SQL
* SELECT COUNT(*)
* INTO :WS-TRANS-COUNT
* FROM BANKDB.TRANSACTIONS
* WHERE BRANCH_CODE = 'B001'
* AND TRANS_DATE >= '2025-01-01'
* END-EXEC
*
* EFFICIENT - host variables allow access path reuse:
MOVE 'B001' TO WS-BRANCH-CODE
MOVE '2025-01-01' TO WS-START-DATE
EXEC SQL
SELECT COUNT(*)
INTO :WS-TRANS-COUNT
FROM BANKDB.TRANSACTIONS
WHERE BRANCH_CODE = :WS-BRANCH-CODE
AND TRANS_DATE >= :WS-START-DATE
END-EXEC.
4000-AVOID-UNNECESSARY-SORTS.
* INEFFICIENT - ORDER BY on a non-indexed column
* forces a sort operation in DB2:
* EXEC SQL
* DECLARE C-UNSORTED CURSOR FOR
* SELECT ACCT_NUMBER, TRANS_DATE, TRANS_AMOUNT
* FROM BANKDB.TRANSACTIONS
* WHERE BRANCH_CODE = :WS-BRANCH-CODE
* ORDER BY TRANS_AMOUNT DESC
* END-EXEC
*
* EFFICIENT - ORDER BY on the indexed key avoids a sort:
EXEC SQL
DECLARE C-SORTED CURSOR FOR
SELECT ACCT_NUMBER, TRANS_DATE, TRANS_AMOUNT
FROM BANKDB.TRANSACTIONS
WHERE BRANCH_CODE = :WS-BRANCH-CODE
ORDER BY ACCT_NUMBER, TRANS_DATE
END-EXEC.
5000-USE-FETCH-FOR-MULTIPLE-ROWS.
* EFFICIENT - MULTI-ROW FETCH retrieves many rows in
* one DB2 call, reducing the number of FETCH calls
* from one per row to one per block of rows:
EXEC SQL
DECLARE C-MULTI CURSOR FOR
SELECT ACCT_NUMBER, TRANS_DATE, TRANS_AMOUNT
FROM BANKDB.TRANSACTIONS
WHERE BRANCH_CODE = :WS-BRANCH-CODE
AND TRANS_DATE BETWEEN :WS-START-DATE
AND :WS-END-DATE
ORDER BY ACCT_NUMBER
END-EXEC
EXEC SQL OPEN C-MULTI END-EXEC
EXEC SQL
FETCH C-MULTI
FOR 100 ROWS
INTO :WS-ACCT-ARRAY,
:WS-DATE-ARRAY,
:WS-AMOUNT-ARRAY
END-EXEC.
32.5.2 Using EXPLAIN to Analyze Access Paths
The DB2 EXPLAIN facility reveals how DB2 will access data for your SQL statements. This is essential for understanding and optimizing performance:
//*================================================================*
//* EXPLAIN THE ACCESS PATH FOR A COBOL PROGRAM'S SQL *
//*================================================================*
//EXPLAIN EXEC PGM=IKJEFT01,DYNAMNBR=20
//STEPLIB DD DSN=DSN.V13R1.SDSNLOAD,DISP=SHR
//SYSTSPRT DD SYSOUT=*
//SYSPRINT DD SYSOUT=*
//SYSTSIN DD *
DSN SYSTEM(DB2P)
BIND PLAN(DAYENDPL) -
MEMBER(DAYENDBT) -
ACTION(REPLACE) -
EXPLAIN(YES) -
ISOLATION(CS)
END
/*
//*
//* After the bind, query the PLAN_TABLE to see access paths:
//*
//QUERY EXEC PGM=IKJEFT01,DYNAMNBR=20
//STEPLIB DD DSN=DSN.V13R1.SDSNLOAD,DISP=SHR
//SYSTSPRT DD SYSOUT=*
//SYSPRINT DD SYSOUT=*
//SYSTSIN DD *
DSN SYSTEM(DB2P)
RUN PROGRAM(DSNTEP2) PLAN(DSNTEP2) -
LIB('DSN.V13R1.RUNLIB.LOAD')
END
//SYSIN DD *
SELECT QUERYNO, QBLOCKNO, PLANNO,
METHOD, TNAME, ACCESSTYPE,
MATCHCOLS, INDEXONLY, ACCESSNAME
FROM SYSIBM.PLAN_TABLE
WHERE APPLNAME = 'DAYENDPL'
ORDER BY QUERYNO, QBLOCKNO, PLANNO;
/*
Key EXPLAIN output values to look for:
| Column | Good Values | Problem Values |
|---|---|---|
| ACCESSTYPE | I (index), I1 (one-fetch index) | R (tablespace scan) |
| MATCHCOLS | > 0 (index columns matched) | 0 (no matching, full index scan) |
| INDEXONLY | Y (data from index only, no table access) | N (must access table pages) |
| METHOD | 0 (nested loop join) | 1,2,3,4 (various sorts required) |
32.5.3 Host Variable Optimization
The way you define host variables in COBOL can affect DB2 performance. Mismatched data types between host variables and DB2 columns cause DB2 to perform conversions, which adds CPU overhead:
*================================================================*
* HOST VARIABLE ALIGNMENT WITH DB2 COLUMN TYPES *
*================================================================*
* DB2 Column COBOL Host Variable
* -------------------------- -------------------------
* CHAR(10) PIC X(10)
* VARCHAR(30) 01 WS-NAME.
* 49 WS-NAME-LEN PIC S9(4) COMP.
* 49 WS-NAME-TEXT PIC X(30).
* DECIMAL(13,2) PIC S9(13)V99 COMP-3
* INTEGER PIC S9(9) COMP
* SMALLINT PIC S9(4) COMP
* DATE PIC X(10)
* TIMESTAMP PIC X(26)
*
* IMPORTANT: Mismatched types cause DB2 to convert at runtime.
* For example, if a DB2 column is DECIMAL(13,2) and your host
* variable is PIC S9(11)V99 COMP-3 (DECIMAL(13,2)), no
* conversion is needed. But if your host variable is
* PIC 9(13).99 (DISPLAY), DB2 must convert on every fetch.
01 WS-EFFICIENT-HOST-VARS.
* These match DB2 column types exactly - no conversion needed
05 WS-ACCT-NUMBER PIC X(10).
05 WS-ACCT-BALANCE PIC S9(13)V99 COMP-3.
05 WS-TRANS-COUNT PIC S9(9) COMP.
05 WS-TRANS-DATE PIC X(10).
05 WS-LAST-UPDATE PIC X(26).
01 WS-INEFFICIENT-HOST-VARS.
* These require DB2 runtime conversion - AVOID
05 WS-BAD-BALANCE PIC 9(13)V99.
05 WS-BAD-COUNT PIC 9(9).
32.6 CICS Performance
CICS performance tuning focuses on keeping transaction response times low and system throughput high. The techniques differ from batch tuning because CICS programs must share resources with many other concurrent transactions.
32.6.1 Pseudo-Conversational vs. Conversational Design
As discussed in Chapter 23, pseudo-conversational programming is essential for CICS performance. In a conversational program, the task waits while the user reads the screen, consuming a CICS task slot and all associated resources. In a pseudo-conversational program, the task ends after sending the screen and restarts when the user presses Enter:
*================================================================*
* PSEUDO-CONVERSATIONAL PATTERN FOR CICS PERFORMANCE *
* Cross-reference: Chapter 23 (CICS Programming) *
*================================================================*
PROCEDURE DIVISION.
0000-MAIN-LOGIC.
EVALUATE TRUE
WHEN EIBCALEN = 0
* First entry - display the initial screen
PERFORM 1000-SEND-MAP
WHEN EIBAID = DFHENTER
* User pressed Enter - process input
PERFORM 2000-RECEIVE-AND-PROCESS
WHEN EIBAID = DFHPF3
* User pressed PF3 - exit
EXEC CICS RETURN END-EXEC
WHEN OTHER
PERFORM 1000-SEND-MAP
END-EVALUATE
* Return control to CICS - the task ENDS here.
* No resources are consumed while the user thinks.
EXEC CICS RETURN
TRANSID('BINQ')
COMMAREA(WS-COMMAREA)
LENGTH(LENGTH OF WS-COMMAREA)
END-EXEC.
32.6.2 BMS Optimization
BMS (Basic Mapping Support) map processing can be optimized by sending only the data that has changed:
*================================================================*
* BMS PERFORMANCE OPTIMIZATION TECHNIQUES *
*================================================================*
PROCEDURE DIVISION.
1000-EFFICIENT-MAP-SEND.
* TECHNIQUE 1: Send only data, not the map format
* On first display, send both MAP and MAPONLY:
EXEC CICS SEND MAP('ACCTMAP')
MAPSET('ACCTSET')
FROM(ACCTMAPO)
ERASE
RESP(WS-RESP)
END-EXEC.
1100-UPDATE-DATA-ONLY.
* On subsequent displays, send DATAONLY to avoid
* resending the entire map format:
EXEC CICS SEND MAP('ACCTMAP')
MAPSET('ACCTSET')
FROM(ACCTMAPO)
DATAONLY
RESP(WS-RESP)
END-EXEC.
2000-MINIMIZE-COMMAREA-SIZE.
* TECHNIQUE 2: Keep the COMMAREA as small as possible.
* Every byte in the COMMAREA is written to the TS queue
* or coupling facility when the task ends and read back
* when it restarts. Smaller = faster.
*
* BAD: Storing a 2000-byte record in the COMMAREA
* 01 WS-BIG-COMMAREA.
* 05 CA-CUSTOMER-RECORD PIC X(2000).
*
* GOOD: Store only the key; re-read the data on restart
01 WS-SMALL-COMMAREA.
05 CA-STATE PIC X(2).
05 CA-ACCOUNT-KEY PIC X(14).
05 CA-SCREEN-PAGE PIC S9(4) COMP.
05 CA-ERROR-FLAG PIC X(1).
* Total: 21 bytes instead of 2000+ bytes
32.6.3 CICS Resource Optimization
*================================================================*
* CICS RESOURCE OPTIMIZATION TECHNIQUES *
*================================================================*
PROCEDURE DIVISION.
1000-EFFICIENT-FILE-READS.
* TECHNIQUE: Combine file reads when possible.
* Each EXEC CICS READ is a separate request to the
* CICS file control module. Minimize the number of
* file requests per transaction.
*
* SLOWER - two separate reads:
* EXEC CICS READ FILE('CUSTMAST') ...
* EXEC CICS READ FILE('ACCTMAST') ...
*
* FASTER - if data is in DB2, use a single SQL join:
EXEC SQL
SELECT C.CUST_NAME, A.ACCT_BALANCE,
A.ACCT_STATUS
INTO :WS-CUST-NAME, :WS-ACCT-BALANCE,
:WS-ACCT-STATUS
FROM BANKDB.CUSTOMER C
INNER JOIN BANKDB.ACCT_MASTER A
ON C.CUST_ID = A.CUST_ID
WHERE A.ACCT_NUMBER = :WS-ACCT-NUMBER
END-EXEC.
2000-EFFICIENT-ENQUEUE.
* TECHNIQUE: Minimize the time locks are held.
* Read with UPDATE only when you are about to update.
* Do all validation BEFORE acquiring the lock.
* SLOWER - lock held during validation:
* EXEC CICS READ FILE('ACCTMAST') UPDATE ...
* PERFORM 3000-VALIDATE-INPUT <-- lock held here
* EXEC CICS REWRITE FILE('ACCTMAST') ...
* FASTER - validate first, then lock briefly:
EXEC CICS READ FILE('ACCTMAST')
INTO(WS-ACCOUNT-RECORD)
RIDFLD(WS-ACCT-KEY)
RESP(WS-RESP)
END-EXEC
PERFORM 3000-VALIDATE-INPUT
IF WS-INPUT-VALID
* Now acquire the lock and update quickly
EXEC CICS READ FILE('ACCTMAST')
INTO(WS-ACCOUNT-RECORD)
RIDFLD(WS-ACCT-KEY)
UPDATE
RESP(WS-RESP)
END-EXEC
* Verify data hasn't changed since our read
IF WS-ACCT-BALANCE = WS-SAVED-BALANCE
PERFORM 4000-APPLY-UPDATE
EXEC CICS REWRITE FILE('ACCTMAST')
FROM(WS-ACCOUNT-RECORD)
RESP(WS-RESP)
END-EXEC
ELSE
* Optimistic lock failure - data changed
EXEC CICS UNLOCK FILE('ACCTMAST')
RESP(WS-RESP)
END-EXEC
PERFORM 5000-HANDLE-CONCURRENT-UPDATE
END-IF
END-IF.
32.7 Memory Optimization
Memory layout in WORKING-STORAGE and the choice of data representations can significantly affect both CPU consumption and memory footprint.
32.7.1 WORKING-STORAGE Layout
The order of fields in WORKING-STORAGE can affect performance due to hardware alignment considerations and cache line utilization:
*================================================================*
* WORKING-STORAGE LAYOUT OPTIMIZATION *
*================================================================*
* PRINCIPLE 1: Group frequently-accessed fields together.
* Fields accessed together in the same paragraph should be
* defined near each other so they occupy the same cache line.
* GOOD - related fields are adjacent:
01 WS-TRANSACTION-PROCESSING.
05 WS-TRANS-COUNT PIC S9(8) COMP.
05 WS-TRANS-TOTAL PIC S9(13)V99 COMP-3.
05 WS-TRANS-TYPE PIC X(2).
05 WS-TRANS-STATUS PIC X(1).
* PRINCIPLE 2: Align COMP fields on fullword boundaries.
* The compiler may insert slack bytes for alignment.
* Define COMP fields in groups to minimize slack.
* SUBOPTIMAL - slack bytes inserted for alignment:
* 01 WS-MIXED-FIELDS.
* 05 WS-FLAG-1 PIC X. (1 byte)
* (3 bytes slack)
* 05 WS-COUNTER-1 PIC S9(8) COMP. (4 bytes)
* 05 WS-FLAG-2 PIC X. (1 byte)
* (3 bytes slack)
* 05 WS-COUNTER-2 PIC S9(8) COMP. (4 bytes)
* Total: 16 bytes with 6 bytes wasted on slack
* OPTIMAL - COMP fields grouped together:
01 WS-ALIGNED-FIELDS.
05 WS-COUNTER-1 PIC S9(8) COMP.
05 WS-COUNTER-2 PIC S9(8) COMP.
05 WS-FLAG-1 PIC X.
05 WS-FLAG-2 PIC X.
* Total: 10 bytes with no slack
* PRINCIPLE 3: Use COMP-3 for financial amounts.
* COMP-3 uses approximately half the bytes of DISPLAY
* for the same precision:
*
* PIC S9(13)V99 DISPLAY = 16 bytes
* PIC S9(13)V99 COMP-3 = 8 bytes
* PIC S9(13)V99 COMP = 8 bytes (but slower for
* decimal arithmetic)
32.7.2 Reducing Memory Footprint
For programs that process large tables or arrays, memory optimization can prevent paging and improve performance:
*================================================================*
* MEMORY OPTIMIZATION FOR LARGE TABLE PROCESSING *
*================================================================*
01 WS-DESIGN-CHOICES.
* APPROACH 1: Full table in memory (fast but memory-heavy)
* Use when table is small enough and accessed randomly.
05 WS-SMALL-TABLE.
10 WS-SMALL-ENTRY OCCURS 1000 TIMES
INDEXED BY WS-SM-IDX.
15 WS-SM-KEY PIC X(10).
15 WS-SM-DATA PIC X(40).
* Memory: 50,000 bytes = ~49 KB (acceptable)
* APPROACH 2: For large tables, load only what you need.
* Instead of loading 1,000,000 records into a table,
* process them in blocks or use a sorted file with
* sequential access.
* BAD - trying to load everything into memory:
* 05 WS-HUGE-TABLE.
* 10 WS-HUGE-ENTRY OCCURS 1000000 TIMES.
* 15 WS-HG-KEY PIC X(10).
* 15 WS-HG-DATA PIC X(190).
* Memory: 200,000,000 bytes = ~191 MB (too large!)
* GOOD - process in blocks of 10,000:
01 WS-BLOCK-TABLE.
05 WS-BLOCK-SIZE PIC S9(8) COMP VALUE 10000.
05 WS-BLOCK-ENTRY OCCURS 10000 TIMES
INDEXED BY WS-BLK-IDX.
10 WS-BLK-KEY PIC X(10).
10 WS-BLK-DATA PIC X(190).
* Memory: 2,000,000 bytes = ~1.9 MB (manageable)
32.8 Batch Job Performance
Batch processing is the heart of mainframe workloads in financial institutions. End-of-day processing, statement generation, interest calculation, and regulatory reporting all run as batch jobs. Optimizing the batch cycle is critical for meeting processing windows.
32.8.1 SORT Optimization
SORT operations are among the most resource-intensive activities in batch processing. The IBM DFSORT product (or SyncSort alternative) provides extensive tuning options:
//*================================================================*
//* OPTIMIZED SORT JOB FOR END-OF-DAY TRANSACTION PROCESSING *
//*================================================================*
//SORTJOB JOB (BANK,EOD),'EOD SORT',
// CLASS=A,MSGCLASS=X,MSGLEVEL=(1,1),
// NOTIFY=&SYSUID,REGION=0M
//*
//SORTRANS EXEC PGM=SORT,PARM='DYNALLOC=(SYSDA,10)'
//*
//* SORT CONTROL STATEMENTS
//SYSIN DD *
SORT FIELDS=(1,10,CH,A, Account number ascending
11,8,CH,A, Transaction date ascending
19,6,CH,A) Transaction time ascending
INCLUDE COND=(35,1,CH,EQ,C'A') Include only active records
SUM FIELDS=NONE Remove duplicate keys
OPTION FILSZ=E50000000 Estimated 50 million records
/*
//*
//* INPUT FILE - daily transactions
//SORTIN DD DSN=BANK.PROD.DAYTRANS,DISP=SHR
//*
//* OUTPUT FILE - sorted transactions
//SORTOUT DD DSN=BANK.PROD.DAYTRANS.SORTED,
// DISP=(NEW,CATLG,DELETE),
// UNIT=SYSDA,
// SPACE=(CYL,(500,100),RLSE),
// DCB=(RECFM=FB,LRECL=200,BLKSIZE=27800)
//*
//* SORT WORK DATASETS - allocated dynamically via DYNALLOC
//* The PARM DYNALLOC=(SYSDA,10) allocates up to 10 work datasets
//* on SYSDA, allowing sort to use parallel I/O.
//*
//SYSOUT DD SYSOUT=*
Using SORT within a COBOL program efficiently:
IDENTIFICATION DIVISION.
PROGRAM-ID. EODPROC.
*================================================================*
* END-OF-DAY TRANSACTION PROCESSOR *
* Uses internal SORT with FASTSRT for optimal performance. *
* Processes the daily transaction file, sorting by account *
* and applying to the account master. *
* *
* Cross-reference: Chapter 28 (SORT/MERGE), *
* Chapter 31 (Security) *
*================================================================*
ENVIRONMENT DIVISION.
INPUT-OUTPUT SECTION.
FILE-CONTROL.
SELECT SORT-FILE ASSIGN TO SORTWORK.
SELECT TRANS-FILE ASSIGN TO TRANSIN.
SELECT ACCOUNT-FILE ASSIGN TO ACCTMAST
ORGANIZATION IS INDEXED
ACCESS MODE IS RANDOM
RECORD KEY IS AM-ACCT-KEY
FILE STATUS IS WS-ACCT-STATUS.
SELECT REPORT-FILE ASSIGN TO RPTOUT.
DATA DIVISION.
FILE SECTION.
SD SORT-FILE.
01 SORT-RECORD.
05 SR-ACCT-NUMBER PIC X(10).
05 SR-TRANS-DATE PIC X(8).
05 SR-TRANS-TIME PIC X(6).
05 SR-TRANS-AMOUNT PIC S9(11)V99 COMP-3.
05 SR-TRANS-TYPE PIC X(2).
05 SR-TRANS-STATUS PIC X(1).
05 FILLER PIC X(166).
FD TRANS-FILE
BLOCK CONTAINS 0 RECORDS.
01 TRANS-RECORD PIC X(200).
FD ACCOUNT-FILE.
01 ACCOUNT-MASTER.
05 AM-ACCT-KEY PIC X(14).
05 AM-ACCT-NAME PIC X(30).
05 AM-ACCT-BALANCE PIC S9(13)V99 COMP-3.
05 AM-ACCT-STATUS PIC X(1).
05 AM-TRANS-COUNT PIC S9(5) COMP.
05 AM-LAST-TRANS-DATE PIC X(8).
05 FILLER PIC X(134).
FD REPORT-FILE.
01 REPORT-RECORD PIC X(133).
WORKING-STORAGE SECTION.
01 WS-ACCT-STATUS PIC XX.
01 WS-PREV-ACCT PIC X(10).
01 WS-ACCT-TOTAL PIC S9(13)V99 COMP-3.
01 WS-ACCT-TRANS-COUNT PIC S9(5) COMP.
01 WS-RECORDS-PROCESSED PIC S9(9) COMP VALUE 0.
01 WS-ACCOUNTS-UPDATED PIC S9(9) COMP VALUE 0.
01 WS-EOF-FLAG PIC 9 VALUE 0.
88 END-OF-SORTED VALUE 1.
PROCEDURE DIVISION.
0000-MAIN-LOGIC.
OPEN I-O ACCOUNT-FILE
OPEN OUTPUT REPORT-FILE
* FASTSRT handles the USING file I/O directly.
* OUTPUT PROCEDURE processes the sorted records.
SORT SORT-FILE
ON ASCENDING KEY SR-ACCT-NUMBER
ON ASCENDING KEY SR-TRANS-DATE
ON ASCENDING KEY SR-TRANS-TIME
USING TRANS-FILE
OUTPUT PROCEDURE 2000-PROCESS-SORTED
CLOSE ACCOUNT-FILE
CLOSE REPORT-FILE
DISPLAY 'EOD PROCESSING COMPLETE' UPON CONSOLE
DISPLAY ' TRANSACTIONS: ' WS-RECORDS-PROCESSED
UPON CONSOLE
DISPLAY ' ACCOUNTS: ' WS-ACCOUNTS-UPDATED
UPON CONSOLE
STOP RUN.
2000-PROCESS-SORTED.
MOVE SPACES TO WS-PREV-ACCT
MOVE 0 TO WS-ACCT-TOTAL
MOVE 0 TO WS-ACCT-TRANS-COUNT
RETURN SORT-FILE
AT END SET END-OF-SORTED TO TRUE
END-RETURN
PERFORM UNTIL END-OF-SORTED
* Accumulate transactions for the same account
IF SR-ACCT-NUMBER NOT = WS-PREV-ACCT
AND WS-PREV-ACCT NOT = SPACES
* Account break - apply accumulated total
PERFORM 3000-UPDATE-ACCOUNT
END-IF
IF SR-ACCT-NUMBER NOT = WS-PREV-ACCT
MOVE SR-ACCT-NUMBER TO WS-PREV-ACCT
MOVE 0 TO WS-ACCT-TOTAL
MOVE 0 TO WS-ACCT-TRANS-COUNT
END-IF
* Accumulate the transaction amount
EVALUATE SR-TRANS-TYPE
WHEN 'CR'
ADD SR-TRANS-AMOUNT TO WS-ACCT-TOTAL
WHEN 'DB'
SUBTRACT SR-TRANS-AMOUNT FROM WS-ACCT-TOTAL
END-EVALUATE
ADD 1 TO WS-ACCT-TRANS-COUNT
ADD 1 TO WS-RECORDS-PROCESSED
RETURN SORT-FILE
AT END SET END-OF-SORTED TO TRUE
END-RETURN
END-PERFORM
* Process the last account
IF WS-PREV-ACCT NOT = SPACES
PERFORM 3000-UPDATE-ACCOUNT
END-IF.
3000-UPDATE-ACCOUNT.
* Read the account master record
MOVE WS-PREV-ACCT TO AM-ACCT-KEY(1:10)
READ ACCOUNT-FILE
INVALID KEY
DISPLAY 'ACCOUNT NOT FOUND: ' WS-PREV-ACCT
UPON CONSOLE
GO TO 3000-EXIT
END-READ
* Apply the net transaction total
ADD WS-ACCT-TOTAL TO AM-ACCT-BALANCE
ADD WS-ACCT-TRANS-COUNT TO AM-TRANS-COUNT
* Rewrite the updated record
REWRITE ACCOUNT-MASTER
INVALID KEY
DISPLAY 'REWRITE FAILED: ' WS-PREV-ACCT
UPON CONSOLE
END-REWRITE
ADD 1 TO WS-ACCOUNTS-UPDATED.
3000-EXIT.
EXIT.
32.8.2 Checkpoint/Restart
For long-running batch jobs, checkpoint/restart allows a job to resume from its last checkpoint after a failure, rather than starting over from the beginning. This is critical for jobs that take hours to complete:
IDENTIFICATION DIVISION.
PROGRAM-ID. CHKPTRST.
*================================================================*
* CHECKPOINT/RESTART IMPLEMENTATION *
* Saves processing state at regular intervals so the job *
* can be restarted from the last checkpoint after a failure. *
*================================================================*
DATA DIVISION.
WORKING-STORAGE SECTION.
01 WS-CHECKPOINT-DATA.
05 WS-CHK-RECORD-COUNT PIC S9(9) COMP.
05 WS-CHK-LAST-KEY PIC X(14).
05 WS-CHK-RUNNING-TOTAL PIC S9(15)V99 COMP-3.
05 WS-CHK-HASH-TOTAL PIC S9(18) COMP.
05 WS-CHK-TIMESTAMP PIC X(26).
01 WS-CHECKPOINT-INTERVAL PIC S9(8) COMP VALUE 100000.
01 WS-RECORDS-SINCE-CHK PIC S9(8) COMP VALUE 0.
01 WS-RESTART-FLAG PIC 9 VALUE 0.
88 WS-IS-RESTART VALUE 1.
88 WS-IS-INITIAL VALUE 0.
01 WS-CHECKPOINT-ID PIC X(8).
PROCEDURE DIVISION.
0000-MAIN-LOGIC.
* Check if this is a restart
PERFORM 0500-CHECK-RESTART
IF WS-IS-RESTART
PERFORM 0600-RESTORE-CHECKPOINT
END-IF
PERFORM 1000-PROCESS-RECORDS
UNTIL WS-END-OF-FILE
* Final checkpoint at end of processing
PERFORM 5000-TAKE-CHECKPOINT
STOP RUN.
0500-CHECK-RESTART.
* Check for the existence of a checkpoint dataset
* If it exists and contains valid data, this is a restart
OPEN INPUT CHECKPOINT-FILE
IF WS-CHK-FILE-STATUS = '00'
READ CHECKPOINT-FILE INTO WS-CHECKPOINT-DATA
IF WS-CHK-FILE-STATUS = '00'
MOVE 1 TO WS-RESTART-FLAG
DISPLAY 'RESTART DETECTED. RESUMING FROM '
'RECORD ' WS-CHK-RECORD-COUNT
' KEY ' WS-CHK-LAST-KEY
UPON CONSOLE
END-IF
CLOSE CHECKPOINT-FILE
ELSE
MOVE 0 TO WS-RESTART-FLAG
END-IF.
0600-RESTORE-CHECKPOINT.
* Position input file past already-processed records
* Restore running totals from checkpoint
DISPLAY 'RESTORING FROM CHECKPOINT: '
'RECORDS=' WS-CHK-RECORD-COUNT
' LAST-KEY=' WS-CHK-LAST-KEY
UPON CONSOLE.
1000-PROCESS-RECORDS.
* Process the current record
PERFORM 2000-BUSINESS-LOGIC
ADD 1 TO WS-RECORDS-SINCE-CHK
* Take a checkpoint at regular intervals
IF WS-RECORDS-SINCE-CHK >= WS-CHECKPOINT-INTERVAL
PERFORM 5000-TAKE-CHECKPOINT
MOVE 0 TO WS-RECORDS-SINCE-CHK
END-IF
PERFORM 1100-READ-NEXT-RECORD.
5000-TAKE-CHECKPOINT.
* Save current processing state
MOVE FUNCTION CURRENT-DATE TO WS-CHK-TIMESTAMP
MOVE WS-RECORD-COUNT TO WS-CHK-RECORD-COUNT
MOVE WS-CURRENT-KEY TO WS-CHK-LAST-KEY
MOVE WS-RUNNING-TOTAL TO WS-CHK-RUNNING-TOTAL
* Write checkpoint data
OPEN OUTPUT CHECKPOINT-FILE
WRITE CHECKPOINT-RECORD FROM WS-CHECKPOINT-DATA
CLOSE CHECKPOINT-FILE
* Issue a DB2 COMMIT to save database changes
EXEC SQL COMMIT END-EXEC
DISPLAY 'CHECKPOINT TAKEN AT RECORD '
WS-CHK-RECORD-COUNT
' TIME ' WS-CHK-TIMESTAMP
UPON CONSOLE.
2000-BUSINESS-LOGIC.
CONTINUE.
1100-READ-NEXT-RECORD.
CONTINUE.
32.8.3 Parallel Processing
For very large batch workloads, splitting the work across multiple parallel job steps or jobs can dramatically reduce elapsed time:
//*================================================================*
//* PARALLEL BATCH PROCESSING - END OF DAY *
//* Split the account master into ranges and process in parallel. *
//* Each step processes a different range of account numbers. *
//*================================================================*
//EODPAR JOB (BANK,EOD),'PARALLEL EOD',
// CLASS=A,MSGCLASS=X,MSGLEVEL=(1,1),
// NOTIFY=&SYSUID
//*
//*----------------------------------------------------------------*
//* STEP 1: Split the transaction file by account range *
//*----------------------------------------------------------------*
//SPLIT EXEC PGM=SORT
//SORTIN DD DSN=BANK.PROD.DAYTRANS.SORTED,DISP=SHR
//SORTOUT1 DD DSN=&&RANGE1,DISP=(NEW,PASS),
// UNIT=SYSDA,SPACE=(CYL,(100,50)),
// DCB=(RECFM=FB,LRECL=200,BLKSIZE=27800)
//SORTOUT2 DD DSN=&&RANGE2,DISP=(NEW,PASS),
// UNIT=SYSDA,SPACE=(CYL,(100,50)),
// DCB=(RECFM=FB,LRECL=200,BLKSIZE=27800)
//SORTOUT3 DD DSN=&&RANGE3,DISP=(NEW,PASS),
// UNIT=SYSDA,SPACE=(CYL,(100,50)),
// DCB=(RECFM=FB,LRECL=200,BLKSIZE=27800)
//SORTOUT4 DD DSN=&&RANGE4,DISP=(NEW,PASS),
// UNIT=SYSDA,SPACE=(CYL,(100,50)),
// DCB=(RECFM=FB,LRECL=200,BLKSIZE=27800)
//SYSIN DD *
SORT FIELDS=COPY
OUTFIL FNAMES=SORTOUT1,INCLUDE=(1,4,CH,LE,C'2499')
OUTFIL FNAMES=SORTOUT2,INCLUDE=(1,4,CH,GE,C'2500',
AND,1,4,CH,LE,C'4999')
OUTFIL FNAMES=SORTOUT3,INCLUDE=(1,4,CH,GE,C'5000',
AND,1,4,CH,LE,C'7499')
OUTFIL FNAMES=SORTOUT4,INCLUDE=(1,4,CH,GE,C'7500')
/*
//SYSOUT DD SYSOUT=*
//*
//*----------------------------------------------------------------*
//* STEPS 2-5: Process each range in parallel *
//* Note: These steps can run concurrently because they access *
//* non-overlapping account ranges. *
//*----------------------------------------------------------------*
//PROC1 EXEC PGM=EODPROC,COND=(0,NE,SPLIT)
//STEPLIB DD DSN=BANK.PROD.LOADLIB,DISP=SHR
//TRANSIN DD DSN=&&RANGE1,DISP=(OLD,DELETE)
//ACCTMAST DD DSN=BANK.PROD.ACCTMAST,DISP=SHR
//RPTOUT DD SYSOUT=*
//SYSOUT DD SYSOUT=*
//*
//PROC2 EXEC PGM=EODPROC,COND=(0,NE,SPLIT)
//STEPLIB DD DSN=BANK.PROD.LOADLIB,DISP=SHR
//TRANSIN DD DSN=&&RANGE2,DISP=(OLD,DELETE)
//ACCTMAST DD DSN=BANK.PROD.ACCTMAST,DISP=SHR
//RPTOUT DD SYSOUT=*
//SYSOUT DD SYSOUT=*
//*
//PROC3 EXEC PGM=EODPROC,COND=(0,NE,SPLIT)
//STEPLIB DD DSN=BANK.PROD.LOADLIB,DISP=SHR
//TRANSIN DD DSN=&&RANGE3,DISP=(OLD,DELETE)
//ACCTMAST DD DSN=BANK.PROD.ACCTMAST,DISP=SHR
//RPTOUT DD SYSOUT=*
//SYSOUT DD SYSOUT=*
//*
//PROC4 EXEC PGM=EODPROC,COND=(0,NE,SPLIT)
//STEPLIB DD DSN=BANK.PROD.LOADLIB,DISP=SHR
//TRANSIN DD DSN=&&RANGE4,DISP=(OLD,DELETE)
//ACCTMAST DD DSN=BANK.PROD.ACCTMAST,DISP=SHR
//RPTOUT DD SYSOUT=*
//SYSOUT DD SYSOUT=*
32.9 Performance Monitoring Tools
Effective performance tuning requires good measurement. z/OS provides several tools for collecting and analyzing performance data.
32.9.1 SMF Records
The System Management Facilities (SMF) produce records that capture detailed information about every job, step, and resource usage on the system. The most important SMF record types for COBOL performance tuning are:
| SMF Record Type | Content | Use |
|---|---|---|
| Type 30 | Job/step resource usage (CPU, I/O, memory) | Primary source for batch job performance data |
| Type 42 | VSAM dataset activity (opens, closes, I/O counts, splits) | VSAM tuning and split analysis |
| Type 101 | DB2 accounting data | DB2 performance analysis |
| Type 102 | DB2 statistics | DB2 system-level tuning |
| Type 110 | CICS performance data | CICS transaction response time analysis |
//*================================================================*
//* EXTRACT SMF TYPE 30 RECORDS FOR PERFORMANCE ANALYSIS *
//*================================================================*
//SMFEXT JOB (BANK,PERF),'SMF EXTRACT',CLASS=A
//STEP1 EXEC PGM=IFASMFDP
//SYSPRINT DD SYSOUT=*
//DUMPIN DD DSN=SYS1.MAN1,DISP=SHR
//DUMPOUT DD DSN=BANK.PERF.SMF30.EXTRACT,
// DISP=(NEW,CATLG,DELETE),
// UNIT=SYSDA,SPACE=(CYL,(50,20),RLSE),
// DCB=(RECFM=VBS,LRECL=32760,BLKSIZE=32760)
//SYSIN DD *
INDD(DUMPIN,OPTIONS(DUMP))
OUTDD(DUMPOUT,TYPE(30))
DATE(2025060,2025060)
START(1800)
END(0600)
/*
32.9.2 IBM Debugging and Performance Tools
IBM provides several tools specifically designed for analyzing COBOL program performance:
IBM Debug Tool / z/OS Debugger allows you to step through COBOL programs, set breakpoints, and examine data values. While primarily a debugging tool, it is invaluable for understanding program flow and identifying performance bottlenecks in logic.
Strobe (now IBM Application Performance Analyzer) is a sampling-based performance analysis tool. It periodically samples where the CPU is executing within your COBOL program and produces a report showing which paragraphs and statements consume the most CPU:
============================================================
APPLICATION PERFORMANCE ANALYZER - PROGRAM: DAYENDBT
SAMPLING INTERVAL: 2 MS TOTAL SAMPLES: 50,000
============================================================
TOP CPU CONSUMERS:
RANK PARAGRAPH SAMPLES PCT CUMULATIVE
---- --------------------- -------- ----- ----------
1 3000-UPDATE-ACCOUNT 18,500 37.0% 37.0%
2 2500-CALCULATE-INTEREST 12,000 24.0% 61.0%
3 4000-WRITE-AUDIT-RECORD 7,500 15.0% 76.0%
4 1000-READ-TRANSACTION 5,000 10.0% 86.0%
5 5000-GENERATE-REPORT 3,000 6.0% 92.0%
6 OTHER 4,000 8.0% 100.0%
TOP STATEMENTS IN 3000-UPDATE-ACCOUNT:
LINE STATEMENT SAMPLES PCT
----- --------------------------------- ------- -----
00523 COMPUTE WS-NEW-BALANCE = 8,200 44.3%
WS-OLD-BALANCE + WS-NET-CHANGE
00528 REWRITE ACCOUNT-MASTER 6,100 33.0%
00519 READ ACCOUNT-FILE 4,200 22.7%
This output tells you exactly where to focus your tuning efforts. In this case, 37% of CPU time is spent in the 3000-UPDATE-ACCOUNT paragraph, and within that paragraph, the COMPUTE statement and the REWRITE are the dominant consumers.
32.9.3 Batch Job Elapsed Time Analysis
A systematic approach to analyzing batch job elapsed time:
IDENTIFICATION DIVISION.
PROGRAM-ID. PERFMON.
*================================================================*
* PERFORMANCE MONITORING INSTRUMENTATION *
* Add timing instrumentation to critical sections of code *
* to identify elapsed-time bottlenecks. *
*================================================================*
DATA DIVISION.
WORKING-STORAGE SECTION.
01 WS-PERF-COUNTERS.
05 WS-START-TIME PIC X(21).
05 WS-END-TIME PIC X(21).
05 WS-FILE-READ-COUNT PIC S9(9) COMP VALUE 0.
05 WS-DB2-CALL-COUNT PIC S9(9) COMP VALUE 0.
05 WS-SORT-COUNT PIC S9(9) COMP VALUE 0.
05 WS-RECORDS-PROCESSED PIC S9(9) COMP VALUE 0.
01 WS-TIMING-SECTION.
05 WS-SECTION-START PIC 9(8)V9(6).
05 WS-SECTION-END PIC 9(8)V9(6).
05 WS-SECTION-ELAPSED PIC 9(8)V9(6).
PROCEDURE DIVISION.
0000-MAIN-LOGIC.
* Capture start time
MOVE FUNCTION CURRENT-DATE TO WS-START-TIME
DISPLAY 'PROGRAM START: ' WS-START-TIME UPON CONSOLE
PERFORM 1000-PHASE-1-READ-AND-SORT
PERFORM 2000-PHASE-2-PROCESS-AND-UPDATE
PERFORM 3000-PHASE-3-GENERATE-REPORTS
* Capture end time and display statistics
MOVE FUNCTION CURRENT-DATE TO WS-END-TIME
DISPLAY 'PROGRAM END: ' WS-END-TIME UPON CONSOLE
DISPLAY 'STATISTICS:' UPON CONSOLE
DISPLAY ' FILE READS: ' WS-FILE-READ-COUNT
UPON CONSOLE
DISPLAY ' DB2 CALLS: ' WS-DB2-CALL-COUNT
UPON CONSOLE
DISPLAY ' RECORDS: ' WS-RECORDS-PROCESSED
UPON CONSOLE
STOP RUN.
1000-PHASE-1-READ-AND-SORT.
ACCEPT WS-SECTION-START FROM TIME
DISPLAY 'PHASE 1 START: SORT TRANSACTIONS'
UPON CONSOLE
* ... sort processing ...
ACCEPT WS-SECTION-END FROM TIME
COMPUTE WS-SECTION-ELAPSED =
WS-SECTION-END - WS-SECTION-START
DISPLAY 'PHASE 1 COMPLETE. ELAPSED: '
WS-SECTION-ELAPSED ' SECONDS'
UPON CONSOLE.
2000-PHASE-2-PROCESS-AND-UPDATE.
ACCEPT WS-SECTION-START FROM TIME
DISPLAY 'PHASE 2 START: PROCESS UPDATES'
UPON CONSOLE
* ... update processing ...
ACCEPT WS-SECTION-END FROM TIME
COMPUTE WS-SECTION-ELAPSED =
WS-SECTION-END - WS-SECTION-START
DISPLAY 'PHASE 2 COMPLETE. ELAPSED: '
WS-SECTION-ELAPSED ' SECONDS'
UPON CONSOLE.
3000-PHASE-3-GENERATE-REPORTS.
ACCEPT WS-SECTION-START FROM TIME
DISPLAY 'PHASE 3 START: GENERATE REPORTS'
UPON CONSOLE
* ... report generation ...
ACCEPT WS-SECTION-END FROM TIME
COMPUTE WS-SECTION-ELAPSED =
WS-SECTION-END - WS-SECTION-START
DISPLAY 'PHASE 3 COMPLETE. ELAPSED: '
WS-SECTION-ELAPSED ' SECONDS'
UPON CONSOLE.
32.10 Real-World Tuning Case Study: Optimizing a Daily Batch Cycle
To bring all of these techniques together, let us walk through a realistic performance tuning case study based on a daily end-of-day batch cycle at a mid-sized retail bank.
32.10.1 The Problem
The bank's end-of-day batch cycle processes approximately 15 million debit and credit card transactions daily. The batch window is from 7:00 PM to 5:00 AM (10 hours). The current cycle takes 9 hours and 45 minutes, leaving only a 15-minute margin. With transaction volumes growing 15% year over year, the batch window will be exceeded within two months unless performance is improved.
The batch cycle consists of:
| Job | Description | Current Elapsed Time |
|---|---|---|
| EOD010 | Sort daily transactions | 45 minutes |
| EOD020 | Apply transactions to account master | 3 hours 15 minutes |
| EOD030 | Calculate daily interest | 2 hours 30 minutes |
| EOD040 | Generate customer statements | 1 hour 45 minutes |
| EOD050 | Regulatory reporting (SOX/BSA) | 1 hour 30 minutes |
| Total | 9 hours 45 minutes |
32.10.2 Analysis Phase
Using the tools described earlier (SMF records, Application Performance Analyzer, DB2 EXPLAIN), we analyze each job:
EOD020 - Transaction Processing (3h 15m): - Application Performance Analyzer shows 62% of CPU time in DB2 calls. - DB2 EXPLAIN reveals a tablespace scan on the account master for each transaction lookup (the index on ACCT_NUMBER was inadvertently dropped during a recent DB2 migration). - VSAM LISTCAT shows 47,000 CI splits on the account master file since the last reorganization.
EOD030 - Interest Calculation (2h 30m): - The program uses DISPLAY numeric fields for all interest rate calculations, causing constant pack/unpack conversions. - The program is compiled with OPTIMIZE(0) and NUMPROC(NOPFD). - Each account is read individually with random VSAM I/O rather than sequential browse.
EOD040 - Statement Generation (1h 45m): - The output file has BLKSIZE=133 (unblocked), causing one I/O per line of output. - The program reads the entire 200-byte customer record to extract just the name and address.
32.10.3 Optimization Actions
Fix 1: Restore the DB2 index (EOD020)
-- Recreate the missing index on the account master table
CREATE UNIQUE INDEX BANKDB.ACCT_MASTER_PK
ON BANKDB.ACCT_MASTER (ACCT_NUMBER ASC)
USING STOGROUP BANKSGRP
PRIQTY 500000
SECQTY 100000
BUFFERPOOL BP1
CLOSE NO;
-- Verify the access path is now using the index
EXPLAIN ALL SET QUERYNO = 1 FOR
SELECT ACCT_NUMBER, ACCT_BALANCE, ACCT_STATUS
FROM BANKDB.ACCT_MASTER
WHERE ACCT_NUMBER = '0001234567';
Fix 2: Reorganize the VSAM file (EOD020)
//*================================================================*
//* REORGANIZE ACCOUNT MASTER VSAM FILE *
//* Eliminates CI/CA splits and restores free space *
//*================================================================*
//REORG JOB (BANK,DBA),'VSAM REORG',CLASS=A
//*
//STEP1 EXEC PGM=IDCAMS
//SYSPRINT DD SYSOUT=*
//INFILE DD DSN=BANK.PROD.ACCTMAST,DISP=SHR
//BACKUP DD DSN=BANK.PROD.ACCTMAST.BACKUP,
// DISP=(NEW,CATLG,DELETE),
// UNIT=SYSDA,SPACE=(CYL,(600,100),RLSE)
//SYSIN DD *
REPRO INFILE(INFILE) OUTFILE(BACKUP)
/*
//*
//STEP2 EXEC PGM=IDCAMS,COND=(0,NE,STEP1)
//SYSPRINT DD SYSOUT=*
//SYSIN DD *
DELETE BANK.PROD.ACCTMAST CLUSTER
DEFINE CLUSTER ( -
NAME(BANK.PROD.ACCTMAST) -
INDEXED -
RECORDSIZE(200 200) -
KEYS(14 0) -
FREESPACE(20 10) -
SHAREOPTIONS(2 3) -
SPEED) -
DATA ( -
NAME(BANK.PROD.ACCTMAST.DATA) -
CYLINDERS(500 100) -
CONTROLINTERVALSIZE(4096) -
BUFFERSPACE(1048576)) -
INDEX ( -
NAME(BANK.PROD.ACCTMAST.INDEX) -
CYLINDERS(10 5))
/*
//*
//STEP3 EXEC PGM=IDCAMS,COND=(0,NE,STEP2)
//SYSPRINT DD SYSOUT=*
//INFILE DD DSN=BANK.PROD.ACCTMAST.BACKUP,DISP=SHR
//OUTFILE DD DSN=BANK.PROD.ACCTMAST,DISP=SHR
//SYSIN DD *
REPRO INFILE(INFILE) OUTFILE(OUTFILE)
/*
Fix 3: Optimize the interest calculation program (EOD030)
IDENTIFICATION DIVISION.
PROGRAM-ID. EOD030.
*================================================================*
* OPTIMIZED DAILY INTEREST CALCULATION *
* *
* Performance improvements: *
* 1. Changed DISPLAY fields to COMP-3 for arithmetic *
* 2. Compiled with OPTIMIZE(2), NUMPROC(PFD), TRUNC(OPT) *
* 3. Changed from random reads to sequential browse *
* 4. Moved rate table lookup outside the per-account loop *
* 5. Used SEARCH ALL instead of SEARCH for rate lookup *
*================================================================*
DATA DIVISION.
WORKING-STORAGE SECTION.
* BEFORE (all DISPLAY - slow):
* 01 WS-INTEREST-CALC.
* 05 WS-DAILY-RATE PIC 9V9(8).
* 05 WS-ACCRUED-INT PIC 9(11)V99.
* 05 WS-BALANCE PIC 9(13)V99.
*
* AFTER (COMP-3 - fast):
01 WS-INTEREST-CALC.
05 WS-DAILY-RATE PIC S9V9(8) COMP-3.
05 WS-ACCRUED-INT PIC S9(11)V99 COMP-3.
05 WS-BALANCE PIC S9(13)V99 COMP-3.
05 WS-ANNUAL-RATE PIC S9V9(6) COMP-3.
05 WS-DAYS-IN-YEAR PIC S9(3) COMP VALUE 365.
01 WS-RATE-TABLE.
05 WS-NUM-RATES PIC S9(4) COMP.
05 WS-RATE-ENTRY OCCURS 200 TIMES
ASCENDING KEY IS WS-RT-PRODUCT
INDEXED BY WS-RT-IDX.
10 WS-RT-PRODUCT PIC X(6).
10 WS-RT-ANNUAL-RATE PIC S9V9(6) COMP-3.
PROCEDURE DIVISION.
0000-MAIN-LOGIC.
* Load the rate table ONCE before processing
PERFORM 0500-LOAD-RATE-TABLE
* Use sequential browse instead of random reads
OPEN INPUT ACCOUNT-FILE
MOVE LOW-VALUES TO AM-ACCT-KEY
START ACCOUNT-FILE KEY NOT LESS THAN AM-ACCT-KEY
PERFORM UNTIL WS-END-OF-FILE
READ ACCOUNT-FILE NEXT
AT END SET WS-END-OF-FILE TO TRUE
END-READ
IF NOT WS-END-OF-FILE
PERFORM 2000-CALCULATE-INTEREST
END-IF
END-PERFORM
CLOSE ACCOUNT-FILE
STOP RUN.
0500-LOAD-RATE-TABLE.
* Load all interest rates into memory once
MOVE 0 TO WS-NUM-RATES
EXEC SQL
DECLARE C-RATES CURSOR FOR
SELECT PRODUCT_CODE, ANNUAL_RATE
FROM BANKDB.INTEREST_RATES
WHERE EFFECTIVE_DATE <= CURRENT DATE
AND EXPIRY_DATE >= CURRENT DATE
ORDER BY PRODUCT_CODE
END-EXEC
EXEC SQL OPEN C-RATES END-EXEC
PERFORM UNTIL SQLCODE NOT = 0
ADD 1 TO WS-NUM-RATES
EXEC SQL
FETCH C-RATES
INTO :WS-RT-PRODUCT(WS-NUM-RATES),
:WS-RT-ANNUAL-RATE(WS-NUM-RATES)
END-EXEC
END-PERFORM
EXEC SQL CLOSE C-RATES END-EXEC
DISPLAY 'LOADED ' WS-NUM-RATES ' INTEREST RATES'
UPON CONSOLE.
2000-CALCULATE-INTEREST.
* Look up the rate using binary search (SEARCH ALL)
SEARCH ALL WS-RATE-ENTRY
AT END
MOVE 0 TO WS-ANNUAL-RATE
WHEN WS-RT-PRODUCT(WS-RT-IDX) = AM-PRODUCT-CODE
MOVE WS-RT-ANNUAL-RATE(WS-RT-IDX)
TO WS-ANNUAL-RATE
END-SEARCH
* Calculate daily interest using COMP-3 fields
* (no pack/unpack conversions needed)
IF WS-ANNUAL-RATE > 0
COMPUTE WS-DAILY-RATE =
WS-ANNUAL-RATE / WS-DAYS-IN-YEAR
COMPUTE WS-ACCRUED-INT ROUNDED =
AM-ACCT-BALANCE * WS-DAILY-RATE
ELSE
MOVE 0 TO WS-ACCRUED-INT
END-IF.
Fix 4: Optimize statement generation output (EOD040)
//*================================================================*
//* STATEMENT GENERATION - OPTIMIZED JCL *
//* Changed BLKSIZE from 133 to 27930 (210x reduction in I/O) *
//* Added BUFNO=20 for write-behind buffering *
//*================================================================*
//EOD040 EXEC PGM=STMTGEN
//STEPLIB DD DSN=BANK.PROD.LOADLIB,DISP=SHR
//CUSTMAST DD DSN=BANK.PROD.CUSTMAST,DISP=SHR,
// AMP=('BUFND=20,BUFNI=10')
//STMTOUT DD DSN=BANK.PROD.STATEMENTS.DAILY,
// DISP=(NEW,CATLG,DELETE),
// UNIT=SYSDA,
// SPACE=(CYL,(200,50),RLSE),
// DCB=(RECFM=FBA,LRECL=133,BLKSIZE=27930,BUFNO=20)
//SYSOUT DD SYSOUT=*
32.10.4 Results
After implementing all optimizations:
| Job | Before | After | Improvement |
|---|---|---|---|
| EOD010 | 45 min | 40 min | 11% (SORT work area tuning) |
| EOD020 | 3h 15m | 55 min | 72% (DB2 index + VSAM reorg) |
| EOD030 | 2h 30m | 45 min | 70% (COMP-3 + sequential I/O + OPTIMIZE(2)) |
| EOD040 | 1h 45m | 25 min | 76% (block size + buffering) |
| EOD050 | 1h 30m | 1h 20m | 11% (minor SQL tuning) |
| Total | 9h 45m | 4h 05m | 58% overall reduction |
The batch window now has nearly 6 hours of headroom, providing capacity for the projected 15% annual transaction growth for the next several years.
Key Concept
The largest performance gains almost always come from fixing fundamental issues --- missing indexes, unblocked output files, wrong data types for arithmetic, and unnecessary random I/O. Micro-optimizations within COBOL code (shaving a few instructions from a loop) are far less impactful than addressing these structural problems. Always measure first, fix the biggest problems first, and measure again after each change.
32.11 Performance Tuning Checklist
Use this checklist when tuning a COBOL program. Items are ordered by typical impact, from highest to lowest:
Compiler Options
- [ ] OPTIMIZE(2) for production compilations
- [ ] TRUNC(OPT) if binary values always fit PIC size
- [ ] NUMPROC(PFD) if all numeric data has preferred signs
- [ ] FASTSRT for programs that use SORT
- [ ] NOTEST for production (remove debugging hooks)
- [ ] AWO for sequential output files
File I/O
- [ ] Block size optimized for device geometry (half-track or full-track)
- [ ] BUFNO increased for sequential files (10-30 buffers)
- [ ] VSAM BUFND and BUFNI specified via AMP parameter
- [ ] VSAM FREESPACE appropriate for insert patterns
- [ ] VSAM files reorganized regularly to eliminate CI/CA splits
- [ ] Sequential access used instead of random when processing >15% of records
DB2
- [ ] All WHERE clause predicates are indexable (no functions on columns)
- [ ] Indexes exist for all frequently-used access paths
- [ ] EXPLAIN shows index access, not tablespace scans
- [ ] Host variables match DB2 column types exactly
- [ ] SELECT lists include only needed columns (no SELECT *)
- [ ] Multi-row FETCH used for cursor processing
- [ ] COMMIT frequency balanced between lock duration and overhead
COBOL Coding
- [ ] COMP (binary) used for counters, subscripts, and flags
- [ ] COMP-3 (packed decimal) used for financial amounts
- [ ] SEARCH ALL used for sorted table lookups
- [ ] Index names used instead of subscripts for table access
- [ ] Invariant expressions moved outside loops
- [ ] Average computed after the loop, not inside it
- [ ] EVALUATE used instead of nested IF for multi-way branches
CICS
- [ ] Pseudo-conversational design used throughout
- [ ] COMMAREA kept as small as possible
- [ ] BMS DATAONLY used for screen refreshes
- [ ] Locks acquired as late as possible and held briefly
- [ ] DB2 calls minimized (use JOINs instead of multiple SELECTs)
Batch Job Design
- [ ] SORT parameters tuned (FILSZ, DYNALLOC, work area allocation)
- [ ] Checkpoint/restart implemented for long-running jobs
- [ ] Parallel processing considered for large-volume jobs
- [ ] Job scheduling dependencies reviewed for unnecessary serialization
32.12 Summary
Performance tuning for COBOL programs on z/OS is both an art and a science. The science lies in measuring accurately, understanding the architecture, and applying known optimization techniques. The art lies in knowing which optimizations to apply first and understanding the trade-offs between CPU time, elapsed time, memory usage, and code maintainability.
In this chapter, we have covered:
- Performance fundamentals --- CPU time, elapsed time, I/O counts, and memory usage as the four pillars of mainframe performance measurement.
- Compiler optimization --- OPTIMIZE(2), TRUNC(OPT), NUMPROC(PFD), and FASTSRT as the most impactful compiler options for performance.
- Efficient coding techniques --- SEARCH ALL for sorted tables, PERFORM VARYING optimization, data type selection, and EVALUATE for multi-way branching.
- File I/O optimization --- Block size calculation, buffer management, and VSAM tuning including FREESPACE, CI/CA split management, and buffer allocation.
- DB2 performance --- Index-friendly predicates, EXPLAIN analysis, host variable alignment, and multi-row FETCH.
- CICS performance --- Pseudo-conversational design, BMS optimization, COMMAREA minimization, and lock management.
- Memory optimization --- WORKING-STORAGE layout, data type sizing, and block processing for large datasets.
- Batch performance --- SORT optimization, checkpoint/restart, and parallel processing patterns.
- Monitoring tools --- SMF records, Application Performance Analyzer, and instrumentation techniques.
- Case study --- A realistic end-of-day batch cycle optimization that achieved a 58% reduction in elapsed time through a combination of DB2 index restoration, VSAM reorganization, compiler option changes, data type corrections, and I/O optimization.
The security considerations discussed in Chapter 31 add overhead to every resource access. As you tune for performance, remember that security overhead is non-negotiable --- you must optimize within the constraints of your security model, never by circumventing it. Similarly, the audit trail requirements discussed in Chapter 31 add I/O that must be accounted for in your performance budget.
Performance tuning is an ongoing activity, not a one-time event. Transaction volumes grow, data accumulates, and access patterns change. The monitoring and measurement techniques in this chapter should be part of your regular operational routine, identifying and addressing performance degradation before it impacts the business.
Chapter Review Questions:
- What are the three fundamental performance metrics on z/OS, and what does each one measure?
- Explain the difference between OPTIMIZE(0), OPTIMIZE(1), and OPTIMIZE(2). Why should production programs always use OPTIMIZE(2)?
- A program processes 10 million records. Compare the number of I/O operations for BLKSIZE=200 vs. BLKSIZE=27800 with LRECL=200. What is the percentage reduction?
- Why is SEARCH ALL significantly faster than SEARCH for a table with 10,000 entries? What is the prerequisite for using SEARCH ALL?
- Describe three ways to optimize DB2 SQL performance in a COBOL program. Which typically has the largest impact?
- Why is pseudo-conversational programming essential for CICS performance? What resources are wasted in conversational mode?
- Explain the concept of checkpoint/restart and why it is important for long-running batch jobs.
- Given a batch cycle with five sequential jobs taking 2h, 3h, 1h, 2h, and 1h respectively, how would you redesign the cycle to reduce total elapsed time through parallel processing? What constraints must you consider?
- Review the performance tuning case study. Which single fix had the largest impact and why?
- How do the security requirements from Chapter 31 affect performance tuning decisions? Give two specific examples.