Chapter 26: Key Takeaways
Batch Performance at Scale: I/O Optimization, Buffer Tuning, SORT Optimization, and DFSORT Tricks
Core Principle
Performance is measurement, not guessing. Every optimization decision must be grounded in data — SMF Type 30 decomposition into CPU, I/O, DB2, and Other wait time. Without this decomposition, you're optimizing the 15% while ignoring the 85%.
Core Takeaways
-
Follow the priority stack. Eliminate unnecessary work first. Then optimize I/O (buffers, BLKSIZE, access methods). Then tune DFSORT. Then adjust compiler options. Then tune DB2. Then — and only then — consider advanced techniques. Skipping levels costs time and money.
-
BLKSIZE is the single highest-ROI I/O change. Half-track blocking (BLKSIZE ≈ 27998 adjusted for LRECL) reduces EXCP count by 80-99% compared to common bad defaults. The formula:
FLOOR(27998 / LRECL) × LRECL. Memorize it. Apply it to every sequential batch dataset. -
Buffers are the cheapest performance investment. BUFNO=20-30 for sequential files costs under a megabyte of memory and can reduce I/O wait by 40-50%. Full index buffering for VSAM KSDS (BUFNI equal to total index records) eliminates 3 index I/Os per random read — at a memory cost measured in tens of megabytes.
-
DFSORT is not just for sorting. INCLUDE/OMIT filters records. OUTREC/INREC reformats them. SUM aggregates them. OUTFIL splits them into multiple outputs. ICETOOL chains multiple operations. Twenty-three COBOL programs at Federal Benefits were replaced by DFSORT — running 4x faster with 96% less code.
-
Filter before sort. An INCLUDE or OMIT statement that runs before SORT reduces the record count entering the sort phase. Every excluded record is a record not sorted, not written to work datasets, and not written to output. This is free performance.
-
MAINSIZE=MAX is mandatory for production sorts. DFSORT's merge passes are determined by available memory. More memory means fewer passes. Going from 3 merge passes to 1 cuts elapsed time in half. Allocate 1 GB regions for large sort jobs.
-
OPT(2) saves 20-40% CPU on compute-intensive batch. It's a compiler switch, not a code change. Use OPT(1) for development, OPT(2) for production. The build pipeline should handle the switch automatically.
-
FASTSRT delegates SORT I/O to DFSORT. It requires USING/GIVING (not INPUT/OUTPUT PROCEDURE) and typically improves SORT performance by 30-50%. If your INPUT PROCEDURE only filters records, replace it with DFSORT INCLUDE and switch to USING.
-
Commit every 1,000-5,000 records in DB2 batch. Too frequent wastes time on synchronization overhead. Too infrequent risks lock escalation and long recovery windows. The sweet spot depends on LOCKMAX and contention patterns.
-
FOR FETCH ONLY on every read cursor. It enables sequential prefetch, avoids intent exclusive locks, and costs nothing. Every DB2 read cursor in a batch program should have it.
Formulas to Remember
Optimal BLKSIZE: FLOOR(27998 / LRECL) × LRECL
Performance decomposition:
CPU% = SMF30CPT / SMF30AET × 100
I/O% = (SMF30TEP + SMF30TIS) / SMF30AET × 100
DB2% = SMF30_DB2_CLASS2 / SMF30AET × 100
Other% = 100 - CPU% - I/O% - DB2%
DFSORT merge passes: CEIL(LOG base (M/R) of N) where M = memory, R = record length, N = record count
Commit overhead: Total commit time = Number of commits × 2-5ms per commit
VSAM index I/O savings: EXCP saved = Index levels × Record count × (1 - 1/BUFNI-coverage-ratio)
Red Flags
- Any sequential batch dataset with BLKSIZE < 10,000 (check immediately)
- BUFNO at default (5) on any critical-path job
- VSAM BUFNI=1 on any batch-accessed KSDS
- COBOL SORT with INPUT/OUTPUT PROCEDURE when USING/GIVING is possible
- OPT(0) on any production batch program
- DB2 batch programs committing every record
- No SMF-based performance baseline for critical-path jobs
- Critical-path jobs with "Other" wait time > 20%
- DFSORT running with MAINSIZE default instead of MAX
- COBOL programs that only sort, merge, or reformat (should be DFSORT)
Production Checklist
I/O Optimization: - [ ] All critical-path datasets use half-track optimal BLKSIZE - [ ] BUFNO=20-30 for large sequential files - [ ] VSAM BUFNI caches full index (or top 2 levels minimum) - [ ] VSAM BUFND ≥ 10 for data components - [ ] No DISP=OLD conflicts between concurrent jobs - [ ] EXCP counts baselined and trending
DFSORT: - [ ] All sort/merge/reformat jobs evaluated for DFSORT replacement - [ ] INCLUDE/OMIT used to filter before sort wherever possible - [ ] INREC used to reduce record size before sort where applicable - [ ] MAINSIZE=MAX on all production sort jobs - [ ] DYNALLOC for work datasets (not pre-allocated)
Compiler: - [ ] OPT(2) for all production batch programs - [ ] FASTSRT enabled for programs with SORT verbs - [ ] NOSEQCHK for production (SEQCHK for test) - [ ] NUMPROC(PFD) unless legacy data has invalid packed signs - [ ] SSRANGE on (safety over speed)
DB2: - [ ] Commit frequency 1,000-5,000 (not every record) - [ ] FOR FETCH ONLY on all read cursors - [ ] DEGREE(ANY) in batch plan BIND - [ ] Sequential/list prefetch verified in EXPLAIN - [ ] RUNSTATS current for batch-accessed tablespaces
Monitoring: - [ ] SMF Type 30 performance baseline for all critical-path jobs - [ ] Performance dashboard reviewed daily - [ ] Trend analysis reviewed weekly - [ ] Quarterly performance review for critical-path jobs - [ ] Performance checklist for all new batch programs before production