Chapter 26: Key Takeaways

Batch Performance at Scale: I/O Optimization, Buffer Tuning, SORT Optimization, and DFSORT Tricks


Core Principle

Performance is measurement, not guessing. Every optimization decision must be grounded in data — SMF Type 30 decomposition into CPU, I/O, DB2, and Other wait time. Without this decomposition, you're optimizing the 15% while ignoring the 85%.


Core Takeaways

  1. Follow the priority stack. Eliminate unnecessary work first. Then optimize I/O (buffers, BLKSIZE, access methods). Then tune DFSORT. Then adjust compiler options. Then tune DB2. Then — and only then — consider advanced techniques. Skipping levels costs time and money.

  2. BLKSIZE is the single highest-ROI I/O change. Half-track blocking (BLKSIZE ≈ 27998 adjusted for LRECL) reduces EXCP count by 80-99% compared to common bad defaults. The formula: FLOOR(27998 / LRECL) × LRECL. Memorize it. Apply it to every sequential batch dataset.

  3. Buffers are the cheapest performance investment. BUFNO=20-30 for sequential files costs under a megabyte of memory and can reduce I/O wait by 40-50%. Full index buffering for VSAM KSDS (BUFNI equal to total index records) eliminates 3 index I/Os per random read — at a memory cost measured in tens of megabytes.

  4. DFSORT is not just for sorting. INCLUDE/OMIT filters records. OUTREC/INREC reformats them. SUM aggregates them. OUTFIL splits them into multiple outputs. ICETOOL chains multiple operations. Twenty-three COBOL programs at Federal Benefits were replaced by DFSORT — running 4x faster with 96% less code.

  5. Filter before sort. An INCLUDE or OMIT statement that runs before SORT reduces the record count entering the sort phase. Every excluded record is a record not sorted, not written to work datasets, and not written to output. This is free performance.

  6. MAINSIZE=MAX is mandatory for production sorts. DFSORT's merge passes are determined by available memory. More memory means fewer passes. Going from 3 merge passes to 1 cuts elapsed time in half. Allocate 1 GB regions for large sort jobs.

  7. OPT(2) saves 20-40% CPU on compute-intensive batch. It's a compiler switch, not a code change. Use OPT(1) for development, OPT(2) for production. The build pipeline should handle the switch automatically.

  8. FASTSRT delegates SORT I/O to DFSORT. It requires USING/GIVING (not INPUT/OUTPUT PROCEDURE) and typically improves SORT performance by 30-50%. If your INPUT PROCEDURE only filters records, replace it with DFSORT INCLUDE and switch to USING.

  9. Commit every 1,000-5,000 records in DB2 batch. Too frequent wastes time on synchronization overhead. Too infrequent risks lock escalation and long recovery windows. The sweet spot depends on LOCKMAX and contention patterns.

  10. FOR FETCH ONLY on every read cursor. It enables sequential prefetch, avoids intent exclusive locks, and costs nothing. Every DB2 read cursor in a batch program should have it.


Formulas to Remember

Optimal BLKSIZE: FLOOR(27998 / LRECL) × LRECL

Performance decomposition:

CPU%   = SMF30CPT / SMF30AET × 100
I/O%   = (SMF30TEP + SMF30TIS) / SMF30AET × 100
DB2%   = SMF30_DB2_CLASS2 / SMF30AET × 100
Other% = 100 - CPU% - I/O% - DB2%

DFSORT merge passes: CEIL(LOG base (M/R) of N) where M = memory, R = record length, N = record count

Commit overhead: Total commit time = Number of commits × 2-5ms per commit

VSAM index I/O savings: EXCP saved = Index levels × Record count × (1 - 1/BUFNI-coverage-ratio)


Red Flags

  • Any sequential batch dataset with BLKSIZE < 10,000 (check immediately)
  • BUFNO at default (5) on any critical-path job
  • VSAM BUFNI=1 on any batch-accessed KSDS
  • COBOL SORT with INPUT/OUTPUT PROCEDURE when USING/GIVING is possible
  • OPT(0) on any production batch program
  • DB2 batch programs committing every record
  • No SMF-based performance baseline for critical-path jobs
  • Critical-path jobs with "Other" wait time > 20%
  • DFSORT running with MAINSIZE default instead of MAX
  • COBOL programs that only sort, merge, or reformat (should be DFSORT)

Production Checklist

I/O Optimization: - [ ] All critical-path datasets use half-track optimal BLKSIZE - [ ] BUFNO=20-30 for large sequential files - [ ] VSAM BUFNI caches full index (or top 2 levels minimum) - [ ] VSAM BUFND ≥ 10 for data components - [ ] No DISP=OLD conflicts between concurrent jobs - [ ] EXCP counts baselined and trending

DFSORT: - [ ] All sort/merge/reformat jobs evaluated for DFSORT replacement - [ ] INCLUDE/OMIT used to filter before sort wherever possible - [ ] INREC used to reduce record size before sort where applicable - [ ] MAINSIZE=MAX on all production sort jobs - [ ] DYNALLOC for work datasets (not pre-allocated)

Compiler: - [ ] OPT(2) for all production batch programs - [ ] FASTSRT enabled for programs with SORT verbs - [ ] NOSEQCHK for production (SEQCHK for test) - [ ] NUMPROC(PFD) unless legacy data has invalid packed signs - [ ] SSRANGE on (safety over speed)

DB2: - [ ] Commit frequency 1,000-5,000 (not every record) - [ ] FOR FETCH ONLY on all read cursors - [ ] DEGREE(ANY) in batch plan BIND - [ ] Sequential/list prefetch verified in EXPLAIN - [ ] RUNSTATS current for batch-accessed tablespaces

Monitoring: - [ ] SMF Type 30 performance baseline for all critical-path jobs - [ ] Performance dashboard reviewed daily - [ ] Trend analysis reviewed weekly - [ ] Quarterly performance review for critical-path jobs - [ ] Performance checklist for all new batch programs before production