41 min read

> "I have seen more performance problems solved by proper buffer pool configuration than by any amount of SQL tuning. You can write the perfect query, but if DB2 does not have the memory it needs to keep your data close, every read hits the disk —...

Chapter 25: Buffer Pool and Memory Tuning — Giving DB2 the Right Resources

"I have seen more performance problems solved by proper buffer pool configuration than by any amount of SQL tuning. You can write the perfect query, but if DB2 does not have the memory it needs to keep your data close, every read hits the disk — and disk is slow. Memory is fast. That is the fundamental truth of database performance."

— A senior DBA, after resolving a 6-hour batch window problem by resizing three buffer pools


Introduction

Every database operation — every SELECT, every INSERT, every index lookup — ultimately depends on reading data pages into memory. The buffer pool is that memory. It is the single most important performance component in DB2, and getting it right is the difference between a system that flies and one that crawls.

This chapter is about understanding how DB2 uses memory, how to configure buffer pools on both z/OS and LUW platforms, and how to monitor and tune memory allocation so your workloads get the resources they need. We will cover the theory, but more importantly, we will cover the practice: real configuration values, real monitoring queries, and real tuning decisions.

If you remember nothing else from this chapter, remember this: a page read from the buffer pool takes microseconds; a page read from disk takes milliseconds. That is a difference of three orders of magnitude. Every page you can keep in memory instead of reading from disk is a performance win — and over the millions of page requests a production system handles every hour, those microseconds add up to minutes, hours, and real money.


25.1 Buffer Pool Fundamentals

25.1.1 What Is a Buffer Pool?

A buffer pool is a region of memory that DB2 uses to cache data pages and index pages read from disk. When DB2 needs a page — whether to satisfy a query, update a row, or traverse an index — it first checks the buffer pool. If the page is there (a buffer pool hit), the operation proceeds at memory speed. If the page is not there (a buffer pool miss), DB2 must perform a physical I/O to read the page from disk, copy it into the buffer pool, and then proceed.

Every table space and index space in DB2 is assigned to exactly one buffer pool. The buffer pool acts as the intermediary between the logical world of SQL and the physical world of disk storage. No application thread ever reads from disk directly. All access goes through the buffer pool — this design gives DB2 complete control over caching strategy.

25.1.2 Page Sizes

DB2 supports multiple page sizes for buffer pools:

Page Size Typical Use Max Row Length
4 KB General purpose, catalog, small tables ~4,005 bytes
8 KB Mixed workloads, moderate row sizes ~8,101 bytes
16 KB Larger rows, LOB descriptors, XML ~16,293 bytes
32 KB Large sequential scans, temp tables, wide rows ~32,677 bytes

On z/OS, the page size of a buffer pool is determined by its number range (BP0-BP49 are 4 KB, BP8K0-BP8K9 are 8 KB, BP16K0-BP16K9 are 16 KB, BP32K0-BP32K9 are 32 KB). On LUW, you specify the page size when you create the buffer pool.

A fundamental rule: the page size of a buffer pool must match the page size of the table spaces assigned to it. You cannot assign a 4 KB table space to a 32 KB buffer pool. Plan your page size strategy before you create your first table space, because changing it later requires unloading and reloading the data.

25.1.3 The Buffer Pool Hit Ratio

The buffer pool hit ratio is the most commonly cited buffer pool metric. It measures the percentage of page requests satisfied from the buffer pool without requiring physical I/O:

Buffer Pool Hit Ratio = (1 - (Physical Reads / Logical Reads)) * 100

Or equivalently:

Hit Ratio = (Buffer Pool Hits / (Buffer Pool Hits + Physical Reads)) * 100

Target hit ratios by workload type:

Workload Type Target Hit Ratio Rationale
OLTP (random access) 95-99%+ Each miss is a synchronous wait
Mixed (OLTP + reporting) 85-95% Balance between random and sequential
Batch/sequential processing 70-90% Prefetch handles throughput
Data warehouse (large scans) 50-80% Pages used once rarely justify caching

A word of caution: the hit ratio is a useful indicator, but it is not the whole story. A 99% hit ratio is meaningless if the 1% of misses are all on your most critical transaction path. Conversely, a 75% hit ratio on a batch pool may be perfectly acceptable because those reads are handled by prefetch. Always dig deeper than the headline number.

The diminishing returns curve. Consider a buffer pool with a 90% hit ratio and 2,000,000 logical reads per hour. That means 200,000 physical reads. Doubling the buffer pool might raise the hit ratio to 95%, cutting physical reads to 100,000 — a savings of 100,000 reads. Doubling it again might reach 97.5%, saving another 50,000 reads. Doubling once more might reach 98.75%, saving 25,000. Each doubling of memory yields half the benefit of the previous doubling. At some point, the memory is more valuable elsewhere.

25.1.4 Random vs. Sequential I/O

Understanding the difference between random and sequential I/O is essential for buffer pool tuning.

Random I/O occurs when DB2 reads individual pages scattered across the disk. This happens during index-driven access to data pages, singleton lookups, and OLTP transactions. Random I/O is expensive because each read requires a separate disk seek (or, on flash storage, a separate I/O operation with its own latency).

Sequential I/O occurs when DB2 reads contiguous pages in order. This happens during table scans, sequential prefetch, and batch processing. Sequential I/O is more efficient because modern storage can read consecutive pages much faster than random ones — the seek cost is amortized across many pages.

DB2 uses prefetch (also called read-ahead) to optimize sequential I/O. When DB2 detects a sequential access pattern, it reads ahead multiple pages at once, overlapping I/O with processing. On z/OS, this is controlled by the sequential prefetch quantity (related to the SEGSIZE of the table space). On LUW, it is controlled by the PREFETCHSIZE of the table space and the EXTENTSIZE setting.

The implications for buffer pool tuning are profound:

  • Random access workloads benefit enormously from large buffer pools because keeping pages in memory avoids expensive random disk seeks. If the entire working set fits in the buffer pool, random I/O drops to near zero.
  • Sequential access workloads benefit less from large buffer pools because prefetch already makes I/O relatively efficient, and scanned pages may only be used once before being discarded.

This is why many experienced DBAs create separate buffer pools for random and sequential workloads — a practice we will explore in detail in Section 25.5.

25.1.5 How DB2 Manages Buffer Pool Pages

When the buffer pool is full and DB2 needs to bring in a new page, it must evict an existing page. DB2 uses a variant of the Least Recently Used (LRU) algorithm to decide which page to evict.

On z/OS, the buffer pool manager maintains two chains:

  1. LRU chain: Pages that have not been updated since their last write to disk. These "clean" pages can be evicted immediately because the disk copy is current.
  2. Dirty page chain (also called the write queue): Pages that have been modified but not yet written to disk. These must be written to disk before they can be evicted — otherwise the changes would be lost.

DB2 uses deferred write to asynchronously write dirty pages to disk, moving them from the dirty chain to the LRU chain where they can be stolen. The timing of deferred writes is controlled by thresholds (DWQT, VDWQT) which we will cover in Section 25.2.

On LUW, the mechanism is similar but uses page cleaners — background threads that write dirty pages to disk. The number of page cleaners is controlled by the NUM_IOCLEANERS configuration parameter. Page cleaners are triggered by several conditions: a buffer pool reaching its dirty page threshold, the log sequence number gap growing too large, or a page steal attempt finding only dirty pages.

25.1.6 Synchronous vs. Asynchronous I/O

This distinction matters enormously for application response time:

  • Asynchronous I/O (prefetch reads, deferred writes): DB2 requests the I/O and continues working while it completes. The application thread is not blocked. Prefetch reads are initiated by DB2's prefetch engine, which works ahead of the application thread.
  • Synchronous I/O (demand reads, forced writes): The application thread is blocked waiting for the I/O to complete. This happens when a needed page is not in the buffer pool, was not anticipated by prefetch, and must be read immediately. It also happens when DB2 must write a dirty page to make room and the page cleaners have not kept up.

The goal of buffer pool tuning is to minimize synchronous I/O. Every synchronous read is a moment where your application is waiting instead of working. On a system processing 5,000 transactions per second, even 1 ms of unnecessary synchronous I/O per transaction translates to 5 seconds of wasted CPU/thread time every second — a staggering cost at scale.


25.2 Buffer Pool Configuration on z/OS

25.2.1 Buffer Pool Naming and Ranges

On z/OS, buffer pools are predefined with fixed names and page sizes:

Buffer Pool Range Page Size Count
BP0 - BP49 4 KB 50 pools
BP8K0 - BP8K9 8 KB 10 pools
BP16K0 - BP16K9 16 KB 10 pools
BP32K0 - BP32K9 32 KB 10 pools

BP0 is the default buffer pool and always exists. It holds the DB2 catalog and directory table spaces. Every other buffer pool must be explicitly activated by setting its VPSIZE to a non-zero value.

A common mistake on z/OS is leaving all objects in BP0. This creates a single contention point where batch scans compete with OLTP lookups, index traversals compete with sort work, and the catalog competes with application data. We will address this in the assignment strategy section.

25.2.2 Key Configuration Parameters

Buffer pools on z/OS are configured using the -ALTER BUFFERPOOL command (issued through the DB2 command processor) or through installation panel DSNTIP1. The critical parameters are:

VPSIZE (Virtual Pool Size)

This is the number of pages (buffers) in the virtual buffer pool — the actual memory allocated. This is the most important parameter for any buffer pool.

-ALTER BUFFERPOOL(BP1) VPSIZE(50000)

This allocates 50,000 pages x 4 KB = approximately 195 MB for BP1.

Sizing guidelines for VPSIZE:

Workload VPSIZE Guideline
Hot OLTP tables Large enough for entire working set; target 98%+ hit ratio
Index pools Large enough for all non-leaf pages plus hot leaf pages; target 99%+
Batch/sequential Moderate; prefetch handles throughput; 80-90% hit ratio acceptable
Temporary work Based on concurrent sort/temp usage patterns
Catalog (BP0) At least 10,000-20,000 pages; more if you have many packages/plans

The practical approach to sizing VPSIZE: start with the total pages of the table spaces assigned to the pool (query SYSIBM.SYSTABLESPACE for NACTIVE or the VSAM catalog for high-allocated). If the working set is smaller than the total data size (it usually is), use monitoring data to determine the active page count.

VPSEQT (Virtual Pool Sequential Steal Threshold)

This parameter controls what percentage of the buffer pool can be consumed by sequentially accessed pages. It is perhaps the most underappreciated parameter in z/OS buffer pool tuning.

-ALTER BUFFERPOOL(BP1) VPSEQT(80)

A value of 80 means that sequentially accessed pages can occupy at most 80% of the buffer pool. Once this threshold is reached, DB2 will steal (evict) sequential pages preferentially, protecting the randomly accessed pages that remain.

Why this matters: Without VPSEQT protection, a single large table scan can flush every carefully cached OLTP page out of the buffer pool. Consider a buffer pool with 100,000 pages holding the hot working set of your OLTP application. A batch job starts scanning a 500,000-page table. With a default VPSEQT of 80, the scan can consume 80,000 of those pages — displacing 80% of the OLTP cache. Your OLTP response times will spike until the cache rebuilds, which may take 15-30 minutes of normal OLTP activity.

Tuning VPSEQT:

Pool Purpose VPSEQT Setting Rationale
Dedicated random OLTP 0-10 Sequential pages used once and discarded
Dedicated index pool 0-5 Indexes are almost always random access
Mixed OLTP + light reporting 20-40 Allow some sequential but protect random
Dedicated sequential batch 80-100 Most access is sequential
Dedicated temp/sort 80-95 Sorts are sequential

Setting VPSEQT to 0 means no sequential pages will be retained in the pool after they are used — they are immediately eligible for steal. This is aggressive but effective for protecting pools dedicated to random OLTP workloads.

DWQT (Deferred Write Queue Threshold)

This parameter controls when DB2 starts asynchronously writing dirty pages to disk. It is expressed as a percentage of the buffer pool.

-ALTER BUFFERPOOL(BP1) DWQT(50)

When the percentage of dirty (updated but not yet written) pages in the buffer pool reaches DWQT, DB2 triggers deferred write processing to flush pages to disk and free up space on the LRU chain.

Why it matters: If dirty pages are not written to disk fast enough, the LRU chain runs out of clean (stealable) pages. When that happens, DB2 must perform a synchronous write — the application thread is suspended while the dirty page is written to disk. Synchronous writes are a serious performance problem because they add disk write latency directly to the application's response time.

Guidelines:

Scenario DWQT Rationale
Read-heavy pool 40-50% Fewer dirty pages, less urgency
Write-heavy pool 25-35% Flush dirty pages earlier to prevent sync writes
Very large pool (>1M pages) 30-40% Even a small percentage means many pages to write

VDWQT (Vertical Deferred Write Queue Threshold)

While DWQT is a pool-level threshold (how much of the entire pool is dirty), VDWQT triggers deferred writes when any single data set (VSAM dataset backing a table space or index) consumes more than the specified percentage of the pool's dirty pages.

-ALTER BUFFERPOOL(BP1) VDWQT(10)

This prevents a single hot table space from monopolizing the buffer pool's write capacity. If one table space is being hammered with updates, VDWQT ensures its dirty pages get flushed before they crowd out other table spaces.

Typical value: 10% (the default). Adjust upward to 15-20% for pools dedicated to a single hot table space.

PGSTEAL (Page Steal Algorithm)

Controls the algorithm used to select pages for eviction:

Value Algorithm Best For
LRU Least Recently Used Most workloads (default)
FIFO First In, First Out Sequential/temp workloads where recency is irrelevant
NONE No page stealing Small reference tables that must always be in memory
-ALTER BUFFERPOOL(BP2) PGSTEAL(LRU)

PGSTEAL(NONE) deserves special attention. When set to NONE, no pages are ever evicted from the pool. This means the pool must be large enough to hold every page of every table space assigned to it. Use this for small, frequently accessed reference tables (branch codes, currency codes, product codes) where you want to guarantee zero physical I/O. But be cautious — if the data grows beyond the pool size, you will get errors.

PGFIX (Page Fix)

When set to YES, DB2 page-fixes the buffer pool pages in real (central) storage, preventing the z/OS operating system from paging them out to auxiliary storage (disk-based paging).

-ALTER BUFFERPOOL(BP1) PGFIX(YES)

Why this matters: Without PGFIX(YES), z/OS may page buffer pool memory to auxiliary storage during memory pressure. When DB2 tries to access a page it believes is in the buffer pool, z/OS must first page it back into real memory — this is called double paging or a page fault. The cost: DB2 thinks the page is cached (no physical read counter), but the access takes milliseconds instead of microseconds. This makes the problem invisible to DB2 monitoring — hit ratios look fine, but response times are terrible.

Recommendation: For production systems with sufficient real memory, always use PGFIX(YES). The performance impact of double paging is severe and nearly impossible to diagnose from DB2 statistics alone.

25.2.3 Comprehensive z/OS Buffer Pool Configuration Example

Here is a realistic buffer pool configuration for a production z/OS DB2 subsystem with 64 GB allocated to DB2:

-- BP0: Catalog and directory (always active, system objects only)
-ALTER BUFFERPOOL(BP0)
  VPSIZE(20000)      -- ~78 MB for catalog
  VPSEQT(50)         -- Some BIND/PREPARE scans are sequential
  DWQT(40)
  VDWQT(10)
  PGSTEAL(LRU)
  PGFIX(YES)

-- BP1: Hot OLTP data tables (ACCOUNTS, CUSTOMERS, BALANCES)
-ALTER BUFFERPOOL(BP1)
  VPSIZE(500000)     -- ~1.95 GB
  VPSEQT(10)         -- Protect random pages aggressively
  DWQT(35)
  VDWQT(10)
  PGSTEAL(LRU)
  PGFIX(YES)

-- BP2: OLTP indexes
-ALTER BUFFERPOOL(BP2)
  VPSIZE(250000)     -- ~977 MB
  VPSEQT(5)          -- Almost all access is random
  DWQT(30)
  VDWQT(10)
  PGSTEAL(LRU)
  PGFIX(YES)

-- BP3: Transaction history (mixed random + sequential)
-ALTER BUFFERPOOL(BP3)
  VPSIZE(400000)     -- ~1.56 GB
  VPSEQT(40)         -- Allow batch scans but protect OLTP inserts
  DWQT(35)
  VDWQT(15)
  PGSTEAL(LRU)
  PGFIX(YES)

-- BP4: Batch/sequential data (audit logs, reports)
-ALTER BUFFERPOOL(BP4)
  VPSIZE(200000)     -- ~781 MB
  VPSEQT(90)         -- Mostly sequential
  DWQT(45)
  VDWQT(20)
  PGSTEAL(FIFO)      -- Pages used once
  PGFIX(YES)

-- BP5: Reference/lookup tables (branch, currency, product codes)
-ALTER BUFFERPOOL(BP5)
  VPSIZE(5000)       -- ~20 MB (entire tables fit)
  VPSEQT(0)
  DWQT(50)
  PGSTEAL(NONE)      -- Never evict — always in memory
  PGFIX(YES)

-- BP6: Temporary/work file tables
-ALTER BUFFERPOOL(BP6)
  VPSIZE(150000)     -- ~586 MB
  VPSEQT(90)
  DWQT(50)
  VDWQT(20)
  PGSTEAL(FIFO)
  PGFIX(YES)

-- BP32K0: 32 KB page temp for large sorts
-ALTER BUFFERPOOL(BP32K0)
  VPSIZE(10000)      -- ~313 MB (32 KB pages)
  VPSEQT(90)
  DWQT(40)
  PGSTEAL(FIFO)
  PGFIX(YES)

25.3 Hiperpool Concepts on z/OS

25.3.1 What Was Hiperpool?

Hiperpool (High-Performance Pool) was a secondary caching layer available on older z/OS systems that used Expanded Storage — a hardware feature of S/390 and early zSeries processors. Expanded Storage was faster than disk but slower than central (real) storage. It was accessed in 4 KB pages through special hardware instructions (PAGEIN/PAGEOUT).

Hiperpool acted as a secondary cache between the virtual buffer pool and disk:

Application Thread
    |
    v
Virtual Buffer Pool (Central Storage) -- fastest
    |  (page stolen)
    v
Hiperpool (Expanded Storage) -- fast
    |  (not found / castout)
    v
DASD (Disk) -- slow

When a page was stolen from the virtual buffer pool, instead of being discarded, it could be cached in the hiperpool. If the page was needed again, reading it from hiperpool was significantly faster than reading from disk (microseconds vs. milliseconds), though slower than reading from the virtual buffer pool.

25.3.2 The Decline of Hiperpool

With modern z/OS systems (z13 and later), Expanded Storage has been eliminated as a separate hardware tier. Modern mainframes have large amounts of real (central) memory — 256 GB, 512 GB, or even multiple terabytes. IBM has deprecated hiperpool functionality in recent DB2 versions (DB2 12 for z/OS and later) for several reasons:

  1. No hardware distinction: Central and expanded storage are now the same physical memory. The hiperpool layer adds software overhead without a hardware benefit.
  2. CPU cost: Moving pages between virtual pool and hiperpool consumes CPU cycles for no benefit when both are backed by the same physical memory.
  3. Larger virtual pools: Modern memory sizes make it practical to allocate very large virtual buffer pools that subsume the role hiperpool used to play.

The recommended approach on modern z/OS:

  1. Disable hiperpool by setting HPSIZE to 0 for all buffer pools.
  2. Increase VPSIZE to absorb the workload hiperpool was handling.
  3. Use PGFIX(YES) to ensure buffer pool pages stay in real memory.
-- Modern configuration: hiperpool disabled
-ALTER BUFFERPOOL(BP1)
  VPSIZE(500000)
  HPSIZE(0)          -- Hiperpool disabled
  PGFIX(YES)

25.3.3 When You Might Still See Hiperpool

If you are working with an older z/OS system or inheriting a legacy configuration, you may encounter hiperpool parameters:

  • HPSIZE: Number of pages in the hiperpool (0 = disabled)
  • CASTOUT: Whether hiperpool pages are written to disk when stolen (YES/NO). YES provides persistence; NO treats hiperpool as pure cache.
  • HITEFAC: A threshold factor controlling when pages move between virtual pool and hiperpool

Migration path: Increase VPSIZE by the amount of HPSIZE, set HPSIZE to 0, monitor hit ratios for a week. In virtually all cases, the larger virtual pool with PGFIX(YES) will match or outperform the old hiperpool configuration — with less CPU overhead.


25.4 Buffer Pool Management on LUW

25.4.1 Creating Buffer Pools

On DB2 LUW, buffer pools are created explicitly with the CREATE BUFFERPOOL statement:

CREATE BUFFERPOOL bp_oltp_data
  SIZE 50000          -- 50,000 pages
  PAGESIZE 8192       -- 8 KB pages
  NOT EXTENDED STORAGE;

The SIZE parameter specifies the number of pages. The total memory consumed is SIZE x PAGESIZE. In this example: 50,000 x 8,192 = approximately 390 MB.

Page size options on LUW: 4096, 8192, 16384, 32768 bytes.

Unlike z/OS where you have predefined pool names, LUW lets you name pools with meaningful identifiers. Use this to your advantage — bp_oltp_data is far more informative than BP1 when someone reads your configuration two years from now.

25.4.2 Altering Buffer Pools

You can change the size of a buffer pool dynamically, without stopping the database:

ALTER BUFFERPOOL bp_oltp_data
  SIZE 100000;        -- Increase to 100,000 pages

On DB2 LUW 10.1 and later, size changes take effect immediately. DB2 gradually allocates (or releases) memory as needed. There is no outage and no disruption to running applications.

To let DB2 manage the size automatically:

ALTER BUFFERPOOL bp_oltp_data
  SIZE AUTOMATIC;

When set to AUTOMATIC, DB2's Self-Tuning Memory Manager (STMM) adjusts the buffer pool size based on workload demands. We will discuss STMM in detail in Section 25.7.

25.4.3 The IBMDEFAULTBP Buffer Pool

Every DB2 LUW database has a default buffer pool called IBMDEFAULTBP. It is created automatically during database creation with a 4 KB page size. Any table space not explicitly assigned to a buffer pool uses IBMDEFAULTBP.

Important: Do not leave everything in IBMDEFAULTBP. Just as on z/OS, putting all objects into a single pool creates contention between workloads. IBMDEFAULTBP should primarily hold the system catalog tables and small miscellaneous objects. Everything else should be in purpose-built pools.

On a new database, a common first step is:

-- Shrink default pool to hold just catalog
ALTER BUFFERPOOL IBMDEFAULTBP SIZE 25000;  -- ~100 MB for catalog at 4 KB

-- Create purpose-built pools for application data
CREATE BUFFERPOOL bp_oltp_data SIZE 500000 PAGESIZE 8192;
CREATE BUFFERPOOL bp_oltp_idx  SIZE 250000 PAGESIZE 8192;
CREATE BUFFERPOOL bp_temp      SIZE 100000 PAGESIZE 32768;

25.4.4 NPAGES and Sizing

The NPAGES parameter (synonymous with SIZE in newer versions) controls the buffer pool size in pages. Here are practical guidelines for distributing memory:

Buffer Pool Purpose Share of DATABASE_MEMORY
Critical OLTP data 40-60%
Indexes 15-25%
Temporary table spaces 10-15%
Default (catalog, misc) 5-10%
Large table scans / batch 5-10%

For a system with 32 GB allocated to DATABASE_MEMORY using 8 KB pages:

-- Approximate total pages available for buffer pools: ~3,200,000
-- (Not all DATABASE_MEMORY goes to buffer pools; subtract heaps, cache, etc.)

CREATE BUFFERPOOL bp_oltp_data
  SIZE 1500000 PAGESIZE 8192;     -- ~11.4 GB (47%)

CREATE BUFFERPOOL bp_indexes
  SIZE 600000 PAGESIZE 8192;      -- ~4.6 GB (19%)

CREATE BUFFERPOOL bp_temp
  SIZE 300000 PAGESIZE 32768;     -- ~9.2 GB at 32K pages

CREATE BUFFERPOOL bp_batch
  SIZE 200000 PAGESIZE 8192;      -- ~1.5 GB (6%)

-- IBMDEFAULTBP for catalog: keep small
ALTER BUFFERPOOL IBMDEFAULTBP SIZE 50000;  -- ~200 MB

25.4.5 Block-Based Buffer Pools

DB2 LUW supports block-based areas within buffer pools to improve sequential prefetch efficiency:

CREATE BUFFERPOOL bp_seq_data
  SIZE 100000
  PAGESIZE 8192
  NUMBLOCKPAGES 30000     -- Pages reserved for block I/O
  BLOCKSIZE 32;           -- 32 pages per block

Without block-based areas, prefetched pages are scattered throughout the buffer pool wherever free slots are found. With block-based areas, DB2 allocates contiguous blocks of memory for prefetched pages, which can improve performance for sequential access patterns because the prefetch I/O can be completed in larger, more efficient units.

Guidelines for block-based areas:

Parameter Setting Notes
NUMBLOCKPAGES 20-30% of SIZE Reserve this portion for block I/O
BLOCKSIZE Match EXTENTSIZE Align with table space extent size for best results

Use block-based areas for buffer pools that serve sequential workloads (batch processing, reporting, data warehouse scans). Do not use them for OLTP pools where access is primarily random — the block reservation wastes memory that would be better used for random page caching.


25.5 Buffer Pool Assignment Strategy

Proper buffer pool assignment is one of the most impactful performance decisions a DBA makes. The goal is to separate workloads so they do not interfere with each other and to give each workload the memory profile it needs.

25.5.1 The Separation Principle

The fundamental principle is: separate things that behave differently. Specifically:

  1. Separate data from indexes. Indexes are almost always accessed randomly (non-leaf pages via probe, leaf pages via scan within a narrow range). Data pages have more varied access patterns. Index non-leaf pages have extremely high reuse — the root page and first-level pages of a busy index may be accessed millions of times per hour.

  2. Separate random access from sequential access. A large sequential scan can flush all the carefully cached random-access pages out of a shared buffer pool. This is the most common cause of sudden OLTP response time spikes — a batch job starts scanning a large table and destroys the OLTP cache.

  3. Separate hot tables from cold tables. A small number of frequently accessed tables typically account for 80-90% of all I/O. Give them dedicated memory so they are not competing with rarely accessed tables.

  4. Separate temporary work from persistent data. Temporary table spaces have different access patterns (write-heavy, short-lived, sequential) and lifecycle characteristics (data is created, used once, and discarded).

  5. Separate the catalog. The system catalog is accessed during statement preparation, BIND operations, and DDL. It has its own access pattern and should not compete with application data.

25.5.2 A Practical Assignment Strategy

Here is a buffer pool assignment strategy that works well for most production systems. This is not a one-size-fits-all template — adjust based on your workload analysis — but it provides a solid starting point:

Pool 1: Catalog and System Objects - Assign: DB2 catalog, directory, system table spaces - Size: Moderate (enough for 95%+ hit ratio on catalog operations) - Character: Mostly random reads during PREPARE; sequential during BIND/RUNSTATS

Pool 2: Hot OLTP Data - Assign: The 10-20 most frequently accessed transaction tables - Size: Large — this is where you spend the lion's share of your memory budget - Character: Random read/write, high concurrency, latency-sensitive - VPSEQT: Low (5-20%) to prevent sequential eviction

Pool 3: OLTP Indexes - Assign: Indexes on the hot OLTP tables - Size: Large enough to hold all non-leaf pages plus frequently accessed leaf pages - Character: Random reads, very high reuse on upper levels - VPSEQT: Very low (0-10%)

Pool 4: Batch/Sequential Data - Assign: Tables primarily accessed by batch jobs, reports, large scans - Size: Moderate (sequential prefetch handles throughput; pages are used once) - Character: Sequential read, occasional bulk write - VPSEQT: High (80-100%)

Pool 5: Temporary Work - Assign: Temporary table spaces (sorts, intermediate results, declared GTTs) - Size: Based on concurrent sort activity and temp table usage - Character: Short-lived, sequential, write-then-read

Pool 6: Reference/Lookup Data - Assign: Small, frequently read reference tables (branch codes, currency codes) - Size: Small but sufficient to hold entire tables in memory permanently - Character: Read-only or read-mostly, random access, zero physical I/O tolerance - PGSTEAL: NONE on z/OS; fixed large enough on LUW

25.5.3 Identifying Hot Tables

Before you can assign tables to pools, you need to know which tables are hot. Examine the I/O statistics:

On z/OS (from the statistics trace or SYSIBM.SYSTABLESPACESTATS):

SELECT DBNAME, TSNAME, NACTIVE, GETPAGE, PHYSREAD,
       CASE WHEN GETPAGE > 0
            THEN DECIMAL((1.0 - FLOAT(PHYSREAD)/FLOAT(GETPAGE)) * 100, 5, 2)
            ELSE 0 END AS HIT_RATIO
FROM SYSIBM.SYSTABLESPACESTATS
ORDER BY GETPAGE DESC
FETCH FIRST 20 ROWS ONLY;

On LUW (from the monitoring table functions):

SELECT SUBSTR(TBSP_NAME, 1, 30) AS TBSP_NAME,
       POOL_DATA_L_READS + POOL_INDEX_L_READS AS LOGICAL_READS,
       POOL_DATA_P_READS + POOL_INDEX_P_READS AS PHYSICAL_READS,
       CASE WHEN (POOL_DATA_L_READS + POOL_INDEX_L_READS) > 0
            THEN DECIMAL(
              (1.0 - FLOAT(POOL_DATA_P_READS + POOL_INDEX_P_READS) /
                     FLOAT(POOL_DATA_L_READS + POOL_INDEX_L_READS)) * 100, 5, 2)
            ELSE 0 END AS HIT_RATIO
FROM TABLE(MON_GET_TABLESPACE(NULL, -2)) AS T
ORDER BY POOL_DATA_L_READS + POOL_INDEX_L_READS DESC
FETCH FIRST 20 ROWS ONLY;

The tables at the top of the GETPAGE / logical reads list are your candidates for dedicated buffer pools. In most systems, you will find that 5-10 table spaces account for 70-80% of all logical reads.

25.5.4 Moving Table Spaces Between Buffer Pools

On z/OS:

ALTER TABLESPACE dbname.tsname
  BUFFERPOOL BP2;

This takes effect immediately for new page requests. Pages already cached in the old buffer pool are managed normally and will be stolen as needed. There is no outage.

On LUW:

ALTER TABLESPACE ts_oltp_data
  BUFFERPOOL bp_oltp_data;

Similarly immediate on LUW. The table space begins using the new buffer pool for all subsequent page requests.


25.6 EDM Pool, RID Pool, and Sort Pool on z/OS

Buffer pools are not the only memory areas that matter on z/OS. Three other pools are critical for performance, and neglecting them can undermine even the best buffer pool configuration.

25.6.1 EDM Pool (Environmental Descriptor Manager Pool)

The EDM pool caches several categories of metadata and compiled SQL artifacts:

  • CT (Cursor Table) and PT (Package Table) entries: The compiled, executable form of SQL statements from bound packages and plans
  • DBDs (Database Descriptors): Metadata about databases, table spaces, tables, indexes, and their physical structure
  • Dynamic Statement Cache entries: Prepared forms of dynamic SQL statements (the EDMSTMTC portion)
  • Authorization cache entries: Cached authorization checks

Why it matters: When a SQL statement is executed, DB2 must locate its compiled form (CT/PT), load the relevant DBDs for the tables it references, and verify authorization. If any of these are not in the EDM pool, DB2 must read them from the directory or catalog — a costly operation that adds directly to statement execution time.

Key configuration parameters (DSNZPARM or zparm):

Parameter Purpose Starting Point
EDM_SKELETON_POOL (or EDMSKLP) CT/PT skeleton storage 50-100 MB
EDMDBDC Database descriptor cache 20-50 MB
EDMSTMTC Dynamic statement cache 100-500 MB

The dynamic statement cache (EDMSTMTC) deserves special attention. Modern applications — especially Java and web applications using JDBC — rely heavily on dynamic SQL with parameter markers. The dynamic statement cache allows DB2 to reuse the compiled form of these statements across multiple executions and even across multiple threads. A larger EDMSTMTC means more statements can be cached, reducing the CPU cost of repeated PREPARE operations.

Monitoring the EDM pool:

Watch for these indicators in the statistics trace (IFCID 0001/0002) or in tools like OMEGAMON:

Symptom Meaning Action
High CT/PT not-in-pool rate EDM pool too small for skeleton storage Increase EDM_SKELETON_POOL
High DBD not-in-pool rate Insufficient DBD cache Increase EDMDBDC
Low dynamic statement cache hit ratio (<85%) Too many unique statements or cache too small Increase EDMSTMTC; review application for literal SQL
EDM pool full conditions Critical — DB2 is discarding cached objects Increase all EDM parameters

25.6.2 RID Pool (Record Identifier Pool)

The RID pool is used during list prefetch and multi-index access operations. When DB2 decides to use an index to identify qualifying rows and then access the data pages in physical order (rather than index order), it stores the Record Identifiers (RIDs) in the RID pool, sorts them by page number, and then reads the data pages sequentially. This converts random I/O into sequential I/O — a significant optimization.

Multi-index access (index ANDing and ORing) also uses the RID pool. When two indexes are used to satisfy a query (e.g., one on CITY and one on STATE), DB2 stores the RID lists from each index in the RID pool and performs set intersection (AND) or union (OR).

Why it matters: If the RID pool is too small, DB2 cannot build the RID list and must fall back to a table space scan. This is called a RID failure or RID FAIL. The query still runs, but the access path changes from an efficient index-driven list prefetch to a full scan — potentially reading millions of pages instead of thousands.

The insidious part: RID failures are silent. The query returns correct results. There is no error. The only symptom is that the query takes 10x or 100x longer than expected. You must monitor for RID failures proactively.

Key parameter:

  • MAXRBLK: Maximum number of 4 KB blocks for the RID pool. Total size = MAXRBLK x 4 KB.

Sizing guidelines:

Workload MAXRBLK Total Size
Light OLTP, few list prefetches 5,000-10,000 20-40 MB
Medium mixed workload 10,000-25,000 40-100 MB
Heavy reporting with multi-index access 25,000-100,000 100-400 MB

Monitor the RID pool failure count (QXRIDPFL in the accounting trace) and the high-water mark of RID pool usage. If you see any RID failures, increase MAXRBLK immediately — each failure represents a query that ran orders of magnitude slower than it should have.

25.6.3 Sort Pool

The sort pool provides memory for in-memory sort operations. When DB2 must sort data — for ORDER BY, GROUP BY, DISTINCT, merge joins, UNION, or index creation — it uses memory from the sort pool.

Key parameter:

  • SRTPOOL: Size of the sort pool in KB. This is the total memory available for all concurrent sort operations in the DB2 subsystem.

How sorting works on z/OS:

  1. DB2 allocates a portion of the sort pool for the sort operation.
  2. If the data to be sorted fits in the allocated memory, the sort completes entirely in memory (an in-memory sort). This is fast.
  3. If the data exceeds the available memory, DB2 spills intermediate sort results to a work file (a table space defined for temporary use). This is a sort overflow or work file sort. It is significantly slower because it involves writing to and reading from disk.

Sizing guidelines:

Workload SRTPOOL Notes
Light OLTP (few sorts) 2-10 MB Most OLTP should avoid sorts via indexes
Medium mixed 10-40 MB Some GROUP BY and ORDER BY in reports
Heavy reporting/analytics 40-200 MB Complex queries with multiple sorts

Monitor sort overflows. A sort overflow rate above 2-3% indicates the sort pool is too small — or that queries are sorting unnecessarily large result sets (consider adding indexes to avoid the sort).


25.7 Memory Architecture on LUW

25.7.1 The Memory Hierarchy

DB2 LUW organizes memory into a clear hierarchy:

Operating System Memory (Physical RAM)
  |
  +-- INSTANCE_MEMORY (total ceiling for the DB2 instance)
       |
       +-- DATABASE_MEMORY (total for each database)
       |    |
       |    +-- Buffer Pools (bp_data, bp_index, bp_temp, ...)
       |    +-- Lock List (LOCKLIST)
       |    +-- Package Cache (PCKCACHESZ)
       |    +-- Catalog Cache (CATALOGCACHE_SZ)
       |    +-- Shared Sort Heap (SHEAPTHRES_SHR)
       |    +-- Utility Heap (UTIL_HEAP_SZ)
       |    +-- Database Heap (DBHEAP)
       |    +-- Other heaps
       |
       +-- FCM (Fast Communication Manager) buffers
       +-- Monitor Heap (MON_HEAP_SZ)
       +-- Audit Buffer
       +-- Agent Private Memory (per-agent sort heap, stmt heap, etc.)

Understanding this hierarchy is essential. Each level sets a ceiling for the level below it. If INSTANCE_MEMORY is 32 GB, then the sum of all DATABASE_MEMORY allocations plus instance-level consumers cannot exceed 32 GB.

25.7.2 INSTANCE_MEMORY

This parameter sets the upper limit of memory that the entire DB2 instance can consume. It includes all databases, FCM buffers, and instance-level memory consumers.

-- View current setting
db2 GET DBM CFG | grep -i instance_memory

-- Set explicitly (value in 4 KB pages)
db2 UPDATE DBM CFG USING INSTANCE_MEMORY 8388608  -- 8,388,608 x 4 KB = 32 GB

-- Let DB2 manage it
db2 UPDATE DBM CFG USING INSTANCE_MEMORY AUTOMATIC

Guideline: On a dedicated database server, set INSTANCE_MEMORY to 70-80% of total physical RAM, leaving the rest for the OS kernel, filesystem cache, and other processes. On a shared server, calculate DB2's fair share and set it explicitly.

Example: 64 GB server, dedicated to DB2:

INSTANCE_MEMORY = 64 GB * 0.75 = 48 GB = 12,582,912 pages (at 4 KB)

25.7.3 DATABASE_MEMORY

This parameter limits the total memory for a specific database within the instance. It encompasses buffer pools, the lock list, package cache, sort heaps (shared threshold), and other database-level memory consumers.

-- View current setting
db2 GET DB CFG FOR mydb | grep -i database_memory

-- Set explicitly (in 4 KB pages)
db2 UPDATE DB CFG FOR mydb USING DATABASE_MEMORY 6291456  -- ~24 GB

-- Let DB2 manage it (recommended for most cases)
db2 UPDATE DB CFG FOR mydb USING DATABASE_MEMORY AUTOMATIC

Key relationship: DATABASE_MEMORY must fit within INSTANCE_MEMORY. If you have multiple databases in one instance, their combined DATABASE_MEMORY allocations (plus instance overhead) must not exceed INSTANCE_MEMORY.

25.7.4 Self-Tuning Memory Manager (STMM)

STMM is DB2 LUW's automatic memory tuning feature. When enabled, STMM monitors workload patterns and dynamically redistributes memory among consumers to optimize overall throughput.

Enabling STMM:

-- Enable the self-tuning feature
db2 UPDATE DB CFG FOR mydb USING SELF_TUNING_MEM ON

-- Enable automatic management for specific consumers
db2 UPDATE DB CFG FOR mydb USING DATABASE_MEMORY AUTOMATIC
db2 UPDATE DB CFG FOR mydb USING LOCKLIST AUTOMATIC
db2 UPDATE DB CFG FOR mydb USING PCKCACHESZ AUTOMATIC
db2 UPDATE DB CFG FOR mydb USING SORTHEAP AUTOMATIC

-- Enable for buffer pools
ALTER BUFFERPOOL bp_data SIZE AUTOMATIC;
ALTER BUFFERPOOL bp_index SIZE AUTOMATIC;

How STMM works:

  1. STMM runs periodically (typically every few minutes, adjustable).
  2. It evaluates each memory consumer's benefit function — an estimate of how much performance would improve if the consumer received more memory. For buffer pools, this is based on the predicted reduction in physical I/O. For sort heap, it is based on the predicted reduction in sort overflows.
  3. It transfers memory from low-benefit consumers to high-benefit consumers.
  4. It respects the DATABASE_MEMORY cap as the total available for redistribution.

When STMM works well: - Workloads that change throughout the day (heavy sorts in the morning, heavy OLTP in the afternoon) - New systems where the DBA does not yet know the optimal memory distribution - Systems where the DBA does not have time for manual tuning

When to be cautious with STMM: - Critical OLTP pools where you need a guaranteed minimum. STMM might shrink a pool during a low-activity period, causing a cold cache when activity resumes. - Very large systems (hundreds of GB) where STMM's periodic adjustments may cause measurable disruption.

The hybrid approach: Enable STMM globally but set fixed sizes for your most critical pools. This gives STMM flexibility to redistribute among less critical consumers while protecting the pools that matter most:

-- Fixed size for critical OLTP (STMM cannot shrink this)
ALTER BUFFERPOOL bp_oltp_data SIZE 500000;

-- STMM-managed for everything else
ALTER BUFFERPOOL bp_batch SIZE AUTOMATIC;
ALTER BUFFERPOOL bp_temp SIZE AUTOMATIC;

25.7.5 Key Memory Configuration Parameters on LUW

Parameter Purpose Typical Range STMM?
SORTHEAP Memory per sort operation (private) 256-4096 pages Yes
SHEAPTHRES_SHR Shared sort heap threshold (total for all sorts) 5-20% of DATABASE_MEMORY Yes
LOCKLIST Memory for lock management AUTOMATIC or 10,000-50,000 pages Yes
PCKCACHESZ Package (plan) cache size AUTOMATIC or 5,000-100,000 pages Yes
CATALOGCACHE_SZ Catalog cache size AUTOMATIC or 1,000-10,000 pages Yes
LOGBUFSZ Log buffer size 256-2048 pages No
NUM_IOCLEANERS Number of page cleaner threads 1-2 per CPU core N/A
NUM_IOSERVERS Number of prefetch I/O server threads 1-2 per physical disk N/A

25.8 Monitoring Memory Health

Effective monitoring is the foundation of tuning. You cannot tune what you cannot measure.

25.8.1 Essential Metrics

1. Buffer Pool Hit Ratio

Calculate this separately for data pages and index pages, and for each buffer pool individually. An aggregate hit ratio hides problems.

-- LUW: Per-buffer-pool hit ratios (data and index separately)
SELECT SUBSTR(BP_NAME, 1, 20) AS BP_NAME,
       POOL_DATA_L_READS,
       POOL_DATA_P_READS,
       CASE WHEN POOL_DATA_L_READS > 0
            THEN DECIMAL((1.0 - FLOAT(POOL_DATA_P_READS) /
                  FLOAT(POOL_DATA_L_READS)) * 100, 5, 2)
            ELSE -1 END AS DATA_HIT_RATIO,
       POOL_INDEX_L_READS,
       POOL_INDEX_P_READS,
       CASE WHEN POOL_INDEX_L_READS > 0
            THEN DECIMAL((1.0 - FLOAT(POOL_INDEX_P_READS) /
                  FLOAT(POOL_INDEX_L_READS)) * 100, 5, 2)
            ELSE -1 END AS INDEX_HIT_RATIO
FROM TABLE(MON_GET_BUFFERPOOL(NULL, -2)) AS T
WHERE POOL_DATA_L_READS + POOL_INDEX_L_READS > 0
ORDER BY BP_NAME;

2. Synchronous Read Percentage

A high percentage of synchronous (non-prefetch) physical reads indicates pages that are not in the buffer pool and cannot be anticipated by prefetch. These are the reads that directly impact application response time.

-- LUW: Synchronous vs. asynchronous reads
SELECT SUBSTR(BP_NAME, 1, 20) AS BP_NAME,
       POOL_DATA_P_READS AS TOTAL_DATA_PHYS_READS,
       POOL_ASYNC_DATA_READS AS ASYNC_DATA_READS,
       POOL_DATA_P_READS - POOL_ASYNC_DATA_READS AS SYNC_DATA_READS,
       CASE WHEN POOL_DATA_P_READS > 0
            THEN DECIMAL(
              FLOAT(POOL_DATA_P_READS - POOL_ASYNC_DATA_READS) /
              FLOAT(POOL_DATA_P_READS) * 100, 5, 2)
            ELSE 0 END AS SYNC_READ_PCT
FROM TABLE(MON_GET_BUFFERPOOL(NULL, -2)) AS T
WHERE POOL_DATA_P_READS > 0
ORDER BY SYNC_READ_PCT DESC;

3. Getpages and Physical Reads Over Time (z/OS)

On z/OS, the key counters from the buffer pool statistics (IFCID 0001/0002) are:

Counter Meaning
GETPAGE Total logical page requests
SYNCIO Synchronous (random) physical reads — the expensive ones
PREFETCH Asynchronous (sequential) physical reads
SYNCWRITE Synchronous writes — a serious problem indicator
DWWRITE Deferred (asynchronous) writes — normal, expected

Track these over time. A suddenly increasing SYNCIO count suggests either data growth, a new query, or a change in access patterns. Correlate with workload changes.

4. Page Cleaner Activity (LUW)

-- LUW: Page cleaner triggers and dirty page steals
SELECT SUBSTR(BP_NAME, 1, 20) AS BP_NAME,
       POOL_DRTY_PG_STEAL_CLNS AS DIRTY_STEALS,
       POOL_DRTY_PG_THRSH_CLNS AS THRESHOLD_CLEANS,
       POOL_LSN_GAP_CLNS AS LSN_GAP_CLEANS
FROM TABLE(MON_GET_BUFFERPOOL(NULL, -2)) AS T
WHERE POOL_DRTY_PG_STEAL_CLNS > 0
   OR POOL_DRTY_PG_THRSH_CLNS > 0;

DIRTY_STEALS is the critical metric. A dirty steal means DB2 needed to evict a page, but the only candidates were dirty pages that had not yet been written to disk by the page cleaners. The application thread must wait for the synchronous write. This indicates one or more of: buffer pool too small, not enough page cleaners (NUM_IOCLEANERS), or I/O subsystem too slow to keep up with write demand.

5. Sort Overflows

-- LUW: Sort memory usage
SELECT TOTAL_SORTS,
       SORT_OVERFLOWS,
       CASE WHEN TOTAL_SORTS > 0
            THEN DECIMAL(FLOAT(SORT_OVERFLOWS) /
                  FLOAT(TOTAL_SORTS) * 100, 5, 2)
            ELSE 0 END AS OVERFLOW_PCT
FROM TABLE(MON_GET_DATABASE(-2)) AS T;

A sort overflow percentage above 2-3% indicates SORTHEAP is too small or queries are generating unnecessarily large sorts.

6. Package Cache Hit Ratio (LUW)

-- LUW: Package cache effectiveness
SELECT PKG_CACHE_LOOKUPS,
       PKG_CACHE_INSERTS,
       CASE WHEN PKG_CACHE_LOOKUPS > 0
            THEN DECIMAL(
              (1.0 - FLOAT(PKG_CACHE_INSERTS) /
                     FLOAT(PKG_CACHE_LOOKUPS)) * 100, 5, 2)
            ELSE 0 END AS PKG_CACHE_HIT_RATIO
FROM TABLE(MON_GET_DATABASE(-2)) AS T;

Target: 95%+ hit ratio. A lower ratio means DB2 is frequently re-preparing SQL statements, consuming CPU.

25.8.2 Setting Up a Monitoring Baseline

Before you tune anything, establish a baseline. Record the following metrics during normal operation:

  1. Buffer pool hit ratios (per pool, data and index separately)
  2. Physical I/O rates (synchronous and asynchronous)
  3. Getpage rates (z/OS) or logical read rates (LUW)
  4. Sort overflow counts and rates
  5. Package cache / EDM pool hit ratios
  6. Overall CPU usage, I/O wait times, response times
  7. Peak period timing (when does load spike?)

Capture these metrics at regular intervals (every 15-30 minutes) for at least one full business cycle — typically one week. This gives you a picture of normal behavior, including peak periods, batch windows, month-end processing, and quiet times. Without this baseline, you have no way to measure whether your tuning changes helped or hurt.


25.9 Tuning Methodology

Buffer pool and memory tuning is not a one-time activity. It is a continuous cycle of observation, analysis, and adjustment.

25.9.1 The Baseline-Measure-Adjust Cycle

    +---> BASELINE <---+
    |    (Capture current state,
    |     document configuration)
    |         |
    |         v
    |     ANALYZE
    |    (Identify the bottleneck:
    |     which pool, which metric?)
    |         |
    |         v
    |      ADJUST
    |    (Make ONE change)
    |         |
    |         v
    +---- VERIFY
         (Same metrics, same period.
          Did it help?)

Step 1: Baseline. Capture all memory-related metrics during normal operation. Document the current configuration: buffer pool sizes, assignment of table spaces to pools, STMM settings, sort heap sizes, all thresholds.

Step 2: Analyze. Identify the specific problem. Do not guess — let the data tell you: - Which buffer pool has the lowest hit ratio? - Which table space is generating the most physical I/O? - Are there sort overflows? How many? - Is the package cache / EDM pool seeing evictions? - Are there synchronous writes? - Is there a time-of-day pattern (batch window impact)?

Step 3: Adjust. Make one change at a time. This is the most important discipline in tuning. If you change three things simultaneously — increase a buffer pool, move two table spaces, and adjust VPSEQT — you will not know which change helped (or hurt). Common adjustments:

Adjustment When To Use
Increase buffer pool size 10-25% Low hit ratio, working set does not fit
Move hot table space to dedicated pool High-volume table in shared pool
Lower VPSEQT on OLTP pool Sequential scans evicting random pages
Raise VPSEQT on batch pool Sequential workload needs more cache
Increase SORTHEAP/SRTPOOL Sort overflow rate above 2-3%
Increase EDMSTMTC/PCKCACHESZ Dynamic SQL cache hit ratio below 90%
Add page cleaners Dirty page steals occurring

Step 4: Verify. After the change, capture the same metrics during the same period (same day of week, same time of day) and compare to the baseline. Did the target metric improve? Did any other metric degrade? If the change helped, update your baseline and consider the next adjustment. If it did not help, revert and investigate further.

25.9.2 Common Tuning Scenarios

Scenario 1: Overall low buffer pool hit ratio

Diagnosis: The buffer pool is too small for the working set, or workload mixing is causing mutual eviction.

Actions: 1. Identify the table spaces consuming the most pages (query statistics). 2. Determine if all those table spaces need to be in this pool. 3. If workloads are mixed (random + sequential), separate them into different pools. 4. If the working set legitimately requires more memory, increase the buffer pool size.

Scenario 2: Hit ratio drops during batch window

Diagnosis: Batch sequential scans are flushing OLTP pages from the buffer pool.

Actions: 1. Create a separate buffer pool for batch table spaces. 2. On z/OS, reduce VPSEQT on the OLTP pool to 10-20%. 3. Move batch-scanned table spaces to the new pool. 4. Monitor OLTP hit ratios during the next batch window.

Scenario 3: High synchronous read rate despite acceptable hit ratio

Diagnosis: The hit ratio is averaged over time, but specific critical transactions are experiencing misses.

Actions: 1. Identify specific table spaces with high sync read counts. 2. Create a dedicated buffer pool for those objects. 3. Consider using PGSTEAL(NONE) if the tables are small enough to cache entirely. 4. Evaluate whether index-only access paths could eliminate data page reads.

Scenario 4: Synchronous writes appearing

Diagnosis: Page cleaners (LUW) or deferred write (z/OS) cannot keep up. The buffer pool runs out of clean pages to steal.

Actions: 1. On z/OS, lower DWQT to trigger deferred writes earlier (e.g., 30% instead of 50%). 2. On LUW, increase NUM_IOCLEANERS. 3. If the pool is simply too small, increase its size. 4. Check I/O subsystem performance — slow storage can cause write backup.

Scenario 5: Memory pressure from sorts

Diagnosis: Sort overflows are consuming disk I/O and slowing queries.

Actions: 1. Increase SORTHEAP (LUW) or SRTPOOL (z/OS). 2. Review the queries causing large sorts — can indexes eliminate the sort? 3. On LUW, check SHEAPTHRES_SHR to ensure the shared threshold is not limiting individual sort operations. 4. Consider 32 KB page temp table spaces for sort work files.

25.9.3 Tuning Priority Order

When resources are limited and you need to decide what to tune first, follow this order:

  1. Buffer pools for critical OLTP tables and indexes — biggest impact on response time
  2. Package cache / EDM pool — reduces CPU overhead for dynamic SQL
  3. Sort memory — eliminates sort overflows
  4. Separate batch from OLTP — prevents cross-workload interference
  5. Lock list — prevents lock escalation (which causes deadlocks and timeouts)
  6. Buffer pools for batch/sequential workloads — improves batch throughput
  7. Temporary table space buffer pools — reduces work file I/O

25.10 Meridian Bank Memory Configuration Example

Let us bring everything together with a comprehensive example. Meridian Bank is a mid-size financial institution running DB2 on both z/OS (for core banking) and LUW (for digital banking and analytics).

25.10.1 The z/OS Configuration

Meridian Bank's core banking runs on a z/OS DB2 subsystem with 128 GB of real memory available on the LPAR. DB2 is allocated 48 GB. The workload includes:

  • Online banking transactions: 5,000 transactions per second during peak
  • ATM/POS transactions: 2,000 per second
  • End-of-day batch: Processes 8 million transactions nightly
  • Regulatory reporting: Monthly batch jobs scanning large tables

Buffer pool allocation:

-- BP0: Catalog and directory
-ALTER BUFFERPOOL(BP0)
  VPSIZE(25000) VPSEQT(50) DWQT(40) PGFIX(YES)
  -- 25,000 x 4 KB = 98 MB

-- BP1: Core account tables (ACCOUNTS, CUSTOMERS, BALANCES)
-ALTER BUFFERPOOL(BP1)
  VPSIZE(2500000) VPSEQT(5) DWQT(35) VDWQT(10) PGFIX(YES)
  -- 2,500,000 x 4 KB = 9.8 GB
  -- VPSEQT(5) because this pool is purely random access

-- BP2: Core account indexes
-ALTER BUFFERPOOL(BP2)
  VPSIZE(1500000) VPSEQT(0) DWQT(30) PGFIX(YES)
  -- 1,500,000 x 4 KB = 5.9 GB
  -- VPSEQT(0) because indexes are always random here

-- BP3: Transaction history tables (mixed OLTP insert + batch scan)
-ALTER BUFFERPOOL(BP3)
  VPSIZE(3000000) VPSEQT(30) DWQT(35) VDWQT(15) PGFIX(YES)
  -- 3,000,000 x 4 KB = 11.7 GB

-- BP4: Transaction history indexes
-ALTER BUFFERPOOL(BP4)
  VPSIZE(1000000) VPSEQT(5) DWQT(30) PGFIX(YES)
  -- 1,000,000 x 4 KB = 3.9 GB

-- BP5: Reference/lookup tables (branch, currency, country codes)
-ALTER BUFFERPOOL(BP5)
  VPSIZE(50000) VPSEQT(0) DWQT(30) PGSTEAL(NONE) PGFIX(YES)
  -- 50,000 x 4 KB = 195 MB
  -- PGSTEAL(NONE): these small tables must ALWAYS be in memory

-- BP6: Batch processing data (audit log, large report tables)
-ALTER BUFFERPOOL(BP6)
  VPSIZE(500000) VPSEQT(95) DWQT(45) PGSTEAL(FIFO) PGFIX(YES)
  -- 500,000 x 4 KB = 2.0 GB

-- BP7: Work/temp tables
-ALTER BUFFERPOOL(BP7)
  VPSIZE(300000) VPSEQT(90) DWQT(50) PGSTEAL(FIFO) PGFIX(YES)
  -- 300,000 x 4 KB = 1.2 GB

-- BP8K0: 8 KB page size tables (wide row tables)
-ALTER BUFFERPOOL(BP8K0)
  VPSIZE(100000) VPSEQT(50) DWQT(40) PGFIX(YES)
  -- 100,000 x 8 KB = 781 MB

-- Total buffer pool: ~35.5 GB
-- Remaining ~12.5 GB: EDM pool, RID pool, sort pool, z/OS overhead

Other memory areas:

EDM_SKELETON_POOL  = 80 MB
EDMDBDC            = 40 MB
EDMSTMTC           = 300 MB    -- Large cache for Java/JDBC dynamic SQL
MAXRBLK            = 25000     -- 100 MB RID pool
SRTPOOL            = 20 MB

25.10.2 The LUW Configuration

Meridian Bank's digital banking platform runs on DB2 LUW with 64 GB of RAM on a dedicated Linux server. The workload includes:

  • Mobile/web banking API: 10,000 requests per second at peak
  • Real-time fraud detection: Low-latency queries against recent transactions
  • Nightly analytics: Aggregation and reporting jobs

Instance and database memory:

-- Instance level: 48 GB (75% of 64 GB)
db2 UPDATE DBM CFG USING INSTANCE_MEMORY 12582912  -- 48 GB in 4 KB pages

-- Database level: STMM-managed within instance ceiling
db2 UPDATE DB CFG FOR meridiandb USING DATABASE_MEMORY AUTOMATIC

-- Enable STMM
db2 UPDATE DB CFG FOR meridiandb USING SELF_TUNING_MEM ON

Buffer pools:

-- Default pool for catalog (4 KB pages)
ALTER BUFFERPOOL IBMDEFAULTBP SIZE 50000;  -- ~200 MB

-- OLTP data pool (8 KB pages for this database)
CREATE BUFFERPOOL bp_oltp_data
  SIZE 2000000 PAGESIZE 8192;              -- ~15.3 GB (fixed, critical)

-- OLTP index pool (8 KB pages)
CREATE BUFFERPOOL bp_oltp_idx
  SIZE 1000000 PAGESIZE 8192;              -- ~7.6 GB (fixed, critical)

-- Fraud detection pool (hot, latency-sensitive tables)
CREATE BUFFERPOOL bp_fraud
  SIZE 500000 PAGESIZE 8192;               -- ~3.8 GB (fixed)

-- Analytics/reporting pool (STMM-managed)
CREATE BUFFERPOOL bp_analytics
  SIZE AUTOMATIC PAGESIZE 8192             -- STMM adjusts
  NUMBLOCKPAGES 90000                      -- Block I/O for sequential scans
  BLOCKSIZE 32;

-- Temporary table space pool (32 KB pages, STMM-managed)
CREATE BUFFERPOOL bp_temp
  SIZE AUTOMATIC PAGESIZE 32768;

Other memory parameters:

db2 UPDATE DB CFG FOR meridiandb USING SORTHEAP 2048 AUTOMATIC
db2 UPDATE DB CFG FOR meridiandb USING SHEAPTHRES_SHR 1000000
db2 UPDATE DB CFG FOR meridiandb USING PCKCACHESZ AUTOMATIC
db2 UPDATE DB CFG FOR meridiandb USING LOCKLIST AUTOMATIC
db2 UPDATE DB CFG FOR meridiandb USING LOGBUFSZ 512
db2 UPDATE DB CFG FOR meridiandb USING NUM_IOCLEANERS 12
db2 UPDATE DB CFG FOR meridiandb USING NUM_IOSERVERS 16

25.10.3 Monitoring Results After Tuning

After implementing the multi-pool configuration (replacing a single monolithic buffer pool), Meridian Bank observed:

Metric Before (Single Pool) After (Multi-Pool)
Core account hit ratio 87% 99.2%
Transaction history hit ratio 72% 91%
Index hit ratio 90% 99.7%
Average OLTP response time 12 ms 3 ms
95th percentile response time 85 ms 12 ms
Batch window duration 5.5 hours 3.2 hours
Sort overflows per hour 340 12
Sync reads per second (z/OS) 8,500 1,200
Sync writes per hour (z/OS) 125,000 < 100

The key insight: the original system had a single large buffer pool for everything. By separating workloads and right-sizing each pool, they eliminated contention between random OLTP access and sequential batch scans. The OLTP improvement was dramatic — a 4x improvement in average response time — because the hot working set now fits entirely in its dedicated pool with no risk of eviction by batch scans.

The batch window also improved, not because it got more memory, but because its sequential access pattern was no longer competing with random OLTP pages. With a dedicated FIFO pool and high VPSEQT, batch prefetch ran without interference.


25.11 Advanced Topics

25.11.1 Buffer Pool Warm-up

When DB2 starts (or a buffer pool is activated), the buffer pool is empty — "cold." Every first access to every page requires a physical read. This "cold start" period can cause severe performance degradation, especially for OLTP systems that are expected to respond in milliseconds from the moment they are online.

On z/OS: DB2 for z/OS supports buffer pool SAVE and RESTORE functionality. At planned shutdown, DB2 saves a map of which pages were in each buffer pool to a BSDS (Bootstrap Dataset) page set. On restart, it restores those pages, effectively pre-warming the buffer pool. This dramatically reduces the cold start period.

The save/restore behavior is configured in DSNZPARM and operates automatically during planned shutdowns. It does not help with crash restarts (the save did not complete).

On LUW: DB2 LUW does not have built-in buffer pool warm-up in most versions. Common workarounds:

  1. Warm-up scripts: Run queries that touch critical tables immediately after database activation:
-- Touch account and customer data
SELECT COUNT(*) FROM meridian.accounts;
SELECT COUNT(*) FROM meridian.customers;
-- Force index scans on critical indexes
SELECT COUNT(DISTINCT account_id) FROM meridian.transactions
  WHERE trans_date >= CURRENT DATE - 7 DAYS;
  1. Automated warm-up: Schedule the warm-up script as part of the database activation procedure so it runs automatically after every restart.

  2. Gradual ramp-up: After a restart, route only a fraction of traffic to the instance until hit ratios stabilize (typically 15-30 minutes for a large OLTP system).

25.11.2 Dynamic Buffer Pool Resizing

Both platforms support changing buffer pool sizes without an outage:

z/OS: The -ALTER BUFFERPOOL command takes effect immediately. DB2 dynamically allocates or releases real storage. However, if you request more memory than the LPAR has available (and PGFIX is YES), the ALTER will fail.

LUW: The ALTER BUFFERPOOL ... SIZE command takes effect immediately. Growing a buffer pool requires available memory within DATABASE_MEMORY. Shrinking takes time as DB2 gradually releases pages that are no longer needed.

Best practice: Make incremental changes (10-25% at a time) rather than dramatic jumps. A sudden doubling of a buffer pool may cause memory pressure elsewhere. Monitor for 30-60 minutes after each change before making the next.

25.11.3 Memory Overcommitment

A common and dangerous mistake is allocating more total buffer pool memory than physically available. The consequences differ by platform but are universally bad:

  • z/OS with PGFIX(YES): The ALTER BUFFERPOOL command will fail if sufficient real storage is not available. This is actually the safe case — the failure prevents the problem.
  • z/OS without PGFIX(YES): z/OS pages buffer pool memory to auxiliary storage (DASD). DB2 does not know this is happening. Hit ratios look fine, but every "cached" page access requires a paging I/O. Performance is catastrophically bad and extremely hard to diagnose.
  • LUW on Linux/AIX: The OS may allow overcommitment (especially Linux with overcommit_memory=1). DB2 thinks it has the memory, but the OS is silently swapping pages. Performance degrades gradually and unpredictably. In extreme cases, the OOM killer may terminate the DB2 process.

Rule of thumb: Total DB2 memory allocation should never exceed 75-80% of physical RAM on a dedicated server. On a shared server, coordinate with the system administrator to ensure DB2's allocation is realistic.

25.11.4 Monitoring Tools Summary

Platform Tool What It Shows
z/OS -DISPLAY BUFFERPOOL(bpname) DETAIL(*) Real-time buffer pool statistics
z/OS RMF (Resource Measurement Facility) System-wide and DB2 memory/CPU metrics
z/OS OMEGAMON for DB2 Real-time monitoring with alerts and thresholds
z/OS SMF records (Type 100-102, IFCIDs) Historical statistics for trend analysis
z/OS -DISPLAY RESOURCE MANAGER DETAIL RID pool and sort pool usage
LUW MON_GET_BUFFERPOOL() Buffer pool metrics per pool
LUW MON_GET_DATABASE() Database-level memory and sort metrics
LUW MON_GET_TABLESPACE() Per-table space I/O metrics
LUW db2pd -bufferpools Real-time buffer pool snapshot from memory
LUW db2pd -mempools Memory pool allocation details
LUW db2mtrk -d -v Memory tracker — every memory allocation
LUW db2top Interactive real-time monitoring (ncurses UI)

Summary

Buffer pool and memory tuning is the single most impactful performance activity a DBA can undertake. The principles are straightforward, even if the execution requires patience and discipline:

  1. Separate workloads into dedicated buffer pools so they do not interfere with each other. Random OLTP and sequential batch are enemies in a shared pool.

  2. Size buffer pools to hold the working set of each workload. For critical OLTP tables, this often means caching the entire table. For batch workloads, a moderate pool with good prefetch is sufficient.

  3. Protect random access pages from sequential scan eviction. On z/OS, use VPSEQT aggressively on OLTP pools. On LUW, use separate pools.

  4. Prevent synchronous writes by tuning deferred write thresholds (z/OS DWQT) or page cleaner counts (LUW NUM_IOCLEANERS).

  5. Do not forget the other memory areas. On z/OS, the EDM pool (especially EDMSTMTC for dynamic SQL), RID pool, and sort pool are critical. On LUW, the package cache, sort heap, and lock list all compete for the same DATABASE_MEMORY budget.

  6. Monitor continuously — hit ratios, synchronous reads, sort overflows, package cache hits. Capture baselines. Compare across time periods.

  7. Tune incrementally — one change at a time, measure the impact, then proceed.

  8. Use STMM on LUW as a starting point, but understand what it is doing. Override it for critical pools that must not shrink.

  9. Never overcommit memory. Paging is the enemy. Leave 20-25% of physical RAM for the operating system.

  10. Document everything. The DBA who tunes your system next year needs to understand why BP2 has VPSEQT(0) and why BP5 uses PGSTEAL(FIFO).

The Meridian Bank example demonstrates that even moderate systems can achieve dramatic performance improvements by moving from a single monolithic buffer pool to a properly segmented and sized memory configuration. The investment in understanding your workload patterns and matching memory resources to those patterns pays dividends every second of every day your system is running.


In the next chapter, we turn from giving DB2 the right resources to ensuring DB2 uses those resources efficiently. We will explore query tuning and access path analysis — reading EXPLAIN output, understanding the optimizer's decisions, and guiding it toward better plans when it goes astray.