> A perfect logical model with bad physical design is like a sports car with bicycle tires. You have engineered a beautiful machine — normalized entities, well-chosen data types, referential integrity constraints, elegant relationships — and none of...
In This Chapter
- Learning Objectives
- 14.1 Logical to Physical Mapping — From Entities to Tablespaces
- 14.2 [z/OS] Tablespace Types — The Full Taxonomy
- 14.3 [LUW] Tablespace Architecture — SMS, DMS, and Automatic Storage
- 14.4 Partitioning Strategies — Divide and Conquer
- 14.5 Row Compression and Page Compression
- 14.6 Data Placement and I/O Optimization
- 14.7 LOB Management
- 14.8 Temporary Table Strategies
- 14.9 Storage Estimation and Capacity Planning
- 14.10 Physical Design for Different Workloads
- 14.11 The Meridian Bank Physical Design — Putting It All Together
- Spaced Review: Concepts from Earlier Chapters
- Chapter Summary
- Key Terms Introduced in This Chapter
Chapter 14: Physical Database Design — Tablespaces, Partitioning, Compression, and Storage Optimization
A perfect logical model with bad physical design is like a sports car with bicycle tires. You have engineered a beautiful machine — normalized entities, well-chosen data types, referential integrity constraints, elegant relationships — and none of it matters if you put the data in the wrong tablespace, choose the wrong page size, skip compression on a 200 GB table, or leave partitioning to chance. Physical design is where the logical blueprint meets the physics of disk platters, memory buses, and I/O channels. It is where theory becomes throughput. In this chapter, we get our hands dirty.
Learning Objectives
After completing this chapter, you will be able to:
- Translate a logical data model into physical DB2 objects — tables, tablespaces, databases, storage groups — with deliberate, documented decisions for each mapping.
- Choose appropriate tablespace types on z/OS (segmented, partition-by-growth, partition-by-range, universal table spaces, LOB tablespaces) and on LUW (SMS, DMS, automatic storage; regular, large, temporary) based on workload characteristics.
- Design range-partitioning strategies for large tables, selecting partition keys, sizing partitions, and planning partition-level maintenance operations.
- Implement row compression and page compression, estimate compression ratios, and calculate the return on investment in CPU versus storage savings.
- Plan storage allocation and growth for production systems using capacity planning calculations: row size, rows per page, pages per partition, free space overhead, and multi-year growth projections.
- Create the complete physical implementation of the Meridian National Bank database model with production-grade DDL on both z/OS and LUW.
14.1 Logical to Physical Mapping — From Entities to Tablespaces
In Chapter 13, we designed a logical model for Meridian National Bank. We have entities, attributes, relationships, primary keys, and foreign keys. That model is platform-independent — it says nothing about how DB2 will store the data on disk. Now we bridge that gap.
14.1.1 The Mapping Hierarchy
Every DB2 database has a hierarchy of physical objects. The hierarchy differs between z/OS and LUW, but the concept is the same: logical entities must be placed into physical containers, and those containers must be placed on physical storage.
z/OS Physical Hierarchy:
Storage Group (STOGROUP)
└── Database (DATABASE)
└── Tablespace (TABLESPACE)
└── Table (TABLE)
└── Index (INDEX → in its own INDEXSPACE)
LUW Physical Hierarchy:
Storage Path / Container
└── Tablespace (TABLESPACE)
└── Table (TABLE)
└── Index (INDEX → in the same or different tablespace)
On z/OS, the DATABASE object is a logical grouping mechanism that controls locking granularity and utility scope. On LUW, the DATABASE is the top-level container — there is no separate DATABASE object inside a database. This is one of the most common points of confusion for cross-platform practitioners.
14.1.2 One Table, One Tablespace — Or Not?
The fundamental mapping question is: how many tables share a tablespace?
z/OS approach: Modern best practice is one table per tablespace using Universal Table Spaces (UTS). Older systems sometimes used segmented tablespaces with multiple tables, but this pattern creates operational headaches — you cannot REORG one table without affecting the others, and space reclamation becomes unpredictable.
LUW approach: Multiple tables can share a tablespace, and this is common for small reference tables. However, large tables should have their own tablespace for independent management, backup, and restore operations.
The Meridian Bank decision: Each major entity gets its own tablespace. Small reference tables (BRANCH, PRODUCT_TYPE, TRANSACTION_CODE) share a single tablespace. This gives us:
| Logical Entity | z/OS Tablespace | LUW Tablespace | Rationale |
|---|---|---|---|
| CUSTOMER | TS_CUSTOMER (PBG UTS) | TS_CUSTOMER (AUTO 8K) | Growth-oriented, moderate size |
| ACCOUNT | TS_ACCOUNT (PBG UTS) | TS_ACCOUNT (AUTO 8K) | Growth-oriented, moderate size |
| TRANSACTION_HISTORY | TS_TRANS_HIST (PBR UTS) | TS_TRANS_HIST (AUTO 8K RNG) | Range-partitioned by month |
| LOAN | TS_LOAN (PBG UTS) | TS_LOAN (AUTO 8K) | Moderate size, variable growth |
| LOAN_PAYMENT | TS_LOAN_PMT (PBR UTS) | TS_LOAN_PMT (AUTO 8K RNG) | Range-partitioned by payment date |
| EMPLOYEE | TS_EMPLOYEE (PBG UTS) | TS_EMPLOYEE (AUTO 4K) | Small, stable |
| Reference tables | TS_REFERENCE (PBG UTS) | TS_REFERENCE (AUTO 4K) | Multiple small tables, shared |
14.1.3 Page Size Selection
DB2 supports four page sizes: 4 KB, 8 KB, 16 KB, and 32 KB. The choice is not arbitrary — it has direct implications for row capacity, buffer pool efficiency, and I/O cost.
Page size selection rules:
- Calculate maximum row length. A row must fit on one page (minus page overhead). On z/OS, page overhead is approximately 22-42 bytes depending on the tablespace type. On LUW, page overhead is approximately 91 bytes for a 4K page.
- Consider the number of rows per page. More rows per page means fewer I/O operations for sequential scans. But if you typically access single rows by key, you want just enough rows per page to avoid wasting buffer pool memory on unneeded rows.
- Consider VARCHAR and LOB usage. Tables with highly variable row lengths benefit from larger pages because they reduce the percentage of wasted space per page.
- Consider index size. Larger pages mean wider index entries can fit, but index pages in the buffer pool occupy more memory.
Meridian Bank page size decisions:
| Table | Average Row Size | Max Row Size | Page Size | Rows/Page (est.) |
|---|---|---|---|---|
| CUSTOMER | 420 bytes | 680 bytes | 8 KB | ~15 |
| ACCOUNT | 180 bytes | 260 bytes | 8 KB | ~35 |
| TRANSACTION_HISTORY | 240 bytes | 380 bytes | 8 KB | ~26 |
| LOAN | 520 bytes | 840 bytes | 8 KB | ~12 |
| LOAN_PAYMENT | 160 bytes | 220 bytes | 4 KB | ~18 |
| EMPLOYEE | 350 bytes | 500 bytes | 4 KB | ~8 |
| BRANCH | 280 bytes | 400 bytes | 4 KB | ~8 |
We chose 8 KB for the high-volume transaction and customer tables to balance rows-per-page against buffer pool utilization. The smaller reference and payment tables use 4 KB pages because they have shorter rows and smaller working sets.
14.2 [z/OS] Tablespace Types — The Full Taxonomy
DB2 for z/OS has the richest set of tablespace types of any relational database. Understanding them is essential for z/OS DBAs, and understanding why they exist helps LUW practitioners appreciate the design tradeoffs.
14.2.1 The Evolution of z/OS Tablespace Types
The history matters because you will encounter all of these in production:
- Simple tablespaces (DB2 V1, 1983). Rows from multiple tables interleaved on the same pages. Deprecated. Never create new simple tablespaces.
- Segmented tablespaces (DB2 V3, 1993). Pages grouped into segments, each segment belonging to one table. Space is reclaimed at the segment level. Still supported but considered legacy.
- Classic partitioned tablespaces (DB2 V3). Data divided into fixed partitions by key range. Each partition is a separate VSAM data set. Limited to 64 partitions originally.
- Partition-by-Growth (PBG) Universal Table Spaces (DB2 9, 2007). Automatically adds partitions as data grows. No key-range definition needed. Maximum 4096 partitions.
- Partition-by-Range (PBR) Universal Table Spaces (DB2 9, 2007). Key-range partitioned like classic, but with Universal Table Space features. Maximum 4096 partitions with DSSIZE up to 128 GB per partition.
- LOB tablespaces (DB2 V6, 1999). Dedicated tablespaces for CLOB, BLOB, and DBCLOB data.
14.2.2 Universal Table Spaces (UTS) — The Modern Standard
IBM's recommendation since DB2 9 for z/OS is clear: use Universal Table Spaces for all new development. UTS comes in two flavors:
Partition-by-Growth (PBG):
CREATE TABLESPACE TS_CUSTOMER
IN MERIDIAN_DB
USING STOGROUP SGMERIDIAN
MAXPARTITIONS 128
DSSIZE 4G
SEGSIZE 32
BUFFERPOOL BP8K0
LOCKSIZE ROW
CLOSE NO
COMPRESS YES;
Key parameters:
- MAXPARTITIONS 128 — DB2 creates new VSAM data sets automatically as data grows. 128 partitions at 4 GB DSSIZE = 512 GB maximum capacity.
- DSSIZE 4G — Each partition (data set) can grow to 4 GB. Valid values: 4G, 8G, 16G, 32G, 64G, 128G (powers of 2 from 4 to 128).
- SEGSIZE 32 — Pages are grouped into 32-page segments. Valid values: 4, 8, 16, 32, 64. SEGSIZE 32 is the most common choice, balancing space reclamation granularity against internal overhead.
- BUFFERPOOL BP8K0 — Assigns this tablespace to an 8 KB buffer pool. The buffer pool must already exist and must match the implied page size.
- LOCKSIZE ROW — Row-level locking for OLTP workloads. Alternatives: PAGE, TABLE, TABLESPACE.
- COMPRESS YES — Enables row-level compression (covered in Section 14.5).
When to use PBG: Tables that grow unpredictably, reference data tables, tables where you do not need partition-level operations (REORG one partition, drop a partition).
Partition-by-Range (PBR):
CREATE TABLESPACE TS_TRANS_HIST
IN MERIDIAN_DB
USING STOGROUP SGMERIDIAN
DSSIZE 16G
SEGSIZE 32
BUFFERPOOL BP8K0
LOCKSIZE ROW
CLOSE NO
COMPRESS YES
NUMPARTS 60;
CREATE TABLE MERIDIAN.TRANSACTION_HISTORY (
TRANS_ID BIGINT NOT NULL,
ACCOUNT_ID BIGINT NOT NULL,
TRANS_DATE DATE NOT NULL,
TRANS_TYPE_CODE CHAR(4) NOT NULL,
AMOUNT DECIMAL(15,2) NOT NULL,
RUNNING_BALANCE DECIMAL(15,2),
DESCRIPTION VARCHAR(200),
BRANCH_ID INTEGER,
TELLER_ID INTEGER,
CHANNEL_CODE CHAR(3),
REFERENCE_NUM VARCHAR(30),
CREATED_TS TIMESTAMP NOT NULL WITH DEFAULT
)
IN MERIDIAN_DB.TS_TRANS_HIST
PARTITION BY RANGE (TRANS_DATE)
(PARTITION 1 ENDING AT ('2024-01-31'),
PARTITION 2 ENDING AT ('2024-02-29'),
PARTITION 3 ENDING AT ('2024-03-31'),
PARTITION 4 ENDING AT ('2024-04-30'),
PARTITION 5 ENDING AT ('2024-05-31'),
-- ... one partition per month ...
PARTITION 60 ENDING AT ('2028-12-31'));
Key parameters for PBR:
- NUMPARTS 60 — Fixed number of partitions. Each is a separate VSAM data set that can be independently managed.
- DSSIZE 16G — Each partition can hold up to 16 GB of data. For TRANSACTION_HISTORY with ~42 million rows per month at 240 bytes average, each monthly partition uses approximately 42M * 240 / 0.85 (free space) / 8192 (page size) * 8192 = ~11.2 GB. So 16G gives us comfortable headroom.
- PARTITION BY RANGE (TRANS_DATE) — The partitioning column and boundary values.
When to use PBR: Large fact tables where you need partition-level operations — REORG one month without touching others, slide a partition window by adding new and archiving old, run utilities on individual partitions in parallel.
14.2.3 DSSIZE — Getting It Right
DSSIZE determines the maximum size of each partition's VSAM data set. Getting it wrong has consequences:
- Too small: The partition fills up and the tablespace enters STOP status. Recovery requires ALTER TABLESPACE to increase DSSIZE, followed by REORG.
- Too large: Each partition's VSAM data set is allocated with a larger maximum, which affects VSAM control interval processing and can waste catalog entries. In practice, allocating a larger DSSIZE than needed has minimal cost — the space is not physically allocated until used.
Rule of thumb: Set DSSIZE to at least 2x your expected partition size to accommodate growth. For very large tables, 4x is not unreasonable for a five-year horizon.
14.2.4 Segmented Tablespaces — Legacy but Still Alive
You will encounter segmented tablespaces in any z/OS shop with DB2 systems older than 2007. They look like this:
CREATE TABLESPACE TS_LEGACY
IN MERIDIAN_DB
USING STOGROUP SGMERIDIAN
SEGSIZE 32
BUFFERPOOL BP4K0
LOCKSIZE PAGE;
A segmented tablespace has no partitions and is stored as a single VSAM data set. Multiple tables can reside in the same segmented tablespace, each occupying distinct segments. When all rows of a table in a segment are deleted, DB2 reclaims the entire segment and makes it available for reuse.
Why migrate away from segmented tablespaces? - No partition-level utility support (REORG affects the entire tablespace). - Single VSAM data set limits scalability. - Space reclamation is less efficient than PBG. - IBM has signaled that segmented tablespaces may be deprecated in future DB2 versions.
Migration path: Convert segmented tablespaces to PBG UTS using ALTER TABLESPACE ... MAXPARTITIONS n. DB2 will materialize the change during the next REORG.
14.2.5 LOB Tablespaces on z/OS
LOB (Large Object) columns — CLOB, BLOB, DBCLOB — require dedicated LOB tablespaces on z/OS. You cannot store LOB data in a regular tablespace; DB2 stores it separately and maintains a pointer (the LOB locator) in the base table row.
CREATE LOB TABLESPACE TS_CUST_DOCS
IN MERIDIAN_DB
USING STOGROUP SGMERIDIAN
DSSIZE 16G
BUFFERPOOL BP32K1
LOCKSIZE LOB
CLOSE NO
COMPRESS NO;
LOB tablespaces always use 32 KB pages internally. The BUFFERPOOL must be a 32 KB buffer pool. Compression on LOB tablespaces is generally not beneficial because LOB data (images, PDFs, compressed archives) is often already compressed at the application level.
14.3 [LUW] Tablespace Architecture — SMS, DMS, and Automatic Storage
DB2 LUW takes a fundamentally different approach to storage management. Where z/OS DBAs think in terms of VSAM data sets and storage groups, LUW DBAs think in terms of containers, extent sizes, and storage paths.
14.3.1 The Three Storage Models
System-Managed Space (SMS):
DB2 delegates space management to the operating system file system. Each container is a directory, and DB2 creates files within it.
CREATE TABLESPACE TS_REFERENCE
MANAGED BY SYSTEM
USING ('/db2data/meridian/ts_reference');
SMS was the default in older DB2 LUW versions. It is simple but offers the least control — the file system handles extent allocation, and DB2 cannot reclaim free space within files efficiently.
Database-Managed Space (DMS):
DB2 manages raw devices or pre-allocated files directly. DB2 has full control over space allocation at the extent level.
CREATE TABLESPACE TS_CUSTOMER
MANAGED BY DATABASE
USING (FILE '/db2data/meridian/ts_customer_01' 500000,
FILE '/db2data/meridian/ts_customer_02' 500000)
EXTENTSIZE 32
PREFETCHSIZE 96
BUFFERPOOL BP8K;
Each container is pre-allocated (500,000 pages in this example). DMS gives DB2 full control over extent allocation and supports features like table-level online REORG and index-level space reclamation.
Automatic Storage:
This is the modern recommendation for DB2 LUW (introduced in DB2 9.1, enhanced in every subsequent release). DB2 manages container creation, sizing, and growth automatically based on storage paths defined at the database level.
-- At database creation:
CREATE DATABASE MERIDIAN_DIGITAL
AUTOMATIC STORAGE YES
ON '/db2data/path1', '/db2data/path2', '/db2data/path3'
DBPATH ON '/db2system';
-- Tablespace creation with automatic storage:
CREATE TABLESPACE TS_CUSTOMER
PAGESIZE 8192
MANAGED BY AUTOMATIC STORAGE
EXTENTSIZE 32
PREFETCHSIZE AUTOMATIC
BUFFERPOOL BP8K
NO FILE SYSTEM CACHING
AUTORESIZE YES
MAXSIZE 100G;
With automatic storage, DB2 distributes data across the defined storage paths using a round-robin striping algorithm at the extent level. Containers are created and resized automatically. The DBA specifies the logical parameters (page size, extent size, maximum size) and DB2 handles the physical layout.
The recommendation is clear: Use automatic storage for all new LUW databases. It simplifies administration, enables online storage path additions, and supports automatic rebalancing when you add new storage paths.
14.3.2 Tablespace Types on LUW
DB2 LUW has three tablespace types, each serving a specific purpose:
Regular tablespaces store user table data and index data. This is where your CUSTOMER, ACCOUNT, and TRANSACTION_HISTORY tables live. Page sizes: 4K, 8K, 16K, or 32K.
Large tablespaces store LOB data (CLOB, BLOB, DBCLOB) and long VARCHAR/VARGRAPHIC data that exceeds the page size limit. Large tablespaces always use the same page size as their associated regular tablespace.
Temporary tablespaces store intermediate results during query processing — sorts, hash joins, temporary tables. There are two subtypes: - System temporary tablespaces — managed by DB2 automatically for internal operations. - User temporary tablespaces — required for DECLARE GLOBAL TEMPORARY TABLE operations.
-- System temporary tablespace
CREATE SYSTEM TEMPORARY TABLESPACE TEMPSPACE_8K
PAGESIZE 8192
MANAGED BY AUTOMATIC STORAGE
EXTENTSIZE 32
PREFETCHSIZE AUTOMATIC
BUFFERPOOL BP8K_TEMP;
-- User temporary tablespace
CREATE USER TEMPORARY TABLESPACE USERTEMP_8K
PAGESIZE 8192
MANAGED BY AUTOMATIC STORAGE
EXTENTSIZE 32
BUFFERPOOL BP8K_UTEMP;
14.3.3 Extent Sizing on LUW
An extent is the unit of space allocation in DB2 LUW. When a table needs more space, DB2 allocates one extent at a time. The extent size (number of pages per extent) affects:
- Prefetch efficiency: Sequential prefetch reads ahead in extent-sized chunks. Larger extents mean fewer prefetch requests for sequential scans.
- Space efficiency: Smaller extents waste less space for small tables. A table with 10 rows in a tablespace with EXTENTSIZE 256 will still consume 256 pages.
- Concurrency: Extent-level operations (allocation, deallocation) acquire internal latches. Fewer, larger extents mean fewer latch acquisitions.
Guidelines for extent sizing:
| Table Size | Recommended EXTENTSIZE | Rationale |
|---|---|---|
| < 1,000 pages | 4 or 8 | Minimize wasted space |
| 1,000-100,000 | 16 or 32 | Balance space and prefetch |
| > 100,000 pages | 32 or 64 | Optimize sequential prefetch |
For Meridian Bank, we use EXTENTSIZE 32 for most tablespaces — a solid default that works well for tables ranging from thousands to millions of pages.
14.3.4 Page Size Strategy on LUW
LUW allows different page sizes per tablespace, and each page size requires its own buffer pool. This means page size decisions ripple into buffer pool configuration:
-- Create buffer pools for each page size we will use
CREATE BUFFERPOOL BP4K PAGESIZE 4096 SIZE 25000;
CREATE BUFFERPOOL BP8K PAGESIZE 8192 SIZE 50000;
CREATE BUFFERPOOL BP8K_TEMP PAGESIZE 8192 SIZE 10000;
CREATE BUFFERPOOL BP8K_UTEMP PAGESIZE 8192 SIZE 5000;
CREATE BUFFERPOOL BP32K PAGESIZE 32768 SIZE 10000;
Every tablespace with PAGESIZE 8192 must reference a buffer pool that was created with PAGESIZE 8192. Mismatches cause DDL errors. Plan your page sizes carefully — changing a tablespace's page size later requires exporting and reimporting the data.
14.4 Partitioning Strategies — Divide and Conquer
Table partitioning is the single most impactful physical design decision for large tables. A well-partitioned table is easy to manage, fast to query, and efficient to maintain. A poorly partitioned table — or an unpartitioned large table — becomes an operational nightmare.
14.4.1 Why Partition?
The benefits of partitioning are concrete and measurable:
- Partition elimination. When a query includes a predicate on the partitioning key, DB2 skips partitions that cannot contain matching rows. A query for January 2025 transactions touches only the January 2025 partition — the other 59 partitions are never read.
- Partition-level utilities. On z/OS, you can REORG, RUNSTATS, COPY, and RECOVER individual partitions. On LUW, you can DETACH and ATTACH partitions for instant archival and data loading.
- Parallel processing. DB2 can process multiple partitions simultaneously during queries, utilities, and maintenance operations.
- Rolling window management. Add new partitions for incoming data and detach/drop old partitions for expired data. This is far more efficient than DELETE operations on unpartitioned tables.
14.4.2 Range Partitioning by Date — The Dominant Pattern
In banking, insurance, telecommunications, and virtually every transaction-processing system, the most effective partitioning strategy is range partitioning by date. The reasons are straightforward:
- Queries almost always filter by date range ("show me last month's transactions").
- Data ages uniformly — January's data ages together, not randomly.
- Archival policies are date-based ("retain 7 years of transactions").
- Monthly maintenance windows align with monthly partitions.
z/OS range partitioning:
CREATE TABLE MERIDIAN.TRANSACTION_HISTORY (
-- columns as defined above
)
IN MERIDIAN_DB.TS_TRANS_HIST
PARTITION BY RANGE (TRANS_DATE)
(PARTITION 1 ENDING AT ('2024-01-31'),
PARTITION 2 ENDING AT ('2024-02-29'),
PARTITION 3 ENDING AT ('2024-03-31'),
-- ... monthly partitions ...
PARTITION 60 ENDING AT ('2028-12-31'));
LUW range partitioning:
CREATE TABLE MERIDIAN.TRANSACTION_HISTORY (
TRANS_ID BIGINT NOT NULL,
ACCOUNT_ID BIGINT NOT NULL,
TRANS_DATE DATE NOT NULL,
TRANS_TYPE_CODE CHAR(4) NOT NULL,
AMOUNT DECIMAL(15,2) NOT NULL,
RUNNING_BALANCE DECIMAL(15,2),
DESCRIPTION VARCHAR(200),
BRANCH_ID INTEGER,
TELLER_ID INTEGER,
CHANNEL_CODE CHAR(3),
REFERENCE_NUM VARCHAR(30),
CREATED_TS TIMESTAMP NOT NULL WITH DEFAULT
)
IN TS_TRANS_HIST
PARTITION BY RANGE (TRANS_DATE)
(STARTING '2024-01-01' ENDING '2024-01-31',
STARTING '2024-02-01' ENDING '2024-02-29',
STARTING '2024-03-01' ENDING '2024-03-31',
STARTING '2024-04-01' ENDING '2024-04-30',
STARTING '2024-05-01' ENDING '2024-05-31',
STARTING '2024-06-01' ENDING '2024-06-30',
STARTING '2024-07-01' ENDING '2024-07-31',
STARTING '2024-08-01' ENDING '2024-08-31',
STARTING '2024-09-01' ENDING '2024-09-30',
STARTING '2024-10-01' ENDING '2024-10-31',
STARTING '2024-11-01' ENDING '2024-11-30',
STARTING '2024-12-01' ENDING '2024-12-31'
-- additional years added via ALTER TABLE ADD PARTITION
);
Note the syntax difference: z/OS uses ENDING AT with inclusive upper bounds; LUW uses STARTING ... ENDING with explicit ranges.
14.4.3 Partition Key Selection
The partitioning key must satisfy several criteria simultaneously:
- Query alignment. The partitioning key should appear in the WHERE clause of the most frequent and expensive queries. If 80% of your queries filter by TRANS_DATE, partition by TRANS_DATE.
- Even distribution. Partitions should contain roughly equal amounts of data. Months with different numbers of days (28 vs. 31) cause modest skew; months with different business volumes (December holidays vs. February) cause more significant skew. Monitor and accept reasonable imbalance.
- Maintenance alignment. The partitioning key should align with your data lifecycle — if you archive data older than 5 years, partition boundaries should make it easy to detach or drop a year's worth of data.
- Immutability. The partitioning key should rarely or never change. If TRANS_DATE changes after insertion, DB2 must move the row between partitions, which is expensive. Transaction dates do not change — they are ideal.
Anti-patterns to avoid: - Partitioning by a high-cardinality artificial key (TRANS_ID). This distributes data evenly but provides no query benefit — nobody queries by "TRANS_ID between 100000 and 200000." - Partitioning by a low-cardinality column (STATUS). You end up with a few massive partitions and many empty ones. - Partitioning by a column that is not in the most frequent queries. The overhead exists, but partition elimination never fires.
14.4.4 Partition Sizing
How much data should a single partition hold? There is no universal rule, but these guidelines help:
- z/OS: Each partition is a VSAM data set. VSAM data sets below 2 GB are easier to manage (DFDSS backup/restore, faster REORG). Aim for partitions in the 2-8 GB range for high-volume tables.
- LUW: Each partition is an extent-managed region within the tablespace containers. Partition size affects REORG time and backup granularity. Partitions in the 1-10 GB range are practical for most workloads.
- Utility windows: If your REORG window is 2 hours and REORG processes ~5 GB/hour, each partition should be no larger than 10 GB.
- Parallelism: More partitions = more opportunities for parallel utility execution. But too many small partitions add catalog overhead and complicate administration.
For Meridian Bank's TRANSACTION_HISTORY table: - 500 million transactions per year / 12 months = ~42 million rows per month. - 42 million rows * 240 bytes average = ~10 GB per month (raw data). - With PCTFREE 10 and overhead: ~11.5 GB per monthly partition. - This fits comfortably within a 16 GB DSSIZE on z/OS and within LUW partition limits.
14.4.5 Partition-Level Operations
The real power of partitioning shows in operations:
z/OS partition-level utilities:
-- REORG a single partition (month)
REORG TABLESPACE MERIDIAN_DB.TS_TRANS_HIST PART 13;
-- RUNSTATS on a single partition
RUNSTATS TABLESPACE MERIDIAN_DB.TS_TRANS_HIST PART 13
TABLE (ALL) INDEX (ALL);
-- IMAGE COPY a single partition
COPY TABLESPACE MERIDIAN_DB.TS_TRANS_HIST DSNUM 13
FULL YES;
LUW partition management:
-- Add a new monthly partition
ALTER TABLE MERIDIAN.TRANSACTION_HISTORY
ADD PARTITION STARTING '2025-01-01' ENDING '2025-01-31';
-- Detach an old partition for archival
ALTER TABLE MERIDIAN.TRANSACTION_HISTORY
DETACH PARTITION PART_2024_01
INTO TABLE MERIDIAN.TRANS_ARCHIVE_202401;
-- Attach a pre-loaded staging table as a partition
ALTER TABLE MERIDIAN.TRANSACTION_HISTORY
ATTACH PARTITION STARTING '2025-02-01' ENDING '2025-02-28'
FROM TABLE MERIDIAN.TRANS_STAGING_202502;
SET INTEGRITY FOR MERIDIAN.TRANSACTION_HISTORY
ALLOW WRITE ACCESS
IMMEDIATE CHECKED;
The LUW DETACH/ATTACH pattern is extraordinarily powerful. DETACH converts a partition into a standalone table in seconds — regardless of partition size. You can then archive, export, or drop that table at your leisure without affecting the active partitions. This is the foundation of efficient data lifecycle management.
14.5 Row Compression and Page Compression
Storage is not free. For Meridian National Bank with 500 million annual transactions and a 7-year retention policy, TRANSACTION_HISTORY alone will consume approximately 840 GB uncompressed (3.5 billion rows * 240 bytes). Compression can reduce this by 50-80%, saving hundreds of gigabytes of storage, reducing I/O, and often improving query performance because more data fits in the buffer pool.
14.5.1 How DB2 Compression Works
DB2 uses dictionary-based compression. The concept is simple: find repeating patterns in the data and replace them with shorter symbols.
Row compression (classic/static compression):
DB2 builds a compression dictionary by scanning the table data. The dictionary maps frequently occurring byte patterns (not just column values, but any repeating byte sequence) to short symbols. After the dictionary is built, each row is compressed by replacing occurrences of dictionary entries with their symbols.
The dictionary is stored in the tablespace and loaded into memory when the tablespace is accessed. Dictionary size is typically 32-128 KB depending on data characteristics.
How the dictionary is built (z/OS):
1. Run REORG TABLESPACE ... COMPRESS YES or LOAD ... REPLACE COMPRESS YES.
2. DB2 samples the data (or scans it entirely) and builds the compression dictionary.
3. Subsequent LOAD and INSERT operations use the dictionary to compress new rows.
4. The dictionary remains static until the next REORG.
How the dictionary is built (LUW):
1. Create the table with COMPRESS YES.
2. Run REORG TABLE ... RESETDICTIONARY to build the dictionary.
3. Alternatively, DB2 can build the dictionary automatically when sufficient data has been inserted (automatic dictionary creation, or ADC).
4. The dictionary is stored in the table object itself.
Adaptive compression (z/OS DB2 11+, LUW DB2 10.5+):
Adaptive compression adds a second level of compression on top of the static dictionary. It creates page-level compression maps that exploit patterns specific to each page — for example, if all rows on a page have the same BRANCH_ID value, that value can be represented once in the page map rather than repeated in every row.
-- z/OS: Enable adaptive compression
ALTER TABLESPACE MERIDIAN_DB.TS_TRANS_HIST
COMPRESS YES;
-- Then REORG to build the dictionary and enable adaptive compression
-- LUW: Enable adaptive compression
ALTER TABLE MERIDIAN.TRANSACTION_HISTORY
COMPRESS YES ADAPTIVE;
Adaptive compression typically adds 10-30% additional compression on top of static row compression, at the cost of slightly higher CPU usage during data access.
14.5.2 Estimating Compression Ratios
Before enabling compression in production, estimate the savings.
z/OS estimation:
-- Use DSN1COMP utility to estimate compression ratio
// DSN1COMP EXEC PGM=DSN1COMP
// SYSUT1 DD DSN=MERIDIAN.DSNDBD.TS_TRANS_HIST.I0001.A001,DISP=SHR
// SYSPRINT DD SYSOUT=*
DSN1COMP reads a sample of the data and reports the expected compression ratio. A typical output might show:
COMPRESSION DICTIONARY BUILT FROM 50000 ROWS
PERCENTAGE OF BYTES SAVED: 62%
DICTIONARY SIZE: 64KB
LUW estimation:
-- Use ADMIN_GET_TAB_COMPRESS_INFO to estimate
SELECT COMPRESS_ATTR, PAGES_SAVED_PERCENT, BYTES_SAVED_PERCENT
FROM TABLE(SYSPROC.ADMIN_GET_TAB_COMPRESS_INFO('MERIDIAN',
'TRANSACTION_HISTORY', 'ESTIMATE')) AS T;
Typical compression ratios by data type:
| Data Characteristics | Expected Ratio | Savings |
|---|---|---|
| Numeric, high cardinality | 30-45% | Modest |
| Character, moderate cardinality | 50-65% | Good |
| Repeating patterns (status codes, dates) | 60-75% | Very good |
| Mixed workload, typical banking data | 55-70% | Good |
| Already-compressed LOB data | 0-5% | None |
14.5.3 Compression ROI Analysis
Compression trades CPU cycles for storage savings. Every read decompresses rows; every write compresses them. Is it worth it?
CPU cost of compression: - Row decompression adds 2-5% CPU overhead per SQL statement that reads compressed data. - Row compression (during INSERT/UPDATE) adds 5-10% CPU overhead. - Adaptive compression adds another 1-3% on top of row compression.
Storage savings: - A 62% compression ratio on 840 GB of TRANSACTION_HISTORY saves 520 GB of disk. - At $0.10/GB/month for enterprise storage, that is $52,000/year. - Over 7 years (the retention period): $364,000 in storage savings.
I/O savings: - Compressed data means fewer pages to read from disk. Buffer pool hit ratios improve because more data fits in the same amount of memory. - Sequential scans read fewer pages. A table scan that reads 100 million pages uncompressed reads 38 million pages compressed. This can cut batch job elapsed time by 50% or more.
The ROI formula:
Annual Savings = (Storage Savings * $/GB/year) + (I/O Reduction * $/IO/year)
Annual Cost = (CPU Overhead * $/MSU or $/core/year)
ROI = (Annual Savings - Annual Cost) / Annual Cost
For most banking workloads, compression ROI is strongly positive. The exceptions are: - Very small tables (dictionary overhead exceeds savings). - CPU-constrained environments where every MSU counts (rare in modern shops with zIIP offload). - Tables with already-compressed data (LOBs containing JPEGs, PDFs).
Meridian Bank compression decisions:
| Table | Compress? | Expected Ratio | Annual Storage Savings |
|---|---|---|---|
| TRANSACTION_HISTORY | YES | 62% | $52,000 |
| CUSTOMER | YES | 55% | $3,200 |
| ACCOUNT | YES | 48% | $1,800 |
| LOAN | YES | 58% | $4,100 |
| LOAN_PAYMENT | YES | 60% | $8,500 |
| EMPLOYEE | NO | ~45% | $12 (not worth it) |
| BRANCH | NO | ~40% | $3 (not worth it) |
Total annual storage savings from compression: approximately $69,600. Over 7 years: $487,200. CPU overhead at approximately $15,000/year for additional MSU consumption. Net 7-year savings: $382,200.
14.6 Data Placement and I/O Optimization
Where data physically resides on storage devices affects performance. Even in the age of SANs and all-flash arrays, data placement matters.
14.6.1 Separating Hot and Cold Data
Not all data is accessed equally. In Meridian Bank's TRANSACTION_HISTORY: - The current month's partition is hot — it receives all new inserts and is queried constantly by online banking. - The previous 2-3 months are warm — queried for recent statement generation and customer service. - Months 4-12 are cool — queried occasionally for annual statements and audit. - Months 13+ are cold — queried rarely, primarily for regulatory compliance.
z/OS placement strategy:
Use different storage groups to direct hot and cold data to different storage tiers:
-- High-performance storage for current data
CREATE STOGROUP SG_HOT
VOLUMES ('VOL001', 'VOL002', 'VOL003', 'VOL004')
VCAT MERIDIAN;
-- Standard storage for historical data
CREATE STOGROUP SG_COLD
VOLUMES ('VOL100', 'VOL101', 'VOL102', 'VOL103')
VCAT MERIDIAN;
-- Assign partitions to storage groups
ALTER TABLESPACE MERIDIAN_DB.TS_TRANS_HIST
PART 59 USING STOGROUP SG_HOT; -- Current month
ALTER TABLESPACE MERIDIAN_DB.TS_TRANS_HIST
PART 58 USING STOGROUP SG_HOT; -- Previous month
-- Older partitions use the tablespace default STOGROUP (SG_COLD)
LUW placement strategy:
Use multiple storage paths with tablespace-level or database-level separation:
-- Fast SSD storage path for hot data
-- (Defined at database level and referenced by tablespace)
CREATE TABLESPACE TS_TRANS_CURRENT
PAGESIZE 8192
MANAGED BY AUTOMATIC STORAGE
EXTENTSIZE 32
BUFFERPOOL BP8K;
-- After partitioning, manage placement through storage groups:
CREATE STOGROUP SG_FAST ON '/db2data/ssd_pool1', '/db2data/ssd_pool2';
CREATE STOGROUP SG_STANDARD ON '/db2data/hdd_pool1', '/db2data/hdd_pool2';
ALTER TABLESPACE TS_TRANS_HIST
USING STOGROUP SG_FAST;
14.6.2 RAID Considerations
Most enterprise DB2 systems run on RAID arrays behind a SAN. The RAID level affects performance:
- RAID 1 (mirroring): Best random read performance. 2x storage cost. Good for redo logs and high-IOPS tablespaces.
- RAID 5 (distributed parity): Good read performance, moderate write penalty. Suitable for most tablespaces.
- RAID 10 (striped mirrors): Excellent read and write performance. 2x storage cost. Ideal for transaction logs and heavily updated tablespaces.
- RAID 6 (double parity): Similar to RAID 5 but with higher write penalty. Better fault tolerance. Good for cold data.
Recommendations for Meridian Bank:
| Object Type | RAID Level | Rationale |
|---|---|---|
| Transaction logs | RAID 10 | Write-intensive, latency-critical |
| Hot tablespaces | RAID 10 | High IOPS, mixed read/write |
| Warm tablespaces | RAID 5 | Read-mostly, cost-effective |
| Cold tablespaces | RAID 6 | Archival, fault-tolerant, lowest cost/GB |
| Temp tablespaces | RAID 10 | Sort-heavy workloads need write performance |
14.6.3 Flash and SSD Placement
Solid-state storage has transformed I/O optimization. Where spinning disks deliver 150-200 IOPS, an enterprise SSD delivers 50,000-100,000 IOPS. For random read workloads (OLTP index lookups), this difference is transformative.
What to put on flash/SSD: - Active transaction log data sets (z/OS) or log files (LUW). - Hot tablespace partitions (current month's transactions). - Frequently accessed indexes, especially clustering indexes on large tables. - System catalog tablespaces (high metadata access during SQL compilation). - Temporary tablespaces if sort-heavy workloads dominate.
What can stay on spinning disk: - Cold archival partitions. - Backup copies and archive logs. - LOB data that is infrequently accessed.
On z/OS with DS8900 or equivalent storage, IBM's Easy Tier feature can automatically migrate hot extents to SSD and cold extents to HDD within the same storage pool. This reduces the DBA's placement burden but does not eliminate the need for deliberate tablespace-level separation.
14.7 LOB Management
Large Object columns store unstructured data — documents, images, audio, XML payloads. Meridian National Bank stores customer documents (loan applications, identity verification scans, signed agreements) as BLOBs and customer correspondence as CLOBs.
14.7.1 LOB Storage Architecture
z/OS: - LOB data is stored in a dedicated LOB tablespace, separate from the base table. - Each LOB value is stored in a contiguous chain of 32 KB pages. - The base table row contains an 8-byte LOB locator (pointer), not the data itself. - A single LOB value can be up to 2 GB.
LUW: - LOB data can be stored inline (in the base table row) if it is small enough, or in a separate large tablespace. - Inline LOBs are stored in the same tablespace as the base table, within the row data. Default inline length: 0 bytes (no inlining). You can set it up to approximately (page_size - row_overhead). - Non-inline LOBs are stored in a large tablespace that DB2 creates automatically (one per table per tablespace).
14.7.2 Inline LOBs on LUW — A Performance Optimization
For LOBs that are typically small (under 1-2 KB), inline storage avoids the overhead of a separate LOB read I/O:
CREATE TABLE MERIDIAN.CUSTOMER_NOTES (
NOTE_ID BIGINT NOT NULL,
CUSTOMER_ID BIGINT NOT NULL,
NOTE_DATE TIMESTAMP NOT NULL,
NOTE_TEXT CLOB(32000) INLINE LENGTH 2000,
NOTE_TYPE CHAR(4),
PRIMARY KEY (NOTE_ID)
)
IN TS_CUST_NOTES;
The INLINE LENGTH 2000 clause tells DB2 to store LOBs up to 2,000 bytes directly in the base table row. LOBs larger than 2,000 bytes overflow to the large tablespace. This is highly effective when most values are small but occasional values are large — a pattern common for customer notes and correspondence.
Inline LOB trade-offs: - Pro: Eliminates additional I/O for small LOBs. Row retrieval is a single page read. - Pro: Improves sequential scan performance for tables with many small LOBs. - Con: Increases average row size, reducing rows-per-page for the base table. - Con: Buffer pool pages contain LOB data, which may not be the best use of limited memory.
14.7.3 LOB Tablespace Sizing
LOB tablespaces tend to grow unpredictably because LOB sizes are highly variable. A customer document might be 50 KB (a short text agreement) or 5 MB (a multi-page scanned document).
Sizing approach:
1. Profile the expected LOB sizes: average, median, 95th percentile, maximum.
2. Estimate the number of LOB values over the capacity planning horizon.
3. Apply the formula: Total LOB Storage = Count * Average_LOB_Size * 1.1 (overhead).
4. Add generous growth margin (50-100%) because LOB sizes tend to increase over time as resolution increases and document formats evolve.
Meridian Bank LOB estimate:
| LOB Use Case | Avg Size | Count (Year 1) | Raw Storage | With Overhead |
|---|---|---|---|---|
| Loan applications | 350 KB | 200,000 | 70 GB | 77 GB |
| Identity scans | 1.2 MB | 500,000 | 600 GB | 660 GB |
| Customer correspondence | 15 KB | 2,000,000 | 30 GB | 33 GB |
| Signed agreements | 200 KB | 300,000 | 60 GB | 66 GB |
| Total | 760 GB | 836 GB |
LOB data dominates the storage footprint. This is typical in modern banking — unstructured data grows faster than structured data.
14.8 Temporary Table Strategies
DB2 uses temporary storage for sort operations, hash joins, intermediate query results, and application-declared temporary tables. Proper sizing and management of temporary space prevents costly sort overflows and application failures.
14.8.1 System Temporary Tablespaces
DB2 uses system temporary tablespaces for internal operations. On both z/OS and LUW, these are created during database setup and sized based on workload:
z/OS work files:
CREATE TABLESPACE TEMPDB01
IN DSNDB07
USING STOGROUP SGTEMP
SEGSIZE 32
BUFFERPOOL BP32K1
LOCKSIZE TABLE
CLOSE YES;
z/OS DB2 uses the DSNDB07 database for work files. The work file tablespace should use 32 KB pages for efficient sort I/O. Size it based on the largest concurrent sort requirement — typically the largest batch job that performs ORDER BY, GROUP BY, or DISTINCT on a large result set.
LUW system temporary tablespace:
CREATE SYSTEM TEMPORARY TABLESPACE TEMPSPACE_LARGE
PAGESIZE 32768
MANAGED BY AUTOMATIC STORAGE
EXTENTSIZE 64
PREFETCHSIZE AUTOMATIC
BUFFERPOOL BP32K_TEMP;
14.8.2 Declared Global Temporary Tables (DGTT)
DGTTs are session-scoped temporary tables that exist only for the duration of a database connection. They are invaluable for complex multi-step processing:
-- z/OS and LUW syntax is identical
DECLARE GLOBAL TEMPORARY TABLE SESSION.TEMP_DAILY_TOTALS (
BRANCH_ID INTEGER,
TRANS_DATE DATE,
TOTAL_DEBITS DECIMAL(15,2),
TOTAL_CREDITS DECIMAL(15,2),
TRANS_COUNT INTEGER
)
ON COMMIT PRESERVE ROWS
NOT LOGGED;
Key characteristics:
- Data is visible only to the declaring session.
- No catalog entries (no concurrency contention on catalog tables).
- NOT LOGGED — changes are not written to the transaction log, which is dramatically faster for bulk inserts.
- ON COMMIT PRESERVE ROWS — data survives COMMIT (alternative: ON COMMIT DELETE ROWS).
- Requires a user temporary tablespace.
14.8.3 Created Global Temporary Tables (CGTT)
CGTTs have a persistent definition in the catalog but session-scoped data:
CREATE GLOBAL TEMPORARY TABLE MERIDIAN.GTT_STATEMENT_WORK (
ACCOUNT_ID BIGINT,
TRANS_DATE DATE,
AMOUNT DECIMAL(15,2),
DESCRIPTION VARCHAR(200),
RUNNING_BAL DECIMAL(15,2)
)
ON COMMIT PRESERVE ROWS;
DGTT vs. CGTT — when to use each:
| Feature | DGTT | CGTT |
|---|---|---|
| Catalog entry | No | Yes |
| Definition persistence | Session only | Permanent |
| Statistics | Not available | Can collect RUNSTATS |
| Optimizer awareness | Limited | Full |
| Suited for | Ad hoc processing | Repeated patterns |
For Meridian Bank's month-end batch processing, we use a CGTT for the statement generation working table because the optimizer needs accurate statistics to choose the best join strategy. For ad hoc customer service queries, we use DGTTs.
14.8.4 Temp Space Sizing
Undersized temporary space causes sort overflows (DB2 spills to disk or fails outright) and SQLCODE -904 errors. Oversized temporary space wastes storage and memory.
Sizing formula:
Temp Space = MAX(Largest_Sort_Requirement, Concurrent_Sort_Sum)
Largest_Sort_Requirement =
Rows_in_Sort * (Sort_Key_Length + Pointer_Size + Overhead)
/ Page_Usable_Bytes * Page_Size
Concurrent_Sort_Sum =
Sum of all simultaneous sort requirements during peak processing
For Meridian Bank: - Largest sort: Month-end statement generation sorting 500M transactions by ACCOUNT_ID, TRANS_DATE. Sort key = 8 + 4 = 12 bytes. With overhead: ~20 bytes per row. 500M * 20 = 10 GB sort space needed. - Concurrent sorts during peak: 10 GB (statement) + 2 GB (batch reporting) + 1 GB (online queries) = 13 GB. - Allocate 20 GB temporary tablespace to provide headroom.
14.9 Storage Estimation and Capacity Planning
Accurate storage estimation prevents two equally bad outcomes: running out of space (database goes down) and over-provisioning (money wasted). The calculation is mechanical but requires precision.
14.9.1 Row Size Calculation
The actual storage size of a row differs from the sum of column lengths due to overhead:
z/OS row overhead: - Row header: 6 bytes (standard) or 8 bytes (with ROWID). - NULL indicator byte: 1 byte per 8 nullable columns (rounded up). - VARCHAR length prefix: 2 bytes per VARCHAR column. - VARCHAR actual data: variable. - Row padding for alignment: 0-7 bytes depending on data types.
LUW row overhead: - Row header: 8-10 bytes. - NULL indicator bitmap: 1 byte per 8 nullable columns. - VARCHAR length prefix: 2 bytes per VARCHAR column (4 bytes for VARCHAR > 255). - Column offsets for variable-length columns: 2 bytes each.
Example: Meridian Bank CUSTOMER table row size:
Column Type Fixed/Var Size
------ ---- --------- ----
CUSTOMER_ID BIGINT Fixed 8
SSN_HASH CHAR(64) Fixed 64
FIRST_NAME VARCHAR(50) Variable 2 + avg 12
LAST_NAME VARCHAR(50) Variable 2 + avg 14
MIDDLE_NAME VARCHAR(30) Variable 2 + avg 6
DATE_OF_BIRTH DATE Fixed 4
EMAIL VARCHAR(100) Variable 2 + avg 28
PHONE_PRIMARY VARCHAR(20) Variable 2 + avg 12
PHONE_SECONDARY VARCHAR(20) Variable 2 + avg 10
ADDRESS_LINE_1 VARCHAR(100) Variable 2 + avg 30
ADDRESS_LINE_2 VARCHAR(100) Variable 2 + avg 10
CITY VARCHAR(50) Variable 2 + avg 12
STATE_CODE CHAR(2) Fixed 2
ZIP_CODE VARCHAR(10) Variable 2 + avg 6
COUNTRY_CODE CHAR(3) Fixed 3
CUSTOMER_SINCE DATE Fixed 4
CUSTOMER_TYPE CHAR(1) Fixed 1
STATUS CHAR(1) Fixed 1
RISK_RATING SMALLINT Fixed 2
CREDIT_SCORE SMALLINT Fixed 2
KYC_STATUS CHAR(1) Fixed 1
KYC_DATE DATE Fixed 4
LAST_UPDATED TIMESTAMP Fixed 10
CREATED_TS TIMESTAMP Fixed 10
--------------------------------------------------------------
Row header (z/OS) 6
NULL indicators (8 nullable cols) 1
--------------------------------------------------------------
Total (average) ~420 bytes
Total (maximum) ~680 bytes
14.9.2 Pages Required
Once you have the average row size, calculate pages:
Usable_Bytes_Per_Page = Page_Size - Page_Overhead
z/OS: 8192 - 22 = 8170 bytes (approximate, varies by tablespace type)
LUW: 8192 - 91 = 8101 bytes
Rows_Per_Page = FLOOR(Usable_Bytes_Per_Page / Average_Row_Size)
CUSTOMER: FLOOR(8170 / 420) = 19 rows/page (z/OS)
CUSTOMER: FLOOR(8101 / 420) = 19 rows/page (LUW)
Effective_Rows_Per_Page = Rows_Per_Page * (1 - PCTFREE/100)
With PCTFREE 10: 19 * 0.90 = 17 effective rows/page
Pages_Required = CEIL(Total_Rows / Effective_Rows_Per_Page)
CUSTOMER: CEIL(2,000,000 / 17) = 117,648 pages
Storage = Pages_Required * Page_Size
CUSTOMER: 117,648 * 8192 = 963 MB ≈ 0.94 GB
14.9.3 PCTFREE — Leaving Room for Growth
PCTFREE specifies the percentage of each page to leave free during LOAD and REORG operations. This free space accommodates row growth from UPDATE operations (VARCHAR columns getting longer) and new rows inserted in clustering sequence.
PCTFREE guidelines:
| Workload | PCTFREE | Rationale |
|---|---|---|
| Insert-only (append) | 0-5 | Rows never grow; new inserts go to end |
| Light updates, short VARCHARs | 5-10 | Modest row growth |
| Heavy updates, growing VARCHARs | 10-20 | Rows grow significantly |
| Clustering maintenance | 15-20 | Space for inserts in clustering sequence |
For Meridian Bank: - TRANSACTION_HISTORY: PCTFREE 0 (insert-only, never updated). - CUSTOMER: PCTFREE 10 (occasional address and contact updates). - ACCOUNT: PCTFREE 10 (balance updates, but balance is fixed-length DECIMAL so row size does not change — the 10% is for clustering maintenance). - LOAN: PCTFREE 15 (status changes, note updates).
14.9.4 Growth Projections
Storage is not static. Capacity planning requires multi-year projections:
Meridian Bank 5-year storage projection:
Year 1 Baseline:
CUSTOMER: 2.0M rows → 0.94 GB
ACCOUNT: 5.5M rows → 1.23 GB
TRANSACTION_HISTORY: 500M rows → 115.0 GB (compressed: ~44 GB)
LOAN: 400K rows → 0.18 GB
LOAN_PAYMENT: 4.8M rows → 0.52 GB
LOB data: → 836.0 GB
Indexes (est. 30% of data): → 35.0 GB
Temp space: → 20.0 GB
─────────────────────────────────────────
Year 1 Total: → ~1,053 GB
Growth Assumptions:
Customers: +200K/year (10% growth)
Accounts: +550K/year (10% growth)
Transactions: +50M/year (10% growth)
LOB data: +100 GB/year (12% growth — resolution increases)
Year 5 Projection:
CUSTOMER: 2.8M rows → 1.32 GB
ACCOUNT: 7.7M rows → 1.72 GB
TRANSACTION_HISTORY: 2.75B rows → 632.5 GB (compressed: ~240 GB)
LOAN: 600K rows → 0.27 GB
LOAN_PAYMENT: 26.8M rows → 2.91 GB
LOB data: → 1,236 GB
Indexes: → 88.0 GB
Temp space: → 25.0 GB
─────────────────────────────────────────
Year 5 Total: → ~1,595 GB (compressed data)
→ ~2,400 GB (if uncompressed)
Compression saves approximately 805 GB over 5 years. This is the number that justifies the compression investment to management.
14.9.5 Storage Allocation Parameters
z/OS — PRIQTY and SECQTY:
CREATE TABLESPACE TS_CUSTOMER
IN MERIDIAN_DB
USING STOGROUP SGMERIDIAN
PRIQTY 960000 -- Primary allocation in KB (~938 MB)
SECQTY 96000 -- Secondary allocation in KB (~94 MB)
-- ...
PRIQTY is the initial space allocation. SECQTY is the size of each secondary extent when the primary is exhausted. Excessive secondary extends fragment the data set and degrade sequential access.
Rule of thumb: Set PRIQTY to the expected table size at deployment. Set SECQTY to 10% of PRIQTY. Monitor and adjust before the data set reaches its maximum extends (127 extends per VSAM data set).
LUW — AUTORESIZE:
With automatic storage, LUW handles allocation automatically. The key parameter is MAXSIZE:
CREATE TABLESPACE TS_CUSTOMER
MANAGED BY AUTOMATIC STORAGE
AUTORESIZE YES
INITIALSIZE 1G
INCREASESIZE 100M
MAXSIZE 50G;
AUTORESIZE YES allows DB2 to grow the tablespace automatically. MAXSIZE prevents runaway growth from consuming all available storage — a safety valve that has saved many production systems from uncontrolled data loads.
14.10 Physical Design for Different Workloads
No single physical design is optimal for all workloads. The right design depends on the access patterns, concurrency requirements, and operational constraints of your specific system.
14.10.1 OLTP Design Patterns
Online Transaction Processing workloads are characterized by: - Short, simple transactions (single-row lookups, small inserts). - High concurrency (thousands of simultaneous users). - Low latency requirements (sub-second response times). - Random I/O patterns (index lookups scatter across the tablespace).
Physical design for OLTP:
| Parameter | Recommended Value | Rationale |
|---|---|---|
| Page size | 4K or 8K | Minimize buffer pool waste for single-row access |
| LOCKSIZE | ROW | Maximize concurrency |
| PCTFREE | 5-10 | Moderate free space for updates |
| Compression | YES (adaptive) | More data in buffer pool = higher hit ratio |
| CLOSE | NO (z/OS) | Keep data sets open to avoid OPEN overhead |
| Buffer pool | Large, dedicated | High hit ratio critical for OLTP |
| EXTENTSIZE (LUW) | 16-32 | Moderate — sequential prefetch rare |
14.10.2 Batch Processing Design Patterns
Batch workloads are characterized by: - Large sequential scans (full table scans, range scans). - Bulk INSERT/UPDATE/DELETE operations. - Single-user or low-concurrency execution. - Throughput-oriented (total elapsed time matters, not individual response time).
Physical design for batch:
| Parameter | Recommended Value | Rationale |
|---|---|---|
| Page size | 8K, 16K, or 32K | More rows per I/O for sequential scans |
| LOCKSIZE | TABLE or PAGE | Reduce lock overhead for bulk operations |
| PCTFREE | 0-5 | Minimize wasted space; batch rarely updates |
| Compression | YES | Reduce I/O volume for large scans |
| Prefetch | Large prefetch size | Sequential prefetch reads ahead aggressively |
| EXTENTSIZE (LUW) | 32-64 | Align with prefetch for efficient I/O |
14.10.3 Mixed Workload Design
Most real systems — including Meridian National Bank — run mixed workloads: OLTP during the day, batch at night, reporting throughout. This requires compromises:
Strategy 1: Time-based separation. Use the same physical design but change DB2 configuration (buffer pool sizes, prefetch thresholds) between OLTP and batch windows. This is simple but limits flexibility.
Strategy 2: Workload-specific tablespaces. Place the same data in different tablespaces optimized for different access patterns. This works for read-only data (materialized query tables for reporting) but not for the base transaction data.
Strategy 3: Partitioning as the mediator. This is the most practical approach for Meridian Bank: - Current month's partition: OLTP-optimized (small PCTFREE, ROW locking, in a hot buffer pool). - Historical partitions: batch/reporting-optimized (no free space needed, can tolerate TABLE locking during batch, compressed heavily). - The partitioning key (TRANS_DATE) naturally separates hot and cold access patterns.
14.10.4 Data Warehouse Considerations
If Meridian Bank feeds a data warehouse (and it should), the warehouse physical design differs substantially:
| Parameter | Warehouse Value | Rationale |
|---|---|---|
| Page size | 32K | Maximum rows per I/O; row sizes are large after denormalization |
| LOCKSIZE | TABLE | Single-user loads; concurrent reads use uncommitted read |
| PCTFREE | 0 | Data is loaded, not updated |
| Compression | YES (aggressive) | Storage savings critical for multi-TB warehouse |
| Partitioning | Range by date | Align with ETL load frequency |
| MQTs/Summary tables | Extensive | Pre-computed aggregations for dashboard queries |
14.11 The Meridian Bank Physical Design — Putting It All Together
This section synthesizes everything in the chapter into a complete, production-ready physical design for Meridian National Bank. Every decision references the principles discussed above.
14.11.1 Environment Summary
| Parameter | z/OS Environment | LUW Environment |
|---|---|---|
| DB2 Version | Db2 13 for z/OS | Db2 11.5.8 for LUW |
| Platform | z16 LPAR, 10 GCP, 8 zIIP | x86_64, 32 cores, 256 GB RAM |
| Storage | DS8950, 200 TB, Easy Tier | NetApp AFF-A800, 100 TB NVMe |
| Use case | Core banking | Digital banking |
| Users | 2,000 concurrent | 50,000 concurrent (API) |
14.11.2 z/OS Physical Design
Storage groups:
CREATE STOGROUP SG_MERI_HOT
VOLUMES ('HOT01', 'HOT02', 'HOT03', 'HOT04')
VCAT MERIDIAN;
CREATE STOGROUP SG_MERI_STD
VOLUMES ('STD01', 'STD02', 'STD03', 'STD04',
'STD05', 'STD06', 'STD07', 'STD08')
VCAT MERIDIAN;
CREATE STOGROUP SG_MERI_COLD
VOLUMES ('CLD01', 'CLD02', 'CLD03', 'CLD04')
VCAT MERIDIAN;
CREATE STOGROUP SG_MERI_TEMP
VOLUMES ('TMP01', 'TMP02')
VCAT MERIDIAN;
Database:
CREATE DATABASE MERIDIAN_DB
STOGROUP SG_MERI_STD
BUFFERPOOL BP8K0
INDEXBP BP8K1
CCSID UNICODE;
Tablespace layout (z/OS):
| Tablespace | Type | DSSIZE | SEGSIZE | BP | COMPRESS | LOCKSIZE | Notes |
|---|---|---|---|---|---|---|---|
| TS_CUSTOMER | PBG UTS | 4G | 32 | BP8K0 | YES | ROW | MaxParts 128 |
| TS_ACCOUNT | PBG UTS | 4G | 32 | BP8K0 | YES | ROW | MaxParts 128 |
| TS_TRANS_HIST | PBR UTS | 16G | 32 | BP8K0 | YES | ROW | 60 parts, monthly |
| TS_LOAN | PBG UTS | 4G | 32 | BP8K0 | YES | ROW | MaxParts 64 |
| TS_LOAN_PMT | PBR UTS | 4G | 32 | BP4K0 | YES | ROW | 60 parts, monthly |
| TS_EMPLOYEE | PBG UTS | 4G | 32 | BP4K0 | NO | ROW | MaxParts 4, small |
| TS_REFERENCE | PBG UTS | 4G | 32 | BP4K0 | NO | ROW | MaxParts 4, multi-table |
| TS_CUST_DOCS | LOB | 16G | n/a | BP32K1 | NO | LOB | Customer documents |
| TS_TEMP01 | Work | n/a | 32 | BP32K2 | NO | TABLE | In DSNDB07 |
14.11.3 LUW Physical Design
Buffer pools:
CREATE BUFFERPOOL BP4K PAGESIZE 4096 SIZE 25000 AUTOMATIC;
CREATE BUFFERPOOL BP8K PAGESIZE 8192 SIZE 100000 AUTOMATIC;
CREATE BUFFERPOOL BP8K_IDX PAGESIZE 8192 SIZE 50000 AUTOMATIC;
CREATE BUFFERPOOL BP32K PAGESIZE 32768 SIZE 15000 AUTOMATIC;
CREATE BUFFERPOOL BP8K_TEMP PAGESIZE 8192 SIZE 20000 AUTOMATIC;
CREATE BUFFERPOOL BP32K_TEMP PAGESIZE 32768 SIZE 10000 AUTOMATIC;
CREATE BUFFERPOOL BP8K_UTEMP PAGESIZE 8192 SIZE 10000 AUTOMATIC;
Storage groups:
CREATE STOGROUP SG_HOT ON '/db2data/nvme_pool1', '/db2data/nvme_pool2';
CREATE STOGROUP SG_STANDARD ON '/db2data/ssd_pool1', '/db2data/ssd_pool2';
CREATE STOGROUP SG_INDEX ON '/db2data/nvme_pool3';
CREATE STOGROUP SG_LOB ON '/db2data/hdd_pool1', '/db2data/hdd_pool2';
CREATE STOGROUP SG_TEMP ON '/db2data/nvme_pool4';
Tablespace layout (LUW):
-- Customer tablespace
CREATE TABLESPACE TS_CUSTOMER
IN TABLESPACE GROUP IBMDEFAULTGROUP
PAGESIZE 8192
MANAGED BY AUTOMATIC STORAGE
USING STOGROUP SG_HOT
EXTENTSIZE 32
PREFETCHSIZE AUTOMATIC
BUFFERPOOL BP8K
NO FILE SYSTEM CACHING
AUTORESIZE YES
MAXSIZE 50G;
-- Transaction history (range-partitioned)
CREATE TABLESPACE TS_TRANS_HIST
PAGESIZE 8192
MANAGED BY AUTOMATIC STORAGE
USING STOGROUP SG_HOT
EXTENTSIZE 32
PREFETCHSIZE AUTOMATIC
BUFFERPOOL BP8K
NO FILE SYSTEM CACHING
AUTORESIZE YES
MAXSIZE 500G;
-- Reference data
CREATE TABLESPACE TS_REFERENCE
PAGESIZE 4096
MANAGED BY AUTOMATIC STORAGE
USING STOGROUP SG_STANDARD
EXTENTSIZE 8
PREFETCHSIZE AUTOMATIC
BUFFERPOOL BP4K
AUTORESIZE YES
MAXSIZE 5G;
-- Temporary tablespaces
CREATE SYSTEM TEMPORARY TABLESPACE TEMPSPACE_8K
PAGESIZE 8192
MANAGED BY AUTOMATIC STORAGE
USING STOGROUP SG_TEMP
EXTENTSIZE 64
PREFETCHSIZE AUTOMATIC
BUFFERPOOL BP8K_TEMP;
CREATE SYSTEM TEMPORARY TABLESPACE TEMPSPACE_32K
PAGESIZE 32768
MANAGED BY AUTOMATIC STORAGE
USING STOGROUP SG_TEMP
EXTENTSIZE 64
PREFETCHSIZE AUTOMATIC
BUFFERPOOL BP32K_TEMP;
CREATE USER TEMPORARY TABLESPACE USERTEMP_8K
PAGESIZE 8192
MANAGED BY AUTOMATIC STORAGE
USING STOGROUP SG_TEMP
EXTENTSIZE 32
BUFFERPOOL BP8K_UTEMP;
14.11.4 Complete Storage Budget
| Object Category | z/OS (Compressed) | z/OS (Uncompressed) | LUW (Compressed) | LUW (Uncompressed) |
|---|---|---|---|---|
| Base table data | 52 GB | 120 GB | 52 GB | 120 GB |
| Indexes | 36 GB | 36 GB | 36 GB | 36 GB |
| LOB data | 836 GB | 836 GB | 836 GB | 836 GB |
| Temp space | 20 GB | 20 GB | 20 GB | 20 GB |
| Catalog/directory | 2 GB | 2 GB | 2 GB | 2 GB |
| Log space | 50 GB | 50 GB | 50 GB | 50 GB |
| Total Year 1 | 996 GB | 1,064 GB | 996 GB | 1,064 GB |
| Total Year 5 | 1,595 GB | 2,400 GB | 1,595 GB | 2,400 GB |
The compression decision saves approximately $80,000 in Year 1 and $240,000 cumulatively by Year 5, using the enterprise storage cost of $0.10/GB/month.
14.11.5 Design Decision Log
Every physical design decision should be documented. Here is the decision log for Meridian Bank:
| Decision | Choice | Rationale | Alternatives Considered |
|---|---|---|---|
| Page size for CUSTOMER | 8K | 420-byte avg row; 19 rows/page; good buffer fit | 4K (too few rows), 16K (waste) |
| Partitioning key for TRANSACTION_HISTORY | TRANS_DATE (monthly) | 80% of queries filter by date; monthly lifecycle | ACCOUNT_ID (no date elimination) |
| Compression for TRANSACTION_HISTORY | YES (adaptive) | 62% savings on 115 GB; $52K/year savings | NO (saves CPU but wastes $52K) | |
| LOB storage strategy | Separate LOB tablespace | Large LOBs (avg 350KB-1.2MB); inline not suitable | Inline (only suits small LOBs) |
| Tablespace type for CUSTOMER (z/OS) | PBG UTS | Unpredictable growth; no need for partition ops | PBR (unnecessary complexity) |
| Tablespace type for TRANS_HIST (z/OS) | PBR UTS | Need partition-level REORG, COPY, archival | PBG (loses partition operations) |
| LOCKSIZE for OLTP tables | ROW | High concurrency (2K users); minimize contention | PAGE (too much contention) |
| Temp space sizing | 20 GB | Largest sort: 10 GB; concurrent peak: 13 GB | 10 GB (too tight for peak) |
| PCTFREE for TRANSACTION_HISTORY | 0 | Insert-only; never updated | 10 (wastes 11.5 GB per year) |
| PCTFREE for CUSTOMER | 10 | Address/contact updates grow VARCHAR columns | 0 (row migration risk) |
Spaced Review: Concepts from Earlier Chapters
This section reinforces critical concepts from Chapters 3, 11, and 13 that directly support physical design work.
From Chapter 3: DB2 Architecture
Review Question 1: In the z/OS address space model, which address space handles buffer pool management and data I/O?
Answer: The DBM1 (Database Services) address space owns the buffer pools and handles all data page reads and writes. When we assign a tablespace to buffer pool BP8K0, that assignment determines which region of DBM1's virtual storage will cache this tablespace's pages. The size of DBM1's region limit directly constrains how large our buffer pools can be — a physical design consideration.
Review Question 2: On LUW, what is the relationship between an instance, a database, and a tablespace?
Answer: An instance is a collection of processes and memory that manages one or more databases. A database contains tablespaces. A tablespace contains tables and indexes. When we create tablespace TS_CUSTOMER in the MERIDIAN_DIGITAL database, we are allocating storage within the database's container files, managed by the instance's buffer pools. The instance's DBHEAP and CATALOGCACHE_SZ parameters determine how much memory is available for catalog metadata, which grows with each tablespace we create.
From Chapter 11: Data Types and Domains
Review Question 3: Why does the choice between CHAR(10) and VARCHAR(10) matter for physical design?
Answer: CHAR(10) always consumes exactly 10 bytes per row, regardless of the actual value. VARCHAR(10) consumes 2 bytes (length prefix) + actual data length. For a column that is almost always 10 characters, CHAR is more space-efficient (10 vs. 12 bytes). For a column that averages 4 characters, VARCHAR is more efficient (6 vs. 10 bytes). Across 2 million CUSTOMER rows, choosing VARCHAR for the STATE_CODE column (always 2 characters) would waste 4 MB (2 extra bytes * 2M rows) compared to CHAR(2). Conversely, choosing CHAR for EMAIL (average 28 characters, max 100) would waste 144 MB (72 extra bytes * 2M rows). Data type choice is physical design.
Review Question 4: How does DECIMAL precision affect storage?
Answer: DB2 stores DECIMAL(p,s) in packed decimal format: CEIL((p+1)/2) bytes. DECIMAL(15,2) — our AMOUNT column — uses 8 bytes. DECIMAL(7,2) would use only 4 bytes. If AMOUNT never exceeds $99,999.99, we could use DECIMAL(7,2) and save 4 bytes per row * 500M rows/year = 2 GB/year. But financial systems must accommodate large values (wire transfers, commercial loans), so DECIMAL(15,2) is the correct choice despite the extra storage.
From Chapter 13: Logical Design
Review Question 5: How does normalization level affect physical design?
Answer: A fully normalized (3NF) model produces more tables with fewer columns each — meaning more tablespaces, more joins at query time, but smaller row sizes and less data redundancy. A denormalized model produces fewer, wider tables — fewer tablespaces, fewer joins, but larger rows and data redundancy that must be managed. For Meridian Bank, we kept the 3NF design for OLTP and will build denormalized summary tables (materialized query tables) for reporting. This is a physical design decision driven by logical design principles.
Chapter Summary
Physical database design translates a logical model into a performing, maintainable, cost-effective system. The key decisions are:
-
Tablespace type selection. On z/OS, use Universal Table Spaces — PBG for unpredictable growth, PBR for large tables needing partition-level operations. On LUW, use automatic storage with appropriate page sizes. Legacy tablespace types (segmented, SMS) should be migrated away from.
-
Partitioning. Range partition large tables by date. Align partition boundaries with query patterns, maintenance windows, and data lifecycle policies. The TRANSACTION_HISTORY table at Meridian Bank uses monthly partitions — the most common and most effective pattern in transaction-processing systems.
-
Compression. Enable compression on all tables larger than a few thousand pages. The ROI is almost always positive: storage savings and I/O reduction outweigh the modest CPU overhead. For Meridian Bank, compression saves approximately $70,000 per year in storage costs.
-
Page size and free space. Choose page size based on row length and access pattern. Set PCTFREE based on update frequency and row-growth potential. Insert-only tables (TRANSACTION_HISTORY) need no free space; frequently updated tables (CUSTOMER, LOAN) need 10-20%.
-
Storage placement. Separate hot and cold data onto appropriate storage tiers. Put transaction logs and active partitions on flash/SSD. Put archival partitions on cost-effective spinning disk.
-
Capacity planning. Calculate storage requirements with precision: row size, rows per page, PCTFREE, growth projections. Present 5-year projections to management to secure adequate storage budgets before you run out of space.
-
Documentation. Every physical design decision should be recorded with its rationale and alternatives considered. The design decision log is a living document that future DBAs will consult when they need to understand why the system is structured the way it is.
Physical design is not glamorous work. There are no elegant algorithms or clever SQL tricks. But it is the foundation on which everything else rests. Get it right, and your system runs smoothly for years. Get it wrong, and you will spend those years firefighting storage emergencies, performance crises, and maintenance nightmares.
In the next chapter, we turn to the other half of the physical design equation: index design. If tablespace design determines where the data lives, index design determines how fast you can find it.
Key Terms Introduced in This Chapter
| Term | Definition |
|---|---|
| Universal Table Space (UTS) | z/OS tablespace type that supports both PBG and PBR modes; the modern standard |
| Partition-by-Growth (PBG) | UTS mode where DB2 automatically adds partitions as data grows |
| Partition-by-Range (PBR) | UTS mode where data is distributed across partitions based on key-range boundaries |
| DSSIZE | z/OS parameter defining the maximum size of each partition's VSAM data set |
| Automatic Storage | LUW storage model where DB2 manages container creation, sizing, and growth automatically |
| Extent | LUW unit of space allocation; a fixed number of contiguous pages |
| Row compression | Dictionary-based compression that replaces repeating byte patterns with short symbols |
| Adaptive compression | Additional page-level compression on top of row compression (z/OS 11+, LUW 10.5+) |
| PCTFREE | Percentage of page space left free during LOAD/REORG for future row growth |
| PRIQTY / SECQTY | z/OS parameters for primary and secondary space allocation of VSAM data sets |
| Partition elimination | Optimizer technique that skips partitions that cannot contain qualifying rows |
| Inline LOB | LUW feature storing small LOB values directly in the base table row |
| SEGSIZE | z/OS parameter defining the number of pages per segment in a tablespace |
Next: Chapter 15 — Index Design: B+ Trees, Clustering, Composite Indexes, and Index-Only Access
Related Reading
Explore this topic in other books
IBM DB2 Logical Design IBM DB2 The Relational Model Intro to Data Science Reshaping and Transforming Data