Case Study 30.1: Partition Strategy for 5-Year Transaction History

Background

Pacific Coast Savings, a mid-sized savings institution with 800,000 customer accounts, is migrating its transaction processing from a legacy system to DB2 for z/OS v13. The new TRANSACTION_HISTORY table must store five years of transaction records to meet regulatory requirements.

Current State

  • Volume: 8 million transactions per month (approximately 270,000 per day).
  • Row size: Average 320 bytes per row (after compression).
  • Total data: 5 years * 12 months * 8 million rows = 480 million rows, approximately 150 GB.
  • Growth: Transaction volumes are increasing at 12% annually due to digital banking adoption.
  • Retention: Exactly 5 years (60 months). Data older than 5 years must be archived and purged from the active table.
  • Query patterns:
  • 70% of queries filter on TRANS_DATE (date range, typically 1-3 months).
  • 20% filter on ACCOUNT_ID + TRANS_DATE.
  • 10% are batch reporting queries scanning 6-12 months of data.
  • Maintenance window: 6 hours on Saturday nights (22:00-04:00).

Platform

DB2 for z/OS v13 running on a z15 mainframe with 32 CPs dedicated to the DB2 subsystem. Storage is DS8950F with 256 GB flash cache.

The Challenge

The DBA team must design a partitioning strategy that:

  1. Supports the 5-year retention policy with efficient archival.
  2. Enables partition elimination for the predominant query patterns.
  3. Allows partition-level maintenance within the 6-hour weekly window.
  4. Accommodates 12% annual growth without redesign.
  5. Maintains high availability (99.95%) for the most recent 2 years of data.

Design Decisions

Partition Key Selection

Decision: TRANS_DATE (DATE) as the single partition key.

Rationale: 70% of queries filter on TRANS_DATE, making it the clear choice for partition elimination. The 5-year retention policy maps directly to monthly partitions. ACCOUNT_ID was considered as a composite key but rejected because it would complicate the rolling window pattern and most account-based queries also include a date range.

Partition Granularity

Decision: Monthly partitions.

Rationale: 60 monthly partitions for 5 years — well within the z/OS limit. Each monthly partition holds approximately 8 million rows (2.5 GB), a manageable size for partition-level REORG (estimated 8 minutes) and COPY (estimated 3 minutes). Daily partitions (1,825 total) were rejected due to excessive partition count and catalog overhead.

DSSIZE and Table Space Configuration

CREATE TABLESPACE TSHIST
    IN PCSLOAN
    USING STOGROUP PCSSTOR
    DSSIZE 8G
    BUFFERPOOL BP32K
    SEGSIZE 64
    MAXPARTITIONS 96
    LOCKSIZE ROW
    CLOSE NO;

DSSIZE 8G: Each monthly partition is approximately 2.5 GB, growing to 4.5 GB in 5 years at 12% annual growth. 8 GB provides ample headroom.

MAXPARTITIONS 96: 60 active + 36 growth headroom. This avoids the need to ALTER MAXPARTITIONS for years.

Index Strategy

-- Primary key (partitioned, DPSI)
-- Including TRANS_DATE enables a partitioned unique index
ALTER TABLE MERIDIAN.TRANSACTION_HISTORY
    ADD CONSTRAINT PK_TRANS
    PRIMARY KEY (TRANS_DATE, TRANS_ID);

-- Account lookup (DPSI)
CREATE INDEX MERIDIAN.IX_TRANS_ACCT
    ON MERIDIAN.TRANSACTION_HISTORY (ACCOUNT_ID, TRANS_DATE)
    CLUSTER
    PARTITIONED;

-- Transaction type (DPSI)
CREATE INDEX MERIDIAN.IX_TRANS_TYPE
    ON MERIDIAN.TRANSACTION_HISTORY (TRANS_TYPE, TRANS_DATE)
    PARTITIONED;

All indexes are DPSIs. The team accepted the multi-probe overhead for account lookups without a date filter (the 20% of queries) in exchange for full partition independence.

Rolling Window Design

Monthly rotation on the first Saturday of each month:

  1. UNLOAD the oldest partition to archive data sets.
  2. ROTATE PARTITION FIRST TO LAST with the new month's end date.
  3. RUNSTATS on the new (empty) last partition.
  4. COPY the new partition (empty — fast).

Estimated execution time: 15 minutes (dominated by the UNLOAD of 8 million rows).

Results

After six months of production operation:

Metric Before Partitioning After Partitioning
Average query response (date-range) 4.2 seconds 0.3 seconds
Weekly REORG duration 6.5 hours (full table) 12 minutes (single partition)
Monthly archive operation 3 hours (DELETE + REORG) 15 minutes (ROTATE)
Recovery time (single partition) 4 hours (full table) 8 minutes
Buffer pool hit ratio 88% 97%

The 14x improvement in date-range query response time was the most impactful result, directly enabling the digital banking team to meet their 500ms API response time SLA.

Lessons Learned

  1. Include the partition key in the primary key: The team initially defined the PK as (TRANS_ID) alone, which forced an NPSI for the unique index. Changing to (TRANS_DATE, TRANS_ID) allowed a partitioned unique index.

  2. Plan for RUNSTATS timing: The team initially ran RUNSTATS only after the monthly rotation. They found that queries against the newest partition performed poorly until RUNSTATS ran because the optimizer assumed an empty partition. Solution: run RUNSTATS after the first full day of loading into the new partition.

  3. Monitor partition sizes proactively: December partitions were 35% larger than February partitions due to holiday transaction volumes. The team set up automated alerts for any partition exceeding 150% of the average.

Discussion Questions

  1. If Pacific Coast Savings acquires another bank and must incorporate 3 years of historical data from the acquired bank, how would you modify the partitioning strategy?

  2. The analytics team wants to run queries that scan the entire 5-year history without a date filter. How would you optimize these queries given the partitioned table design?

  3. If the 12% annual growth rate continues for 5 years, what is the estimated partition size and should the DSSIZE be increased?