Appendix A: Mathematical Foundations for DB2 Practitioners

Database systems are, at their core, applied mathematics. This appendix consolidates the formal underpinnings that DB2's optimizer, storage manager, and buffer pool rely on every time you execute a query. You do not need to memorize these formulas, but understanding them will make you a sharper diagnostician when EXPLAIN output looks wrong or capacity plans fall short.


A.1 Relational Algebra

Every SQL statement you write is translated by DB2 into an internal tree of relational algebra operators before the optimizer considers access paths. The six fundamental operators follow.

A.1.1 SELECT (Restriction) --- σ

The SELECT operator filters rows based on a predicate. It does not choose columns; that is PROJECT.

σ_predicate(R)

Example. Given a relation EMPLOYEE(EMPNO, NAME, DEPT, SALARY):

σ_{DEPT='D01'}(EMPLOYEE)

Returns only those tuples where DEPT = 'D01'. In SQL this maps to the WHERE clause:

SELECT * FROM EMPLOYEE WHERE DEPT = 'D01';

Properties: - Commutative: σ_p(σ_q(R)) = σ_q(σ_p(R)) - Cascadable: σ_{p AND q}(R) = σ_p(σ_q(R)) - The optimizer exploits commutativity to push selective predicates down the plan tree, reducing intermediate result sizes early.

A.1.2 PROJECT --- π

PROJECT removes columns (attributes) from a relation, producing a result with only the specified attributes.

π_{A1, A2, ...}(R)

Example:

π_{EMPNO, NAME}(EMPLOYEE)

In SQL:

SELECT DISTINCT EMPNO, NAME FROM EMPLOYEE;

Note the DISTINCT. Relational algebra works on sets; SQL works on multisets (bags) unless you explicitly request DISTINCT. DB2's optimizer knows when duplicate elimination is necessary and when it can be skipped.

A.1.3 JOIN --- ⋈

JOIN combines tuples from two relations based on a matching condition.

Theta Join:

R ⋈_{condition} S

Equijoin (theta join where the condition uses equality):

EMPLOYEE ⋈_{EMPLOYEE.DEPT = DEPARTMENT.DEPTNO} DEPARTMENT

Natural Join (equijoin on all identically named columns, projecting away duplicates):

R ⋈ S

Join cardinality estimation (critical for the optimizer):

For an equijoin on attribute A between relations R and S:

|R ⋈ S| = (|R| * |S|) / max(V(A, R), V(A, S))

Where V(A, R) is the number of distinct values of attribute A in relation R. This formula assumes uniform distribution and attribute independence---assumptions that DB2's RUNSTATS-collected statistics refine with histograms and frequent-value data.

A.1.4 UNION

UNION combines tuples from two union-compatible relations, eliminating duplicates.

R ∪ S

Both R and S must have the same degree (number of attributes) and domain-compatible attributes. In SQL:

SELECT EMPNO, NAME FROM EMPLOYEE_NYC
UNION
SELECT EMPNO, NAME FROM EMPLOYEE_LON;

Cardinality bound: |R ∪ S| ≤ |R| + |S|

A.1.5 INTERSECT

INTERSECT returns tuples present in both relations.

R ∩ S

Cardinality bound: |R ∩ S| ≤ min(|R|, |S|)

A.1.6 DIFFERENCE

DIFFERENCE returns tuples in R that are not in S.

R - S

Cardinality bound: |R - S| ≤ |R|

In SQL this corresponds to EXCEPT:

SELECT EMPNO FROM ALL_EMPLOYEES
EXCEPT
SELECT EMPNO FROM TERMINATED_EMPLOYEES;

A.1.7 Derived Operators

Several common operators are defined in terms of the six fundamentals:

Operator Definition
Semi-join R ⋉ S π_{attrs(R)}(R ⋈ S)
Anti-join R ▷ S R - (R ⋉ S)
Division R ÷ S Used for "for all" queries
Outer Join Extension that preserves non-matching tuples

DB2's optimizer internally uses semi-join and anti-join transformations heavily, especially when rewriting EXISTS and NOT EXISTS subqueries.


A.2 Set Theory Basics for SQL

SQL operates on multisets (bags), not pure sets, unless DISTINCT is specified. This distinction matters for aggregation and for understanding why UNION ALL is cheaper than UNION.

Key multiset operations:

Operation Set Semantics Bag Semantics (SQL default)
UNION Eliminates duplicates UNION ALL preserves them
INTERSECT Eliminates duplicates INTERSECT ALL preserves min count
EXCEPT Eliminates duplicates EXCEPT ALL subtracts counts

NULL handling in set operations. Two NULLs are considered duplicates for purposes of UNION, INTERSECT, and EXCEPT. This differs from the general SQL rule that NULL = NULL evaluates to UNKNOWN, and it catches practitioners off guard during data reconciliation work.

Three-valued logic. SQL uses three truth values: TRUE, FALSE, and UNKNOWN. Any comparison involving NULL yields UNKNOWN. The WHERE clause keeps only rows where the predicate evaluates to TRUE.

p q p AND q p OR q NOT p
T T T T F
T U U T F
T F F T F
U U U U U
U F F U U
F F F F T

A.3 Cost-Based Optimizer Math

DB2's optimizer is cost-based on both z/OS and LUW. It estimates the cost of each candidate access path in terms of I/O, CPU, and (on z/OS) communications cost. The core inputs are statistics gathered by RUNSTATS.

A.3.1 Filter Factors

A filter factor (FF) is the estimated fraction of rows that satisfy a predicate. It ranges from 0.0 to 1.0.

Equal predicate on a column with uniform distribution:

FF(col = value) = 1 / COLCARDF

Where COLCARDF is the column cardinality (number of distinct values) stored in SYSCAT.COLUMNS (LUW) or SYSIBM.SYSCOLUMNS (z/OS).

Range predicates:

FF(col > value) = (HIGH2KEY - value) / (HIGH2KEY - LOW2KEY)
FF(col BETWEEN v1 AND v2) = (v2 - v1) / (HIGH2KEY - LOW2KEY)

These assume uniform distribution between LOW2KEY and HIGH2KEY.

IN list:

FF(col IN (v1, v2, ..., vn)) = n / COLCARDF

Capped at 1.0.

LIKE predicate:

FF(col LIKE 'prefix%') ≈ 1 / COLCARDF   (if prefix is selective)
FF(col LIKE '%substring%') ≈ default filter factor (typically 1/10 to 1/20)

DB2 uses a default filter factor when it cannot estimate selectivity. On z/OS, the classic default is 1/25 for equal predicates on unindexed columns without statistics.

Compound predicates:

FF(p1 AND p2) = FF(p1) * FF(p2)          -- assumes independence
FF(p1 OR p2)  = FF(p1) + FF(p2) - FF(p1) * FF(p2)
FF(NOT p)     = 1 - FF(p)

The independence assumption is often wrong in practice. DB2 on z/OS v12 and Db2 LUW v11.5 both support multi-column statistics (column groups) that capture correlations, improving estimates when columns are dependent.

A.3.2 Cardinality Estimation

The estimated number of qualifying rows after applying predicates:

Estimated_Rows = CARD(table) * FF(predicate)

Where CARD(table) is the table cardinality from catalog statistics.

For a join between R and S on column A:

Estimated_Join_Rows = CARD(R) * CARD(S) * FF(join_predicate)
FF(R.A = S.A) = 1 / max(COLCARDF(A, R), COLCARDF(A, S))

Cascading joins compound the estimation error. If each single-table estimate is off by a factor of 2, a five-way join estimate can be off by 2^5 = 32x. This is why complex queries sometimes get catastrophically bad plans---and why keeping statistics current with RUNSTATS is non-negotiable.

A.3.3 Cost Formulas (Simplified)

Table scan cost:

Cost_tablescan = Pages(T) * IO_cost_per_page + CARD(T) * CPU_cost_per_row

Index scan cost (matching index):

Cost_indexscan = Levels(I) * IO_cost_per_page          -- index traversal
              + MatchingPages(I) * IO_cost_per_page      -- leaf page scan
              + QualifyingRows * IO_cost_per_page         -- data page fetch (worst case)
              + QualifyingRows * CPU_cost_per_row          -- row processing

In practice, buffer pool caching dramatically reduces the effective I/O cost term, which is where buffer pool hit ratio enters the equation.


A.4 B+ Tree Height Calculations

DB2 indexes are B+ trees. Understanding the height tells you the maximum number of I/Os needed to traverse from root to leaf.

A.4.1 Basic Formula

Height = ⌈log_F(N)⌉ + 1

Where: - F = fanout (number of entries per index page) - N = number of leaf pages (or equivalently, number of index entries) - The "+1" accounts for the leaf level itself

A.4.2 Computing Fanout

F = floor(PageSize / (KeySize + RID_Size + Overhead))

Typical values: - PageSize: 4096 bytes (4 KB, the z/OS default) or larger - RID_Size: 4 bytes (z/OS traditional) or 6 bytes (LUW with large RIDs) - Overhead: approximately 10-12 bytes per entry for pointers and headers

Example. For a 4 KB page, an 8-byte integer key, and 4-byte RID with 10-byte overhead:

F = floor(4096 / (8 + 4 + 10)) = floor(4096 / 22) = 186

A.4.3 Height Examples

Rows Fanout Leaf Pages Height Max I/O to Leaf
1,000 186 6 1 (root is leaf) 1
100,000 186 538 2 2
10,000,000 186 53,764 3 3
1,000,000,000 186 5,376,344 4 4

The practical takeaway: even a billion-row table needs only four I/Os (at most) to traverse from root to leaf. Since the root page and often the first non-leaf level are cached in the buffer pool, the effective cost is usually two physical I/Os---one for the lower non-leaf page and one for the leaf page.


A.5 Buffer Pool Hit Ratio Math

The buffer pool is DB2's in-memory cache for data and index pages. The hit ratio measures how often a page request is satisfied from memory without physical I/O.

A.5.1 Basic Hit Ratio

Hit_Ratio = (Logical_Reads - Physical_Reads) / Logical_Reads * 100

Or equivalently:

Hit_Ratio = (1 - Physical_Reads / Logical_Reads) * 100

Target values: - Data pages: > 80% for OLTP workloads; 95%+ is achievable for well-tuned systems - Index pages: > 95% (index pages are reused heavily and should almost always be in cache)

A.5.2 Marginal Benefit of Adding Memory

The relationship between buffer pool size and hit ratio is not linear. It follows a curve often modeled as:

Hit_Ratio ≈ 1 - (Working_Set_Pages / Buffer_Pool_Pages)^α

Where α depends on the access pattern. For random access, α ≈ 1. For patterns with temporal locality, α > 1, meaning the hit ratio improves faster.

Practical rule of thumb: Doubling the buffer pool size from 50% to 100% of the working set yields diminishing returns. The biggest gains come from ensuring the buffer pool is large enough to hold the "hot" set of frequently accessed pages.

A.5.3 z/OS-Specific: GETPAGE and Read Counts

On z/OS, you monitor hit ratio using:

-DISPLAY BUFFERPOOL(BP0) DETAIL

Key fields: - GETPAGE: Total logical page requests - SYNCHRONOUS READ: Physical reads (synchronous I/O) - PREFETCH READ: Physical reads via sequential or list prefetch

Hit_Ratio = (GETPAGE - SYNC_READ - PREFETCH_READ) / GETPAGE * 100

Some practitioners exclude prefetch reads from the denominator since prefetch is expected to cause physical I/O. The formula above is the conservative (honest) version.

A.5.4 LUW: BPHR from Snapshot or MON_GET

SELECT BP_NAME,
       POOL_DATA_L_READS,
       POOL_DATA_P_READS,
       CASE WHEN POOL_DATA_L_READS > 0
            THEN DECIMAL(1.0 - (FLOAT(POOL_DATA_P_READS) / POOL_DATA_L_READS), 5, 3) * 100
            ELSE 0
       END AS DATA_HIT_RATIO
FROM TABLE(MON_GET_BUFFERPOOL(NULL, -2)) AS T;

A.6 Capacity Planning Formulas

A.6.1 Storage Sizing

Table space size estimation:

Row_Size = Σ(column_sizes) + overhead_per_row
Rows_Per_Page = floor((PageSize - Page_Overhead) / (Row_Size + Row_Overhead))
Pages_Needed = ceil(Total_Rows / Rows_Per_Page)
Table_Space_Size = Pages_Needed * PageSize

Typical overhead values: - z/OS page overhead: 22 bytes (4 KB page) - z/OS row overhead: 6 bytes (non-LOB) - LUW page overhead: 91 bytes (4 KB page) - LUW row overhead: 8-10 bytes depending on null indicators

Include free space:

Effective_Rows_Per_Page = Rows_Per_Page * (1 - PCTFREE / 100)
Adjusted_Pages = ceil(Total_Rows / Effective_Rows_Per_Page)

Index size estimation:

Index_Entry_Size = KeySize + RID_Size + Entry_Overhead
Entries_Per_Leaf = floor((PageSize - Leaf_Overhead) / Index_Entry_Size)
Leaf_Pages = ceil(Total_Rows / Entries_Per_Leaf)
Total_Index_Pages ≈ Leaf_Pages * (1 + 1/Fanout + 1/Fanout^2 + ...)  ≈ Leaf_Pages * Fanout / (Fanout - 1)

For high fanout (typical), the non-leaf overhead is less than 1% of the leaf level.

A.6.2 IOPS Estimation

OLTP workload:

IOPS = TPS * Avg_IO_Per_Transaction * (1 - Hit_Ratio)

Where: - TPS = transactions per second - Avg_IO_Per_Transaction = average logical I/Os per transaction (from accounting or monitoring) - Hit_Ratio = buffer pool hit ratio as a decimal

Example. 5,000 TPS, 20 logical I/Os per transaction, 95% hit ratio:

IOPS = 5000 * 20 * (1 - 0.95) = 5000 * 20 * 0.05 = 5,000

You need a storage subsystem capable of sustaining 5,000 random IOPS---or roughly 2-3 modern SSDs.

A.6.3 Log Space Estimation

Log_Rate = TPS * Avg_Log_Bytes_Per_Transaction
Daily_Log_Volume = Log_Rate * Seconds_Per_Day

Example. 2,000 TPS, 500 bytes average log per transaction:

Log_Rate = 2000 * 500 = 1,000,000 bytes/sec = ~1 MB/sec
Daily_Log_Volume = 1 MB/sec * 86,400 = ~84 GB/day

On z/OS, this determines how many active and archive log data sets you need. On LUW, this determines log file size and number of primary/secondary logs.

A.6.4 Backup Window Estimation

Backup_Time = Database_Size / Backup_Throughput + Overhead

Typical backup throughput with modern storage: - Disk-to-disk: 100-500 MB/sec per stream - Disk-to-tape (z/OS): 50-150 MB/sec per tape unit - Parallel streams multiply throughput linearly (with diminishing returns above 8-12 streams)

Example. 2 TB database, 4 parallel streams at 200 MB/sec each:

Backup_Time = 2,048,000 MB / (4 * 200 MB/sec) = 2,560 sec ≈ 43 minutes

Add 10-20% for catalog operations, compression overhead, and log archival.

A.6.5 Recovery Time Estimation

Recovery time is harder to predict and depends on the recovery strategy:

Recovery_Time = Restore_Time + Log_Apply_Time + Validation
Log_Apply_Time = Log_Volume_Since_Backup / Log_Apply_Rate

Log apply rates are typically 2-5x slower than log generation rate due to random I/O patterns during redo processing. z/OS system-level recovery (using RECOVER TABLESPACE) benefits from parallel log apply. LUW's HADR standby replays logs continuously, so failover time is measured in seconds rather than hours.


A.7 Summary of Key Formulas

Formula Purpose
FF = 1/COLCARDF Equal predicate filter factor
Est_Rows = CARD * FF Cardinality estimation
Height = ⌈log_F(N)⌉ + 1 B+ tree levels
HR = (LR - PR) / LR Buffer pool hit ratio
IOPS = TPS * IO/Txn * (1-HR) Physical I/O demand
Pages = ceil(Rows / RPP) Table storage sizing
Log/day = TPS * bytes/txn * 86400 Log volume planning

These formulas are approximations. Real systems exhibit non-uniform distributions, correlated columns, skewed access patterns, and caching effects that complicate the math. But these approximations are what DB2's optimizer starts with, and understanding them is the first step to understanding why the optimizer chose the plan it chose---and what to do when it chose wrong.


A.8 Join Cost Models

Understanding the cost models for each join method helps you interpret EXPLAIN output and predict when the optimizer will switch strategies.

A.8.1 Nested-Loop Join (NLJOIN)

For each row of the outer table, read the matching rows from the inner table (typically via an index).

Cost_NLJ = CARD(outer) * (Index_Probe_Cost + Matching_Rows * Data_Page_Cost)

Where: - Index_Probe_Cost = number of index levels traversed per probe (typically 2-3 I/Os, often cached to 0-1) - Matching_Rows = average matching rows in inner table per outer row - Data_Page_Cost = cost to fetch data pages (reduced by buffer pool caching)

Nested-loop join is optimal when the outer table is small and the inner table has a selective index. It is the only join method that supports non-equijoin predicates efficiently.

A.8.2 Hash Join (HSJOIN)

Build a hash table from the smaller (build) input, then probe it with each row of the larger (probe) input.

Cost_HJ = Pages(build) + Pages(probe) + CPU_hash_cost
         + Spill_Cost   (if hash table exceeds available memory)

Hash join is optimal for large equijoins where both tables are large and no useful index exists. It requires the build input to fit in memory (the sort heap); if it does not, DB2 spills partitions to disk, significantly increasing cost.

A.8.3 Merge Join (MSJOIN)

Both inputs are sorted on the join key, then merged in a single pass.

Cost_MJ = Sort_Cost(R) + Sort_Cost(S) + Pages(R) + Pages(S)

If one or both inputs are already sorted (e.g., delivered by an index scan in key order), the sort cost drops to zero for that input, making merge join very efficient. DB2's optimizer considers this "sort avoidance" when evaluating merge join.

A.8.4 Join Method Selection Rules of Thumb

Condition Likely Best Join
Small outer, indexed inner Nested-loop
Large tables, no useful indexes Hash join
Both inputs pre-sorted on join key Merge join
Non-equijoin (>, <, BETWEEN) Nested-loop
Memory-constrained environment Merge join (lower memory than hash)

A.9 Compression Ratio Estimation

DB2 supports row-level compression (LUW and z/OS) and columnar compression (BLU Acceleration on LUW). Estimating compression ratios helps with capacity planning.

A.9.1 Row Compression

Row compression uses a dictionary built from repeating patterns in the data. Typical compression ratios:

Compressed_Size = Original_Size * (1 - Compression_Ratio)

Typical ratios: - Tables with many repeated values (status codes, dates, short strings): 60-80% reduction - Tables with unique or high-entropy data (GUIDs, hashes): 10-20% reduction - Average across a typical OLTP database: 40-60% reduction

On LUW, estimate compression before enabling:

-- Estimate compression savings for a table
CALL SYSPROC.ADMIN_GET_TAB_COMPRESS_INFO('MYSCHEMA', 'MYTABLE', 'ESTIMATE', ?, ?, ?, ?);

A.9.2 Columnar Compression (BLU)

Column-organized tables typically achieve 3-10x compression ratios compared to uncompressed row-organized tables, because: - Column-homogeneous data compresses better than mixed-type rows - Sorted columns exhibit long runs of identical values - Frequency encoding replaces common values with short codes - Dictionary encoding eliminates redundant string storage


A.10 Little's Law and Its Application to DB2

Little's Law is a fundamental queueing theory result that relates throughput, concurrency, and response time:

L = λ * W

Where: - L = average number of concurrent requests (threads in DB2) - λ = throughput (requests per second) - W = average response time (seconds per request)

Application to DB2 thread sizing:

If your target throughput is 2,000 TPS and average transaction response time is 10 ms:

L = 2000 * 0.010 = 20 concurrent threads

You need at least 20 concurrent database connections to sustain this throughput. In practice, add headroom for variability: 2-3x the calculated value.

Application to connection pool sizing:

Pool_Size = Target_TPS * Avg_Response_Time_Seconds * Safety_Factor

For 5,000 TPS, 15 ms average response time, and a 2x safety factor:

Pool_Size = 5000 * 0.015 * 2 = 150 connections

Little's Law explains why adding connections beyond the saturation point does not increase throughput---it only increases response time and resource contention.