40 min read

DML asks questions of data that already exists. DDL shapes the container that data lives in. Get this wrong, and you will be ALTER-ing in production at 2 AM.

Chapter 11: Data Definition Language — CREATE, ALTER, DROP: Building and Evolving Database Structures


Learning Objectives

By the end of this chapter, you will be able to:

  1. Create tables with appropriate data types, constraints, and platform-specific options on both z/OS and LUW
  2. Create and manage tablespaces on both z/OS and LUW, understanding the fundamental architectural differences
  3. Use ALTER TABLE to modify existing structures safely and understand the implications of each change
  4. Understand DROP dependencies and cascading effects, and the operational risks of removing objects
  5. Work with sequences, identity columns, and generated columns for automated value generation
  6. Build the complete Meridian National Bank schema using DDL

Opening: The Other Half of SQL

DML asks questions of data that already exists. DDL shapes the container that data lives in. Get this wrong, and you will be ALTER-ing in production at 2 AM.

Every SELECT, INSERT, UPDATE, and DELETE that you have written in the preceding chapters assumes something profound: that a table already exists, that its columns have specific data types, that constraints enforce the business rules you take for granted, and that the physical storage underneath is configured to handle your workload. None of that happens by accident. It all begins with DDL — Data Definition Language.

DDL is where the database architect and the database administrator converge. A developer writes a SELECT; a DBA writes a CREATE TABLE. A developer asks, "Why is this query slow?" A DBA answers, "Because the table was created with the wrong data types and no useful indexes." The quality of your DDL determines the ceiling of your application's performance, the integrity of its data, and the flexibility of its evolution over time.

This chapter is the largest in Part II for a reason. We are not merely learning syntax. We are learning how to make decisions that will persist for years — sometimes decades — in production systems. A column data type, once populated with a billion rows, is not trivially changed. A constraint, once relied upon by dozens of applications, cannot be casually dropped. A tablespace, once allocated, defines the physical boundaries within which your data lives and your utilities operate.

We will work through the complete DDL vocabulary: CREATE TABLE, CREATE TABLESPACE, CREATE INDEX, ALTER TABLE, DROP, sequences, identity columns, generated columns, and aliases. For each statement, we will explore both DB2 for z/OS and DB2 for LUW syntax, because the differences — particularly around tablespace management — are substantial. And at the end of this chapter, we will bring everything together to build the complete Meridian National Bank schema: every table, every constraint, every index, every tablespace. That schema will be the foundation for everything else in this book.

Let us begin with the statement you will write more often than any other in your career as a DB2 professional.


11.1 CREATE TABLE — The Foundation

The CREATE TABLE statement does exactly what it says: it creates a new table in the database. But the simplicity of that description conceals a remarkable amount of power and nuance. A CREATE TABLE statement specifies the table's name, its columns (with data types and optional constraints), its table-level constraints, and a set of platform-specific options that control everything from compression to partitioning.

Basic Syntax

The minimal CREATE TABLE statement is deceptively simple:

CREATE TABLE schema_name.table_name (
    column_name  data_type  [NOT NULL]  [DEFAULT default_value],
    ...
);

In practice, production CREATE TABLE statements are far more elaborate. Here is a realistic example for a Meridian Bank customer table:

-- DB2 for LUW
CREATE TABLE meridian.customer (
    customer_id       INTEGER        NOT NULL GENERATED ALWAYS AS IDENTITY
                                      (START WITH 1, INCREMENT BY 1, NO CACHE),
    tax_id            VARCHAR(11)    NOT NULL,
    first_name        VARCHAR(50)    NOT NULL,
    last_name         VARCHAR(50)    NOT NULL,
    date_of_birth     DATE           NOT NULL,
    email_address     VARCHAR(100),
    phone_primary     VARCHAR(20),
    customer_since    DATE           NOT NULL DEFAULT CURRENT DATE,
    customer_status   CHAR(1)        NOT NULL DEFAULT 'A'
                                      CONSTRAINT chk_cust_status
                                      CHECK (customer_status IN ('A','I','C','S')),
    credit_score      SMALLINT,
    row_updated       TIMESTAMP      NOT NULL DEFAULT CURRENT TIMESTAMP,
    CONSTRAINT pk_customer PRIMARY KEY (customer_id),
    CONSTRAINT uq_customer_tax UNIQUE (tax_id)
);

Every element in that statement represents a decision: the choice of INTEGER over BIGINT for the primary key, the length of VARCHAR columns, the use of GENERATED ALWAYS versus GENERATED BY DEFAULT for the identity, the explicit constraint names, the DEFAULT values for status and timestamp columns. We will examine each of these decisions throughout this chapter.

Schema Qualification

Every table in DB2 belongs to a schema. On z/OS, the schema defaults to the authorization ID of the creator unless explicitly specified. On LUW, the schema defaults to the value of the CURRENT SCHEMA special register.

Experienced DB2 professionals always explicitly qualify table names with the schema:

CREATE TABLE meridian.customer ( ... );

This eliminates ambiguity and prevents accidental creation under the wrong schema — a mistake that is surprisingly common in development environments where multiple developers share a database.

z/OS Note: On z/OS, schema names are limited to 8 characters (the traditional TSO authorization ID length). On LUW, schema names can be up to 128 characters. This difference matters when designing schemas that will run on both platforms.

The IN Clause — Placing Tables in Tablespaces

On z/OS, you must specify where the table lives using the IN clause:

-- DB2 for z/OS
CREATE TABLE meridian.customer (
    customer_id       INTEGER        NOT NULL,
    ...
)
IN MERIDDB.CUSTTS;

Here, MERIDDB is the database and CUSTTS is the tablespace. On z/OS, a table must reside in a tablespace, and that tablespace must reside in a database. This three-level hierarchy (database > tablespace > table) is fundamental to z/OS physical design.

On LUW, the IN clause specifies only the tablespace name (databases are top-level objects in LUW, not containers for tablespaces):

-- DB2 for LUW
CREATE TABLE meridian.customer (
    ...
)
IN ts_customer_data;

If you omit the IN clause on LUW, the table goes into the default tablespace for the schema. On z/OS, omitting the IN clause causes DB2 to create an implicit database and tablespace — a practice universally discouraged in production.


Check Your Understanding (Box 1)

Pause and answer from memory:

  1. What three elements does the z/OS IN clause specify? (Database, tablespace, and what is the relationship?)
  2. Why do experienced DB2 professionals always explicitly qualify table names with a schema?
  3. What happens on z/OS if you omit the IN clause from a CREATE TABLE?

If any of these are unclear, re-read Section 11.1 before continuing.


11.2 Data Types Deep Dive

Choosing the right data type is one of the most consequential decisions in database design. Once a table is populated with millions or billions of rows, changing a column's data type is expensive, disruptive, and sometimes practically impossible without downtime. Get it right the first time.

Numeric Types

DB2 provides a rich set of numeric types:

Type Size Range Use When
SMALLINT 2 bytes -32,768 to 32,767 Status codes, small counters, ages
INTEGER 4 bytes -2,147,483,648 to 2,147,483,647 Primary keys (moderate volume), quantities
BIGINT 8 bytes -9.2 x 10^18 to 9.2 x 10^18 Primary keys (high volume), large counters
DECIMAL(p,s) Variable Up to 31 digits Financial amounts — always
REAL 4 bytes ~7 significant digits Scientific data (rare in business)
DOUBLE 8 bytes ~15 significant digits Scientific data, approximate analytics
DECFLOAT(16) 8 bytes 16 significant digits IEEE 754 decimal — financial interchange
DECFLOAT(34) 16 bytes 34 significant digits High-precision financial calculations

The cardinal rule of financial data: Never use REAL or DOUBLE for money. Floating-point arithmetic introduces rounding errors that, across millions of transactions, produce discrepancies that auditors will find and regulators will punish. Use DECIMAL(p,s) for all monetary amounts. At Meridian Bank, every dollar amount is DECIMAL(15,2) — 15 total digits, 2 after the decimal point, supporting values up to $9,999,999,999,999.99.

DECFLOAT deserves special mention. Introduced in DB2 9.5 (LUW) and DB2 9 (z/OS), DECFLOAT implements the IEEE 754-2008 decimal floating-point standard. Unlike binary floating-point (REAL, DOUBLE), decimal floating-point represents decimal fractions exactly. DECFLOAT is appropriate when you need floating-point range and behavior but cannot tolerate the rounding errors inherent in binary floating-point representation. Financial interchange formats like SWIFT messages sometimes specify DECFLOAT.

Choosing between INTEGER and BIGINT for primary keys: INTEGER supports approximately 2.1 billion positive values. If your table might eventually exceed that number of rows — or if you are generating IDs with gaps (as identity columns and sequences do) — use BIGINT. The storage cost is only 4 bytes more per row, but the cost of altering a primary key column from INTEGER to BIGINT on a billion-row table is measured in hours of downtime.

Character Types

Type Max Length Storage Use When
CHAR(n) 254 bytes Fixed, always n bytes Fixed-length codes: state codes (CHAR(2)), status flags (CHAR(1)), currency codes (CHAR(3))
VARCHAR(n) 32,672 bytes (LUW) / 32,704 bytes (z/OS) Variable, 2-byte length prefix + actual data Names, addresses, descriptions — most string data
CLOB(n) 2 GB LOB storage Large text: documents, log entries, XML as text
GRAPHIC(n) 127 double-byte chars Fixed, always 2n bytes z/OS: DBCS (double-byte character set) data
VARGRAPHIC(n) 16,336 double-byte chars Variable z/OS: variable-length DBCS data
DBCLOB(n) 1 GB LOB storage z/OS: large DBCS text

CHAR vs. VARCHAR: Use CHAR only for truly fixed-length data. A first_name column declared as CHAR(50) wastes enormous space — the name "Li" occupies 50 bytes, padded with 48 trailing spaces. VARCHAR(50) stores "Li" in just 4 bytes (2 for the length prefix, 2 for the data). Across millions of rows, this difference translates to gigabytes of wasted storage, more I/O, and slower queries.

z/OS Specifics: GRAPHIC and VARGRAPHIC

The GRAPHIC and VARGRAPHIC types exist primarily for z/OS installations that process double-byte character set (DBCS) data — primarily Japanese, Chinese, and Korean text. On LUW, Unicode support through VARCHAR with UTF-8 encoding has largely eliminated the need for separate graphic types. On z/OS, where EBCDIC encoding is still common, GRAPHIC types remain relevant for international banking and government applications.

VARCHAR length selection: Do not declare VARCHAR(1000) "just in case" when the actual maximum is 100 characters. While VARCHAR stores only the actual data, the declared maximum affects:

  • The DB2 optimizer's cardinality and cost estimates
  • Memory allocation for sort work areas
  • ODBC/JDBC buffer allocation in client applications
  • REORG and LOAD utility memory usage

Choose a maximum that is generous enough to accommodate real data but not absurdly oversized.

Date and Time Types

Type Storage Format (ISO) Example
DATE 4 bytes YYYY-MM-DD 2026-03-16
TIME 3 bytes HH.MM.SS 14.30.00
TIMESTAMP 10 bytes YYYY-MM-DD-HH.MM.SS.nnnnnn 2026-03-16-14.30.00.000000
TIMESTAMP(p) 10-13 bytes Variable fractional seconds TIMESTAMP(12) for nanosecond precision

Always use DATE for dates, not VARCHAR: Storing dates as VARCHAR(10) in 'YYYY-MM-DD' format is a common anti-pattern that sacrifices date arithmetic, range comparisons, and storage efficiency. DB2 stores a DATE in 4 bytes; a VARCHAR(10) date requires 12 bytes and cannot be directly used in date arithmetic without CAST operations.

TIMESTAMP precision: DB2 supports TIMESTAMP with fractional-second precision from 0 to 12 digits. The default is 6 (microseconds). For audit trails and row-change tracking, TIMESTAMP(6) is usually sufficient. For high-frequency trading or event logging where sub-microsecond precision matters, TIMESTAMP(12) provides picosecond granularity.

BOOLEAN (LUW Only)

DB2 for LUW (starting with version 11.1) supports the BOOLEAN data type:

-- DB2 for LUW only
CREATE TABLE meridian.feature_flag (
    flag_name     VARCHAR(50)   NOT NULL,
    is_enabled    BOOLEAN       NOT NULL DEFAULT FALSE,
    ...
);

DB2 for z/OS does not support BOOLEAN. The conventional z/OS workaround is CHAR(1) with a CHECK constraint:

-- DB2 for z/OS equivalent
is_enabled  CHAR(1)  NOT NULL  DEFAULT 'N'
    CONSTRAINT chk_enabled CHECK (is_enabled IN ('Y', 'N'))

XML Type

Both z/OS and LUW support the XML data type for storing well-formed XML documents. XML columns are stored in a separate hierarchical storage structure optimized for XQuery and SQL/XML operations:

CREATE TABLE meridian.document_store (
    doc_id      INTEGER       NOT NULL,
    doc_type    VARCHAR(20)   NOT NULL,
    doc_content XML,
    ...
);

XML columns cannot have default values, cannot be used in primary keys or unique constraints, and have different indexing requirements (XML indexes using XMLPATTERN). We will revisit XML storage in Chapter 14 (Physical Database Design).

Type Selection Best Practices

The following guidelines reflect hard-won production experience:

  1. Financial amounts: DECIMAL(15,2) minimum. Never REAL or DOUBLE.
  2. Primary keys: INTEGER for tables under ~500 million rows. BIGINT otherwise. Always.
  3. Status/flag columns: CHAR(1) with CHECK constraint. Document the valid values.
  4. Names and variable text: VARCHAR with a realistic maximum. Not VARCHAR(4000) "just in case."
  5. Dates: DATE type. Not VARCHAR. Not INTEGER (e.g., 20260316). DATE.
  6. Timestamps for auditing: TIMESTAMP NOT NULL DEFAULT CURRENT TIMESTAMP.
  7. Country, currency, language codes: CHAR(2) or CHAR(3) matching ISO standards.
  8. Large text: CLOB, stored in a separate LOB tablespace on z/OS.

Check Your Understanding (Box 2)

From memory:

  1. Why should financial amounts never be stored as REAL or DOUBLE?
  2. What is the storage difference between CHAR(50) and VARCHAR(50) when storing the value "Li"?
  3. On which platform is the BOOLEAN data type available, and what is the workaround on the other?

Retrieval practice now prevents confusion later when these choices matter in production.


11.3 Constraints in Detail

Constraints are the database's enforcement mechanism. They guarantee that the data in your tables conforms to the rules your business requires — regardless of which application, user, or batch process writes that data. A constraint enforced at the database level cannot be circumvented by a buggy application or a careless ad hoc UPDATE.

PRIMARY KEY Constraint

A primary key uniquely identifies each row. It implies NOT NULL and UNIQUE:

-- Inline (single-column) form
CREATE TABLE meridian.branch (
    branch_id   INTEGER  NOT NULL  PRIMARY KEY,
    ...
);

-- Table-level (named) form — preferred
CREATE TABLE meridian.branch (
    branch_id     INTEGER  NOT NULL,
    branch_name   VARCHAR(100)  NOT NULL,
    ...
    CONSTRAINT pk_branch PRIMARY KEY (branch_id)
);

Always name your constraints. An unnamed constraint gets a system-generated name like SQL260316140230 — meaningless in error messages, migration scripts, and catalog queries. Named constraints (pk_branch, fk_account_customer, chk_status) make every error message, every SYSIBM.SYSCHECKS catalog row, and every ALTER statement self-documenting.

A table can have at most one primary key. On z/OS, the primary key creates a unique index automatically (in the same tablespace by default). On LUW, the behavior is similar — a unique index is created to enforce the primary key.

Composite primary keys use the table-level form:

CONSTRAINT pk_account_holder PRIMARY KEY (account_id, holder_id)

Composite primary keys should be used when a single column does not naturally uniquely identify a row — for example, in junction tables that represent many-to-many relationships. However, avoid composite keys with more than three columns; they make foreign key references verbose and join conditions error-prone.

UNIQUE Constraint

UNIQUE ensures no two rows have the same value(s) in the specified column(s). Unlike PRIMARY KEY, UNIQUE columns can allow NULLs (though DB2's handling of NULLs in unique constraints varies by platform and version):

CONSTRAINT uq_customer_email UNIQUE (email_address)

On DB2 for z/OS, a UNIQUE constraint on a nullable column allows at most one NULL. On DB2 for LUW (11.1+), a UNIQUE constraint with the EXCLUDE NULL KEYS option allows multiple NULLs — the constraint applies only to non-NULL values. This is an important platform difference for columns that are optional but must be unique when present.

FOREIGN KEY Constraint

Foreign keys enforce referential integrity between tables. They guarantee that a value in one table (the child) exists in another table (the parent):

CREATE TABLE meridian.account (
    account_id    INTEGER       NOT NULL,
    customer_id   INTEGER       NOT NULL,
    branch_id     INTEGER       NOT NULL,
    ...
    CONSTRAINT pk_account PRIMARY KEY (account_id),
    CONSTRAINT fk_acct_cust FOREIGN KEY (customer_id)
        REFERENCES meridian.customer (customer_id)
        ON DELETE RESTRICT,
    CONSTRAINT fk_acct_branch FOREIGN KEY (branch_id)
        REFERENCES meridian.branch (branch_id)
        ON DELETE RESTRICT
);

The ON DELETE clause specifies what happens when a parent row is deleted:

Rule Effect When to Use
RESTRICT Block the delete if child rows exist Default — safest option. Use for core entities.
CASCADE Delete all child rows automatically Dependent data that has no meaning without the parent (e.g., order line items when the order is deleted)
SET NULL Set the foreign key column to NULL When the child row should survive but the relationship becomes unknown
NO ACTION Same as RESTRICT but checked at statement end Useful with deferred constraint checking
SET DEFAULT Set the foreign key column to its DEFAULT value Rare — use when a "default parent" exists

ON DELETE CASCADE is powerful and dangerous. A single DELETE on a parent table can cascade through multiple levels of child tables, removing thousands or millions of rows. In a banking application, cascading a customer delete through accounts, transactions, and audit records would be catastrophic. Use CASCADE only for genuinely dependent data, and always document the cascade chain.

z/OS vs. LUW: Both platforms support all five ON DELETE rules. However, z/OS also supports ON DELETE SET DEFAULT, which is less commonly used. Both platforms support the ON UPDATE clause, but its practical use is limited because primary keys should rarely change.

CHECK Constraints

CHECK constraints enforce arbitrary conditions on column values:

-- Column-level CHECK
customer_status  CHAR(1)  NOT NULL
    CONSTRAINT chk_cust_status CHECK (customer_status IN ('A','I','C','S')),

-- Table-level CHECK (can reference multiple columns)
CONSTRAINT chk_balance_sign CHECK (
    (account_type = 'CREDIT' AND current_balance <= 0)
    OR
    (account_type <> 'CREDIT' AND current_balance >= 0)
)

Table-level CHECK constraints can reference multiple columns, enabling cross-column validation. However, CHECK constraints cannot reference other tables, call functions with side effects, or use subqueries. For validation that requires cross-table logic, you will need triggers (Chapter 12) or application-layer enforcement.

Informational Constraints

DB2 supports "informational" constraints — constraints that are defined but not enforced by the database engine. These serve as metadata for the optimizer:

-- DB2 for LUW
ALTER TABLE meridian.account
    ADD CONSTRAINT info_positive_bal
    CHECK (current_balance >= 0)
    NOT ENFORCED
    ENABLE QUERY OPTIMIZATION;

When ENABLE QUERY OPTIMIZATION is specified, the optimizer uses the constraint as a hint for query rewriting and access path selection, even though DB2 does not enforce it. This is useful when the constraint is enforced by the application and you want the optimizer to benefit from that knowledge. However, if the application violates the constraint, queries may return incorrect results — the optimizer trusts the constraint unconditionally.

On z/OS, informational constraints are specified differently but serve the same purpose:

-- DB2 for z/OS
ALTER TABLE meridian.account
    ADD CONSTRAINT info_positive_bal
    CHECK (current_balance >= 0)
    NOT ENFORCED;

11.4 CREATE TABLESPACE — z/OS

On DB2 for z/OS, tablespace management is one of the most critical and nuanced aspects of physical database design. The tablespace is the unit of storage, recovery, locking granularity, and utility operation. A poorly designed tablespace strategy will haunt you through every REORG, COPY, RECOVER, and locking timeout for the life of the application.

The z/OS Storage Hierarchy

On z/OS, the physical hierarchy is:

Storage Group (STOGROUP)
  └── Database
        └── Tablespace
              └── Table(s)
                    └── Index (in its own index space)

A storage group defines the DASD (Direct Access Storage Device) volumes where data is physically stored. A database is a logical grouping of tablespaces and index spaces. A tablespace is a VSAM data set that contains one or more tables.

Tablespace Types on z/OS

DB2 for z/OS has evolved its tablespace architecture significantly over the decades. The current recommendation is to use Universal Table Spaces (UTS), which come in two flavors:

Partition-by-Range UTS: For large tables that benefit from range partitioning. Each partition is a separate data set, enabling parallel utility operations and partition-level independence:

CREATE TABLESPACE TRNTS
    IN MERIDDB
    USING STOGROUP MERIDSGRP
    BUFFERPOOL BP1
    LOCKSIZE ROW
    CLOSE NO
    COMPRESS YES
    MAXPARTITIONS 256
    DSSIZE 4G
    SEGSIZE 32
;

Partition-by-Growth UTS: For tables that start small and grow unpredictably. DB2 automatically adds partitions as the data grows:

CREATE TABLESPACE CUSTTS
    IN MERIDDB
    USING STOGROUP MERIDSGRP
    BUFFERPOOL BP1
    LOCKSIZE ROW
    CLOSE NO
    COMPRESS YES
    MAXPARTITIONS 128
    DSSIZE 4G
    SEGSIZE 32
;

The key difference: partition-by-range requires explicit PARTITION clauses on the CREATE TABLE (defining key ranges), while partition-by-growth partitions are added automatically and transparently.

Legacy types (segmented, classic partitioned) still exist in older installations but should not be used for new development. IBM has stated its strategic direction is UTS.

Key Tablespace Parameters

Parameter Purpose Typical Values
BUFFERPOOL Assigns the buffer pool for I/O caching BP0 (4K), BP8K0, BP16K0, BP32K (by page size)
LOCKSIZE Default lock granularity ROW (most OLTP), PAGE (batch-heavy), TABLE (rare)
CLOSE Whether data sets are closed when not in use NO for active tables, YES for infrequently accessed
COMPRESS Enable data compression YES for most production tables
DSSIZE Maximum data set size per partition 1G, 2G, 4G, 8G, 16G, 32G, 64G
SEGSIZE Number of pages per segment 4 to 64 (32 is common)
MAXPARTITIONS Maximum number of partitions Up to 4096 for partition-by-range
PRIQTY / SECQTY Primary and secondary space allocation (KB) Depends on data volume estimates
FREEPAGE / PCTFREE Free space for insert activity FREEPAGE 0, PCTFREE 10 (typical OLTP)
TRACKMOD Track modified pages for incremental copies YES (always, for recovery)
LOGGED / NOT LOGGED Whether changes are logged for recovery LOGGED (always in production)

LOCKSIZE guidance: Use ROW for OLTP workloads where concurrency matters. PAGE locking is more efficient for batch processing where contention is low. TABLESPACE locking is essentially serialization and should only be used for reference tables that are rarely updated.

Buffer pool assignment is critical for performance. High-activity tablespaces should be in buffer pools with enough memory to cache their frequently accessed pages. We will explore buffer pool tuning in detail in Chapter 25.

LOB Tablespaces

LOB (Large Object) columns require separate LOB tablespaces on z/OS:

CREATE LOB TABLESPACE DOCLOB
    IN MERIDDB
    USING STOGROUP MERIDSGRP
    BUFFERPOOL BP8K0
    LOG YES
    CLOSE NO
;

When a table contains CLOB, BLOB, or DBCLOB columns, the LOB data is stored separately from the base table row. The base row contains a "LOB descriptor" that points to the LOB tablespace. This architectural separation is important because:

  • LOB data can be very large (up to 2 GB per value)
  • LOB tablespaces have different space management characteristics
  • LOB data is often accessed differently from base row data (less frequently, sequentially)

Creating the z/OS Storage Infrastructure

Before you can create tables, you must create the storage infrastructure. The typical sequence is:

-- Step 1: Create the storage group
CREATE STOGROUP MERIDSGRP
    VOLUMES ('VOL001', 'VOL002', 'VOL003')
    VCAT MERIDCAT;

-- Step 2: Create the database
CREATE DATABASE MERIDDB
    STOGROUP MERIDSGRP
    BUFFERPOOL BP0
    CCSID UNICODE;

-- Step 3: Create tablespaces
CREATE TABLESPACE CUSTTS
    IN MERIDDB
    USING STOGROUP MERIDSGRP
    BUFFERPOOL BP1
    LOCKSIZE ROW
    CLOSE NO
    COMPRESS YES
    DSSIZE 4G
    SEGSIZE 32;

-- Step 4: Now create tables in the tablespace
CREATE TABLE meridian.customer (
    ...
) IN MERIDDB.CUSTTS;

This multi-step process is characteristic of z/OS — the DBA has explicit control over every layer of the storage hierarchy.


11.5 Tablespace Management — LUW

DB2 for LUW takes a fundamentally different approach to tablespace management. While z/OS requires explicit VSAM data set management through storage groups and volumes, LUW abstracts much of the physical storage layer, offering simpler management with less granular control.

Tablespace Types on LUW

DB2 for LUW has three tablespace management approaches:

Automatic Storage: The modern default. DB2 manages data file placement and growth automatically across one or more storage paths defined at the database level:

-- Database created with automatic storage paths
CREATE DATABASE meridiandb ON '/data/db2/path1', '/data/db2/path2';

-- Tablespace uses automatic storage by default
CREATE TABLESPACE ts_customer_data
    PAGESIZE 8K
    MANAGED BY AUTOMATIC STORAGE
    EXTENTSIZE 32
    PREFETCHSIZE AUTOMATIC
    OVERHEAD 7.5
    TRANSFERRATE 0.06
    NO FILE SYSTEM CACHING;

Database-Managed Space (DMS): The DBA specifies the file or device containers. DMS provides more control than automatic storage and was the preferred choice before automatic storage matured:

CREATE TABLESPACE ts_transaction_data
    PAGESIZE 16K
    MANAGED BY DATABASE
    USING (
        FILE '/data/db2/trn_data01.dbf' 10G,
        FILE '/data/db2/trn_data02.dbf' 10G
    )
    EXTENTSIZE 32
    PREFETCHSIZE AUTOMATIC;

System-Managed Space (SMS): The operating system manages the files. SMS is the simplest option but offers the least control. It is suitable for development and small workloads:

CREATE TABLESPACE ts_temp_work
    MANAGED BY SYSTEM
    USING ('/data/db2/temp_work');

Page Sizes

LUW supports four page sizes: 4K, 8K, 16K, and 32K. The page size affects the maximum row length and the maximum number of columns:

Page Size Max Row Length Max Columns Use When
4 KB ~4,005 bytes 500 Small rows, many rows per page
8 KB ~8,101 bytes 500 General purpose — good default
16 KB ~16,293 bytes 500 Wide rows, many VARCHAR columns
32 KB ~32,677 bytes 500 Very wide rows, staging tables

The page size must be chosen before creating the tablespace and cannot be changed afterward. Buffer pools are page-size-specific: a tablespace with 8K pages must use a buffer pool configured for 8K pages.

Extent Size and Prefetch

Extent size is the number of pages allocated at a time when the tablespace needs more space. A larger extent size reduces allocation overhead but may waste space for small tables. Common values range from 8 to 64 pages.

Prefetch size determines how many pages are read in a single I/O request during sequential prefetch. Setting this to AUTOMATIC lets DB2 calculate the optimal prefetch size based on the extent size and the number of containers — the recommended setting for most workloads.

Tablespace for Temporary Tables

DB2 for LUW uses separate system temporary tablespaces for sort operations and intermediate results. You must have at least one system temporary tablespace for each page size used by user tablespaces:

CREATE SYSTEM TEMPORARY TABLESPACE ts_temp_8k
    PAGESIZE 8K
    MANAGED BY AUTOMATIC STORAGE
    EXTENTSIZE 32;

User temporary tablespaces hold declared global temporary tables (DGTTs):

CREATE USER TEMPORARY TABLESPACE ts_user_temp
    PAGESIZE 8K
    MANAGED BY AUTOMATIC STORAGE;

LUW vs. z/OS Tablespace Comparison

Aspect z/OS LUW
Storage abstraction STOGROUP > VSAM data sets Storage paths > files or devices
Management model Always DBA-managed Automatic, DMS, or SMS
Page sizes 4K, 8K, 16K, 32K 4K, 8K, 16K, 32K
Partitioning Tablespace-level (partition-by-range, partition-by-growth) Table-level partitioning (independent of tablespace)
LOB storage Separate LOB tablespace required LOB data in same or separate tablespace
Lock granularity LOCKSIZE on tablespace Lock granularity at table/row level via DB configuration
Compression Tablespace-level COMPRESS YES Table-level COMPRESS YES
Buffer pool binding Tablespace bound to specific buffer pool Tablespace bound to specific buffer pool

The most important conceptual difference: on z/OS, tablespace partitioning drives table partitioning. On LUW, table partitioning (using range, hash, or list partitioning in the CREATE TABLE statement) is independent of the tablespace structure.


Check Your Understanding (Box 3)

  1. Name the three tablespace management approaches on DB2 for LUW.
  2. What z/OS tablespace type does IBM recommend for new development?
  3. Why must you have a system temporary tablespace for each page size in LUW?

These architectural distinctions will be tested in certification exams and matter in production.


11.6 CREATE INDEX

An index is a separate physical structure that provides fast access to table rows based on the values in one or more columns. Without indexes, every query would require a full table scan — reading every row to find the ones that match. Indexes are what make relational databases practical at scale.

B-Tree Index Fundamentals

DB2 uses B-tree (balanced tree) indexes as its primary index structure on both z/OS and LUW. A B-tree index organizes key values in a hierarchical tree structure where:

  • The root page is the top of the tree
  • Non-leaf pages contain key ranges and pointers to lower-level pages
  • Leaf pages contain key values and pointers (RIDs) to the actual data rows

A query using the index traverses from root to leaf in typically 2-4 levels, regardless of table size. A table with 10 million rows might have a 3-level index, requiring only 3 page reads to locate any single row — versus reading millions of pages in a table scan.

Basic CREATE INDEX Syntax

-- Unique index (enforces uniqueness)
CREATE UNIQUE INDEX meridian.ix_customer_pk
    ON meridian.customer (customer_id ASC);

-- Non-unique index for query performance
CREATE INDEX meridian.ix_customer_name
    ON meridian.customer (last_name ASC, first_name ASC);

Unique Indexes

A unique index enforces that no two rows have the same value(s) in the indexed columns. Primary key and unique constraints automatically create unique indexes. You can also create standalone unique indexes:

CREATE UNIQUE INDEX meridian.ix_customer_taxid
    ON meridian.customer (tax_id);

Clustering Indexes

A clustering index tells DB2 to attempt to store rows in the physical order defined by the index. This dramatically improves the performance of range scans and sequential access:

-- DB2 for z/OS
CREATE INDEX meridian.ix_txn_date
    ON meridian.transaction (transaction_date ASC)
    CLUSTER;

-- DB2 for LUW
CREATE INDEX meridian.ix_txn_date
    ON meridian.transaction (transaction_date ASC)
    CLUSTER;

A table can have only one clustering index. Choose it carefully — it should correspond to the most common access pattern. For a transaction table, clustering by transaction_date is almost always the right choice, because most queries filter by date range.

Over time, as rows are inserted, the physical clustering degrades. The REORG utility restores physical clustering. We will cover REORG in Chapter 17.

INCLUDE Columns

DB2 for LUW (and z/OS in later versions) supports INCLUDE columns in unique indexes. These columns are stored in the index leaf pages but are not part of the uniqueness key:

CREATE UNIQUE INDEX meridian.ix_account_pk
    ON meridian.account (account_id)
    INCLUDE (account_type, current_balance, customer_id);

This creates an "index-only" access path: queries that need only account_id, account_type, current_balance, and customer_id can be satisfied entirely from the index without accessing the base table. This is one of the most powerful performance optimizations in DB2.

Index Compression

Both platforms support index compression to reduce storage and I/O:

-- DB2 for z/OS
CREATE INDEX meridian.ix_txn_date
    ON meridian.transaction (transaction_date)
    COMPRESS YES;

-- DB2 for LUW
CREATE INDEX meridian.ix_txn_date
    ON meridian.transaction (transaction_date)
    COMPRESS YES;

Index compression works by deduplicating common prefixes in the index keys. It is most effective on indexes with long, repetitive key values and is generally recommended for large indexes.

Partitioned Indexes (z/OS)

On z/OS, indexes on partitioned tablespaces can be either partitioned (DPSI — Data-Partitioned Secondary Index) or non-partitioned (NPSI — Non-Partitioned Secondary Index):

-- Partitioned index (DPSI) — one index partition per data partition
CREATE INDEX meridian.ix_txn_acct
    ON meridian.transaction (account_id)
    PARTITIONED;

-- Non-partitioned index — single index across all partitions
CREATE INDEX meridian.ix_txn_refno
    ON meridian.transaction (reference_number)
    NOT PARTITIONED;

DPSIs enable partition-level independence — you can REORG, COPY, or RECOVER a single partition without affecting the entire index. NPSIs are required for uniqueness across partitions (unless the partitioning key is part of the unique key).

Index Design Guidelines

  1. Always index foreign key columns. Without an index on the child table's FK column, every DELETE on the parent table requires a full scan of the child table to check for referencing rows.
  2. Index columns used in WHERE clauses, JOIN conditions, and ORDER BY. These are the columns the optimizer uses for access path selection.
  3. Choose the clustering index based on the most common range scan. Date columns are common clustering choices for transaction tables.
  4. Use INCLUDE columns to enable index-only access for frequently executed queries that need only a few columns.
  5. Do not over-index. Every index imposes a cost on INSERT, UPDATE, and DELETE operations, because the index must be maintained. A table with 20 indexes incurs 20 index updates for every row modification.
  6. Name your indexes consistently. Convention: ix_{table}_{column(s)} (e.g., ix_customer_name, ix_txn_date).

11.7 ALTER TABLE — Evolving Without Rebuilding

Production tables change. New business requirements demand new columns. Data quality issues require new constraints. Regulatory mandates require expanded column lengths. The ALTER TABLE statement is how you evolve a table's structure without dropping and recreating it.

ADD COLUMN

The most common ALTER TABLE operation:

ALTER TABLE meridian.customer
    ADD COLUMN middle_name VARCHAR(50);

ALTER TABLE meridian.customer
    ADD COLUMN loyalty_tier CHAR(1) NOT NULL DEFAULT 'S'
        CONSTRAINT chk_loyalty CHECK (loyalty_tier IN ('S','G','P','D'));

Critical rule: Added columns are always appended to the end of the table. You cannot insert a column at a specific position. If column order matters for your application (it should not), you must drop and recreate the table.

On z/OS, adding a NOT NULL column with a DEFAULT value does not immediately rewrite existing rows. Instead, DB2 stores the default in the catalog and materializes it on first access or during the next REORG. This is a significant performance advantage for large tables — adding a column is an instant catalog-only operation.

On LUW, the behavior is similar starting with DB2 11.1 — adding a NOT NULL column with DEFAULT is an online, non-disruptive operation for most data types.

ALTER COLUMN

Modifying existing columns is more constrained than adding new ones:

-- Increase VARCHAR length (safe — always works)
ALTER TABLE meridian.customer
    ALTER COLUMN email_address SET DATA TYPE VARCHAR(200);

-- Change default value
ALTER TABLE meridian.customer
    ALTER COLUMN customer_status SET DEFAULT 'A';

-- Drop the default
ALTER TABLE meridian.customer
    ALTER COLUMN customer_status DROP DEFAULT;

What you can safely alter: - Increase VARCHAR length (always safe) - Increase DECIMAL precision (from DECIMAL(10,2) to DECIMAL(15,2)) - Change or drop DEFAULT values - Set or drop NOT NULL (with conditions)

What you cannot easily alter: - Decrease column length (would truncate data) - Change data type (INTEGER to VARCHAR, DATE to TIMESTAMP) - Rename a column (supported on LUW 9.7+, not on z/OS without workarounds)

For changes that ALTER TABLE cannot handle, the standard procedure is:

  1. Create a new table with the desired structure
  2. Copy data from the old table to the new table (with appropriate CAST operations)
  3. Drop the old table
  4. Rename the new table (or use an alias)

This is a significant undertaking for large tables and should be planned carefully with appropriate downtime windows.

ADD CONSTRAINT

Adding constraints to existing tables:

-- Add a foreign key
ALTER TABLE meridian.account
    ADD CONSTRAINT fk_acct_branch
    FOREIGN KEY (branch_id)
    REFERENCES meridian.branch (branch_id)
    ON DELETE RESTRICT;

-- Add a check constraint
ALTER TABLE meridian.transaction
    ADD CONSTRAINT chk_amount_positive
    CHECK (transaction_amount > 0);

When you add a constraint to a table that already contains data, DB2 validates existing data against the constraint. If any rows violate it, the ALTER fails. On z/OS, CHECK constraints put the table in CHECK PENDING status until the data is validated using the CHECK DATA utility. On LUW, the validation occurs during the ALTER TABLE execution.

DROP CONSTRAINT

ALTER TABLE meridian.account
    DROP CONSTRAINT fk_acct_branch;

ALTER TABLE meridian.transaction
    DROP CHECK chk_amount_positive;

Dropping a constraint is immediate and does not require a REORG. However, dropping a PRIMARY KEY or UNIQUE constraint also drops the associated unique index, which may affect query performance.

Online Schema Changes

Both platforms have made significant progress toward online schema changes — alterations that do not require application downtime:

z/OS (DB2 12+): - ADD COLUMN is immediate (catalog-only) - Added columns may put the tablespace in REORG-pending (AREOR) status, but the table remains accessible - Most ALTER COLUMN operations are catalog-only, with materialization deferred to REORG - Some changes require a REORG TABLESPACE to materialize

LUW (11.1+): - ADD COLUMN with DEFAULT is typically non-disruptive - Column length increases are online - Adding NOT ENFORCED constraints is online - Adding enforced constraints requires data validation (may lock the table)

REORG Pending States (z/OS)

On z/OS, certain ALTER operations put the tablespace or index into a "pending" state:

State Abbreviation Cause Resolution
REORG-pending AREOR Column added, type changed REORG TABLESPACE
CHECK-pending ACHKP Constraint added to populated table CHECK DATA utility
REBUILD-pending AREO* Index modification REBUILD INDEX

While in a pending state, the table remains accessible for most operations, but some utilities and operations may be restricted. The pending state is DB2's way of saying: "I have accepted your change, but I need you to complete it by running the appropriate utility."

Monitor pending states by querying the SYSIBM.SYSTABLESPACESTATS or SYSIBM.SYSTABLESPACE catalog tables.


11.8 DROP — Removing Objects

The DROP statement removes objects from the database. It is the most dangerous DDL statement, and its effects are permanent — there is no UNDO for a DROP.

DROP TABLE

DROP TABLE meridian.customer;

This removes the table, all its data, all its indexes, all its constraints, and all dependent objects (views, triggers, etc. that reference the table). On z/OS, the tablespace that contained the table still exists — you must drop it separately if desired. On LUW, the tablespace also persists.

CASCADE vs. RESTRICT

DB2 handles dependent objects differently depending on the platform:

z/OS: DROP TABLE automatically drops all dependent indexes and invalidates dependent plans and packages. Views that reference the dropped table become inoperable (they are not automatically dropped — they remain in the catalog but fail when accessed). This behavior can cause cascading failures in production if applications depend on views of the dropped table.

LUW: DROP TABLE with RESTRICT (the default in many contexts) fails if dependent objects exist. You must drop the dependent objects first or use CASCADE:

-- This may fail if views, triggers, or MQTs depend on the table
DROP TABLE meridian.customer;

-- On LUW, views that depend on the table are dropped automatically
-- But foreign key relationships must be dropped first

DROP INDEX

DROP INDEX meridian.ix_customer_name;

Dropping an index removes the access path but does not affect the data. However, queries that relied on the index may suddenly become much slower — switching from a fast index scan to a full table scan. Always check the EXPLAIN output for production queries before dropping an index.

On z/OS, dropping an index also drops the associated index space (VSAM data set).

DROP TABLESPACE

-- DB2 for z/OS
DROP TABLESPACE MERIDDB.CUSTTS;

-- DB2 for LUW
DROP TABLESPACE ts_customer_data;

On z/OS, DROP TABLESPACE drops the tablespace and all tables and indexes within it. This is a bulk destruction operation. On LUW, you cannot drop a tablespace that still contains tables — you must drop (or move) the tables first.

The "DROP TABLE in Production" Nightmare

Every experienced DBA has either witnessed or heard about the accidental DROP TABLE in production. The scenarios are predictable:

  1. A developer runs a script intended for the test environment against production.
  2. A DBA copies a table name wrong in a migration script.
  3. An automated deployment tool drops a table as part of a "recreate" strategy.

Mitigation strategies:

  • Grant DROP privileges sparingly. Only DBAs should have DROP authority on production tables.
  • Use schemas to separate environments. Test tables in schema TEST, production tables in schema PROD. A DROP TABLE TEST.customer is survivable; a DROP TABLE PROD.customer is not.
  • Require two-person approval for DROP in production. This is a process control, not a technical control, but it catches mistakes.
  • Maintain current backups. If a DROP happens, you need to be able to RECOVER the table (z/OS) or RESTORE from a backup (LUW) quickly.
  • Use aliases or synonyms as an abstraction layer. Applications reference an alias; the actual table can be swapped without changing application code.

On z/OS, the RECOVER utility can recover a dropped table from an image copy if the tablespace still exists and you have not yet taken a new image copy. On LUW, recovery requires restoring from a backup and performing point-in-time roll-forward. Neither process is instant — both require careful planning and execution, often under intense pressure.


11.9 Sequences and Identity Columns

Generating unique numeric values is a fundamental requirement. Order numbers, transaction IDs, customer numbers — all require guaranteed uniqueness without relying on application logic that might have bugs or concurrency issues. DB2 provides two mechanisms: sequences and identity columns.

CREATE SEQUENCE

A sequence is a standalone database object that generates unique numeric values:

CREATE SEQUENCE meridian.seq_transaction_id
    AS BIGINT
    START WITH 1000000
    INCREMENT BY 1
    NO MAXVALUE
    NO CYCLE
    CACHE 50;

You retrieve the next value using the NEXT VALUE FOR expression:

INSERT INTO meridian.transaction (transaction_id, account_id, amount)
VALUES (NEXT VALUE FOR meridian.seq_transaction_id, 1001, 250.00);

And you can retrieve the most recently generated value in the current session:

SELECT PREVIOUS VALUE FOR meridian.seq_transaction_id
FROM SYSIBM.SYSDUMMY1;

Identity Columns

An identity column is a column that automatically generates a value on INSERT:

CREATE TABLE meridian.customer (
    customer_id  INTEGER  NOT NULL
        GENERATED ALWAYS AS IDENTITY
        (START WITH 1, INCREMENT BY 1, CACHE 20),
    ...
);

GENERATED ALWAYS vs. GENERATED BY DEFAULT:

Option Behavior Use When
GENERATED ALWAYS DB2 always generates the value. INSERTs that specify a value for this column fail. New tables where you want absolute control over ID generation
GENERATED BY DEFAULT DB2 generates a value only if the INSERT does not provide one. Data migration scenarios where you need to preserve existing ID values

GENERATED ALWAYS is the safer choice for new applications. It prevents accidental or malicious insertion of duplicate values. However, GENERATED BY DEFAULT is necessary when loading data from another system that already has assigned IDs.

Gaps in Sequences and Identity Columns

Both sequences and identity columns can produce gaps in the generated values. Gaps occur when:

  1. A transaction generates a value but then rolls back
  2. The database is shut down and cached values are lost
  3. Multiple threads each cache a range of values, and one thread uses its range faster than another

Gaps are normal and expected. Do not design your application to rely on gapless sequences — that constraint imposes severe concurrency limitations.

Caching and Performance

The CACHE clause specifies how many values DB2 pre-allocates in memory:

CACHE 50   -- Pre-allocate 50 values at a time
NO CACHE   -- Generate one value at a time (slow but gapless on normal shutdown)

Larger cache sizes improve INSERT throughput because fewer disk I/O operations are needed to update the sequence's current value. However, larger caches mean more potential gaps if the database is shut down abnormally. A cache of 20-100 is typical for OLTP workloads.

z/OS performance consideration: On z/OS, sequence and identity column caching is especially important in data sharing environments. Without caching, every NEXT VALUE FOR call requires a lock on the sequence catalog row, creating a serialization point across all members of the data sharing group.

Sequence vs. Identity: When to Use Which

Factor Sequence Identity
Multiple tables Yes — one sequence can feed many tables No — tied to one column in one table
Explicit NEXT VALUE FOR call Required Not needed (auto-generated on INSERT)
Schema independence Standalone object Part of the table definition
Migration/loading Can use GENERATED BY DEFAULT with explicit values GENERATED BY DEFAULT allows explicit values
Cross-table ordering Possible with shared sequence Not possible

Use identity columns for single-table primary keys. Use sequences when you need a single counter shared across multiple tables or when you need to generate the value before the INSERT (e.g., to use it as a foreign key in a related table within the same transaction).


Check Your Understanding (Box 4)

  1. What is the difference between GENERATED ALWAYS and GENERATED BY DEFAULT?
  2. Why do gaps occur in identity column values, and should you try to prevent them?
  3. What does the CACHE clause control, and what is the trade-off?

These distinctions appear on certification exams and in production debugging scenarios.


11.10 Generated Columns and Row Change Timestamps

Beyond identity columns, DB2 supports columns whose values are derived from expressions involving other columns in the same row or from database state.

Expression-Based Generated Columns

A generated column computes its value automatically from other columns in the same row:

CREATE TABLE meridian.account (
    account_id       INTEGER       NOT NULL,
    first_name       VARCHAR(50)   NOT NULL,
    last_name        VARCHAR(50)   NOT NULL,
    full_name        VARCHAR(101)
        GENERATED ALWAYS AS (first_name || ' ' || last_name),
    daily_limit      DECIMAL(10,2) NOT NULL,
    monthly_limit    DECIMAL(12,2)
        GENERATED ALWAYS AS (daily_limit * 30),
    ...
);

Generated columns are computed when the row is inserted or when the source columns are updated. You cannot directly INSERT or UPDATE a generated column — DB2 always computes the value from the expression.

Use cases for generated columns: - Concatenated names or addresses for display - Computed totals or derived amounts - Normalized or uppercased search columns - Hash values for data comparison

Row Change Timestamp Columns

A row change timestamp column automatically records when a row was last modified:

-- DB2 for z/OS
CREATE TABLE meridian.customer (
    customer_id      INTEGER      NOT NULL,
    ...
    row_changed      TIMESTAMP    NOT NULL
        GENERATED ALWAYS FOR EACH ROW ON UPDATE AS ROW CHANGE TIMESTAMP,
    ...
);

On LUW, the equivalent is typically implemented with a trigger or a DEFAULT clause combined with application discipline. DB2 for LUW does not have the same FOR EACH ROW ON UPDATE AS ROW CHANGE TIMESTAMP syntax, but you can approximate it:

-- DB2 for LUW approach
CREATE TABLE meridian.customer (
    customer_id      INTEGER      NOT NULL,
    ...
    row_changed      TIMESTAMP    NOT NULL DEFAULT CURRENT TIMESTAMP,
    ...
);

On LUW, you would use an UPDATE trigger to maintain the row_changed column. On z/OS, the ROW CHANGE TIMESTAMP is maintained automatically by DB2 without any trigger overhead.

Row change timestamps are valuable for: - Optimistic concurrency control: Applications read the row change timestamp when fetching data and check it before updating to detect concurrent modifications - Incremental data extraction: ETL processes can select only rows modified since the last extraction - Audit trailing: Knowing when any column in a row last changed


11.11 CREATE ALIAS and Synonyms

Aliases provide an alternative name for a table, view, or other alias. They serve as an abstraction layer between applications and physical database objects.

CREATE ALIAS

-- DB2 for z/OS
CREATE ALIAS meridian.cust FOR meridian.customer;

-- DB2 for LUW
CREATE ALIAS meridian.cust FOR meridian.customer;

Applications can then use meridian.cust wherever they would use meridian.customer. The alias is resolved at execution time, so it always refers to the current underlying table.

Use Cases for Aliases

Schema management: When multiple schemas exist (DEV, TEST, PROD), aliases can provide a common name that points to the appropriate environment:

-- In the DEV environment
CREATE ALIAS app.customer FOR dev.customer;

-- In the PROD environment
CREATE ALIAS app.customer FOR prod.customer;

Applications reference app.customer, and the alias resolves to the correct physical table based on the environment.

Distributed access: When accessing tables across DB2 subsystems (z/OS) or instances (LUW), aliases simplify the three-part name syntax:

-- Instead of: SELECT * FROM remote_server.remote_schema.customer
CREATE ALIAS meridian.remote_customer
    FOR remote_server.remote_schema.customer;

-- Now: SELECT * FROM meridian.remote_customer

Zero-downtime table replacement: To replace a table (e.g., after a structural rebuild), you can:

  1. Create the new table with a temporary name
  2. Load data into the new table
  3. Drop the old alias and create a new one pointing to the new table
  4. Applications continue to reference the alias without interruption

z/OS vs. LUW: On z/OS, the CREATE SYNONYM statement also exists and is functionally similar to CREATE ALIAS for backward compatibility. IBM recommends using CREATE ALIAS for new development. On LUW, only CREATE ALIAS is available; there is no CREATE SYNONYM.


11.12 The Meridian National Bank Complete DDL

This section brings together every DDL concept from this chapter to build the authoritative Meridian National Bank schema. This is the schema that all subsequent chapters will reference — for queries, data modification, performance tuning, security, and administration.

Schema Overview

Meridian National Bank's core banking schema consists of the following tables:

Table Purpose Estimated Rows (5-year)
CUSTOMER Customer master record 2 million
ADDRESS Customer addresses (1:many) 5 million
BRANCH Bank branches 250
EMPLOYEE Bank employees 3,500
ACCOUNT Customer accounts 4 million
ACCOUNT_HOLDER Many-to-many: customers to accounts 5 million
TRANSACTION Financial transactions 500 million
TRANSACTION_TYPE Reference: transaction type codes 50
LOAN Loan accounts 800,000
LOAN_PAYMENT Loan payment history 20 million
CARD Debit/credit cards linked to accounts 3 million
FEE_SCHEDULE Fee definitions by account type 200
AUDIT_LOG System audit trail 1 billion

Design Principles

The schema follows these principles:

  1. Named constraints: Every PRIMARY KEY, FOREIGN KEY, UNIQUE, and CHECK constraint has an explicit, descriptive name.
  2. Consistent data types: All monetary amounts are DECIMAL(15,2). All identifiers are INTEGER or BIGINT based on expected volume. All status flags are CHAR(1) with CHECK constraints.
  3. Audit columns: Every table includes created_at TIMESTAMP NOT NULL DEFAULT CURRENT TIMESTAMP and updated_at TIMESTAMP NOT NULL DEFAULT CURRENT TIMESTAMP.
  4. Referential integrity: All relationships are enforced with FOREIGN KEY constraints. ON DELETE is RESTRICT by default.
  5. Index coverage: Primary keys, foreign keys, and commonly queried columns are indexed.

Key Tables — LUW DDL

The complete DDL is in code/meridian-complete-ddl.sql. Here we highlight the key design decisions.

CUSTOMER table: The root entity. customer_id uses GENERATED ALWAYS AS IDENTITY because this is a new system with no legacy IDs to preserve. The tax_id column has a UNIQUE constraint because no two customers should share a tax identifier.

ACCOUNT table: Links to CUSTOMER through the ACCOUNT_HOLDER junction table, supporting joint accounts (one account, multiple holders) and customers with multiple accounts. The account_number is a business-facing identifier separate from the surrogate primary key.

TRANSACTION table: The highest-volume table in the system. On z/OS, it would be in a partition-by-range tablespace partitioned by transaction_date, enabling partition-level utility operations and archival. The transaction_id uses BIGINT because 500 million rows over five years will approach INTEGER limits, especially with gaps from caching.

AUDIT_LOG table: The largest table by row count. Designed for append-only writes — rows are never updated or deleted (enforced by application policy and security grants). Uses BIGINT for audit_id and CLOB for the change_detail column to accommodate variable-length change descriptions.

Cross-Platform Considerations

The code/meridian-complete-ddl.sql script is written for DB2 for LUW. For z/OS, the following changes would be necessary:

  1. Add IN clauses to place each table in its designated tablespace
  2. Create storage groups, databases, and tablespaces before creating tables (see code/tablespace-examples.sql)
  3. Replace BOOLEAN columns with CHAR(1) + CHECK constraint
  4. Add ROW CHANGE TIMESTAMP columns where appropriate
  5. Adjust VARCHAR maximum lengths if any exceed z/OS limits
  6. Create explicit index spaces for each index

The z/OS tablespace DDL for the Meridian schema is provided in code/tablespace-examples.sql, demonstrating the complete storage hierarchy.


Spaced Review: Concepts from Chapters 2, 4, and 9

The following questions revisit material from earlier chapters. Attempting to recall this information now — rather than re-reading it — strengthens your long-term retention.

From Chapter 2 (The Relational Model):

  1. Define "referential integrity" in your own words. How does the FOREIGN KEY constraint from this chapter implement the relational model's referential integrity rule?
  2. What is the difference between a candidate key and a primary key? How do the PRIMARY KEY and UNIQUE constraints map to these concepts?

From Chapter 4 (Setting Up Your Environment):

  1. When you created your first database using CREATE DATABASE, what objects did DB2 create automatically? How do the tablespaces discussed in this chapter relate to those automatically created objects?

From Chapter 9 (Data Modification):

  1. When an INSERT statement omits a column, what value does that column receive? How do the DEFAULT clauses from this chapter's CREATE TABLE statements determine that behavior?
  2. If you attempt to DELETE a row from the CUSTOMER table that has child rows in the ACCOUNT_HOLDER table, and the foreign key specifies ON DELETE RESTRICT, what happens? Write the exact error you would expect.

11.13 DDL Best Practices Summary

These guidelines reflect the collective experience of DB2 professionals across decades of production operations.

Naming Conventions

Establish and enforce consistent naming conventions before writing any DDL:

Object Convention Example
Table Singular noun, descriptive CUSTOMER, ACCOUNT, TRANSACTION
Column snake_case, no abbreviations except standard ones customer_id, first_name, account_type
Primary key pk_{table} pk_customer
Foreign key fk_{child}_{parent} fk_account_customer
Unique constraint uq_{table}_{column(s)} uq_customer_taxid
Check constraint chk_{table}_{description} chk_cust_status
Index ix_{table}_{column(s)} ix_customer_name
Tablespace (z/OS) {table_abbrev}TS CUSTTS, TRNTS
Sequence seq_{purpose} seq_transaction_id

Version Control Your DDL

DDL scripts belong in version control (Git) alongside application code. Every change to the schema should be:

  1. Written as a migration script (ALTER statements, not full recreations)
  2. Reviewed by a DBA before execution
  3. Tested in a non-production environment
  4. Applied using a consistent deployment process

Document Your Decisions

Every non-obvious DDL decision should be documented with SQL comments:

-- BIGINT chosen because 5-year projected volume (500M rows)
-- exceeds INTEGER capacity with identity gaps
-- Reviewed: 2026-03-16, DBA: J. Nakamura
transaction_id  BIGINT  NOT NULL  GENERATED ALWAYS AS IDENTITY
    (START WITH 1, INCREMENT BY 1, CACHE 100),

Test Your Constraints

After creating a table with constraints, verify them:

-- Attempt to violate each constraint
INSERT INTO meridian.customer (tax_id, first_name, last_name, date_of_birth, customer_status)
VALUES ('999-99-9999', 'Test', 'User', '2000-01-01', 'X');
-- Expected: SQLSTATE 23513 (check constraint violation)

INSERT INTO meridian.customer (tax_id, first_name, last_name, date_of_birth)
VALUES (NULL, 'Test', 'User', '2000-01-01');
-- Expected: SQLSTATE 23502 (NOT NULL violation)

Plan for Evolution

Design your schema with change in mind:

  • Use VARCHAR instead of CHAR for columns that might need to grow
  • Use INTEGER or BIGINT primary keys even if SMALLINT would suffice today
  • Leave room in DECIMAL precision for future requirements
  • Document which columns are candidates for future changes

Chapter Summary

This chapter covered the complete DDL vocabulary for DB2, from CREATE TABLE through DROP, with deep dives into both z/OS and LUW platform-specific features. The key takeaways:

  1. CREATE TABLE is the most consequential DDL statement. Data type choices, constraint definitions, and physical placement decisions persist for the life of the table — potentially decades in production.

  2. Data types must be chosen with precision. DECIMAL for money, INTEGER/BIGINT for keys based on volume projections, VARCHAR with realistic maximums for variable text, DATE/TIMESTAMP for temporal data. Never use floating-point for financial values. Never store dates as strings.

  3. Constraints are the database's enforcement mechanism. PRIMARY KEY, UNIQUE, FOREIGN KEY, and CHECK constraints guarantee data integrity regardless of which application writes the data. Name every constraint explicitly.

  4. Tablespace architecture differs fundamentally between z/OS and LUW. z/OS requires explicit creation of the storage hierarchy (STOGROUP > DATABASE > TABLESPACE) before creating tables. LUW offers automatic storage that handles physical management with less DBA intervention.

  5. Indexes provide fast access paths. B-tree indexes, clustering indexes, INCLUDE columns, and compression are essential tools for query performance. But every index costs INSERT/UPDATE/DELETE performance — index judiciously.

  6. ALTER TABLE enables schema evolution without rebuilding. ADD COLUMN and column length increases are safe and often online. Data type changes and column removals require more complex procedures.

  7. DROP is permanent and dangerous. Guard DROP privileges carefully, maintain backups, and use aliases as abstraction layers.

  8. Sequences and identity columns generate unique values automatically. GENERATED ALWAYS is safer; GENERATED BY DEFAULT accommodates data migration. Cache sizes trade gap potential for throughput.

  9. Generated columns and row change timestamps provide computed values and change tracking without application overhead.

  10. The Meridian Bank schema demonstrates all these concepts working together in a realistic banking application.

In Chapter 12, we will build on this foundation with views, triggers, and stored procedures — the programmable layer that sits on top of the physical tables you have learned to create.


Return to Part II: SQL Mastery | Continue to Chapter 12: Views, Triggers, and Stored Procedures