Reading data is safe. Modifying data changes the world. In DB2, you need to do it right or undo it fast.
In This Chapter
- Learning Objectives
- Opening: The Line You Cannot Uncross
- 9.1 INSERT -- Adding Data
- 9.2 UPDATE -- Changing Existing Data
- 9.3 DELETE -- Removing Data
- 9.4 MERGE -- The Upsert Pattern
- 9.5 Transaction Control -- COMMIT and ROLLBACK
- 9.6 SAVEPOINT -- Partial Rollback
- 9.7 The SELECT FROM INSERT/UPDATE/DELETE Pattern
- 9.8 Identity Columns and Sequences
- 9.9 Bulk Operations and Performance
- 9.10 Data Modification Safety Patterns
- 9.11 Meridian Bank Data Operations
- Spaced Review: Concepts from Earlier Chapters
- Summary
Chapter 9: Data Modification: INSERT, UPDATE, DELETE, MERGE, and Transaction Control
Learning Objectives
By the end of this chapter, you will be able to:
- Insert single and multiple rows with INSERT and INSERT...SELECT
- Update rows safely with WHERE-qualified UPDATE statements
- Delete rows with DELETE and understand cascading effects
- Use MERGE for upsert patterns (insert-or-update)
- Understand transaction control: COMMIT, ROLLBACK, SAVEPOINT
- Apply data modification to the Meridian National Bank progressive project
Opening: The Line You Cannot Uncross
Reading data is safe. Modifying data changes the world. In DB2, you need to do it right or undo it fast.
Every SQL statement you have written so far in this book has been a SELECT. You have filtered rows, joined tables, aggregated results, built CTEs, and crafted subqueries of impressive complexity. But none of it changed a single byte on disk. SELECT is read-only. It is a tourist visiting a city, taking photographs, leaving everything exactly as it was found.
This chapter is different.
INSERT, UPDATE, DELETE, and MERGE are the four verbs of data modification language (DML). They add rows, change values, remove records, and synchronize datasets. When you execute them, something happens in the real world. A customer's address changes. An account balance moves. A transaction record appears where none existed before. And once you COMMIT those changes, they are permanent. No amount of regret or caffeine will bring back the old data unless you planned for recovery before you needed it.
This is why data modification in DB2 demands a different mindset from querying. It demands discipline. It demands transactions. It demands the habit of testing with SELECT before executing with UPDATE. It demands understanding what COMMIT actually means at the log level, and what ROLLBACK can and cannot rescue.
We will cover all of this. We will also cover the patterns that experienced DB2 professionals use every day: identity columns and sequences for generating keys, MERGE for ETL upsert logic, the SELECT FROM INSERT trick for auditing, bulk loading strategies for performance, and the safety habits that separate a careful DBA from a headline in the postmortem report.
By the end of this chapter, you will not only know the syntax of every data modification statement DB2 supports. You will know when to use each one, how to protect yourself with transactions, and why the habits in this chapter matter more than in any other.
Let us begin with the simplest act of creation: adding a row to a table.
9.1 INSERT -- Adding Data
The INSERT statement adds new rows to a table. It is the most common write operation in any database, and DB2 supports several variations, each with its own use case.
9.1.1 Single-Row INSERT with a Column List
The canonical form specifies both the target columns and the values to insert:
INSERT INTO meridian.customer
(customer_id, first_name, last_name, email, date_of_birth, branch_id)
VALUES
(10001, 'Elena', 'Vasquez', 'evasquez@email.com', '1988-07-14', 101);
The column list is technically optional -- DB2 allows you to omit it if you supply values for every column in table order. Do not do this. Always specify the column list. Here is why:
- Readability. A reader can see exactly which columns receive values without consulting the table definition.
- Resilience. If someone adds a column to the table later, your INSERT still works. Without a column list, it breaks immediately because the value count no longer matches the column count.
- Self-documentation. The column list makes the intent unambiguous. It is free documentation.
DB2 enforces data type compatibility at INSERT time. If you try to insert the string 'not-a-date' into a DATE column, DB2 raises SQLSTATE 22007 (invalid datetime format). If you try to insert a value that violates a CHECK constraint, DB2 raises SQLSTATE 23513. If you violate a UNIQUE or PRIMARY KEY constraint, you get SQLSTATE 23505. These are your guardrails. Respect them.
9.1.2 Multi-Row INSERT (VALUES Clause)
DB2 supports inserting multiple rows in a single statement:
INSERT INTO meridian.branch
(branch_id, branch_name, city, state, manager_id)
VALUES
(101, 'Downtown Main', 'Hartford', 'CT', 5001),
(102, 'West Side', 'Hartford', 'CT', 5002),
(103, 'Eastbrook Plaza', 'Manchester', 'CT', 5003),
(104, 'Riverside Commons', 'Glastonbury', 'CT', 5004);
This is more efficient than four separate INSERT statements because:
- Network round trips are reduced. One statement, one trip to the server.
- Logging is more efficient. DB2 can batch the log writes.
- Lock acquisition is reduced. The engine acquires the page or row locks once, inserts all four rows, and releases them.
[z/OS] DB2 for z/OS supports multi-row INSERT natively as of DB2 10. On older versions, you would use multi-row FETCH/INSERT through application programs.
[LUW] DB2 for LUW has supported the multi-row VALUES clause since version 9.7. The practical limit is not defined by SQL syntax but by the statement size limit (2 MB for static SQL, 2 GB for dynamic SQL on LUW).
9.1.3 INSERT...SELECT -- Inserting from a Query
The most powerful form of INSERT derives its rows from a SELECT statement:
INSERT INTO meridian.high_value_customers
(customer_id, full_name, total_balance, classification_date)
SELECT
c.customer_id,
c.first_name || ' ' || c.last_name,
SUM(a.current_balance),
CURRENT DATE
FROM meridian.customer c
JOIN meridian.account a ON a.customer_id = c.customer_id
GROUP BY c.customer_id, c.first_name, c.last_name
HAVING SUM(a.current_balance) > 100000.00;
This is the workhorse of ETL processing, data migration, and table population. The SELECT can be as complex as anything you learned in Chapters 5 through 8 -- joins, subqueries, CTEs, aggregation, window functions. Everything is fair game.
A few important rules:
- The column count and types must be compatible. The SELECT must produce the same number of columns as the INSERT's column list, and the types must be assignment-compatible.
- The target table must not appear in the FROM clause in ways that would create read-write conflicts. DB2 handles this by materializing the SELECT result before inserting. This prevents the "Halloween Problem" -- a classic database bug where an INSERT that reads from the same table it writes to sees its own newly inserted rows and loops forever. DB2 solves this internally, but you should be aware it adds a materialization step that uses temporary space.
- INSERT...SELECT can move millions of rows in a single statement. This is powerful but dangerous. A single INSERT that generates 10 million log records will fill the active log, potentially causing a log-full condition. We address this in Section 9.9 on bulk operations.
9.1.4 Default Values
When a column has a DEFAULT defined, you can use the DEFAULT keyword:
INSERT INTO meridian.account
(account_id, customer_id, account_type, current_balance, status, open_date)
VALUES
(20001, 10001, 'CHECKING', 0.00, DEFAULT, DEFAULT);
Here, status might default to 'ACTIVE' and open_date might default to CURRENT DATE. If you omit a column from the column list entirely, and that column has a DEFAULT, DB2 uses the default. If the column has no default and does not allow NULLs, DB2 raises an error.
You can also use the DEFAULT VALUES clause to insert a row where every column takes its default:
INSERT INTO meridian.audit_log DEFAULT VALUES;
This only works if every column either has a default defined or allows NULL.
9.1.5 The NULL Question
If a column allows NULL and you omit it from the column list, NULL is inserted. You can also be explicit:
INSERT INTO meridian.customer
(customer_id, first_name, last_name, email, date_of_birth, branch_id)
VALUES
(10002, 'Marcus', 'Chen', NULL, '1975-03-22', 102);
Marcus does not have an email address on file. Being explicit about NULL makes your intent clear. Omitting the column achieves the same result but forces the reader to know the table's nullable columns.
9.2 UPDATE -- Changing Existing Data
UPDATE modifies the values of existing rows. It is the most dangerous of the DML statements, because a poorly constructed UPDATE can silently change data you did not intend to touch, and you will not know it happened until a customer calls.
9.2.1 The Basic UPDATE
UPDATE meridian.customer
SET email = 'elena.vasquez@newmail.com'
WHERE customer_id = 10001;
This is the simplest form: SET one column to a new value, WHERE one specific row is identified. The WHERE clause is not optional -- it is essential. An UPDATE without WHERE updates every row in the table.
The golden rule of UPDATE: write the WHERE clause first. Before you type SET, write WHERE. Before you execute the UPDATE, run it as a SELECT to see which rows match. This habit will save your career at least once.
9.2.2 Updating Multiple Columns
UPDATE meridian.customer
SET email = 'elena.vasquez@newmail.com',
last_name = 'Vasquez-Rodriguez',
branch_id = 103
WHERE customer_id = 10001;
All changes in a single UPDATE are atomic. Either all three columns change, or none do (if the statement fails). You do not need a transaction wrapper for this -- a single SQL statement is always atomic in DB2.
9.2.3 UPDATE with Expressions
You can use expressions, functions, and calculations in SET clauses:
-- Apply 2.5% interest to all savings accounts
UPDATE meridian.account
SET current_balance = current_balance * 1.025,
last_interest_date = CURRENT DATE
WHERE account_type = 'SAVINGS'
AND status = 'ACTIVE';
The expression current_balance * 1.025 uses the current value of the column before the update. DB2 evaluates all expressions in the SET clause using the row's pre-update values. This means the order of columns in the SET clause does not matter -- they all see the "before" picture.
9.2.4 UPDATE with Subquery
You can use a subquery in the SET clause to derive values from other tables:
-- Set each customer's loyalty_tier based on their total balance
UPDATE meridian.customer c
SET loyalty_tier = (
SELECT CASE
WHEN SUM(a.current_balance) >= 250000 THEN 'PLATINUM'
WHEN SUM(a.current_balance) >= 100000 THEN 'GOLD'
WHEN SUM(a.current_balance) >= 25000 THEN 'SILVER'
ELSE 'STANDARD'
END
FROM meridian.account a
WHERE a.customer_id = c.customer_id
)
WHERE EXISTS (
SELECT 1 FROM meridian.account a
WHERE a.customer_id = c.customer_id
);
This is a correlated UPDATE -- the subquery in the SET clause references the outer table (c.customer_id). The correlated subquery executes conceptually once per row of the outer UPDATE, exactly as correlated subqueries work in SELECT (Chapter 8).
The WHERE EXISTS clause is critical. Without it, customers who have no accounts would have their loyalty_tier set to NULL by the subquery (since SUM() over zero rows returns NULL). The EXISTS check ensures we only update customers who actually have accounts.
9.2.5 UPDATE with JOIN (DB2 Syntax)
DB2 for LUW supports an UPDATE with a FROM clause, similar to SQL Server syntax:
-- [LUW] Update customer email from a staging table
UPDATE meridian.customer c
SET email = s.new_email
FROM meridian.customer_staging s
WHERE c.customer_id = s.customer_id
AND s.change_type = 'EMAIL_UPDATE';
[z/OS] DB2 for z/OS does not support the FROM clause in UPDATE. Instead, you use a correlated subquery or a MERGE statement (Section 9.4):
-- [z/OS] Equivalent using correlated subquery
UPDATE meridian.customer c
SET email = (
SELECT s.new_email
FROM meridian.customer_staging s
WHERE s.customer_id = c.customer_id
AND s.change_type = 'EMAIL_UPDATE'
)
WHERE EXISTS (
SELECT 1
FROM meridian.customer_staging s
WHERE s.customer_id = c.customer_id
AND s.change_type = 'EMAIL_UPDATE'
);
The correlated subquery approach works on both platforms and is the portable choice.
9.2.6 Mass UPDATE Safety
When an UPDATE affects thousands or millions of rows, several concerns arise:
-
Lock escalation. DB2 acquires a lock for each row (or page) you update. When the number of locks exceeds a threshold, DB2 escalates to a table-level lock, which blocks all other users. On z/OS, the LOCKMAX parameter on the tablespace controls this threshold. On LUW, the LOCKLIST and MAXLOCKS database configuration parameters govern escalation.
-
Log volume. Every changed row generates a log record containing both the before-image and after-image of the data. A mass UPDATE on a million-row table produces a massive volume of log data. If the active log fills, DB2 halts until log space is available.
-
Undo complexity. If the UPDATE fails partway through, DB2 must roll back all changes made so far. The larger the UPDATE, the longer the rollback takes.
The safe pattern for mass updates is to batch them:
-- Update in batches of 10,000, committing each batch
-- (This is pseudocode; actual implementation requires a loop
-- in a stored procedure or application program)
UPDATE meridian.account
SET interest_rate = 0.0275
WHERE account_type = 'SAVINGS'
AND interest_rate <> 0.0275
AND ROWNUM <= 10000; -- LUW: use FETCH FIRST 10000 ROWS ONLY
COMMIT;
-- Repeat until no rows are updated
We will revisit batch processing patterns in Section 9.9 and again in Chapter 17 when we discuss DB2 utilities.
9.3 DELETE -- Removing Data
DELETE removes rows from a table. Like UPDATE, it is a statement that demands respect.
9.3.1 WHERE-Qualified DELETE
DELETE FROM meridian.transaction_history
WHERE transaction_date < '2020-01-01'
AND status = 'ARCHIVED';
The WHERE clause identifies which rows to remove. Every rule about WHERE in UPDATE applies here: write it first, test it with SELECT first, and never omit it unless you genuinely intend to empty the table.
9.3.2 DELETE with Subquery
You can use subqueries to identify rows for deletion:
-- Delete customers who have no accounts
DELETE FROM meridian.customer
WHERE customer_id NOT IN (
SELECT DISTINCT customer_id
FROM meridian.account
WHERE customer_id IS NOT NULL
);
Recall from Chapter 8 the NULL pitfall with NOT IN: if the subquery returns any NULL values, NOT IN returns UNKNOWN for every comparison, and no rows are deleted. The WHERE customer_id IS NOT NULL in the subquery protects against this. Alternatively, use NOT EXISTS:
DELETE FROM meridian.customer c
WHERE NOT EXISTS (
SELECT 1
FROM meridian.account a
WHERE a.customer_id = c.customer_id
);
NOT EXISTS is immune to the NULL pitfall and is the preferred pattern for this type of operation.
9.3.3 DELETE without WHERE -- The Nuclear Option
DELETE FROM meridian.transaction_history;
This deletes every row in the table. DB2 will do it. DB2 will not ask "are you sure?" DB2 will not send you a warning email. It will acquire locks, write log records for every deleted row, fire any DELETE triggers, and enforce any referential integrity rules. And if the table has 500 million rows, this will take a very long time, fill your log, and potentially crash your batch window.
If you truly want to empty a table, TRUNCATE is almost always the better choice.
9.3.4 TRUNCATE vs. DELETE
TRUNCATE TABLE removes all rows from a table, but it works fundamentally differently from DELETE:
TRUNCATE TABLE meridian.transaction_staging IMMEDIATE;
| Characteristic | DELETE (all rows) | TRUNCATE |
|---|---|---|
| Logging | Full logging (before/after image for every row) | Minimal logging (logs the action, not individual rows) |
| Triggers | Fires DELETE triggers for each row | Does NOT fire triggers |
| Referential integrity | Enforces foreign key rules | Requires RESTRICT behavior or no dependent tables |
| Speed | Slow for large tables | Very fast, regardless of table size |
| ROLLBACK | Can be rolled back within a transaction | Behavior varies by platform (see below) |
| WHERE clause | Supported | Not supported (all-or-nothing) |
| Identity columns | Identity value continues from last value | Can reset identity with RESTART IDENTITY |
[LUW] TRUNCATE on DB2 for LUW is transactional -- it can be rolled back within the current transaction if you have not yet committed.
[z/OS] On DB2 for z/OS, TRUNCATE TABLE was introduced in DB2 12 (function level 501). Before DB2 12, the equivalent was the LOAD utility with REPLACE and an empty dataset, or DELETE with a REORG afterward.
The IMMEDIATE keyword specifies that DB2 should not defer the truncation. On both platforms, this is the standard usage.
When to use TRUNCATE: - Clearing staging tables between ETL loads - Resetting test data - Any time you need to remove all rows and do not need trigger execution or WHERE filtering
When to use DELETE: - Removing specific rows (WHERE clause) - When triggers must fire (audit logging, cascade logic) - When you need row-level logging for recovery
9.3.5 Cascading Deletes via Foreign Keys
When you defined a foreign key with ON DELETE CASCADE, deleting a parent row automatically deletes all child rows:
-- Assuming this foreign key definition exists:
-- ALTER TABLE meridian.account
-- ADD CONSTRAINT fk_account_customer
-- FOREIGN KEY (customer_id)
-- REFERENCES meridian.customer (customer_id)
-- ON DELETE CASCADE;
-- This delete removes the customer AND all their accounts:
DELETE FROM meridian.customer
WHERE customer_id = 10001;
Cascading deletes are powerful and dangerous. A single DELETE on a parent table can cascade through multiple levels of child tables, removing thousands of rows. This behavior is invisible in the DELETE statement itself -- you must know the foreign key definitions to predict the full impact.
Alternative ON DELETE actions:
| Action | Behavior |
|---|---|
| ON DELETE CASCADE | Child rows are deleted |
| ON DELETE SET NULL | Child foreign key columns are set to NULL |
| ON DELETE RESTRICT | DELETE is rejected if child rows exist |
| ON DELETE NO ACTION | Similar to RESTRICT but checked at statement end |
Best practice for Meridian Bank: Use ON DELETE RESTRICT for financial data. You do not want a customer deletion to silently cascade through accounts, transactions, and audit records. Instead, require explicit cleanup of dependent records before the parent can be removed.
9.4 MERGE -- The Upsert Pattern
MERGE is the Swiss Army knife of data modification. It combines INSERT, UPDATE, and optionally DELETE into a single statement, matching a source dataset against a target table and performing different actions depending on whether a match exists.
9.4.1 The Business Problem
You receive a daily feed of customer data from an external system. Some customers are new (they should be inserted). Some customers already exist but have updated information (they should be updated). You need to handle both cases in a single operation. This is the "upsert" pattern -- update if the row exists, insert if it does not.
Before MERGE, you would write this as two statements with conditional logic:
-- The old way (do not do this)
UPDATE meridian.customer SET ... WHERE customer_id = ?;
-- If SQLCODE = 100 (no row updated), then:
INSERT INTO meridian.customer VALUES (...);
This approach has a race condition: between the UPDATE and the INSERT, another session could insert the same row, causing a duplicate key error. MERGE solves this atomically.
9.4.2 MERGE Syntax
MERGE INTO meridian.customer AS target
USING meridian.customer_daily_feed AS source
ON target.customer_id = source.customer_id
WHEN MATCHED THEN
UPDATE SET
target.first_name = source.first_name,
target.last_name = source.last_name,
target.email = source.email,
target.phone = source.phone,
target.last_updated = CURRENT TIMESTAMP
WHEN NOT MATCHED THEN
INSERT (customer_id, first_name, last_name, email, phone,
date_of_birth, branch_id, enrollment_date)
VALUES (source.customer_id, source.first_name, source.last_name,
source.email, source.phone, source.date_of_birth,
source.branch_id, CURRENT DATE);
The anatomy of MERGE:
- MERGE INTO -- the target table that will be modified.
- USING -- the source dataset. This can be a table, a view, a subquery, or a VALUES clause.
- ON -- the matching condition. This determines whether a source row "matches" a target row.
- WHEN MATCHED THEN UPDATE -- what to do when the ON condition finds a match.
- WHEN NOT MATCHED THEN INSERT -- what to do when no match exists.
9.4.3 MERGE with Additional Conditions
You can add conditions to the WHEN clauses:
MERGE INTO meridian.customer AS target
USING meridian.customer_daily_feed AS source
ON target.customer_id = source.customer_id
WHEN MATCHED AND source.action_code = 'UPDATE' THEN
UPDATE SET
target.email = source.email,
target.phone = source.phone,
target.last_updated = CURRENT TIMESTAMP
WHEN MATCHED AND source.action_code = 'DEACTIVATE' THEN
UPDATE SET
target.status = 'INACTIVE',
target.last_updated = CURRENT TIMESTAMP
WHEN NOT MATCHED AND source.action_code = 'NEW' THEN
INSERT (customer_id, first_name, last_name, email, branch_id,
enrollment_date, status)
VALUES (source.customer_id, source.first_name, source.last_name,
source.email, source.branch_id, CURRENT DATE, 'ACTIVE');
This pattern is common in ETL processing where the source feed includes an action code that tells you what operation to perform.
9.4.4 MERGE with DELETE
DB2 supports a DELETE action in the WHEN MATCHED clause:
MERGE INTO meridian.customer AS target
USING meridian.customer_daily_feed AS source
ON target.customer_id = source.customer_id
WHEN MATCHED AND source.action_code = 'DELETE' THEN
DELETE
WHEN MATCHED AND source.action_code = 'UPDATE' THEN
UPDATE SET
target.email = source.email,
target.last_updated = CURRENT TIMESTAMP
WHEN NOT MATCHED AND source.action_code = 'INSERT' THEN
INSERT (customer_id, first_name, last_name, email, branch_id)
VALUES (source.customer_id, source.first_name, source.last_name,
source.email, source.branch_id);
[z/OS] The DELETE clause in MERGE is supported as of DB2 11 for z/OS.
[LUW] Supported since DB2 9.7 for LUW.
9.4.5 MERGE Performance Considerations
MERGE is generally more efficient than separate INSERT and UPDATE statements because:
- It scans the source data once.
- It resolves the match condition once per source row.
- It acquires locks in a single pass rather than two.
However, MERGE has its own performance considerations:
-
The ON clause should use indexed columns. If the match condition does not align with an index on the target table, MERGE performs a table scan for every source row. For a million-row source feeding into a billion-row target, this is catastrophic.
-
MERGE acquires exclusive locks on matched rows. In high-concurrency environments, this can cause contention if multiple MERGE statements target the same rows.
-
Log volume. MERGE generates log records for every inserted, updated, and deleted row. A large MERGE can fill the log just as fast as a large INSERT or UPDATE.
-
Error handling. If a single row in the MERGE violates a constraint, the entire MERGE fails by default. Consider using the ATOMIC or NOT ATOMIC syntax to control this behavior.
9.4.6 Real-World ETL Pattern
Here is a complete MERGE pattern for Meridian Bank's daily transaction feed processing:
MERGE INTO meridian.transaction_history AS target
USING (
SELECT
transaction_id,
account_id,
transaction_type,
amount,
transaction_date,
description,
external_ref
FROM meridian.daily_transaction_feed
WHERE feed_date = CURRENT DATE
AND validation_status = 'VALID'
) AS source
ON target.transaction_id = source.transaction_id
WHEN MATCHED THEN
UPDATE SET
target.amount = source.amount,
target.description = source.description,
target.last_modified = CURRENT TIMESTAMP
WHEN NOT MATCHED THEN
INSERT (transaction_id, account_id, transaction_type,
amount, transaction_date, description, external_ref,
posted_timestamp)
VALUES (source.transaction_id, source.account_id,
source.transaction_type, source.amount,
source.transaction_date, source.description,
source.external_ref, CURRENT TIMESTAMP);
The USING clause here is a subquery that filters and validates the source data before the MERGE processes it. This is a best practice: clean your data before it hits the target table.
9.5 Transaction Control -- COMMIT and ROLLBACK
Every data modification statement you have seen so far operates within a transaction. Understanding transactions is not optional -- it is the foundation of data integrity in DB2 and every other serious database system.
9.5.1 What Is a Transaction?
A transaction is a logical unit of work -- a sequence of one or more SQL statements that must succeed or fail as a group. DB2 calls this a unit of work (UOW).
The classic example is a funds transfer at Meridian Bank:
-- Transfer $500 from account 20001 to account 20002
UPDATE meridian.account
SET current_balance = current_balance - 500.00
WHERE account_id = 20001;
UPDATE meridian.account
SET current_balance = current_balance + 500.00
WHERE account_id = 20002;
INSERT INTO meridian.transaction_history
(transaction_id, account_id, transaction_type, amount, transaction_date)
VALUES
(NEXT VALUE FOR meridian.transaction_seq, 20001, 'TRANSFER_OUT', -500.00, CURRENT DATE);
INSERT INTO meridian.transaction_history
(transaction_id, account_id, transaction_type, amount, transaction_date)
VALUES
(NEXT VALUE FOR meridian.transaction_seq, 20002, 'TRANSFER_IN', 500.00, CURRENT DATE);
COMMIT;
All four statements form a single transaction. If any one of them fails -- maybe account 20002 does not exist, or a CHECK constraint rejects the negative balance -- the entire transaction is rolled back. The $500 does not vanish from account 20001 without appearing in account 20002. This is atomicity, the "A" in ACID.
9.5.2 COMMIT -- Making Changes Permanent
COMMIT ends the current transaction and makes all changes permanent:
COMMIT;
-- or, equivalently:
COMMIT WORK;
After COMMIT: - All changes are written to the DB2 transaction log and are durable. Even if the system crashes one millisecond after COMMIT returns, the changes survive. - All locks held by the transaction are released (assuming Cursor Stability or stricter isolation -- see Chapter 26). - The changes become visible to other transactions (subject to their isolation level).
COMMIT is not just a suggestion. It is the point of no return. Before COMMIT, everything is reversible. After COMMIT, the only way to "undo" the changes is to execute compensating SQL (an UPDATE that sets the values back, a DELETE that removes the inserted rows, etc.).
9.5.3 ROLLBACK -- Undoing Changes
ROLLBACK undoes all changes made since the last COMMIT (or since the connection began, if no COMMIT has been issued):
ROLLBACK;
-- or, equivalently:
ROLLBACK WORK;
After ROLLBACK: - All changes since the last COMMIT are undone. Inserted rows disappear. Updated rows revert to their previous values. Deleted rows reappear. - All locks held by the transaction are released. - The database is in the exact state it was in at the last COMMIT point.
ROLLBACK is your safety net. But it only works if you have not already committed. This is why experienced DBAs never set autocommit on for interactive sessions where they are running DML against production data.
9.5.4 Implicit vs. Explicit Transactions
Explicit transactions are what we have been discussing: you issue SQL statements, and then explicitly COMMIT or ROLLBACK.
Implicit commit happens automatically in certain situations:
- DDL statements. On many platforms, issuing a CREATE TABLE or ALTER TABLE implicitly commits any pending transaction. DB2 for LUW follows this behavior. [z/OS] DB2 for z/OS does NOT implicitly commit on DDL -- DDL is part of the current unit of work and can be rolled back. This is a significant platform difference.
- Normal program termination. If a program ends normally without an explicit COMMIT or ROLLBACK, DB2 typically commits the pending work. The exact behavior depends on the precompiler options and connection settings.
- Abnormal termination. If a program crashes or the connection drops, DB2 rolls back the pending work.
9.5.5 Autocommit Behavior
[LUW] The DB2 Command Line Processor (CLP) and many client tools run with autocommit ON by default. This means every individual SQL statement is automatically committed as soon as it completes. To disable autocommit:
db2 +c "UPDATE meridian.customer SET email = 'test@test.com' WHERE customer_id = 10001"
The +c flag disables autocommit for that command. In the CLP, you can also use:
db2 UPDATE COMMAND OPTIONS USING c OFF
[z/OS] DB2 for z/OS does not have an "autocommit" concept in the traditional sense. Programs running under TSO, CICS, or batch use explicit COMMIT statements. The SPUFI and DSNTEP2 query tools have configurable commit behavior.
Warning
Running DML in a tool with autocommit enabled means every statement is immediately permanent. There is no ROLLBACK after the fact. If you accidentally run an UPDATE without a WHERE clause and autocommit is on, every row in the table is changed and committed before you can react. Disable autocommit when doing interactive DML work.
9.5.6 COMMIT Frequency and Its Impact on Logging
Every data modification generates log records. DB2 writes these to the active log (a circular set of log files on LUW, or log datasets on z/OS). When the active log fills, DB2 archives it and reuses the space.
A long-running transaction that modifies millions of rows without committing creates two problems:
-
Log space consumption. The transaction's log records must remain in the active log until the transaction commits or rolls back, because DB2 needs them for potential rollback. If the transaction generates more log than the active log can hold, DB2 enters a log full condition and halts all work until log space is freed.
-
Lock accumulation. The transaction holds locks on every row it has modified. These locks prevent other transactions from accessing those rows. As the transaction grows, contention increases, and lock escalation becomes likely.
The solution is commit frequency -- committing periodically within a long-running batch operation:
-- Pseudocode for a batch update with periodic commits
DECLARE counter INTEGER DEFAULT 0;
FOR each_row AS cursor FOR
SELECT account_id FROM meridian.account
WHERE account_type = 'SAVINGS' AND status = 'ACTIVE'
DO
UPDATE meridian.account
SET current_balance = current_balance * 1.025
WHERE account_id = each_row.account_id;
SET counter = counter + 1;
IF MOD(counter, 5000) = 0 THEN
COMMIT;
END IF;
END FOR;
COMMIT; -- Final commit for remaining rows
Choosing commit frequency is a tradeoff:
| Commit Frequency | Advantages | Disadvantages |
|---|---|---|
| Every row (1) | Minimal lock holding, minimal log | Maximum overhead from frequent commits |
| Every 1,000-10,000 rows | Good balance of throughput and resource use | Standard recommendation for most workloads |
| Every 100,000+ rows | Best throughput for batch | Risk of log fill, long rollback if failure |
| Never (one giant transaction) | Maximum throughput in theory | Log fill, lock escalation, extremely long rollback |
The typical recommendation for Meridian Bank batch jobs: commit every 5,000 to 10,000 rows, depending on row size and log capacity.
9.6 SAVEPOINT -- Partial Rollback
Sometimes you need finer control than "commit everything" or "roll back everything." SAVEPOINT provides intermediate markers within a transaction that you can roll back to without losing earlier work.
9.6.1 SAVEPOINT Syntax
-- Begin a complex enrollment transaction
INSERT INTO meridian.customer
(customer_id, first_name, last_name, email, branch_id)
VALUES (10003, 'Sarah', 'Thompson', 'sthompson@email.com', 101);
SAVEPOINT after_customer ON ROLLBACK RETAIN CURSORS;
INSERT INTO meridian.account
(account_id, customer_id, account_type, current_balance)
VALUES (20003, 10003, 'CHECKING', 500.00);
SAVEPOINT after_checking ON ROLLBACK RETAIN CURSORS;
-- Try to open a savings account, but something goes wrong
INSERT INTO meridian.account
(account_id, customer_id, account_type, current_balance)
VALUES (20004, 10003, 'SAVINGS', -100.00);
-- This might fail due to a CHECK constraint on current_balance >= 0
-- If the savings account insert failed, roll back to after the checking account
ROLLBACK TO SAVEPOINT after_checking;
-- The customer and checking account inserts are still intact
-- We can try again or skip the savings account
COMMIT;
After ROLLBACK TO SAVEPOINT after_checking:
- The failed savings account insert is undone.
- The customer insert and checking account insert remain.
- We are still in the same transaction -- no commit has occurred.
9.6.2 ON ROLLBACK RETAIN CURSORS
The ON ROLLBACK RETAIN CURSORS clause is important. Without it, a rollback to the savepoint closes any cursors that were opened after the savepoint. With it, cursors remain open and positioned. This is essential for batch processing where you are iterating through a cursor and need to roll back individual row operations without losing your position.
[z/OS] DB2 for z/OS supports ON ROLLBACK RETAIN CURSORS as of DB2 10.
[LUW] Supported since DB2 9.7.
9.6.3 RELEASE SAVEPOINT
You can explicitly release a savepoint when you no longer need it:
RELEASE SAVEPOINT after_customer;
After releasing, you can no longer roll back to that savepoint. Releasing savepoints is optional but good practice in long transactions to free the internal resources DB2 uses to track them.
9.6.4 Nested Savepoints
DB2 supports multiple savepoints within a transaction. They behave like a stack:
SAVEPOINT sp1 ON ROLLBACK RETAIN CURSORS;
-- ... work ...
SAVEPOINT sp2 ON ROLLBACK RETAIN CURSORS;
-- ... more work ...
SAVEPOINT sp3 ON ROLLBACK RETAIN CURSORS;
-- ... even more work ...
ROLLBACK TO SAVEPOINT sp2;
-- sp3 is implicitly released. Work done after sp2 is undone.
-- sp1 is still active.
Rolling back to sp2 also releases sp3 (and any savepoints created between sp2 and sp3). You cannot roll back to sp3 after rolling back to sp2.
9.6.5 Use Cases for SAVEPOINT
-
Multi-step enrollment. As shown above, enroll a customer and open multiple accounts. If one account opening fails, roll back that step without losing the customer enrollment.
-
Batch processing with error tolerance. Process 10,000 records. If one record fails validation, roll back just that record's changes and continue with the next.
-
Tentative changes. Insert a row, check a business rule with a subsequent query, and roll back the insert if the rule is violated.
-
Stored procedure error handling. Within a stored procedure, set a savepoint at the beginning. If any operation fails, roll back to the savepoint and return an error code without affecting the caller's transaction.
9.7 The SELECT FROM INSERT/UPDATE/DELETE Pattern
DB2 supports a powerful pattern: you can wrap an INSERT, UPDATE, or DELETE inside a SELECT to retrieve the affected rows. This is invaluable for auditing and capturing generated values.
9.7.1 SELECT FROM INSERT
SELECT customer_id, enrollment_date
FROM FINAL TABLE (
INSERT INTO meridian.customer
(customer_id, first_name, last_name, email, branch_id, enrollment_date)
VALUES
(NEXT VALUE FOR meridian.customer_seq, 'David', 'Park',
'dpark@email.com', 102, CURRENT DATE)
);
The FINAL TABLE clause returns the row as it exists after the INSERT (including any values generated by defaults, triggers, or identity columns). This is extremely useful for retrieving auto-generated keys:
-- If customer_id is an identity column:
SELECT customer_id
FROM FINAL TABLE (
INSERT INTO meridian.customer
(first_name, last_name, email, branch_id)
VALUES
('David', 'Park', 'dpark@email.com', 102)
);
-- Returns the generated customer_id without a separate query
Without this pattern, you would need to INSERT the row and then SELECT it back using IDENTITY_VAL_LOCAL() or PREVIOUS VALUE FOR sequence -- a two-step process with a potential race condition in some scenarios.
9.7.2 SELECT FROM UPDATE
SELECT customer_id, email AS old_email, 'elena.v@newmail.com' AS new_email
FROM FINAL TABLE (
UPDATE meridian.customer
SET email = 'elena.v@newmail.com'
WHERE customer_id = 10001
);
OLD TABLE and FINAL TABLE:
- FINAL TABLE returns the row after modification.
- OLD TABLE returns the row before modification.
-- Capture the before-image for audit logging
INSERT INTO meridian.audit_log (table_name, operation, old_values, change_timestamp)
SELECT 'CUSTOMER', 'UPDATE',
'email=' || email,
CURRENT TIMESTAMP
FROM OLD TABLE (
UPDATE meridian.customer
SET email = 'elena.v@newmail.com'
WHERE customer_id = 10001
);
This single statement performs the UPDATE and captures the old values into an audit log -- atomically, in one operation.
9.7.3 SELECT FROM DELETE
-- Delete expired sessions and log them
INSERT INTO meridian.expired_session_log
(session_id, customer_id, expired_at, logged_at)
SELECT session_id, customer_id, last_activity, CURRENT TIMESTAMP
FROM OLD TABLE (
DELETE FROM meridian.active_sessions
WHERE last_activity < CURRENT TIMESTAMP - 30 MINUTES
);
This deletes expired sessions and simultaneously logs them to an archive table. Without this pattern, you would need a SELECT to find the rows, an INSERT to log them, and a DELETE to remove them -- three statements instead of one, with the risk that the set of rows changes between the SELECT and the DELETE.
9.7.4 Platform Support
[LUW] SELECT FROM INSERT/UPDATE/DELETE with FINAL TABLE and OLD TABLE is fully supported since DB2 9.7.
[z/OS] SELECT FROM INSERT is supported. SELECT FROM UPDATE and SELECT FROM DELETE have more limited support and may require specific APAR levels. Check the DB2 for z/OS Knowledge Center for your specific version.
9.8 Identity Columns and Sequences
When inserting rows, you often need a unique key value. DB2 provides two mechanisms for generating unique numeric values: identity columns and sequence objects.
9.8.1 Identity Columns
An identity column is a column whose values are automatically generated by DB2:
CREATE TABLE meridian.customer (
customer_id INTEGER NOT NULL
GENERATED ALWAYS AS IDENTITY
(START WITH 10000 INCREMENT BY 1),
first_name VARCHAR(50) NOT NULL,
last_name VARCHAR(50) NOT NULL,
email VARCHAR(100),
branch_id INTEGER NOT NULL
);
GENERATED ALWAYS vs. GENERATED BY DEFAULT:
| Option | Behavior | Use Case |
|---|---|---|
| GENERATED ALWAYS | DB2 always generates the value. You cannot specify it in INSERT. | New tables where DB2 controls all key generation |
| GENERATED BY DEFAULT | DB2 generates a value only if you do not provide one. | Data migration scenarios where you need to preserve existing keys |
With GENERATED ALWAYS:
-- This works:
INSERT INTO meridian.customer (first_name, last_name, email, branch_id)
VALUES ('Elena', 'Vasquez', 'evasquez@email.com', 101);
-- This FAILS (cannot specify a value for GENERATED ALWAYS):
INSERT INTO meridian.customer (customer_id, first_name, last_name, email, branch_id)
VALUES (99999, 'Elena', 'Vasquez', 'evasquez@email.com', 101);
With GENERATED BY DEFAULT, both of the above would succeed. The first would generate a value; the second would use 99999. However, GENERATED BY DEFAULT creates a risk of duplicate key errors if the manually supplied value collides with a generated value.
Retrieving the generated identity value:
-- After an INSERT:
VALUES IDENTITY_VAL_LOCAL();
-- Returns the most recently generated identity value in the current session
-- Or, using the SELECT FROM INSERT pattern (Section 9.7):
SELECT customer_id FROM FINAL TABLE (
INSERT INTO meridian.customer (first_name, last_name, email, branch_id)
VALUES ('Marcus', 'Chen', 'mchen@email.com', 102)
);
IDENTITY_VAL_LOCAL() is session-scoped -- it returns the last identity value generated by an INSERT in your session, regardless of what other sessions are doing. It is safe in concurrent environments.
9.8.2 Sequence Objects
Sequences are standalone database objects that generate numeric values independently of any table:
CREATE SEQUENCE meridian.transaction_seq
AS BIGINT
START WITH 1000000
INCREMENT BY 1
NO MAXVALUE
NO CYCLE
CACHE 50;
To use a sequence in an INSERT:
INSERT INTO meridian.transaction_history
(transaction_id, account_id, transaction_type, amount, transaction_date)
VALUES
(NEXT VALUE FOR meridian.transaction_seq, 20001, 'DEPOSIT', 1500.00, CURRENT DATE);
NEXT VALUE FOR advances the sequence and returns the next value. PREVIOUS VALUE FOR returns the last value generated by NEXT VALUE FOR in the current session (similar to IDENTITY_VAL_LOCAL() but for sequences).
Key sequence parameters:
- CACHE n -- DB2 pre-allocates n values in memory for performance. This is critical for high-insert environments. Without caching, every sequence value requires a catalog update. With CACHE 50, DB2 only updates the catalog every 50th value.
- NO CYCLE -- The sequence does not wrap around when it reaches MAXVALUE. It raises an error instead. For primary keys, you almost always want NO CYCLE.
- ORDER / NO ORDER -- [z/OS] In a data sharing environment, ORDER guarantees that values are generated in strict chronological order across all members. This adds overhead due to cross-member coordination. NO ORDER allows slight reordering for better performance.
9.8.3 Identity vs. Sequence: When to Use Which
| Criterion | Identity Column | Sequence Object |
|---|---|---|
| Scope | Tied to one table | Can be used across multiple tables |
| Syntax | Value generated automatically on INSERT | Must explicitly use NEXT VALUE FOR |
| Flexibility | Limited (one per table) | Unlimited (create as many as you need) |
| Data migration | Awkward with GENERATED ALWAYS | Easy (just stop using the sequence during load) |
| Multi-table keys | Not suitable | Ideal (one sequence generates keys for related tables) |
Meridian Bank convention: Use sequences for transaction IDs (shared across multiple transaction tables) and identity columns for single-table surrogate keys (customer_id, account_id).
9.8.4 Gaps and Why They Happen
Sequences and identity columns can have gaps in their generated values. A value is consumed when NEXT VALUE FOR is called (or when an INSERT triggers identity generation), but if the transaction rolls back, the value is not "returned." It is lost.
Common causes of gaps:
- Transaction rollback. INSERT fails or the application rolls back. The sequence value is consumed but the row does not exist.
- Cached values lost. If DB2 crashes with cached sequence values in memory, those values are lost. On restart, the sequence continues from the next cache block.
- Multi-row INSERT where some rows fail. In a batch insert, if row 5 of 10 violates a constraint and the entire statement rolls back, values for rows 1-10 are consumed.
Gaps are normal and expected. Do not design systems that assume gapless sequences. If you need gapless numbering (invoice numbers in some jurisdictions, for example), you must implement it in application logic with serialized access -- and accept the significant performance penalty.
9.9 Bulk Operations and Performance
When you need to insert millions of rows, single-row INSERT statements executed in a loop are the wrong tool. DB2 provides several mechanisms for high-performance data loading.
9.9.1 Batch INSERT (Multi-Row Values)
As shown in Section 9.1.2, multi-row INSERT reduces overhead:
INSERT INTO meridian.transaction_history
(transaction_id, account_id, transaction_type, amount, transaction_date)
VALUES
(1000001, 20001, 'DEPOSIT', 1500.00, '2025-01-15'),
(1000002, 20001, 'WITHDRAWAL', -200.00, '2025-01-16'),
(1000003, 20002, 'DEPOSIT', 3000.00, '2025-01-15'),
-- ... hundreds or thousands of rows
(1000500, 20150, 'TRANSFER', -750.00, '2025-01-17');
For programmatic batch loading, use parameterized inserts with array binding. In JDBC:
PreparedStatement ps = conn.prepareStatement(
"INSERT INTO meridian.transaction_history " +
"(transaction_id, account_id, transaction_type, amount, transaction_date) " +
"VALUES (?, ?, ?, ?, ?)");
for (TransactionRecord rec : records) {
ps.setLong(1, rec.getId());
ps.setLong(2, rec.getAccountId());
ps.setString(3, rec.getType());
ps.setBigDecimal(4, rec.getAmount());
ps.setDate(5, rec.getDate());
ps.addBatch();
if (++count % 5000 == 0) {
ps.executeBatch();
conn.commit();
}
}
ps.executeBatch();
conn.commit();
This batches 5,000 rows per round trip, committing each batch. The JDBC driver can optimize batched inserts into multi-row protocol operations.
9.9.2 The LOAD Utility (Preview)
For truly large data volumes (millions to billions of rows), DB2's LOAD utility bypasses the SQL engine entirely and writes data directly to the tablespace:
[LUW]
db2 "LOAD FROM customer_data.del OF DEL
MODIFIED BY COLDEL|
INSERT INTO meridian.customer
(customer_id, first_name, last_name, email, branch_id)"
[z/OS] The LOAD utility is a standalone utility executed via JCL or the DB2 utilities panel.
LOAD is dramatically faster than INSERT because it: - Bypasses the SQL parser and optimizer - Writes directly to data pages - Can run in parallel across partitions - Optionally skips logging (with recovery implications) - Builds indexes in a single pass after loading
We will cover LOAD in detail in Chapter 17 (DB2 Utilities). For now, know that it exists and is the right tool for initial data population and large batch loads.
9.9.3 IMPORT and EXPORT (LUW)
[LUW] DB2 for LUW provides IMPORT and EXPORT utilities that are simpler than LOAD but still more efficient than raw INSERT:
-- Export data to a file
db2 "EXPORT TO customers.csv OF DEL
SELECT * FROM meridian.customer WHERE branch_id = 101"
-- Import data from a file
db2 "IMPORT FROM customers.csv OF DEL
INSERT INTO meridian.customer_backup"
IMPORT uses SQL INSERT statements internally, so it is slower than LOAD but supports triggers, constraints, and logging.
9.9.4 NOT LOGGED INITIALLY
[LUW] For large bulk operations where you can afford to lose the data and reload it in case of failure, you can suppress logging:
ALTER TABLE meridian.transaction_staging ACTIVATE NOT LOGGED INITIALLY;
INSERT INTO meridian.transaction_staging
SELECT * FROM meridian.external_feed;
COMMIT;
NOT LOGGED INITIALLY tells DB2 not to write transaction log records for changes to this table within the current unit of work. This dramatically speeds up bulk inserts but has a critical consequence: if the transaction rolls back (or the system crashes before COMMIT), the table is marked invalid and must be dropped and recreated. There is no recovery.
[z/OS] The equivalent concept on z/OS is using the LOAD utility with LOG(NO). There is no NOT LOGGED INITIALLY clause for SQL DML on z/OS.
Use NOT LOGGED INITIALLY only for: - Staging tables that can be easily repopulated - Initial data load into new tables - Situations where speed is more important than recoverability
9.10 Data Modification Safety Patterns
This section is the most important in the chapter. The technical syntax of INSERT, UPDATE, DELETE, and MERGE is straightforward. The discipline of using them safely in production is what separates professionals from amateurs.
9.10.1 Always Test with SELECT First
Before running any UPDATE or DELETE, convert it to a SELECT to see which rows will be affected:
-- You want to run this:
UPDATE meridian.account
SET status = 'CLOSED'
WHERE last_activity_date < '2020-01-01'
AND current_balance = 0;
-- First, run this:
SELECT account_id, customer_id, status, last_activity_date, current_balance
FROM meridian.account
WHERE last_activity_date < '2020-01-01'
AND current_balance = 0;
-- Check: Is the row count what you expected?
-- Check: Are the right rows showing up?
-- Check: Did you miss a WHERE condition?
-- THEN run the UPDATE.
This takes 30 seconds. It has saved countless production databases from accidental mass updates.
9.10.2 The "UPDATE without WHERE" Horror Story
It happens to someone every year. A DBA is writing an UPDATE for a single customer's email address. They type the SET clause, mean to add a WHERE clause, but accidentally hit Enter (or the tool they are using has autocommit on). The result:
UPDATE meridian.customer
SET email = 'john.smith@email.com';
-- Every customer in the table now has the same email address.
-- If autocommit is on, this is permanent.
Prevention measures:
- Disable autocommit for interactive DML sessions.
- Write WHERE first. Type
UPDATE table SET ... WHERE ...and fill in the WHERE before the SET. - Use a transaction. Begin a transaction, run the UPDATE, SELECT the affected rows to verify, and only then COMMIT.
- Require row count confirmation. Some tools (like SPUFI on z/OS) display the row count after DML. If you expect 1 row updated and see 50,000, ROLLBACK immediately.
- Use FETCH FIRST in testing. When developing an UPDATE, add
FETCH FIRST 1 ROW ONLY(LUW) or an additional restrictive condition to limit the blast radius during testing.
9.10.3 Transaction Size Limits
There is no hard SQL-level limit on transaction size, but practical limits exist:
- Active log capacity. If your transaction generates more log data than the active log can hold, DB2 enters a log-full condition. On z/OS, this triggers a "DSNJ032I" message and all activity stops. On LUW, you get SQL0964C.
- Lock memory. Every lock consumes memory. A transaction that locks millions of rows may exhaust the lock memory pool, causing lock escalation or even an out-of-memory condition.
- Rollback time. If a 10-million-row UPDATE fails after 9 million rows, DB2 must undo all 9 million changes. This can take longer than the original operation because undo processing is typically sequential.
Rule of thumb: If your batch modifies more than 10,000 rows, implement periodic commits.
9.10.4 Lock Escalation Awareness
DB2 starts by acquiring row-level locks (or page-level locks on z/OS, depending on the tablespace LOCKSIZE parameter). When the number of locks for a single table exceeds a threshold, DB2 escalates to a table-level exclusive lock.
Table-level exclusive lock means: nobody else can read or write the table until your transaction commits. In a production environment, this can cause cascading timeouts and application failures.
[z/OS] The escalation threshold is controlled by the LOCKMAX parameter on the tablespace. Setting LOCKMAX to SYSTEM uses the subsystem default (defined in DSNZPARM).
[LUW] The escalation threshold is determined by the LOCKLIST and MAXLOCKS database configuration parameters. When a single application holds more than MAXLOCKS percent of the LOCKLIST, escalation occurs.
Prevention: - Commit frequently in batch operations. - Process data in smaller batches. - Consider using LOCK TABLE explicitly if you know you will need exclusive access (this avoids the overhead of acquiring and escalating individual locks).
9.10.5 Row Count Verification
Always check the row count after DML:
UPDATE meridian.customer
SET loyalty_tier = 'GOLD'
WHERE customer_id = 10001;
-- In an application, check SQLERRD(3) for the row count
-- In CLP: the message shows "DB20000I The SQL command completed successfully."
-- followed by the number of rows affected.
If you expected 1 row and the count says 0, the WHERE clause did not match. If it says 50,000, your WHERE clause is too broad. Either way, do not commit until you have verified the count.
9.11 Meridian Bank Data Operations
Let us apply everything from this chapter to the Meridian National Bank progressive project. We will implement four core data operations that the bank performs daily.
9.11.1 New Customer Enrollment
When a new customer walks into a Meridian branch, the teller initiates an enrollment process that creates the customer record and opens their initial accounts:
-- Step 1: Create the customer record
-- Using a sequence for customer_id
INSERT INTO meridian.customer
(customer_id, first_name, last_name, date_of_birth, ssn_hash,
email, phone, address_line1, city, state, zip_code,
branch_id, enrollment_date, status)
VALUES
(NEXT VALUE FOR meridian.customer_seq,
'James', 'Morrison', '1982-11-03',
HEX(HASH('459-22-1847', 2)), -- SHA-256 hash of SSN
'jmorrison@email.com', '860-555-0142',
'45 Elm Street', 'Hartford', 'CT', '06103',
101, CURRENT DATE, 'ACTIVE');
SAVEPOINT after_customer_create ON ROLLBACK RETAIN CURSORS;
-- Step 2: Open a checking account
INSERT INTO meridian.account
(account_id, customer_id, account_type, current_balance,
interest_rate, open_date, status)
VALUES
(NEXT VALUE FOR meridian.account_seq,
PREVIOUS VALUE FOR meridian.customer_seq,
'CHECKING', 500.00,
0.001, CURRENT DATE, 'ACTIVE');
SAVEPOINT after_checking ON ROLLBACK RETAIN CURSORS;
-- Step 3: Open a savings account
INSERT INTO meridian.account
(account_id, customer_id, account_type, current_balance,
interest_rate, open_date, status)
VALUES
(NEXT VALUE FOR meridian.account_seq,
PREVIOUS VALUE FOR meridian.customer_seq,
'SAVINGS', 1000.00,
0.025, CURRENT DATE, 'ACTIVE');
-- Step 4: Record the initial deposits as transactions
INSERT INTO meridian.transaction_history
(transaction_id, account_id, transaction_type, amount,
transaction_date, description)
VALUES
(NEXT VALUE FOR meridian.transaction_seq,
PREVIOUS VALUE FOR meridian.account_seq,
'DEPOSIT', 1000.00,
CURRENT DATE, 'Initial deposit - account opening');
COMMIT;
Notice the use of savepoints: if the savings account opening fails (perhaps due to a system issue), we can roll back to after_checking and complete the enrollment with just the checking account, rather than losing the entire enrollment.
9.11.2 Account-to-Account Transfer
The funds transfer is the canonical banking transaction:
-- Transfer $250 from checking (20001) to savings (20002)
-- Both accounts belong to the same customer
-- Verify sufficient funds first (SELECT before DML)
SELECT account_id, account_type, current_balance
FROM meridian.account
WHERE account_id IN (20001, 20002)
AND status = 'ACTIVE';
-- Execute the transfer
UPDATE meridian.account
SET current_balance = current_balance - 250.00
WHERE account_id = 20001
AND current_balance >= 250.00; -- Prevent overdraft
-- Check that exactly 1 row was updated
-- If 0 rows: insufficient funds or account not found
UPDATE meridian.account
SET current_balance = current_balance + 250.00
WHERE account_id = 20002;
-- Record both sides of the transfer
INSERT INTO meridian.transaction_history
(transaction_id, account_id, transaction_type, amount,
transaction_date, description, related_transaction_id)
VALUES
(NEXT VALUE FOR meridian.transaction_seq,
20001, 'TRANSFER_OUT', -250.00,
CURRENT DATE, 'Transfer to savings',
NEXT VALUE FOR meridian.transaction_seq);
INSERT INTO meridian.transaction_history
(transaction_id, account_id, transaction_type, amount,
transaction_date, description, related_transaction_id)
VALUES
(PREVIOUS VALUE FOR meridian.transaction_seq,
20002, 'TRANSFER_IN', 250.00,
CURRENT DATE, 'Transfer from checking',
PREVIOUS VALUE FOR meridian.transaction_seq - 1);
COMMIT;
The CHECK constraint on current_balance >= 0 (if defined) provides a second line of defense against overdrafts, but checking in the WHERE clause is preferred because it gives the application a clear signal (0 rows updated means insufficient funds) rather than an error.
9.11.3 End-of-Day Batch Processing
At the end of each business day, Meridian runs a batch process that updates account statistics and archives processed transactions. This uses MERGE to update or insert daily summary records:
-- End-of-day: merge daily transaction summaries
MERGE INTO meridian.daily_account_summary AS target
USING (
SELECT
account_id,
CURRENT DATE AS summary_date,
COUNT(*) AS transaction_count,
SUM(CASE WHEN amount > 0 THEN amount ELSE 0 END) AS total_credits,
SUM(CASE WHEN amount < 0 THEN ABS(amount) ELSE 0 END) AS total_debits,
SUM(amount) AS net_change
FROM meridian.transaction_history
WHERE transaction_date = CURRENT DATE
GROUP BY account_id
) AS source
ON target.account_id = source.account_id
AND target.summary_date = source.summary_date
WHEN MATCHED THEN
UPDATE SET
target.transaction_count = source.transaction_count,
target.total_credits = source.total_credits,
target.total_debits = source.total_debits,
target.net_change = source.net_change,
target.last_updated = CURRENT TIMESTAMP
WHEN NOT MATCHED THEN
INSERT (account_id, summary_date, transaction_count,
total_credits, total_debits, net_change, last_updated)
VALUES (source.account_id, source.summary_date,
source.transaction_count, source.total_credits,
source.total_debits, source.net_change,
CURRENT TIMESTAMP);
COMMIT;
This MERGE is idempotent -- you can run it multiple times for the same day, and it will update the existing summary rather than inserting duplicates.
9.11.4 Monthly Interest Calculation
At month end, Meridian calculates and applies interest to all savings accounts:
-- Monthly interest calculation batch job
-- Process in batches of 5,000 accounts to manage log volume
-- First, calculate and record the interest as transactions
INSERT INTO meridian.transaction_history
(transaction_id, account_id, transaction_type, amount,
transaction_date, description)
SELECT
NEXT VALUE FOR meridian.transaction_seq,
a.account_id,
'INTEREST',
ROUND(a.current_balance * (a.interest_rate / 12), 2),
CURRENT DATE,
'Monthly interest - ' || VARCHAR_FORMAT(CURRENT DATE, 'YYYY-MM')
FROM meridian.account a
WHERE a.account_type = 'SAVINGS'
AND a.status = 'ACTIVE'
AND a.current_balance > 0;
-- Then, update the balances
UPDATE meridian.account a
SET current_balance = current_balance + (
SELECT ROUND(a2.current_balance * (a2.interest_rate / 12), 2)
FROM meridian.account a2
WHERE a2.account_id = a.account_id
)
WHERE a.account_type = 'SAVINGS'
AND a.status = 'ACTIVE'
AND a.current_balance > 0;
COMMIT;
In production, this would be implemented in a stored procedure with batch commits every 5,000 accounts, error handling, and a restart capability. We will build that stored procedure in Chapter 12.
Note on the interest calculation: The approach above has a subtle issue -- the UPDATE's subquery reads the pre-update
current_balance, which is correct because the interest should be calculated on the opening balance, not the balance after interest is added. DB2's behavior of evaluating SET expressions against pre-update values makes this work correctly.
Spaced Review: Concepts from Earlier Chapters
This chapter marks a transition from read-only SQL to data modification. Before moving on, let us reinforce key concepts from earlier chapters that directly support what you learned here.
From Chapter 2: The Relational Model
Review question: Why do foreign key constraints matter for DELETE operations?
Foreign keys enforce referential integrity -- the rule that every value in a foreign key column must correspond to an existing value in the referenced primary key column. When you DELETE a row from a parent table, DB2 must check whether any child rows reference it. The ON DELETE action (CASCADE, SET NULL, RESTRICT, NO ACTION) determines what happens. Without foreign keys, orphaned rows accumulate silently, and no one notices until a JOIN returns fewer rows than expected.
From Chapter 5: SQL Fundamentals
Review question: How does the WHERE clause in a SELECT relate to the WHERE clause in an UPDATE or DELETE?
They are identical. The WHERE clause in UPDATE and DELETE uses exactly the same predicate syntax as SELECT. Every comparison operator, every logical connector (AND, OR, NOT), every pattern match (LIKE), every range test (BETWEEN), and every subquery (IN, EXISTS) works the same way. This is why "test with SELECT first" works -- if a SELECT with your WHERE clause returns the correct rows, the same WHERE clause in an UPDATE or DELETE will affect those same rows.
From Chapter 7: Aggregation and Grouping
Review question: How do aggregate functions appear in INSERT...SELECT statements?
You can use GROUP BY, HAVING, and all aggregate functions (SUM, COUNT, AVG, MIN, MAX) in the SELECT portion of an INSERT...SELECT. The grouped result set is what gets inserted into the target table. This is the pattern used in Section 9.11.3 for the daily account summary, where the source query aggregates transactions by account.
Summary
This chapter covered the complete spectrum of data modification in DB2:
INSERT adds data. Use column lists for clarity and resilience. Multi-row INSERT and INSERT...SELECT handle bulk scenarios. Default values and NULL handling require explicit thought.
UPDATE changes data. The WHERE clause is not optional -- it is your primary safety mechanism. Correlated subqueries and JOIN syntax (LUW) allow complex updates driven by other tables. Mass updates require batch processing with periodic commits.
DELETE removes data. WHERE-qualified DELETEs are targeted; DELETE without WHERE is a nuclear option. TRUNCATE is faster for clearing entire tables. Cascading deletes through foreign keys can have far-reaching, invisible effects.
MERGE combines INSERT and UPDATE (and optionally DELETE) into a single atomic statement. It solves the upsert pattern cleanly and is the backbone of ETL processing.
Transaction control is the framework that makes all of this safe. COMMIT makes changes permanent. ROLLBACK undoes them. SAVEPOINT provides intermediate recovery points. Understanding commit frequency, log impact, and lock behavior is essential for production systems.
Identity columns and sequences generate unique values for keys. They can have gaps -- this is normal. Choose identity columns for single-table keys and sequences for cross-table keys.
Bulk operations require different tools: multi-row INSERT for moderate volumes, LOAD for massive volumes, and NOT LOGGED INITIALLY (LUW) for staging tables where speed trumps recoverability.
Safety patterns are the most important lesson in this chapter. Test with SELECT first. Write WHERE before SET. Disable autocommit for interactive DML. Verify row counts. Use transactions. These habits are not optional -- they are the difference between a routine data change and an incident report.
In the next chapter, we move to advanced SQL: window functions, OLAP operations, temporal tables, and recursive queries. These tools will let you write sophisticated analytical queries against the data you now know how to create, modify, and protect.
Return to Part II: SQL Mastery | Previous: Chapter 8: Subqueries and CTEs | Next: Chapter 10: Advanced SQL