Part II: SQL Mastery

SQL is the language you use to talk to DB2. That sentence sounds simple, but its implications run deep. Every query you write, every table you create, every procedure you define, every permission you grant — all of it flows through SQL. If your SQL is weak, everything you do with DB2 is weak. If your SQL is strong, you have leverage over every problem the database can throw at you.

This is the largest part of the book, spanning eight chapters, and that size is proportional to the importance of the subject. We are not going to rush through syntax tables and move on. We are going to build your SQL fluency methodically, starting from simple retrieval and ending with sophisticated programming constructs that most DB2 professionals never fully master.

What This Part Covers

The eight chapters in Part II form a deliberate progression. Each one assumes you have internalized the material that came before it.

Chapter 5 covers the SELECT statement — the foundation of all data retrieval. We start with single-table queries, work through filtering with WHERE, sorting with ORDER BY, and eliminating duplicates with DISTINCT. This chapter moves quickly if you have prior SQL experience, but do not skip it. DB2 has idioms and behaviors in its SELECT processing that differ from other database systems, and those differences matter when you reach the performance chapters.

Chapter 6 introduces joins and subqueries. You will learn inner joins, outer joins (LEFT, RIGHT, FULL), cross joins, and self-joins. We will cover correlated and non-correlated subqueries, EXISTS and NOT EXISTS, and the important question of when to use a join versus a subquery. The Meridian Bank dataset becomes essential here — joining customer records to accounts, accounts to transactions, branches to employees. These are the multi-table queries that make up the bulk of real-world SQL.

Chapter 7 tackles aggregation, grouping, and the HAVING clause. We will work through COUNT, SUM, AVG, MIN, MAX, and then move into grouped aggregations and the subtleties of NULL handling in aggregate functions. The ROLLUP, CUBE, and GROUPING SETS extensions get thorough treatment because they solve reporting problems that would otherwise require multiple queries or application-side logic.

Chapter 8 covers set operations (UNION, INTERSECT, EXCEPT), common table expressions (CTEs), and recursive queries. CTEs are one of the most powerful features in modern SQL, and recursive CTEs unlock an entire category of problems — hierarchical data, graph traversal, running totals — that flat relational queries cannot address. We will build several recursive queries against Meridian's organizational hierarchy and account relationship structures.

Chapter 9 is where things get genuinely advanced: window functions and OLAP expressions. ROW_NUMBER, RANK, DENSE_RANK, NTILE, LAG, LEAD, running aggregates with OVER clauses, and frame specifications. This chapter separates SQL practitioners from SQL experts. Most developers learn window functions superficially; we will learn them deeply enough that you can solve complex analytical problems in pure SQL without resorting to procedural code.

Chapter 10 shifts from querying to data modification: INSERT, UPDATE, DELETE, and MERGE. We cover single-row and multi-row operations, INSERT from SELECT, conditional updates, and the MERGE statement that combines insert-or-update logic into a single atomic operation. We will also discuss identity columns, sequence objects, and generated columns — the mechanisms DB2 provides for automatic value creation.

Chapter 11 moves to Data Definition Language. CREATE TABLE, ALTER TABLE, DROP TABLE, and the full taxonomy of data types, constraints, and table options across both z/OS and LUW. We will cover referential integrity constraints, check constraints, and the platform-specific options like tablespace assignment on z/OS and automatic storage on LUW. This chapter is the bridge between SQL and database design — you will understand not just how to create objects, but what options are available and why you might choose one over another.

Chapter 12 covers views, triggers, user-defined functions, and stored procedures. These are the programmable objects that let you encapsulate business logic inside the database. Views provide logical data independence. Triggers enforce complex business rules at the data layer. User-defined functions extend DB2's built-in capabilities. Stored procedures package SQL and procedural logic into callable units that reduce network round trips and centralize data access patterns. We will write substantial examples of each, using DB2 SQL PL as the procedural language.

Why It Matters

I have reviewed thousands of lines of production SQL over my career, and I can tell you that the gap between adequate SQL and excellent SQL is enormous — not just in elegance, but in performance, maintainability, and correctness. A developer who does not understand window functions will write a correlated subquery that scans a table repeatedly. A developer who does not understand CTEs will create deeply nested inline views that no one can read or debug. A developer who does not understand MERGE will write separate INSERT and UPDATE statements with a race condition between them.

SQL fluency is not about memorizing syntax. It is about having a rich enough mental vocabulary that when you encounter a data problem, you can see the SQL solution directly. You should be able to look at a business requirement — "show me each customer's three most recent transactions along with the running balance" — and immediately decompose it into window functions and CTEs without fumbling through intermediate steps.

DB2's SQL dialect has specific strengths that we will exploit throughout this part. Its implementation of recursive CTEs is mature and performant. Its window function support is comprehensive. Its MERGE statement handles complex upsert scenarios gracefully. Knowing these strengths lets you write DB2 SQL that plays to the optimizer's capabilities rather than fighting against them.

The Meridian Bank Dataset in Action

Every chapter in Part II uses the Meridian National Bank dataset extensively. By Chapter 6, you will be writing queries that join customers to accounts, accounts to transactions, and transactions to branch records. By Chapter 9, you will be computing rolling 30-day average balances, ranking branches by deposit growth, and identifying customers whose transaction patterns deviate from their historical norms.

This is not contrived. These are the kinds of queries that real banking applications execute thousands of times per day. The Meridian dataset is large enough to be interesting — large enough that a poorly written query will be noticeably slower than a well-written one — but small enough that you can understand the full data landscape and verify your results manually when needed.

Dual-Platform SQL

One of the realities of the DB2 world is that z/OS and LUW share a common SQL core but diverge in specific areas. Throughout Part II, I will flag these differences explicitly. Some are minor — slightly different function names or syntax variations. Others are significant — z/OS handles temporal tables differently from LUW, certain optimizer hints are platform-specific, and the procedural language support has meaningful differences.

Where the platforms diverge, you will see both versions. Where they are identical, I will say so once and move on. The goal is complete coverage without redundant repetition.

How to Approach This Part

Work every example. Type the SQL yourself rather than copying it. Modify the queries and observe how the results change. When you encounter a new construct, try to break it — give it NULL values, empty result sets, edge cases. Understanding how SQL behaves at the boundaries is just as important as understanding how it behaves in the common case.

If you come to this book with prior SQL experience in another database system — Oracle, SQL Server, PostgreSQL — you will move through Chapters 5 and 6 quickly. Do not let that early speed tempt you into skimming Chapters 8 and 9. The advanced material is where the real value is, and it is where DB2-specific knowledge becomes critical.

By the time you finish Part II, you will have the SQL vocabulary to express any data operation DB2 supports. That is not an exaggeration. These eight chapters cover the complete SQL surface area that a DB2 professional needs. The chapters that follow will use this vocabulary constantly, so make sure the foundation is solid before you move on.

Chapters in This Part