> "The art of database querying is not in knowing the syntax — it is in knowing which question to ask."
In This Chapter
- Opening: From Business Question to Database Answer
- 5.1 The SELECT Statement — Your Primary Tool
- 5.2 The WHERE Clause — Filtering with Precision
- 5.3 Logical Operators — AND, OR, NOT, and Precedence
- 5.4 ORDER BY — Sorting Results
- 5.5 DISTINCT — Eliminating Duplicates
- 5.6 Built-In Scalar Functions
- 5.7 The CASE Expression — Conditional Logic in SQL
- 5.8 NULL — The Billion-Dollar Concept
- 5.9 FETCH FIRST N ROWS ONLY
- 5.10 Column Aliases and Expressions
- 5.11 Querying the Meridian Bank — Putting It All Together
- 5.12 Spaced Review — Chapters 1-4
- Summary
- What is Next
Chapter 5: SQL Fundamentals — SELECT, WHERE, ORDER BY, and the Art of Asking Questions
"The art of database querying is not in knowing the syntax — it is in knowing which question to ask."
Opening: From Business Question to Database Answer
It is Monday morning at Meridian National Bank. The branch manager at the Hartford Main Street location opens her email to find three requests:
- The compliance team needs a list of all customers who opened accounts in the last 90 days.
- A loan officer wants to see every account with a balance above $100,000.
- The marketing department is asking for all customers in Connecticut, sorted by how long they have been with the bank.
Each of these is a question — a question directed at data. And in the world of relational databases, the language for asking questions is SQL. More specifically, the answer to every one of these requests begins with a single word: SELECT.
This chapter teaches you to write SQL queries that retrieve, filter, sort, and transform data stored in DB2 tables. By the end, you will be able to answer every one of those Monday-morning requests — and hundreds more like them — with confidence.
We will work exclusively with the Meridian National Bank database you created in Chapter 4. If you have not yet run the create-meridian-database.sql and sample-data-load.sql scripts, do so now before continuing. Every example in this chapter assumes that data is in place.
5.1 The SELECT Statement — Your Primary Tool
The SELECT statement is the most frequently used statement in SQL. It retrieves data from one or more tables and returns it as a result set. At its simplest, SELECT requires only two clauses: what you want and where to get it.
Basic Syntax
SELECT column1, column2, ...
FROM schema.table_name;
Let us start with the most fundamental query against our Meridian Bank database — retrieving all columns and all rows from the BRANCHES table:
SELECT *
FROM MERIDIAN.BRANCHES;
The asterisk (*) is a wildcard meaning "all columns." DB2 expands it into every column defined on the table, in the order they were declared in the CREATE TABLE statement. You will see output similar to this (abbreviated for readability):
BRANCH_ID BRANCH_CODE BRANCH_NAME CITY STATE_CODE STATUS
--------- ----------- ------------------------ ---------- ---------- ------
1 MNB001 Meridian Main Street Hartford CT A
2 MNB002 Meridian Westside Hartford CT A
3 MNB003 Meridian Downtown Boston Boston MA A
4 MNB004 Meridian Park Avenue New York NY A
5 MNB005 Meridian Princeton Princeton NJ A
Selecting Specific Columns
In practice, you should almost never use SELECT * in production code. There are two reasons. First, it retrieves every column, including large columns you may not need, which wastes network bandwidth and memory. Second, if someone later adds or removes a column from the table, your application's behavior changes silently.
Instead, name the columns you need:
SELECT BRANCH_NAME, CITY, STATE_CODE
FROM MERIDIAN.BRANCHES;
Result:
BRANCH_NAME CITY STATE_CODE
------------------------ ---------- ----------
Meridian Main Street Hartford CT
Meridian Westside Hartford CT
Meridian Downtown Boston Boston MA
Meridian Park Avenue New York NY
Meridian Princeton Princeton NJ
This is a projection — you are projecting only the columns you care about from a wider table.
Selecting from the CUSTOMERS Table
Let us see a few of our customers:
SELECT CUSTOMER_ID, FIRST_NAME, LAST_NAME, CITY, STATE_CODE, CUSTOMER_SINCE
FROM MERIDIAN.CUSTOMERS;
Result (first 10 rows shown):
CUSTOMER_ID FIRST_NAME LAST_NAME CITY STATE_CODE CUSTOMER_SINCE
----------- ---------- --------------- -------------- ---------- --------------
100000 Alice Thornton Hartford CT 1998-06-15
100001 Benjamin Hartwell Hartford CT 2001-03-20
100002 Carmen Dominguez West Hartford CT 2010-01-05
100003 Daniel Fitzgerald Hartford CT 1999-11-01
100004 Hartford Coffee Roasters Hartford CT 2015-05-10
100005 Elena Vasquez West Hartford CT 2018-07-22
100006 Franklin Wu Boston MA 2006-02-14
100007 Grace O'Malley Boston MA 2008-10-30
100008 Boston Tech Solutions Boston MA 2011-01-20
100009 Henry Nakamura New York NY 2010-06-15
Notice that customer IDs begin at 100000 — we set START WITH 100000 in the IDENTITY specification in Chapter 4.
The Anatomy of a SELECT Statement
Before we go further, it is worth understanding the full logical structure of a SELECT statement. Although we will not use all of these clauses until later chapters, knowing where each piece fits will help you understand how queries are built:
SELECT [DISTINCT] column_list -- What to return (Chapter 5)
FROM table_references -- Where to get data (Chapter 5)
WHERE search_condition -- Row filtering (Chapter 5)
GROUP BY grouping_columns -- Aggregation groups (Chapter 7)
HAVING group_condition -- Group filtering (Chapter 7)
ORDER BY sort_specification -- Result ordering (Chapter 5)
FETCH FIRST n ROWS ONLY -- Result limiting (Chapter 5)
DB2 processes these clauses in a specific logical order that differs from the order you write them:
- FROM — Identify the source table(s)
- WHERE — Filter individual rows
- GROUP BY — Form groups (covered in Chapter 7)
- HAVING — Filter groups (covered in Chapter 7)
- SELECT — Evaluate expressions and apply DISTINCT
- ORDER BY — Sort the result set
- FETCH FIRST — Limit the number of rows returned
This logical order explains why you cannot reference a column alias defined in the SELECT clause within the WHERE clause — the WHERE is processed before the SELECT. We will revisit this point in later chapters.
Schema Qualification
In every example above, we wrote MERIDIAN.BRANCHES rather than just BRANCHES. This is schema qualification. On DB2 for LUW, if you have set the current schema to MERIDIAN (with SET CURRENT SCHEMA = 'MERIDIAN'), you can omit the prefix. On DB2 for z/OS, you typically qualify table names explicitly or rely on the CURRENT SQLID setting.
Throughout this chapter, we will include the schema qualifier for clarity. In your own work, decide on a convention and apply it consistently.
Platform Note — z/OS vs. LUW: The SELECT syntax shown in this chapter is identical on both platforms. The only differences you may encounter relate to the command-line tool you use to execute them (SPUFI, DSNTEP2, or QMF on z/OS versus the db2 CLP or IBM Data Studio on LUW). The SQL itself is the same.
SELECT and the Relational Model
If you recall from Chapter 2, the relational model defines two fundamental operations on relations: selection (choosing rows) and projection (choosing columns). In SQL, the WHERE clause performs selection, and the column list in the SELECT clause performs projection. The naming is confusing — the SQL keyword SELECT actually corresponds to the relational operation of projection — but the underlying concepts are precise.
When you write SELECT BRANCH_NAME, CITY FROM MERIDIAN.BRANCHES WHERE STATE_CODE = 'CT', you are performing a projection (narrowing to two columns) combined with a selection (filtering to Connecticut). Every query you write in this chapter is a composition of these two fundamental operations.
5.2 The WHERE Clause — Filtering with Precision
A SELECT without a WHERE clause returns every row in the table. Most of the time, you need a subset. The WHERE clause specifies a condition that each row must satisfy to be included in the result.
Comparison Operators
DB2 supports the standard comparison operators:
| Operator | Meaning | Example |
|---|---|---|
= |
Equal to | STATUS = 'A' |
<> |
Not equal to | STATE_CODE <> 'CT' |
< |
Less than | CURRENT_BALANCE < 1000.00 |
> |
Greater than | CURRENT_BALANCE > 100000.00 |
<= |
Less than or equal to | RISK_RATING <= 2 |
>= |
Greater than or equal to | CUSTOMER_SINCE >= '2020-01-01' |
Let us answer one of those Monday-morning questions: which accounts have a balance over $100,000?
SELECT ACCOUNT_NUMBER, CUSTOMER_ID, ACCOUNT_TYPE_CODE, CURRENT_BALANCE
FROM MERIDIAN.ACCOUNTS
WHERE CURRENT_BALANCE > 100000.00;
Result:
ACCOUNT_NUMBER CUSTOMER_ID ACCOUNT_TYPE_CODE CURRENT_BALANCE
---------------- ----------- ----------------- ---------------
MNB-MMA-000001 100000 MMA 250000.00
MNB-CHK-000007 100006 CHI 156200.00
MNB-MMA-000002 100006 MMA 500000.00
MNB-SAV-000004 100003 SAV 120500.00
MNB-CHK-000009 100008 CHK 234500.00
MNB-MMA-000003 100009 MMA 750000.00
MNB-CHK-000012 100011 CHI 325000.00
MNB-SAV-000008 100011 SAV 450000.00
The BETWEEN Predicate
BETWEEN tests whether a value falls within an inclusive range. It is equivalent to writing >= lower AND <= upper, but it reads more naturally:
SELECT FIRST_NAME, LAST_NAME, CUSTOMER_SINCE
FROM MERIDIAN.CUSTOMERS
WHERE CUSTOMER_SINCE BETWEEN '2010-01-01' AND '2019-12-31';
This returns customers who joined the bank in the 2010s. Remember: BETWEEN is inclusive on both ends. The row where CUSTOMER_SINCE = '2010-01-01' (Carmen Dominguez) will be included.
The IN Predicate
When you need to match against a list of specific values, use IN:
SELECT BRANCH_NAME, CITY, STATE_CODE
FROM MERIDIAN.BRANCHES
WHERE STATE_CODE IN ('MA', 'NY');
Result:
BRANCH_NAME CITY STATE_CODE
------------------------ -------- ----------
Meridian Downtown Boston Boston MA
Meridian Park Avenue New York NY
IN is equivalent to writing STATE_CODE = 'MA' OR STATE_CODE = 'NY', but it is more readable when the list has more than two values.
The LIKE Predicate — Pattern Matching
LIKE performs pattern matching using two wildcards:
%matches zero or more characters_matches exactly one character
Find all customers whose last name starts with "S":
SELECT FIRST_NAME, LAST_NAME, CITY
FROM MERIDIAN.CUSTOMERS
WHERE LAST_NAME LIKE 'S%';
Result:
FIRST_NAME LAST_NAME CITY
---------- --------- ---------
Karen Sterling Princeton
Patrick Sullivan Princeton
Find all account numbers containing "SAV":
SELECT ACCOUNT_NUMBER, CURRENT_BALANCE
FROM MERIDIAN.ACCOUNTS
WHERE ACCOUNT_NUMBER LIKE '%SAV%';
The underscore wildcard is useful for fixed-format data. To find any branch code where the fourth character is '0' and the sixth is '3':
SELECT BRANCH_CODE, BRANCH_NAME
FROM MERIDIAN.BRANCHES
WHERE BRANCH_CODE LIKE 'MNB_0_';
Performance Warning: A LIKE pattern that begins with
%(such asLIKE '%SAV%') cannot use a standard index on that column. DB2 must scan every row. For small tables this is fine. For tables with millions of rows, consider alternative approaches like full-text indexes or computed columns. We will revisit this in the indexing chapter (Chapter 14).
IS NULL and IS NOT NULL
You cannot test for NULL with =. The expression MIDDLE_NAME = NULL does not work the way you might expect — it always evaluates to UNKNOWN (we will explain why in Section 5.8). Instead, use the special predicates IS NULL and IS NOT NULL:
SELECT FIRST_NAME, LAST_NAME, MIDDLE_NAME
FROM MERIDIAN.CUSTOMERS
WHERE MIDDLE_NAME IS NULL;
Result:
FIRST_NAME LAST_NAME MIDDLE_NAME
---------- --------------- -----------
Benjamin Hartwell -
Hartford Coffee Roasters -
Elena Vasquez -
Isabella Romano -
Liam Okafor -
Nathan Blackwood -
Olivia Chambers - (Note: shown as - to represent NULL)
Patrick Sullivan -
Qian Liu -
Nine of our twenty customers have no middle name on file. This kind of discovery is common in data quality work — a theme we will explore in Case Study 2.
Comparing Dates in WHERE
Date comparisons deserve special attention because they are among the most common filtering operations in business applications. In DB2, date literals are written as strings in the format 'YYYY-MM-DD', and DB2 implicitly converts them to DATE values for comparison:
SELECT FIRST_NAME, LAST_NAME, CUSTOMER_SINCE
FROM MERIDIAN.CUSTOMERS
WHERE CUSTOMER_SINCE >= '2015-01-01'
ORDER BY CUSTOMER_SINCE;
You can also compare dates to special registers like CURRENT DATE:
SELECT ACCOUNT_NUMBER, OPEN_DATE
FROM MERIDIAN.ACCOUNTS
WHERE OPEN_DATE >= CURRENT DATE - 1 YEAR;
This finds accounts opened within the last year. The expression CURRENT DATE - 1 YEAR computes a date exactly one year before today. DB2 supports labeled durations with YEAR, YEARS, MONTH, MONTHS, DAY, and DAYS.
Be careful with date ranges. If you want all transactions from November 2025, this works correctly:
WHERE TRANSACTION_DATE BETWEEN '2025-11-01' AND '2025-11-30'
But if you were filtering a TIMESTAMP column rather than a DATE column, BETWEEN '2025-11-01' AND '2025-11-30' would miss timestamps on November 30 after midnight (because the string '2025-11-30' converts to 2025-11-30-00.00.00). For timestamp columns, use:
WHERE TRANSACTION_TS >= '2025-11-01' AND TRANSACTION_TS < '2025-12-01'
This pattern — using >= on the start and < on the day after the end — is the safest way to filter timestamp ranges.
Combining WHERE with Subqueries (Preview)
Although we cover subqueries formally in Chapter 8, it is worth noting that the WHERE clause can contain subqueries — queries nested inside other queries. For example, to find customers who have at least one account:
SELECT FIRST_NAME, LAST_NAME
FROM MERIDIAN.CUSTOMERS
WHERE CUSTOMER_ID IN (
SELECT CUSTOMER_ID
FROM MERIDIAN.ACCOUNTS
);
The inner SELECT produces a list of customer IDs from the ACCOUNTS table, and the outer WHERE filters the CUSTOMERS table to only those IDs. We will explore this powerful technique thoroughly in Chapter 8.
Check Your Understanding — Box 1
- Write a SELECT statement that returns only the
FIRST_NAME,LAST_NAME, andEMAIL_ADDRESScolumns from the CUSTOMERS table.- What is the difference between
WHERE STATUS = 'A'andWHERE STATUS IN ('A')? (Answer: They are functionally identical for a single value, but IN is designed for lists.)- A colleague writes
WHERE CLOSE_DATE = NULLto find accounts that have not been closed. Will this work? Why or why not?- Write a query to find all branches in the state of Connecticut (STATE_CODE = 'CT').
5.3 Logical Operators — AND, OR, NOT, and Precedence
Real-world filtering usually involves more than one condition. SQL provides three logical operators to combine predicates.
AND — Both Conditions Must Be True
Find active customers in Connecticut:
SELECT FIRST_NAME, LAST_NAME, CITY, STATUS
FROM MERIDIAN.CUSTOMERS
WHERE STATE_CODE = 'CT'
AND STATUS = 'A';
Every row in the result satisfies both conditions.
OR — At Least One Condition Must Be True
Find customers in either Connecticut or Massachusetts:
SELECT FIRST_NAME, LAST_NAME, STATE_CODE
FROM MERIDIAN.CUSTOMERS
WHERE STATE_CODE = 'CT'
OR STATE_CODE = 'MA';
NOT — Negation
Find all customers who are not in New York:
SELECT FIRST_NAME, LAST_NAME, STATE_CODE
FROM MERIDIAN.CUSTOMERS
WHERE NOT STATE_CODE = 'NY';
You can also combine NOT with other predicates:
WHERE NOT (STATE_CODE = 'NY' OR STATE_CODE = 'NJ')
WHERE LAST_NAME NOT LIKE 'S%'
WHERE ACCOUNT_TYPE_CODE NOT IN ('CHK', 'CHI')
WHERE CLOSE_DATE IS NOT NULL
Operator Precedence: The Critical Rule
SQL evaluates logical operators in this order:
NOT(highest precedence)ANDOR(lowest precedence)
This means that AND binds more tightly than OR. Consider this query:
-- CAUTION: This may not do what you expect
SELECT FIRST_NAME, LAST_NAME, STATE_CODE, CUSTOMER_TYPE
FROM MERIDIAN.CUSTOMERS
WHERE STATE_CODE = 'CT'
OR STATE_CODE = 'MA'
AND CUSTOMER_TYPE = 'I';
Because AND is evaluated before OR, DB2 interprets this as:
WHERE STATE_CODE = 'CT'
OR (STATE_CODE = 'MA' AND CUSTOMER_TYPE = 'I')
This returns all Connecticut customers (regardless of type) plus only individual customers from Massachusetts. If you wanted individual customers from either state, you must use parentheses:
SELECT FIRST_NAME, LAST_NAME, STATE_CODE, CUSTOMER_TYPE
FROM MERIDIAN.CUSTOMERS
WHERE (STATE_CODE = 'CT' OR STATE_CODE = 'MA')
AND CUSTOMER_TYPE = 'I';
Best Practice: Always use parentheses when mixing AND and OR. Even if you know the precedence rules, the next person reading your SQL may not. Explicit parentheses make your intent unambiguous.
Combining Multiple Conditions — A Realistic Example
Find individual customers in Connecticut who joined before 2010 and have a low risk rating:
SELECT CUSTOMER_ID, FIRST_NAME, LAST_NAME, CUSTOMER_SINCE, RISK_RATING
FROM MERIDIAN.CUSTOMERS
WHERE STATE_CODE = 'CT'
AND CUSTOMER_TYPE = 'I'
AND CUSTOMER_SINCE < '2010-01-01'
AND RISK_RATING <= 2;
Result:
CUSTOMER_ID FIRST_NAME LAST_NAME CUSTOMER_SINCE RISK_RATING
----------- ---------- ---------- -------------- -----------
100000 Alice Thornton 1998-06-15 1
100001 Benjamin Hartwell 2001-03-20 2
100003 Daniel Fitzgerald 1999-11-01 1
100015 Maria Gonzalez 2002-04-15 2
These are the bank's long-standing, low-risk customers in Connecticut — exactly the kind of segment a relationship manager would want to see.
5.4 ORDER BY — Sorting Results
SQL does not guarantee the order of rows in a result set unless you specify one. Even if your results happen to come back in a particular order today, that order may change tomorrow as the optimizer chooses a different access path or data is reorganized. If you need a specific order, you must use ORDER BY.
Basic Sorting
Sort customers alphabetically by last name:
SELECT FIRST_NAME, LAST_NAME, CITY
FROM MERIDIAN.CUSTOMERS
ORDER BY LAST_NAME;
The default sort direction is ascending (ASC). You can be explicit:
ORDER BY LAST_NAME ASC
Descending Order
Sort accounts by balance from highest to lowest:
SELECT ACCOUNT_NUMBER, CUSTOMER_ID, CURRENT_BALANCE
FROM MERIDIAN.ACCOUNTS
ORDER BY CURRENT_BALANCE DESC;
Result (first five rows):
ACCOUNT_NUMBER CUSTOMER_ID CURRENT_BALANCE
---------------- ----------- ---------------
MNB-MMA-000003 100009 750000.00
MNB-MMA-000002 100006 500000.00
MNB-SAV-000008 100011 450000.00
MNB-CHK-000012 100011 325000.00
MNB-MMA-000001 100000 250000.00
Multi-Column Sorting
You can sort on multiple columns. DB2 sorts by the first column, then breaks ties using the second, and so on:
SELECT STATE_CODE, CITY, LAST_NAME, FIRST_NAME
FROM MERIDIAN.CUSTOMERS
ORDER BY STATE_CODE, CITY, LAST_NAME;
This first groups customers by state, then by city within each state, then alphabetically by last name within each city.
You can mix directions:
ORDER BY STATE_CODE ASC, CURRENT_BALANCE DESC
Sorting by Column Position
You can refer to columns by their position in the SELECT list:
SELECT FIRST_NAME, LAST_NAME, CUSTOMER_SINCE
FROM MERIDIAN.CUSTOMERS
ORDER BY 3 DESC;
Here 3 refers to CUSTOMER_SINCE, the third column in the SELECT list. This approach is convenient for quick interactive queries but should be avoided in production code — if someone reorders the columns in the SELECT list, the sort changes silently.
Sorting by Expression
You can sort by computed values, even if they do not appear in the SELECT list:
SELECT FIRST_NAME, LAST_NAME, CUSTOMER_SINCE
FROM MERIDIAN.CUSTOMERS
ORDER BY YEAR(CURRENT DATE) - YEAR(CUSTOMER_SINCE) DESC;
This sorts by approximate customer tenure, longest first.
NULLS FIRST and NULLS LAST
When a sort column contains NULL values, where should those rows appear? DB2 for LUW defaults to sorting NULLs as higher than any non-NULL value, meaning NULLs appear last in ascending order and first in descending order. You can override this:
SELECT FIRST_NAME, LAST_NAME, MIDDLE_NAME
FROM MERIDIAN.CUSTOMERS
ORDER BY MIDDLE_NAME NULLS FIRST;
This places customers with no middle name at the top of the result, followed by those with middle names in ascending order.
ORDER BY MIDDLE_NAME DESC NULLS LAST
Platform Note — z/OS: DB2 for z/OS does not support the
NULLS FIRST/NULLS LASTsyntax. On z/OS, NULLs sort high (they appear last in ascending order, first in descending order). If you need different behavior on z/OS, use a CASE expression in the ORDER BY:
sql ORDER BY CASE WHEN MIDDLE_NAME IS NULL THEN 0 ELSE 1 END, MIDDLE_NAME
Collation and Sort Order
The sort order for character data depends on the database's collation setting. Our Meridian database was created with COLLATE USING IDENTITY, which sorts based on the Unicode code point values. This means uppercase letters sort before lowercase letters (A=65, Z=90, a=97, z=122). In practice, since DB2 stores VARCHAR data as-entered and our sample data uses typical mixed case, this rarely causes surprises.
If your database uses a locale-sensitive collation (such as COLLATE USING SYSTEM on LUW), the sort order follows the rules of that locale — accented characters sort near their unaccented equivalents, for example.
Check Your Understanding — Box 2
- Write a query that returns all customers sorted by state code ascending, then by last name descending within each state.
- What is the danger of using column positions (like
ORDER BY 3) in production code?- True or false: Without an ORDER BY clause, DB2 guarantees rows will be returned in primary key order. (Answer: False.)
- On DB2 for LUW, where do NULL values appear by default when you sort ascending?
5.5 DISTINCT — Eliminating Duplicates
Sometimes you want to know which unique values exist in a column, not every row. The DISTINCT keyword eliminates duplicate rows from the result set.
Which states do our customers live in?
SELECT DISTINCT STATE_CODE
FROM MERIDIAN.CUSTOMERS;
Result:
STATE_CODE
----------
CT
MA
NJ
NY
Without DISTINCT, you would get 20 rows (one per customer), with many duplicate state codes.
DISTINCT on Multiple Columns
DISTINCT applies to the entire row. When you list multiple columns, DB2 eliminates rows where all listed columns match:
SELECT DISTINCT STATE_CODE, CITY
FROM MERIDIAN.CUSTOMERS;
Result:
STATE_CODE CITY
---------- -------------
CT Hartford
CT West Hartford
MA Boston
NJ Princeton
NY New York
DISTINCT with ORDER BY
When you use DISTINCT, any column in the ORDER BY clause must also appear in the SELECT list. This makes sense logically — DB2 eliminates duplicates based on the selected columns, so it can only sort by values it retains:
-- This works: sorting by a column that is in the SELECT list
SELECT DISTINCT STATE_CODE, CITY
FROM MERIDIAN.CUSTOMERS
ORDER BY STATE_CODE, CITY;
-- This would fail: sorting by a column not in the DISTINCT set
-- SELECT DISTINCT STATE_CODE FROM MERIDIAN.CUSTOMERS ORDER BY CITY;
When Not to Use DISTINCT
DISTINCT adds a sort or hash operation to eliminate duplicates, which has a performance cost. Do not use it as a band-aid to hide a logic error in a query — especially a query with joins (which we cover in Chapter 6). If you are getting unexpected duplicates, the correct fix is usually to adjust your join logic, not to slap DISTINCT on the SELECT.
A useful rule of thumb: if you add DISTINCT and the row count changes, ask yourself why duplicates existed. If the answer is "because I joined to a table that has multiple matching rows," then you have a design decision to make about your query — DISTINCT might be the right answer, or you might need to restructure the query.
5.6 Built-In Scalar Functions
A scalar function operates on a single value and returns a single value. DB2 provides a rich library of built-in scalar functions. You use them in the SELECT list, WHERE clause, ORDER BY, and virtually anywhere an expression is allowed.
String Functions
UPPER and LOWER — Convert case:
SELECT UPPER(FIRST_NAME) AS FIRST_UPPER,
LOWER(EMAIL_ADDRESS) AS EMAIL_LOWER
FROM MERIDIAN.CUSTOMERS
WHERE CUSTOMER_ID = 100000;
Result:
FIRST_UPPER EMAIL_LOWER
----------- -------------------------
ALICE alice.thornton@email.com
A common use: case-insensitive searching.
SELECT FIRST_NAME, LAST_NAME
FROM MERIDIAN.CUSTOMERS
WHERE UPPER(LAST_NAME) = 'O''MALLEY';
SUBSTR — Extract a substring:
SUBSTR(string, start, length)
SELECT ACCOUNT_NUMBER,
SUBSTR(ACCOUNT_NUMBER, 5, 3) AS ACCT_TYPE_PART
FROM MERIDIAN.ACCOUNTS
WHERE CUSTOMER_ID = 100000;
Result:
ACCOUNT_NUMBER ACCT_TYPE_PART
---------------- --------------
MNB-CHK-000001 CHK
MNB-SAV-000001 SAV
MNB-MMA-000001 MMA
LENGTH — Returns the length of a string:
SELECT FIRST_NAME, LAST_NAME, LENGTH(LAST_NAME) AS NAME_LEN
FROM MERIDIAN.CUSTOMERS
ORDER BY LENGTH(LAST_NAME) DESC;
TRIM — Remove leading, trailing, or both whitespace (or specified characters):
SELECT TRIM(BOTH ' ' FROM ' Hello ') AS TRIMMED
FROM SYSIBM.SYSDUMMY1;
Result: Hello
On DB2, SYSIBM.SYSDUMMY1 is a system table with one row — useful for evaluating expressions that do not reference user tables. On z/OS, you can also use SYSIBM.SYSDUMMY1 for the same purpose.
CONCAT or || — Concatenate strings:
SELECT FIRST_NAME || ' ' || LAST_NAME AS FULL_NAME
FROM MERIDIAN.CUSTOMERS
WHERE CUSTOMER_ID = 100002;
Result: Carmen Dominguez
You can also use the CONCAT function:
SELECT CONCAT(CONCAT(FIRST_NAME, ' '), LAST_NAME) AS FULL_NAME
FROM MERIDIAN.CUSTOMERS
WHERE CUSTOMER_ID = 100002;
The || operator is more readable when concatenating multiple values.
REPLACE — Replace occurrences within a string:
SELECT PHONE_PRIMARY,
REPLACE(PHONE_PRIMARY, '-', '') AS PHONE_DIGITS
FROM MERIDIAN.CUSTOMERS
WHERE CUSTOMER_ID = 100000;
Result:
PHONE_PRIMARY PHONE_DIGITS
------------- ------------
860-555-1001 8605551001
Numeric Functions
ROUND — Round to a specified number of decimal places:
SELECT INTEREST_RATE,
ROUND(INTEREST_RATE * 100, 2) AS RATE_PERCENT
FROM MERIDIAN.ACCOUNTS
WHERE INTEREST_RATE > 0;
ABS — Absolute value:
SELECT TRANSACTION_ID, AMOUNT, ABS(AMOUNT) AS ABS_AMOUNT
FROM MERIDIAN.TRANSACTIONS
WHERE AMOUNT < 0
FETCH FIRST 5 ROWS ONLY;
MOD — Modulus (remainder after division):
SELECT CUSTOMER_ID, MOD(CUSTOMER_ID, 2) AS IS_ODD
FROM MERIDIAN.CUSTOMERS
FETCH FIRST 5 ROWS ONLY;
FLOOR and CEILING (or CEIL):
SELECT FLOOR(3.7) AS FLOOR_VAL, CEILING(3.2) AS CEIL_VAL
FROM SYSIBM.SYSDUMMY1;
Result: FLOOR_VAL = 3, CEIL_VAL = 4
Date and Time Functions
DB2 provides special registers and functions for date/time work.
CURRENT DATE, CURRENT TIME, CURRENT TIMESTAMP — These are special registers, not functions (no parentheses):
SELECT CURRENT DATE AS TODAY,
CURRENT TIME AS NOW,
CURRENT TIMESTAMP AS RIGHT_NOW
FROM SYSIBM.SYSDUMMY1;
YEAR, MONTH, DAY — Extract components from a date:
SELECT FIRST_NAME, LAST_NAME,
YEAR(CUSTOMER_SINCE) AS JOIN_YEAR,
MONTH(CUSTOMER_SINCE) AS JOIN_MONTH,
DAY(CUSTOMER_SINCE) AS JOIN_DAY
FROM MERIDIAN.CUSTOMERS
WHERE CUSTOMER_ID = 100000;
Result:
FIRST_NAME LAST_NAME JOIN_YEAR JOIN_MONTH JOIN_DAY
---------- --------- --------- ---------- --------
Alice Thornton 1998 6 15
Date Arithmetic — DB2 supports direct arithmetic on dates:
SELECT CURRENT DATE AS TODAY,
CURRENT DATE - 90 DAYS AS NINETY_DAYS_AGO,
CURRENT DATE + 1 YEAR AS NEXT_YEAR
FROM SYSIBM.SYSDUMMY1;
This is how we answer that compliance question: "Show customers who opened accounts in the last 90 days."
SELECT C.FIRST_NAME, C.LAST_NAME, A.ACCOUNT_NUMBER, A.OPEN_DATE
FROM MERIDIAN.ACCOUNTS A,
MERIDIAN.CUSTOMERS C
WHERE A.CUSTOMER_ID = C.CUSTOMER_ID
AND A.OPEN_DATE >= CURRENT DATE - 90 DAYS;
Note
This query uses a basic two-table reference with a WHERE join condition. We will formally cover joins in Chapter 6. The syntax is shown here because the business question requires it. If you have not yet studied joins, focus on the date arithmetic for now.
DAYS — Converts a date to an integer representing the number of days since December 31, 0000:
SELECT DAYS(CURRENT DATE) - DAYS(CUSTOMER_SINCE) AS DAYS_AS_CUSTOMER
FROM MERIDIAN.CUSTOMERS
WHERE CUSTOMER_ID = 100000;
TIMESTAMPDIFF — Calculate the difference between two timestamps in a specified unit:
SELECT FIRST_NAME, LAST_NAME,
TIMESTAMPDIFF(16, CHAR(CURRENT TIMESTAMP - CREATED_TS)) AS DAYS_SINCE_CREATED
FROM MERIDIAN.CUSTOMERS
WHERE CUSTOMER_ID = 100000;
The first argument to TIMESTAMPDIFF is an integer code indicating the unit: 1=fractions of a second, 2=seconds, 4=minutes, 8=hours, 16=days, 32=weeks, 64=months, 128=quarters, 256=years.
Conversion Functions
CAST — The universal type converter:
SELECT CAST(CURRENT_BALANCE AS INTEGER) AS BALANCE_INT,
CAST(CUSTOMER_ID AS VARCHAR(10)) AS CUST_ID_STR
FROM MERIDIAN.ACCOUNTS
WHERE ACCOUNT_ID = 1000000;
INTEGER, DECIMAL, VARCHAR — Convenience cast functions:
SELECT INTEGER(CURRENT_BALANCE) AS BALANCE_INT,
DECIMAL(INTEREST_RATE * 100, 5, 2) AS RATE_PCT,
VARCHAR(CUSTOMER_ID) AS CUST_ID_STR
FROM MERIDIAN.ACCOUNTS
WHERE ACCOUNT_ID = 1000000;
CHAR — Convert to fixed-length character string:
SELECT CHAR(OPEN_DATE, ISO) AS OPEN_DATE_ISO
FROM MERIDIAN.ACCOUNTS
WHERE ACCOUNT_ID = 1000000;
Result: 2025-06-15 (ISO format: YYYY-MM-DD)
DIGITS — Convert a numeric value to a character string of digits (no sign, no decimal point, padded with leading zeros). Useful on z/OS for formatting:
SELECT DIGITS(CUSTOMER_ID) AS CUST_DIGITS
FROM MERIDIAN.CUSTOMERS
WHERE CUSTOMER_ID = 100000;
Check Your Understanding — Box 3
- Write a query that displays each customer's full name (FIRST_NAME + space + LAST_NAME) as a single column called FULL_NAME.
- How would you find all customers whose email address contains "email.com" (case-insensitive)?
- Write a query that shows the year each branch was opened.
- What is the purpose of
SYSIBM.SYSDUMMY1?- What is the difference between CAST(X AS INTEGER) and INTEGER(X)?
5.7 The CASE Expression — Conditional Logic in SQL
The CASE expression lets you embed if-then-else logic directly within a SQL query. It is not a statement — it is an expression that returns a value, which means it can be used anywhere a value is expected: in the SELECT list, WHERE clause, ORDER BY, or even inside function calls.
Simple CASE
The simple form compares one expression against multiple values:
SELECT CUSTOMER_ID, FIRST_NAME, LAST_NAME,
CASE CUSTOMER_TYPE
WHEN 'I' THEN 'Individual'
WHEN 'B' THEN 'Business'
ELSE 'Unknown'
END AS CUSTOMER_TYPE_DESC
FROM MERIDIAN.CUSTOMERS;
Result (first five rows):
CUSTOMER_ID FIRST_NAME LAST_NAME CUSTOMER_TYPE_DESC
----------- ---------- --------------- ------------------
100000 Alice Thornton Individual
100001 Benjamin Hartwell Individual
100002 Carmen Dominguez Individual
100003 Daniel Fitzgerald Individual
100004 Hartford Coffee Roasters Business
Searched CASE
The searched form evaluates independent Boolean conditions. This is more flexible:
SELECT ACCOUNT_NUMBER, CURRENT_BALANCE,
CASE
WHEN CURRENT_BALANCE >= 100000.00 THEN 'Premium'
WHEN CURRENT_BALANCE >= 25000.00 THEN 'Gold'
WHEN CURRENT_BALANCE >= 5000.00 THEN 'Standard'
ELSE 'Basic'
END AS ACCOUNT_TIER
FROM MERIDIAN.ACCOUNTS
ORDER BY CURRENT_BALANCE DESC;
Result (first eight rows):
ACCOUNT_NUMBER CURRENT_BALANCE ACCOUNT_TIER
---------------- --------------- ------------
MNB-MMA-000003 750000.00 Premium
MNB-MMA-000002 500000.00 Premium
MNB-SAV-000008 450000.00 Premium
MNB-CHK-000012 325000.00 Premium
MNB-MMA-000001 250000.00 Premium
MNB-CHK-000009 234500.00 Premium
MNB-CHK-000007 156200.00 Premium
MNB-SAV-000004 120500.00 Premium
DB2 evaluates WHEN conditions in order and returns the value for the first condition that is true. If no condition is true and there is no ELSE clause, the CASE expression returns NULL.
Best Practice: Always include an ELSE clause, even if you think all cases are covered. Without it, unexpected data could silently produce NULLs in your result set.
CASE in the WHERE Clause
You can use CASE in the WHERE clause, though the need is less common:
SELECT FIRST_NAME, LAST_NAME, RISK_RATING
FROM MERIDIAN.CUSTOMERS
WHERE CASE
WHEN RISK_RATING <= 2 THEN 'LOW'
WHEN RISK_RATING = 3 THEN 'MEDIUM'
ELSE 'HIGH'
END = 'LOW';
This is equivalent to WHERE RISK_RATING <= 2, but the CASE form can be useful in more complex scenarios.
CASE in ORDER BY
Sort customers with business accounts first, then individuals:
SELECT CUSTOMER_ID, FIRST_NAME, LAST_NAME, CUSTOMER_TYPE
FROM MERIDIAN.CUSTOMERS
ORDER BY CASE CUSTOMER_TYPE
WHEN 'B' THEN 1
WHEN 'I' THEN 2
ELSE 3
END,
LAST_NAME;
Decoding Status Codes
One of the most common uses of CASE is translating terse status codes into human-readable labels:
SELECT ACCOUNT_NUMBER, CURRENT_BALANCE,
CASE STATUS
WHEN 'A' THEN 'Active'
WHEN 'I' THEN 'Inactive'
WHEN 'C' THEN 'Closed'
WHEN 'F' THEN 'Frozen'
ELSE 'Unknown (' || STATUS || ')'
END AS STATUS_DESC
FROM MERIDIAN.ACCOUNTS;
CASE and Data Type Consistency
All result expressions in a CASE must be compatible data types. DB2 will implicitly cast between compatible types (for example, VARCHAR(10) and VARCHAR(20)), but mixing fundamentally different types causes an error:
-- This would fail: mixing VARCHAR and INTEGER in CASE results
-- CASE WHEN X = 1 THEN 'Low' WHEN X = 2 THEN 100 END
-- Fix: ensure all results are the same type
CASE WHEN X = 1 THEN 'Low' WHEN X = 2 THEN '100' END
Nested CASE Expressions
CASE expressions can be nested — a CASE inside a CASE — for multi-dimensional classification:
SELECT ACCOUNT_NUMBER,
ACCOUNT_TYPE_CODE,
CURRENT_BALANCE,
CASE ACCOUNT_TYPE_CODE
WHEN 'CHK' THEN
CASE
WHEN CURRENT_BALANCE >= 10000 THEN 'Premium Checking'
WHEN CURRENT_BALANCE >= 1000 THEN 'Standard Checking'
ELSE 'Basic Checking'
END
WHEN 'SAV' THEN
CASE
WHEN CURRENT_BALANCE >= 50000 THEN 'High-Yield Savings'
ELSE 'Regular Savings'
END
ELSE 'Other: ' || ACCOUNT_TYPE_CODE
END AS DETAILED_CLASSIFICATION
FROM MERIDIAN.ACCOUNTS
ORDER BY ACCOUNT_TYPE_CODE, CURRENT_BALANCE DESC;
While powerful, deeply nested CASE expressions can become hard to read. If you find yourself nesting three or more levels deep, consider whether a lookup table or a view might express the logic more clearly.
5.8 NULL — The Billion-Dollar Concept
NULL is one of the most important concepts in relational databases, and one of the most misunderstood. NULL does not mean zero. It does not mean empty string. It means unknown or not applicable. Understanding how DB2 handles NULL is critical to writing correct queries.
Three-Valued Logic
In most programming languages, a Boolean expression is either true or false. In SQL, there is a third possibility: UNKNOWN. This arises whenever NULL is involved in a comparison.
Consider the expression MIDDLE_NAME = 'Ann' when MIDDLE_NAME is NULL. Is it true? We do not know, because the value is unknown. Is it false? We do not know that either. So SQL evaluates it as UNKNOWN.
The WHERE clause includes a row only when the condition evaluates to TRUE. Rows where the condition is FALSE or UNKNOWN are excluded. This leads to a critical consequence:
-- This does NOT find rows where MIDDLE_NAME is NULL
SELECT * FROM MERIDIAN.CUSTOMERS WHERE MIDDLE_NAME = NULL;
-- Returns: 0 rows (always!)
-- This is the correct way
SELECT * FROM MERIDIAN.CUSTOMERS WHERE MIDDLE_NAME IS NULL;
Truth Tables with UNKNOWN
Here is how the three logical operators behave when UNKNOWN is involved:
AND Truth Table:
| A | B | A AND B |
|---|---|---|
| TRUE | UNKNOWN | UNKNOWN |
| FALSE | UNKNOWN | FALSE |
| UNKNOWN | UNKNOWN | UNKNOWN |
OR Truth Table:
| A | B | A OR B |
|---|---|---|
| TRUE | UNKNOWN | TRUE |
| FALSE | UNKNOWN | UNKNOWN |
| UNKNOWN | UNKNOWN | UNKNOWN |
NOT:
| A | NOT A |
|---|---|
| UNKNOWN | UNKNOWN |
The key takeaway: FALSE AND UNKNOWN is FALSE (because something that is already false cannot become true). TRUE OR UNKNOWN is TRUE (because something already true remains true). Everything else involving UNKNOWN yields UNKNOWN.
NULL in Expressions
Any arithmetic or string operation involving NULL produces NULL:
SELECT 5 + NULL FROM SYSIBM.SYSDUMMY1; -- Result: NULL
SELECT 'Hello' || NULL FROM SYSIBM.SYSDUMMY1; -- Result: NULL
SELECT NULL = NULL FROM SYSIBM.SYSDUMMY1; -- Result: UNKNOWN (not TRUE!)
This last point deserves emphasis: NULL is not equal to NULL. Two unknown values are not considered equal, because we do not know what either of them represents.
NULL in Aggregate Functions
Aggregate functions (covered in Chapter 7) generally ignore NULL values. For example, AVG(INTEREST_RATE) computes the average of non-NULL interest rates. The rows where INTEREST_RATE is NULL are not included in the count or the sum. The exception is COUNT(*), which counts all rows regardless of NULL values.
Common NULL Pitfalls
Pitfall 1: NOT IN with NULL values
-- Suppose RELATED_ACCOUNT_ID contains some NULL values
SELECT ACCOUNT_ID
FROM MERIDIAN.ACCOUNTS
WHERE ACCOUNT_ID NOT IN (
SELECT RELATED_ACCOUNT_ID
FROM MERIDIAN.TRANSACTIONS
);
If any value in the subquery result is NULL, the entire NOT IN evaluates to UNKNOWN for every row, and no rows are returned. This is one of the most subtle bugs in SQL. The fix is to filter NULLs in the subquery:
WHERE ACCOUNT_ID NOT IN (
SELECT RELATED_ACCOUNT_ID
FROM MERIDIAN.TRANSACTIONS
WHERE RELATED_ACCOUNT_ID IS NOT NULL
)
Or use NOT EXISTS instead (covered in Chapter 8).
Pitfall 2: Counting NULLs
SELECT COUNT(MIDDLE_NAME) FROM MERIDIAN.CUSTOMERS; -- Returns 11 (non-NULL only)
SELECT COUNT(*) FROM MERIDIAN.CUSTOMERS; -- Returns 20 (all rows)
If you want to count how many customers have no middle name, use:
SELECT COUNT(*) - COUNT(MIDDLE_NAME) AS NULL_COUNT
FROM MERIDIAN.CUSTOMERS;
Or:
SELECT COUNT(*)
FROM MERIDIAN.CUSTOMERS
WHERE MIDDLE_NAME IS NULL;
Pitfall 3: String concatenation with NULL
SELECT FIRST_NAME || ' ' || MIDDLE_NAME || ' ' || LAST_NAME AS FULL_NAME
FROM MERIDIAN.CUSTOMERS;
For customers without a middle name, this returns NULL for the entire expression — not "Alice Thornton" but NULL. The fix is COALESCE (see below).
COALESCE — The NULL Handler
COALESCE returns the first non-NULL value in its argument list:
COALESCE(value1, value2, ..., valueN)
Fix the full-name concatenation:
SELECT FIRST_NAME || ' ' ||
COALESCE(MIDDLE_NAME || ' ', '') ||
LAST_NAME AS FULL_NAME
FROM MERIDIAN.CUSTOMERS;
Result for Alice Thornton: Alice Marie Thornton
Result for Benjamin Hartwell: Benjamin Hartwell
Use COALESCE to provide default values in your output:
SELECT BRANCH_NAME,
COALESCE(FAX_NUMBER, 'No fax on file') AS FAX
FROM MERIDIAN.BRANCHES;
NULLIF — Creating NULLs from Values
NULLIF returns NULL if two values are equal; otherwise it returns the first value:
NULLIF(expression1, expression2)
A common use is to prevent division by zero:
-- Without NULLIF: division by zero error if SOME_COUNT = 0
-- With NULLIF: returns NULL instead of dividing by zero
SELECT TOTAL_AMOUNT / NULLIF(SOME_COUNT, 0) AS AVERAGE_AMOUNT
Another use: treat a sentinel value as NULL:
-- If PHONE_SECONDARY is stored as 'N/A' for some rows, treat it as NULL
SELECT CUSTOMER_ID,
NULLIF(PHONE_SECONDARY, 'N/A') AS PHONE_ALT
FROM MERIDIAN.CUSTOMERS;
Check Your Understanding — Box 4
- What does the expression
5 > NULLevaluate to?- You run
SELECT COUNT(CLOSE_DATE) FROM MERIDIAN.ACCOUNTS;and get 0. What does this tell you about the data?- Write a COALESCE expression that displays a customer's secondary phone number, or their primary phone number if secondary is NULL, or 'No phone' if both are NULL.
- Why does
WHERE COLUMN_X NOT IN (1, 2, NULL)return no rows?
5.9 FETCH FIRST N ROWS ONLY
When you are exploring data or building a report preview, you often want just the first few rows rather than the entire result set. DB2 provides FETCH FIRST N ROWS ONLY for this purpose.
Basic Usage
SELECT ACCOUNT_NUMBER, CURRENT_BALANCE
FROM MERIDIAN.ACCOUNTS
ORDER BY CURRENT_BALANCE DESC
FETCH FIRST 5 ROWS ONLY;
Result:
ACCOUNT_NUMBER CURRENT_BALANCE
---------------- ---------------
MNB-MMA-000003 750000.00
MNB-MMA-000002 500000.00
MNB-SAV-000008 450000.00
MNB-CHK-000012 325000.00
MNB-MMA-000001 250000.00
This is the DB2 SQL standard syntax. It works on both DB2 for LUW and DB2 for z/OS.
FETCH FIRST with ONLY vs. WITH TIES
DB2 12 for z/OS and DB2 11.5 for LUW support FETCH FIRST N ROWS WITH TIES, which includes additional rows that have the same sort value as the Nth row:
SELECT ACCOUNT_TYPE_CODE, CURRENT_BALANCE
FROM MERIDIAN.ACCOUNTS
ORDER BY ACCOUNT_TYPE_CODE
FETCH FIRST 3 ROWS WITH TIES;
If multiple rows share the same ACCOUNT_TYPE_CODE as the third row, they are all included.
OFFSET — Skipping Rows
To paginate results, combine OFFSET with FETCH FIRST:
SELECT ACCOUNT_NUMBER, CURRENT_BALANCE
FROM MERIDIAN.ACCOUNTS
ORDER BY CURRENT_BALANCE DESC
OFFSET 5 ROWS
FETCH FIRST 5 ROWS ONLY;
This skips the top 5 and returns the next 5 — effectively "page 2" of a 5-per-page result set.
DB2 Syntax vs. LIMIT
If you are coming from MySQL or PostgreSQL, you may be accustomed to writing:
-- MySQL/PostgreSQL syntax — NOT standard SQL
SELECT * FROM table LIMIT 5;
SELECT * FROM table LIMIT 5 OFFSET 10;
DB2 does not support the LIMIT keyword. The equivalent DB2 syntax is:
-- DB2 standard syntax
SELECT * FROM table FETCH FIRST 5 ROWS ONLY;
SELECT * FROM table OFFSET 10 ROWS FETCH FIRST 5 ROWS ONLY;
Optimization Implications
FETCH FIRST is not just a convenience — it can affect the optimizer's choice of access plan. When DB2 knows you only need the first N rows, it may choose an access path that delivers those rows quickly (such as scanning an index in the right order) rather than one that would be optimal for the complete result set. This is particularly effective when combined with ORDER BY on an indexed column.
However, do not assume that FETCH FIRST makes a query "free." DB2 still needs to identify the qualifying rows and sort them (if ORDER BY is specified). For a large table with no suitable index, DB2 may still need to scan the entire table and sort all qualifying rows before returning the top N.
FETCH FIRST Without ORDER BY
You can use FETCH FIRST without ORDER BY, but the result is non-deterministic — you get N rows, but you do not control which N rows:
-- Returns 5 rows, but which 5? No guarantee.
SELECT ACCOUNT_NUMBER, CURRENT_BALANCE
FROM MERIDIAN.ACCOUNTS
FETCH FIRST 5 ROWS ONLY;
This is useful for quick data exploration ("just show me a few rows so I can see the format"), but never for reports or application queries where specific rows matter.
Practical Pagination Example
Implementing pagination for a web application is a common use case. Here is a complete example showing three pages of 10 accounts each:
-- Page 1: Rows 1-10
SELECT ACCOUNT_NUMBER, CURRENT_BALANCE
FROM MERIDIAN.ACCOUNTS
ORDER BY ACCOUNT_NUMBER
FETCH FIRST 10 ROWS ONLY;
-- Page 2: Rows 11-20
SELECT ACCOUNT_NUMBER, CURRENT_BALANCE
FROM MERIDIAN.ACCOUNTS
ORDER BY ACCOUNT_NUMBER
OFFSET 10 ROWS
FETCH FIRST 10 ROWS ONLY;
-- Page 3: Rows 21-30
SELECT ACCOUNT_NUMBER, CURRENT_BALANCE
FROM MERIDIAN.ACCOUNTS
ORDER BY ACCOUNT_NUMBER
OFFSET 20 ROWS
FETCH FIRST 10 ROWS ONLY;
Note that pagination using OFFSET becomes increasingly expensive for large offsets. For page 1,000 of a million-row table, DB2 must skip 999,000 rows. We will discuss keyset pagination as a more scalable alternative in the performance chapter (Chapter 14).
5.10 Column Aliases and Expressions
Column Aliases
You have already seen the AS keyword in several examples. It gives a column in the result set a custom name:
SELECT FIRST_NAME AS "First Name",
LAST_NAME AS "Last Name",
CUSTOMER_SINCE AS "Member Since"
FROM MERIDIAN.CUSTOMERS;
The AS keyword is optional — FIRST_NAME "First Name" works too — but including it improves readability.
If the alias does not contain spaces or special characters, you can omit the double quotes:
SELECT FIRST_NAME AS FNAME, LAST_NAME AS LNAME
FROM MERIDIAN.CUSTOMERS;
Unquoted aliases are automatically uppercased by DB2.
Expressions in the SELECT List
You can include computed values in the SELECT list:
SELECT ACCOUNT_NUMBER,
CURRENT_BALANCE,
OVERDRAFT_LIMIT,
CURRENT_BALANCE + OVERDRAFT_LIMIT AS TOTAL_AVAILABLE
FROM MERIDIAN.ACCOUNTS
WHERE CUSTOMER_ID = 100000;
Result:
ACCOUNT_NUMBER CURRENT_BALANCE OVERDRAFT_LIMIT TOTAL_AVAILABLE
---------------- --------------- --------------- ---------------
MNB-CHK-000001 12450.75 500.00 12950.75
MNB-SAV-000001 85200.00 0.00 85200.00
MNB-MMA-000001 250000.00 0.00 250000.00
String Literals
You can include literal values in your SELECT:
SELECT 'Customer: ' || FIRST_NAME || ' ' || LAST_NAME AS LABEL,
'Active' AS STATUS_LABEL
FROM MERIDIAN.CUSTOMERS
WHERE STATUS = 'A'
FETCH FIRST 3 ROWS ONLY;
Arithmetic with Dates
Date arithmetic in the SELECT list is common:
SELECT FIRST_NAME, LAST_NAME,
CUSTOMER_SINCE,
DAYS(CURRENT DATE) - DAYS(CUSTOMER_SINCE) AS DAYS_AS_CUSTOMER,
YEAR(CURRENT DATE) - YEAR(CUSTOMER_SINCE) AS APPROX_YEARS
FROM MERIDIAN.CUSTOMERS
ORDER BY CUSTOMER_SINCE
FETCH FIRST 5 ROWS ONLY;
This shows how long each customer has been with the bank.
5.11 Querying the Meridian Bank — Putting It All Together
Let us bring together everything from this chapter to answer realistic questions that Meridian National Bank staff might ask.
Query 1: Customer Directory for the Hartford Region
"Give me a directory of all individual customers in Connecticut, showing their full name and contact information, sorted alphabetically."
SELECT CUSTOMER_ID,
FIRST_NAME || ' ' ||
COALESCE(MIDDLE_NAME || ' ', '') ||
LAST_NAME AS FULL_NAME,
ADDRESS_LINE1,
CITY,
ZIP_CODE,
COALESCE(EMAIL_ADDRESS, 'No email') AS EMAIL,
COALESCE(PHONE_PRIMARY, 'No phone') AS PHONE
FROM MERIDIAN.CUSTOMERS
WHERE STATE_CODE = 'CT'
AND CUSTOMER_TYPE = 'I'
ORDER BY LAST_NAME, FIRST_NAME;
This query uses: column selection, string concatenation, COALESCE for NULLs, WHERE with AND, and ORDER BY on two columns.
Query 2: High-Value Account Summary
"Show me all accounts with balances over $50,000, categorized by tier."
SELECT ACCOUNT_NUMBER,
ACCOUNT_TYPE_CODE,
CURRENT_BALANCE,
CASE
WHEN CURRENT_BALANCE >= 500000.00 THEN 'Platinum'
WHEN CURRENT_BALANCE >= 100000.00 THEN 'Gold'
WHEN CURRENT_BALANCE >= 50000.00 THEN 'Silver'
END AS TIER,
INTEREST_RATE * 100 AS RATE_PERCENT
FROM MERIDIAN.ACCOUNTS
WHERE CURRENT_BALANCE >= 50000.00
ORDER BY CURRENT_BALANCE DESC;
Query 3: Recent Transaction Analysis
"Show me Alice Thornton's checking account transactions from December 2025, with a readable transaction type."
SELECT TRANSACTION_DATE,
CASE TRANSACTION_TYPE
WHEN 'DEP' THEN 'Deposit'
WHEN 'WDR' THEN 'Withdrawal'
WHEN 'TRF' THEN 'Transfer'
WHEN 'PMT' THEN 'Payment'
WHEN 'FEE' THEN 'Fee'
WHEN 'INT' THEN 'Interest'
WHEN 'CHK' THEN 'Check'
WHEN 'ADJ' THEN 'Adjustment'
ELSE TRANSACTION_TYPE
END AS TRANS_TYPE,
DESCRIPTION,
AMOUNT,
RUNNING_BALANCE,
CHANNEL
FROM MERIDIAN.TRANSACTIONS
WHERE ACCOUNT_ID = 1000000
AND TRANSACTION_DATE BETWEEN '2025-12-01' AND '2025-12-31'
ORDER BY TRANSACTION_DATE, TRANSACTION_TS;
Result:
TRANSACTION_DATE TRANS_TYPE DESCRIPTION AMOUNT RUNNING_BALANCE CHANNEL
---------------- ---------- ------------------------- --------- --------------- -------
2025-12-01 Deposit Payroll direct deposit 5000.00 11022.50 ACH
2025-12-03 Check Rent payment - check #1043 -1250.00 9772.50 CHECK
2025-12-05 Payment Electric bill payment -92.00 9680.50 ONLINE
2025-12-07 Withdrawal ATM withdrawal -300.00 9380.50 ATM
2025-12-10 Transfer Transfer to money market -3000.00 6380.50 ONLINE
2025-12-15 Deposit Payroll direct deposit 5000.00 11380.50 ACH
2025-12-15 Payment Internet service payment -45.00 11335.50 ONLINE
Query 4: Branch Activity Overview
"For each branch, show the branch name, city, and when it was opened. Include how many years the branch has been open."
SELECT BRANCH_CODE,
BRANCH_NAME,
CITY || ', ' || STATE_CODE AS LOCATION,
OPEN_DATE,
YEAR(CURRENT DATE) - YEAR(OPEN_DATE) AS APPROX_YEARS_OPEN
FROM MERIDIAN.BRANCHES
WHERE STATUS = 'A'
ORDER BY OPEN_DATE;
Result:
BRANCH_CODE BRANCH_NAME LOCATION OPEN_DATE APPROX_YEARS_OPEN
----------- ------------------------ --------------- ---------- -----------------
MNB001 Meridian Main Street Hartford, CT 1998-03-15 28
MNB002 Meridian Westside Hartford, CT 2003-06-01 23
MNB003 Meridian Downtown Boston Boston, MA 2005-09-22 21
MNB004 Meridian Park Avenue New York, NY 2010-01-10 16
MNB005 Meridian Princeton Princeton, NJ 2015-04-18 11
Query 5: Account Type Distribution
"List each distinct account type code that appears in the accounts table, along with how many accounts of that type exist and whether the type is interest-bearing."
We will preview an aggregate function here (COUNT) that is formally covered in Chapter 7, because this is a natural question to ask:
SELECT A.ACCOUNT_TYPE_CODE,
T.ACCOUNT_TYPE_NAME,
T.CATEGORY,
T.INTEREST_BEARING,
COUNT(*) AS NUM_ACCOUNTS
FROM MERIDIAN.ACCOUNTS A,
MERIDIAN.ACCOUNT_TYPES T
WHERE A.ACCOUNT_TYPE_CODE = T.ACCOUNT_TYPE_CODE
GROUP BY A.ACCOUNT_TYPE_CODE, T.ACCOUNT_TYPE_NAME,
T.CATEGORY, T.INTEREST_BEARING
ORDER BY NUM_ACCOUNTS DESC;
Preview: GROUP BY and COUNT are aggregate features covered in Chapter 7. The key lesson here is that SELECT queries rarely exist in isolation — they build on each other. By the end of Part II, you will be combining all of these tools fluently.
Query 6: Customers Without Recent Activity
"Find all savings accounts that have not had activity since October 2025."
SELECT ACCOUNT_NUMBER,
CUSTOMER_ID,
CURRENT_BALANCE,
LAST_ACTIVITY_DATE
FROM MERIDIAN.ACCOUNTS
WHERE ACCOUNT_TYPE_CODE = 'SAV'
AND (LAST_ACTIVITY_DATE < '2025-10-01'
OR LAST_ACTIVITY_DATE IS NULL)
ORDER BY LAST_ACTIVITY_DATE NULLS FIRST;
Notice the NULL handling: an account that has never had activity (LAST_ACTIVITY_DATE is NULL) should also be flagged. The OR clause ensures we catch those rows, and NULLS FIRST puts them at the top where they will get immediate attention.
Query 7: Employee Compensation Lookup
"Show all employees and their annual salary, formatted as currency, sorted by branch and then salary descending."
SELECT E.EMPLOYEE_NUMBER,
E.FIRST_NAME || ' ' || E.LAST_NAME AS EMPLOYEE_NAME,
E.JOB_TITLE,
E.BRANCH_ID,
'$' || TRIM(CHAR(INTEGER(E.SALARY))) AS ANNUAL_SALARY
FROM MERIDIAN.EMPLOYEES E
WHERE E.STATUS = 'A'
ORDER BY E.BRANCH_ID, E.SALARY DESC;
Formatting Note: DB2 SQL does not have built-in currency formatting functions. The approach above produces a rough approximation (e.g.,
$125000). For production reports, formatting is typically handled by the application layer or a reporting tool like Cognos or BIRT, not in SQL.
Check Your Understanding — Box 5
- Write a query that finds all transactions on account 1000000 where the amount is negative (a debit) and the channel is either 'ATM' or 'ONLINE'.
- Modify Query 3 above to show only transactions with an absolute value greater than $1,000.
- How would you find all customers whose first name starts with a vowel (A, E, I, O, or U)?
- Write a query that finds the top 3 accounts by balance, but only among checking accounts (ACCOUNT_TYPE_CODE IN ('CHK', 'CHI')).
5.12 Spaced Review — Chapters 1-4
As you settle into SQL fundamentals, let us reinforce key concepts from earlier chapters.
From Chapter 1: Why DB2?
DB2 has been in continuous development since 1983. It runs on more platforms than any other enterprise RDBMS — from z/OS mainframes handling millions of transactions per second to Linux servers running departmental applications. The SQL you are learning in this chapter works on all of them.
From Chapter 2: The Relational Model
Every SELECT query operates on relations (tables). The columns in your SELECT list are attributes, and the rows returned are tuples. When you write WHERE STATE_CODE = 'CT', you are applying a selection operation. When you list specific columns, you are applying a projection. These formal terms map directly to the SQL keywords you now know.
From Chapter 3: DB2 Architecture
When you submit a SELECT statement, DB2's optimizer evaluates multiple access paths — index scans, table scans, various join methods — and chooses the one it estimates will be fastest. The decisions the optimizer makes depend on table statistics (gathered by RUNSTATS), the indexes available, and configuration parameters. This is why the same query can perform differently after you add an index or update statistics. We ran RUNSTATS on all our tables in the Chapter 4 setup script specifically so the optimizer would have good information from the start.
From Chapter 4: Your Working Environment
You should now be comfortable connecting to the MERIDIAN database and executing SQL from either the db2 command line processor (LUW) or SPUFI/Data Studio (z/OS). If any of the queries in this chapter did not produce the expected output, go back to Chapter 4 and verify that your sample data loaded correctly. Run this quick check:
SELECT COUNT(*) AS CUSTOMER_COUNT FROM MERIDIAN.CUSTOMERS;
-- Expected: 20
SELECT COUNT(*) AS ACCOUNT_COUNT FROM MERIDIAN.ACCOUNTS;
-- Expected: 35
SELECT COUNT(*) AS TRANSACTION_COUNT FROM MERIDIAN.TRANSACTIONS;
-- Expected: approximately 95 rows
If these counts do not match, re-run sample-data-load.sql.
Spaced Review Questions
- (Chapter 1) Name two industries where DB2 for z/OS is dominant. (Banking, insurance, government, airlines, healthcare)
- (Chapter 2) What is the difference between a primary key and a unique constraint? (A primary key cannot contain NULLs; a unique constraint can allow a single NULL on LUW.)
- (Chapter 3) What component of DB2 decides whether to use an index or a table scan? (The optimizer.)
- (Chapter 4) What command collects table statistics on DB2 for LUW? (RUNSTATS.)
Summary
This chapter covered the foundational SQL skills that every DB2 practitioner must master. Let us recap the major pieces:
SELECT retrieves data from tables. Specify columns explicitly rather than using SELECT * in production code.
WHERE filters rows using comparison operators (=, <>, <, >, <=, >=), range predicates (BETWEEN), list predicates (IN), pattern matching (LIKE), and NULL testing (IS NULL, IS NOT NULL).
Logical operators (AND, OR, NOT) combine conditions. Remember that AND binds more tightly than OR — use parentheses to make your intent clear.
ORDER BY sorts results. Without it, row order is undefined. Use ASC/DESC, multiple columns, and NULLS FIRST/LAST to get exactly the order you need.
DISTINCT eliminates duplicate rows but comes with a performance cost.
Scalar functions transform individual values: string functions (UPPER, LOWER, SUBSTR, TRIM, LENGTH, CONCAT, REPLACE), numeric functions (ROUND, ABS, MOD, FLOOR, CEILING), date functions (CURRENT DATE, YEAR, MONTH, DAY, DAYS, TIMESTAMPDIFF), and conversion functions (CAST, INTEGER, DECIMAL, VARCHAR, CHAR).
CASE expressions provide conditional logic within SQL — simple CASE for value matching, searched CASE for Boolean conditions.
NULL represents unknown or missing data. It is not zero, not empty string, and not equal to itself. Use IS NULL / IS NOT NULL to test for it, COALESCE to provide defaults, and NULLIF to generate NULLs from sentinel values. Beware of NULL in NOT IN subqueries and string concatenation.
FETCH FIRST N ROWS ONLY limits result sets. Combine with OFFSET for pagination. DB2 does not support the LIMIT keyword.
Column aliases (AS) and expressions in the SELECT list let you shape and label your output.
These are the tools you will use in every SQL query you write. In Chapter 6, we will extend them by joining multiple tables together, unlocking the full power of the relational model.
What is Next
Chapter 6 introduces JOINs — the mechanism for combining data from multiple tables in a single query. You will learn INNER JOIN, LEFT/RIGHT OUTER JOIN, FULL OUTER JOIN, CROSS JOIN, and self-joins, all applied to the Meridian Bank database. The ability to join tables transforms SQL from a simple lookup tool into a language for answering arbitrarily complex business questions.
Chapter 5 of "IBM DB2: From First Query to Enterprise Architecture" Part II: SQL Mastery
Related Reading
Explore this topic in other books
IBM DB2 Embedded SQL Advanced COBOL Advanced SQL Learning COBOL Embedded SQL & DB2 Intermediate COBOL Embedded SQL Fundamentals