Case Study 1: Complex Business Logic — Loan Eligibility Determination

Background

Meridian National Bank is launching a new "Premier Business Loan" product targeted at established customers with strong financial profiles. The product management team has defined a multi-criteria eligibility framework, and the data engineering team must implement it as a SQL query that can be embedded in the loan origination system.

The eligibility criteria are:

  1. Tenure: The customer must have been with the bank for at least 3 years (DATE_OPENED on or before the date 3 years ago from today).
  2. Deposit Relationship: The customer must have a total deposit balance (sum of SAVINGS, CHECKING, and MONEY_MARKET accounts) of at least $75,000.
  3. Account Diversity: The customer must hold at least 2 different account types.
  4. Transaction Activity: The customer must have conducted at least 12 transactions in the past 6 months across all their accounts.
  5. Credit Standing: The customer must have no existing loans with a STATUS of 'DELINQUENT' or 'DEFAULT'.
  6. Branch Standing: The customer's primary branch must be in 'ACTIVE' status (not a branch being closed or consolidated).

If all six criteria are met, the customer is eligible. The query must also return a "relationship score" that sums: 1 point per year of tenure (capped at 10), 1 point per $25,000 in deposits (capped at 10), and 1 point per distinct account type (capped at 5). This score is used for prioritizing loan offers.

The Challenge

Each criterion involves a different table or combination of tables. Some criteria require aggregation, others require existence checks, and one requires a NOT EXISTS pattern. Implementing all six in a single flat query with nested subqueries would be unreadable and unmaintainable. This is a textbook case for CTEs.

Solution: CTE-Based Eligibility Engine

WITH tenured_customers AS (
    -- Criterion 1: At least 3 years of tenure
    SELECT CUSTOMER_ID,
           DATE_OPENED,
           YEAR(CURRENT DATE) - YEAR(DATE_OPENED) AS TENURE_YEARS
    FROM   CUSTOMER
    WHERE  DATE_OPENED <= CURRENT DATE - 3 YEARS
),
deposit_relationships AS (
    -- Criterion 2: Total deposits >= $75,000
    SELECT a.CUSTOMER_ID,
           SUM(a.BALANCE) AS TOTAL_DEPOSITS,
           COUNT(DISTINCT a.ACCOUNT_TYPE) AS ACCOUNT_TYPE_COUNT
    FROM   ACCOUNT a
    WHERE  a.ACCOUNT_TYPE IN ('SAVINGS', 'CHECKING', 'MONEY_MARKET')
    GROUP BY a.CUSTOMER_ID
    HAVING SUM(a.BALANCE) >= 75000
),
diverse_accounts AS (
    -- Criterion 3: At least 2 different account types
    SELECT CUSTOMER_ID
    FROM   deposit_relationships
    WHERE  ACCOUNT_TYPE_COUNT >= 2
),
active_transactors AS (
    -- Criterion 4: At least 12 transactions in past 6 months
    SELECT a.CUSTOMER_ID,
           COUNT(*) AS RECENT_TXN_COUNT
    FROM   TRANSACTION t
    JOIN   ACCOUNT a ON t.ACCOUNT_ID = a.ACCOUNT_ID
    WHERE  t.TRANSACTION_DATE >= CURRENT DATE - 6 MONTHS
    GROUP BY a.CUSTOMER_ID
    HAVING COUNT(*) >= 12
),
clean_credit AS (
    -- Criterion 5: No delinquent or defaulted loans
    SELECT DISTINCT c.CUSTOMER_ID
    FROM   CUSTOMER c
    WHERE  NOT EXISTS (
        SELECT 1
        FROM   LOAN l
        WHERE  l.CUSTOMER_ID = c.CUSTOMER_ID
          AND  l.STATUS IN ('DELINQUENT', 'DEFAULT')
    )
),
active_branch_customers AS (
    -- Criterion 6: Primary branch is active
    SELECT c.CUSTOMER_ID
    FROM   CUSTOMER c
    JOIN   BRANCH b ON c.BRANCH_ID = b.BRANCH_ID
    WHERE  b.STATUS = 'ACTIVE'
)
SELECT c.CUSTOMER_ID,
       c.FIRST_NAME,
       c.LAST_NAME,
       tc.TENURE_YEARS,
       dr.TOTAL_DEPOSITS,
       dr.ACCOUNT_TYPE_COUNT,
       at.RECENT_TXN_COUNT,
       -- Relationship score calculation
       LEAST(tc.TENURE_YEARS, 10)
       + LEAST(INTEGER(dr.TOTAL_DEPOSITS / 25000), 10)
       + LEAST(dr.ACCOUNT_TYPE_COUNT, 5)
           AS RELATIONSHIP_SCORE
FROM   CUSTOMER c
JOIN   tenured_customers tc    ON c.CUSTOMER_ID = tc.CUSTOMER_ID
JOIN   deposit_relationships dr ON c.CUSTOMER_ID = dr.CUSTOMER_ID
JOIN   diverse_accounts da     ON c.CUSTOMER_ID = da.CUSTOMER_ID
JOIN   active_transactors at   ON c.CUSTOMER_ID = at.CUSTOMER_ID
JOIN   clean_credit cc         ON c.CUSTOMER_ID = cc.CUSTOMER_ID
JOIN   active_branch_customers abc ON c.CUSTOMER_ID = abc.CUSTOMER_ID
ORDER BY RELATIONSHIP_SCORE DESC,
         dr.TOTAL_DEPOSITS DESC;

Analysis

Why CTEs Work Here

Each CTE encapsulates one business rule. The main query joins them all with INNER JOINs, which means a customer must satisfy every criterion to appear in the result. This is the AND-logic pattern: six INNER JOINs enforce six simultaneous conditions.

If the business later changes a criterion — say, lowering the deposit threshold to $50,000 — you change exactly one CTE. The rest of the query is untouched. If a new criterion is added (e.g., "customer must reside in an eligible state"), you add one new CTE and one new JOIN line.

The NOT EXISTS Pattern for Clean Credit

Criterion 5 uses NOT EXISTS rather than NOT IN. This is deliberate. If any LOAN row has a NULL CUSTOMER_ID, a NOT IN approach would silently return zero results for all customers. NOT EXISTS correctly handles NULLs by checking for the existence of matching rows rather than set membership.

Performance Considerations

The optimizer has several paths it can take:

  1. Start with the most selective CTE. If only 5% of customers have $75,000+ in deposits, the deposit_relationships CTE is the most selective. DB2's optimizer may start with this small set and probe other tables for the remaining criteria.

  2. Semi-join transformation. CTEs like diverse_accounts and active_branch_customers that produce only CUSTOMER_ID values are candidates for semi-join execution — DB2 need only verify existence, not retrieve data.

  3. Index opportunities. Key indexes for this query: - CUSTOMER(CUSTOMER_ID, DATE_OPENED) — for the tenure filter - ACCOUNT(CUSTOMER_ID, ACCOUNT_TYPE, BALANCE) — for the deposit aggregation - TRANSACTION(ACCOUNT_ID, TRANSACTION_DATE) — for the activity count - LOAN(CUSTOMER_ID, STATUS) — for the credit standing check - BRANCH(BRANCH_ID, STATUS) — for the branch status filter

Without these indexes, the query could become expensive on large data sets. With them, each CTE can use indexed access, and the joins between CTEs use CUSTOMER_ID, which is the primary key of the CUSTOMER table.

Testing Strategy

A major benefit of the CTE approach is incremental testability. You can test each criterion independently:

-- Test criterion 1 alone
WITH tenured_customers AS (
    SELECT CUSTOMER_ID, DATE_OPENED,
           YEAR(CURRENT DATE) - YEAR(DATE_OPENED) AS TENURE_YEARS
    FROM   CUSTOMER
    WHERE  DATE_OPENED <= CURRENT DATE - 3 YEARS
)
SELECT COUNT(*) AS TENURED_COUNT FROM tenured_customers;

Run each CTE standalone to verify the count and spot-check results. Then combine them incrementally, adding one JOIN at a time and watching how the result set narrows. This stepwise approach catches logic errors early and makes debugging straightforward.

Alternative: Scoring Without Hard Cutoffs

A variation replaces the hard eligibility cutoffs with a scoring model. Instead of filtering with INNER JOINs, use LEFT JOINs and CASE expressions to compute a score. Customers above a threshold score are eligible. This is more flexible but harder to explain to regulators who need deterministic yes/no criteria.

Key Takeaways

  1. CTEs turn complex eligibility logic into readable, maintainable SQL. Each business rule is one CTE. The main query is a simple multi-way join.

  2. INNER JOINs enforce AND logic. A customer must appear in every CTE to be in the result.

  3. NOT EXISTS is the safe choice for negative criteria. It handles NULLs correctly and maps to efficient anti-semi-join plans.

  4. Incremental testability is a CTE superpower. Test each rule independently before combining.

  5. Appropriate indexes are critical for multi-CTE queries. Each CTE touches different tables; each table needs indexes that support its CTE's access pattern.