Case Study 1: Batch Window Optimization

Background

Meridian National Bank runs a nightly batch cycle between 11:00 PM and 5:00 AM — a six-hour window. Within this window, the following jobs must complete in sequence:

  1. Transaction posting (ETL from branch feeds): ~45 minutes
  2. Interest calculation for all savings and CD accounts: ~60 minutes
  3. Overdraft detection and fee assessment: ~30 minutes
  4. Account balance reconciliation: ~90 minutes
  5. Regulatory reporting extracts: ~40 minutes
  6. Backup: ~60 minutes

Total: approximately 5 hours and 25 minutes — leaving only 35 minutes of slack. As the bank has grown from 200,000 to 500,000 customers and transaction volumes have doubled, the batch window is being breached two to three nights per week. The backup starts late, and online processing is delayed in the morning.

The DBA team has been asked to reduce batch elapsed time by 40% without hardware upgrades.

The Problem Queries

Job 2: Interest Calculation (60 minutes)

-- Current interest calculation: runs for each eligible account
SELECT a.account_id,
       a.current_balance,
       p.interest_rate,
       a.current_balance * p.interest_rate / 365 AS daily_interest
FROM   ACCOUNTS a,
       PRODUCTS p
WHERE  a.product_id = p.product_id
  AND  a.status = 'A'
  AND  a.account_type IN ('SAV', 'CD', 'MMA')
  AND  YEAR(a.last_interest_date) = YEAR(CURRENT DATE)
  AND  MONTH(a.last_interest_date) < MONTH(CURRENT DATE)
ORDER BY a.branch_id, a.account_id;

Issues identified: - Old-style comma join (not a performance issue per se, but obscures intent) - YEAR() and MONTH() on last_interest_date are non-sargable - ORDER BY requires a sort since no supporting index exists - No EXPLAIN has ever been run; no one knows the actual access path

Job 3: Overdraft Detection (30 minutes)

-- Find accounts with negative available balance
SELECT a.account_id, a.account_number, a.current_balance,
       c.customer_name, c.phone, c.email,
       (SELECT SUM(h.hold_amount) FROM HOLDS h
        WHERE h.account_id = a.account_id AND h.status = 'A') AS total_holds
FROM   ACCOUNTS a
JOIN   CUSTOMERS c ON c.customer_id = a.customer_id
WHERE  a.account_type = 'CHK'
  AND  a.current_balance - COALESCE(
       (SELECT SUM(h.hold_amount) FROM HOLDS h
        WHERE h.account_id = a.account_id AND h.status = 'A'), 0) < 0;

Issues identified: - Correlated subquery in WHERE clause executes for every checking account - Same correlated subquery appears twice (SELECT list and WHERE clause) - No index on HOLDS(account_id, status)

Job 4: Account Reconciliation (90 minutes — the biggest target)

-- Reconcile: sum of posted transactions should match balance change
SELECT a.account_id,
       a.account_number,
       a.current_balance,
       a.opening_balance,
       (SELECT SUM(t.txn_amount)
        FROM   TRANSACTIONS t
        WHERE  t.account_id = a.account_id
          AND  YEAR(t.txn_date) = YEAR(CURRENT DATE)
          AND  MONTH(t.txn_date) = MONTH(CURRENT DATE)
          AND  DAY(t.txn_date) = DAY(CURRENT DATE)) AS today_net,
       a.current_balance - a.opening_balance AS expected_change
FROM   ACCOUNTS a
WHERE  a.status = 'A'
  AND  ABS(
       a.current_balance - a.opening_balance -
       COALESCE((SELECT SUM(t.txn_amount)
                 FROM   TRANSACTIONS t
                 WHERE  t.account_id = a.account_id
                   AND  YEAR(t.txn_date) = YEAR(CURRENT DATE)
                   AND  MONTH(t.txn_date) = MONTH(CURRENT DATE)
                   AND  DAY(t.txn_date) = DAY(CURRENT DATE)), 0)
       ) > 0.01;

Issues identified: - Three date functions (YEAR, MONTH, DAY) make the transaction predicate non-sargable - Correlated subquery appears twice (SELECT list and WHERE clause) - ABS() in WHERE clause prevents sargability on the expression - For 2 million active accounts, the subquery executes 2 million times

Your Task

Part 1: Rewrite Each Query

Rewrite all three queries to eliminate performance problems. For each rewrite: - Replace non-sargable predicates with sargable equivalents - Replace correlated subqueries with JOINs or derived tables - Add appropriate index recommendations - Preserve the exact same result set

Part 2: Estimate the Impact

Based on the pattern of improvements we have seen in this chapter (e.g., sargable rewrites typically yield 100-300x getpage reduction), estimate the new elapsed time for each job.

Part 3: New Batch Schedule

Given your estimated improvements, create a revised batch schedule. Calculate the new total time and the slack in the batch window.

Part 4: Index Recommendations

List every index you would create to support the rewritten queries. For each index, specify: - Table and columns (in order) - Whether it supports a specific predicate, join, or sort - The estimated size impact

Part 5: Monitoring

Write monitoring queries that the operations team can run to verify the batch is performing as expected. Include queries that check: - Elapsed time per job - Getpage counts per job - Any query that exceeds a threshold

Solution Outline

Job 2 Rewrite: Interest Calculation

-- Rewritten: sargable date predicate, explicit JOIN
SELECT a.account_id,
       a.current_balance,
       p.interest_rate,
       a.current_balance * p.interest_rate / 365 AS daily_interest
FROM   ACCOUNTS a
JOIN   PRODUCTS p ON p.product_id = a.product_id
WHERE  a.status = 'A'
  AND  a.account_type IN ('SAV', 'CD', 'MMA')
  AND  a.last_interest_date >= :first_of_current_month  -- e.g., '2025-06-01'
  AND  a.last_interest_date <  :first_of_current_month   -- Wait: re-read the original logic
  -- Original: YEAR matches current year AND MONTH is less than current month
  -- This means: last interest was this year but in a prior month
  -- Rewrite: last_interest_date >= Jan 1 of current year
  --      AND last_interest_date < first day of current month
  AND  a.last_interest_date >= :jan_first_current_year
  AND  a.last_interest_date <  :first_of_current_month
ORDER BY a.branch_id, a.account_id;

-- Supporting index:
CREATE INDEX IX_ACCT_STATUS_TYPE_INTDATE
    ON ACCOUNTS(status, account_type, last_interest_date, branch_id, account_id)
    INCLUDE (current_balance, product_id);
-- This index supports the WHERE predicates and the ORDER BY (after filtering)

Estimated improvement: From 60 minutes to ~8 minutes (sargable predicates + index support).

Job 3 Rewrite: Overdraft Detection

-- Rewritten: single derived table for holds, no repeated subquery
SELECT a.account_id, a.account_number, a.current_balance,
       c.customer_name, c.phone, c.email,
       COALESCE(h.total_holds, 0) AS total_holds
FROM   ACCOUNTS a
JOIN   CUSTOMERS c ON c.customer_id = a.customer_id
LEFT JOIN (
    SELECT account_id, SUM(hold_amount) AS total_holds
    FROM   HOLDS
    WHERE  status = 'A'
    GROUP BY account_id
) h ON h.account_id = a.account_id
WHERE  a.account_type = 'CHK'
  AND  a.current_balance - COALESCE(h.total_holds, 0) < 0;

-- Supporting index:
CREATE INDEX IX_HOLDS_ACCT_STATUS ON HOLDS(account_id, status) INCLUDE (hold_amount);

Estimated improvement: From 30 minutes to ~5 minutes.

Job 4 Rewrite: Account Reconciliation

-- Rewritten: sargable date range, single derived table
SELECT a.account_id,
       a.account_number,
       a.current_balance,
       a.opening_balance,
       COALESCE(ts.today_net, 0) AS today_net,
       a.current_balance - a.opening_balance AS expected_change
FROM   ACCOUNTS a
LEFT JOIN (
    SELECT account_id, SUM(txn_amount) AS today_net
    FROM   TRANSACTIONS
    WHERE  txn_date >= CURRENT DATE
      AND  txn_date <  CURRENT DATE + 1 DAY
    GROUP BY account_id
) ts ON ts.account_id = a.account_id
WHERE  a.status = 'A'
  AND  ABS(a.current_balance - a.opening_balance - COALESCE(ts.today_net, 0)) > 0.01;

-- Supporting index (if not already exists):
CREATE INDEX IX_TXN_DATE_ACCT ON TRANSACTIONS(txn_date, account_id) INCLUDE (txn_amount);

Estimated improvement: From 90 minutes to ~12 minutes.

Revised Batch Schedule

Job Original Tuned Savings
1. Transaction posting 45 min 45 min (no SQL change) 0 min
2. Interest calculation 60 min 8 min 52 min
3. Overdraft detection 30 min 5 min 25 min
4. Reconciliation 90 min 12 min 78 min
5. Regulatory extracts 40 min 40 min (no SQL change) 0 min
6. Backup 60 min 60 min (no change) 0 min
Total 5h 25m 2h 50m 2h 35m

New slack: 3 hours and 10 minutes — a 48% reduction in total batch time, exceeding the 40% target.