Case Study 1: Batch Window Optimization
Background
Meridian National Bank runs a nightly batch cycle between 11:00 PM and 5:00 AM — a six-hour window. Within this window, the following jobs must complete in sequence:
- Transaction posting (ETL from branch feeds): ~45 minutes
- Interest calculation for all savings and CD accounts: ~60 minutes
- Overdraft detection and fee assessment: ~30 minutes
- Account balance reconciliation: ~90 minutes
- Regulatory reporting extracts: ~40 minutes
- Backup: ~60 minutes
Total: approximately 5 hours and 25 minutes — leaving only 35 minutes of slack. As the bank has grown from 200,000 to 500,000 customers and transaction volumes have doubled, the batch window is being breached two to three nights per week. The backup starts late, and online processing is delayed in the morning.
The DBA team has been asked to reduce batch elapsed time by 40% without hardware upgrades.
The Problem Queries
Job 2: Interest Calculation (60 minutes)
-- Current interest calculation: runs for each eligible account
SELECT a.account_id,
a.current_balance,
p.interest_rate,
a.current_balance * p.interest_rate / 365 AS daily_interest
FROM ACCOUNTS a,
PRODUCTS p
WHERE a.product_id = p.product_id
AND a.status = 'A'
AND a.account_type IN ('SAV', 'CD', 'MMA')
AND YEAR(a.last_interest_date) = YEAR(CURRENT DATE)
AND MONTH(a.last_interest_date) < MONTH(CURRENT DATE)
ORDER BY a.branch_id, a.account_id;
Issues identified:
- Old-style comma join (not a performance issue per se, but obscures intent)
- YEAR() and MONTH() on last_interest_date are non-sargable
- ORDER BY requires a sort since no supporting index exists
- No EXPLAIN has ever been run; no one knows the actual access path
Job 3: Overdraft Detection (30 minutes)
-- Find accounts with negative available balance
SELECT a.account_id, a.account_number, a.current_balance,
c.customer_name, c.phone, c.email,
(SELECT SUM(h.hold_amount) FROM HOLDS h
WHERE h.account_id = a.account_id AND h.status = 'A') AS total_holds
FROM ACCOUNTS a
JOIN CUSTOMERS c ON c.customer_id = a.customer_id
WHERE a.account_type = 'CHK'
AND a.current_balance - COALESCE(
(SELECT SUM(h.hold_amount) FROM HOLDS h
WHERE h.account_id = a.account_id AND h.status = 'A'), 0) < 0;
Issues identified: - Correlated subquery in WHERE clause executes for every checking account - Same correlated subquery appears twice (SELECT list and WHERE clause) - No index on HOLDS(account_id, status)
Job 4: Account Reconciliation (90 minutes — the biggest target)
-- Reconcile: sum of posted transactions should match balance change
SELECT a.account_id,
a.account_number,
a.current_balance,
a.opening_balance,
(SELECT SUM(t.txn_amount)
FROM TRANSACTIONS t
WHERE t.account_id = a.account_id
AND YEAR(t.txn_date) = YEAR(CURRENT DATE)
AND MONTH(t.txn_date) = MONTH(CURRENT DATE)
AND DAY(t.txn_date) = DAY(CURRENT DATE)) AS today_net,
a.current_balance - a.opening_balance AS expected_change
FROM ACCOUNTS a
WHERE a.status = 'A'
AND ABS(
a.current_balance - a.opening_balance -
COALESCE((SELECT SUM(t.txn_amount)
FROM TRANSACTIONS t
WHERE t.account_id = a.account_id
AND YEAR(t.txn_date) = YEAR(CURRENT DATE)
AND MONTH(t.txn_date) = MONTH(CURRENT DATE)
AND DAY(t.txn_date) = DAY(CURRENT DATE)), 0)
) > 0.01;
Issues identified:
- Three date functions (YEAR, MONTH, DAY) make the transaction predicate non-sargable
- Correlated subquery appears twice (SELECT list and WHERE clause)
- ABS() in WHERE clause prevents sargability on the expression
- For 2 million active accounts, the subquery executes 2 million times
Your Task
Part 1: Rewrite Each Query
Rewrite all three queries to eliminate performance problems. For each rewrite: - Replace non-sargable predicates with sargable equivalents - Replace correlated subqueries with JOINs or derived tables - Add appropriate index recommendations - Preserve the exact same result set
Part 2: Estimate the Impact
Based on the pattern of improvements we have seen in this chapter (e.g., sargable rewrites typically yield 100-300x getpage reduction), estimate the new elapsed time for each job.
Part 3: New Batch Schedule
Given your estimated improvements, create a revised batch schedule. Calculate the new total time and the slack in the batch window.
Part 4: Index Recommendations
List every index you would create to support the rewritten queries. For each index, specify: - Table and columns (in order) - Whether it supports a specific predicate, join, or sort - The estimated size impact
Part 5: Monitoring
Write monitoring queries that the operations team can run to verify the batch is performing as expected. Include queries that check: - Elapsed time per job - Getpage counts per job - Any query that exceeds a threshold
Solution Outline
Job 2 Rewrite: Interest Calculation
-- Rewritten: sargable date predicate, explicit JOIN
SELECT a.account_id,
a.current_balance,
p.interest_rate,
a.current_balance * p.interest_rate / 365 AS daily_interest
FROM ACCOUNTS a
JOIN PRODUCTS p ON p.product_id = a.product_id
WHERE a.status = 'A'
AND a.account_type IN ('SAV', 'CD', 'MMA')
AND a.last_interest_date >= :first_of_current_month -- e.g., '2025-06-01'
AND a.last_interest_date < :first_of_current_month -- Wait: re-read the original logic
-- Original: YEAR matches current year AND MONTH is less than current month
-- This means: last interest was this year but in a prior month
-- Rewrite: last_interest_date >= Jan 1 of current year
-- AND last_interest_date < first day of current month
AND a.last_interest_date >= :jan_first_current_year
AND a.last_interest_date < :first_of_current_month
ORDER BY a.branch_id, a.account_id;
-- Supporting index:
CREATE INDEX IX_ACCT_STATUS_TYPE_INTDATE
ON ACCOUNTS(status, account_type, last_interest_date, branch_id, account_id)
INCLUDE (current_balance, product_id);
-- This index supports the WHERE predicates and the ORDER BY (after filtering)
Estimated improvement: From 60 minutes to ~8 minutes (sargable predicates + index support).
Job 3 Rewrite: Overdraft Detection
-- Rewritten: single derived table for holds, no repeated subquery
SELECT a.account_id, a.account_number, a.current_balance,
c.customer_name, c.phone, c.email,
COALESCE(h.total_holds, 0) AS total_holds
FROM ACCOUNTS a
JOIN CUSTOMERS c ON c.customer_id = a.customer_id
LEFT JOIN (
SELECT account_id, SUM(hold_amount) AS total_holds
FROM HOLDS
WHERE status = 'A'
GROUP BY account_id
) h ON h.account_id = a.account_id
WHERE a.account_type = 'CHK'
AND a.current_balance - COALESCE(h.total_holds, 0) < 0;
-- Supporting index:
CREATE INDEX IX_HOLDS_ACCT_STATUS ON HOLDS(account_id, status) INCLUDE (hold_amount);
Estimated improvement: From 30 minutes to ~5 minutes.
Job 4 Rewrite: Account Reconciliation
-- Rewritten: sargable date range, single derived table
SELECT a.account_id,
a.account_number,
a.current_balance,
a.opening_balance,
COALESCE(ts.today_net, 0) AS today_net,
a.current_balance - a.opening_balance AS expected_change
FROM ACCOUNTS a
LEFT JOIN (
SELECT account_id, SUM(txn_amount) AS today_net
FROM TRANSACTIONS
WHERE txn_date >= CURRENT DATE
AND txn_date < CURRENT DATE + 1 DAY
GROUP BY account_id
) ts ON ts.account_id = a.account_id
WHERE a.status = 'A'
AND ABS(a.current_balance - a.opening_balance - COALESCE(ts.today_net, 0)) > 0.01;
-- Supporting index (if not already exists):
CREATE INDEX IX_TXN_DATE_ACCT ON TRANSACTIONS(txn_date, account_id) INCLUDE (txn_amount);
Estimated improvement: From 90 minutes to ~12 minutes.
Revised Batch Schedule
| Job | Original | Tuned | Savings |
|---|---|---|---|
| 1. Transaction posting | 45 min | 45 min (no SQL change) | 0 min |
| 2. Interest calculation | 60 min | 8 min | 52 min |
| 3. Overdraft detection | 30 min | 5 min | 25 min |
| 4. Reconciliation | 90 min | 12 min | 78 min |
| 5. Regulatory extracts | 40 min | 40 min (no SQL change) | 0 min |
| 6. Backup | 60 min | 60 min (no change) | 0 min |
| Total | 5h 25m | 2h 50m | 2h 35m |
New slack: 3 hours and 10 minutes — a 48% reduction in total batch time, exceeding the 40% target.