Case Study 34.2: UDF Library for Financial Calculations
Background
Meridian National Bank's reporting team generates over 200 daily reports, ranging from branch-level summaries to regulatory filings. Many of these reports require the same financial calculations: compound interest, present value, loan amortization, risk scoring, and data masking. Currently, these calculations are duplicated across reports — some in SQL inline expressions, some in COBOL copybooks, and some in Excel macros applied after data extraction.
The inconsistency is more than an annoyance. A recent internal audit found that three different reports used three different formulas for calculating APY (Annual Percentage Yield), producing slightly different numbers. While the differences were small (within 0.02%), they raised regulatory concerns about data consistency.
The Challenge
Create a centralized library of financial calculation functions that: - Can be used directly in SQL queries by report developers - Produces identical results regardless of which report uses them - Handles edge cases (NULL inputs, zero denominators, negative rates) gracefully - Performs well enough to be used in queries scanning millions of rows - Is versioned and auditable for regulatory compliance
Solution Design
Function Categories
The team organized the UDF library into five categories:
- Time Value of Money:
FN_COMPOUND_INTEREST,FN_PRESENT_VALUE,FN_FUTURE_VALUE,FN_APY,FN_MONTHLY_PAYMENT - Risk Assessment:
FN_RISK_SCORE,FN_DTI_RATIO,FN_COVERAGE_RATIO - Data Masking:
FN_MASK_ACCOUNT,FN_MASK_SSN,FN_MASK_PHONE - Date Utilities:
FN_IS_BUSINESS_DAY,FN_NEXT_BUSINESS_DAY,FN_BUSINESS_DAYS_BETWEEN - Formatting:
FN_FORMAT_CURRENCY,FN_FORMAT_PHONE,FN_FORMAT_ACCOUNT_NUM
Design Decisions
DETERMINISTIC vs. NOT DETERMINISTIC: Pure mathematical functions (FN_COMPOUND_INTEREST, FN_APY, FN_MONTHLY_PAYMENT) are marked DETERMINISTIC, allowing DB2 to cache and optimize. Functions that read tables (FN_IS_BANK_HOLIDAY, FN_CURRENT_RATE) are NOT DETERMINISTIC.
RETURNS NULL ON NULL INPUT: Applied to all mathematical functions. If any input is NULL, the output is NULL — no function invocation overhead.
CALLED ON NULL INPUT: Applied only to masking functions where NULL should produce the string 'N/A' rather than NULL.
Performance Considerations
The team discovered that using scalar UDFs in WHERE clauses of large queries caused significant performance degradation:
-- SLOW: UDF prevents index use on ACCOUNT_NUMBER
SELECT * FROM MERIDIAN.ACCOUNTS
WHERE MERIDIAN.FN_MASK_ACCOUNT(ACCOUNT_NUMBER) = 'XXXX-1234';
-- FAST: Equivalent predicate that uses the index
SELECT * FROM MERIDIAN.ACCOUNTS
WHERE RIGHT(ACCOUNT_NUMBER, 4) = '1234';
Guideline established: Use UDFs in SELECT lists and HAVING clauses (post-filter). Avoid UDFs in WHERE clauses on indexed columns.
For the risk scoring function, which was called for every row in a 3-million-row report, the team created a generated column approach:
ALTER TABLE MERIDIAN.ACCOUNTS
ADD COLUMN RISK_SCORE_CACHED INTEGER
GENERATED ALWAYS AS (MERIDIAN.FN_RISK_SCORE(BALANCE, OVERDRAFT_COUNT, DAYS_OPEN));
This pre-computes the risk score on INSERT/UPDATE, making query-time access free.
Table Function: Amortization Schedule
The most complex function was FN_AMORTIZATION_SCHEDULE, which generates a full payment schedule. Using a recursive CTE (as shown in Section 34.8.2), the function generates up to 360 rows (for a 30-year mortgage) in a single call.
Report developers use it like this:
SELECT s.*
FROM MERIDIAN.LOANS l,
TABLE(MERIDIAN.FN_AMORTIZATION_SCHEDULE(
l.PRINCIPAL, l.ANNUAL_INTEREST_RATE, l.TERM_MONTHS
)) AS s
WHERE l.LOAN_TYPE = 'MORTGAGE' AND l.STATUS = 'ACTIVE';
This replaces a COBOL batch program that previously took 45 minutes to generate the same data.
Testing Approach
Each function has a corresponding set of test cases validated against an independent Excel model (maintained by the compliance team):
| Function | Test Cases | Validation Method |
|---|---|---|
| FN_COMPOUND_INTEREST | 15 | Excel EFFECT() comparison |
| FN_MONTHLY_PAYMENT | 12 | Excel PMT() comparison |
| FN_APY | 8 | Regulatory formula verification |
| FN_RISK_SCORE | 20 | Manual calculation by risk team |
| FN_MASK_ACCOUNT | 10 | Pattern matching verification |
All tests are automated in a test procedure that runs nightly and reports any discrepancies.
Results
| Metric | Before UDF Library | After UDF Library |
|---|---|---|
| APY calculation variants | 3 different formulas | 1 canonical function |
| Report development time | 2-3 days | 0.5-1 day |
| Calculation audit findings | 3 per quarter | 0 |
| Amortization report runtime | 45 minutes (batch) | 3 minutes (SQL) |
| Masking inconsistencies | Frequent | None |
Lessons Learned
- Performance testing is essential. A UDF that works perfectly for 100 rows can bring a 10-million-row report to a halt. Test at production scale before deployment.
- Generated columns bridge the performance gap. Pre-computing UDF results as generated columns gives you the consistency of a function with the performance of a stored value.
- Versioning matters for compliance. When a regulator asks "which formula did you use for this 2024 report?", you need to point to the exact function version that was active at that time.
- Document the math. Every financial function includes a comment block with the formula, its source (e.g., "OCC Regulation CC, Section 229.2"), and sample calculations.
Discussion Questions
- Should masking functions like
FN_MASK_SSNbe implemented as UDFs that every developer can call, or should masking be enforced at a different layer (views, column masks)? - How would you handle a regulatory change that requires modifying
FN_APY? What about reports that have already been generated using the old formula? - A developer proposes creating a UDF that calls an external web service for real-time exchange rates. What concerns would you raise?