Case Study 34.2: UDF Library for Financial Calculations

Background

Meridian National Bank's reporting team generates over 200 daily reports, ranging from branch-level summaries to regulatory filings. Many of these reports require the same financial calculations: compound interest, present value, loan amortization, risk scoring, and data masking. Currently, these calculations are duplicated across reports — some in SQL inline expressions, some in COBOL copybooks, and some in Excel macros applied after data extraction.

The inconsistency is more than an annoyance. A recent internal audit found that three different reports used three different formulas for calculating APY (Annual Percentage Yield), producing slightly different numbers. While the differences were small (within 0.02%), they raised regulatory concerns about data consistency.

The Challenge

Create a centralized library of financial calculation functions that: - Can be used directly in SQL queries by report developers - Produces identical results regardless of which report uses them - Handles edge cases (NULL inputs, zero denominators, negative rates) gracefully - Performs well enough to be used in queries scanning millions of rows - Is versioned and auditable for regulatory compliance

Solution Design

Function Categories

The team organized the UDF library into five categories:

  1. Time Value of Money: FN_COMPOUND_INTEREST, FN_PRESENT_VALUE, FN_FUTURE_VALUE, FN_APY, FN_MONTHLY_PAYMENT
  2. Risk Assessment: FN_RISK_SCORE, FN_DTI_RATIO, FN_COVERAGE_RATIO
  3. Data Masking: FN_MASK_ACCOUNT, FN_MASK_SSN, FN_MASK_PHONE
  4. Date Utilities: FN_IS_BUSINESS_DAY, FN_NEXT_BUSINESS_DAY, FN_BUSINESS_DAYS_BETWEEN
  5. Formatting: FN_FORMAT_CURRENCY, FN_FORMAT_PHONE, FN_FORMAT_ACCOUNT_NUM

Design Decisions

DETERMINISTIC vs. NOT DETERMINISTIC: Pure mathematical functions (FN_COMPOUND_INTEREST, FN_APY, FN_MONTHLY_PAYMENT) are marked DETERMINISTIC, allowing DB2 to cache and optimize. Functions that read tables (FN_IS_BANK_HOLIDAY, FN_CURRENT_RATE) are NOT DETERMINISTIC.

RETURNS NULL ON NULL INPUT: Applied to all mathematical functions. If any input is NULL, the output is NULL — no function invocation overhead.

CALLED ON NULL INPUT: Applied only to masking functions where NULL should produce the string 'N/A' rather than NULL.

Performance Considerations

The team discovered that using scalar UDFs in WHERE clauses of large queries caused significant performance degradation:

-- SLOW: UDF prevents index use on ACCOUNT_NUMBER
SELECT * FROM MERIDIAN.ACCOUNTS
WHERE MERIDIAN.FN_MASK_ACCOUNT(ACCOUNT_NUMBER) = 'XXXX-1234';

-- FAST: Equivalent predicate that uses the index
SELECT * FROM MERIDIAN.ACCOUNTS
WHERE RIGHT(ACCOUNT_NUMBER, 4) = '1234';

Guideline established: Use UDFs in SELECT lists and HAVING clauses (post-filter). Avoid UDFs in WHERE clauses on indexed columns.

For the risk scoring function, which was called for every row in a 3-million-row report, the team created a generated column approach:

ALTER TABLE MERIDIAN.ACCOUNTS
ADD COLUMN RISK_SCORE_CACHED INTEGER
GENERATED ALWAYS AS (MERIDIAN.FN_RISK_SCORE(BALANCE, OVERDRAFT_COUNT, DAYS_OPEN));

This pre-computes the risk score on INSERT/UPDATE, making query-time access free.

Table Function: Amortization Schedule

The most complex function was FN_AMORTIZATION_SCHEDULE, which generates a full payment schedule. Using a recursive CTE (as shown in Section 34.8.2), the function generates up to 360 rows (for a 30-year mortgage) in a single call.

Report developers use it like this:

SELECT s.*
FROM MERIDIAN.LOANS l,
     TABLE(MERIDIAN.FN_AMORTIZATION_SCHEDULE(
         l.PRINCIPAL, l.ANNUAL_INTEREST_RATE, l.TERM_MONTHS
     )) AS s
WHERE l.LOAN_TYPE = 'MORTGAGE' AND l.STATUS = 'ACTIVE';

This replaces a COBOL batch program that previously took 45 minutes to generate the same data.

Testing Approach

Each function has a corresponding set of test cases validated against an independent Excel model (maintained by the compliance team):

Function Test Cases Validation Method
FN_COMPOUND_INTEREST 15 Excel EFFECT() comparison
FN_MONTHLY_PAYMENT 12 Excel PMT() comparison
FN_APY 8 Regulatory formula verification
FN_RISK_SCORE 20 Manual calculation by risk team
FN_MASK_ACCOUNT 10 Pattern matching verification

All tests are automated in a test procedure that runs nightly and reports any discrepancies.

Results

Metric Before UDF Library After UDF Library
APY calculation variants 3 different formulas 1 canonical function
Report development time 2-3 days 0.5-1 day
Calculation audit findings 3 per quarter 0
Amortization report runtime 45 minutes (batch) 3 minutes (SQL)
Masking inconsistencies Frequent None

Lessons Learned

  1. Performance testing is essential. A UDF that works perfectly for 100 rows can bring a 10-million-row report to a halt. Test at production scale before deployment.
  2. Generated columns bridge the performance gap. Pre-computing UDF results as generated columns gives you the consistency of a function with the performance of a stored value.
  3. Versioning matters for compliance. When a regulator asks "which formula did you use for this 2024 report?", you need to point to the exact function version that was active at that time.
  4. Document the math. Every financial function includes a comment block with the formula, its source (e.g., "OCC Regulation CC, Section 229.2"), and sample calculations.

Discussion Questions

  1. Should masking functions like FN_MASK_SSN be implemented as UDFs that every developer can call, or should masking be enforced at a different layer (views, column masks)?
  2. How would you handle a regulatory change that requires modifying FN_APY? What about reports that have already been generated using the old formula?
  3. A developer proposes creating a UDF that calls an external web service for real-time exchange rates. What concerns would you raise?