Case Study: MedClaim — Testing Adjudication Logic with Edge Cases

DataField.Dev

Case Study: MedClaim — Testing Adjudication Logic with Edge Cases

Background

MedClaim Health Services processes approximately 500,000 insurance claims per month through their CLM-ADJUD program. Each claim passes through a pipeline of business rules: eligibility verification, benefit determination, allowed amount calculation, coordination of benefits, and payment computation. The program has over 120 distinct business rules, many of which interact in complex ways.

James Okafor, the team lead, had long relied on a set of 50 "golden" test claims — real claims from production (anonymized) that covered the most common scenarios. After each code change, a tester would submit the golden claims and compare the output to known-good baselines. This process took approximately six hours per test cycle.

The problem became acute when regulatory changes required MedClaim to implement new telehealth billing rules, modify preventive care coverage, and update coordination of benefits logic — all within a 90-day window. The six-hour test cycle was no longer sustainable.

The Challenge

The CLM-ADJUD program presented unique testing challenges:

Combinatorial explosion: With 6 eligibility conditions, 4 benefit types, 3 coordination of benefits scenarios, and 5 payment calculation paths, there were theoretically 360 distinct processing paths.
Dependent logic: The output of each processing stage became input to the next. An error in eligibility checking could cascade through benefit determination and payment calculation.
Regulatory precision: Claims processing errors were not just costly — they were potentially illegal. CMS (Centers for Medicare & Medicaid Services) audits required demonstrable accuracy.
Data complexity: A single claim record had 127 fields. Setting up test data was laborious and error-prone.

Sarah Kim's Decision Table Approach

Sarah Kim, the business analyst, proposed using decision tables to systematically identify test cases. She worked with James to decompose the adjudication pipeline into stages, then built a decision table for each stage.

For eligibility alone, the decision table identified 12 unique test scenarios (from the 64 theoretical combinations of 6 binary conditions, most were impossible or redundant). For benefit determination, another 8 scenarios. For coordination of benefits, 6 more.

In total, the decision tables identified 42 critical test scenarios — compared to the 50 golden claims that had been chosen haphazardly over the years.

The Critical Discovery

When Sarah mapped the 50 golden claims to the decision table, she found: - 15 of the 42 critical scenarios were not covered by any golden claim - 23 of the 50 golden claims were redundant — they tested the same scenario as another claim - The remaining 12 golden claims covered unique scenarios that the decision table also identified

The uncovered scenarios included: - Coordination of benefits where the secondary payer's liability exceeded the billed amount - A claim for a procedure that was covered but required pre-authorization that was expired (not missing — expired) - A claim where the member had both Medicare and commercial insurance with conflicting allowed amounts

TDD for the Telehealth Rules

For the new telehealth billing rules, James mandated a TDD approach. Tomás Rivera, the DBA, set up a dedicated test DB2 subsystem. The team wrote tests first, then implemented the rules.

The telehealth rules required: 1. Recognizing telehealth modifier codes (GT, 95, CR) 2. Applying a "facility fee" reduction for certain telehealth services 3. Verifying that the provider was licensed in the member's state 4. Applying different copay rules for telehealth vs. in-person visits

The TDD process for just the facility fee rule:

Step 1: Sarah wrote the business rule in plain English: "When a claim has modifier GT and procedure code is in the office visit range (99201-99215), reduce the allowed amount by 15%."

Step 2: James translated this to test cases:

Test 1: Office visit with GT modifier → 15% reduction
Test 2: Office visit without GT modifier → no reduction
Test 3: Non-office visit with GT modifier → no reduction
Test 4: Boundary: procedure code 99201 with GT → reduction
Test 5: Boundary: procedure code 99215 with GT → reduction
Test 6: Just outside range: 99200 with GT → no reduction
Test 7: Just outside range: 99216 with GT → no reduction

Step 3: All tests failed (the code didn't exist yet).

Step 4: James implemented the rule. Tests 1-5 passed, but test 6 failed — the implementation was using >= "99200" which included 99200 in the range. The boundary test caught the error before it was ever deployed.

Step 5: James fixed the boundary condition. All 7 tests passed.

Results

Metric	Before	After
Test scenarios	50 (haphazard)	42 (systematic) + 28 (telehealth)
Coverage of critical paths	64%	100% of decision table scenarios
Test execution time	6 hours (manual)	4 minutes (automated)
Defects found pre-deployment	~2 per release	7 found during test writing
CMS audit findings	3 (previous year)	0 (following year)
Regulatory change deployment	90 days	Completed in 67 days

The Ripple Effect

The success of the CLM-ADJUD testing initiative had organizational consequences:

Other teams adopted the approach. The payment processing team (CLM-PAY) and provider reporting team (RPT-PROVIDER) both began building test suites.
Sarah Kim's role expanded. She became the "test architect," responsible for maintaining decision tables across all MedClaim programs.
Hiring criteria changed. MedClaim began listing "experience with automated testing" in their COBOL developer job postings — a first for the organization.
The golden claims evolved. Rather than being discarded, the original 50 golden claims became an integration test suite that complemented the new unit tests.

Discussion Questions

Why were 23 of the 50 golden claims redundant? What does this tell you about the limitations of ad-hoc test selection?
The TDD approach caught a boundary condition error (test 6) before deployment. Would this error likely have been caught by the golden claims? Why or why not?
Sarah Kim, a business analyst rather than a developer, played a central role in test design. Why is domain expertise at least as important as technical expertise in testing?
MedClaim's CMS audit findings dropped from 3 to 0 after implementing automated testing. What is the connection between unit testing and regulatory compliance?