Case Study 2: Insurance Claims Schema Design

DataField.Dev

Case Study 2: Insurance Claims Schema Design

Designing a Logical Model for ClaimShield Insurance

Background

ClaimShield Insurance is a mid-size property and casualty insurance company processing 200,000 claims per year across three product lines: auto insurance, homeowner's insurance, and commercial property insurance. The company is replacing a 15-year-old claims management system with a modern platform backed by IBM DB2.

You have been hired as the data modeling consultant. Your task: design the logical data model for the claims management domain. The model must support claim intake, investigation, adjustment, payment, and regulatory reporting.

The Legacy System

The existing system stored claims as hierarchical records in indexed flat files on an AS/400. A single claim "record" was actually spread across four files:

CLMHDR — Claim header: claim number, policy number, date of loss, status, adjuster code
CLMPTY — Parties: up to 8 parties per claim stored as repeating groups (PARTY1_NAME, PARTY1_ROLE, PARTY2_NAME, PARTY2_ROLE, ...)
CLMPAY — Payments: up to 30 payment slots per claim
CLMNOT — Notes: free-text adjuster notes, limited to 10 entries

The problems were textbook First Normal Form violations at industrial scale. CLMPTY stored up to 8 parties using positional fields. If a claim involved 9 parties — which happened regularly in multi-vehicle accidents — the system could not record the ninth. Payment history was similarly constrained: long-running litigation claims frequently exceeded the 30-payment limit, forcing adjusters to create "continuation claims" linked by free-text notes.

Requirements Gathering

Six weeks of stakeholder interviews produced these core data requirements:

No arbitrary limits. A claim can have any number of parties, payments, notes, or documents.
Temporal tracking. Every status change, payment, and reserve adjustment must be tracked with who, when, and why.
Multi-coverage support. A single loss event can trigger claims under multiple coverages (a house fire involves dwelling coverage, personal property coverage, and additional living expense coverage). Each coverage has independent reserves and payments.
Party flexibility. The same attorney might represent the claimant on one claim and the insured on another. The same body shop might be a vendor on hundreds of claims.
Integration with policy data. The claims schema must reference — not duplicate — policy data.
Document management. Track document metadata (the documents themselves are stored in a content management system).
Fraud detection readiness. The schema must support pattern-matching queries to identify suspicious claims networks.

The Logical Model

Core Entities

The model was designed around a clear separation between the loss event, the claim, and the financial tracking.

LOSS_EVENT — The real-world event that caused the loss. | Attribute | Type | Notes | |-----------|------|-------| | LOSS_EVENT_ID | Integer (PK) | | | EVENT_DATE | Date | | | EVENT_TYPE | Code | COLLISION, WEATHER, FIRE, THEFT, VANDALISM, LIABILITY | | DESCRIPTION | Varchar(2000) | | | LOCATION_ADDRESS | Varchar(200) | | | LOCATION_STATE | Char(2) | | | POLICE_REPORT_NUMBER | Varchar(30) | | | CATASTROPHE_CODE | Varchar(10) | For named storms, wildfires, etc. |

Design decision: LOSS_EVENT is separate from CLAIM because one event can generate multiple claims. A storm that damages both a house and a car creates two claims (one against the homeowner's policy, one against the auto policy) but one loss event. Without this entity, correlating related claims requires string matching on descriptions.

CLAIM — A formal request for payment filed against a specific policy. | Attribute | Type | Notes | |-----------|------|-------| | CLAIM_ID | Integer (PK) | | | CLAIM_NUMBER | Varchar(20) (unique) | Customer-facing | | POLICY_ID | FK to POLICY | References the policy system | | LOSS_EVENT_ID | FK to LOSS_EVENT | | | CLAIMANT_ID | FK to PARTY | May differ from policyholder | | DATE_REPORTED | Date | | | ESTIMATED_AMOUNT | Decimal(15,2) | Initial estimate | | APPROVED_AMOUNT | Decimal(15,2) | After adjustment | | CURRENT_STATUS | Code | FILED, INVESTIGATING, ADJUSTED, APPROVED, DENIED, CLOSED | | ASSIGNED_ADJUSTER_ID | FK to EMPLOYEE | | | PRIORITY | Code | LOW, MEDIUM, HIGH, CRITICAL | | CLOSURE_DATE | Date | |

CLAIM_COVERAGE — The intersection between a claim and a policy coverage. This was the most consequential design decision. | Attribute | Type | Notes | |-----------|------|-------| | CLAIM_COVERAGE_ID | Integer (PK) | | | CLAIM_ID | FK to CLAIM | | | COVERAGE_CODE | FK to COVERAGE_TYPE | | | RESERVE_AMOUNT | Decimal(15,2) | Current reserve | | PAID_AMOUNT | Decimal(15,2) | Total paid to date | | DEDUCTIBLE_AMOUNT | Decimal(10,2) | | | COVERAGE_LIMIT | Decimal(15,2) | | | STATUS | Code | |

When an adjuster sets a reserve of $15,000 on dwelling coverage and $8,000 on personal property coverage for the same fire claim, those are two rows in CLAIM_COVERAGE with their own payment and reserve histories. The legacy system had no way to make this distinction, which had been a persistent audit finding.

Party Model

PARTY — Any person or organization that participates in the insurance ecosystem. | Attribute | Type | Notes | |-----------|------|-------| | PARTY_ID | Integer (PK) | | | PARTY_TYPE | Code | INDIVIDUAL, ORGANIZATION | | TAX_ID_LAST_FOUR | Char(4) | | | PRIMARY_PHONE | Varchar(20) | | | PRIMARY_EMAIL | Varchar(200) | | | ADDRESS_LINE1 | Varchar(200) | | | CITY | Varchar(100) | | | STATE_CODE | Char(2) | | | ZIP_CODE | Varchar(10) | |

CLAIM_PARTY — Junction table: parties play roles on claims. | Attribute | Type | Notes | |-----------|------|-------| | CLAIM_PARTY_ID | Integer (PK) | | | CLAIM_ID | FK to CLAIM | | | PARTY_ID | FK to PARTY | | | ROLE_CODE | Code | INSURED, CLAIMANT, WITNESS, ATTORNEY, VENDOR, MEDICAL_PROVIDER | | IS_PRIMARY | Boolean | | | EFFECTIVE_DATE | Date | |

The same person can appear with different roles across different claims. A body shop that is a VENDOR on hundreds of claims has one PARTY row and hundreds of CLAIM_PARTY rows.

Financial Tracking

CLAIM_PAYMENT — Every payment issued or recovery received. | Attribute | Type | Notes | |-----------|------|-------| | PAYMENT_ID | Integer (PK) | | | CLAIM_COVERAGE_ID | FK to CLAIM_COVERAGE | Payments are coverage-level | | PAYEE_PARTY_ID | FK to PARTY | | | PAYMENT_TYPE | Code | INDEMNITY, EXPENSE, SUBROGATION_RECOVERY, SALVAGE | | AMOUNT | Decimal(15,2) | | | PAYMENT_DATE | Date | | | PAYMENT_METHOD | Code | CHECK, ACH, WIRE | | CHECK_NUMBER | Varchar(20) | | | STATUS | Code | PENDING, ISSUED, CLEARED, VOIDED |

Critical design decision: Payments are tied to CLAIM_COVERAGE, not directly to CLAIM. This enables accurate coverage-level financial reporting. A $5,000 payment under dwelling coverage reduces the dwelling reserve, not the personal property reserve.

CLAIM_RESERVE_HISTORY — Temporal record of every reserve change. | Attribute | Type | Notes | |-----------|------|-------| | RESERVE_HISTORY_ID | Integer (PK) | | | CLAIM_COVERAGE_ID | FK to CLAIM_COVERAGE | | | PREVIOUS_AMOUNT | Decimal(15,2) | | | NEW_AMOUNT | Decimal(15,2) | | | CHANGE_DATE | Timestamp | | | CHANGED_BY | FK to EMPLOYEE | | | REASON_CODE | Code | | | NOTES | Varchar(1000) | |

Workflow and Audit

CLAIM_STATUS_HISTORY — Every status transition with accountability. | Attribute | Type | Notes | |-----------|------|-------| | STATUS_HISTORY_ID | Integer (PK) | | | CLAIM_ID | FK to CLAIM | | | PREVIOUS_STATUS | Code | | | NEW_STATUS | Code | | | TRANSITION_DATE | Timestamp | | | CHANGED_BY | FK to EMPLOYEE | | | REASON_CODE | Code | |

CLAIM_DOCUMENT — Metadata for photos, reports, estimates. | Attribute | Type | Notes | |-----------|------|-------| | DOCUMENT_ID | Integer (PK) | | | CLAIM_ID | FK to CLAIM | | | DOCUMENT_TYPE | Code | PHOTO, POLICE_REPORT, MEDICAL_RECORD, ESTIMATE, INVOICE | | FILE_NAME | Varchar(200) | | | STORAGE_REFERENCE_ID | Varchar(100) | Pointer to content management system | | UPLOAD_DATE | Timestamp | | | UPLOADED_BY | FK to EMPLOYEE | |

FRAUD_INVESTIGATION — Parallel investigation process for flagged claims. | Attribute | Type | Notes | |-----------|------|-------| | INVESTIGATION_ID | Integer (PK) | | | CLAIM_ID | FK to CLAIM | | | OPENED_DATE | Date | | | INVESTIGATOR_ID | FK to EMPLOYEE | | | FRAUD_TYPE | Code | STAGED_ACCIDENT, INFLATED_CLAIM, ARSON, PHANTOM_VEHICLE | | OUTCOME | Code | OPEN, CONFIRMED_FRAUD, CLEARED, REFERRED_TO_LAW_ENFORCEMENT | | CLOSED_DATE | Date | | | SAVINGS_AMOUNT | Decimal(15,2) | Amount saved by detecting fraud |

Reference Tables

LOSS_TYPE — Categorization of losses
COVERAGE_TYPE — Definitions of coverage categories
PAYMENT_TYPE — Indemnity, expense, subrogation recovery, salvage
DOCUMENT_TYPE — Police report, medical record, estimate, photo
NOTE_TYPE — General, investigation, legal, medical, supervisory

Total: 19 entities (7 core, 6 supporting, 6 reference)

Key Design Decisions

Decision 1: LOSS_EVENT as an Independent Entity

A loss event is distinct from a claim. Separating them enables: - Correlation of related claims from the same event (multi-policy, multi-vehicle) - Catastrophe tracking (all claims from Hurricane X, regardless of policy type) - Prevention of duplicate claim filing for the same event

Decision 2: Coverage-Level Financial Tracking

Tying payments and reserves to CLAIM_COVERAGE rather than directly to CLAIM was the decision that drew the most pushback. "It is more joins," the development team argued. The answer was normalization discipline: a payment belongs to a specific coverage, and that coverage has its own lifecycle. The additional join was negligible — CLAIM_COVERAGE typically has 1-3 rows per claim, and a clustered index on CLAIM_ID resolves the join with a single index probe.

Decision 3: Party-Role Pattern

Rather than duplicating party information in each claim, we created a shared PARTY table with role-based junction (CLAIM_PARTY). This enables fraud detection: queries can identify networks of claimants, attorneys, and medical providers appearing together across multiple claims.

Decision 4: Immutable History Tables

CLAIM_STATUS_HISTORY and CLAIM_RESERVE_HISTORY are append-only. No UPDATE or DELETE is permitted. This satisfies regulatory audit requirements and enables precise reconstruction of any claim's state at any point in time.

Normalization Validation

Issue found: The initial CLAIM design included POLICYHOLDER_NAME and POLICY_NUMBER. Both are transitive dependencies (they depend on POLICY_ID). Removed in favor of JOINs to the policy system.

Issue found: CLAIM_PAYMENT initially included CLAIM_NUMBER. Transitive dependency through CLAIM_COVERAGE_ID -> CLAIM_ID. Removed.

All entities pass 1NF, 2NF, and 3NF after corrections.

Comparison with the Meridian Bank Model

Pattern	Meridian Bank	ClaimShield Insurance
Supertype/subtype	CUSTOMER, ACCOUNT	PARTY, COVERED_ITEM
Junction table	ACCOUNT_HOLDER	CLAIM_PARTY
Effective dating	FEE_SCHEDULE, INTEREST_RATE_SCHEDULE	CLAIM_RESERVE_HISTORY
Status tracking	PENDING_TRANSACTION + TRANSACTION	CLAIM_STATUS_HISTORY + CURRENT_STATUS
Audit trail	AUDIT_LOG	CLAIM_STATUS_HISTORY, CLAIM_RESERVE_HISTORY
Regulatory	CTR_FILING, ALERT	FRAUD_INVESTIGATION

The patterns are universal. Once you learn to recognize them, you can model any industry.

Results

After migrating 15 years of claims history (3.2 million claims, 18 million payment records):

Zero arbitrary limits. The most party-heavy claim had 23 parties. The most payment-heavy had 847 payments over a 12-year litigation case.
Reporting time reduced by 90%. The monthly loss development report dropped from a 4-hour batch job to a 15-minute SQL query.
Fraud detection enabled. Pattern-matching queries identified suspicious claims networks in the first month, flagging $1.8 million in potentially fraudulent claims.

Discussion Questions

Explain why tying payments to CLAIM_COVERAGE rather than directly to CLAIM is important for actuarial accuracy. What specific calculations would produce incorrect results under the flattened approach?
The model uses a foreign key from CLAIM.POLICY_ID to a table in a separate system. Can DB2 enforce this foreign key constraint across subsystems? If not, how would you ensure data consistency?
The CLAIM entity has both ESTIMATED_AMOUNT and APPROVED_AMOUNT. Should the history of estimate revisions be tracked? If so, design the entity.
A business analyst asks: "Can we calculate average claim processing time by adjuster?" Trace the query path through the model and identify the required JOINs.
The model does not include a SUBROGATION entity (recovery of payments from at-fault third parties). Design this entity and show how it relates to CLAIM and CLAIM_PAYMENT.

Return to Chapter 13 | Continue to Key Takeaways