Case Study 2: Meridian Financial — Saying No to an Unfair Model

DataField.Dev

Case Study 2: Meridian Financial — Saying No to an Unfair Model

Context

Meridian Financial's credit scoring system — the XGBoost ensemble with 500 trees and 200 features, encountered throughout Chapters 24, 28, 29, 31, 34, and 35 — processes 15,000 applications per day. The system has been stable for eight months: AUC is 0.847 on the monthly validation sample, the adverse impact ratio (AIR) for all protected groups exceeds the regulatory threshold of 0.80, and the quarterly model risk review (SR 11-7) has produced no findings.

The Head of Consumer Lending, Priya Sundaram, has a problem. Default rates in the personal loan portfolio have increased by 0.8 percentage points over the past two quarters. The credit loss provision has grown by $12M. The CFO has asked every business unit to propose loss mitigation strategies by end of quarter.

Priya's proposal to the data science team is straightforward: "Add zip code to the credit model. My analysts ran the numbers — zip code improves AUC by 2.1 percentage points, which translates to an estimated $4.3M annual reduction in default losses. The data is already in our systems. Can your team have this deployed by end of month?"

The staff data scientist must respond.

The Analysis

Step 1: Understand the Request

The staff DS begins by understanding the request on its own terms, not by immediately objecting. Priya is not asking for something absurd. Zip code is genuinely predictive of default risk because it correlates with local economic conditions (unemployment rate, cost of living, industry concentration) that affect borrowers' ability to repay. The $4.3M loss reduction estimate is plausible. The data is available. The implementation is straightforward — add one feature to the model, retrain, validate, deploy.

This understanding is important because the staff DS cannot credibly say no without first acknowledging what is valid about the request. A refusal that does not engage with the business rationale will be perceived as obstruction, not judgment.

Step 2: Identify the Problem

Zip code is a proxy for race, ethnicity, and national origin. This is not speculation — it is empirically well-documented and legally recognized. The relationship between residential location and demographic composition in the United States is a product of historical practices (redlining, restrictive covenants, exclusionary zoning) whose effects persist in current geographic patterns.

The Equal Credit Opportunity Act (ECOA) prohibits discrimination on the basis of race, color, religion, national origin, sex, marital status, or age. The act covers both intentional discrimination (disparate treatment) and unintentional discrimination that produces disproportionate adverse effects on protected groups (disparate impact). Facially neutral variables — variables that do not name a protected characteristic — can still produce disparate impact if they serve as proxies.

The staff DS runs a quick analysis using the fairness auditing tools from Chapter 31:

from dataclasses import dataclass, field
from typing import Dict, List, Tuple, Optional
import numpy as np


@dataclass
class ProxyAnalysis:
    """Analyze whether a candidate feature is a proxy for protected attributes.

    Measures correlation between the candidate feature and protected
    attributes, and estimates the disparate impact of including the
    feature in the credit model.

    Attributes:
        feature_name: Name of the candidate feature.
        protected_attributes: Protected attributes to test against.
        correlations: Pearson correlation with each protected attribute.
        mutual_information: MI with each protected attribute.
        air_before: Adverse impact ratio before adding feature.
        air_after: Adverse impact ratio after adding feature.
        auc_gain: AUC improvement from adding the feature.
    """
    feature_name: str
    protected_attributes: List[str] = field(default_factory=list)
    correlations: Dict[str, float] = field(default_factory=dict)
    mutual_information: Dict[str, float] = field(default_factory=dict)
    air_before: Dict[str, float] = field(default_factory=dict)
    air_after: Dict[str, float] = field(default_factory=dict)
    auc_gain: float = 0.0

    def is_proxy(self, correlation_threshold: float = 0.30) -> bool:
        """Determine whether the feature is a proxy for any protected attribute.

        A feature is considered a proxy if its correlation with any
        protected attribute exceeds the threshold.

        Args:
            correlation_threshold: Minimum absolute correlation to flag.

        Returns:
            True if the feature is a proxy for any protected attribute.
        """
        return any(
            abs(corr) > correlation_threshold
            for corr in self.correlations.values()
        )

    def disparate_impact_delta(self) -> Dict[str, float]:
        """Compute the change in adverse impact ratio from adding the feature.

        Returns:
            Dictionary mapping protected attribute to AIR change.
            Negative values indicate increased disparate impact.
        """
        delta = {}
        for attr in self.protected_attributes:
            if attr in self.air_before and attr in self.air_after:
                delta[attr] = self.air_after[attr] - self.air_before[attr]
        return delta

    def summary(self) -> str:
        """Generate a human-readable summary of the proxy analysis.

        Returns:
            Multi-line summary string.
        """
        lines = [f"Proxy Analysis: {self.feature_name}"]
        lines.append(f"AUC gain: +{self.auc_gain:.3f}")
        lines.append("")
        lines.append("Correlations with protected attributes:")
        for attr, corr in self.correlations.items():
            flag = " [PROXY]" if abs(corr) > 0.30 else ""
            lines.append(f"  {attr}: {corr:+.3f}{flag}")
        lines.append("")
        lines.append("Adverse Impact Ratio change:")
        for attr, delta in self.disparate_impact_delta().items():
            direction = "worse" if delta < 0 else "better"
            lines.append(
                f"  {attr}: {self.air_before[attr]:.3f} -> "
                f"{self.air_after[attr]:.3f} ({direction})"
            )
        return "\n".join(lines)


# Simulate the analysis for zip code at Meridian Financial.
zip_code_analysis = ProxyAnalysis(
    feature_name="applicant_zip_code",
    protected_attributes=["race_ethnicity", "national_origin", "age_group"],
    correlations={
        "race_ethnicity": 0.58,    # Strong proxy.
        "national_origin": 0.42,   # Moderate proxy.
        "age_group": 0.12,         # Weak relationship.
    },
    mutual_information={
        "race_ethnicity": 0.31,
        "national_origin": 0.19,
        "age_group": 0.04,
    },
    air_before={
        "race_ethnicity": 0.86,    # Above 0.80 threshold.
        "national_origin": 0.83,   # Above 0.80 threshold.
        "age_group": 0.91,
    },
    air_after={
        "race_ethnicity": 0.71,    # Below 0.80 threshold.
        "national_origin": 0.74,   # Below 0.80 threshold.
        "age_group": 0.89,
    },
    auc_gain=0.021,
)

print(zip_code_analysis.summary())
print()
print(f"Is proxy: {zip_code_analysis.is_proxy()}")

Proxy Analysis: applicant_zip_code
AUC gain: +0.021

Correlations with protected attributes:
  race_ethnicity: +0.580 [PROXY]
  national_origin: +0.420 [PROXY]
  age_group: +0.120

Adverse Impact Ratio change:
  race_ethnicity: 0.860 -> 0.710 (worse)
  national_origin: 0.830 -> 0.740 (worse)
  age_group: 0.910 -> 0.890 (worse)

Is proxy: True

The results are unambiguous. Zip code has a 0.58 correlation with race/ethnicity and a 0.42 correlation with national origin — both well above the 0.30 proxy threshold. Adding zip code to the model drops the adverse impact ratio for race/ethnicity from 0.86 to 0.71 — below the 0.80 regulatory threshold. The model would fail a fair lending examination.

Step 3: Say No — With the Framework

The staff DS meets with Priya. The conversation follows the four-step framework from Section 38.5.

1. Acknowledge the value. "I understand the pressure — the $12M credit loss increase is serious, and the CFO needs mitigation strategies. Your analysis is correct: zip code does improve prediction by 2.1 AUC points, and the loss reduction estimate is reasonable."

2. Explain the tradeoff. "The problem is that zip code is a strong proxy for race and national origin. Our analysis shows a 0.58 correlation with race/ethnicity. Adding it would drop our adverse impact ratio from 0.86 to 0.71 — well below the 0.80 threshold that regulators use as a bright line for disparate impact. A fair lending examination would flag this. The regulatory risk is not theoretical — the CFPB and OCC have brought enforcement actions against lenders for exactly this pattern. Fines in recent cases have ranged from $7M to $98M, plus restitution, plus reputational damage."

3. Offer an alternative. "Here is what I think we can do instead. The predictive signal in zip code is partly driven by legitimate economic factors — local unemployment rates, cost of living, industry concentration — that affect repayment ability. If we can extract those legitimate signals and use them directly, we capture much of the predictive improvement without the discriminatory proxy."

4. Escalate if needed. "If you believe the loss reduction is critical enough to accept the regulatory risk, I would need to escalate to Legal and Compliance. I'm not the right person to make that tradeoff — but I can prepare the analysis they would need to evaluate it."

Step 4: Build the Alternative

The staff DS proposes a three-week analysis to decompose the zip code signal:

@dataclass
class LegitimateSignalDecomposition:
    """Decompose a proxy variable's signal into legitimate and illegitimate components.

    The goal is to identify which portion of the proxy's predictive
    power comes from legitimate economic factors (which can be used
    directly) vs. demographic composition (which cannot).

    Attributes:
        proxy_feature: The proxy variable being decomposed.
        legitimate_features: Economic variables that explain the proxy.
        residual_proxy_power: AUC gain from proxy after controlling for
            legitimate features (attributable to demographics).
        legitimate_power: AUC gain from legitimate features alone.
        total_proxy_power: Original AUC gain from the proxy.
    """
    proxy_feature: str
    legitimate_features: List[str] = field(default_factory=list)
    residual_proxy_power: float = 0.0
    legitimate_power: float = 0.0
    total_proxy_power: float = 0.0

    @property
    def legitimate_fraction(self) -> float:
        """Fraction of proxy's signal attributable to legitimate factors."""
        if self.total_proxy_power == 0:
            return 0.0
        return self.legitimate_power / self.total_proxy_power

    @property
    def proxy_fraction(self) -> float:
        """Fraction of proxy's signal attributable to demographics."""
        return 1.0 - self.legitimate_fraction


# After the three-week analysis:
decomposition = LegitimateSignalDecomposition(
    proxy_feature="applicant_zip_code",
    legitimate_features=[
        "county_unemployment_rate",
        "county_median_household_income",
        "county_cost_of_living_index",
        "zip_industry_concentration_hhi",
        "zip_housing_price_index_yoy",
    ],
    residual_proxy_power=0.006,   # Only 0.6 AUC points from demographics.
    legitimate_power=0.015,        # 1.5 AUC points from economics.
    total_proxy_power=0.021,       # Original 2.1 AUC points.
)

print(f"Total proxy AUC gain: +{decomposition.total_proxy_power:.3f}")
print(f"Legitimate economic signal: +{decomposition.legitimate_power:.3f} "
      f"({decomposition.legitimate_fraction:.0%})")
print(f"Demographic residual: +{decomposition.residual_proxy_power:.3f} "
      f"({decomposition.proxy_fraction:.0%})")

Total proxy AUC gain: +0.021
Legitimate economic signal: +0.015 (71%)
Demographic residual: +0.006 (29%)

The decomposition reveals that 71% of the zip code's predictive signal comes from legitimate economic factors. The remaining 29% is attributable to demographic composition — the portion that creates disparate impact.

The alternative model uses the five legitimate economic features directly, without zip code. The result:

Metric	Current Model	With Zip Code	With Economic Features
AUC	0.847	0.868 (+0.021)	0.862 (+0.015)
AIR (race/ethnicity)	0.86	0.71	0.84
AIR (national origin)	0.83	0.74	0.82
Est. loss reduction	—	$4.3M \| $3.1M

The economic-features model captures $3.1M of the $4.3M loss reduction (72%) while keeping the adverse impact ratio above the 0.80 threshold for all protected groups. It is both more predictive than the current model and more defensible than the zip code model.

The Organizational Dynamics

Priya's initial reaction to the "no" is frustration: "You're leaving $1.2M on the table because of a correlation statistic." The staff DS does not argue. Instead, they reframe:

"Think of it this way. The $4.3M model has a regulatory risk attached. If a fair lending exam flags it — and our analysis says it would — the cost is $7-98M in fines plus restitution plus 12-18 months of remediation effort during which the model is frozen. The $3.1M model has no regulatory risk. On an expected-value basis, the $3.1M model is the better bet."

This reframing works because it translates the ethical argument into a business argument. Priya is a sophisticated business leader — she understands expected value, risk-adjusted returns, and regulatory cost. Framing the decision in her vocabulary, not in the vocabulary of fairness metrics and proxy analysis, is the translation skill that makes staff-level communication effective.

Priya asks: "Can we get the remaining $1.2M some other way?" The staff DS: "Yes. The economic features we identified update quarterly. If we build a pipeline to ingest county-level economic data monthly instead of quarterly, we capture more timely signal — which may close part of the gap. I'd estimate 6 weeks of data engineering effort and an additional $300-500K in loss reduction. I'll write a design document."

The Broader Lesson

This case study illustrates three principles of staff-level technical leadership:

1. Saying no requires saying yes to something else. The staff DS did not simply refuse the request. They invested three weeks in building an alternative that captured 72% of the value without the regulatory risk. The "no" was credible because it was accompanied by a concrete, viable alternative.

2. The same argument needs different translations for different audiences. To the data science team, the argument is technical: "zip code has a 0.58 correlation with race/ethnicity and drops AIR below 0.80." To Priya, the argument is financial: "the expected cost of regulatory action exceeds the loss reduction." To Legal (if escalation were needed), the argument is regulatory: "the model would fail an ECOA disparate impact analysis under the CFPB's 2022 guidance." The underlying analysis is the same; the framing changes to match the audience's evaluative framework.

3. The proxy decomposition is the technical contribution; the communication is the leadership contribution. A senior data scientist could have run the proxy analysis. The staff-level contribution was integrating the technical finding into a business decision framework, managing the stakeholder relationship, proposing a constructive alternative, and navigating the organizational dynamics — all within a two-week timeline that met Priya's quarterly deadline.

The credit model with economic features shipped on time. The quarterly model risk review noted the new features and the proxy analysis — which became part of the model's fair lending documentation, strengthening its regulatory position. The $1.2M gap was partially closed in the following quarter by the monthly economic data pipeline. And Priya, despite her initial frustration, cited the collaboration as an example of data science adding strategic value — not just technical value — in her next leadership review.