Case Study 11.2: Algorithmic Redlining in Digital Lending

DataField.Dev

Case Study 11.2: Algorithmic Redlining in Digital Lending

The Markup Investigation and Its Aftermath

Case Overview: In August 2021, the investigative journalism outlet The Markup published an analysis of nearly 9 million mortgage loan applications drawn from federal Home Mortgage Disclosure Act (HMDA) data. The analysis documented that Black, Hispanic, and Asian mortgage applicants were denied at substantially higher rates than white applicants with similar reported financial characteristics, across lenders and loan types. The investigation reignited debate about whether algorithmic mortgage underwriting systems are perpetuating the patterns of racial exclusion that characterized the era of explicit redlining — and what, if anything, regulators and the industry are doing about it.

Primary Issue Areas: Algorithmic redlining, disparate impact in mortgage lending, HMDA data transparency, fair lending examination, automated underwriting systems

Regulatory Agencies Involved: Consumer Financial Protection Bureau (CFPB); Department of Housing and Urban Development (HUD); Department of Justice (DOJ); Federal Financial Institutions Examination Council (FFIEC)

Applicable Law: Fair Housing Act (42 U.S.C. § 3605); Equal Credit Opportunity Act (15 U.S.C. § 1691); Home Mortgage Disclosure Act (12 U.S.C. § 2801)

Part 1: The Markup Methodology — HMDA Data Analysis

The Markup is a nonprofit investigative news outlet focused on the accountability of algorithms and technology. For its mortgage lending investigation, published under the headline "Secret Algorithm" and the subhead "The Racist Data Driving U.S. Mortgage Algorithms," The Markup used the same data source that fair lending researchers have relied on for decades: HMDA data.

What HMDA Data Contains

The Home Mortgage Disclosure Act, enacted in 1975 and significantly expanded in 2015, requires covered financial institutions to report detailed data on every mortgage application they receive. The 2015 HMDA expansion added fields that made the data far more useful for fair lending analysis, including:

Applicant race and ethnicity
Applicant sex and age
Income
Loan-to-value (LTV) ratio
Debt-to-income (DTI) ratio
Credit score (not the score itself, but the range)
Property type and purpose
Automated underwriting system used and recommendation received
Interest rate and points charged
Application outcome (approved, denied, withdrawn, incomplete)

The CFPB makes this data publicly available each year. It covers millions of applications from thousands of lenders and is the most comprehensive public dataset on mortgage lending in the United States.

The Markup's Analytical Approach

The Markup analyzed 2019 HMDA data — filed in 2020, covering approximately 8.9 million mortgage applications. Their analytical approach was to compare denial rates across racial and ethnic groups after controlling for the financial characteristics that HMDA data includes: income, loan amount, loan type, loan purpose, property type, occupancy type, and lien status. They also controlled for LTV ratio and DTI ratio where available.

The comparison asked, in essence: among applicants who look similar on the financial characteristics we can observe, how much does race predict the likelihood of denial?

The methodology was peer-reviewed before publication. It was also necessarily limited: HMDA data does not include credit scores in disaggregated form (only ranges), does not include complete credit history data, and does not include the full set of variables that lenders' automated underwriting systems use. The Markup acknowledged these limitations prominently and argued that the disparities they documented were large enough to be meaningful even accounting for unobserved variables.

Part 2: The Findings — Racial Disparities in Mortgage Denial Rates

The Markup's findings documented substantial racial disparities in mortgage denial rates that persisted across lender types, loan types, and geographic regions.

Overall Denial Rate Disparities

After controlling for available financial characteristics, The Markup found that:

Black applicants were 80% more likely to be denied than comparable white applicants (odds ratio: 1.80)
Latino applicants were 70% more likely to be denied (odds ratio: 1.70)
Asian applicants were 50% more likely to be denied (odds ratio: 1.50)
Native American applicants were 40% more likely to be denied (odds ratio: 1.40)

These disparities were not driven by outlier lenders. The Markup examined the 89 metro areas with the largest number of applications and found racial disparities in denial rates in virtually all of them, with variation in magnitude but not in direction.

Lender-Specific Findings

The Markup named specific lenders whose denial rate disparities exceeded the national average. These included major national banks, regional banks, credit unions, and independent mortgage companies. Notably, several fintech mortgage lenders — companies that had publicly marketed their technology as more objective and equitable than traditional loan officers — appeared in the list of lenders with above-average racial disparities.

Pricing Disparities

Beyond approval/denial disparities, the investigation found evidence of pricing disparities — racial and ethnic minority borrowers who were approved paid higher interest rates on average than comparable white borrowers. These pricing disparities were smaller in magnitude than the approval disparities but statistically robust.

The Cross-Lender Consistency

One of the most important features of The Markup's findings was the cross-lender consistency of the disparities. If racial disparities in mortgage approval reflected individual loan officer bias, we would expect large variation across lenders — some with severe bias, others without. Instead, the racial disparities were remarkably consistent across lenders of different types, sizes, and ownership structures. This consistency is more consistent with a systemic cause — a shared automated underwriting infrastructure or shared data inputs — than with individual human bias.

Part 3: How Algorithmic Underwriting Might Produce These Results

The cross-lender consistency of racial disparities in mortgage denial rates points toward the automated underwriting systems (AUS) that dominate the mortgage market as a likely common cause. Understanding how these systems work — and how they might produce racially disparate outcomes — requires examining their architecture and data inputs.

The GSE Automated Underwriting Systems

Fannie Mae's Desktop Underwriter (DU) and Freddie Mac's Loan Product Advisor (LPA, formerly Loan Prospector) are used in the vast majority of conforming mortgage applications. Lenders submit application data to these systems, which return a recommendation: "Approve/Eligible," "Refer with Caution," or "Ineligible." These recommendations are highly influential — an "Approve/Eligible" recommendation from DU or LPA is typically sufficient for the lender to approve the loan. A "Refer" recommendation typically means additional underwriting scrutiny.

These systems use credit score as a primary input, along with LTV, DTI, and loan characteristics. Credit score disparities by race — documented consistently in research, and attributable to the historical factors discussed in Section 11.3 — would directly translate into AUS recommendation disparities, even if the AUS itself does not use race as an input.

Proprietary Model Overlays

Many lenders apply proprietary model overlays on top of or instead of GSE AUS recommendations, particularly for portfolio loans and jumbo mortgages. These proprietary models — which are not publicly available and not subject to the same scrutiny as GSE models — may use additional variables including neighborhood characteristics that can serve as racial proxies.

The Role of Geographic Variables

Neighborhood-level variables — property value trends, school district quality, local employment patterns, proximity to commercial development — are among the most likely mechanisms for algorithmic redlining. These variables have legitimate uses in mortgage underwriting: property value trends affect collateral quality, and collateral quality is directly relevant to mortgage default risk and recovery. But these variables are also heavily correlated with the racial composition of neighborhoods, because neighborhood racial composition is itself correlated with the historical pattern of investment (and disinvestment) that redlining produced.

A model that penalizes loan applications in neighborhoods with low property value appreciation, or in neighborhoods near commercial zoning, or in neighborhoods with lower school quality ratings, will systematically disadvantage applicants from majority-minority neighborhoods — not because of their race, but because of where they live, and those locations are correlated with race due to historical discrimination.

Part 4: The Response from Lenders — Explaining vs. Justifying

The Markup contacted every lender named in its investigation for comment before publication. The responses varied but fell into recognizable patterns.

The "Unobserved Variables" Defense

The most common response was some version of: we do not discriminate, and the disparities in The Markup's analysis reflect unobserved financial variables — particularly credit scores — that The Markup's analysis did not fully control for. This defense is partially valid as a methodological observation and wholly inadequate as a substantive response.

It is true that HMDA data does not include full credit score information, and credit scores are strongly associated with race (for the historical reasons discussed in Chapter 11). To the extent that racial disparities in HMDA denial rates are "explained" by credit score disparities, the denial rate disparities are not random variation; they are the downstream effect of historical credit access discrimination filtering through the credit scoring system. Saying that racial disparities in mortgage denial are "just" a credit score effect is not an explanation that dispels the concern — it relocates it.

The "We Follow the Rules" Response

Several lenders responded that they comply with all applicable fair lending laws. This is likely true for most lenders named in the investigation. Compliance with existing law, as the Apple Card case also demonstrated, does not establish that outcomes are fair. Current fair lending law — applied to automated underwriting systems — does not require lenders to audit or address racial disparities in automated system recommendations.

The More Thoughtful Responses

A small number of lenders engaged more substantively, acknowledging that disparate outcomes were a genuine concern and describing their fair lending testing processes. These responses were the exception, not the rule.

The pattern of industry response to The Markup investigation — defensive, dismissive, and focused on the methodological limitations of HMDA data rather than on the substantive question of why racial disparities exist — reflects an industry culture that treats fair lending compliance as a legal obligation to be managed rather than an ethical commitment to equitable outcomes.

Part 5: CFPB's Response — Enhanced HMDA Data Requirements and Fair Lending Examinations

The CFPB's response to The Markup investigation and to the broader evidence of racial disparities in mortgage lending took several forms.

Enhanced HMDA Examination

The CFPB announced in 2022 that it was prioritizing fair lending examinations of mortgage lenders, with particular focus on automated underwriting systems and the use of algorithmic models. CFPB examination teams were directed to review lenders' AUS recommendations and compare them with demographic data to identify potential disparate impact.

The CFPB also updated its fair lending examination procedures to specifically address AI and ML models, requiring examiners to assess whether lenders have conducted disparate impact analysis of their automated systems and whether they have validated their models for bias.

HMDA Data Enhancement

The expanded HMDA data fields added in 2015 — including AUS recommendation, LTV, DTI, and interest rate — significantly enhanced the data's ability to support fair lending analysis. The CFPB has continued to refine the HMDA data collection and publication process, and researchers have used the expanded data to conduct more sophisticated analyses than were previously possible.

Enforcement Actions

The DOJ and CFPB have brought a series of fair lending enforcement actions since 2021 that reflect the evidence of algorithmic redlining documented by The Markup and academic researchers. The DOJ's 2023 consent order with City National Bank, which found redlining in the Los Angeles market, included requirements for enhanced marketing, community outreach, and credit access in majority-minority census tracts. The CFPB's enforcement actions against Trident Mortgage Company and others similarly required affirmative steps to expand credit access.

Part 6: Congressional Attention

The Markup investigation attracted significant Congressional attention, particularly from members of the House Financial Services Committee and the Senate Banking Committee.

Representative Maxine Waters, who chaired the House Financial Services Committee at the time of publication, called the findings "deeply troubling" and convened hearings on algorithmic discrimination in mortgage lending. Witnesses from the CFPB, academic researchers, and fair housing advocates testified about the evidence of racial disparities in automated underwriting.

Congressional attention produced several notable responses. The CFPB was pressed to expand its use of HMDA data in fair lending examinations and to develop clearer guidance on how disparate impact doctrine applies to automated underwriting systems. There were legislative proposals to require lenders to conduct and disclose disparate impact testing of automated underwriting systems, though as of the time of this writing no such legislation has been enacted at the federal level.

The Illinois legislature enacted the Illinois Artificial Intelligence Video Interview Act and related legislation that, while primarily focused on employment, established a precedent for state-level algorithmic bias regulation that influenced discussions about algorithmic lending. Several other states enacted or considered legislation specifically addressing algorithmic discrimination in financial services.

Part 7: The Limits of HMDA Data — What It Shows and Doesn't Show

Intellectually honest evaluation of The Markup investigation requires understanding what HMDA data can and cannot show.

What HMDA Data Shows

HMDA data is comprehensive in its coverage of applications and outcomes. It provides income, loan amount, LTV, DTI, property characteristics, loan type, and the automated underwriting recommendation. This is substantially more information than was available in HMDA data before the 2015 expansion. When The Markup finds that Black applicants are denied at 80% higher rates than white applicants with similar HMDA-observable characteristics, that finding is statistically robust and meaningful.

What HMDA Data Does Not Show

HMDA data does not include the complete set of variables that lenders' automated underwriting systems use. Most importantly, it does not include credit scores in a fully disaggregated form. The 2015 HMDA expansion added credit score range (not the specific score) for most applications, but the ranges are broad — a 680-719 range includes scores that may predict very different default rates.

HMDA also does not include complete credit history data: the number of accounts, their age, the payment history on each, utilization ratios, or collections and derogatory marks. All of these factors are inputs to automated underwriting systems and to credit scores, and they are associated with race for the historical reasons discussed in Section 11.3.

This limitation means that HMDA data cannot definitively establish that racial disparities in denial rates are caused by discrimination rather than by unobserved credit risk variables that are themselves correlated with race. The Markup's findings establish disparate outcomes. They suggest but do not conclusively prove discriminatory mechanism.

The Research Frontier

Several economists have attempted to use HMDA data together with other data sources — including credit bureau data purchased from the bureaus or obtained through regulatory channels — to examine whether racial disparities persist after controlling for credit scores. Bhutta and Hizmo (2021) examined racial disparities in mortgage interest rates using 2018-2019 HMDA data and found that Black and Hispanic borrowers paid higher rates even after controlling for credit score and other risk factors. The magnitude of the unexplained disparity was smaller than The Markup's raw disparity figures but remained statistically and economically significant.

Part 8: The Case for Paired Testing of Mortgage Algorithms

The most powerful method for detecting discriminatory lending — and the method that has produced the clearest evidence in housing discrimination cases — is paired testing. In a paired test, testers with identical financial profiles but different demographic characteristics apply for credit (or housing) and the outcomes are compared. Pairs are carefully constructed to match on the financial variables that should determine the outcome, isolating demographic characteristics as the only systematic difference.

Paired testing has been used in employment and housing discrimination cases for decades and has been accepted by courts as direct evidence of discrimination. It could in principle be applied to algorithmic mortgage lending, but the application raises several challenges.

The Digital Application Challenge

Physical paired testing — testers who walk into bank branches and apply — is logistically feasible. For algorithmic lending, which is predominantly conducted online, paired testing requires submitting matched digital applications. Many lenders have implemented fraud detection systems that flag multiple applications from the same IP address or with similar information, which can interfere with digital paired testing.

Scale and Representativeness

Individual paired tests establish discrimination in individual cases; detecting systemic algorithmic bias requires paired tests at scale. Conducting thousands of paired mortgage applications is expensive and logistically demanding, but the Department of Justice and fair housing organizations have conducted paired testing studies with enough scale to draw statistically meaningful conclusions.

Regulatory Adoption

Several fair housing advocates and researchers have argued that regulators should require lenders to conduct paired testing of their automated underwriting systems as a condition of model approval, and to disclose the results. This is analogous to the pre-market testing requirements that pharmaceutical regulators impose — before a drug can be marketed, its sponsor must demonstrate its safety and efficacy. Before a mortgage algorithm can be deployed, lenders could be required to demonstrate that it does not produce discriminatory outcomes.

This proposal has not been adopted as a regulatory requirement, but some regulators have indicated openness to it as a supervisory tool.

Part 9: What Happened Next — Changes (or Lack Thereof) in Lending Patterns

HMDA Data Trends

Annual HMDA data releases since The Markup's 2021 investigation allow tracking of whether racial disparities in mortgage denial rates have changed. The short answer is: the disparities persist, with modest variation from year to year. The 2021 and 2022 HMDA data show racial disparities in approval rates that are broadly similar in magnitude to the 2019 data The Markup analyzed. Neither regulatory attention, Congressional scrutiny, nor industry awareness appears to have substantially changed the pattern within the time period covered by available data.

Industry Initiatives

Several mortgage industry organizations have announced voluntary initiatives to address racial disparities in lending, including commitments to increase mortgage lending to Black and Hispanic borrowers, programs to help potential borrowers build credit, and partnerships with housing counseling agencies. The Mortgage Bankers Association and the National Association of Realtors have published reports acknowledging the racial homeownership gap and proposing policy approaches to address it.

These initiatives are genuine but their impact on algorithmic underwriting patterns has been limited. Credit score requirements, DTI limits, and LTV standards — which are substantially driven by GSE guidelines and secondary market requirements — have not materially changed. The fundamental architecture of algorithmic mortgage underwriting, which produces the documented racial disparities, has not been redesigned.

GSE Actions

Fannie Mae and Freddie Mac have taken some steps toward addressing racial disparities in automated underwriting. Fannie Mae introduced an option to incorporate rent payment history into DU recommendations in 2021, which has the potential to improve scores for thin-file applicants who pay rent reliably. Freddie Mac expanded its automated underwriting to consider bank account data in some circumstances. Both GSEs have announced research initiatives focused on racial equity in mortgage lending.

These are incremental improvements, not structural changes. The fundamental dependence on credit scores — which encode historical discrimination in credit access — as a primary input to automated underwriting systems remains intact.

The Accountability Deficit

One of the most striking features of the post-Markup landscape is the accountability deficit: despite substantial evidence of racial disparities in algorithmic mortgage underwriting, no major lender has faced a significant enforcement action specifically premised on the use of a biased automated underwriting system. Enforcement actions for redlining have focused on marketing and outreach practices — failure to advertise in minority communities, failure to locate branches in minority neighborhoods — rather than on the automated underwriting systems that make approval and pricing decisions.

This enforcement gap reflects the legal challenges of bringing disparate impact claims against algorithmic systems discussed in Part 6 and in the main chapter text. It also reflects resource constraints: regulators lack the technical capacity to independently audit and validate lenders' proprietary automated underwriting systems at scale.

Discussion Questions

The Markup found that racial disparities in mortgage denial rates persisted after controlling for the financial variables in HMDA data, but acknowledged that the data does not include credit scores in full detail. How should policymakers weigh evidence of disparate outcomes that might be partially explained by unobserved variables? At what point does the burden shift to lenders to explain why their outcomes are not discriminatory?
The racial disparities documented by The Markup were consistent across lenders — suggesting a systemic cause such as shared automated underwriting infrastructure — rather than varying by lender, which would suggest individual loan officer bias. What are the implications of this finding for how regulators should approach fair lending examinations? If the problem is systemic, what are the systemic remedies?
Paired testing has been used successfully to document housing discrimination and is accepted by courts as direct evidence. Design a paired testing protocol for a digital mortgage algorithm. What challenges would you need to address? What legal and logistical obstacles might prevent widespread adoption of algorithmic paired testing?
The HMDA data that supported The Markup investigation is publicly available. If you were a fair housing researcher with access to this data and basic statistical skills, what analysis would you conduct? What additional data sources would you want access to, and what would the combination allow you to conclude?