Chapter 11: Quiz

Bias in Financial Services and Credit

Instructions: This quiz covers material from Chapter 11. Multiple choice and true/false questions should be answered by selecting the best or correct option. Short answer questions should be answered in 3–5 sentences. Applied scenario questions require longer responses of approximately 150–200 words. Time suggested: 45–60 minutes.

Part I: Multiple Choice (8 questions, 2 points each)

Question 1

Under the Equal Credit Opportunity Act (ECOA), which of the following best describes "disparate impact" in algorithmic lending?

A) A credit algorithm that explicitly uses race as an input variable to make lending decisions

B) A credit policy that is facially neutral but produces disproportionately adverse outcomes for a protected class, without adequate business justification

C) A pattern of loan officer decisions that, when reviewed statistically, reveals individual bias in some decisions

D) An algorithm that denies applications from protected groups at higher rates solely because their average incomes are lower

Question 2

The Apple Card gender discrimination controversy illustrated which of the following problems most directly?

A) Goldman Sachs intentionally programmed gender bias into its algorithm

B) The algorithm used gender as a direct input variable, violating ECOA

C) A discriminatory outcome was produced by an algorithm that did not directly use gender as a variable, and current law was insufficient to prove illegal discrimination

D) The New York DFS found that Goldman Sachs had violated state fair lending law and imposed a consent order

Question 3

The Markup's 2021 investigation used HMDA data to find that Black mortgage applicants were denied at what rate compared to comparable white applicants?

A) Approximately 20% more likely to be denied

B) Approximately 50% more likely to be denied

C) Approximately 80% more likely to be denied

D) Approximately twice as likely to be denied (100% more likely)

Question 4

Under the four-fifths rule (as applied to fair lending), a disparate impact ratio of 0.75 for Black applicants compared to white applicants means:

A) Black applicants are approved at 75% of their applications

B) Black applicants are approved at a rate that is 75% of the white applicant approval rate, which is below the 0.80 threshold indicating prima facie adverse impact

C) 75% of Black applicants experience some form of discrimination

D) The model is 75% accurate for Black applicants vs. 100% accurate for white applicants

Question 5

Which of the following variables in a mortgage underwriting model is MOST likely to constitute algorithmic redlining?

A) Debt-to-income ratio at the time of application

B) Number of late payments in the past 24 months

C) Census tract property value appreciation rate over the past five years

D) Loan-to-value ratio based on independent appraisal

Question 6

The "thin-file problem" in credit scoring refers to:

A) The practice of financial institutions maintaining incomplete records for minority applicants

B) The situation of individuals with insufficient credit history for scoring models to generate reliable scores, disproportionately affecting communities historically excluded from credit

C) A technical limitation of the FICO model that prevents it from scoring applicants with fewer than two credit accounts

D) The gap between what HMDA data reports and what lenders' internal models actually use

Question 7

ECOA's adverse action notice requirement, as interpreted by the CFPB's 2022 circular, requires AI-driven credit models to:

A) Provide applicants with a copy of the model's decision function upon request

B) Select reason codes from the CFPB's standardized list without modification

C) Provide reasons that accurately reflect the model's actual decision, not just the most common denial reasons from a standard list

D) Obtain written consent from applicants before using AI to make credit decisions

Question 8

Which of the following best describes the regulatory status of many fintech lenders compared to traditional banks?

A) Fintechs face stricter fair lending requirements because they are supervised by both the CFPB and state regulators simultaneously

B) Fintechs are regulated identically to banks because they perform equivalent financial functions

C) Many fintechs operate with lighter federal supervision than banks, creating a potential gap in fair lending enforcement coverage

D) Fintechs are exempt from ECOA because they use technology rather than human decision-makers

Part II: True or False (5 questions, 2 points each)

Question 9

True or False: The Community Reinvestment Act (CRA) prohibits discrimination in credit decisions and is enforced by the Consumer Financial Protection Bureau through formal enforcement actions.

(If False, correct the statement.)

Question 10

True or False: A credit model that does not include race, national origin, sex, or any other ECOA-protected characteristic as an input variable cannot produce discriminatory outcomes, because discrimination requires intent.

(If False, correct the statement.)

Question 11

True or False: HMDA data includes applicant credit scores in fully disaggregated form (the specific three-digit score), which is why The Markup's investigation was able to control perfectly for credit quality in its analysis of racial disparities in mortgage approval rates.

(If False, correct the statement.)

Question 12

True or False: The New York Department of Financial Services investigation of Goldman Sachs's Apple Card algorithm found that Goldman Sachs had illegally discriminated against women applicants and required a consent order with remediation payments.

(If False, correct the statement.)

Question 13

True or False: Under the EU AI Act (2024), AI systems used in credit scoring and creditworthiness assessment are classified as high-risk AI systems, requiring pre-deployment conformity assessment and registration in an EU database.

(If True, note whether U.S. law imposes equivalent pre-deployment requirements.)

Part III: Short Answer (4 questions, 5 points each)

Question 14

Explain the "proxy problem" in algorithmic credit scoring. Give two concrete examples of variables that might serve as racial proxies in a mortgage underwriting model, and explain the mechanism by which each produces racially disparate outcomes. Your answer should be 3–5 sentences.

Question 15

A bank's fraud detection model produces false positive rates (legitimate transactions incorrectly flagged as fraudulent) that are significantly higher for Black customers than for white customers, even though the model does not include race as an input variable. Explain how this can happen mechanically, and identify what legal framework, if any, provides consumers with recourse when they experience such disparate treatment. Your answer should be 3–5 sentences.

Question 16

What is "algorithmic redlining," and how does it differ from the historical practice of redlining? In your answer, identify at least one specific category of variable that can produce algorithmic redlining effects and explain why that variable is likely to be correlated with neighborhood racial composition. Your answer should be 3–5 sentences.

Question 17

The SR 11-7 / OCC 2011-12 guidance on model risk management requires banks to validate models they use for significant decisions. Identify three specific components of a model validation process for a credit underwriting AI that address fair lending risk specifically (not just general model accuracy). For each component, briefly explain what it tests and what it is designed to detect. Your answer should be 3–5 sentences.

Part IV: Applied Scenarios (3 questions, 10 points each)

Question 18

Scenario — The Adverse Action Notice

A community bank uses an AI model to make small business loan decisions. The model uses 60 variables and a neural network architecture. Theresa Washington, a Black woman who owns a catering business, applies for a $75,000 loan to purchase equipment. The model denies her application. The bank sends her an adverse action notice that says: "Reason for denial: Insufficient collateral."

An internal SHAP analysis of the decision shows the following top factors: - Years in business: 2.1 years (negative contribution — below the model's preferred threshold of 3 years) - Business credit score: 612 (negative — below threshold) - Industry category "Food Service" (negative — higher-risk category in training data) - Annual revenue of $210,000 (positive contribution — above average for loan size) - Personal credit score of 720 (positive — but weighted less than business factors)

Questions: (a) Is the adverse action notice compliant with ECOA's Regulation B requirements? Why or why not? (b) Draft a corrected adverse action notice based on the SHAP analysis. (c) Theresa suspects that "Food Service" being flagged as a high-risk industry may have discriminatory effects on minority-owned businesses. What information would you need to evaluate her concern? What legal framework applies?

Question 19

Scenario — The Fair Lending Examination

You are a CFPB fair lending examiner reviewing Metropolitan Mortgage Company's automated underwriting system. Metropolitan uses Fannie Mae's Desktop Underwriter (DU) for conforming loans and a proprietary AI model (called "MetroScore") for its portfolio loans. Your analysis of Metropolitan's 2023 HMDA data for portfolio loans shows:

Group	Applications	Approved	Approval Rate	DIR vs. White
White	1,240	942	76.0%	—
Black	290	168	57.9%	0.76
Hispanic	385	255	66.2%	0.87
Asian	175	143	81.7%	1.07

Metropolitan's conforming loan (DU-underwritten) approval rates show no significant racial disparities. Metropolitan argues that the disparity in portfolio loans reflects genuine credit risk differences not captured in HMDA data.

Questions: (a) Which groups fail the four-fifths rule for portfolio loans? (b) What does the contrast between portfolio loan disparities and conforming loan approval rates suggest about where the bias is likely arising? (c) What additional data or documentation would you request from Metropolitan before concluding your examination? (d) If Metropolitan cannot justify the disparity as reflecting legitimate credit risk differences, what remedies would you recommend?

Question 20

Scenario — Algorithmic Redlining Decision

You are the Chief Compliance Officer at Lakefront Credit Union. Your data science team has built a home equity loan model that includes "neighborhood investment score" as one of its 40 features. This score is constructed from census tract data including: median home value appreciation (5-year), school district quality rating, local business permit activity, and code violation rate.

Your fair lending analyst reports that: - The neighborhood investment score is the fifth-most important variable in the model (importance weight: 5.8%) - The score is correlated with census tract racial composition: predominantly white tracts average 65 out of 100; majority-minority tracts average 42 out of 100 - Removing the score reduces the model's predictive AUC from 0.79 to 0.77 and increases the Black applicant approval rate from 52% to 56% - With the score included, the Black applicant DIR is 0.74 (below four-fifths threshold) - Without the score, the Black applicant DIR is 0.82 (above four-fifths threshold)

Questions: (a) Based solely on these facts, would you recommend deploying the model with or without the neighborhood investment score? Justify your recommendation using fair lending principles and the specific data provided. (b) The data science team argues that removing the score reduces model accuracy, which will lead to more defaults and ultimately higher rates for all borrowers. How do you weigh this argument? (c) Are there any modifications to the neighborhood investment score — short of removing it entirely — that might reduce its proxy discrimination effects while preserving some of its predictive value? (d) What regulatory engagement, if any, would you recommend before deploying either version of this model?

Answer Key

Part I: Multiple Choice

B — Disparate impact is defined by disproportionate adverse effects on a protected class from a facially neutral policy, without business justification. Option A describes disparate treatment; C describes human bias pattern; D conflates correlation with causation but misses the "business justification" element.
C — The core lesson of Apple Card is that discriminatory outcomes can be produced without discriminatory intent, and that current law struggled to reach the outcome. No explicit gender coding (eliminates A and B); no violation found by DFS (eliminates D).
C — The Markup's investigation found Black applicants were approximately 80% more likely to be denied than comparable white applicants in the 2019 HMDA data.
B — The four-fifths rule holds that a disparate impact ratio below 0.80 indicates prima facie adverse impact. A ratio of 0.75 means the protected group's approval rate is 75% of the reference group rate — below threshold.
C — Census tract property value appreciation is the most direct example of algorithmic redlining: it uses geographic data that reflects historical redlining patterns and correlates strongly with neighborhood racial composition. The other options are individual financial characteristics not specifically tied to geographic racial patterns.
B — The thin-file problem describes the self-reinforcing exclusion of individuals with insufficient credit history from credit access, disproportionately affecting historically excluded communities. It is not a record-keeping issue (eliminates A), not a technical FICO limitation per se (C is too narrow), and not related to HMDA reporting (eliminates D).
C — The CFPB's 2022 circular specifically stated that lenders cannot use standardized boilerplate reason codes as a safe harbor when those codes do not accurately reflect the model's actual decision. Lenders must provide reasons that are specific to the individual decision.
C — Many fintech lenders operate as nonbank entities subject to state licensing rather than federal bank examination, creating supervisory gaps. Fintechs are generally subject to ECOA and other fair lending laws but may have less intense supervision in practice.

Part II: True or False

False. The CRA does not prohibit discrimination per se; it affirmatively requires depository institutions to meet the credit needs of all communities, including low- and moderate-income neighborhoods. CRA compliance is assessed by the OCC, FDIC, and Federal Reserve (not primarily the CFPB), and is enforced through examination ratings that affect merger approvals — not typically through formal enforcement actions.
False. A credit model can produce discriminatory outcomes through proxy variables — features that are correlated with protected characteristics even when the protected characteristics themselves are excluded. Discrimination in the legal sense (under disparate impact doctrine) does not require intent. The Apple Card case illustrates that gendered outputs can arise from an algorithm that does not name gender as a variable.
False. HMDA data does not include credit scores in fully disaggregated (specific three-digit) form. The 2015 HMDA expansion added credit score range (broad buckets), not the specific score. This is why The Markup acknowledged that its analysis could not fully control for credit quality, leaving open the possibility that racial disparities in approval rates are partially explained by credit score differences within the ranges reported.
False. The New York Department of Financial Services investigation concluded in March 2021 that it found no evidence of illegal discrimination by Goldman Sachs. The DFS did identify governance concerns and recommended enhanced bias testing, but it did not find a violation of law and did not require a consent order or remediation payments.
True. The EU AI Act classifies AI systems for creditworthiness assessment as high-risk, requiring pre-deployment conformity assessment and registration. U.S. law does not impose equivalent pre-deployment requirements; the primary U.S. framework (SR 11-7) requires model validation but does not mandate pre-deployment regulatory clearance or registration in a public database.

Part III: Short Answer

Suggested elements for full-credit responses:

Question 14 — Proxy problem: A proxy variable correlates with a protected characteristic and produces discriminatory outcomes indirectly. Examples: (1) neighborhood risk score — correlates with census tract racial composition due to historical redlining; penalizes applicants from majority-minority neighborhoods that were historically deprived of investment; (2) length of credit history — correlates with race because historical credit exclusion left minority communities with shorter average credit histories; the model penalizes the effect of past discrimination as if it were a current risk indicator. Full-credit answers identify the correlation mechanism and the underlying historical cause.

Question 15 — False positives arise when the model flags "unusual" transactions based on patterns learned from a predominantly white, higher-income customer base; minority customers' legitimate behavior (different geographic patterns, different merchant categories) deviates more from those learned norms, triggering more false positives. Legal framework: CFPB UDAAP (unfair, deceptive, or abusive acts or practices) authority potentially reaches discriminatory fraud detection; state consumer protection laws may apply. Full-credit answers identify both the mechanism and the applicable (if limited) legal framework.

Question 16 — Algorithmic redlining uses geographically correlated variables (neighborhood appreciation rates, school district ratings, census tract crime statistics) that are facially neutral but correlated with neighborhood racial composition due to historical redlining. Differs from historical redlining in that it uses these proxies rather than explicit racial maps and is not intentional. Full-credit answers identify a specific variable type and explain the causal chain from historical redlining to current neighborhood characteristics to model output.

Question 17 — Components: (1) demographic disparity analysis — running the model on a stratified sample to measure approval rates and outcomes across protected groups; (2) variable correlation analysis — identifying which input variables are correlated with protected characteristics (proxy screening); (3) matched-pair / comparative file review — comparing outcomes for matched pairs of applicants differing only in demographic characteristics to isolate demographic effects; (4) adverse action reason code audit — verifying that generated reason codes accurately reflect the model's actual decision for sampled applications; (5) out-of-time validation with demographic stratification — testing on holdout data from a later period to assess whether demographic disparities persist or worsen. Full-credit answers identify three distinct components with clear descriptions of what each tests.

Part IV: Applied Scenarios

Scoring rubric for Applied Scenarios:

10 points: Accurately identifies the relevant legal framework, applies it correctly to the specific facts, addresses all sub-questions, and demonstrates nuanced understanding of the tension between business interests and fair lending obligations
7–9 points: Addresses all sub-questions with generally accurate legal analysis; may miss nuance or fail to address one element
4–6 points: Addresses most sub-questions but with significant gaps in legal accuracy or practical application
1–3 points: Addresses some elements but demonstrates fundamental misunderstanding of the legal framework or the facts

Question 18 — Key points:

(a) The notice is not compliant. The adverse action reason "insufficient collateral" does not appear among the top SHAP factors and is not the principal reason for the denial. Regulation B requires that the specific principal reasons be provided — not a paraphrase or an inaccurate characterization.

(b) A corrected notice should list: (1) years in business below minimum threshold; (2) business credit score below required level; (3) food service industry classification (higher-risk category). Note: if "food service" is listed, the bank must be prepared to show this categorization is not itself discriminatory.

(c) To evaluate the food service concern: compare denial rates across industry categories by race; test whether food service businesses owned by white vs. minority owners are denied at different rates within the same industry category; assess whether the industry risk classification was developed from data that reflects actual credit risk or from patterns that may encode racial bias in business lending.

Question 19 — Key points:

(a) Black applicants (DIR = 0.76) fail the four-fifths rule. Hispanic applicants (0.87) and Asian applicants (1.07) pass.

(b) The contrast between DU-underwritten conforming loans (no significant disparity) and MetroScore portfolio loans (significant Black disparity) strongly suggests that the MetroScore proprietary model — not the underlying financial characteristics of applicants — is the source of the disparity.

(c) Request: (i) full MetroScore model documentation including variable list and weights; (ii) demographic disparity testing Metropolitan has conducted on MetroScore; (iii) training data composition and time period; (iv) any prior validation or audit of MetroScore for fair lending compliance; (v) sample of denied Black applicant files for comparative file review against approved white applicants with similar financial profiles.

(d) If unjustified: require Metropolitan to conduct full disparate impact analysis of MetroScore; require remediation of potentially harmed applicants (re-evaluation and monetary remediation where appropriate); require enhanced model validation before continued use; potentially require model redesign or replacement.

Question 20 — Key points:

(a) Removing the neighborhood investment score is strongly recommended. The DIR without the score (0.82) is above the four-fifths threshold; with it (0.74) it is below. The score is demonstrably correlated with racial composition of neighborhoods (r = -0.42 equivalent); the predictive accuracy cost is modest (0.79 to 0.77 AUC); and the inclusion cannot be justified as necessary when a less discriminatory model is available at similar performance. Under disparate impact doctrine, the "less discriminatory alternative" test is directly applicable.

(b) The accuracy argument deserves consideration but does not override fair lending obligations. A 2-point AUC reduction is modest and may not materially increase defaults. Even if it did, using a proxy variable that produces racial disparities below the four-fifths threshold requires the institution to demonstrate the variable is necessary for the business purpose — a threshold that a 2-point AUC difference probably does not meet.

(c) Possible modifications: (i) include only the property value appreciation component of the score (directly relevant to collateral quality), not the school district and code violation components; (ii) normalize the score within peer geographic areas rather than using raw scores that reflect historical investment patterns; (iii) use the score only as a flag for manual review rather than as a direct numerical input.

(d) Regulatory engagement: Given the proximity to the four-fifths threshold even with the score removed, and the demonstrated correlation with racial composition, seeking CFPB or OCC engagement before deployment would be prudent. At minimum, the credit union should document its fair lending analysis thoroughly before deployment.