Quiz — Chapter 29: Algorithmic Fairness and Bias in Compliance Systems
Instructions
This quiz contains 14 questions covering sources of algorithmic bias, fairness metrics, regulatory obligations, and remediation approaches. Questions are a mix of multiple choice, short answer, and applied analysis. An answer key is provided at the end.
Questions
Question 1
Which of the following best describes measurement bias in a machine learning compliance system?
A. The model was trained on historical decisions that reflected discriminatory lending practices B. Training data from populations subject to more intense monitoring creates the appearance of higher risk in those populations C. Human labellers assigned different labels to identical cases based on the applicant's name D. The training dataset contains fewer examples from certain demographic groups
Question 2
A credit scoring model has been trained on ten years of lending decisions. Lending officers during that period consistently applied higher scrutiny to loan applications from applicants in certain postcodes. The model has learned to assign lower scores to applicants from those postcodes. Which type of bias does this primarily represent?
A. Representation bias B. Aggregation bias C. Historical bias D. Label bias
Question 3
Demographic parity requires which of the following?
A. That the model's true positive rate is equal across demographic groups B. That the model's positive prediction rate is equal across demographic groups C. That the model's predicted probabilities are equally accurate across demographic groups D. That the model would make the same decision for the same individual regardless of their demographic group
Question 4
The four-fifths rule holds that a potential disparate impact exists when:
A. The minority group's false positive rate exceeds the majority group's false positive rate by more than 20% B. The minority group's approval rate is less than 80% of the majority group's approval rate C. The minority group's precision is less than 80% of the majority group's precision D. The minority group represents less than 20% of the training dataset
Question 5
A KYC verification model shows an overall accuracy of 94%. When results are broken down by demographic group, Group A shows 98% accuracy and Group B shows 76% accuracy. Which type of bias does this illustrate?
A. Label bias B. Measurement bias C. Historical bias D. Aggregation bias
Question 6
According to the impossibility theorem (Chouldechova 2017; Kleinberg et al. 2016), which statement is correct?
A. Demographic parity, equalized odds, and calibration can all be achieved simultaneously with sufficiently advanced algorithms B. When base rates differ across demographic groups, demographic parity, equalized odds, and calibration cannot all be satisfied simultaneously C. Fairness is achievable only when the training dataset contains equal numbers of examples from each demographic group D. Counterfactual fairness is equivalent to equalized odds in all cases
Question 7
The NIST Face Recognition Vendor Test (2019) found which of the following?
A. Commercial facial recognition algorithms showed equal false non-match rates across all demographic groups tested B. False non-match rates for African American and Asian faces were up to 100 times higher than for Caucasian faces in some commercial algorithms C. The algorithms showed higher false positive rates for darker skin tones but equal false negative rates D. Only government-developed algorithms showed significant demographic performance disparities
Question 8
Under the UK Equality Act 2010, indirect discrimination occurs when:
A. A person is treated less favourably because of a protected characteristic with explicit discriminatory intent B. A provision, criterion, or practice applies equally to everyone but has disproportionate adverse effects on people sharing a protected characteristic C. A firm fails to make reasonable adjustments for a disabled person D. A protected characteristic is directly used as a feature in an algorithmic decision system
Question 9
How many protected characteristics does the UK Equality Act 2010 identify?
A. Five B. Seven C. Nine D. Twelve
Question 10
Equalized odds (Hardt et al. 2016) requires:
A. Equal approval rates across demographic groups B. Equal true positive rates across demographic groups only C. Equal true positive rates AND equal false positive rates across demographic groups D. Equal predicted probabilities across demographic groups
Question 11
Calibration in a machine learning model means:
A. The model applies the same feature weights to all demographic groups B. The model's predicted probabilities are equally accurate across demographic groups — a score of 0.7 means a 70% probability of the predicted outcome for all groups C. The model achieves the same overall accuracy rate regardless of which demographic group is presented D. The model's approval rate does not vary by more than 5% across demographic groups
Question 12
Counterfactual fairness asks:
A. Whether the model would be accurate if it had been trained on a counterfactual dataset B. Whether historical counterfactual data could have produced different outcomes C. Whether a specific individual would have received a different decision if their protected characteristic had been different, all else equal D. Whether the model's output is consistent with a counterfactual model trained without any demographic data
Question 13
When a financial firm discovers that a vendor-supplied KYC system is producing a 3.8× rejection rate differential across demographic groups, which of the following best describes the firm's regulatory position?
A. The firm can rely on the vendor's validation documentation to demonstrate compliance B. The firm is not responsible for outcomes produced by third-party systems if the vendor's model was independently validated C. The firm bears full regulatory responsibility for the outcomes produced by the system, regardless of the vendor relationship D. The firm's obligation is limited to notifying the vendor and requesting a model update
Question 14
A compliance team is drafting a fairness clause for a vendor contract covering a credit decisioning algorithm. Which of the following elements is most important to include?
A. A requirement that the model use no features correlated with demographic group membership B. A requirement that the vendor provide disaggregated performance metrics by demographic group, along with a process for investigating and remediating identified disparities C. A requirement that the model's overall accuracy not decrease following any fairness adjustment D. A requirement that the vendor indemnify the firm for any Equality Act claims arising from the model's decisions
Answer Key
1. B — Measurement bias occurs when differential monitoring intensity in training data creates the appearance of higher risk in monitored populations, unrelated to genuine underlying risk differences.
2. C — The model is learning patterns from historical human decisions that encoded discriminatory scrutiny, which is historical bias. Note that D (label bias) is closely related and also defensible as a partial answer — the key distinction is that the bias flows from the historical pattern of human decisions embedded in the outcome labels, making historical bias the most precise categorisation.
3. B — Demographic parity requires equal positive prediction rates (approval rates) across demographic groups. Option A describes equalized odds (TPR). Option C describes calibration. Option D describes counterfactual fairness.
4. B — The four-fifths rule holds that a minority group's approval rate less than 80% of the majority group's approval rate indicates potential disparate impact.
5. D — Aggregation bias describes a situation where strong aggregate performance conceals poor performance for demographic subgroups. The 94% overall accuracy masks the 76% accuracy for Group B.
6. B — The impossibility theorem proves that when base rates differ across groups, demographic parity, equalized odds, and calibration cannot simultaneously be satisfied. This is a mathematical result, not a limitation of current algorithms.
7. B — The NIST FRVT found false non-match rates up to 100 times higher for African American and Asian faces compared to Caucasian faces in some commercial algorithms, driven by representation bias in training datasets.
8. B — Indirect discrimination under the Equality Act occurs when a neutral provision, criterion, or practice has disproportionate adverse effects on people sharing a protected characteristic. Discriminatory intent is not required. Option A describes direct discrimination.
9. C — The UK Equality Act 2010 identifies nine protected characteristics: age, disability, gender reassignment, marriage and civil partnership, pregnancy and maternity, race, religion or belief, sex, and sexual orientation.
10. C — Equalized odds requires both equal true positive rates (TPR) and equal false positive rates (FPR) across demographic groups. Equal approval rates alone describes demographic parity.
11. B — Calibration means that a model's predicted probabilities accurately reflect true probabilities of the outcome for all demographic groups — a score of 0.7 should mean a 70% probability of the predicted outcome regardless of which group the individual belongs to.
12. C — Counterfactual fairness asks whether a specific individual would have received a different decision if their protected characteristic had been different, with all other characteristics held constant. It is the closest algorithmic analogue to the legal concept of direct discrimination.
13. C — The FCA and other financial regulators do not transfer regulatory responsibility to vendors. The financial firm bears full responsibility for the customer outcomes produced by any system it deploys, regardless of whether the system was developed by a third party. The firm cannot rely on vendor validation as a compliance defence.
14. B — Disaggregated performance reporting by demographic group is the most essential fairness contract clause because it makes disparities visible and creates an obligation for the vendor to investigate and remediate. Option A is overly broad and would exclude legitimate features. Option C conflates accuracy with fairness in a way that is not tenable under the impossibility theorem. Option D is a legal indemnity, not a fairness mechanism.