Chapter 31: Quiz

DataField.Dev

Chapter 31: Quiz

Test your understanding of fairness in machine learning. Answers follow each question.

Question 1

What is proxy discrimination, and why does removing protected attributes from a model's features not prevent it?

Answer

**Proxy discrimination** occurs when a model uses facially neutral features — such as zip code, employer, or credit history — that are strongly correlated with protected attributes like race or gender. Removing the protected attribute itself does not prevent proxy discrimination because the statistical relationship between the proxy features and the protected attribute persists in the data. A model trained on zip code effectively learns neighborhood-level demographic patterns that reproduce discriminatory outcomes, even without ever seeing a race variable. The model does not need direct access to the protected attribute to replicate its statistical footprint. This is why fairness analysis requires examining model outcomes disaggregated by protected group, not merely auditing the feature list for prohibited variables.

Question 2

Define demographic parity and the four-fifths rule. How are they related?

Answer

**Demographic parity** (also called statistical parity) requires that the probability of receiving a positive prediction is the same across all protected groups: $P(\hat{Y}=1 \mid A=0) = P(\hat{Y}=1 \mid A=1)$. In other words, the selection rate must be independent of group membership. The **four-fifths rule** is a practical threshold for demographic parity used in U.S. employment law (Uniform Guidelines on Employee Selection Procedures, 1978): if the selection rate for any protected group is less than four-fifths (80%) of the selection rate for the group with the highest rate, the selection procedure is considered to have disparate impact. The four-fifths rule operationalizes demographic parity as a ratio rather than a difference: it computes $\min(\text{SR}_a) / \max(\text{SR}_a)$ and flags values below 0.80. The two are related in that the four-fifths rule is a relaxation of strict demographic parity — it permits some disparity (up to a 20% relative difference) rather than requiring exact equality.

Question 3

What is the difference between equalized odds and equal opportunity?

Answer

**Equalized odds** (Hardt, Price, and Srebro, 2016) requires that both the true positive rate (TPR) and the false positive rate (FPR) are equal across protected groups: $P(\hat{Y}=1 \mid Y=y, A=0) = P(\hat{Y}=1 \mid Y=y, A=1)$ for $y \in \{0, 1\}$. **Equal opportunity** is a relaxation of equalized odds that constrains only the TPR: $P(\hat{Y}=1 \mid Y=1, A=0) = P(\hat{Y}=1 \mid Y=1, A=1)$. Equal opportunity requires that among truly qualified individuals, the model identifies them at equal rates regardless of group membership — but it permits the false positive rate to differ across groups. Equal opportunity is less restrictive and therefore easier to achieve while maintaining accuracy. It focuses on the most ethically salient error: denying a positive outcome to someone who deserves it.

Question 4

Why does calibration by group differ from predictive parity, and why is calibration a stronger condition?

Answer

**Predictive parity** requires equal positive predictive value (PPV) across groups at the decision threshold: $P(Y=1 \mid \hat{Y}=1, A=0) = P(Y=1 \mid \hat{Y}=1, A=1)$. It is a statement about the meaning of a positive prediction at a single threshold. **Calibration by group** requires that for every predicted probability $s$, the true probability of a positive outcome is $s$ for all groups: $P(Y=1 \mid S=s, A=a) = s$ for all $s, a$. Calibration is stronger because it constrains the model's probability estimates at every score value, not just at the decision boundary. A calibrated model satisfies predictive parity at every possible threshold, while a model with predictive parity may only satisfy it at one specific threshold. Calibration ensures that scores are semantically comparable across groups — a score of 0.7 means the same thing regardless of group.

Question 5

State the impossibility theorem (Chouldechova, 2017). What are its three conditions, and what does it prove?

Answer

The impossibility theorem states that for a binary classifier applied to two groups with unequal base rates ($\text{BR}_0 \neq \text{BR}_1$), the following three conditions cannot hold simultaneously unless the classifier is perfect: (1) **calibration by group** — equal PPV and equal FDR across groups; (2) **equal false positive rates** — $\text{FPR}_0 = \text{FPR}_1$; and (3) **equal false negative rates** — $\text{FNR}_0 = \text{FNR}_1$. Conditions 2 and 3 together constitute equalized odds. The theorem proves that calibration and equalized odds are mathematically incompatible whenever base rates differ, which is the case in virtually every real-world prediction problem. The proof follows from Bayes' theorem: PPV is a function of TPR, FPR, and the base rate, so equalizing TPR and FPR across groups with different base rates necessarily produces different PPVs.

Question 6

What is the key practical implication of the impossibility theorem for deploying ML systems?

Answer

The key implication is that **fairness is a choice, not a discovery**. There is no universal fairness criterion that a model can satisfy; every deployment must explicitly select which fairness criteria to enforce and accept the tradeoffs imposed by the impossibility theorem. This means the choice of fairness criterion is an ethical and domain-specific decision, not a purely technical one. It must be made by people with the authority and context to make ethical decisions — including legal, domain, and community stakeholders — and it must be documented, defended, and monitored over time. Any claim of "algorithmic fairness" that does not specify which criterion, which protected attributes, and which measured values is incomplete.

Question 7

What is individual fairness, and what makes it harder to implement than group fairness?

Answer

**Individual fairness** (Dwork et al., 2012) requires that similar individuals receive similar predictions: $d_{\text{output}}(\hat{Y}(x_i), \hat{Y}(x_j)) \leq L \cdot d_{\text{input}}(x_i, x_j)$, where $d_{\text{input}}$ is a distance metric on the feature space and $d_{\text{output}}$ is a distance metric on predictions. The fundamental difficulty is defining $d_{\text{input}}$ — what it means for two individuals to be "similar." This requires domain knowledge and ethical judgment. For example, if two loan applicants differ only in zip code, whether they are "similar" depends on whether zip code is considered a legitimate creditworthiness signal or a proxy for race. The distance metric encodes the ethical question directly, and there is no universally agreed-upon metric. Group fairness metrics, by contrast, require only group labels and standard confusion-matrix computations, making them straightforward to compute even though the choice of which metric to enforce remains a value judgment.

Question 8

What is counterfactual fairness, and what does it require that other fairness definitions do not?

Answer

**Counterfactual fairness** (Kusner et al., 2017) requires that a prediction would remain the same in a counterfactual world where the individual belonged to a different protected group, with all non-descendant variables held constant. Formally: $P(\hat{Y}_{A \leftarrow a}(U) = y \mid X=x, A=a) = P(\hat{Y}_{A \leftarrow a'}(U) = y \mid X=x, A=a)$. Unlike group fairness metrics (which require only outcomes and group labels) or individual fairness (which requires a distance metric), counterfactual fairness requires a **causal model** — a directed acyclic graph specifying the causal relationships between the protected attribute, all features, and the outcome. This is a much stronger requirement, because the causal graph determines which features are causally downstream of the protected attribute (and must be adjusted in the counterfactual) and which are not. The causal graph is often contested — whether education is causally downstream of race, for instance, is an empirical and philosophical question — and different graphs yield different definitions of counterfactual fairness.

Question 9

Why is intersectional fairness analysis necessary? Give an example of how single-attribute analysis can miss a fairness violation.

Answer

**Intersectional fairness analysis** is necessary because a model can satisfy fairness criteria with respect to each protected attribute individually while being unfair to subgroups defined by the intersection of attributes. For example, a hiring model might have equal selection rates for men and women (passes gender demographic parity) and equal selection rates for Black and white applicants (passes race demographic parity), but dramatically lower selection rates for Black women specifically. This happens when the model overselects Black men and white women, balancing the aggregate statistics while disadvantaging the intersection. Buolamwini and Gebru (2018) documented this in facial recognition: systems achieved high overall accuracy for both dark-skinned and light-skinned faces and for both male and female faces, but had dramatically higher error rates for dark-skinned women. Intersectional analysis computes metrics for cross-product groups (race x gender, race x age, etc.) to detect these masked disparities.

Question 10

Describe the reweighing pre-processing method. What does it correct for, and what can it not correct for?

Answer

**Reweighing** (Kamiran and Calders, 2012) assigns sample weights to training instances to make the label $Y$ statistically independent of the protected attribute $A$. The weight for each instance is $w(y, a) = P(Y=y) \cdot P(A=a) / P(Y=y, A=a)$. This upweights underrepresented (label, group) combinations — for example, positive outcomes in a disadvantaged group — and downweights overrepresented ones. The reweighed dataset has approximately equal base rates across groups. Reweighing corrects for **marginal statistical dependence** between $Y$ and $A$: it ensures that the training signal does not reflect the raw base-rate disparity. However, it **cannot correct for proxy discrimination** through features that correlate with $A$ but are not $A$ itself. If zip code is correlated with race, reweighing does not change the zip code values or their relationship with the outcome — the downstream model can still learn proxy patterns from the features.

Question 11

How does Fairlearn's ExponentiatedGradient enforce fairness constraints? What is the core algorithmic idea?

Answer

Fairlearn's `ExponentiatedGradient` (Agarwal et al., 2018) reduces the constrained optimization problem — minimize error subject to a fairness constraint — to a sequence of **cost-sensitive classification** problems. The algorithm maintains a set of Lagrange multiplier weights on the fairness constraints. At each iteration, it (1) solves a weighted classification problem where the costs encode the current constraint weights, producing a classifier, and (2) updates the constraint weights based on the fairness violation of the current classifier — increasing weights on violated constraints and decreasing weights on satisfied ones. This is an exponentiated gradient (multiplicative weights) update on the Lagrangian dual. The final output is a randomized classifier: a weighted combination of the classifiers produced at each iteration, where the weights are the mixing probabilities from the game-theoretic equilibrium. The key insight is that any fairness-constrained optimization over a base classifier class can be solved by repeatedly solving standard (unconstrained) cost-sensitive classification problems.

Question 12

Explain adversarial debiasing. What role does the adversary_loss_weight ($\lambda$) parameter play?

Answer

**Adversarial debiasing** (Zhang, Lemoine, and Mitchell, 2018) trains two networks jointly: a predictor that maps features $X$ to predictions $\hat{Y}$, and an adversary that tries to predict the protected attribute $A$ from the predictor's output. The predictor is trained to maximize prediction accuracy while minimizing the adversary's ability to recover $A$ — effectively forcing the predictor's output to be uninformative about group membership. The combined loss is $L_{\text{pred}} - \lambda \cdot L_{\text{adv}}$, where $L_{\text{pred}}$ is the prediction loss, $L_{\text{adv}}$ is the adversary loss, and $\lambda$ is the `adversary_loss_weight`. Setting $\lambda = 0$ recovers the standard unconstrained model. Increasing $\lambda$ places more weight on fooling the adversary (enforcing fairness) at the expense of prediction accuracy. The parameter $\lambda$ thus controls the fairness-accuracy tradeoff and represents the team's explicit choice about how much accuracy to sacrifice for fairness.

Question 13

What is post-processing threshold adjustment, and why is it particularly useful in regulated industries?

Answer

**Post-processing threshold adjustment** applies group-specific decision thresholds to a model's scores, rather than a single global threshold. For each group, the threshold is chosen to satisfy the desired fairness criterion (e.g., equalized odds, demographic parity) while maintaining accuracy above a floor. It is particularly useful in regulated industries for two reasons. First, **separation of model and policy**: the model itself is validated once (meeting regulatory requirements for model risk management, such as SR 11-7), and the post-processing layer is adjusted independently. This avoids the need to re-validate the entire model when fairness adjustments are made. Second, **transparency**: the threshold adjustment is a simple, interpretable intervention — "we apply a threshold of 0.42 for group 0 and 0.38 for group 1" — which is easier to explain to regulators and auditors than an opaque in-processing method like adversarial debiasing. However, applying different thresholds to different groups may itself raise legal or ethical concerns (disparate treatment), requiring careful legal review.

Question 14

What is reject option classification, and what intuition motivates it?

Answer

**Reject option classification** (Kamiran, Karim, and Zhang, 2012) applies fairness corrections only to instances in the **uncertainty region** — where the model's score is close to the decision boundary (within a specified margin). For instances far from the boundary, the model's prediction is accepted unchanged. For instances near the boundary, predictions are flipped in favor of the disadvantaged group: borderline cases from the disadvantaged group are given the favorable outcome, and borderline cases from the advantaged group are given the unfavorable outcome. The intuition is that the model is least confident about borderline cases, so flipping their predictions has the smallest accuracy cost. The fairness correction is concentrated where it does the least damage to predictive quality. This limits the fairness-accuracy tradeoff to the region where the model is already uncertain — instances far from the boundary, where the model is confident, are never affected.

Question 15

What is the difference between Fairlearn's MetricFrame and AIF360's ClassificationMetric? When would you use each?

Answer

Fairlearn's `MetricFrame` takes any scikit-learn-compatible metric function and computes it disaggregated by protected attribute. It is flexible (any metric), integrates natively with the scikit-learn ecosystem, and supports operations like `.difference()` and `.ratio()` for computing disparities. AIF360's `ClassificationMetric` is a specialized class that provides a fixed set of fairness-specific metrics (statistical parity difference, average odds difference, equal opportunity difference, Theil index, etc.) computed from AIF360's own dataset wrappers. `MetricFrame` is preferred for day-to-day monitoring and integration with existing ML pipelines because it works directly with numpy arrays and any sklearn metric. `ClassificationMetric` is useful when you need metrics that are not available as standalone sklearn functions (e.g., Theil index, generalized entropy index) or when using AIF360's built-in mitigation algorithms, which expect AIF360 dataset objects. In practice, many teams use Fairlearn for monitoring and reporting and AIF360 for specific mitigation algorithms not available in Fairlearn.

Question 16

What is a fairness review board, and why is it cross-functional?

Answer

A **fairness review board** (FRB) is an organizational body that reviews ML systems for fairness before deployment and periodically after deployment. It is cross-functional because fairness decisions require expertise that no single discipline possesses. Data scientists provide technical assessment (computing metrics, evaluating tradeoffs). Legal and compliance professionals ensure regulatory requirements are met (ECOA, FCRA, Title VII). Domain experts provide business context (what does the model's decision mean for real people?). Ethicists or external advisors assess values alignment (which fairness criterion reflects the organization's values?). Representatives from affected communities provide lived experience (what harms are actually felt?). The impossibility theorem guarantees that fairness involves tradeoffs among incompatible criteria, and making those tradeoffs is an ethical decision that must be informed by multiple perspectives. A purely technical team may optimize the wrong criterion; a purely legal team may stop at regulatory compliance without addressing substantive harm.

Question 17

How should a team decide which fairness criterion to enforce for a specific model? Describe the metric selection framework.

Answer

The metric selection framework maps the primary harm the model could cause to the fairness criterion that best addresses it. If the primary harm is denying a benefit to qualified individuals (e.g., creditworthy applicants denied loans), **equal opportunity** is favored — it ensures equal TPR across groups. If the primary harm is differential error rates (errors concentrated in one group), **equalized odds** is favored. If the primary harm is underrepresentation in outcomes, **demographic parity** is favored. If predictions must mean the same thing across groups (a score of 0.7 must have the same semantics), **calibration or predictive parity** is favored. If the decision is subject to disparate impact law, the **four-fifths rule** provides a legally grounded threshold. If individual-level treatment is the core concern, **individual or counterfactual fairness** is favored. In practice, teams monitor multiple metrics and set binding constraints on one or two. The choice must be documented with its rationale, reviewed by the FRB, and revisited periodically.

Question 18

Why must fairness be monitored continuously, not just assessed once at deployment?

Answer

Fairness can degrade over time for several reasons. **Data distribution shift:** the demographic composition of the population may change (e.g., a lender expands to a new geographic market), altering base rates and selection rates. **Feature drift:** upstream features may change their relationship with the protected attribute (e.g., a data source update changes how zip codes are encoded). **Label drift:** the ground truth outcomes may shift (e.g., economic conditions change default rates differently across groups). **Feedback loops:** the model's own decisions may alter future data (e.g., denied applicants cannot build credit, reinforcing the disparity). **Population self-selection:** who applies may change in response to the model's perceived fairness. Any of these can cause a model that was fair at deployment to become unfair over time. Continuous monitoring — computing fairness metrics at every retraining and scoring batch, with alerting thresholds and quarterly FRB reviews — ensures that drift is caught before it causes harm.

Question 19

In the StreamRec context, what is creator fairness, and how does it differ from user fairness?

Answer

**Creator fairness** concerns whether content creators receive equitable exposure (recommendations, impressions) from the algorithm, relative to their content production. A recommendation system that systematically directs most impressions to a small set of established creators, while ignoring equally good content from smaller or newer creators, may be unfair to the underexposed creators — particularly if the underexposure correlates with creator demographics (region, language, account age). Creator fairness is measured by the exposure equity ratio: the fraction of impressions a group receives divided by the fraction of content it produces. **User fairness** concerns whether users from different demographic groups receive equally good recommendations — measured by metrics like Hit@10, NDCG@10, and completion rate across user demographic groups. The two can conflict: maximizing user fairness (giving every user the best possible recommendations) may concentrate exposure on a small set of popular creators, harming creator fairness. A fairness-aware re-ranking module must balance both objectives.

Question 20

A credit scoring model passes the four-fifths rule at deployment (demographic parity ratio = 0.83). Six months later, the ratio drops to 0.76. The equalized odds difference remains stable. What are the most likely root causes, and what should the team do?

Answer

The most likely root causes are: (1) **Base rate shift:** the actual default rates across groups have diverged further (e.g., an economic downturn affects one group more), which changes the demographic parity ratio without affecting equalized odds if the model's error rates remain proportional. (2) **Population composition shift:** the applicant pool has changed (e.g., one group's application volume has increased, or the new applicants in one group are systematically different from historical applicants), shifting the selection rate even if the model behaves identically on similar applicants. (3) **Feature drift in proxy variables:** a feature correlated with the protected attribute (e.g., zip code distribution, employer mix) has shifted, causing the model to score one group differently even though the model weights have not changed. The team should: (i) decompose the change into base rate, population, and feature contributions; (ii) check whether the model's calibration has degraded for the affected group; (iii) evaluate whether post-processing threshold adjustment can restore the ratio above 0.80 without unacceptable accuracy loss; and (iv) escalate to the fairness review board for a decision on immediate remediation vs. further investigation. The 0.76 ratio is below the binding threshold, so some action is required — at minimum, a documented timeline for remediation.