Chapter 26 Quiz: Fairness, Explainability, and Transparency

DataField.Dev

Chapter 26 Quiz: Fairness, Explainability, and Transparency

Multiple Choice

Question 1. A model satisfies demographic parity when:

a) The model's accuracy is the same for all demographic groups b) The probability of a positive outcome is the same across all demographic groups c) Similar individuals receive similar predictions regardless of group membership d) The model's predicted probabilities match actual outcome rates for all groups

Question 2. The impossibility theorem of algorithmic fairness (Chouldechova/Kleinberg) states that:

a) It is impossible to build a machine learning model that is completely fair b) When base rates differ across groups, a classifier cannot simultaneously satisfy calibration, predictive parity, and equalized odds c) Fairness and accuracy are always inversely correlated d) No machine learning model can satisfy any fairness definition perfectly

Question 3. Under the 4/5ths rule, adverse impact is indicated when a group's selection rate is less than what percentage of the highest group's selection rate?

a) 50 percent b) 70 percent c) 80 percent d) 90 percent

Question 4. Which of the following best describes the difference between disparate treatment and disparate impact?

a) Disparate treatment concerns outcomes; disparate impact concerns inputs b) Disparate treatment involves intentional use of protected characteristics; disparate impact involves facially neutral practices that disproportionately affect a protected group c) Disparate treatment applies to AI systems; disparate impact applies only to human decision-makers d) Disparate treatment is a statistical test; disparate impact is a legal standard

Question 5. A model is described as interpretable when:

a) A post-hoc explanation method (like SHAP) can approximate its behavior b) Its internal mechanics are simple enough that a human can trace the decision logic directly c) It produces the same output every time for the same input d) It has been validated on a test set with known outcomes

Question 6. SHAP values are grounded in which mathematical framework?

a) Bayesian probability theory b) Information theory (entropy) c) Cooperative game theory (Shapley values) d) Linear algebra (matrix decomposition)

Question 7. Which property makes SHAP values particularly useful for individual prediction explanations?

a) They are always positive for important features b) They are additive — they sum to the difference between the prediction and the average prediction c) They are identical for all instances d) They only work with tree-based models

Question 8. LIME generates explanations by:

a) Computing exact Shapley values for each feature b) Creating a local linear approximation of the model's behavior around a specific instance c) Removing features one at a time and measuring accuracy degradation d) Analyzing the model's internal weights and biases

Question 9. When would you choose LIME over SHAP for explaining a model's predictions?

a) When regulatory compliance requires consistency and an audit trail b) When you need global feature importance across all predictions c) When you need a quick, model-agnostic explanation for debugging a specific prediction d) When the model is a decision tree

Question 10. A model card, as defined by Mitchell et al. (2019), is analogous to:

a) A software license agreement b) A nutrition label for food products c) A financial audit report d) A user manual for a software application

Question 11. Which of the following is NOT a standard section in a model card?

a) Intended use and out-of-scope uses b) Performance metrics broken down by demographic group c) Source code and model weights d) Ethical considerations and known biases

Question 12. Timnit Gebru et al.'s "Datasheets for Datasets" framework focuses on documenting:

a) The algorithms used to process the data b) The provenance, composition, collection process, and intended uses of training data c) The hyperparameters used during model training d) The deployment infrastructure and serving architecture

Question 13. Under GDPR Article 22, individuals have the right to:

a) Access the source code of any AI system that makes decisions about them b) Demand that all AI systems be replaced with human decision-makers c) Not be subject to decisions based solely on automated processing that significantly affect them, and to obtain meaningful information about the logic involved d) Receive monetary compensation for any automated decision they disagree with

Question 14. The EU AI Act classifies AI systems into risk categories. Credit scoring and hiring algorithms are classified as:

a) Unacceptable risk (prohibited) b) High risk (subject to transparency, oversight, and fairness requirements) c) Limited risk (transparency obligations only) d) Minimal risk (no specific requirements)

Question 15. Permutation importance measures a feature's importance by:

a) Examining the feature's coefficient in a linear model b) Counting how often the feature is used for splitting in tree-based models c) Randomly shuffling the feature and measuring the degradation in model performance d) Computing the correlation between the feature and the target variable

Question 16. In Athena's churn model, Tom discovers that zip code is the third most important feature. The primary concern is that zip code:

a) Is computationally expensive to encode b) Serves as a proxy for race and socioeconomic status, introducing indirect discrimination c) Is not available for all customers d) Changes too frequently to be a reliable predictor

Question 17. Athena's solution to the zip code problem involves:

a) Removing all geographic features from the model b) Replacing zip code with geographic purchasing pattern features that capture useful signal with less demographic encoding c) Adding race as an explicit feature to control for the correlation d) Using a more complex neural network that learns to ignore zip code

Question 18. A partial dependence plot (PDP) shows:

a) The importance of each feature relative to all other features b) The marginal effect of one feature on predictions, averaged over all instances c) The correlation between two features d) The distribution of SHAP values for a single feature

Short Answer

Question 19. In 2-3 sentences, explain why a model that satisfies calibration might still be considered unfair. Use a concrete example.

Question 20. A colleague argues that the impossibility theorem means fairness is a futile pursuit and that organizations should focus only on accuracy. In 3-4 sentences, explain why this argument is flawed.

Question 21. Explain the difference between a global SHAP explanation and a local SHAP explanation. When would a business stakeholder need each type?

Question 22. Professor Okonkwo says: "If you can't explain your model's decision to the person affected by it, you shouldn't be making that decision algorithmically." Identify one scenario where you agree with this statement and one scenario where you might disagree. Justify both positions in 3-4 sentences total.

Question 23. A company deploys a customer segmentation model that groups customers into "premium," "standard," and "budget" tiers for marketing purposes. An external audit reveals that the "budget" tier is 74 percent composed of customers from a minority demographic group, while that group represents only 31 percent of the total customer base. The company argues that the model does not use race as an input. Using concepts from this chapter, explain in 4-5 sentences why the company's defense is insufficient and what steps it should take.

True or False

Question 24. True or False: SHAP values can only be computed for tree-based models such as random forests and gradient boosting.

Question 25. True or False: Under the EU AI Act, a company deploying a chatbot for customer service must disclose to users that they are interacting with an AI system, even though chatbots are classified as "limited risk" rather than "high risk."

Answer key available in Appendix B: Answers to Selected Exercises and Quiz Questions.