Appendix G: Quick Reference Cards

Rapid Lookup Reference for AI Ethics Practitioners


Reference Card 1: Fairness Metrics Cheat Sheet

The Six Core Fairness Metrics


Metric 1: Demographic Parity (Statistical Parity / Group Fairness)

Definition: The proportion of individuals receiving a positive outcome is equal across demographic groups.

In Plain Language: The approval rate should be the same for Group A and Group B — regardless of other characteristics.

When to Use: - When the goal is proportional representation in outcomes - Anti-discrimination law contexts where equal selection rates are required - When base rates of the underlying characteristic are similar across groups - When there is reason to believe historical data encodes discrimination that should not be perpetuated

Key Limitation: - Does not account for legitimate differences in qualifications or risk between groups - Can require approving less-qualified applicants from one group to match approval rates - Mathematically: satisfying demographic parity is incompatible with predictive parity when base rates differ

Example: A hiring algorithm that screens 1,000 male applicants and 1,000 female applicants should, under demographic parity, select the same percentage of each group. If it selects 30% of men but only 15% of women, it violates demographic parity.


Metric 2: Equal Opportunity

Definition: The true positive rate (recall, sensitivity) is equal across demographic groups — among those who deserve a positive outcome, the same proportion of each group receives it.

In Plain Language: Among qualified applicants, the acceptance rate should be the same for all groups.

When to Use: - When the cost of false negatives (missing qualified candidates) falls unequally on groups - Employment and lending contexts where deserving individuals are the focus - When you want to ensure that merit is rewarded equally across groups

Key Limitation: - Only addresses false negatives (people who should be accepted but are rejected); does not address false positives - Requires reliable ground-truth labels for who "deserves" a positive outcome, which may themselves be biased

Example: Among loan applicants who would repay their loans if approved, the same proportion of Black and white applicants should receive loan approval. If 80% of creditworthy white applicants are approved but only 55% of creditworthy Black applicants are approved, equal opportunity is violated.


Metric 3: Equalized Odds

Definition: Both the true positive rate AND the false positive rate are equal across demographic groups.

In Plain Language: Among people who deserve a positive outcome, equal proportions get one; among people who do not deserve a positive outcome, equal proportions still get one incorrectly.

When to Use: - When both types of errors (false positives and false negatives) have significant consequences - Criminal justice contexts where both wrongful conviction (false positive) and wrongful acquittal (false negative) matter - The most stringent standard for error-rate-based fairness

Key Limitation: - Impossible to satisfy simultaneously with calibration when base rates differ (the Chouldechova impossibility result) - Requires ground truth about who "deserves" a positive outcome

Example: If COMPAS assigns a "high risk" score to a defendant, equalized odds requires that (1) among defendants who will reoffend, equal proportions of Black and white defendants get the "high risk" score; AND (2) among defendants who will not reoffend, equal proportions of Black and white defendants are falsely labeled "high risk."


Metric 4: Predictive Parity (Calibration / Test Fairness)

Definition: Among individuals who receive the same predicted score or decision, the actual outcome rate is the same across demographic groups.

In Plain Language: When the model says someone has a 40% risk of X, that 40% prediction should be equally accurate for all demographic groups.

When to Use: - When the score is used to communicate a probability (e.g., recidivism risk, credit risk) - When decision-makers rely on the score's literal meaning, not just its ranking - The standard emphasized by Northpointe in the COMPAS controversy

Key Limitation: - Can be satisfied while still producing racially disparate false positive rates (as in the COMPAS case) - Impossible to simultaneously satisfy with equalized odds when base rates differ across groups

Example: If a recidivism algorithm assigns a score of 7/10 to Black and white defendants, and among those scored 7/10 exactly 70% of both groups reoffend, the algorithm is calibrated even if it assigns the 7/10 score to very different proportions of Black and white defendants.


Metric 5: Individual Fairness

Definition: Similar individuals should receive similar predictions or outcomes — regardless of their demographic group.

In Plain Language: Two people who are alike in all relevant ways should be treated alike.

When to Use: - When fairness is best conceptualized at the individual level - When group-level metrics are insufficient to capture inequities - Theoretical counterpoint to group-based metrics

Key Limitation: - Requires defining a meaningful notion of "similar" individuals — a hard problem that can hide contested value judgments - Does not guarantee equal treatment across demographic groups at the population level - Often difficult to operationalize in practice

Example: Two loan applicants with the same credit score, income, and debt-to-income ratio should receive the same lending decision regardless of their race or gender.


Metric 6: Counterfactual Fairness

Definition: An individual's outcome would be the same in the counterfactual world where they belonged to a different demographic group, with all causally non-relevant characteristics unchanged.

In Plain Language: Changing only your race (or gender, etc.) should not change the model's decision about you.

When to Use: - Causal inference frameworks; when you want to isolate the direct effect of protected characteristics - Theoretically rigorous basis for assessing direct discrimination

Key Limitation: - Requires a causal model of how protected characteristics relate to other variables — which is itself contested - Difficult to operationalize; often theoretical rather than practical - Does not address disparate impact arising from variables causally downstream of race (e.g., neighborhood, education)

Example: If you applied for a loan and were rejected, you were treated counterfactually fairly if you would have received the same decision had you been of a different race with all non-race characteristics unchanged.


Reference Card 2: Ethical Frameworks at a Glance

Framework Core Claim Application to AI Strength Weakness Key Thinker
Consequentialism / Utilitarianism The right action maximizes overall welfare (happiness, preference satisfaction) Evaluate AI by aggregate benefits minus harms; optimize for wellbeing Directly attentive to outcomes; allows comparison and trade-offs Can justify sacrificing individuals for aggregate benefit; measurement problems; whose welfare counts? Jeremy Bentham, John Stuart Mill
Kantian Deontology Actions are right or wrong in themselves, not merely by their consequences; treat persons as ends, never merely as means AI must respect human dignity; certain uses of AI are prohibited regardless of consequences Protects individual rights against utilitarian trade-offs; clear prohibitions May be too absolute; provides limited guidance on how to weigh competing duties Immanuel Kant
Virtue Ethics Right action flows from the character of a virtuous agent Focus on cultivating AI developers and organizations with good character; trustworthy AI as expression of virtue Attentive to practice and culture, not just rules Doesn't resolve specific dilemmas; difficult to institutionalize Aristotle, Alasdair MacIntyre
Rawlsian Justice A just society is one that rational persons would choose from behind a veil of ignorance; the difference principle AI systems are just if those who don't know which group they belong to would consent to them; AI must benefit the least advantaged Powerful tool for identifying unfair AI systems; centers the least advantaged Relies on hypothetical consent; doesn't fully address historical injustice John Rawls
Capabilities Approach Justice requires that all persons have access to a threshold of core human capabilities Evaluate AI by whether it expands or contracts human capabilities; especially attentive to the most marginalized Specific and substantive; centers concrete human flourishing Identifying and weighting capabilities is contested; may be difficult to operationalize Amartya Sen, Martha Nussbaum
Care Ethics Moral life is constituted by relationships and responsibilities of care; attentiveness to particular persons and contexts AI must attend to the relational and contextual dimensions of harm; who bears the care burden of AI systems? Centers relationships and vulnerability; resists abstraction May not generate action-guiding principles; can be difficult to scale Carol Gilligan, Nel Noddings

Reference Card 3: EU AI Act Risk Tiers

The Four-Tier Risk Classification


TIER 1: UNACCEPTABLE RISK — PROHIBITED

These AI applications are prohibited entirely. No exceptions.

Prohibited Practice Description
Social scoring by governments Classifying people by social behavior for government purposes with detrimental consequences
Subliminal manipulation Using techniques beyond consciousness to distort behavior causing significant harm
Exploitation of vulnerability Targeting vulnerable persons (children, disabled) to distort behavior causing harm
Remote biometric ID in public spaces Real-time identification for law enforcement, with narrow exceptions
Predictive policing based on profiling Criminal risk assessment based solely on personal characteristics
Untargeted facial image database scraping Building databases from internet or CCTV facial images
Emotion inference in workplace/education Inferring emotional states of workers or students (with exceptions for medical/safety)

Key requirement: Absolute prohibition. No mitigation or safeguard can make these uses compliant.


TIER 2: HIGH RISK

High-risk AI systems are permitted subject to extensive mandatory requirements.

Categories of High-Risk AI: - Biometric identification and categorization - Critical infrastructure (energy, water, transport) - Education and vocational training - Employment and worker management (including hiring tools) - Essential private and public services (credit scoring, benefits eligibility) - Law enforcement (risk assessment, polygraphs, crime analytics) - Migration and border control - Administration of justice and democratic processes

Key Requirements: - Conformity assessment before deployment - Registration in EU AI Act database - Risk management system throughout lifecycle - Training data governance requirements - Technical documentation and logging - Transparency and provision of information to users - Human oversight measures - Robustness, accuracy, and cybersecurity requirements

Who Must Comply: Providers (who place AI on the market) and deployers (who use it in their business)


TIER 3: LIMITED RISK (TRANSPARENCY OBLIGATIONS)

These AI systems must meet transparency requirements but face no substantive restrictions.

Applies to: - Chatbots and conversational AI (must disclose AI identity) - Deepfakes (must label as AI-generated) - Emotion recognition systems (must notify users) - AI that generates or manipulates images, audio, video (must label)

Key Requirement: Users must be informed they are interacting with or viewing AI-generated content. No functional restrictions on design or use.


TIER 4: MINIMAL RISK

The vast majority of AI applications. No specific requirements under the AI Act.

Examples: Spam filters, AI-enabled video games, most recommendation systems, AI in manufacturing processes

Note: General-purpose AI models (including large language models) have their own distinct requirements under Title VIII of the Act, separate from this four-tier classification.


Reference Card 4: AI Bias Types (Suresh & Guttag Taxonomy)

Bias Type Definition Example Detection Method
Historical Bias Bias that exists in the world and is reflected in data even when data is perfectly collected Medical dataset where Black patients appear less sick because they've received less care historically Compare model performance to ground truth independent of historical access patterns
Representation Bias Under-representation of a subpopulation in the training dataset Facial recognition trained predominantly on lighter-skinned faces performs poorly on darker-skinned faces Audit training dataset for demographic representation; compare to target population
Measurement Bias Errors in how features or labels are measured that differ across groups Arrest records used as a proxy for criminal behavior, but arrest rates reflect policing intensity, not actual crime Validate measurement against independent ground truth; test for differential measurement error by group
Aggregation Bias A model trained on aggregate data fails to perform well for subgroups with different underlying patterns A diabetes prediction model trained on the general population fails for Native American patients with distinct disease presentation Build separate models or explicitly model subgroup differences; test performance by subgroup
Learning Bias The model learning process amplifies biases in the data A word embedding model that associates "programmer" with male pronouns, amplifying gender associations beyond their training data frequency Measure correlation between model outputs and protected attributes; compare to training data base rates
Deployment Bias The model is used in a context different from what it was designed for, causing biased outcomes A medical model designed for research populations is deployed for clinical decisions in a different demographic setting Conduct ongoing monitoring in deployment; compare deployment population to training population

Reference Card 5: Key Laws and Regulations

Law / Regulation What It Is Who It Applies To Key AI Ethics Requirement Enforcement Body
GDPR (EU, 2018) Comprehensive data protection law Any organization processing personal data of EU residents Art. 22: right not to be subject to solely automated decisions with significant effects; rights to access and erasure; data minimization; purpose limitation National Data Protection Authorities; EU Data Protection Board
EU AI Act (2024) First comprehensive AI regulation Providers and deployers of AI systems in the EU market Prohibits high-risk AI practices; mandatory requirements (conformity assessment, human oversight, documentation) for high-risk systems; transparency for limited-risk systems EU AI Office; national market surveillance authorities
CCPA / CPRA (California) State consumer privacy law Businesses that collect California residents' personal information above size thresholds Right to know, delete, opt-out, and correct personal information; limits on use of sensitive data; automated decision-making rights (CPRA) California Privacy Protection Agency
ECOA (U.S.) Equal Credit Opportunity Act (1974) Any creditor Prohibits credit discrimination based on race, color, religion, national origin, sex, marital status, age; adverse action notice requirement; applies to algorithmic credit scoring CFPB, federal banking agencies, DOJ
Fair Housing Act (U.S.) Prohibits housing discrimination (1968) Sellers, landlords, lenders, real estate agents, platform operators Prohibits discrimination in sale, rental, financing of housing; disparate impact doctrine applies; applies to algorithmic housing recommendations HUD, DOJ
EEOC Guidance (U.S.) EEOC Technical Assistance on AI (2022) Employers subject to Title VII, ADA, ADEA AI hiring tools must not create disparate impact; ADA reasonable accommodation may require alternatives to AI screening; 4/5 rule applies EEOC
Illinois BIPA Biometric Information Privacy Act (2008) Illinois entities collecting biometric data (fingerprints, face scans, iris scans) Written consent before collection; retention and destruction policy; prohibition on sale; private right of action with statutory damages Private litigation; IL AG
NYC Local Law 144 (2023) NYC Automated Employment Decision Tool Law NYC employers using AEDT tools in hiring or promotion Annual bias audit by independent auditor; public posting of audit results; notice to candidates; opt-out mechanism NYC DCWP (Department of Consumer and Worker Protection)

Reference Card 6: XAI Methods Comparison

Method How It Works When to Use Output Type Key Limitation
LIME (Local Interpretable Model-agnostic Explanations) Perturbs inputs near the instance being explained, fits a simple interpretable model to the black box's local behavior, identifies which features most influence the specific prediction Explaining individual predictions of any classifier; tabular data, text, or images Feature importance for a specific instance (local explanation) Explanations may be inconsistent for similar inputs; computationally intensive; faithfulness to the underlying model is not guaranteed
SHAP (SHapley Additive exPlanations) Uses cooperative game theory (Shapley values) to assign each feature a fair contribution to the prediction, accounting for all possible feature orderings Explaining individual predictions and global feature importance; when theoretical guarantees of consistency and accuracy are needed Feature attribution values for a specific instance; global feature importance when averaged More computationally expensive than LIME; can be misinterpreted as causal when features are correlated; less intuitive than LIME for non-technical audiences
Counterfactual Explanations Identifies the minimal change to the input that would change the model's output (e.g., "You would have been approved if your income were $5,000 higher") Providing actionable explanations to individuals; situations where recourse matters Description of the nearest decision boundary; actionable changes May identify counterfactuals that are not actually achievable (e.g., "change your age"); multiple counterfactuals may exist and it's unclear which to surface
Decision Trees Trains an interpretable decision tree to approximate the black box model's behavior globally or locally When global interpretability is needed; regulatory contexts where decision rules must be auditable Rule-based decision logic (if-then-else tree) Loses accuracy when approximating complex models; trees can become unwieldy at depth needed to capture complex behavior
Saliency Maps For image or text models, computes the gradient of the output with respect to the input to identify which pixels or tokens most influenced the prediction Image classification; NLP; understanding what the model "looks at" Heatmap overlay on input image or highlighted tokens in text Can produce unstable or noisy attributions; saliency is computed from gradients not necessarily causal influence; may highlight irrelevant regions
Integrated Gradients Computes the average gradient of the output with respect to the input along a path from a baseline (e.g., blank image) to the actual input, providing attribution scores When more reliable saliency attributions are needed than gradient-only methods; NLP and image models Feature attribution scores (similar to SHAP for neural networks) Requires choice of baseline, which affects attributions; does not provide globally interpretable rules; can still produce counterintuitive attributions

These reference cards provide quick-lookup definitions. For deeper treatment of any concept, consult the relevant chapter of the textbook. For regulatory details, always verify current requirements with qualified legal counsel, as AI law is developing rapidly.