Quiz: Responsible AI Development

DataField.Dev

Quiz: Responsible AI Development

Test your understanding before moving to the next chapter. Target: 70% or higher to proceed.

Section 1: Multiple Choice (1 point each)

1. Model cards (Mitchell et al., 2019) are best described as:

A) Marketing documents for AI products
B) Structured documentation that records an AI model's purpose, performance, limitations, and ethical considerations for multiple audiences
C) Technical specifications for model architecture and hyperparameters
D) Legal compliance documents required by the EU AI Act

Answer

**B)** Structured documentation that records an AI model's purpose, performance, limitations, and ethical considerations for multiple audiences. *Explanation:* Section 29.2 describes model cards as documentation designed for multiple audiences: technical reviewers assessing methodology, ethics committees evaluating risks, regulators verifying compliance, and affected communities understanding how the model works and where it may fail. They are not marketing documents (A), purely technical specs (C), or legally mandated (D), though they may support regulatory compliance.

2. Datasheets for datasets (Gebru et al., 2021) are analogous to:

A) A food nutrition label
B) A product warranty
C) A clinical trial protocol that discloses the study population and methodology
D) A software license agreement

Answer

**C)** A clinical trial protocol that discloses the study population and methodology. *Explanation:* Section 29.3 draws this analogy explicitly: "Publishing model cards without datasheets is like publishing a drug label without disclosing the clinical trial population." A datasheet documents the dataset's composition, collection methodology, known biases, and limitations -- just as a clinical trial protocol documents the study population, enabling others to evaluate whether the results generalize.

3. The ModelCard dataclass in the chapter generates a report structured for which of the following audiences?

A) Only data scientists
B) Technical reviewers, ethics committees, regulators, and affected communities
C) Only senior management
D) Only legal counsel

Answer

**B)** Technical reviewers, ethics committees, regulators, and affected communities. *Explanation:* Section 29.4.1 states that the report is "structured for multiple audiences: Technical reviewers can assess performance and methodology. Ethics committees can evaluate risks and safeguards. Regulators can verify compliance and accountability. Affected communities can understand how the model works and where it may fail."

4. VitraMed's model card revealed that the clinical prediction model's overall accuracy was 92%. What did disaggregated analysis show?

A) Accuracy was uniform across all demographic groups
B) Accuracy was significantly lower for patients over 65 and for patients from predominantly Hispanic neighborhoods
C) Accuracy was higher for minority populations due to oversampling
D) Accuracy data was unavailable for demographic subgroups

Answer

**B)** Accuracy was significantly lower for patients over 65 and for patients from predominantly Hispanic neighborhoods. *Explanation:* Section 29.3.3 and the model card application in Section 29.4.2 reveal that VitraMed's model performed less accurately for older patients and patients from clinics in predominantly Hispanic neighborhoods -- likely because these populations were underrepresented in the training data. The aggregate 92% accuracy concealed this disparity.

5. Red-teaming in AI development refers to:

A) A software testing methodology focused on code quality
B) Adversarial testing by a team specifically tasked with finding failure modes, vulnerabilities, and harmful outputs that standard testing misses
C) A management technique for motivating development teams
D) Testing AI systems exclusively for cybersecurity vulnerabilities

Answer

**B)** Adversarial testing by a team specifically tasked with finding failure modes, vulnerabilities, and harmful outputs that standard testing misses. *Explanation:* Section 29.5 defines red-teaming as an adversarial testing methodology drawn from military and cybersecurity practice. Red teams adopt the perspective of adversaries or edge-case users, actively trying to make the system fail. This differs from standard testing (which verifies that the system works as designed) by seeking the failures that the designers did not anticipate.

6. Model drift refers to:

A) A model being deployed in a different country than where it was developed
B) Changes in model performance over time due to evolving data distributions, changing real-world conditions, or degradation of input quality
C) The process of updating a model's architecture
D) Moving a model from development to production environments

Answer

**B)** Changes in model performance over time due to evolving data distributions, changing real-world conditions, or degradation of input quality. *Explanation:* Section 29.6 defines model drift as the degradation of model performance after deployment. This can occur through concept drift (the relationship between features and outcomes changes), data drift (the distribution of input data shifts), or performance degradation (the model's accuracy declines as conditions evolve). Drift requires ongoing monitoring because a model that performs well at launch may perform poorly months later.

7. The chapter argues that publishing model cards without datasheets is problematic because:

A) Datasheets are legally required while model cards are optional
B) The model card may show high aggregate performance while the datasheet would reveal that the training data systematically underrepresents certain populations
C) Datasheets are easier to produce than model cards
D) Model cards and datasheets contain identical information

Answer

**B)** The model card may show high aggregate performance while the datasheet would reveal that the training data systematically underrepresents certain populations. *Explanation:* Section 29.3.3 makes this point directly: "The model card says 'accuracy: 92%,' but the datasheet would reveal 'accuracy: 92% for suburban patients aged 35-65; accuracy: 78% for urban patients over 75.' The aggregate hides the disparity." Without the datasheet, users cannot evaluate whether the model's documented performance will generalize to their specific context.

8. The OECD AI Principles include five principles. Which of the following is one of them?

A) Maximum data collection for model improvement
B) Transparency and explainability
C) Algorithmic secrecy for competitive advantage
D) Automated decision-making without human oversight

Answer

**B)** Transparency and explainability. *Explanation:* Section 29.1 describes the OECD AI Principles (adopted May 2019): inclusive growth/sustainable development/well-being; human-centered values and fairness; transparency and explainability; robustness/security/safety; and accountability. Options A, C, and D contradict the principles' intent.

9. The ModelCard dataclass includes a review_status field. This field is designed to document:

A) Whether the model has been commercially licensed
B) Whether the model has been reviewed by an ethics committee and the outcome of that review
C) The model's technical review score on code quality
D) Whether users have reviewed the model's outputs

Answer

**B)** Whether the model has been reviewed by an ethics committee and the outcome of that review. *Explanation:* The `review_status` field (Section 29.4.1) is part of the "Governance" section of the model card, designed to record information like "Approved by Ethics Committee 2026-02-15." It connects the model documentation to the organizational governance infrastructure discussed in Chapters 26 and 28.

10. The chapter describes a documentation pipeline connecting multiple tools. The correct order is:

A) Model card -> Datasheet -> Quality audit -> Lineage tracker
B) Datasheet -> Quality audit -> Lineage tracker -> Model card
C) Lineage tracker -> Model card -> Datasheet -> Quality audit
D) Quality audit -> Model card -> Lineage tracker -> Datasheet

Answer

**B)** Datasheet -> Quality audit -> Lineage tracker -> Model card. *Explanation:* Section 29.4.1 describes the pipeline: "the datasheet documents the training data, the quality auditor assesses its integrity, the lineage tracker follows it through the pipeline, and the model card documents the resulting model." This sequence follows the logical flow from data documentation through processing to model documentation.

Section 2: True/False with Justification (1 point each)

11. "A model card should only include information that shows the model in a favorable light."

Answer

**False.** *Explanation:* The entire purpose of model cards is to provide honest, comprehensive documentation including limitations, ethical concerns, fairness gaps, and out-of-scope uses. Section 29.2 emphasizes that model cards should document where the model may fail, not just where it succeeds. The `ModelCard` dataclass includes dedicated fields for limitations, risks_and_harms, and ethical_considerations, and the code generates a WARNING if no ethical considerations or limitations are documented.

12. "Red-teaming is only relevant for large language models and chatbots, not for traditional machine learning models."

Answer

**False.** *Explanation:* Section 29.5 describes red-teaming as applicable to any AI system, not just language models. Red-teaming can test for adversarial inputs that fool image classifiers, edge cases that cause prediction models to fail, data poisoning attacks against training pipelines, and misuse scenarios for any deployed system. The chapter applies red-teaming to VitraMed's clinical prediction model -- a traditional ML system, not a language model.

13. "Once a model card is published at deployment, it does not need to be updated."

Answer

**False.** *Explanation:* Section 29.6 argues that model documentation must be maintained over time. The `ModelCard` dataclass includes an `update_schedule` field precisely because models drift, conditions change, and new ethical considerations emerge. A model card that was accurate at deployment may become misleading months later if the model's performance has degraded for specific populations.

14. "Dr. Adeyemi describes every dataset as 'a political document' because datasets involve government data."

Answer

**False.** *Explanation:* Section 29.3.3 quotes Dr. Adeyemi: "Every dataset is a political document. It reflects choices about who counts, what matters, and whose reality is represented." The word "political" here does not refer to government -- it refers to the power dynamics embedded in data collection decisions. Who is included, who is excluded, what variables are measured, and what is left out are all choices that reflect and reinforce existing power structures.

15. "The ModelCard dataclass generates a WARNING when no ethical considerations are documented."

Answer

**True.** *Explanation:* The `generate_report()` method includes the line: if no ethical_considerations are provided, the report displays "WARNING: No ethical considerations documented." Similarly, if no limitations are documented, it displays "WARNING: No limitations documented. All models have limitations." These warnings embody the responsible AI principle that every model has ethical implications and limitations that must be acknowledged.

Section 3: Short Answer (2 points each)

16. Explain the concept of "out-of-scope uses" in a model card and why it is important. Provide two examples for VitraMed's clinical prediction model.

Sample Answer

Out-of-scope uses are applications of a model that the developers explicitly identify as inappropriate, unsupported, or potentially harmful -- uses for which the model was not designed, tested, or validated. Documenting out-of-scope uses is important because models are often repurposed beyond their original intent, and such repurposing can produce harmful outcomes when the model is applied in contexts it was not designed for. Two examples for VitraMed's model: (1) Using the model's risk scores to determine individual insurance premiums or coverage decisions -- the model was designed to support clinical prevention, not to enable insurance discrimination. (2) Using the model to assess individual patient health without clinician review -- the model generates risk tiers to support clinical judgment, not to replace it. Automated decision-making based solely on the model's output could miss clinical nuances and produce harmful treatment decisions. *Key points for full credit:* - Defines out-of-scope uses accurately - Explains why documentation matters (repurposing risk) - Provides two specific, realistic examples for VitraMed

17. The chapter describes three types of model drift: concept drift, data drift, and performance degradation. Define each and explain how each would manifest in VitraMed's clinical prediction model.

Sample Answer

**Concept drift:** The underlying relationship between features and outcomes changes. For VitraMed, this could manifest if medical treatment advances change the relationship between lab values and disease onset -- a patient with specific lab results in 2020 might have a different probability of developing diabetes in 2028 due to new preventive medications. **Data drift:** The distribution of input data shifts over time. For VitraMed, this could manifest if the patient population changes -- if VitraMed acquires clinics in a new geographic region with different demographics, the incoming data may differ significantly from the training data distribution. **Performance degradation:** The model's accuracy declines as conditions evolve. For VitraMed, this could manifest as a gradual decline in prediction accuracy over time, detected through monitoring but not attributable to a single cause -- the cumulative effect of small changes in patient demographics, clinical practices, and data collection methods. *Key points for full credit:* - Defines all three types accurately - Provides specific VitraMed manifestations for each - Explains why monitoring is needed for each

18. Why does the chapter argue that model documentation is "governance as software"? What does embedding governance in code accomplish that policy documents alone cannot?

Sample Answer

The chapter argues that tools like the `ModelCard` dataclass represent governance embedded in software because they automate and standardize the documentation process. A policy document that says "all models must be documented" relies on individual compliance and can be ignored or completed perfunctorily. A `ModelCard` dataclass that generates warnings when ethical considerations or limitations are not documented makes the governance requirement executable: the code itself enforces the documentation standard. Code-based governance accomplishes several things policy alone cannot: (1) Consistency -- every model card follows the same structure, enabling comparison and audit. (2) Automation -- reports can be generated programmatically as part of the deployment pipeline. (3) Integration -- documentation can be connected to monitoring systems, quality auditors, and lineage trackers. (4) Accountability -- the absence of documentation is visible in the generated report rather than hidden in an unfiled document. *Key points for full credit:* - Explains the concept of governance embedded in code - Identifies at least two advantages over policy documents alone - References specific features of the ModelCard dataclass

19. The chapter's documentation pipeline connects the DataSheet (data), DataQualityAuditor (quality), DataLineageTracker (lineage), and ModelCard (model). Explain why the pipeline must flow in this order and what happens if any stage is skipped.

Sample Answer

The pipeline must flow in this order because each stage builds on the previous one. The datasheet documents the raw training data -- its composition, sources, limitations. The quality auditor assesses the data's integrity -- accuracy, completeness, bias. The lineage tracker follows the data through transformations -- cleaning, normalization, feature engineering. The model card documents the resulting model, informed by knowledge of the data (datasheet), its quality (auditor), and its processing history (lineage). If the datasheet is skipped, the model card may claim high accuracy without disclosing that the training data systematically underrepresents certain populations. If the quality audit is skipped, the model may be trained on data with errors that propagate into biased predictions. If lineage tracking is skipped, transformations that introduce bias (e.g., dropping records with missing values, which disproportionately removes minority patients) are invisible. If the model card is skipped, the model is deployed without documentation that would help users understand its limitations and appropriate use. *Key points for full credit:* - Explains the logical dependency between stages - Identifies the consequence of skipping at least two stages - Connects to specific ethical risks

Section 4: Applied Scenario (5 points)

20. Read the following scenario and answer all parts.

Scenario: HireRight AI

HireRight AI is a startup that sells an AI-powered resume screening tool to corporate HR departments. The tool analyzes resumes and ranks candidates by "predicted job performance" using features extracted from work history, education, skills, and writing quality. The model was trained on 200,000 resume-outcome pairs from five Fortune 500 client companies (candidates who were hired and their subsequent performance ratings).

The tool is marketed to companies in healthcare, finance, retail, and government. HireRight claims "92% accuracy in predicting top performers." The company has not published a model card, datasheet, or disaggregated performance metrics.

(a) Using the model card framework from Section 29.2, identify at least five critical fields that should be documented in HireRight's model card but are currently absent. For each, explain what information it should contain and why its absence is problematic. (1 point)

(b) Create a datasheet entry for HireRight's training data. Identify at least three sources of bias in the training data that the datasheet should disclose. (1 point)

(c) HireRight claims "92% accuracy." Using the disaggregated metrics framework, explain why this number is insufficient. What disaggregated groups should be reported, and what patterns might disaggregation reveal? (1 point)

(d) Design a red-teaming exercise for HireRight's tool. Specify three specific attack scenarios the red team should test, the expected harmful outcome for each, and the pass/fail criterion. (1 point)

(e) HireRight's tool is used by a government agency to screen applications for civil service positions. The training data came from private-sector Fortune 500 companies. Using the concepts of intended use, out-of-scope use, and model drift, evaluate whether this deployment is appropriate. What should HireRight's model card say about this use case? (1 point)

Sample Answer

**(a)** Five critical missing fields: 1. **Intended use:** Should specify that the tool is designed for initial resume screening in industries similar to the training companies, not as a sole decision-maker. Absence means clients may use the tool as a replacement for human review. 2. **Out-of-scope uses:** Should document that government hiring, industries not represented in training data, and non-English resumes are out of scope. Absence means the tool is deployed in contexts where it has not been validated. 3. **Disaggregated performance metrics:** Should report accuracy by gender, race/ethnicity, age group, and disability status. Absence conceals potential disparate impact in hiring recommendations. 4. **Limitations:** Should document that the model reflects the hiring biases of five specific companies, that "job performance" was measured by subjective supervisor ratings, and that the model has not been tested across all industries. Absence gives users false confidence. 5. **Ethical considerations:** Should address the risk of perpetuating historical hiring discrimination, the impact on legally protected classes, and the regulatory environment (EEOC, NYC Local Law 144). Absence means no ethical analysis has been documented. **(b)** Datasheet disclosure -- three sources of bias: 1. **Selection bias:** The training data includes only candidates who were *hired* (and their subsequent performance). Candidates who were not hired -- potentially due to human biases in the original screening -- are absent. The model learns what the five companies' existing (possibly biased) processes selected for. 2. **Label bias:** "Job performance" was measured by subjective supervisor ratings. Research consistently shows that supervisor ratings are influenced by race, gender, and age biases. The model treats these biased labels as ground truth. 3. **Representation bias:** Five Fortune 500 companies likely overrepresent certain demographics (e.g., candidates from elite universities, candidates in certain geographic regions) and underrepresent others. The model may perform well for applicants who resemble the training population but poorly for applicants from underrepresented backgrounds. **(c)** The 92% claim is insufficient because: Aggregate accuracy conceals disparate performance across demographic groups. Disaggregation should report accuracy, false positive rates, and false negative rates separately for: gender, race/ethnicity, age groups, disability status, and educational background (elite vs. non-elite institutions). Disaggregation might reveal that the model accurately ranks candidates from backgrounds well-represented in training data (white males from top universities at Fortune 500 companies) while systematically underranking candidates from underrepresented backgrounds -- effectively automating the historical hiring patterns of the training companies. **(d)** Red-teaming exercise: 1. **Name-based bias test:** Submit pairs of identical resumes differing only in the applicant's name (names associated with different racial/ethnic groups). Pass criterion: ranking difference less than 5%. Expected harm: racially disparate rankings based solely on name. 2. **Employment gap test:** Submit resumes with identical qualifications but one includes a two-year employment gap (common for parents, particularly mothers, and for people with health conditions). Pass criterion: gap does not reduce ranking by more than 10%. Expected harm: discrimination against parents and people with disabilities. 3. **Writing style bias test:** Submit resumes written in American English, British English, and English as a second language style. Pass criterion: ranking is based on qualifications, not writing style markers correlated with national origin. Expected harm: discrimination against non-native English speakers. **(e)** This deployment is inappropriate and should be documented as an out-of-scope use. The model was trained on private-sector Fortune 500 hiring data. Government hiring operates under different legal frameworks (civil service rules, merit-based selection, anti-discrimination requirements specific to government), different job types, different performance metrics, and different applicant populations. Deploying a model trained in one domain to an entirely different domain without validation is a textbook example of out-of-scope use. The model card should include under out-of-scope uses: "Government or public-sector hiring. This model was trained exclusively on private-sector hiring data from five Fortune 500 companies. It has not been validated for government positions, civil service requirements, or the legal standards applicable to public-sector hiring decisions. Use in government hiring may produce inaccurate results and may violate merit-based selection requirements."

Scoring & Review Recommendations

Score Range	Assessment	Next Steps
Below 50% (< 15 pts)	Needs review	Re-read Sections 29.1-29.4, redo Part A exercises
50-69% (15-20 pts)	Partial understanding	Review specific weak areas, attempt Python exercises
70-85% (21-25 pts)	Solid understanding	Ready to proceed to Chapter 30
Above 85% (> 25 pts)	Strong mastery	Proceed to Chapter 30: When Things Go Wrong

Section	Points Available
Section 1: Multiple Choice	10 points (10 questions x 1 pt)
Section 2: True/False with Justification	5 points (5 questions x 1 pt)
Section 3: Short Answer	8 points (4 questions x 2 pts)
Section 4: Applied Scenario	5 points (5 parts x 1 pt)
Total	28 points