Chapter 19: Quiz — Auditing AI Systems

DataField.Dev

Chapter 19: Quiz — Auditing AI Systems

20 questions. Select the best answer for each multiple-choice question. For short-answer questions, provide a concise response of 2–4 sentences.

1. Which of the following best describes an "impact audit" of an AI system?

A) A technical examination of the AI model's accuracy and performance metrics across demographic groups B) An assessment of real-world outcomes for affected populations resulting from AI deployment C) A compliance review confirming that the AI system's developer followed documented development procedures D) An adversarial test of the AI system's robustness to malicious inputs

2. NYC Local Law 144's bias audit requirement applies to:

A) All AI systems used by employers in New York City, regardless of purpose B) AI systems used in hiring, promotion, and performance management for all U.S. employers C) Automated employment decision tools used by New York City employers to make hiring and promotion decisions D) AI systems used in credit decisions, housing, and employment by companies based in New York City

3. Which of the following is the primary advantage of external AI auditing over internal AI auditing?

A) External auditors have more technical expertise in machine learning than internal auditors B) External auditors have access to more data about the AI system than internal auditors C) External auditors have independence from the organization being audited, reducing conflicts of interest D) External auditors can complete audits more quickly than internal auditors because they are not involved in ongoing operations

4. ProPublica's COMPAS investigation found that COMPAS was:

A) More accurate at predicting recidivism for Black defendants than for white defendants B) More likely to produce false positive errors (labeling non-recidivists as high-risk) for Black defendants than for white defendants C) Less accurate overall than human prediction of recidivism for both racial groups D) Equally calibrated across racial groups, meaning predictions were equally inaccurate for Black and white defendants

5. The "access problem" in external AI auditing refers to:

A) The challenge of providing affected individuals with access to the AI decisions that affected them B) Organizations' reluctance to give external auditors access to proprietary models and training data C) The difficulty that auditors face in accessing sufficient computing resources to test large AI models D) The limited access that consumers have to information about AI systems that affect them

6. When researchers describe the "mathematical incompatibility" of fairness metrics, they mean that:

A) Different research teams using different mathematical approaches will calculate fairness metrics differently B) Fairness metrics can only be calculated using specialized mathematical software C) When base rates differ across groups, it is impossible to satisfy all major fairness criteria simultaneously D) The mathematical complexity of modern AI models makes fairness analysis infeasible

7. Which of the following is an element of a comprehensive algorithmic impact assessment?

A) An audit of the AI vendor's financial statements to assess business viability B) A systematic identification of potential harms to affected populations, organized by harm type and likelihood C) A penetration test of the organization's cybersecurity systems D) A comparison of the AI system's performance to the state of the art in academic research

8. The Federal Reserve's SR 11-7 guidance is significant in AI auditing because it:

A) Requires federal agencies to publish bias audits of their AI systems B) Establishes model validation requirements for financial institutions — the most established U.S. regulatory AI audit mandate C) Prohibits financial institutions from using AI in credit decisions without FDIC approval D) Requires banks to obtain third-party audits for all AI systems regardless of risk level

9. A "model card" is best described as:

A) A physical access card required to use high-risk AI systems in secure government facilities B) A standardized documentation template for AI models describing intended use, performance metrics, limitations, and training data C) A credit-style score that rates an AI model's safety and compliance D) A regulatory certification issued by the NIST upon completion of a conformity assessment

10. The EU AI Act's conformity assessment requirement differs from NYC LL 144's bias audit requirement primarily because:

A) The EU AI Act covers only employment tools while LL 144 covers all AI applications B) The EU AI Act requires assessment against a broader set of standards, covers more AI applications, and integrates assessment into a mandatory registration system C) The EU AI Act requires only self-assessment while LL 144 requires independent third-party audit D) The EU AI Act applies only to AI systems developed in EU member states

11. "Red-teaming" in the context of generative AI auditing involves:

A) A regulatory agency reviewing an AI company's internal safety documentation B) A team of engineers reviewing code for security vulnerabilities before deployment C) Structured adversarial testing by researchers attempting to elicit harmful behaviors from an AI system D) The analysis of AI-generated content for factual accuracy by subject matter experts

12. Which of the following is NOT identified in the chapter as a challenge in external AI auditing?

A) Trade secret protection of proprietary models and training data B) The risk that audit findings could become evidence in litigation against the audited organization C) The absence of universal audit standards for AI systems D) The prohibition on external auditors testifying in regulatory proceedings

13. "Datasheets for Datasets" is:

A) A regulatory requirement that companies document the data used in FDA-regulated medical AI B) A standardized documentation format for AI training datasets, recording provenance, collection methodology, and known limitations C) A financial audit requirement for companies that collect personal data at scale D) A privacy impact assessment template required under GDPR for data processing activities

14. Continuous monitoring of deployed AI systems is necessary primarily because:

A) AI systems require constant retraining to maintain their performance B) Regulatory requirements typically mandate real-time audit reporting C) AI systems can experience drift — degradation in performance or emerging fairness issues — as the world changes after deployment D) AI system vendors often modify systems after deployment without notifying deploying organizations

15. Canada's Directive on Automated Decision-Making is notable in AI auditing because it:

A) Was the first mandatory bias audit requirement for employment AI in the Americas B) Requires federal government departments to complete algorithmic impact assessments before deploying automated decision systems, scaled to the stakes of the decision C) Established the first AI audit credentialing program, analogous to the CPA designation D) Created the first international standards body for AI auditing

16. The "credentialing gap" in AI auditing refers to:

A) The difficulty of credentialing AI systems that operate across national borders B) The absence of a recognized professional credential for AI auditors, analogous to the CPA for financial auditors C) The gap between AI auditors' technical credentials and their domain expertise D) The failure of most universities to offer AI auditing programs

17. A key limitation of the ProPublica COMPAS investigation as a model for AI auditing is that it:

A) Was conducted by journalists rather than credentialed auditors, limiting its scientific validity B) Could identify that racial disparities existed in predictions but could not determine why they existed or what in the system caused them C) Used data from only one county in Florida, making it difficult to generalize to other jurisdictions D) Found results that were contradicted by all subsequent academic analyses of COMPAS

18. Algorithmic Impact Assessments (AIAs) are analogous to which of the following established practices?

A) Financial due diligence in mergers and acquisitions B) Consumer product safety testing required by the CPSC C) Environmental impact assessments required for major federal projects under NEPA D) Antitrust review of mergers by the DOJ and FTC

19. (Short Answer) Explain why the selection of a fairness metric for an AI bias audit is a normative judgment rather than a purely technical one. Use the COMPAS controversy as an example to illustrate your answer.

20. (Short Answer) What is the primary limitation of self-assessment (as allowed for most high-risk AI systems under the EU AI Act) relative to third-party assessment? What provisions of the EU AI Act might compensate for this limitation?

Answer Key: 1-B, 2-C, 3-C, 4-B, 5-B, 6-C, 7-B, 8-B, 9-B, 10-B, 11-C, 12-D, 13-B, 14-C, 15-B, 16-B, 17-B, 18-C. Questions 19–20 are short answer; see discussion guide for model responses.