Exercises — Chapter 30: The EU AI Act and Algorithmic Accountability
Exercise 1: AI System Classification Under the Four Risk Tiers
Instructions
For each of the six AI systems described below, (a) identify the most likely risk tier under the EU AI Act (Prohibited / High-Risk / Limited Risk / Minimal Risk / Uncertain — requires legal review), (b) cite the specific legal provision or Annex III category that supports your classification, and (c) identify any ambiguity or classification dispute that might arise.
Systems to Classify
System A: Automated Overdraft Refusal Model A UK retail bank operates an ML model that automatically refuses overdraft requests for EU-resident customers when the predicted probability of default exceeds a set threshold. No human reviews the refusal before it is communicated to the customer. The customer receives an automated email: "Your overdraft request has been declined."
System B: Executive Compensation Benchmarking Tool A financial services firm uses an AI tool to analyze market salary data and recommend compensation bands for senior executives. The tool processes publicly available salary surveys and generates recommended pay ranges. It is used internally by HR leadership; individual employees are not affected by its outputs directly.
System C: Real-Time Facial Expression Analysis in Video Interviews A financial services firm uses an AI tool to analyze facial expressions and micro-expressions during recorded video interviews with job applicants. The tool generates an "engagement and authenticity score" that is provided to hiring managers alongside the video recording.
System D: Customer Sentiment Monitoring in Call Centre A financial firm uses voice analytics AI to classify the emotional tone of customer service calls (frustrated, satisfied, neutral) in real time. The outputs are used to prioritize escalations and are shown on the supervising team leader's dashboard. The system runs continuously in the background of all calls without customer notification.
System E: Generative AI for Compliance Document Drafting A compliance team uses a large language model to draft initial versions of internal policies, procedure documents, and regulatory correspondence. A human compliance officer always reviews and approves the final document before it is issued. The LLM is not used for customer-facing decisions or credit assessments.
System F: AML Risk Score Contributing to Account Restrictions A bank's AML monitoring system generates a risk score for each customer account. When the score exceeds a threshold, the system automatically places a restriction on the account (blocks outbound payments) and generates a SAR referral queue item for a compliance analyst to review. The restriction takes effect before human review.
Exercise 2: Designing an Article 9 Risk Management System for a Credit Scoring AI
Instructions
You have been appointed as the EU AI Act compliance officer for Meridian Bank's retail credit scoring system, MeridianScore v3.1. The system is a gradient boosting model used to evaluate creditworthiness for personal loans, approved credit cards, and mortgage pre-qualification for EU-resident customers.
Design a risk management system for MeridianScore v3.1 that satisfies Article 9 of the EU AI Act. Your design should:
(a) Identify and categorize the known and foreseeable risks associated with MeridianScore, including risks to individuals (discriminatory outcomes, inaccurate assessments, automation bias in decision-making), risks to the firm (regulatory penalty, reputational harm, model drift), and risks to financial stability (systemic credit misallocation).
(b) Specify the risk evaluation methodology. How will identified risks be estimated and evaluated? What quantitative metrics will be used for accuracy, fairness, and stability monitoring? How will the evaluation framework address risks that emerge post-deployment that were not identified during development?
(c) Specify risk mitigation measures. For each major risk category identified in (a), describe at least one concrete mitigation measure. Include measures addressing: (i) discriminatory outcomes by protected characteristic; (ii) model drift over time; (iii) adversarial or fraudulent input manipulation; (iv) data quality failures.
(d) Describe the residual risk disclosure process. After mitigation, what residual risks remain? How will these be disclosed to deployers (the relationship managers and credit decision teams who use MeridianScore's outputs) in accordance with Article 13's transparency requirements?
(e) Define the lifecycle scope. How and when will the risk management system be reviewed and updated? What triggers a mandatory review (model update, significant performance change, new regulatory guidance, adverse event)?
Your answer should be structured as a policy framework outline — not a narrative essay. Use numbered sections, bullet points for specific requirements, and brief explanatory notes where needed.
Exercise 3: Drafting an Article 13 Instructions-for-Use Document for a Fraud Detection System
Instructions
Article 13 of the EU AI Act requires that high-risk AI systems be accompanied by instructions for use that provide sufficient transparency for deployers to understand outputs and use the system appropriately.
You are the provider of FraudSentinel, an ML-based real-time transaction fraud scoring system sold to financial institutions. FraudSentinel outputs a fraud probability score (0.0–1.0) and a risk category (Low / Medium / High / Block) for each transaction. High scores trigger automatic payment holds; Block-category transactions are rejected automatically.
Draft the key sections of an Article 13-compliant instructions-for-use document for FraudSentinel. Your document must include the following sections:
Section 1: System Description and Intended Purpose What FraudSentinel does; what it is designed to be used for; what it must not be used for.
Section 2: Performance Metrics The accuracy metrics that characterize FraudSentinel's performance. Include: overall precision and recall; false positive rate (legitimate transactions blocked); false negative rate (fraudulent transactions missed); performance variation across transaction types, geographic regions, and customer segments. Specify the conditions under which stated metrics are valid.
Section 3: Known Risks and Limitations Document at least four known risks or limitations that deployers must be aware of. For each, describe the risk and specify what deployers should do to mitigate it.
Section 4: Human Oversight Requirements Specify the human oversight measures that deployers must implement in accordance with Article 14. Include: the minimum qualification and training required for oversight personnel; the circumstances in which human review is required before a Hold decision becomes a Block; the process for customer dispute resolution; and the reporting requirements for detected system malfunctions.
Section 5: Input Data Requirements What input data does FraudSentinel require? What are the minimum data quality standards for each input field? What should deployers do when required input data is missing or of poor quality?
Section 6: Monitoring and Logging What events does FraudSentinel automatically log? What additional monitoring should deployers implement? What performance thresholds should trigger escalation to the provider?
Your document should be professionally structured — written as a real technical compliance document, not an academic exercise.
Exercise 4: Code Exercise — Implementing a classification_wizard() Method
Instructions
Extend the AIActComplianceRegister class from the chapter's Python section by implementing a classification_wizard() method. The method should:
- Accept a system record identifier or prompt the user to identify a system from the register
- Ask a structured sequence of classification questions based on the EU AI Act's risk tier logic
- Based on answers, suggest the most likely risk tier and the relevant legal provision
- Output a classification recommendation with rationale
Starting Code
from __future__ import annotations
from dataclasses import dataclass, field
from datetime import date
from enum import Enum
from typing import Optional
class AIRiskTier(Enum):
PROHIBITED = "Prohibited — Must cease use"
HIGH_RISK = "High Risk — Full compliance required"
LIMITED_RISK = "Limited Risk — Transparency obligations"
MINIMAL_RISK = "Minimal Risk — General law only"
UNCERTAIN = "Classification Uncertain — Legal review needed"
class ComplianceStatus(Enum):
COMPLIANT = "Compliant"
PARTIAL = "Partially Compliant — Gaps Identified"
NON_COMPLIANT = "Non-Compliant"
NOT_YET_ASSESSED = "Not Yet Assessed"
@dataclass
class AISystemRecord:
system_id: str
name: str
description: str
provider: str
deployer: str
use_case: str
affects_eu_customers: bool
decision_type: str
autonomous_decision: bool
risk_tier: AIRiskTier = AIRiskTier.UNCERTAIN
compliance_status: ComplianceStatus = ComplianceStatus.NOT_YET_ASSESSED
classification_rationale: str = ""
ce_marking: bool = False
eu_database_registered: bool = False
technical_documentation_complete: bool = False
risk_management_system: bool = False
human_oversight_implemented: bool = False
logging_implemented: bool = False
next_review_date: Optional[date] = None
def compliance_gaps(self) -> list[str]:
if self.risk_tier != AIRiskTier.HIGH_RISK:
return []
gaps = []
if not self.technical_documentation_complete:
gaps.append("Technical documentation not complete (Article 11 + Annex IV)")
if not self.risk_management_system:
gaps.append("Risk management system not implemented (Article 9)")
if not self.human_oversight_implemented:
gaps.append("Human oversight measures not implemented (Article 14)")
if not self.logging_implemented:
gaps.append("Automatic logging not implemented (Article 12)")
if not self.ce_marking:
gaps.append("CE marking and Declaration of Conformity not completed (Article 48)")
if not self.eu_database_registered:
gaps.append("Not registered in EU database (Article 49)")
return gaps
def readiness_score(self) -> float:
if self.risk_tier != AIRiskTier.HIGH_RISK:
return 1.0
requirements = [
self.technical_documentation_complete,
self.risk_management_system,
self.human_oversight_implemented,
self.logging_implemented,
self.ce_marking,
self.eu_database_registered,
]
return sum(requirements) / len(requirements)
class AIActComplianceRegister:
def __init__(self, firm_name: str):
self.firm_name = firm_name
self._systems: dict[str, AISystemRecord] = {}
def register(self, system: AISystemRecord) -> None:
self._systems[system.system_id] = system
def high_risk_systems(self) -> list[AISystemRecord]:
return [s for s in self._systems.values() if s.risk_tier == AIRiskTier.HIGH_RISK]
def systems_with_gaps(self) -> list[tuple[AISystemRecord, list[str]]]:
result = []
for system in self.high_risk_systems():
gaps = system.compliance_gaps()
if gaps:
result.append((system, gaps))
return result
def unclassified_systems(self) -> list[AISystemRecord]:
return [s for s in self._systems.values() if s.risk_tier == AIRiskTier.UNCERTAIN]
def inventory_summary(self) -> dict:
all_systems = list(self._systems.values())
return {
"total_systems": len(all_systems),
"prohibited": sum(1 for s in all_systems if s.risk_tier == AIRiskTier.PROHIBITED),
"high_risk": sum(1 for s in all_systems if s.risk_tier == AIRiskTier.HIGH_RISK),
"limited_risk": sum(1 for s in all_systems if s.risk_tier == AIRiskTier.LIMITED_RISK),
"minimal_risk": sum(1 for s in all_systems if s.risk_tier == AIRiskTier.MINIMAL_RISK),
"unclassified": sum(1 for s in all_systems if s.risk_tier == AIRiskTier.UNCERTAIN),
"fully_compliant": sum(
1 for s in all_systems if s.compliance_status == ComplianceStatus.COMPLIANT
),
"eu_affecting": sum(1 for s in all_systems if s.affects_eu_customers),
}
def readiness_dashboard(self) -> list[dict]:
return sorted(
[
{
"system_id": s.system_id,
"name": s.name,
"tier": s.risk_tier.value.split(" —")[0],
"readiness": f"{s.readiness_score():.0%}",
"gaps": len(s.compliance_gaps()),
"eu_affecting": s.affects_eu_customers,
}
for s in self._systems.values()
if s.risk_tier == AIRiskTier.HIGH_RISK
],
key=lambda x: x["readiness"],
)
def classification_wizard(self, system_id: str) -> dict:
"""
TODO: Implement this method.
The method should:
1. Retrieve the system record from self._systems by system_id
2. Ask a structured sequence of YES/NO classification questions
3. Apply the EU AI Act decision tree:
- Does the system engage in a prohibited practice? (Article 5) -> PROHIBITED
- Does the system affect EU customers? -> If no, limited Act scope
- Does the system perform creditworthiness assessment? -> HIGH_RISK (Annex III(5)(b))
- Does the system perform insurance risk assessment? -> HIGH_RISK (Annex III(5)(c))
- Does the system perform recruitment/selection? -> HIGH_RISK (Annex III(4)(a))
- Does the system perform biometric identification? -> HIGH_RISK (Annex III(1))
- Does the system interact with humans without disclosing it is AI? -> LIMITED_RISK
- Is the system a deepfake generator? -> LIMITED_RISK
- Otherwise: MINIMAL_RISK with note to review Annex III completeness
4. Return a dict with keys: system_id, name, suggested_tier, rationale, provision,
requires_legal_review (bool), classification_confidence ("High"/"Medium"/"Low")
The method should NOT use input() — instead accept a responses dict parameter
for testability, with fallback to interactive prompting.
Hint: Structure as a decision tree with early exits. Keep the question sequence
logical and explainable — each question should correspond to a specific
Article 5 prohibition or Annex III category.
"""
raise NotImplementedError("Implement classification_wizard()")
Requirements
Your implementation must:
- Accept an optional
responses: dict[str, bool] = Noneparameter alongsidesystem_id— if responses are provided (keyed to question identifiers), use them instead of prompting; if not provided, useinput()prompts - Cover at least the following classification branches: Article 5 prohibited practices, Annex III(5)(b) credit, Annex III(5)(c) insurance, Annex III(4) employment, Annex III(1) biometric, Article 50 chatbot disclosure, and minimal risk
- Return a typed dict with keys:
system_id,name,suggested_tier(anAIRiskTiervalue),rationale(string),provision(string citing specific Act article),requires_legal_review(bool),classification_confidence(one of "High", "Medium", "Low") - After generating the recommendation, optionally update the system record's
risk_tierandclassification_rationalefields if the user confirms
Write a test harness demonstrating the wizard with at least three different AI systems (one that classifies as high-risk, one as limited-risk, one as uncertain requiring legal review).
Exercise 5: Building an EU AI Act Compliance Timeline for an EU-Serving Firm
Instructions
You are the Head of RegTech Compliance at Atlas Financial Services, a German asset manager with €12 billion AUM that serves retail and institutional clients across Germany, France, and the Netherlands. Atlas uses twelve AI systems across its operations. Following an initial inventory, the relevant systems have been classified as follows:
| System ID | Name | Classification | EU Customers Affected |
|---|---|---|---|
| AFS-001 | InvestScore (retail suitability scoring) | High-Risk (Annex III(5)(b)) | Yes |
| AFS-002 | HiringBot (recruitment screening) | High-Risk (Annex III(4)) | Yes |
| AFS-003 | SentinelAML (AML transaction monitoring) | Uncertain — legal review | Yes |
| AFS-004 | AtlasChat (customer chatbot) | Limited Risk | Yes |
| AFS-005 | ReportDrafter (LLM for compliance reports) | Minimal Risk | No |
| AFS-006 | PortfolioOptimizer (internal quant model) | Minimal Risk | No |
The current date is 1 March 2026. The August 2026 deadline is five months away.
Task A: Build a Compliance Timeline
Create a structured compliance timeline for Atlas Financial Services from 1 March 2026 through 31 August 2026. The timeline should cover:
- Legal review completion for AFS-003
- Technical documentation preparation for AFS-001 and AFS-002
- Risk management system design and implementation (Article 9) for AFS-001 and AFS-002
- Human oversight framework design and implementation (Article 14) for AFS-001 and AFS-002
- Data governance review (Article 10) including bias testing for AFS-001 and AFS-002
- Logging implementation (Article 12) for any high-risk systems lacking it
- Disclosure language implementation for AFS-004
- Internal conformity assessment for AFS-001 and AFS-002
- CE marking and Declaration of Conformity execution
- EU database registration
- Post-deadline monitoring plan
For each milestone, estimate the duration (in weeks), identify the responsible function (legal, compliance, model risk, IT, HR, external counsel), and identify any dependencies that affect sequencing.
Task B: Risk Assessment of Timeline Feasibility
Five months is a short runway for two full conformity assessments plus an AML legal review and two high-risk compliance implementations. Identify the three biggest risks to the timeline and propose mitigation strategies for each.
Task C: Budget Estimate
Based on the compliance work described in the timeline, provide an order-of-magnitude budget estimate for the Atlas Financial Services AI Act compliance program. Break down costs by category: external legal counsel, external technical consultants, internal staff time, IT infrastructure changes, and ongoing monitoring costs post-August 2026. Justify your estimates with reference to comparable regulatory implementation programs (GDPR, SR 11-7 model validation, DORA).