Case Study 01 — Chapter 30: The EU AI Act and Algorithmic Accountability

"Cornerstone's AI Inventory Exercise — The Systems Nobody Knew Were AI"


Background

Three weeks after the board presentation, Priya Nair was running the first working session of Cornerstone Financial Group's AI Act compliance program. The steering committee had given her the mandate: produce a complete inventory of all AI systems potentially subject to the EU AI Act, with preliminary risk tier classifications, by the end of the quarter.

She had hired a specialist legal counsel from a Brussels firm with EU AI Act experience and assembled a cross-functional working group: the Chief Data Officer, heads of model risk from retail and wholesale banking, the Chief Information Security Officer, a senior FCA compliance manager, and representatives from HR, marketing, and payments operations.

The first task was straightforward in theory: identify all AI systems in use across the firm. In practice, it immediately generated three classification disputes that Priya had not fully anticipated.


Problem 1: The Excel Macro That Turned Out to Be an AI System

The retail credit team's Senior Manager, James Okafor, attended the first working session to represent the retail lending function. When Priya's working group asked for a list of all AI systems in retail lending, James submitted a one-item list: RetailScore v4.2, the gradient boosting model already identified during the board preparation work.

During the session, a member of the model risk team asked a follow-up question: "What about the customer profitability scoring tool that feeds the relationship manager dashboard?"

James hesitated. "That's not AI. That's just a spreadsheet. It's a regression formula that was built in 2019."

The model risk manager pulled up the tool's specification document. The tool was, in fact, a linear regression model with twenty-three input variables — customer income, transaction history, product holdings, tenure — trained on three years of historical profitability data. The output was a predicted profitability score used to triage customer relationship manager workload and, indirectly, to determine which customers were offered retention incentives when they indicated they wanted to close an account.

Brussels counsel was direct: "Under the AI Act's definition of an AI system — a machine-based system that infers from input data how to generate outputs such as predictions, recommendations, or decisions — a trained linear regression model is an AI system. It doesn't matter that it runs in Excel. It doesn't matter that it was built in 2019. The question is whether it meets the definition, and this does."

The profitability tool was added to the inventory. Its risk classification — high-risk under Annex III(5)(b) if retention incentive decisions affected EU customers' access to financial services, or minimal-risk if it was purely an internal management tool — required a closer analysis of how its outputs were actually used in decisions affecting customers.

The broader implication was immediate: if one team had a 2019 regression model that nobody thought of as AI, how many others did the firm have? The inventory expanded from the initial forty-seven systems Priya had counted from IT registers to a list of sixty-three candidates requiring review.


Problem 2: The Vendor Who Said "It's Rules, Not AI"

The payments team's FraudGuard ML system had been identified as probably high-risk during the board preparation work. When the compliance program reached out to the vendor — VendorCo Ltd, a London-headquartered fraud detection provider — to request the Annex IV technical documentation, VendorCo's response was unexpected.

VendorCo's compliance team sent a letter asserting that FraudGuard was not an AI system under the EU AI Act. "FraudGuard," the letter stated, "operates on the basis of a rule engine. Transactions are scored by applying a series of defined rules. This is not a machine learning system and does not fall within the Act's definition of an AI system."

Priya's Brussels counsel reviewed the letter alongside FraudGuard's marketing materials, the system's technical architecture document (obtained from Cornerstone's procurement records), and the vendor's own website. The architecture document showed that the "rules" in FraudGuard's rule engine were generated by a machine learning model trained on historical fraud data and updated quarterly based on new fraud patterns. The rule engine was, in effect, a frozen snapshot of an ML model's learned decision boundaries, re-expressed as a set of thresholds and conditionals.

This was, counsel concluded, precisely the kind of attempted definitional arbitrage the Act's broad AI system definition was intended to prevent. A rule engine whose rules are derived from and updated by an ML model retains the characteristics of an ML system — its outputs reflect statistical patterns learned from training data, not explicit human-authored logic. The EU AI Act's definition was deliberately technology-neutral for this reason.

Priya now faced a vendor accountability problem. Under the Act, VendorCo was the provider of FraudGuard — it developed the system and placed it on the market. Cornerstone, as the entity deploying FraudGuard in its operations, was the deployer. Article 13 required VendorCo to supply Cornerstone with the technical documentation and instructions for use. VendorCo was, on its current position, declining to produce this documentation.

The compliance program needed to resolve whether to:

  1. Engage in formal dialogue with VendorCo to challenge its classification position and negotiate documentation production;
  2. Seek independent legal opinion on whether FraudGuard's specific technical architecture constituted an AI system under the Act's definition;
  3. Treat FraudGuard as high-risk and build Cornerstone's own deployer-side Article 9, 12, and 14 compliance infrastructure regardless of the vendor's position, while seeking replacement vendor options;
  4. Refer the classification question to the relevant national competent authority for guidance.

Each option had cost, timeline, and legal risk implications. There was also a residual question: if VendorCo's classification was wrong and the system was high-risk, and Cornerstone continued to deploy it after the August 2026 deadline without compliant technical documentation, who bore enforcement exposure — the vendor, the deployer, or both?


The AI Act working group discovered, in week two of the inventory exercise, that the HR department had been using an AI-powered job application screening tool — TalentFilter, provided by HRtech Partners Inc — for eighteen months. The tool had been procured by HR under a software subscription that did not route through the usual IT procurement or model risk governance channels. Legal and compliance had never reviewed it.

TalentFilter used an NLP model to analyze CV text, cover letter language, and application form responses, generating a shortlisting score for each applicant. The model had been trained by HRtech on a proprietary dataset of "successful hires" from across its client base — a dataset whose composition, training methodology, and demographic representativeness Cornerstone had never assessed or even inquired about.

The AI Act classification was unambiguous: Annex III(4)(a) explicitly listed AI systems used for recruitment or selection of natural persons as high-risk. Cornerstone was operating a high-risk AI system in EU recruitment processes that it had never classified, never documented, and never subjected to any of the Article 9–15 obligations.

The discovery raised three immediate questions:

First, vendor accountability: HRtech Partners Inc, as the provider, bore primary responsibility for conformity assessment and technical documentation. But had HRtech conducted an EU AI Act conformity assessment? Did it even know its product was high-risk? The contract predated the EU AI Act's widespread awareness in the mid-market HR software sector.

Second, Cornerstone's deployer obligations: Even if HRtech produced compliant documentation, Cornerstone as deployer was independently obligated under Article 14 to implement human oversight measures, under Article 26 to conduct a fundamental rights impact assessment before deployment in recruitment, and to ensure the tool was used within its intended purpose. None of these deployer-side obligations had been addressed.

Third, legacy hiring decisions: TalentFilter had been used to screen approximately 2,400 job applications over eighteen months. If the model was later found to be non-compliant — or worse, if it was found to have produced discriminatory shortlisting outcomes — what were Cornerstone's legal and reputational exposures to applicants who had been screened out?

The compliance program's immediate recommendation was to suspend use of TalentFilter pending a full review. The HR Director objected: three active recruitment cycles were mid-process. Suspending the tool meant reverting to manual CV screening with no capacity to do so in the current hiring timeline.

The working group had to make a recommendation under time pressure, incomplete information, and competing institutional interests. There was no clean answer.


Discussion Questions

1. Definitional boundaries and inventory completeness

The Cornerstone inventory expanded from 47 to 63 systems once teams began examining tools that "weren't AI." What practical strategies should a financial institution use to achieve inventory completeness? How should firms approach the classification of regression models, rule engines with ML-derived rules, and scoring tools built in general-purpose software like Excel or R? What governance structures — model inventory registers, procurement gatekeeping, mandatory model declarations — would prevent unregistered AI systems from being deployed in the first place?

2. Vendor accountability and the provider/deployer distinction

When a vendor asserts that its product is not an AI system under the Act's definition, what options does the deploying financial institution have? Consider: (a) the legal exposure facing the deployer if the vendor's classification is wrong; (b) the leverage available to the deployer in contract negotiations and vendor management; (c) the role of procurement processes in creating AI Act obligations — specifically, whether contract clauses requiring vendor EU AI Act compliance documentation should be standard in financial services AI procurement. Who ultimately bears the compliance burden when a vendor refuses to produce Annex IV documentation?

3. The legacy deployment problem

TalentFilter had been operating for eighteen months before the AI Act inventory exercise uncovered it. The Act applies to systems placed on the market or put into service, and contains transition provisions for pre-existing systems — but those provisions do not eliminate obligations, they only delay them until the transition deadline. What should Cornerstone do about the 2,400 applications screened by a system that may have lacked adequate bias testing or representativeness validation? Does the firm have any obligation to those applicants? How should this situation inform future AI procurement governance?

4. The tension between compliance timelines and operational continuity

The recommendation to suspend TalentFilter during active recruitment cycles illustrates a common tension in compliance programs: the right answer from a legal risk perspective (suspend the non-compliant tool) conflicts with the right answer from an operational continuity perspective (complete the hiring process). How should compliance programs structure decision frameworks for this type of conflict? What criteria should govern the decision to continue operating a potentially non-compliant AI system during a transition period?

5. The role of proportionality

The EU AI Act's high-risk framework applies uniformly to a small firm's single credit scoring model and a large bank's portfolio of forty-seven AI systems. Is this appropriate? How should a proportionality principle — if one were to be applied — affect the compliance obligations of smaller financial institutions? And does the Act's current structure (which permits self-assessment for most Annex III systems) already incorporate proportionality implicitly?