Case Study 2: NYC Local Law 144 --- The First AI Hiring Law in Practice

Case Study 2: NYC Local Law 144 --- The First AI Hiring Law in Practice

Introduction

On December 11, 2021, New York City became the first major US jurisdiction to enact a law specifically regulating the use of artificial intelligence in employment decisions. Local Law 144, which took effect on July 5, 2023, requires employers and employment agencies that use automated employment decision tools (AEDTs) to conduct annual independent bias audits, publish audit results, and notify candidates about the use of automated tools.

The law was modest in scope --- it applied only to employment decisions, only in New York City, and only to automated tools that "substantially assist or replace" human decision-making. Yet its implementation generated a disproportionate amount of controversy, confusion, and insight. NYC Local Law 144 became a real-world laboratory for AI hiring regulation, revealing the practical challenges that larger, more comprehensive AI laws --- including the EU AI Act --- will face at scale.

For business leaders and AI practitioners, the law's first two years of implementation offer lessons that no theoretical framework can provide. The gap between legislative intent and operational reality proved wider, and more instructive, than anyone anticipated.

The Problem the Law Aimed to Solve

The use of AI in hiring had been growing rapidly throughout the 2020s. By 2023, an estimated 83 percent of large US employers used some form of automated screening in their hiring process, according to a Harvard Business School/Accenture study. These tools ranged from simple resume keyword filters to sophisticated systems analyzing video interviews, writing samples, voice patterns, and even facial expressions.

The benefits were real: faster screening, reduced recruiter workload, and the potential (at least theoretically) for more consistent evaluation criteria. But so were the risks:

Documented bias. Research had demonstrated that AI hiring tools could perpetuate and amplify existing biases. Amazon famously scrapped an AI recruiting tool in 2018 after discovering it penalized resumes that included the word "women's" (as in "women's chess club" or "women's studies") because the system had been trained on a decade of historical hiring data that reflected the company's predominantly male engineering workforce. Academic studies showed that facial analysis systems used in video interviews performed less accurately for darker-skinned individuals and that NLP-based resume screeners could discriminate based on names, zip codes, and educational institutions that correlated with race and socioeconomic status.

Opacity. Most candidates had no idea that an algorithm was evaluating them, let alone how. Employers using AEDTs rarely disclosed the fact, and the systems themselves were proprietary black boxes. Candidates who were rejected had no way to know whether the decision was made by a human, an algorithm, or some combination of the two.

Accountability vacuum. When a human recruiter discriminated, existing employment law provided a clear framework for liability and redress. When an algorithm discriminated, the legal framework was ambiguous. Was the employer liable for using a tool it didn't fully understand? Was the vendor liable for selling a biased product? Could a candidate even demonstrate that discrimination had occurred without access to the algorithm?

NYC Local Law 144 was designed to address opacity and accountability, not to ban AI hiring tools. Its core philosophy was transparency: if you use an AEDT, you must audit it for bias, publish the results, and tell candidates what you're doing.

The Law's Requirements

Scope: What Counts as an AEDT?

The law defines an automated employment decision tool as "any computational process, derived from machine learning, statistical modeling, data analytics, or artificial intelligence, that issues simplified output, including a score, classification, or recommendation, that is used to substantially assist or replace discretionary decision making for making employment decisions that impact natural persons."

This definition was intentionally broad --- it captured not just AI-powered video interview analysis tools but also resume screening software, candidate ranking algorithms, and any automated system that scored or classified job applicants.

The phrase "substantially assist or replace" was critical. A system that merely organized resumes alphabetically would not qualify. A system that ranked candidates by predicted job performance and presented the top 50 to recruiters for further review would qualify --- it substantially assisted the decision, even though a human made the final call.

Independent Bias Audit

The law requires an independent bias audit of any AEDT used in hiring or promotion decisions:

The audit must be conducted by an independent auditor (not the employer or the vendor)
It must be conducted within the year prior to the AEDT's use
The audit must calculate selection rates and impact ratios for sex categories (male, female, non-binary/other), race/ethnicity categories (as defined by the EEOC), and intersectional categories (sex crossed with race/ethnicity)
The audit results --- including selection rates and impact ratios for each category --- must be published on the employer's website

Definition: The impact ratio is the selection rate for a given demographic group divided by the selection rate for the most-selected group. An impact ratio below 0.8 (the "four-fifths rule") is commonly used as a threshold for identifying potential adverse impact, though NYC Local Law 144 did not mandate a specific threshold --- it required disclosure of the ratios, not a pass/fail determination.

Candidate Notice

Employers must provide notice to candidates:

At least 10 business days before using an AEDT on a candidate's application
The notice must include information about the AEDT's use and describe the job qualifications and characteristics that the tool evaluates
Candidates must be informed of their right to request an alternative selection process or accommodation (though the law does not specify what the alternative must be)
If the AEDT collects personal data, additional privacy-related disclosures may be required

Implementation: Where Theory Met Reality

Challenge 1: The Auditor Market

When the law took effect, there was no established market for AEDT bias audits. A handful of firms --- including academic researchers, legal consultants, and emerging AI audit startups --- positioned themselves as auditors, but standards for what constituted a quality audit were virtually nonexistent.

The result was wide variation in audit rigor:

Minimal audits. Some auditors offered inexpensive, standardized assessments that calculated the required selection rates and impact ratios using historical data, with limited analysis of the AEDT's underlying logic, training data, or edge cases. These audits technically complied with the law but provided limited insight into actual bias.

Comprehensive audits. A smaller number of auditors conducted thorough evaluations that included analysis of training data composition, feature importance evaluation, testing with synthetic and counterfactual data, intersectional analysis beyond the minimum requirements, and qualitative assessment of the AEDT's impact in context.

The cost gap reflected the quality gap. Minimal audits could be completed for $5,000-$15,000. Comprehensive audits cost $50,000-$150,000 or more. Employers overwhelmingly chose the cheaper option, particularly in the law's first year.

Business Insight: The variance in audit quality is a cautionary tale for any jurisdiction implementing AI audit requirements. Without standards for auditor qualifications, methodologies, and reporting, mandatory audits can become compliance theater --- checking a regulatory box without providing meaningful assurance. The EU AI Act's conformity assessment requirements will face the same challenge at a much larger scale.

Challenge 2: The Data Problem

Calculating selection rates and impact ratios requires demographic data about candidates. But most employers do not systematically collect race and ethnicity data for all applicants, and many candidates decline to provide it. This created a practical paradox: the law required analysis by demographic category, but the data necessary for that analysis was often incomplete or unavailable.

Responses varied:

Some employers began collecting voluntary demographic data from all applicants, adding a step to the application process that some candidates found intrusive
Some used proxy data (zip codes, names) to infer demographic characteristics --- a practice with its own ethical and legal complications
Some auditors conducted "testing audits" using synthetic candidate profiles designed to test for differential treatment across demographic groups, rather than relying on historical application data
Some employers concluded that the data limitations made meaningful audit impossible and sought legal opinions on whether their compliance efforts were sufficient

Challenge 3: Employer Responses

Employer responses to the law fell into several categories:

Compliance. Large employers with dedicated legal and HR teams generally implemented compliance programs, conducted audits, and published results. Banks, professional services firms, and technology companies with significant NYC hiring presence were among the most compliant.

Avoidance. Some employers restructured their hiring processes to avoid triggering the law:

Removing the automated scoring component and using AI tools purely for organizing (not ranking) candidates
Shifting screening decisions to human recruiters who used AI-generated information as one input among many, arguing that the tool no longer "substantially assisted or replaced" discretionary decision-making
Moving hiring operations outside NYC where possible

Noncompliance. Some employers simply ignored the law, calculating that enforcement was unlikely. This calculation was not unreasonable in the law's first year: the NYC Department of Consumer and Worker Protection (DCWP) had limited resources for investigation and enforcement, and no high-profile enforcement action had been taken by mid-2024.

Abandonment. A small number of employers abandoned AI hiring tools entirely, reverting to fully manual screening. These tended to be smaller companies that concluded the compliance cost exceeded the benefit of using AI.

Challenge 4: Audit Quality and Interpretation

Published audit results revealed significant inconsistencies in how auditors interpreted the law's requirements:

Some audits reported impact ratios at a high level (overall male vs. female selection rates) without intersectional analysis
Some audits covered only specific roles or departments, not the employer's full use of the AEDT
Some audits were conducted on historical data that predated the AEDT's current version, raising questions about relevance
The law did not specify a minimum sample size, leading some audits to report impact ratios based on very small numbers of applicants in specific categories

More fundamentally, the law required disclosure of impact ratios but did not require employers to act on the results. An employer could publish audit results showing significant adverse impact against a demographic group and continue using the tool. The law's theory was that transparency would create market and reputational pressure for improvement. Whether that theory held in practice remained uncertain.

What the Law Revealed

The Gap Between Regulation and Governance

NYC Local Law 144 demonstrated that regulatory compliance and genuine AI governance are not the same thing. An employer could be fully compliant --- audit conducted, results published, candidates notified --- while operating an AEDT that systematically disadvantaged certain demographic groups. The law created transparency but not accountability.

This gap is not unique to NYC Local Law 144. It exists in any regulatory framework that relies primarily on disclosure rather than substantive requirements. The EU AI Act addresses this gap by requiring high-risk AI systems to meet substantive standards for accuracy, fairness, and robustness --- not just transparency. But whether those substantive standards will be effectively enforced remains an open question.

The Market for AI Auditing

The law catalyzed the emergence of AI auditing as a professional discipline, but also revealed how immature that discipline remains. Questions that arose during implementation --- Who qualifies as an independent auditor? What methodologies are acceptable? How should statistical significance be addressed in small-sample audits? What constitutes a material finding? --- do not yet have standardized answers.

The analogy to financial auditing is instructive. Financial auditing developed over more than a century, with gradually standardized methodologies, professional certifications, regulatory oversight of auditors, and established norms for independence and materiality. AI auditing is at the very beginning of this journey. NYC Local Law 144 accelerated the journey but also exposed how far there is to go.

The Importance of Enforcement

A law without enforcement is a suggestion. NYC Local Law 144's limited enforcement activity in its first 18 months reduced its deterrent effect and emboldened noncompliance. The lesson for other jurisdictions is that regulatory frameworks must be accompanied by adequate enforcement resources --- investigators, technical evaluators, and legal staff --- from the outset. The EU AI Act's tiered enforcement structure (AI Office, national competent authorities, market surveillance authorities) reflects awareness of this challenge, but staffing and funding those entities remains a work in progress.

Candidate Experience and Power Dynamics

The law's notice requirements were designed to empower candidates. In practice, the impact was uncertain. Candidates received notice that an AEDT would be used, but most had no realistic ability to opt out of a competitive hiring process. Requesting an "alternative selection process" could signal to an employer that the candidate was likely to be litigious --- hardly a positive signal in a job application. The power asymmetry between employer and candidate meant that the law's transparency provisions, while valuable, did not fundamentally alter the dynamics of the hiring process.

Implications for the EU AI Act and Beyond

NYC Local Law 144 is a preview of the challenges that larger AI regulatory frameworks will face. Several specific lessons are directly relevant to the EU AI Act's implementation:

1. Standards for auditors and audits must be developed proactively. Waiting until the law takes effect to address audit quality creates a period of regulatory uncertainty and compliance theater. The EU's reliance on harmonized standards is an attempt to avoid this problem, but standards development is running behind implementation deadlines.

2. Data requirements must be realistic. Regulations that require demographic analysis must grapple with the fact that demographic data is often incomplete, voluntarily provided, or legally restricted. Technical solutions (synthetic testing, counterfactual analysis) may be necessary complements to statistical analysis of historical data.

3. Disclosure without accountability is insufficient. Transparency is necessary but not sufficient. Regulations that require disclosure of bias findings without mandating corrective action may create an illusion of oversight without materially reducing harm.

4. Enforcement must be resourced from day one. Regulatory frameworks without enforcement budgets are aspirational documents. The gap between legal requirements and actual compliance is a function of enforcement credibility.

5. Avoidance strategies are predictable. Regulated parties will restructure their activities to avoid triggering regulatory requirements. Lawmakers must anticipate avoidance strategies and draft definitions that are robust to creative compliance.

Epilogue: The Ripple Effect

Despite its limitations, NYC Local Law 144 had effects far beyond New York City:

Colorado's AI Act (2024) drew on NYC's experience, requiring impact assessments rather than just bias audits and imposing obligations on both developers and deployers
Illinois, Maryland, and California introduced AI hiring legislation that referenced NYC's approach while attempting to address its gaps
The EU AI Act's employment provisions were informed by NYC's early implementation experience, particularly regarding audit standards and demographic data challenges
Employers operating nationally in the US often adopted NYC-compliant practices across all locations, creating a voluntary national floor --- an example of what scholars call the "California effect" (or, in this case, the "New York effect"), where a jurisdiction's regulations become a national standard through corporate compliance decisions

NYC Local Law 144 was imperfect, incompletely enforced, and narrower than its advocates wanted. It was also the first AI regulation in the United States to require specific actions from specific entities by a specific date. For all its limitations, it moved the conversation from hypothetical to operational --- and that transition, more than any specific provision, is its most enduring contribution.

Discussion Questions

NYC Local Law 144 requires disclosure of bias audit results but does not require employers to take corrective action. Is mandatory disclosure sufficient, or should the law require employers to meet specific fairness thresholds? What are the tradeoffs?
Some employers responded to the law by removing AI scoring components from their hiring tools, effectively reverting to AI-assisted (but not AI-driven) screening. Does this response serve the law's objectives, or does it undermine them?
The wide variation in audit quality --- from $5,000 minimal audits to $150,000 comprehensive evaluations --- suggests that audit standards are needed. Who should set these standards? Government regulators? Professional associations? Academic institutions? What are the risks of each approach?
The law's notice requirements inform candidates that an AEDT is being used but give candidates limited power to influence the process. Is transparency without agency meaningful? What additional protections would be needed to genuinely empower candidates?
NYC Local Law 144 applies only to New York City, but many employers have adopted its requirements nationwide. What does this suggest about the relationship between local regulation and national practice? Is local regulation an effective path to broader change, or does it create an uneven playing field?
Compare NYC Local Law 144's approach (transparency and audit) with the EU AI Act's approach (conformity assessment with substantive requirements) for employment-related AI systems. Which is more likely to be effective? Which is more practical to implement?

This case study connects to Chapter 25's examination of bias in AI hiring systems (which provides the technical context for why bias audits matter), Chapter 27's governance frameworks (which provide the organizational infrastructure for conducting meaningful audits), and Chapter 30's discussion of responsible AI in practice (which addresses how organizations can move beyond compliance to genuine fairness).