Case Study 15.1: "The Welfare Algorithm Letter"

Communicating Automated Benefits Decisions — The Arkansas Medicaid Case

Overview

In 2016, the state of Arkansas implemented a new automated system to determine how many hours of home-based care Medicaid recipients with disabilities were entitled to receive. The system used a proprietary algorithm — developed by a vendor called Brocade — that assessed functional ability through scores on a standardized assessment instrument and translated those scores into authorized care hours. Within months of the system's deployment, thousands of recipients saw their care hours drastically cut. In many cases, their care hours were reduced by 30%, 40%, or more, with no explanation beyond a standard form letter citing "assessment update."

Recipients were not told the algorithm's logic. They were not told which factors in their assessment drove the reduction. They were not told whether the reduction reflected a change in their own condition or a change in the formula being applied. When they appealed, the appeals were reviewed by the same agency using the same algorithm. Many recipients went months without adequate care. For individuals who relied on home care workers to assist with feeding, bathing, toileting, and medication management, these reductions had severe and immediate consequences.

In the landmark case Ledgerwood v. Jobe (2016) — which became one of the foundational cases in US administrative law on algorithmic decision-making — a federal district court ruled that Arkansas's implementation of the system violated the due process rights of Medicaid recipients. The court found that the state's communications were constitutionally inadequate: they did not provide sufficient information for recipients to understand why their benefits had been cut, and the appeal process was not meaningfully designed to remedy errors.

Background: The Arkansas Home Care Medicaid Program

Arkansas's Medicaid home care program serves individuals with physical and cognitive disabilities who require assistance with activities of daily living: bathing, dressing, grooming, feeding, toileting, and similar tasks. Prior to 2016, care hours were determined by independent nurse assessors who reviewed each recipient's individualized situation and exercised professional judgment to recommend an appropriate level of care. Nurses could take into account factors not captured in standardized assessments: the particular circumstances of a recipient's living situation, the availability of family support, patterns of need that varied over time.

In 2016, Arkansas contracted with Brocade to implement an algorithmic system called the Assessment and Level of Care (ALC) Tool. The tool was designed to standardize the determination of care hours based on recipients' scores on a standardized assessment instrument, the Arkansas Independent Assessment (ARIA). Under the new system, care hours were determined by a formula that mapped ARIA subscores to care hour allocations. Human nurse assessors still conducted the ARIA assessments, but the ALC tool — not a human professional — translated assessment scores into authorized hours.

The rationale for the change was efficiency and consistency: eliminating the variability in individual nurse judgment would produce more uniform outcomes and reduce costs. The anticipated cost reduction was substantial — and, as subsequent litigation revealed, was a significant driver of the system's design parameters.

The Letters That Said Nothing

When recipients received notifications of their care hour reductions, the letters were substantively empty. A typical letter stated that the recipient's services were being modified following an "updated assessment" and provided the new authorized hours. It noted that the recipient had the right to appeal within 30 days. It provided no information about which elements of the ARIA assessment had driven the change, no comparison between the current assessment scores and previous scores, no explanation of the formula used to translate scores into hours, and no indication of whether the change reflected a deterioration in the recipient's condition or simply a change in the algorithmic criteria.

For a recipient like Ledgerwood — a woman with fibromyalgia and other conditions who had her care hours cut from 56 per week to 32 per week — the letter communicated nothing useful. She could not tell from the letter whether her care hours had been reduced because her assessed functional level had declined (which she knew was not the case — her condition had not improved), because the ARIA assessment had changed, or because the algorithm's formula had changed. She had no basis on which to construct a meaningful appeal.

When she did appeal, the appeal was reviewed by a nurse employed by the Department of Human Services — but the nurse used the same ALC tool to evaluate the appeal. The nurse could not deviate from the tool's output, regardless of any information the appellant presented. The appeal process thus provided a nominal human review that was not substantively different from the original automated determination.

The Legal Framework: Due Process and Automated Benefits

Due process requirements for administrative benefit decisions derive from two overlapping sources: the Fifth and Fourteenth Amendments to the US Constitution, which prohibit deprivation of life, liberty, or property without due process; and the statutory and regulatory frameworks governing specific benefit programs.

In the landmark 1975 case Goldberg v. Kelly, the Supreme Court held that welfare recipients have a protected property interest in their benefits and cannot have those benefits terminated without adequate prior notice and an opportunity to be heard. The Court specified that notice must provide the specific reasons for the proposed termination in sufficient detail to permit the recipient to prepare a response, and that the hearing must be a genuine opportunity to present evidence and argument, not a rubber-stamp of the agency's original decision.

Applying Goldberg's framework to algorithmic benefit decisions, the district court in Ledgerwood v. Jobe concluded that Arkansas's system failed on multiple dimensions:

Inadequate notice. The termination letters did not explain the algorithmic formula used to calculate care hours, the specific inputs from the ARIA assessment that drove the recipient's allocation, or whether the change reflected a change in the recipient's condition or in the algorithmic criteria. Without this information, recipients could not meaningfully prepare an appeal. The court emphasized that constitutional adequacy of notice depends on whether the recipient receives information sufficient to challenge the decision — not just information sufficient to be aware that a decision was made.

Inadequate hearing. The appeal process was constitutionally inadequate because the appeals officer was bound by the ALC tool's output. A hearing that cannot result in a different outcome because the decision-maker is constrained by an algorithm provides no genuine opportunity to be heard. The court noted that this was not merely a procedural objection: if the algorithm contained errors or produced systematically wrong outputs for particular recipient profiles, the appeal process provided no mechanism for identifying or correcting those errors.

Failure to disclose algorithm logic. The court expressed concern — though this was not the primary holding — about the state's refusal to disclose the ALC tool's formula and parameters, citing vendor confidentiality agreements. The court found it troubling that a government program determining citizens' constitutional entitlements was operating through a black-box algorithm whose logic was shielded from the affected citizens and, effectively, from judicial review.

What the Algorithm Got Wrong

Subsequent analysis revealed that the ALC tool contained errors — not just in its communication, but in its substance. The algorithm failed to account for certain medical conditions that affected care needs: recipients with spasticity or uncontrolled seizures, for example, needed higher care hours than their ARIA scores predicted because standard tasks took them longer and required more assistance. The ARIA assessment did not capture these needs in a way the algorithm could process, and the algorithm did not flag cases where its standard formula was likely to be inaccurate.

The algorithm also failed to account for the interaction between care needs and living arrangements. A recipient living alone required more care hours for the same level of functional limitation than a recipient with a family caregiver, because there was no one else to perform the tasks Medicaid home care was not covering. The ALC tool did not incorporate this factor.

These errors would have been discoverable through the appeal process — if the appeal process had given recipients the information they needed to identify them, and if appeals had been decided by humans with authority to deviate from the tool's output when presented with evidence that it was producing a wrong result. Without that information and that authority, the errors were invisible from within the system. They became visible only through litigation.

The Aftermath: Reform and Resistance

The court's ruling required Arkansas to provide meaningful individualized explanations for care hour determinations, to create a genuine human review process for appeals, and to disclose the ALC tool's methodology to recipients and their advocates. Implementation of these requirements was slow and contested.

Arkansas revised its termination notice to include additional information about the ARIA assessment scores and the care tier to which the recipient had been assigned, but advocates argued the revised notices still did not provide enough information for recipients to understand why their hours differed from what the assessment had previously authorized. The appeal process was redesigned to give appeals reviewers more authority to exercise independent judgment, but enforcement of this authority varied.

The case generated attention from disability rights advocates, legal scholars, and AI ethics researchers, and became a touchstone in the developing legal and policy debate about algorithmic accountability in government. Similar challenges were subsequently brought in other states, and the case influenced the development of federal guidance on algorithmic decision-making in benefit programs.

Lessons and Analysis

The Arkansas Medicaid case illustrates several critical principles for AI communication in high-stakes government contexts.

The stakes matter enormously. Benefits terminations that reduce an individual's care hours can result in immediate, serious harm — falls, malnutrition, medication errors, and in severe cases, death or hospitalization. When the stakes of an AI decision are this high, communication requirements should be correspondingly more demanding, not less. The minimal notice that might be adequate for a low-stakes automated communication is wholly inadequate when the communication concerns a change in life-sustaining services.

Explanation must be built into the model. The reason Arkansas could not provide meaningful individual explanations was in part because the ALC tool had not been designed to produce them. The tool calculated a care tier based on ARIA subscores without generating a record of which subcomponents most influenced the outcome or of how the outcome compared to similar cases. Meaningful explanation capacity requires explanation to be designed into the system — built into the data structures that record each decision, not retrofitted through post-hoc analysis.

Vendor confidentiality cannot override constitutional rights. The state's argument that it could not disclose the algorithm's logic because of vendor confidentiality agreements was rejected, and rightly so. Government agencies that deploy vendor AI systems to make decisions affecting citizens' constitutional rights cannot shield those systems from accountability through contractual arrangements with vendors. Procurement contracts should require transparency; agencies should not accept vendor opacity terms for systems used in high-stakes benefit administration.

Appeal processes must be genuinely corrective. An appeal process that cannot produce different outcomes is not an appeal process; it is a delay mechanism. Genuine appeal requires human reviewers with authority to override automated decisions, access to the information needed to evaluate whether the automated decision was correct, and a realistic timeline that limits the harm caused by erroneous decisions while the appeal is pending.

Algorithms in benefits administration require public disclosure. The ALC tool's parameters were not public. Neither legislators who had authorized the home care program, nor advocacy organizations representing program recipients, nor independent researchers studying benefits administration had access to information about how the algorithm worked. This opacity prevented external accountability that might have identified the algorithm's errors before they harmed thousands of recipients. Government algorithms used to determine citizens' entitlements should be publicly disclosed, subject to independent audit, and subject to regular review.

Reflection Questions

Arkansas argued that algorithmic care hour determinations were more consistent and less subject to individual nurse bias than the previous professional judgment system. Is consistency a sufficient justification for using an algorithm in this context? How should consistency and individual accuracy be weighed against each other?
The federal court required meaningful individual explanations for care hour determinations. What would such explanations look like in practice? How much information is "enough" for a recipient to be able to meaningfully challenge a determination?
Vendor confidentiality is a significant barrier to algorithmic accountability in government. What contractual or regulatory requirements should govern the procurement of AI systems used in government benefit administration?
How should the design of the ALC tool — which was developed before concerns about algorithmic accountability were widely discussed — be evaluated in retrospect? Who bears responsibility for the tool's errors: the vendor, the state agency, the political officials who authorized the system, or some combination?
Consider the argument that the previous system — where nurses exercised individual professional judgment — was itself not free from bias and inconsistency, and that in some ways the algorithmic system was more equitable. How should this argument affect our evaluation of the case?