Case Study 34.2: Rafael's Dilemma — When the System Works as Designed and That's the Problem

DataField.Dev

Case Study 34.2: Rafael's Dilemma — When the System Works as Designed and That's the Problem

The Situation

Organization: Meridian Capital (fictional US broker-dealer; acquired in 2024; Rafael Torres now consulting) Rafael Torres's role: Post-acquisition integration consultant advising on compliance system governance Challenge: An automated trade surveillance system correctly identifies a pattern of behavior — but applying its output as designed would cause harm that Rafael does not believe the designers anticipated Timeline: Q2 2024

Background

After Meridian Capital's acquisition by a larger financial services group, Rafael Torres was retained as a compliance integration consultant. One of his responsibilities: assessing Meridian's automated trade surveillance systems against the acquiring firm's governance standards.

Meridian's trade surveillance system used ML-based detection to flag potential market manipulation patterns — spoofing, layering, front-running (Chapter 22). The system had a documented false positive rate of approximately 22% — meaning that roughly 1 in 5 alerts was a legitimate trade pattern, not manipulation.

The workflow for flagged alerts: alerts were reviewed by a junior surveillance analyst team. If the analyst confirmed the alert as suspicious, it was escalated to the head of surveillance, who decided whether to refer to legal and compliance for formal investigation. If legal determined there was sufficient evidence, a formal notification could be filed with FINRA (in the US) or the relevant NCA (in EU equivalents).

The system was well-designed and had generated several enforcement referrals that resulted in confirmed regulatory actions. Rafael was not questioning its technical quality.

The Ethical Problem Rafael Found

During his review, Rafael examined the pattern of alerts that had been escalated through the full workflow over the prior 24 months.

He found something that he could not initially explain. The alert-to-escalation rate — the proportion of flagged alerts that were escalated beyond the first analyst review — was 31% for traders on certain desks and 12% for traders on others. The differential was not explained by the underlying trading strategies, which were similar across desks.

The escalation differential was explained by something else. The surveillance team's escalation decisions correlated significantly with whether the trader had been involved in previous escalations — regardless of the outcome of those escalations (i.e., regardless of whether the prior escalation had resulted in a finding of wrongdoing or been cleared as a false positive). A trader who had been escalated before was substantially more likely to be escalated again.

This meant the surveillance system, combined with the human escalation workflow, was creating a feedback loop: once flagged, a trader was more likely to be flagged again — not because they were more likely to engage in misconduct, but because the escalation history itself influenced subsequent analyst decisions.

The feedback loop was most pronounced for traders in a specific demographic group — not because the model had been designed with any demographic features, but because certain desks historically had higher escalation rates, and those desks disproportionately included traders of a particular background.

Rafael wrote in his consultation notes: "The model does not know the demographic. The analysts may not consciously know they are doing it. But the data shows a differential that the model's design did not anticipate and that the governance process has not corrected."

Rafael's Three Options

Rafael identified three courses of action, each with different ethical and practical implications.

Option 1: Document and escalate internally. Report the finding to the acquiring firm's Head of Compliance and Chief Risk Officer. Recommend a review of the surveillance workflow and escalation process. This is the conservative, procedurally correct approach — raise the concern to the appropriate institutional decision-makers.

Risk: internal escalation may be slow, may be deprioritized, may not result in meaningful remediation. The harm continues while the process runs.

Option 2: Include the finding explicitly in the formal integration assessment report. The integration assessment will be read by senior leadership and potentially by regulators during any examination of the post-acquisition integration process. Making the finding explicit and formal creates a paper trail and increases the probability of action.

Risk: the finding reflects poorly on both Meridian's surveillance governance and the acquiring firm's due diligence. A frank assessment may be politically uncomfortable. Rafael's consulting relationship may be affected.

Option 3: Recommend immediate suspension of the escalation protocol pending review. Suspend the current escalation process and replace it with a temporary protocol that removes prior escalation history from analyst consideration until the bias can be assessed. This is the most protective of traders who may be experiencing unfair escalation patterns.

Risk: suspending the escalation protocol may reduce surveillance effectiveness. If the temporary protocol misses a genuine manipulation pattern that the suspended protocol would have caught, the acquiring firm may be exposed to regulatory criticism.

The Ethical Analysis

Rafael worked through each framework.

Consequentialism. Continuing the current workflow while the review proceeds generates ongoing harm to traders being unfairly escalated. Option 3 generates the least ongoing harm but may generate a different harm (reduced surveillance effectiveness, regulatory risk). The calculation is not obvious — but the scale of ongoing harm (daily, affecting multiple individuals, indefinitely) argues for prompt action.

Deontology. The traders being disproportionately escalated have a right not to be subject to biased surveillance — even if the bias is unintentional and structural rather than deliberate. That right does not depend on the consequentialist calculation. Continuing a biased process because it is convenient or because the institutional review takes time does not respect that right.

Virtue ethics. Rafael asked himself: what would a compliance professional of good character do in this situation? The answer he arrived at was not what would be easiest or least disruptive, but what would take the documented harm seriously and act to prevent its continuation.

What Rafael Did

Rafael chose a combination of Option 2 and a modified version of Option 3.

He included the finding explicitly in the formal integration assessment, with full supporting data. He recommended an immediate operational change: prior escalation history to be removed from the analyst review interface for a 90-day review period (analysts would see the alert details but not the trader's prior escalation history). He noted that this change was operationally straightforward to implement and would not significantly impair surveillance effectiveness — the alert itself, not the history, was the basis for escalation.

He also recommended a 90-day root cause analysis: why had the escalation rate differentials developed? Was it in the model's alert generation, in analyst decision-making, or in both?

The acquiring firm's compliance leadership was uncomfortable with the explicitness of the formal report. "Can't we just say 'escalation process enhancements recommended' rather than documenting the specific finding?"

Rafael's response: "We could. But if this is later examined — in a discrimination claim, in a regulatory review, in an employment dispute — the documentation that we identified a specific pattern and chose not to record it clearly would be worse for the firm than documentation that we identified it and acted."

The formal language stayed. The operational change was implemented. The 90-day review found that the differential in escalation rates was predominantly driven by the prior-history bias in analyst decision-making, not by the model's alert generation. Protocol changes were implemented; in the six months following, the escalation rate differential closed from 31%/12% to 18%/14%.

Discussion Questions

1. The differential escalation rate was produced by the combination of the surveillance model and the human review process — neither of which, individually, was designed to produce a biased outcome. How should institutions identify and govern these emergent patterns, which are not visible in the design of any single system component?

2. Rafael faced a professional tension: the explicit documentation of a bias finding would be politically uncomfortable for his client and might affect his consulting relationship. How should compliance professionals navigate the tension between professional honesty and institutional comfort? Does the fiduciary or duty-of-care dimension of the compliance role resolve this tension?

3. The acquiring firm asked Rafael to soften the documentation — "escalation process enhancements recommended" rather than explicitly documenting the specific bias finding. Rafael declined. Was his reasoning — that undisclosed findings are worse than disclosed ones if later examined — purely self-interested (protecting the firm from future liability), or was there an ethical principle at stake independent of liability risk?

4. The operational change — removing prior escalation history from analyst view — was described as operationally straightforward. What due process considerations apply to traders who had been historically over-escalated? Should they be notified of the finding? Should any past escalation records be reviewed and potentially corrected?

5. The 90-day review found that the bias was predominantly in analyst decision-making rather than the model. What does this imply for the design of human-in-the-loop review processes for automated surveillance? How can human review processes be designed to add genuine value (human judgment, contextual understanding) without amplifying biases that the automated system does not contain?