Case Study 2: Building a First Communications Surveillance Program at Meridian Asset Management

Overview

This case study follows Priya Nair's engagement at Meridian Asset Management — a tier-2 investment manager with approximately £18 billion in assets under management — as she advises on the design and implementation of a first-generation communications surveillance program. The engagement surfaces the practical challenges of applying regulatory communications monitoring requirements to a real institution: data volume, false positive management, alert fatigue, and the cultural resistance of a trading floor that views compliance monitoring as an intrusion rather than a safeguard.

Key regulatory framework: MiFID II Article 16(7) (recording obligation); UK MAR Article 16 (STOR obligation, which presupposes communications review capability); FCA Market Watch 69 (communications surveillance expectations); FCA Conduct of Business Sourcebook (COBS) 11.8 (recording requirements)

Characters: Priya Nair (Big 4 RegTech consultant), Jonathan Adler (Meridian CRO), Sasha Merritt (Meridian Head of Compliance), Rafael Torres (consultant, providing specialist broker-dealer perspective)


Background: The Engagement

Meridian Asset Management had operated for eleven years as a UK-regulated investment manager. Its core business was managing multi-asset portfolios for pension funds, insurance companies, and sovereign wealth fund clients. Its trading function — approximately forty traders across equities, fixed income, and FX — used Bloomberg terminals as the primary communication platform for dealer-to-counterparty communication, supplemented by internal Symphony messaging for desk-to-desk coordination and traditional telephony recorded under MiFID II.

Meridian had a working trade surveillance program — a licensed instance of a market surveillance platform that monitored order and execution data. It did not have a communications surveillance program in any meaningful sense. The firm's recorded calls were archived but not monitored. Bloomberg Chat messages were retained per the five-year retention requirement but were not systematically reviewed. Email was subject to a keyword alert system maintained by the IT security team, which had been designed primarily for data loss prevention rather than market abuse detection.

The trigger for the engagement was twofold. First, an FCA Supervisory Review letter received in September highlighted that Meridian's surveillance framework did not include communications monitoring and asked the firm to explain how it met its obligations under MAR Article 16 without the ability to review communications in the context of suspicious transaction investigations. Second, Rafael Torres, a consultant whom Meridian's CRO Jonathan Adler had engaged separately to advise on trade surveillance calibration, flagged the communications gap as Meridian's most significant unaddressed regulatory exposure.

Rafael introduced Priya, whose specialist expertise in communications surveillance technology — developed through three previous engagements at comparable-sized investment managers — made her the natural lead for the engagement. Priya arrived for the initial scoping meeting in mid-October with a framework diagram drawn from her previous work and a set of diagnostic questions designed to expose the full scope of the gap.


Phase 1: Diagnostic — Mapping the Communications Landscape

Priya's first task was to map the communications channels actually in use at Meridian, as distinct from the channels that compliance policy assumed were in use. The gap between these two maps is, in Priya's experience, consistently wider than compliance teams expect.

Her diagnostic process involved structured interviews with desk heads and traders across the three primary desks, a review of MiFID II recording logs to identify which channels were being captured, a technical review of the IT infrastructure supporting retained communications, and a review of Meridian's existing policies on use of communication channels for business purposes.

The resulting communications map contained six distinct streams:

Channel Volume Estimate MiFID II Recording Status Current Surveillance Status
Bloomberg Chat (BNET) ~2,400 messages/day Retained (Bloomberg MSG feed) Not monitored
Internal Symphony ~800 messages/day Retained (Symphony archive) Not monitored
Recorded telephony ~150 calls/day Retained (NICE recording) Not monitored
Email (Microsoft 365) ~3,500 messages/day Retained (Exchange archive) Keyword DLP alerts only
WhatsApp (personal devices) Unknown NOT retained Not monitored
Microsoft Teams (personal chat) ~200 messages/day Retained (Teams archive) Not monitored

The WhatsApp finding was the most immediately concerning. Several traders on the FX desk acknowledged using WhatsApp groups to communicate informally with counterparty dealers. Priya was careful in her diagnostic interviews not to suggest that this was necessarily abusive — informal dealer-to-dealer communication about market color is a longstanding practice — but she noted for Sasha that WhatsApp use on personal devices for business-related communications likely violated both Meridian's own policy and the MiFID II recording obligation. A separate workstream was opened to address personal device usage.

The total addressable message volume across retained channels was approximately 7,000 messages per day — or roughly 1.75 million messages per year. Even at the most efficient manual review rate conceivable — ten minutes per message — a team reviewing every message would require 290,000 hours annually. Systematic manual review was not a realistic proposition. The surveillance program would need to be automated, layered, and tightly targeted.


Phase 2: Technology Selection and Architecture Design

Priya presented Meridian's Compliance and Technology leadership with a three-layer architecture for communications surveillance, designed to produce a manageable number of alerts from the 7,000-message daily volume without creating a system so narrow that it missed genuinely problematic communications.

Layer 1: Keyword and Phrase Alerting

The first layer applied a library of high-risk terms to all ingested communications. This is the most widely deployed form of communications surveillance and the simplest to implement. The keyword library drew from: the FCA's published STOR case studies and Market Watch guidance; historical market abuse communications excerpts from published enforcement notices (the LIBOR investigation had produced an extraordinary public record of manipulative language); Priya's proprietary term library developed across previous engagements; and Meridian-specific terms relevant to instruments in its portfolio.

The challenge with keyword alerting is calibration. A keyword library that includes the term "push" will fire on every message in which a trader asks a counterparty to "push the price." A library that requires the phrase "move the fixing" is highly specific but will miss novel phrasing. Priya recommended a tiered keyword structure: Tier 1 (high specificity, high risk — e.g., "don't let it fix below"; "if you buy this I'll sell you that"; "we need the rate lower") triggering immediate analyst review; Tier 2 (medium specificity, context-dependent — e.g., "coordinate"; "get a better fix"; "cover this for me") triggering contextual analysis before alert generation.

Based on Meridian's instrument profile and desk structure, Priya estimated that a well-calibrated keyword library would generate approximately 25-40 Tier 1 alerts per day and 60-80 Tier 2 alerts per day across all channels. The Tier 2 alerts, after contextual filtering, would reduce to 10-15 that required analyst review.

Layer 2: Behavioral Pattern Analysis

The second layer applied behavioral analytics to communications metadata — not content, but patterns. Who was messaging whom? With what frequency? Were there communications with counterparties in time windows correlated with suspicious order activity flagged by the trade surveillance system? Were there communications patterns that deviated from a trader's historical baseline — for example, a spike in external messaging volume in the period before a material market move?

This layer operated primarily on metadata, which created a significant advantage from a data protection and GDPR perspective: metadata analysis does not involve reading message content and therefore creates fewer tensions with employees' reasonable privacy expectations. Priya walked Jonathan through this distinction carefully, noting that Meridian's employees had been made aware of the recording and retention of communications via the standard MiFID II disclosure (a point that Sasha confirmed had been included in the firm's employment contracts and client-facing disclosures). However, behavioral analytics operating at the metadata level was less intrusive and easier to defend as proportionate under GDPR Article 5(1)(c).

The behavioral layer was expected to generate 5-10 alerts per week — a much lower volume, because it required a genuine anomaly in communication patterns to trigger.

Layer 3: ML-Based Contextual Analysis

The third layer applied a fine-tuned NLP classification model to communications flagged by Layers 1 or 2, as well as to a random sample of all communications (approximately 2% daily, to provide ongoing calibration data). The model had been pre-trained on a corpus that included published enforcement notices, historical STOR cases (anonymized), and general financial communications.

Priya was candid with Sasha about the limitations of Layer 3. The model was not infallible, and its outputs were classification scores rather than determinations. A message classified as "high risk" by the model required human review; the model's role was to prioritize and contextualize, not to replace analyst judgment. She also emphasized that the model's training data reflected past manipulation patterns; genuinely novel manipulation schemes would likely evade it. Regular retraining — Priya recommended quarterly — was essential to maintain detection quality.


Phase 3: The False Positive Problem — Legitimate Business Jargon

The most challenging element of the implementation, in Priya's experience, was managing false positives generated by legitimate financial services jargon. This was not merely an efficiency problem; it was a cultural and legal problem. Compliance analysts reviewing large volumes of flagged communications about entirely normal trading activity would develop alert fatigue, reducing the quality of their review of genuinely suspicious communications. And a firm that filed STORs based on communications that turned out to be innocent — or that generated a pattern of complaints from traders about intrusive monitoring of legitimate activities — would face both reputational and legal risks.

Priya conducted a calibration exercise in the third week of the engagement, running a prototype version of the keyword library against two months of historical Bloomberg Chat messages across all desks. The raw output was striking: the uncalibrated library generated 847 alerts per day. Investigation of a 5% sample revealed that the vast majority were false positives driven by the following categories:

Category 1: Directional language used in legitimate context. Phrases such as "we need this to go lower" or "I'm trying to push the price down" appeared frequently in the context of traders negotiating better execution prices for client orders — a perfectly normal activity. The library was firing on surface-level language without understanding the counterparty and transactional context.

Priya's solution: Add counterparty context filters. Communications between Meridian traders and registered, FCA-authorized counterparties (brokers, prime brokers, market makers) in the context of execution-related conversations were subject to a lower base-rate prior for manipulation than communications with unregistered counterparties or communications in channels not typically used for order execution.

Category 2: Information sharing that resembles tipping but is legitimate research. Sell-side analysts sharing views on issuers — "we think XYZ is going to miss consensus by 15-20%; not our main call but worth watching" — triggered terms related to inside information disclosure. In every instance reviewed, the information was clearly analyst opinion based on publicly available data.

Priya's solution: Implement a "registered analyst" whitelist, so communications from approved research provider contacts in research-designated channels (Bloomberg {MSG} research context) were scored differently from identical-content communications in order-execution channels.

Category 3: Coordination language in legitimate index rebalancing context. The passive funds desk generated substantial communication around index rebalancing events, using language such as "we all need to be on the same side for the rebalance" and "we should coordinate timing on the index roll." This language sounded alarming in isolation; in context, it was a description of legitimate pre-announced, transparent index tracking activity.

Priya's solution: Create a contextual flag for known rebalancing windows (based on index provider published schedules) that applied reduced sensitivity to coordination language during these periods, with the logic documented in the system's configuration for audit purposes.

After three iterations of calibration refinement, the alert volume dropped from 847 per day to 42 per day — a reduction of approximately 95% — while, based on Priya's assessment of the cases reviewed, not materially reducing coverage of genuinely suspicious communications. The 42 daily alerts comprised approximately 18 Tier 1 keyword alerts, 12 Tier 2 alerts after contextual filtering, and 12 behavioral pattern alerts.


Phase 4: Integration with Trade Surveillance

A critical design decision in the architecture was the integration between the communications surveillance output and the existing trade surveillance system. Priya had seen, in previous engagements, implementations where communications surveillance and trade surveillance operated in entirely parallel silos — each generating its own alerts, each reviewed by different analysts, with no systematic mechanism to correlate a flagged communication with a flagged trading pattern.

At Meridian, Priya designed an integration layer that linked the two systems through the trader identifier and timestamp. When the trade surveillance system generated an alert for a specific trader and instrument in a specific time window, the communications surveillance system automatically retrieved and surfaced any communications from that trader in a configurable window (defaulting to one hour before and after the trading event). Conversely, when a high-risk communication alert was generated, the system pulled the flagged trader's order and execution activity for the surrounding window.

This integration was, in Priya's view, the most valuable capability improvement in the entire program. Manipulation cases — particularly LIBOR-style benchmark manipulation and pump-and-dump schemes — are almost always visible in both order flow and communications. A surveillance system that requires a human analyst to manually cross-reference the two streams is slower, more error-prone, and less likely to produce the kind of integrated evidentiary picture that supports a quality STOR.

Rafael Torres, reviewing Priya's architecture during a joint call in week five, observed that this integration was precisely what had been missing from the trade surveillance system he had helped design at Meridian Capital. "We had excellent order analytics," he said, "but when we needed to investigate a potential coordination case, the communications pull was a manual process that took days. By the time we had the full picture, the FCA had already received a STOR from another firm and was three steps ahead of us."


Phase 5: Alert Fatigue — The Human Side of the Problem

Four weeks into the pilot, with the calibrated system running against live communications, Sasha raised a concern that Priya recognized immediately from previous engagements: analyst fatigue.

Meridian's compliance team comprised four analysts with market surveillance responsibilities. The daily alert volume of 42 was, in theory, manageable — approximately ten alerts per analyst per day. In practice, the alerts were unevenly distributed by time of day (the FX desk, which was most active in London morning hours, generated a disproportionate share of communications alerts during the first two hours of the session), and each alert required not just a content review but a contextual review incorporating trade data, counterparty information, and the trader's communication history.

By week four, Priya observed that two analysts had developed a habit of resolving Tier 2 alerts within two to three minutes — a timeline that did not allow for meaningful contextual review. When she discussed this with Sasha, the analysts explained that the alerts felt routine and that the pressure to maintain a clear queue was leading them to make quick dispositions rather than thorough ones.

Priya's response was threefold:

Alert restructuring. She proposed reducing the Tier 2 daily alert target further, from 12 to 6, by tightening the contextual filters. The cost of missing a genuine manipulation case that would have been caught by the broader Tier 2 library was low at Meridian's scale — the firm's trading footprint did not generate the kind of high-frequency, high-volume manipulation that required broad-net detection. The benefit of giving analysts more time per alert was higher quality review of the alerts that were generated.

Rotation and specialization. Priya recommended that alert review responsibilities be rotated on a monthly basis, but that each analyst develop specialist knowledge of a particular desk's communication style and trading context. An analyst familiar with the linguistic patterns of the FX desk would be better positioned to distinguish genuine coordination language from legitimate execution discussion than a generalist reviewing FX alerts for the first time.

Documented disposition standards. Each alert disposition — whether closed without further action or escalated — was required to include a minimum of three substantive observations about the communication content, the trade context, and the reason for the disposition. This created a written record that both supported regulatory defensibility and provided a quality control mechanism: a supervisor reviewing disposition records could identify whether analysts were engaging substantively with each alert or processing them superficially.


Phase 6: Outcomes and Regulatory Response

Six months after go-live, Meridian's communications surveillance program had produced three STOR-supporting cases — cases where the communications evidence was material to the decision to file a STOR with the FCA. In two cases, the communications were the primary trigger: the trade surveillance system had generated only a low-severity alert, but communications review surfaced direct evidence of information sharing that established reasonable grounds for suspicion of insider dealing. In the third case, the communications corroborated a trade surveillance alert by revealing coordination language consistent with a marking-the-close pattern.

The FCA's follow-up response to the September Supervisory Review letter — provided after Meridian submitted a detailed description of the new program — was positive. The FCA noted that the program's architecture, calibration methodology, and integration with trade surveillance reflected a thoughtful and proportionate approach to the communications monitoring obligation. The FCA specifically commended the documentation of alert thresholds and calibration decisions, which it described as providing good regulatory transparency.

Priya presented the engagement results to a wider audience at an industry compliance conference in London the following spring. Her key message to the audience of compliance officers from comparable-sized firms: "The right question is not 'how do we monitor everything?' The answer to that question is always 'you can't, and you'll destroy your compliance team trying.' The right question is 'how do we build a system that finds the genuine cases reliably enough to meet our regulatory obligation and defensible enough to demonstrate that obligation is being met?' Those are different engineering problems, and the second one is much more achievable."


Discussion Questions

  1. Priya's calibration exercise reduced alert volume from 847 per day to 42 per day — a 95% reduction. From a regulatory perspective, what documentation would Cornerstone need to maintain to demonstrate that this reduction did not reduce the program's effectiveness at detecting genuine market abuse?

  2. The WhatsApp finding — traders using personal devices for business-related communications — is a common issue at investment firms. Meridian opened a separate workstream to address it. What are the firm's legal options for addressing this gap? What are the practical challenges of each option?

  3. Rafael Torres observed that the cross-system integration — linking communications alerts to trade data — was the most valuable capability in the program. Why is integrated, cross-stream surveillance analytically superior to siloed surveillance? What are the technical and organizational challenges of achieving this integration?

  4. The false positive categories identified in the calibration exercise — directional language in legitimate context, analyst research communications, index rebalancing coordination — all reflect the difficulty of applying keyword-based rules to the rich context of financial communication. What are the limitations of keyword-based surveillance that make it necessary to layer behavioral and ML-based approaches on top of it?

  5. Alert fatigue is described as a human problem as much as a technology problem. Priya's response involved alert restructuring, rotation and specialization, and documented disposition standards. Which of these interventions do you consider most important for regulatory defensibility? Which do you consider most important for detection effectiveness? Are they the same?

  6. The program produced three STOR-supporting cases in its first six months. How would you assess whether this number represents program success or program underperformance? What benchmarks or reference points would you use in that assessment?