Case Study 1: Meridian's Shadow Model Problem

DataField.Dev

Case Study 1: Meridian's Shadow Model Problem

Background

Three months after the Model Risk Committee approved the fraud detection model — the approval that followed Rafael Torres's successful presentation of SHAP-based explainability — a different kind of problem surfaced at Meridian Capital.

The catalyst was a routine model inventory audit. Rafael had commissioned it partly in response to Dr. Marchetti's challenge, and partly because his own experience with the fraud model had made him genuinely uncertain about how well the firm actually understood its model landscape. He hired an external quantitative governance consultancy, Redpoint Advisory, to conduct a thirty-day inventory of all quantitative systems across Meridian's front and middle office.

Redpoint's lead consultant, Hannah Beech, was methodical and persistent. She interviewed forty-three people across eight business units. She reviewed system architecture diagrams, data flow documentation, and every tool that any team described as "automated." She asked, repeatedly, a version of the same question: what quantitative inputs does this system take, and what quantitative outputs does it produce that someone uses to make a decision?

On day nineteen, she sat down with Marcus Webb, a quantitative strategist on the equities desk. Marcus had been at Meridian for six years. He was thoughtful, technically sophisticated, and genuinely confused about why Hannah was asking him about a spreadsheet he had built to help the desk manage their algorithmic trading parameters.

"It's just a tool," he said. "I use it every morning. It takes the overnight volatility signals, the beta estimates, and the execution cost model outputs, and gives me adjusted weights for the three algos we run."

Hannah asked to see it.

The spreadsheet had forty-three parameters organized across six worksheets. The core logic was a set of nested IF/THEN conditions that translated input signal values into parameter adjustments — adjustments that directly modified the bid-offer aggression, the order slice timing, and the risk exposure limits for Meridian's three equities trading algorithms. Marcus had been running it every morning for eighteen months. During that time, the spreadsheet had influenced approximately 340 trading days' worth of algorithmic parameter settings.

"When was this validated?" Hannah asked.

Marcus looked at her. "Validated?"

The Question: Is This a Model?

The answer to this question is not merely semantic. It determines whether Meridian has an undisclosed governance gap of eighteen months and — more seriously — whether the desk has been operating with an unvalidated quantitative system that has been influencing live trading positions.

SR 11-7 defines a model as "a quantitative method, system, or approach that applies statistical, economic, financial, or mathematical theories, techniques, and assumptions to process input data into quantitative estimates." The definition is deliberately broad. The Federal Reserve and OCC have been explicit in examinations and supervisory letters that the definition encompasses not just econometric models and machine learning systems but also analytical tools embedded in spreadsheets, pricing models, and vendor-supplied systems.

Marcus Webb's spreadsheet meets every element of the definition. It applies mathematical logic (the IF/THEN parameter mapping constitutes a functional transformation). It processes input data (volatility signals, beta estimates, execution cost outputs). It produces quantitative estimates (adjusted algorithm parameters) that are used in trading decisions. The fact that it is implemented in Excel rather than Python, that Marcus built it himself rather than a dedicated quant team, and that he thinks of it as a "tool" rather than a "model" does not change its functional character.

Hannah called Rafael the same afternoon.

What Happened Next

Rafael immediately convened a meeting with Marcus, Marcus's desk head (David Liang), the Head of Model Risk (Victoria Ashworth), and Dr. Marchetti. He described the situation plainly: Meridian had an unregistered, unvalidated quantitative model that had been influencing equities trading parameters for eighteen months. The model's creator had not known it was a model. His desk head had not known it needed to be registered. No one had done anything wrong in the ordinary sense — but the governance gap was real and material.

The meeting produced three immediate decisions. First, Marcus's spreadsheet — now officially designated Model MDL-007, "Equities Parameter Adjustment Tool v1.0" — was registered in the model inventory with the designation PENDING VALIDATION and a notation that it had been in production use since April of the preceding year without validation. Second, David Liang committed that the equities desk would not modify the spreadsheet's logic until validation was complete. Modifying an unvalidated model during the validation process would have compounded the problem. Third, Victoria Ashworth's model risk team was assigned to conduct emergency retrospective validation within forty-five days.

The retrospective validation was the harder problem. Normally, validation occurs before production deployment. A retrospective validation must assess whether a model that has already been making decisions was actually sound — a question with different implications depending on what the validation finds. If the model is sound, the gap is a process failure that can be remediated. If the model is not sound, the firm has a potentially material incident: eighteen months of trading decisions influenced by a flawed quantitative tool.

Victoria Ashworth's team worked through the spreadsheet over three weeks. On day twenty-two, they found an error.

The error was in the beta estimation logic on the third worksheet. A formula that was intended to compute a rolling 60-day beta using the most recent two months of data was, due to an off-by-one error in the date range reference, computing a 60-day beta using data from two months prior — a two-month lag that was not intentional and not documented. During normal market conditions, the error would have had minimal practical effect: beta estimates do not change dramatically month-to-month. But during two periods of elevated volatility in the preceding eighteen months — one in March and one in October of the prior year — the lagged beta estimate had differed meaningfully from the contemporaneous estimate, and the parameter adjustments derived from it had been suboptimal as a result.

The trading impact was quantified at approximately £340,000 in additional execution costs and slippage over the two affected periods — not a material loss by Meridian's standards, but not trivial, and significant as evidence of what unvalidated models can do.

The Cultural Change

Rafael spent more time on the aftermath of MDL-007 than on the formal remediation. The formal remediation was straightforward: fix the formula, re-validate with the correction, establish going-forward monitoring, register the model properly, and add the incident to the quarterly model risk report. The harder work was cultural.

Marcus Webb had not been negligent. He had built a tool to help him do his job, using techniques he knew, and it had worked well enough that he had continued using it. He had no idea that the regulatory framework required him to register it and subject it to independent validation. David Liang had oversight responsibility but had similarly not recognized the tool as a model in the governance sense. The desk had no process for assessing whether new analytical tools crossed the threshold into "model" territory.

Meridian's model governance policy, as written, required developers to register models before deployment. The policy was not wrong. But it assumed that the people building models would recognize that what they were building was a model, and that assumption had failed.

Rafael worked with Victoria Ashworth to implement three changes. First, a decision tree was added to the model governance intranet page: a five-question flowchart that any Meridian employee could use to assess whether a tool they were building or using was likely to qualify as a model under SR 11-7. The first question was: "Does this tool take numerical inputs and produce numerical outputs that inform a business decision?" If yes, continue. The final question was: "If this tool produced a wrong answer for 30 consecutive days before anyone noticed, what would the consequence be?" The tool was deliberately accessible — written for non-specialists, not for quantitative professionals.

Second, a model pre-registration requirement was added to the technology change management process: any request to deploy a new analytical tool to a production environment required a model pre-assessment form to be completed before IT would execute the deployment. This created a checkpoint at the moment of deployment rather than relying on developers to self-identify after the fact.

Third, training was delivered to all front-office quant teams on what constitutes a model under SR 11-7, with specific examples drawn from Meridian's own context — including, with Marcus Webb's consent, a sanitized version of the MDL-007 story. Marcus participated in two of the training sessions and spoke directly about his experience. His presence turned what could have been a cautionary tale about rule-breaking into an honest account of a knowledge gap — and that framing made the training more effective than any policy document would have been.

Discussion Questions

1. Marcus Webb did not consider his spreadsheet a "model" and had no awareness that it required registration and validation. What organizational mechanisms could have caught the MDL-007 situation earlier, before eighteen months of unvalidated operation? Identify at least three preventive controls.

2. SR 11-7 applies to "vendor models" as well as internally developed models — meaning the firm using a vendor model is responsible for validating and governing it, even if the vendor does not disclose the methodology. How does this create particular challenges for smaller financial institutions that depend heavily on third-party scoring tools and risk platforms? What steps can such firms take?

3. The MDL-007 validation found a formula error that caused approximately £340,000 in additional execution costs. How should this finding be treated from a regulatory reporting perspective? Does it require disclosure to the FCA? Does it affect the firm's capital calculations? What factors determine whether a model error becomes a reportable incident?

4. Rafael and Victoria Ashworth implemented a decision tree and a pre-registration checkpoint to prevent future shadow models. Evaluate these controls: are they sufficient? What could circumvent them? What additional controls might be worth considering?

5. The MDL-007 situation involved a model built by a skilled individual contributor who did not recognize the governance implications of what he had built. How should a firm balance the goals of empowering technically skilled employees to build useful tools quickly and ensuring robust model governance oversight? Is there a tension between innovation speed and governance discipline, or can these goals be aligned?