Chapter 27: Case Study 2 — Cornerstone's Concentration Problem: Three Critical Systems, One Cloud Provider

DataField.Dev

Chapter 27: Case Study 2 — Cornerstone's Concentration Problem: Three Critical Systems, One Cloud Provider

Background

Cornerstone Financial Group is a mid-tier UK investment bank with operations across London, Frankfurt, and Singapore. It has approximately £18 billion in total assets, a trading book, a significant custody business, and a retail lending subsidiary. Its compliance function is structured around three pillars: financial crime, market conduct, and operational risk. The Head of Operational Risk Compliance, Daniel Reeves, has responsibility for third-party risk management and operational resilience.

Cornerstone began its cloud migration program in 2019, initially with internal productivity tools and collaboration software, and progressively with more critical workloads. By 2024, Cornerstone operated a hybrid cloud estate: some critical systems remained on-premise in its own data centers, while others had migrated to cloud. The migrations had been managed project by project, approved by IT governance committees, with compliance sign-off on individual migrations but no overarching view of the cloud estate as a whole.

In early 2025, as DORA came into force, Cornerstone initiated a DORA compliance assessment. One of the first exercises was building the ICT third-party register required by DORA Article 28 — a comprehensive inventory of all ICT third-party provider relationships. It was while building this register that Daniel's team made an uncomfortable discovery.

The Discovery

Across Cornerstone's cloud estate, nine systems had been classified as critical ICT systems — meaning their failure would directly impair Important Business Services: trade settlement, custody accounting, regulatory reporting, AML monitoring, market surveillance, client portal access, collateral management, treasury operations, and sanctions screening.

When Daniel's team mapped each critical system to its cloud provider, the result was stark: seven of the nine critical systems were hosted on a single cloud provider. Two systems — the legacy custody accounting platform and the on-premise regulatory reporting system — remained in Cornerstone's own data centers. Every other critical system was on the same hyperscale cloud provider.

The concentration had not been a deliberate decision. It had been the accumulated result of nine independent project decisions, each of which had independently concluded that the same provider offered the best combination of price, features, and existing enterprise relationships. No one had maintained a system-level view of cloud provider concentration during the migration program.

The Regulatory Conversation

Cornerstone's lead supervisor at the PRA, as part of its DORA implementation review, asked to see Cornerstone's ICT third-party register and its concentration risk assessment.

Daniel submitted the ICT third-party register, which was comprehensive and well-organized. It clearly showed the 7:2 concentration on a single cloud provider.

The supervisor then asked to see the concentration risk assessment.

Cornerstone did not have one.

The supervisor's written follow-up was direct: under DORA Article 29, financial entities are required to identify, assess, and manage concentration risk arising from ICT third-party arrangements. Cornerstone's ICT third-party register had made the concentration visible, but the absence of an accompanying risk assessment meant that Cornerstone could not demonstrate that it had actively understood or managed the risk. The supervisor asked Cornerstone to submit a concentration risk assessment within sixty days.

Building the Concentration Risk Assessment

Daniel convened a cross-functional working group including IT, compliance, operational risk, and finance to build Cornerstone's first formal concentration risk assessment. The assessment addressed five areas.

Failure Scenario Analysis. The working group modeled what would happen to Cornerstone's operations under three scenarios: a regional cloud outage affecting the primary region for ninety minutes; a multi-region cloud outage affecting all of the provider's EU regions for four hours; and a prolonged multi-region outage lasting twenty-four hours. For each scenario, the working group assessed the operational impact, the regulatory impact (which important business services would be impaired, for how long), and the manual fallback capacity. The results were sobering. A four-hour multi-region outage would simultaneously impair AML monitoring, market surveillance, sanctions screening, trade settlement, and the client portal. Manual fallback capacity for some of these functions was limited: AML monitoring and sanctions screening had manual fallback procedures that could operate at roughly twenty-five percent of normal transaction volume, which would require transaction processing to be significantly curtailed. Trade settlement had no meaningful manual fallback.

Recovery Assessment. For each critical system hosted on the provider, the working group assessed Cornerstone's recovery options in the event of an extended outage. For six of the seven systems, recovery was dependent on the provider restoring service — Cornerstone had not built alternative cloud deployments or maintained warm-standby arrangements on any other provider. For one system (the client portal), a failover to an Azure-hosted backup was available, but this had not been tested in eighteen months.

Contractual Review. The working group reviewed the contractual arrangements with the cloud provider for each critical system. They found that two of the seven systems had been migrated before the current enterprise agreement, and those arrangements were based on older terms that did not include the DORA Article 30 provisions. The enterprise agreement for the five more recently migrated systems did include the relevant addenda. This finding added a contractual remediation workstream to the concentration risk program.

Dependency Mapping. The working group mapped the dependencies between the seven cloud-hosted critical systems. They found that two systems — AML monitoring and the client portal — were technically dependent on the CIAM system, which was also cloud-hosted. An outage of CIAM would cascade to impair AML monitoring and the client portal simultaneously. This dependency had not been documented in the operational resilience self-assessment, which had treated each system as independent.

Cost-Benefit of Diversification. The working group considered whether Cornerstone should migrate some critical systems to a second cloud provider to reduce concentration. The analysis was nuanced. The direct cost of dual-provider cloud architecture — licensing, integration engineering, operational overhead — was estimated at £2.8 million over three years. The indirect cost of maintaining expertise in two cloud environments was harder to quantify but real. Against this, the working group modeled the probability of a prolonged multi-region outage affecting all EU regions of a major cloud provider and the associated regulatory and commercial costs. They concluded that wholesale diversification was not cost-justified given the relatively low probability of multi-region failure for a major provider — but that a targeted approach was warranted.

The Solution: Targeted Resilience, Not Mandatory Diversification

Cornerstone's concentration risk assessment concluded with a set of recommendations that the PRA accepted as a proportionate response.

The first recommendation was to establish warm-standby capacity for the two highest-criticality systems — trade settlement and regulatory reporting — on a second cloud provider. This did not require full migration: it required maintaining a configured, tested standby environment that could be activated within the RTO. The cost was significantly less than full multi-cloud operation.

The second recommendation was to implement and test manual fallback procedures for AML monitoring and sanctions screening that could sustain the full transaction volume — not twenty-five percent — for up to twenty-four hours. This was achievable through operational process redesign and additional staffing protocols rather than technology change.

The third recommendation was to remediate the dependency on CIAM by deploying a resilient, multi-instance CIAM configuration that could survive a partial cloud failure without cascading to AML monitoring and the client portal.

The fourth recommendation was to remediate the contractual gaps in the two legacy system arrangements and ensure that all critical systems were covered by DORA-compliant contract terms.

The fifth recommendation was to integrate cloud provider concentration metrics into Cornerstone's Key Risk Indicators dashboard, reviewed monthly by the operational risk committee and quarterly by the board risk committee. Concentration would no longer be a silent risk.

The PRA reviewed the assessment and the recommendations. Its response acknowledged that diversification was not always the correct answer: "Concentration risk does not mandate multi-cloud. It mandates understanding, and governance proportionate to the dependency." The PRA required Cornerstone to implement the first three recommendations within twelve months and submit progress updates quarterly.

The Broader Question: Is Multi-Cloud Always the Answer?

Cornerstone's case raises a question that many compliance and IT professionals face: if cloud concentration is a regulatory risk, does regulatory compliance require a multi-cloud strategy?

The answer, based on the regulatory frameworks reviewed in this chapter, is no — but it requires explanation.

DORA Article 29 requires concentration risk assessment and management, not concentration elimination. The EBA Outsourcing Guidelines and UK SS2/21 similarly require that concentration risk be understood and governed. None of these frameworks mandates multi-cloud architecture as such. What they require is that firms can demonstrate they have assessed the risk, implemented proportionate controls, and can operate within their impact tolerances even if their primary cloud provider experiences a significant outage.

For some firms and some workloads, multi-cloud is the right answer — particularly for critical systems where manual fallback is insufficient and RTO requirements are aggressive. For others, resilience within a single provider (multi-region deployment, robust failover testing, manual fallback procedures) combined with contractual protections may be sufficient. And for others, the cost of multi-cloud exceeds the risk reduction benefit, particularly for smaller firms with less complex operations.

The regulatory expectation is documented analysis, not a particular architecture. A firm that has rigorously assessed its concentration risk, modeled failure scenarios, tested its recovery capability, and concluded — on the basis of evidence — that single-provider cloud with specific resilience measures meets its regulatory obligations is in a very different position from a firm that has simply never thought about concentration. The former has exercised regulatory judgment. The latter has abdicated it.

Discussion Questions

1. Cornerstone's cloud concentration developed over five years through a series of individually reasonable project decisions, none of which considered the system-level consequence. Design a governance process that would have made cloud provider concentration visible and managed during an ongoing migration program. What information would need to be tracked, and who would be responsible for maintaining the system-level view?

2. The PRA stated that "concentration risk does not mandate multi-cloud — it mandates understanding." Evaluate this position. Are there levels of concentration at which the regulatory expectation shifts from "understand it" to "reduce it"? How would a firm know when it has crossed that threshold?

3. Cornerstone's failure scenario analysis found that a four-hour multi-region outage would impair five critical systems simultaneously. Consider the regulatory implications: DORA requires that major ICT incidents be reported to the competent authority. How would Cornerstone classify and report a cloud provider multi-region outage under DORA's incident reporting framework?

4. The concentration risk assessment found a cascade dependency: CIAM failure would impair both AML monitoring and the client portal. Dependency mapping had not been included in the operational resilience self-assessment, which had treated each system as independent. Why is cascade failure modeling particularly important for cloud-hosted systems? What features of cloud architecture make cascade failures more likely than in on-premise environments?

5. The cost-benefit analysis of full multi-cloud diversification produced a three-year cost estimate of £2.8 million. Critique the methodology of a cost-benefit analysis for concentration risk reduction. What costs and benefits are difficult to quantify, and how should the analysis handle the low-probability, high-impact nature of a multi-region cloud outage? What role should regulatory expectation play in the analysis when the regulator has not explicitly mandated diversification?