Case Study 35.2: TIBER-EU Bank Red Team Exercises and the Equation Group Tools Leak

DataField.Dev

Case Study 35.2: TIBER-EU Bank Red Team Exercises and the Equation Group Tools Leak

Part 1: TIBER-EU — Red Teaming Financial Infrastructure

Background

In 2018, the European Central Bank (ECB) launched TIBER-EU (Threat Intelligence-Based Ethical Red Teaming), a framework for conducting red team exercises against financial institutions across the European Union. TIBER-EU represented a watershed moment for red teaming: a standardized, regulatory-driven framework requiring the most critical financial institutions in Europe to undergo realistic adversary simulation exercises.

The framework was born from the recognition that traditional compliance-based security assessments -- checkbox audits, vulnerability scans, and constrained penetration tests -- were insufficient to evaluate the resilience of financial institutions against sophisticated threat actors. Nation-state groups, organized crime syndicates, and hacktivist organizations regularly target banks, payment processors, and financial market infrastructures. The 2016 Bangladesh Bank heist, in which attackers stole $81 million through the SWIFT network, demonstrated that even the most established financial institutions were vulnerable to determined adversaries.

The TIBER-EU Framework

TIBER-EU is structured around three phases, each involving specific roles and deliverables:

Phase 1: Preparation

The financial institution identifies its "critical functions" -- the services and systems whose disruption would pose systemic risk. A threat intelligence provider is contracted to produce a Targeted Threat Intelligence (TTI) report specific to the institution, identifying the most relevant threat actors, their TTPs, and the most likely attack scenarios.

The TTI report is not a generic threat assessment. It analyzes the specific institution's threat landscape based on its sector, geography, technology stack, and business relationships. The report identifies 3-5 realistic attack scenarios that a determined adversary might execute against the institution.

Phase 2: Testing

An independent red team provider (separate from the threat intelligence provider) uses the TTI report to design and execute a red team exercise. The exercise follows the attack scenarios identified in the TTI report, using realistic TTPs of the identified threat actors.

Key characteristics of TIBER-EU tests:

Live production environment: Tests are conducted against the institution's actual production systems, not lab replicas. This is essential because the goal is to test real defenses, real detection capabilities, and real response procedures.
Limited knowledge: Only a small "white team" (typically 2-4 people) at the institution knows about the exercise. The SOC, IT teams, and business units are unaware and respond as they would to a real attack.
Full scope: Testing may include social engineering, physical access attempts, and supply chain vectors, in addition to traditional cyber attacks.
Objective-based: The red team pursues specific objectives related to the institution's critical functions (e.g., access the payment processing system, demonstrate the ability to modify transaction records).
Duration: Typically 10-12 weeks of active testing.

Phase 3: Closure

After the exercise, a comprehensive debrief brings together the red team, blue team (now informed of the exercise), white team, threat intelligence provider, and regulatory authorities. A remediation plan is developed, and the institution commits to addressing identified weaknesses within defined timeframes.

The regulatory authority reviews the results and remediation plan. While TIBER-EU does not result in pass/fail compliance determinations, the findings inform the regulatory authority's assessment of the institution's resilience.

National Implementations

Multiple European countries have adopted TIBER-EU as a national framework:

Netherlands (TIBER-NL): The first implementation, predating TIBER-EU, served as the model for the European framework
Germany (TIBER-DE): Implemented by BaFin and the Bundesbank
Ireland (TIBER-IE): Led by the Central Bank of Ireland
Denmark (TIBER-DK): Implemented by the Danish Financial Supervisory Authority
Belgium (TIBER-BE): Led by the National Bank of Belgium
United Kingdom (CBEST): The UK's own intelligence-led testing framework, which preceded and influenced TIBER-EU

Key Findings from TIBER-EU Exercises

While specific results are confidential, aggregate findings reported by regulatory authorities and industry participants reveal common themes:

Social engineering remains effective. Despite investment in security awareness training, phishing and social engineering consistently succeed as initial access vectors in TIBER-EU exercises. Financial institution employees, including those in security roles, fall for well-crafted social engineering campaigns.

Detection gaps in lateral movement. Many institutions detect initial access attempts (phishing emails, exploitation attempts) but struggle to detect lateral movement within the internal network. Once a red team achieves a foothold, the path to critical functions is often unobstructed.

Legacy system risks. Financial institutions frequently run critical systems on legacy infrastructure with limited security monitoring. These systems are often the weakest link in the kill chain.

Incident response coordination. Even institutions with mature SOC capabilities struggle with cross-team coordination during complex incidents. The handoff between detection (SOC), investigation (IR team), and business decision-making (management) frequently introduces delays.

Third-party dependencies. TIBER-EU exercises often reveal that critical functions depend on third-party services and infrastructure whose security the institution does not directly control.

Impact on the Financial Sector

TIBER-EU has driven measurable improvements in financial sector security:

Institutions that have undergone TIBER-EU exercises report significant increases in detection capability in subsequent exercises
The framework has established red teaming as a standard practice in European financial regulation
Cross-border exercises have tested systemic resilience of the European financial system
The framework has influenced similar programs in other sectors and regions (DORA in the EU, FEER in Asia-Pacific)

Part 2: The Equation Group Tools Leak — When Red Team Tools Go Public

Background

In August 2016, a group calling itself "The Shadow Brokers" began leaking hacking tools alleged to belong to the Equation Group, widely believed to be the U.S. National Security Agency's (NSA) Tailored Access Operations (TAO) unit. The leaks occurred in multiple waves between August 2016 and April 2017, releasing some of the most sophisticated offensive tools ever developed.

The leaked tools were not theoretical concepts or proof-of-concept code. They were fully operational, weaponized tools that had been used in real intelligence operations. When they became publicly available, they transformed the threat landscape overnight and raised fundamental questions about the development, stockpiling, and security of offensive cyber capabilities.

The Leaked Arsenal

EternalBlue (MS17-010): An exploit targeting a vulnerability in Microsoft's Server Message Block (SMB) protocol. EternalBlue could achieve remote code execution on Windows systems without authentication, making it one of the most powerful network exploitation tools ever created.

EternalRomance: Another SMB exploit targeting a different vulnerability, providing an alternative exploitation path for Windows systems.

DoublePulsar: A kernel-level implant designed to be deployed via EternalBlue or EternalRomance. DoublePulsar provided persistent backdoor access to compromised systems.

FuzzBunch: An exploitation framework similar to Metasploit, providing a command-line interface for deploying the various exploits and implants.

DanderSpritz: A sophisticated post-exploitation framework with extensive capabilities for persistence, credential harvesting, and data exfiltration.

The WannaCry and NotPetya Catastrophes

The leaked tools had immediate and devastating consequences:

WannaCry (May 2017): The North Korean Lazarus Group used EternalBlue to create WannaCry, a ransomware worm that spread automatically across networks. WannaCry infected over 200,000 systems in 150 countries within days. The UK's National Health Service (NHS) was severely impacted, with hospitals turning away patients and canceling surgeries. Total damages were estimated at $4-8 billion.

NotPetya (June 2017): The Russian military intelligence service (GRU) used EternalBlue as part of the NotPetya destructive malware, which was initially targeted at Ukraine but spread globally. Maersk (the world's largest container shipping company) lost 45,000 PCs and 4,000 servers. Merck (pharmaceutical company) suffered $870 million in damages. FedEx's TNT Express unit was crippled. Total global damages exceeded $10 billion.

Implications for Red Teaming

The Equation Group tools leak had profound implications for the red teaming community:

Offensive tool security: The leak demonstrated that even the most sophisticated organizations can lose control of their tools. Red teams must consider: What happens if our tools are discovered, leaked, or stolen? Tool security is not just about keeping tools secret during an engagement; it is about managing the lifecycle of offensive capabilities.

The equities problem: The leaked tools exploited vulnerabilities that the NSA had discovered but not reported to Microsoft. This "Vulnerabilities Equities Process" (VEP) debate pits intelligence collection against defensive security. After the leaks, the U.S. government reformed its VEP, but the tension remains: should government agencies report vulnerabilities they discover, or stockpile them for offensive use?

Weaponization timeline: The time between the tool leak and its use in devastating attacks (WannaCry, NotPetya) was measured in weeks, not months. This demonstrates how quickly sophisticated tools can be repurposed by other threat actors. Red teams must consider the potential for their tools and techniques to be co-opted.

Tool development ethics: The incident raised questions about the ethics of developing offensive tools. The tools were created for national security purposes but caused billions of dollars in collateral damage when leaked. Red teams, while operating at a different scale, face analogous questions about the tools they develop.

Detection development: The leaked tools provided defenders with unprecedented insight into nation-state offensive capabilities. Security vendors could now develop specific detections for tools like DoublePulsar and EternalBlue. This illustrates how tool exposure (whether through leaks or red team reporting) drives defensive improvement.

Combined Discussion Questions

Regulatory red teaming: Should TIBER-EU-style regulatory red teaming be mandated for critical infrastructure sectors beyond finance (energy, healthcare, telecommunications)? What are the practical challenges?
Testing in production: TIBER-EU requires testing against live production systems. What are the risks and benefits of this approach compared to testing in laboratory environments? How do you manage the risk of disrupting critical financial services?
Tool stockpiling: The Equation Group leak demonstrates the risk of stockpiling offensive tools. How should organizations (both government and private sector red teams) manage the security of their offensive tool arsenals?
Vulnerability equities: When a red team discovers a zero-day vulnerability during an engagement, what are their ethical obligations? Should they report it to the vendor immediately, or use it for the duration of the engagement?
Collateral damage: WannaCry and NotPetya caused billions in damage because leaked offensive tools were repurposed. How should the development of offensive tools account for the potential for misuse if they are leaked or stolen?
Continuous improvement: TIBER-EU exercises show that repeat exercises reveal improvement. What is the optimal frequency for red team exercises in critical infrastructure organizations? How do you balance cost with benefit?

Connections to Chapter Content

The TIBER-EU framework connects to Section 35.2 (engagement planning and threat modeling), demonstrating how threat intelligence drives red team exercise design. The Equation Group tools leak connects to Section 35.7 (advanced red team techniques) and raises important ethical considerations about offensive tool development and management. Both cases reinforce the central message of Section 35.8 (reporting and debrief): the value of red teaming lies not in the attack itself but in the improvements it drives.