Case Study 14.2: Collection #1-5 Credential Dumps and the SolarWinds "solarwinds123" Password

Part A: Collection #1 Through #5 — The Largest Credential Dumps in History

Overview

In January 2019, security researcher Troy Hunt, creator of the HaveIBeenPwned breach notification service, was alerted to a massive data collection being shared on underground forums and the cloud storage service MEGA. Dubbed "Collection #1," this aggregate dataset contained approximately 773 million unique email addresses and 21 million unique passwords, drawn from thousands of individual data breaches spanning years of cybercriminal activity.

Collection #1 was just the beginning. Within weeks, researchers discovered Collections #2 through #5, bringing the total to approximately 2.2 billion unique email-password combinations—the largest aggregate credential dump ever compiled. These collections represented not a single breach but the consolidation of the entire cybercriminal credential economy into downloadable datasets.

The Scale of Exposure

Collection	Approximate Size	Unique Emails	Unique Passwords
Collection #1	87 GB	773 million	21 million
Collection #2	528 GB	~600 million	Additional millions
Collection #3	37 GB	~300 million	Additional millions
Collection #4	178 GB	~500 million	Additional millions
Collection #5	42 GB	~300 million	Additional millions
Combined	~872 GB	~2.2 billion unique pairs	Hundreds of millions

To put this in perspective: the world's population is approximately 8 billion people. With 2.2 billion unique credential pairs, roughly one in four people on Earth had at least one credential exposed in these collections.

Origins and Composition

The Collections were not the result of a single hack. They were compiled from thousands of individual breaches over many years, including:

Major platform breaches: LinkedIn (2012, 117M), Adobe (2013, 153M), MySpace (2016, 360M), Dropbox (2012, 68M)
Small website breaches: Thousands of smaller sites that were hacked via SQL injection, default credentials, or other common vulnerabilities
Credential scraping: Automated tools that tested leaked credentials against other services and recorded successful logins
Phishing campaigns: Credentials harvested through phishing over many years
Stealer malware: Credentials extracted from victims' browsers by information-stealing malware

The data had been circulating in cybercriminal circles for years before becoming publicly available. Collection #1's appearance on a public forum represented the democratization of data that professional cybercriminals had previously guarded.

Impact on Credential Stuffing

The Collections transformed credential stuffing from a specialized cybercriminal activity into something accessible to anyone with basic technical skills:

Before the Collections: Attackers needed to purchase or steal individual breach databases, often paying hundreds or thousands of dollars for high-quality, "fresh" credentials.

After the Collections: 2.2 billion credential pairs were freely available for download. Any attacker could: 1. Download the combined dataset 2. Filter for a target organization's email domain (e.g., @medsecure.com) 3. Run automated credential stuffing against the organization's login portal 4. At a 0.5-2% success rate, potentially compromise dozens of accounts

Real-World Credential Stuffing Attacks Enabled by the Collections

The availability of massive credential datasets led to an explosion of credential stuffing attacks:

Dunkin' Donuts (2018-2019): Multiple credential stuffing attacks compromised customer loyalty accounts, enabling gift card theft.
State Farm (2019): Approximately 100 million credential stuffing attempts over several months.
The North Face (2020): 194,905 customer accounts compromised via credential stuffing.
Spotify (2020): 350,000 accounts compromised through credential stuffing with a leaked database of 380 million records.

Defensive Lessons

The Collections demonstrate several critical lessons:

1. Assume Credentials Are Compromised With 2.2 billion pairs available, the statistical likelihood that some of your organization's users have exposed credentials is virtually 100%. Defense must not rely on password secrecy alone.

2. Breach Screening Is Essential Organizations should check new and existing passwords against known breach databases. Troy Hunt's HaveIBeenPwned Passwords API provides this service with privacy-preserving k-anonymity, and NIST SP 800-63B explicitly recommends it.

# Checking a password against HaveIBeenPwned (k-anonymity model)
import hashlib
import requests

def check_pwned(password):
    sha1 = hashlib.sha1(password.encode()).hexdigest().upper()
    prefix = sha1[:5]
    suffix = sha1[5:]

    response = requests.get(f"https://api.pwnedpasswords.com/range/{prefix}")

    for line in response.text.splitlines():
        hash_suffix, count = line.split(':')
        if hash_suffix == suffix:
            return int(count)  # Number of times this password appeared in breaches
    return 0  # Not found in breaches

3. MFA Is the Most Effective Countermeasure Even perfectly valid credentials are useless if multi-factor authentication is required. MFA is the single most effective defense against credential stuffing.

4. Credential Monitoring Should Be Continuous New breaches are disclosed constantly. Organizations should continuously monitor for their employees' credentials appearing in new breach databases, not just perform one-time checks.

Part B: SolarWinds and the "solarwinds123" Password

Overview

In December 2020, the cybersecurity world was rocked by the disclosure of the SolarWinds supply chain attack—one of the most sophisticated and far-reaching cyberattacks in history. Russian intelligence (SVR, attributed to the APT29/Cozy Bear group) had compromised SolarWinds' Orion software build system and injected malicious code into updates distributed to approximately 18,000 organizations worldwide, including multiple U.S. government agencies, Fortune 500 companies, and critical infrastructure operators.

While the supply chain compromise itself was a masterclass in advanced tradecraft, a parallel discovery cast the incident in a starkly different light: a SolarWinds intern had set the password for a critical update server to "solarwinds123," and this password had been publicly exposed on GitHub since 2018.

The Password Incident

In February 2021, during Congressional hearings on the SolarWinds breach, security researcher Vinoth Kumar testified that in November 2019, he had discovered that the password for SolarWinds' update server (accessible at downloads.solarwinds.com) was "solarwinds123." This password had been accidentally published in a public GitHub repository by a SolarWinds intern.

Kumar reported the exposure to SolarWinds, and the password was changed. However, the password had been publicly accessible for approximately a year before being reported, and the damage—if any was done through this specific vector—may have already occurred.

Why "solarwinds123" Is a Case Study in Password Failure

The "solarwinds123" incident encapsulates virtually every password security failure discussed in this chapter:

1. Predictability The password follows the most common pattern in the RockYou database: company name + simple number sequence. Any targeted dictionary attack would try "solarwinds" combined with common suffixes as one of its first guesses.

# This password would be cracked in seconds by any competent attacker
echo "solarwinds123" | hashcat -m 0 --stdout -r best64.rule
# Output includes: solarwinds123, Solarwinds123, SOLARWINDS123, etc.

2. Critical System, Minimal Security The update server is one of the most security-critical systems in SolarWinds' infrastructure. Software updates are signed and distributed from this server to thousands of enterprise and government customers. The password protecting it was weaker than what most email services would accept.

3. Public Exposure The password was committed to a public GitHub repository—a tragically common occurrence. GitHub scans for and removes certain secret types (API keys, cloud credentials), but simple passwords are not detectable through automated scanning.

4. Insider Threat (Unintentional) An intern set the password, illustrating that even unintentional insider actions can create catastrophic vulnerabilities. Without proper access controls and oversight, a junior employee's mistake can compromise an entire organization.

5. Delayed Detection The password was publicly exposed for approximately a year before a security researcher noticed and reported it. This suggests that SolarWinds was not conducting regular security audits of their infrastructure, monitoring for credential exposure, or scanning for their company name in public repositories.

The Broader Supply Chain Attack

While the "solarwinds123" password may not have been the direct vector for the sophisticated SVR attack (which involved compromising the software build pipeline), it is emblematic of the security culture that allowed such an attack to succeed:

If a critical server could be protected with "solarwinds123," what other security gaps existed?
If an intern could set passwords on critical infrastructure without review, what access controls were in place?
If public credential exposure went undetected for a year, what monitoring existed?

The SVR attackers compromised SolarWinds through a far more sophisticated method—injecting malicious code into the Orion software build system (the SUNBURST backdoor). But the "solarwinds123" password revealed an organization whose security posture was fundamentally inadequate for its role in the software supply chain.

Impact

The SolarWinds breach affected: - U.S. government agencies: Treasury, Commerce, Homeland Security, State Department, and others - Technology companies: Microsoft, Intel, Cisco, VMware - Critical infrastructure: Energy companies, telecommunications providers - Healthcare organizations: Multiple healthcare entities used SolarWinds Orion - Security companies: FireEye (now Mandiant) discovered the breach when their own red team tools were stolen

Estimated cleanup costs exceeded $100 million for SolarWinds alone, with total economic impact across all affected organizations estimated in the billions.

Relevance to MedSecure

The SolarWinds case raises critical questions for MedSecure:

Vendor Password Security: MedSecure depends on dozens of software vendors for EHR systems, medical devices, and IT infrastructure. How confident is MedSecure that these vendors protect their update mechanisms with appropriate authentication?

Service Account Hygiene: Does MedSecure have service accounts with passwords like "medsecure123" or "admin123"? The Kerberoasting and password spraying techniques from this chapter would quickly find them.

Secret Management: Are credentials accidentally stored in code repositories, configuration files, or documentation? Automated secret scanning should be implemented across all MedSecure development environments.

Intern and Contractor Access: Do interns and contractors at MedSecure have access to set passwords on critical systems? Access controls must ensure that security-critical configuration requires senior approval.

Combined Lessons

🔵 Blue Team Perspective: The Collection #1-5 dumps and the SolarWinds password incident together illustrate the full spectrum of password security failures:

Credential stuffing (enabled by massive breach datasets) requires MFA, breach screening, and rate limiting

Weak passwords on critical infrastructure require automated policy enforcement and regular auditing

Credential exposure in code repositories requires automated secret scanning (tools like GitGuardian, TruffleHog, or GitHub's built-in secret scanning)

Comprehensive defense requires assuming credentials will be compromised and building layers of protection beyond passwords: MFA, network segmentation, monitoring, and least-privilege access

Discussion Questions

The Collection datasets contain credentials from thousands of individual breaches compiled over years. What responsibility do organizations have to notify users when their historical breach data appears in new compilations?
The "solarwinds123" password was set by an intern. How should organizations balance the need to give interns meaningful work with the need to protect critical systems?
With 2.2 billion credential pairs publicly available, is the traditional password model fundamentally broken? What would a post-password authentication world look like?
Should there be legal liability for organizations that suffer breaches due to provably weak passwords on critical systems? How would such liability be structured?
How would you design a credential security program for MedSecure that addresses both the credential stuffing threat (external) and the weak internal password threat?

References

Hunt, T. (2019). "The 773 Million Record 'Collection #1' Data Breach." troyhunt.com.
Hern, A. (2019). "'Collection #1' data breach: 773m records and 22m passwords exposed." The Guardian.
SolarWinds. (2021). "SolarWinds Security Advisory."
U.S. Senate Committee on Intelligence. (2021). "Hearing on the Hack of U.S. Networks by a Foreign Adversary."
Kumar, V. (2021). Testimony before the U.S. House of Representatives Committee on Oversight and Reform.
Mandiant (FireEye). (2020). "Highly Evasive Attacker Leverages SolarWinds Supply Chain to Compromise Multiple Global Victims Via SUNBURST Backdoor."
CISA. (2021). "Emergency Directive 21-01: Mitigate SolarWinds Orion Code Compromise."