Case Study 2: A Credential-Stuffing Wave at a Consumer Streaming Service

DataField.Dev

Case Study 2: A Credential-Stuffing Wave at a Consumer Streaming Service

"We didn't get hacked. Our users' passwords got hacked — somewhere else, years ago — and the bill came due on our login page." — Incident lead, StreamHarbor (constructed)

Executive Summary

Meridian is a bank with sixty-five crown-jewel identities it can hand hardware keys; this case study is the other world — a consumer service with thirty million accounts, no way to ship anyone a security key, and a login page being hammered by automation around the clock. StreamHarbor is a (constructed) subscription streaming service that wakes up one Tuesday to a flood of account-takeover complaints: customers locked out, payment methods changed, profiles renamed, gift balances drained. StreamHarbor was never breached. Its users were — elsewhere, in other companies' breaches years earlier — and an attacker is now cashing in those leaked passwords against StreamHarbor's login through credential stuffing. This is a detection-and-response case: where Case Study 1 designed an authentication standard from a blank page, here you join an incident in progress, read the telemetry to distinguish stuffing from a brute-force or a spray, and choose the layered controls that stop an attack you cannot prevent at the source. All names, figures, and logs are constructed for teaching (Tier 3).

Skills applied: authentication-log analysis (distinguishing stuffing from spraying and brute force by shape); account-takeover (ATO) detection; layered credential-attack defense (breach screening, smart throttling, bot defense, MFA); measuring a login page's success-rate as a detection signal; communicating "we weren't breached, our users were."

Background

StreamHarbor has ~30 million accounts, each protected by a single password — the default for a consumer service that competes on frictionless sign-up. It offers optional MFA, but adoption is low (consumers opt out), and the service stores gift-card balances and saved payment methods that make an account worth taking over. Its security team is small relative to its user base, with a SOC that watches application telemetry and a fraud team that watches transactions.

For years, StreamHarbor's login page absorbed a low, constant background of failed logins — the internet's ambient automated tide (the §1.3 idea). What changes on this Tuesday is the volume and the shape. Over a few hours, failed logins spike twentyfold, and — critically — a small but steady stream of those attempts succeed, followed within minutes by changes to email, password, and payment method on the taken-over accounts. The fraud queue fills with "I'm locked out of my account" tickets.

🔗 Connection: StreamHarbor's core vulnerability is not in its code; it is in human password reuse across the whole internet — exactly the engine of credential stuffing from §16.6. The attacker holds billions of email:password pairs leaked from other companies' breaches and bets that some fraction of StreamHarbor's users reused one of those passwords. StreamHarbor's clean security record is irrelevant; the attack rides in on credentials that were correct the moment the user typed them, because they are the user's real password, reused.

The Investigation

Phase 1 — Read the shape: stuffing, spray, or brute force?

The SOC's first job is to name the attack, because the three credential attacks look superficially alike (lots of failed logins) but demand different responses. Analyst Maya pulls an hour of authentication logs. The §16.6 mental model — shape — is the tool:

StreamHarbor auth log (illustrative; source IPs in documentation range 198.51.100.0/24):
  10:00:01  user=a3391@ex   src=198.51.100.7    result=FAIL   reason=bad_password
  10:00:01  user=z8810@ex   src=198.51.100.182  result=SUCCESS
  10:00:02  user=k0042@ex   src=198.51.100.55   result=FAIL   reason=bad_password
  10:00:02  user=m7723@ex   src=198.51.100.99   result=FAIL   reason=bad_password
  10:00:03  user=p1180@ex   src=198.51.100.7    result=SUCCESS
  10:00:03  user=t6650@ex   src=198.51.100.244  result=FAIL   reason=bad_password
   ... ~120,000 DISTINCT users in one hour; ~1.5% SUCCESS; thousands of source IPs ...

Maya characterizes the shape against the three patterns:

Attack	Usernames	Passwords	Sources	Success rate	Match?
Credential stuffing	very many distinct	one per user (leaked)	many (botnet/proxies)	low but nonzero	YES
Password spraying	many distinct	one/few common values	one or few	very low	No (no shared password)
Brute force	one or few	very many guesses	one or few	~0 until hit	No (not one account)

The verdict is credential stuffing: an enormous number of distinct users, each tried with what appears to be a single specific password (not a guess list), from thousands of rotating IPs to evade per-IP limits, with a low-but-real ~1.5% success rate. That 1.5% is the entire problem: against 120,000 attempts an hour, it is 1,800 taken-over accounts an hour, and each is monetized in minutes.

🛡️ Defender's Lens: The single most useful aggregate signal is the login page's overall success rate. In steady state, StreamHarbor's real users succeed ~92% of the time (people mistype). During the stuffing wave, the denominator explodes with attacker attempts that mostly fail, dragging the measured success rate down to ~12%. A sudden collapse in success rate, with a simultaneous spike in distinct usernames and source IPs, is a near-unambiguous stuffing signature — and it is a cheap metric to alert on, because it needs no per-account logic. Maya's first SIEM rule out of this incident is exactly that: alert when global login success rate drops below a threshold while attempt volume rises.

Theo (visiting from Meridian as part of a cross-org information-sharing group) offers the cross-account detection logic, the same group-by-the-right-thing idea from §16.6 — here grouping by source behavior and distinct-user fan-out rather than by password, because in stuffing each user has a different password but the sources fan out across many accounts:

# Illustrative stuffing detector: flag sources hitting MANY distinct users with low success.
# (Hand-traced teaching logic; not executed.)
from collections import defaultdict

def stuffing_sources(events, user_fanout=50, max_success_rate=0.10):
    """events: (ts, user, src, result). Flag a src that touched >= user_fanout
    distinct users with a success rate <= max_success_rate (classic stuffing shape)."""
    seen, ok, total = defaultdict(set), defaultdict(int), defaultdict(int)
    for ts, user, src, result in events:
        seen[src].add(user); total[src] += 1
        if result == "SUCCESS": ok[src] += 1
    flagged = []
    for src in seen:
        if len(seen[src]) >= user_fanout and ok[src] / total[src] <= max_success_rate:
            flagged.append((src, len(seen[src]), round(ok[src] / total[src], 3)))
    return flagged

# Hand-traced: a source 198.51.100.7 that tried 60 distinct users with 3 successes (5%):
#   stuffing_sources(events) -> [('198.51.100.7', 60, 0.05)]
# Expected output:
# [('198.51.100.7', 60, 0.05)]

Phase 2 — Confirm account takeover and scope the damage

Distinguishing the attack from the damage matters: a failed login is noise; a successful login followed by an account change is a takeover. The fraud and SOC teams join the auth log to the account-event log and find the ATO finishing move repeatedly:

10:03:11  user=p1180@ex  result=SUCCESS           src=198.51.100.7
10:03:40  user=p1180@ex  event=EMAIL_CHANGED       new=attacker-mailbox@ex
10:03:52  user=p1180@ex  event=PASSWORD_CHANGED
10:04:18  user=p1180@ex  event=PAYMENT_METHOD_ADD  + GIFT_BALANCE_REDEEM

The pattern — successful login from a stuffing-flagged source, then a rapid email change, password change, and payment/gift action — is the takeover signature. Changing the email first is deliberate: it hijacks the recovery channel so the real owner cannot reset their way back in (the same recovery-path lesson as Case Study 1, here weaponized at scale). Scoping the incident means counting these sequences, not the raw successes, and identifying every account that saw a post-login change from a flagged source.

⚠️ Common Pitfall: Measuring the wrong number under pressure. Executives ask "how many accounts were breached?" and a panicked team reports the failed-login spike (millions) or the raw successes (thousands), neither of which is the answer. The number that matters for customer notification and remediation is confirmed takeovers — successful logins from attack-correlated sources followed by an account change. Getting this number right (and not overstating it) is the difference between an accurate disclosure and a self-inflicted reputational wound.

Phase 3 — Layered response: you cannot stop the source, so stack the controls

StreamHarbor cannot un-leak its users' passwords or stop the attacker from owning a botnet. So the response is pure defense in depth on the login — each layer assuming the one before it will let some attacks through (Theme 4), stacked until the attack is uneconomical:

LAYERED CREDENTIAL-STUFFING RESPONSE (each layer catches what the prior one misses)
  1. BLOCK / THROTTLE the obvious: rate-limit per source + reputation block the worst IPs
     -> attacker rotates IPs (botnet), so this alone is insufficient — but it cuts volume.
  2. BOT DEFENSE: device fingerprinting + challenge (CAPTCHA) on anomalous patterns
     -> raises the per-attempt cost; filters cheap automation.
  3. BREACH-PASSWORD SCREENING: at next login/reset, force users whose password appears
     in a breach corpus to change it (k-anonymity check, §16.6 / authn.py).
     -> removes the attacker's actual ammunition: a reused-from-a-breach password.
  4. RISK-BASED / STEP-UP MFA: when a login looks risky (new device, flagged source,
     impossible travel), REQUIRE a second factor even for opt-out users.
     -> even a CORRECT stuffed password now fails without the second factor.
  5. PROTECT THE FINISHING MOVE: require re-authentication / step-up MFA for
     email change, password change, and payment changes; notify the OLD email on change.
     -> breaks the takeover even if a login slips through.

The decisive layers are 3, 4, and 5. Breach-password screening (layer 3) attacks the root: stuffing only works because users reused a breached password, so forcing those specific users off those specific passwords removes the attacker's ammunition. Risk-based step-up MFA (layer 4) makes a correct stolen password insufficient on a risky login — the §16.6 "MFA neutralizes stuffing's premise" lesson, applied selectively so StreamHarbor's frictionless-login business model survives (most legitimate logins see no extra friction; only risky ones get challenged). And protecting the finishing move (layer 5) means that even if a takeover login succeeds, the attacker cannot complete the monetization — changing the email or draining the balance — without clearing another check, and the real owner is notified on their old channel before the recovery path is hijacked.

🚪 Threshold Concept: You usually cannot stop a credential-stuffing attack; you can only make it uneconomical. The attacker's whole model is volume × a tiny success rate × easy monetization. Every layer you add either shrinks the success rate (breach screening, step-up MFA) or breaks the monetization (protecting account changes), and the attacker — who is running this against thousands of services at once — moves to a softer target the moment your return-on-effort drops below the next site's. Defense against opportunistic credential attacks is not about being impregnable; it is about being more expensive than your neighbor, which is the §1.3 asymmetry turned to your advantage.

Phase 4 — Recovery, notification, and the honest message

For the confirmed-takeover accounts, StreamHarbor force-resets credentials, reverses fraudulent payment and gift actions where possible, restores hijacked email addresses, and notifies affected users on a verified channel. The harder communications problem is the narrative, because the press will write "streaming service hacked" and the truth is more specific and more important: StreamHarbor's systems were not breached; its users' reused passwords, leaked from other companies long ago, were replayed against its login. The team's message threads that needle honestly — taking responsibility for the defenses ("we are adding breach-password screening and risk-based MFA so a reused password can no longer be enough") without claiming a breach that did not happen, and using the moment to push MFA adoption among the broad user base.

🔗 Connection: This is the same lesson Meridian's Dana delivered to her board in Case Study 1, seen from the opposite end. Meridian pre-empted the credential-attack problem by matching authentication strength to value before an incident; StreamHarbor is retrofitting those same controls during one. Both arrive at the identical control set — breach screening, MFA (ideally phishing-resistant or at least risk-based step-up), and protected recovery — because the controls are dictated by the attacks, which are the same everywhere. The cheaper path is Meridian's: build it before the bill comes due.

Discussion Questions

Maya distinguished stuffing from spraying and brute force by shape. Reconstruct the three shapes from memory (usernames × passwords × sources × success rate). Why does naming the attack correctly change the response?
StreamHarbor competes on frictionless login and most users opted out of MFA. How does risk-based step-up MFA preserve the business model while still defeating stuffing? What is the residual risk?
The attacker changed the account's email first. Explain why, and connect it to the account-recovery lesson from Case Study 1. What layer-5 control specifically defeats this move?
The team agonized over the public message ("we weren't breached, our users were"). Is this distinction fair to make, or does it sound like deflecting responsibility? Argue both sides, then state what responsibility StreamHarbor does bear.
Of the five response layers, which two would you implement first if you could only deploy two this week, and why? Defend your choice on a risk basis.

Your Turn

You are handed one hour of a consumer service's authentication logs showing a 25× spike in failed logins. Write the investigation: (1) the three or four fields you would compute to determine which credential attack it is, and the shape that distinguishes stuffing from spraying; (2) the one aggregate metric you would alert on going forward; (3) a prioritized, layered response of at least four controls, naming for each what it stops and what it lets through; and (4) the two-sentence honest customer message. Keep it to one page. If your response is a single control, you have not yet internalized defense in depth — stuffing is beaten by a stack, not a silver bullet.

Key Takeaways

Credential stuffing rides in on your users' reused passwords, leaked elsewhere — a clean security record offers no protection, because the credentials are correct when typed.
Name the attack by its shape: stuffing = many users × one leaked password each × many sources, low success; spraying = many users × few common passwords × few sources; brute force = one account × many guesses. The response differs for each.
The login success-rate is a cheap, powerful aggregate signal: a sudden collapse with rising attempt volume is a near-unambiguous stuffing alarm, needing no per-account logic.
Measure confirmed takeovers (login + account change), not failed-login volume or raw successes — it is the number that drives notification and remediation, and overstating it is self-inflicted harm.
You cannot stop the source — stack the controls: rate-limit/bot-defense, breach-password screening (removes the ammunition), risk-based step-up MFA (a correct stolen password no longer suffices), and protect the finishing move (re-auth + old-channel notification on email/password/payment changes).
Defense against opportunistic credential attacks means being more expensive than the next target — every layer shrinks the success rate or breaks monetization until the attacker moves on.
The controls are the same as Meridian's; the timing is worse — retrofitting under fire costs more than building authentication to the threat in advance.