Key Takeaways: Incident Response

DataField.Dev

Key Takeaways: Incident Response

A one-page reference. Reread before an exam or before responding to a real incident. Dense by design.

The premise

It's not if, it's when. Prevention lowers frequency and severity; it never reaches zero. A program is judged by how it responds. Incident response is a security control in its own right — identical intrusions reach opposite outcomes based on IR capability.
Event vs. incident. An event is any observable occurrence; a security incident is an event (or correlated set) that violates/imminently threatens security policy and harms (or credibly threatens) the CIA triad. Triage funnels many events down to the few real incidents.

The NIST SP 800-61 lifecycle (memorize the order — it's a LOOP)

PREPARE → DETECT & ANALYZE → CONTAIN, ERADICATE, RECOVER → POST-INCIDENT ACTIVITY → (back to PREPARE)

Phase	The job	Key idea
1. Prepare	Plan, playbooks, runbooks, tools, comms, roles, training	Where the leverage is; lessons feed back here
2. Detect & Analyze	Triage: real? how bad? how far? what now?	Corroborate across sources; scope early
3. Contain, Eradicate, Recover	Stop spread; remove attacker; restore safely	Iterates with analysis; type sets posture
4. Post-Incident Activity	Blameless postmortem; owned action items	Closes the loop into Prepare

Detection/analysis and containment iterate — you contain what you've scoped, keep scoping, contain more. Not a one-way checklist.

Preparation artifacts (build before the fire)

Artifact	What it is
IR plan	Governing doc: definitions, severity matrix, escalation, notification paths; short enough to use
Playbook	Scenario-level decision/coordination procedure (ransomware, phishing/BEC, account compromise, exfil)
Runbook	Step-by-step technical task (isolate host, disable account + revoke sessions, `krbtgt` reset)
Comms plan	Who tells whom, when; internal cadence, legal/insurer, regulators, customers
Roles	Incident commander (decision authority) + technical lead, scribe, comms, legal, SMEs
Out-of-band copy	Plan, contacts, runbooks, comms channel — off the systems the plan protects

Build playbooks in risk priority (likely × damaging): ransomware, phishing/BEC, account compromise, data exfiltration first.

Severity classification (drives speed, escalation, resourcing, notification)

SEV	Trigger (example, bank)	Response
SEV-1 Critical	Customer data / core banking / DC compromise; active ransomware; regulatory/outage stakes	Full team, 24/7, IC, war room; exec + legal + insurer + regulator
SEV-2 High	Single privileged account / sensitive server; lateral-movement-capable malware	IR lead + engineers ≤30 min; legal on standby
SEV-3 Medium	Single non-priv account; contained commodity malware	SOC, business hours, documented
SEV-4 Low	Recon, blocked attack, isolated low-risk event	Log and trend

Triage — the four questions (decision under uncertainty)

Is it real? Corroborate across log sources (SIEM, EDR, firewall, auth, proxy). An alert is a hypothesis.
How bad? Severity by the matrix; which assets, which CIA leg.
How far (SCOPE)? The question novices under-ask. Pivot on every IoC (Ch.22), every identity action, build the timeline.
What now? Escalate / declare / contain / keep watching.

Under-scoping (clean the one host, miss the footholds) reignites incidents. Tipping off (loud containment too early) burns known footholds.

Containment — the four competing concerns; the type sets the posture

Concern	Pulls toward
Stop the damage	Immediate aggressive isolation
Preserve evidence	Don't power off / don't wipe — isolate, keep powered
Business continuity	Surgical containment, keep critical services up
Avoid tipping off	Quiet, coordinated containment

Incident type	Posture
Ransomware / wiper / active exfil (fast, destructive)	IMMEDIATE aggressive — isolate now, scope in parallel
APT / persistent / slow insider (stealthy)	QUIET thorough scoping first, then coordinated containment everywhere at once

Short-term containment: fast, reversible (isolate host, disable account + revoke sessions, block C2).
Long-term containment: durable holding pattern (clean replacement system, temp firewall rules, broad credential reset).

Eradication & Recovery

Eradicate (only as good as your scoping): - Wipe and reimage deeply compromised hosts — do not "disinfect" (can't prove all implants gone). - Remove persistence (scheduled tasks, services, run keys, web shells, rogue accounts). - Rotate all accessible credentials; for DC compromise, krbtgt double-reset. - Close the initial vector (patch, fix config, remove standing access) — or you're waiting for re-entry.

Recover (deliberate, not rushed): - Restore from known-good, tested, OFFLINE/immutable backups (the thing ransomware deletes first). - Stage by business priority; validate against baseline; heightened monitoring post-recovery. - Verify before closure (vector closed, persistence gone, creds rotated, stable).

Tabletop exercise

Discussion-based simulation; walk a realistic scenario step by step; no production impact.
Cheapest, highest-leverage IR investment. Run regularly (Meridian: quarterly).
A tabletop that finds no problems was facilitated too gently — the point is to surface gaps.
Use timed injects that force hard decisions (containment, ransom, the clock).

The ransom decision (pre-decided, escalated — never improvised)

It is strategic/legal/ethical/sanctions, not technical or merely financial.
Paying sanctioned groups may be unlawful; payment guarantees nothing (decryptor or deletion).
Default: recover from backups, do not pay; revisited only by named execs + counsel + insurer if recovery is genuinely impossible.

Communications & the regulatory clock

Cadence internally (IC status every 30–60 min for SEV-1, even "no change").
Out-of-band channel if the attacker may be in email/Teams.
Legal + cyber-insurer are often the first external calls; late insurer notice can void coverage.
Banking: notify the federal regulator within 36 hours of determining a qualifying computer- security incident. Healthcare (HIPAA): notify individuals + HHS within 60 days; ≥500 individuals adds media + HHS portal. State breach laws apply in parallel.
The clock can start before you fully understand the incident — the determination is itself a deadline-bound decision the IC must drive.

Blameless postmortem

Ground rule: improve systems, not assign blame. People don't come to work to cause incidents.
Blame → silence → blindness. Blamelessness → truth → durable fixes (and more reporting).
The trigger (a click, a missed patch) is rarely the root cause (a systemic gap). Push past it ("five whys" ends at a process/design gap, never a person).
Output: timeline → what went well → what went poorly/got lucky → root cause → owned, deadlined, tracked action items. A report with no action items guarantees recurrence.
Capture MTTD/MTTR per incident; the trend is the honest measure of maturity (Ch.36).

Certification crosswalk

Concept	CompTIA Security+	(ISC)² CISSP domain
IR lifecycle / phases	4.0 Security Operations (incident response)	Security Operations
Event vs. incident; severity	4.0	Security Operations
Playbooks / runbooks	4.0	Security Operations
Containment/eradication/recovery	4.0	Security Operations
Incident commander / IR team / comms	4.0; 5.0 Governance	Security Operations; Security & Risk Mgmt
Tabletop / exercises	4.0; 5.0	Security Operations
Breach notification (HIPAA/state/banking)	5.0 Governance, Risk & Compliance	Security & Risk Management
Lessons learned / blameless review	4.0	Security Operations

Project additions this chapter

Meridian program: IR plan (definitions, SEV-1–SEV-4 matrix, chain of command + incident commander, notification paths incl. 36-hour clock) + starter playbooks (ransomware, phishing/BEC, account compromise, exfil) + supporting runbooks (host isolation, account disable/revoke, krbtgt reset) + out-of-band comms plan + a quarterly ransomware tabletop validating it all.
bluekit toolkit: ir.py — triage(alert) (signals → severity + action) and containment(incident_type) (type → containment posture).

Common pitfalls

Storing the IR plan/contacts/runbooks only on the systems they protect (no out-of-band copy).
Treating the lifecycle as a strict line instead of an iterating loop.
Under-scoping (clean the one host, declare victory) or tipping off a stealthy attacker.
A playbook with no runbooks — "isolate the host" with no 3 a.m.-executable procedure.
Powering off a host (losing memory) when isolation would have preserved evidence.
Improvising the ransom decision under the clock instead of following a pre-written policy.
Missing the regulatory clock because legal/GRC were not at the table early.
A beautiful postmortem with no owned, deadlined action items — the lesson re-taught at full price.
Assuming every incident is loud; the quiet exfiltration breach is often more damaging, and you cannot scope down what you did not log.