Key Takeaways: Incident Response
A one-page reference. Reread before an exam or before responding to a real incident. Dense by design.
The premise
- It's not if, it's when. Prevention lowers frequency and severity; it never reaches zero. A program is judged by how it responds. Incident response is a security control in its own right — identical intrusions reach opposite outcomes based on IR capability.
- Event vs. incident. An event is any observable occurrence; a security incident is an event (or correlated set) that violates/imminently threatens security policy and harms (or credibly threatens) the CIA triad. Triage funnels many events down to the few real incidents.
The NIST SP 800-61 lifecycle (memorize the order — it's a LOOP)
PREPARE → DETECT & ANALYZE → CONTAIN, ERADICATE, RECOVER → POST-INCIDENT ACTIVITY → (back to PREPARE)
| Phase | The job | Key idea |
|---|---|---|
| 1. Prepare | Plan, playbooks, runbooks, tools, comms, roles, training | Where the leverage is; lessons feed back here |
| 2. Detect & Analyze | Triage: real? how bad? how far? what now? | Corroborate across sources; scope early |
| 3. Contain, Eradicate, Recover | Stop spread; remove attacker; restore safely | Iterates with analysis; type sets posture |
| 4. Post-Incident Activity | Blameless postmortem; owned action items | Closes the loop into Prepare |
Detection/analysis and containment iterate — you contain what you've scoped, keep scoping, contain more. Not a one-way checklist.
Preparation artifacts (build before the fire)
| Artifact | What it is |
|---|---|
| IR plan | Governing doc: definitions, severity matrix, escalation, notification paths; short enough to use |
| Playbook | Scenario-level decision/coordination procedure (ransomware, phishing/BEC, account compromise, exfil) |
| Runbook | Step-by-step technical task (isolate host, disable account + revoke sessions, krbtgt reset) |
| Comms plan | Who tells whom, when; internal cadence, legal/insurer, regulators, customers |
| Roles | Incident commander (decision authority) + technical lead, scribe, comms, legal, SMEs |
| Out-of-band copy | Plan, contacts, runbooks, comms channel — off the systems the plan protects |
Build playbooks in risk priority (likely × damaging): ransomware, phishing/BEC, account compromise, data exfiltration first.
Severity classification (drives speed, escalation, resourcing, notification)
| SEV | Trigger (example, bank) | Response |
|---|---|---|
| SEV-1 Critical | Customer data / core banking / DC compromise; active ransomware; regulatory/outage stakes | Full team, 24/7, IC, war room; exec + legal + insurer + regulator |
| SEV-2 High | Single privileged account / sensitive server; lateral-movement-capable malware | IR lead + engineers ≤30 min; legal on standby |
| SEV-3 Medium | Single non-priv account; contained commodity malware | SOC, business hours, documented |
| SEV-4 Low | Recon, blocked attack, isolated low-risk event | Log and trend |
Triage — the four questions (decision under uncertainty)
- Is it real? Corroborate across log sources (SIEM, EDR, firewall, auth, proxy). An alert is a hypothesis.
- How bad? Severity by the matrix; which assets, which CIA leg.
- How far (SCOPE)? The question novices under-ask. Pivot on every IoC (Ch.22), every identity action, build the timeline.
- What now? Escalate / declare / contain / keep watching.
Under-scoping (clean the one host, miss the footholds) reignites incidents. Tipping off (loud containment too early) burns known footholds.
Containment — the four competing concerns; the type sets the posture
| Concern | Pulls toward |
|---|---|
| Stop the damage | Immediate aggressive isolation |
| Preserve evidence | Don't power off / don't wipe — isolate, keep powered |
| Business continuity | Surgical containment, keep critical services up |
| Avoid tipping off | Quiet, coordinated containment |
| Incident type | Posture |
|---|---|
| Ransomware / wiper / active exfil (fast, destructive) | IMMEDIATE aggressive — isolate now, scope in parallel |
| APT / persistent / slow insider (stealthy) | QUIET thorough scoping first, then coordinated containment everywhere at once |
- Short-term containment: fast, reversible (isolate host, disable account + revoke sessions, block C2).
- Long-term containment: durable holding pattern (clean replacement system, temp firewall rules, broad credential reset).
Eradication & Recovery
Eradicate (only as good as your scoping):
- Wipe and reimage deeply compromised hosts — do not "disinfect" (can't prove all implants gone).
- Remove persistence (scheduled tasks, services, run keys, web shells, rogue accounts).
- Rotate all accessible credentials; for DC compromise, krbtgt double-reset.
- Close the initial vector (patch, fix config, remove standing access) — or you're waiting for re-entry.
Recover (deliberate, not rushed): - Restore from known-good, tested, OFFLINE/immutable backups (the thing ransomware deletes first). - Stage by business priority; validate against baseline; heightened monitoring post-recovery. - Verify before closure (vector closed, persistence gone, creds rotated, stable).
Tabletop exercise
- Discussion-based simulation; walk a realistic scenario step by step; no production impact.
- Cheapest, highest-leverage IR investment. Run regularly (Meridian: quarterly).
- A tabletop that finds no problems was facilitated too gently — the point is to surface gaps.
- Use timed injects that force hard decisions (containment, ransom, the clock).
The ransom decision (pre-decided, escalated — never improvised)
- It is strategic/legal/ethical/sanctions, not technical or merely financial.
- Paying sanctioned groups may be unlawful; payment guarantees nothing (decryptor or deletion).
- Default: recover from backups, do not pay; revisited only by named execs + counsel + insurer if recovery is genuinely impossible.
Communications & the regulatory clock
- Cadence internally (IC status every 30–60 min for SEV-1, even "no change").
- Out-of-band channel if the attacker may be in email/Teams.
- Legal + cyber-insurer are often the first external calls; late insurer notice can void coverage.
- Banking: notify the federal regulator within 36 hours of determining a qualifying computer- security incident. Healthcare (HIPAA): notify individuals + HHS within 60 days; ≥500 individuals adds media + HHS portal. State breach laws apply in parallel.
- The clock can start before you fully understand the incident — the determination is itself a deadline-bound decision the IC must drive.
Blameless postmortem
- Ground rule: improve systems, not assign blame. People don't come to work to cause incidents.
- Blame → silence → blindness. Blamelessness → truth → durable fixes (and more reporting).
- The trigger (a click, a missed patch) is rarely the root cause (a systemic gap). Push past it ("five whys" ends at a process/design gap, never a person).
- Output: timeline → what went well → what went poorly/got lucky → root cause → owned, deadlined, tracked action items. A report with no action items guarantees recurrence.
- Capture MTTD/MTTR per incident; the trend is the honest measure of maturity (Ch.36).
Certification crosswalk
| Concept | CompTIA Security+ | (ISC)² CISSP domain |
|---|---|---|
| IR lifecycle / phases | 4.0 Security Operations (incident response) | Security Operations |
| Event vs. incident; severity | 4.0 | Security Operations |
| Playbooks / runbooks | 4.0 | Security Operations |
| Containment/eradication/recovery | 4.0 | Security Operations |
| Incident commander / IR team / comms | 4.0; 5.0 Governance | Security Operations; Security & Risk Mgmt |
| Tabletop / exercises | 4.0; 5.0 | Security Operations |
| Breach notification (HIPAA/state/banking) | 5.0 Governance, Risk & Compliance | Security & Risk Management |
| Lessons learned / blameless review | 4.0 | Security Operations |
Project additions this chapter
- Meridian program: IR plan (definitions, SEV-1–SEV-4 matrix, chain of command + incident commander,
notification paths incl. 36-hour clock) + starter playbooks (ransomware, phishing/BEC, account
compromise, exfil) + supporting runbooks (host isolation, account disable/revoke,
krbtgtreset) + out-of-band comms plan + a quarterly ransomware tabletop validating it all. bluekittoolkit:ir.py—triage(alert)(signals → severity + action) andcontainment(incident_type)(type → containment posture).
Common pitfalls
- Storing the IR plan/contacts/runbooks only on the systems they protect (no out-of-band copy).
- Treating the lifecycle as a strict line instead of an iterating loop.
- Under-scoping (clean the one host, declare victory) or tipping off a stealthy attacker.
- A playbook with no runbooks — "isolate the host" with no 3 a.m.-executable procedure.
- Powering off a host (losing memory) when isolation would have preserved evidence.
- Improvising the ransom decision under the clock instead of following a pre-written policy.
- Missing the regulatory clock because legal/GRC were not at the table early.
- A beautiful postmortem with no owned, deadlined action items — the lesson re-taught at full price.
- Assuming every incident is loud; the quiet exfiltration breach is often more damaging, and you cannot scope down what you did not log.