Key Takeaways: Incident Response

A one-page reference. Reread before an exam or before responding to a real incident. Dense by design.

The premise

  • It's not if, it's when. Prevention lowers frequency and severity; it never reaches zero. A program is judged by how it responds. Incident response is a security control in its own right — identical intrusions reach opposite outcomes based on IR capability.
  • Event vs. incident. An event is any observable occurrence; a security incident is an event (or correlated set) that violates/imminently threatens security policy and harms (or credibly threatens) the CIA triad. Triage funnels many events down to the few real incidents.

The NIST SP 800-61 lifecycle (memorize the order — it's a LOOP)

PREPARE → DETECT & ANALYZE → CONTAIN, ERADICATE, RECOVER → POST-INCIDENT ACTIVITY → (back to PREPARE)
Phase The job Key idea
1. Prepare Plan, playbooks, runbooks, tools, comms, roles, training Where the leverage is; lessons feed back here
2. Detect & Analyze Triage: real? how bad? how far? what now? Corroborate across sources; scope early
3. Contain, Eradicate, Recover Stop spread; remove attacker; restore safely Iterates with analysis; type sets posture
4. Post-Incident Activity Blameless postmortem; owned action items Closes the loop into Prepare

Detection/analysis and containment iterate — you contain what you've scoped, keep scoping, contain more. Not a one-way checklist.

Preparation artifacts (build before the fire)

Artifact What it is
IR plan Governing doc: definitions, severity matrix, escalation, notification paths; short enough to use
Playbook Scenario-level decision/coordination procedure (ransomware, phishing/BEC, account compromise, exfil)
Runbook Step-by-step technical task (isolate host, disable account + revoke sessions, krbtgt reset)
Comms plan Who tells whom, when; internal cadence, legal/insurer, regulators, customers
Roles Incident commander (decision authority) + technical lead, scribe, comms, legal, SMEs
Out-of-band copy Plan, contacts, runbooks, comms channel — off the systems the plan protects

Build playbooks in risk priority (likely × damaging): ransomware, phishing/BEC, account compromise, data exfiltration first.

Severity classification (drives speed, escalation, resourcing, notification)

SEV Trigger (example, bank) Response
SEV-1 Critical Customer data / core banking / DC compromise; active ransomware; regulatory/outage stakes Full team, 24/7, IC, war room; exec + legal + insurer + regulator
SEV-2 High Single privileged account / sensitive server; lateral-movement-capable malware IR lead + engineers ≤30 min; legal on standby
SEV-3 Medium Single non-priv account; contained commodity malware SOC, business hours, documented
SEV-4 Low Recon, blocked attack, isolated low-risk event Log and trend

Triage — the four questions (decision under uncertainty)

  1. Is it real? Corroborate across log sources (SIEM, EDR, firewall, auth, proxy). An alert is a hypothesis.
  2. How bad? Severity by the matrix; which assets, which CIA leg.
  3. How far (SCOPE)? The question novices under-ask. Pivot on every IoC (Ch.22), every identity action, build the timeline.
  4. What now? Escalate / declare / contain / keep watching.

Under-scoping (clean the one host, miss the footholds) reignites incidents. Tipping off (loud containment too early) burns known footholds.

Containment — the four competing concerns; the type sets the posture

Concern Pulls toward
Stop the damage Immediate aggressive isolation
Preserve evidence Don't power off / don't wipe — isolate, keep powered
Business continuity Surgical containment, keep critical services up
Avoid tipping off Quiet, coordinated containment
Incident type Posture
Ransomware / wiper / active exfil (fast, destructive) IMMEDIATE aggressive — isolate now, scope in parallel
APT / persistent / slow insider (stealthy) QUIET thorough scoping first, then coordinated containment everywhere at once
  • Short-term containment: fast, reversible (isolate host, disable account + revoke sessions, block C2).
  • Long-term containment: durable holding pattern (clean replacement system, temp firewall rules, broad credential reset).

Eradication & Recovery

Eradicate (only as good as your scoping): - Wipe and reimage deeply compromised hosts — do not "disinfect" (can't prove all implants gone). - Remove persistence (scheduled tasks, services, run keys, web shells, rogue accounts). - Rotate all accessible credentials; for DC compromise, krbtgt double-reset. - Close the initial vector (patch, fix config, remove standing access) — or you're waiting for re-entry.

Recover (deliberate, not rushed): - Restore from known-good, tested, OFFLINE/immutable backups (the thing ransomware deletes first). - Stage by business priority; validate against baseline; heightened monitoring post-recovery. - Verify before closure (vector closed, persistence gone, creds rotated, stable).

Tabletop exercise

  • Discussion-based simulation; walk a realistic scenario step by step; no production impact.
  • Cheapest, highest-leverage IR investment. Run regularly (Meridian: quarterly).
  • A tabletop that finds no problems was facilitated too gently — the point is to surface gaps.
  • Use timed injects that force hard decisions (containment, ransom, the clock).

The ransom decision (pre-decided, escalated — never improvised)

  • It is strategic/legal/ethical/sanctions, not technical or merely financial.
  • Paying sanctioned groups may be unlawful; payment guarantees nothing (decryptor or deletion).
  • Default: recover from backups, do not pay; revisited only by named execs + counsel + insurer if recovery is genuinely impossible.

Communications & the regulatory clock

  • Cadence internally (IC status every 30–60 min for SEV-1, even "no change").
  • Out-of-band channel if the attacker may be in email/Teams.
  • Legal + cyber-insurer are often the first external calls; late insurer notice can void coverage.
  • Banking: notify the federal regulator within 36 hours of determining a qualifying computer- security incident. Healthcare (HIPAA): notify individuals + HHS within 60 days; ≥500 individuals adds media + HHS portal. State breach laws apply in parallel.
  • The clock can start before you fully understand the incident — the determination is itself a deadline-bound decision the IC must drive.

Blameless postmortem

  • Ground rule: improve systems, not assign blame. People don't come to work to cause incidents.
  • Blame → silence → blindness. Blamelessness → truth → durable fixes (and more reporting).
  • The trigger (a click, a missed patch) is rarely the root cause (a systemic gap). Push past it ("five whys" ends at a process/design gap, never a person).
  • Output: timeline → what went well → what went poorly/got lucky → root cause → owned, deadlined, tracked action items. A report with no action items guarantees recurrence.
  • Capture MTTD/MTTR per incident; the trend is the honest measure of maturity (Ch.36).

Certification crosswalk

Concept CompTIA Security+ (ISC)² CISSP domain
IR lifecycle / phases 4.0 Security Operations (incident response) Security Operations
Event vs. incident; severity 4.0 Security Operations
Playbooks / runbooks 4.0 Security Operations
Containment/eradication/recovery 4.0 Security Operations
Incident commander / IR team / comms 4.0; 5.0 Governance Security Operations; Security & Risk Mgmt
Tabletop / exercises 4.0; 5.0 Security Operations
Breach notification (HIPAA/state/banking) 5.0 Governance, Risk & Compliance Security & Risk Management
Lessons learned / blameless review 4.0 Security Operations

Project additions this chapter

  • Meridian program: IR plan (definitions, SEV-1–SEV-4 matrix, chain of command + incident commander, notification paths incl. 36-hour clock) + starter playbooks (ransomware, phishing/BEC, account compromise, exfil) + supporting runbooks (host isolation, account disable/revoke, krbtgt reset) + out-of-band comms plan + a quarterly ransomware tabletop validating it all.
  • bluekit toolkit: ir.pytriage(alert) (signals → severity + action) and containment(incident_type) (type → containment posture).

Common pitfalls

  • Storing the IR plan/contacts/runbooks only on the systems they protect (no out-of-band copy).
  • Treating the lifecycle as a strict line instead of an iterating loop.
  • Under-scoping (clean the one host, declare victory) or tipping off a stealthy attacker.
  • A playbook with no runbooks — "isolate the host" with no 3 a.m.-executable procedure.
  • Powering off a host (losing memory) when isolation would have preserved evidence.
  • Improvising the ransom decision under the clock instead of following a pre-written policy.
  • Missing the regulatory clock because legal/GRC were not at the table early.
  • A beautiful postmortem with no owned, deadlined action items — the lesson re-taught at full price.
  • Assuming every incident is loud; the quiet exfiltration breach is often more damaging, and you cannot scope down what you did not log.