46 min read

> "The adversary is already inside your network. The only question is whether you have looked."

Prerequisites

  • 21
  • 2
  • 10

Learning Objectives

  • Distinguish alert-driven detection from hypothesis-driven threat hunting, and explain when each is the right tool.
  • Apply the pyramid of pain to choose detections that cost the adversary the most, and distinguish indicator-based from behavior-based detection.
  • Turn threat intelligence and MITRE ATT&CK techniques into deployable detections, and write a Sigma rule from a technique.
  • Run a hypothesis-driven hunt end to end: form a hypothesis, query the data, triage results, and convert a finding into a durable detection.
  • Measure detection coverage against ATT&CK, find the gaps, and reason about false negatives.

Chapter 22: Threat Detection and Hunting: Indicators of Compromise, Threat Intelligence, and Hunting for Adversaries

"The adversary is already inside your network. The only question is whether you have looked." — a working assumption of every mature threat-hunting team

Overview

For nine months in 2020, the most sophisticated intrusion in recent memory sat quietly inside thousands of organizations, and almost no one's alerts fired. The SolarWinds Orion software-update mechanism — trusted, signed, installed everywhere — had been weaponized. The malicious code that came down with a routine update (the industry calls it Sunburst) did almost nothing loud. It waited up to two weeks before doing anything at all. It checked whether it was being watched and went dormant if it was. When it finally reached out to its operators, it did so over HTTPS to domains that looked mundane, on an irregular, jittered schedule designed to blend into the ordinary noise of an enterprise network. There was no exploit to block, no malware signature to match, no failed login to count. By every measure that a signature-and-alert security program is built to catch, nothing was happening.

And yet it was detectable — by the organizations that went looking. The tell was not any single bad indicator. It was a behavior: a server-management product, which has no legitimate business talking to the open internet, was beaconing out to an external domain on a faint, regular rhythm. No rule said "block this." A human had to form a hypothesis — if an adversary were living in our environment, how would their command-and-control traffic look, and where would it hide? — and then go hunt for it in data that no alert had ever flagged. That is the difference between the two halves of this chapter. The first half is detection: building the rules that fire automatically when known-bad things happen. The second half is hunting: looking, proactively and by hand, for the bad things your rules were never written to catch — because a competent adversary's entire job is to do things your rules do not cover.

This chapter sits one step above the SIEM you stood up in Chapter 21. A SIEM collects and correlates logs and fires alerts; this chapter is about what you make it fire on, and about what you do when you accept — as you must — that your alerts will never be complete. You will learn to read threat intelligence and the MITRE ATT&CK framework not as reading material but as a backlog of detections to build. You will write your first Sigma rule, the portable detection format that lets one rule run on any SIEM. You will run a hypothesis-driven hunt against the SolarWinds-style beaconing scenario at Meridian. And you will learn to measure what you cannot see — your detection coverage — because the most dangerous thing in security is the attack you have no chance of catching and do not know it.

In this chapter, you will learn to:

  • Place every detection on the pyramid of pain and prefer the ones that hurt the adversary most.
  • Distinguish indicator-based detection (matching known-bad hashes, IPs, domains) from behavioral detection (matching adversary technique), and know the strengths and failure modes of each.
  • Convert a piece of threat intelligence or an ATT&CK technique into a deployable detection, expressed as a Sigma rule mapped to a technique ID.
  • Run a hypothesis-driven hunt through the full loop — hypothesis → data → query → triage → finding → new detection — using the Meridian beaconing hunt as the worked example.
  • Build a detection coverage map against ATT&CK, find your blind spots, and reason explicitly about false negatives.

Learning Paths

This is a SOC-track chapter at its core — it is where an analyst grows from clearing a queue into actively running down adversaries. Weight it accordingly.

🛡️ SOC Analyst: This entire chapter is yours. §22.2 (the pyramid of pain) and §22.4 (writing Sigma) are the craft of detection engineering; §22.5 (hunting) is the skill that separates senior analysts from ticket-closers. Do the Meridian hunt by hand. 🏗️ Security Engineer: Focus on §22.4 — detection-as-code is built and version-controlled like any other software, and you will own the pipeline that ships it. §22.6 (coverage) frames it as an engineering backlog. 📋 GRC: Skim §22.1 and §22.6. "Detection coverage against ATT&CK" is a metric you will report; understanding what it measures (and what it cannot) keeps you honest in front of auditors. 📜 Certification Prep: Threat intelligence, IoCs, the kill chain, ATT&CK, and threat hunting are tested on Security+ and weave through CISSP's Security Operations domain. The key-takeaways.md file maps every term to its exam.


22.1 From alerts to hunting

By the time you reach this chapter you have a SIEM (Chapter 21) ingesting logs and firing alerts, network visibility (Chapter 10) feeding it flow and Zeek data, and a working vocabulary for how attacks unfold — the cyber kill chain and MITRE ATT&CK from Chapter 2. So why is detection a whole discipline of its own, rather than just "write some alert rules and watch the queue"?

Because alert-driven monitoring has a structural blind spot, and naming it precisely is the first move of this chapter. An alert fires when something you anticipated happens. Someone wrote a rule — "more than ten failed logins from one source in a minute," "a connection to this known-bad IP," "this antivirus signature matched" — and the rule encodes a thing the team already knew to look for. This is indispensable and it catches an enormous amount of attack traffic, most of it the indiscriminate, automated kind from Chapter 1. But it can only ever catch what someone already thought to write down. A competent adversary's whole craft is to operate in the space your rules do not cover: to use a credential instead of an exploit, to run a built-in Windows tool instead of malware, to move slowly enough that no threshold trips. Against that adversary, "watch the queue" is a strategy that waits for the attacker to make a mistake you happened to have a rule for.

Let us define the two modes cleanly.

Threat detection is the practice of identifying malicious or unauthorized activity in an environment, whether automatically (a rule fires) or through analysis. It is the umbrella term for everything in this chapter. Detection has two broad postures, and a mature program runs both.

The first posture is detection engineering: the disciplined practice of designing, building, testing, and maintaining the automated detections — the rules, signatures, and analytics — that fire alerts without a human in the loop. A detection engineer treats a detection rule the way a software engineer treats code: it is written to a spec (a threat to catch), tested against true and false positives, version-controlled, documented with its rationale and its ATT&CK mapping, and maintained as the environment and the threat change. The output of detection engineering is the alert queue your SOC works. This is reactive in the sense that it waits for the bad thing to happen — but proactive in that the team decided in advance, deliberately, what "bad" looks like.

The second posture is threat hunting: the proactive, human-led search through data for adversary activity that has not triggered any alert. The hunter does not wait for a rule to fire; the hunter assumes — per this chapter's epigraph — that an adversary may already be inside, undetected by the existing rules, and goes looking. Hunting is what you do precisely because detection engineering can never be complete. Every successful hunt that finds something the rules missed is also a gift to detection engineering: it reveals a gap, and a good hunt ends by closing that gap with a new automated detection so the same thing is caught for free next time.

        DETECTION ENGINEERING                 THREAT HUNTING
        (automated, anticipated)              (human-led, proactive)
   ┌─────────────────────────────┐      ┌─────────────────────────────┐
   │ Write rule for a KNOWN bad  │      │ Hypothesis: "IF an adversary │
   │ thing  ──►  rule fires  ──► │      │ were here, THEY would ..."   │
   │ alert  ──►  analyst triages │      │   ──► query the raw data     │
   └─────────────┬───────────────┘      │   ──► investigate results    │
                 │                       │   ──► found something the    │
                 │   feeds the queue     │       rules MISSED?          │
                 ▼                       └───────────┬─────────────────┘
            ALERT QUEUE  ◄───────── new detection ───┘
   (catches the anticipated)        (every hunt finding becomes a rule,
                                     so the gap is closed permanently)

Figure 22.1 — The two postures and the loop between them. Detection engineering catches what you anticipated; hunting finds what you didn't, then hands its findings back to engineering as new rules. The arrow from hunting back to the queue is the most important one: a hunt that doesn't produce a durable detection has only postponed the next miss.

The relationship between the two is the spine of this chapter. Detection engineering raises the floor — it automates everything you already know how to catch, so humans don't waste attention on solved problems. Hunting probes the ceiling — it finds the novel and the stealthy, then converts each discovery into a new piece of floor. A program with only detection engineering is brittle: it catches yesterday's attacks and misses anything new. A program with only hunting doesn't scale: you cannot hand-search for every attack every day. You need both, and you need the loop between them.

🚪 Threshold Concept: Absence of alerts is not evidence of absence of attackers. This is the single most important idea in the chapter, and it reorganizes how you think about a quiet SOC. A silent alert queue can mean you are safe — or it can mean an adversary is operating entirely within your blind spots, exactly as Sunburst did for nine months across thousands of victims whose alerts never fired. The mature defender treats a quiet queue as a hypothesis to test by hunting, not as a verdict of safety. You earn the right to believe you are not compromised by looking for compromise and not finding it — not by waiting and seeing nothing.

🔄 Check Your Understanding: 1. In one sentence each, distinguish detection engineering from threat hunting. What is the single most important output of a successful hunt? 2. Why is "our alert queue is quiet, so we're secure" a dangerous inference? Tie your answer to the SolarWinds timeline.

Answers

  1. Detection engineering builds and maintains the automated rules that fire alerts on anticipated bad activity; threat hunting is the proactive, human-led search for adversary activity that no rule caught. The most important output of a hunt is a new detection that closes the gap the hunt revealed, so the same activity is caught automatically next time. 2. Because a quiet queue can equally mean "no attacker" or "an attacker operating entirely inside your blind spots" — alerts only fire on what you anticipated. Sunburst sat undetected for ~9 months in thousands of environments whose queues were quiet the entire time; their silence was not safety.

22.2 IoCs and the pyramid of pain

To detect adversaries you match on the traces they leave — but not all traces are equal, and understanding why is the most important conceptual tool a detection engineer owns. We met the indicator of compromise (IoC) back in Chapter 2: a forensic artifact that suggests a system has been compromised — a file hash of known malware, a malicious IP address or domain, a registry key a backdoor creates, a distinctive string in a payload. IoCs are the raw material of indicator-based detection: you collect known-bad indicators (from your own incidents, from threat intelligence, from the community) and you alert when one appears in your telemetry. 🔗 Connection: this is exactly the ioc_match function you will build in this chapter's Project Checkpoint — given an event and a set of known-bad indicators, does the event contain one?

Indicator matching is fast, cheap, and precise, and you should absolutely do it. But it has a brutal limitation, and a security researcher named David Bianco captured that limitation in a model that every hunter internalizes: the pyramid of pain. The pyramid ranks the types of indicator you can detect on by how much pain it causes the adversary when you deny them that indicator — equivalently, how hard it is for the adversary to change it and carry on. The insight is that the cheap, easy indicators are also the ones the adversary can swap out in seconds, while the expensive, hard ones force the adversary to retool, retrain, or rethink their whole operation.

Here is the pyramid, from the indicators that cost the adversary nothing to change (bottom) to the ones that cost them dearly (top):

                          ▲  MOST pain to the adversary
                         ╱ ╲     (hardest for them to change)
                        ╱TTPs╲          ── Tactics, Techniques & Procedures:
                       ╱───────╲           HOW they operate. Change = relearn
                      ╱  Tools   ╲          their craft. ◄── HUNT HERE
                     ╱─────────────╲     ── Attacker software/utilities.
                    ╱ Network/Host   ╲       Change = rebuild/find new tools.
                   ╱   Artifacts      ╲   ── Registry keys, file paths, user-agents,
                  ╱─────────────────────╲     C2 URI patterns. Annoying to change.
                 ╱     Domain Names      ╲ ── Mildly annoying: re-register, but cheap.
                ╱─────────────────────────╲── IP Addresses: trivial — new server/proxy.
               ╱        Hash Values        ╲─ Trivial: flip one byte, new hash.
              ╱─────────────────────────────╲
             ──────────────────────────────────  LEAST pain (adversary changes freely)

Figure 22.2 — The pyramid of pain (after David Bianco). Detections built on the bottom layers are cheap to write and cheap for the adversary to evade — you'll be playing whack-a-mole forever. Detections built on the top layers are hard to write but force the adversary to fundamentally change how they operate. A program that only matches hashes and IPs is, by this model, optimizing for the adversary's convenience.

Walk up the pyramid and feel the cost shift.

  • Hash values (bottom). A cryptographic hash of a malicious file is a perfect, unambiguous indicator — and worthless the moment the adversary recompiles or changes a single byte, which produces a completely different hash for free. You will match hashes (it costs you nothing and catches the lazy and the unchanged), but you cannot rely on them.
  • IP addresses. A known-bad command-and-control IP is slightly more durable, but an adversary spins up a new server or routes through a new proxy or a cloud provider in minutes. Blocking an IP is a speed bump.
  • Domain names. A malicious domain costs a few dollars and a few minutes to register a replacement, though it is marginally more painful than an IP because of the registration and DNS propagation friction.
  • Network and host artifacts. Now it gets interesting. A distinctive URI pattern the malware uses to phone home, a peculiar User-Agent string, a registry key a backdoor creates, a named pipe, a specific file path — these are baked into the tooling. Changing them means changing the tool, which is real work. This is the first layer where you start to actually inconvenience the adversary.
  • Tools. Detecting the adversary's tools — by their behavior, their on-disk characteristics, the artifacts they generate regardless of cosmetic changes — forces the adversary to find or build a different tool, which can cost them weeks and money.
  • TTPs (top) — tactics, techniques, and procedures, the ATT&CK vocabulary from Chapter 2. This is how the adversary operates: that they dump credentials from memory, that they use a particular living-off-the-land binary for execution, that they establish persistence a certain way, that their C2 beacons on a regular interval. To evade a detection built on a technique, the adversary cannot swap a value — they must change the way they do their job, relearn a new method, and risk that the new method is detected too. This is the most painful thing you can do to them, and it is exactly what behavioral detection targets.

This brings us to the central distinction of modern detection. Behavioral detection (also called behavior-based or anomaly-aware detection) identifies adversary activity — the technique, the pattern, the sequence of actions — rather than a specific static indicator. Instead of "alert on hash 5f4dcc3b...," a behavioral detection says "alert when a process spawned by a web server launches a command shell," or "alert when a server-management application makes an outbound connection to the internet on a regular interval." These detections survive the adversary changing their malware, their IP, and their domain, because they fire on the behavior that the adversary's goal requires them to perform. They are harder to write, and they generate more false positives that you must tune away — but they sit at the top of the pyramid where the pain lives.

🛡️ Defender's Lens: The SolarWinds Sunburst operators were experts at the bottom of the pyramid. They changed hashes (custom malware, recompiled), used fresh domains and IPs, and blended their C2 into normal traffic — defeating every indicator-based detection by design. But they could not change one thing without abandoning their objective: a server-management product had to reach out to their infrastructure to receive commands. That outbound-beacon-from-a-thing-that-should-never-beacon is a TTP — top of the pyramid — and it is what the organizations that caught Sunburst hunted for. When you cannot win at the bottom of the pyramid, you climb it.

⚠️ Common Pitfall: Building a detection program that is entirely IoC feeds. It is seductive: threat-intel vendors sell you millions of indicators, you load them into the SIEM, dashboards light up, and it looks like coverage. But an all-indicator program is, by the pyramid, optimized for the adversary's convenience — it catches commodity malware and reused infrastructure while a targeted adversary sails through by changing values that cost them nothing. Indicators are the floor of detection, not the ceiling. Spend your scarce engineering time climbing toward behavior and TTPs.

🔄 Check Your Understanding: 1. An adversary's malware is detected by its file hash, so they recompile it. Which layer of the pyramid did your detection live on, and why was it so cheap to evade? Where would you move the detection to make evasion painful? 2. Give one concrete example of a behavioral (TTP-level) detection for a credential-dumping attack, phrased as "alert when …". Why does it survive the adversary changing their tooling?

Answers

  1. The detection lived on the hash layer — the bottom — which is trivial to evade because recompiling or changing one byte yields a brand-new hash for free, while the malware's behavior is unchanged. Move the detection up to a TTP/behavioral level (e.g., alert on the behavior the malware performs, such as the way it establishes persistence or beacons) so that evasion requires changing how the adversary operates, not just rebuilding a file. 2. For example: "alert when a process reads the memory of the Windows LSASS process" — credential dumping requires accessing LSASS memory regardless of which tool does it, so changing tools doesn't evade the detection; the adversary would have to change how they obtain credentials, which is far more costly.

22.3 Threat intelligence that drives detections

You cannot detect a threat you have never heard of and cannot describe. Threat intelligence — defined in Chapter 2 as evidence-based knowledge about adversaries, their capabilities, and their intentions — is the raw input that tells detection engineering and hunting what to look for. 🔗 Connection: we use Chapter 2's definitions of threat intelligence and IoCs freely here; this chapter is about turning that knowledge into deployed detections rather than re-defining it. The danger is treating intelligence as a stream of articles to read. Intelligence that does not change a detection, a hunt, or a decision is, for an operational team, just news.

It helps to know the three tiers of threat intelligence, because they drive different defensive actions:

  • Strategic intelligence is high-level: which adversaries target your sector, their motivations, the geopolitics. It informs the CISO's risk decisions and what you prioritize. ("Financial-sector institutions are being targeted by a group using supply-chain compromise" is strategic.)
  • Operational intelligence is campaign-level: the TTPs of a specific adversary or campaign, mapped to ATT&CK. ("This group establishes persistence via technique T1543, dumps credentials via T1003, and beacons over HTTPS with a 30-minute jittered interval.") This is the gold for detection engineering — it tells you which behaviors to write rules for.
  • Tactical intelligence is the indicators: specific hashes, IPs, domains, URLs from a campaign. ("These five domains and three hashes are associated with the campaign.") This feeds your indicator matching — useful, but bottom-of-the-pyramid and short-lived.

The practical machine that manages all of this is a threat intelligence platform (TIP): a system that aggregates threat-intelligence feeds from many sources (commercial vendors, open-source feeds, government advisories like CISA's, industry sharing groups, and your own past incidents), normalizes and de-duplicates the indicators, scores them for confidence and relevance, enriches them with context, and pushes the operational ones out to your detection tooling — your SIEM, your firewalls, your EDR. A TIP is to threat intelligence roughly what a SIEM is to logs: the aggregation-and-distribution hub that turns a flood of raw feeds into something a team can actually operationalize. The community has also built standard formats so intelligence can be machine-shared: STIX (a structured language for describing threats — actors, campaigns, indicators, relationships) and TAXII (the protocol for transporting STIX between organizations and platforms). When CISA or a sector ISAC (Information Sharing and Analysis Center) publishes indicators for an active campaign, they often publish them as STIX so your TIP can ingest them automatically.

The crucial discipline — and the place new teams go wrong — is the pipeline from intelligence to detection. Receiving intelligence is not detecting. The loop is:

  1. Ingest the intelligence (a feed, an advisory, a report on a campaign hitting your sector).
  2. Triage it for relevance: does this adversary or technique plausibly target an organization like yours, given your assets and exposure? An advisory about an industrial-control-system worm matters less to a bank than an advisory about a financial-sector credential-phishing campaign. Irrelevant intelligence is filtered out, deliberately, so it does not drown the signal.
  3. Extract what is actionable. From a campaign report, pull both the indicators (for matching) and — more valuably — the techniques (mapped to ATT&CK), which become behavioral-detection or hunt candidates.
  4. Deploy or hunt. Turn each actionable item into one of two things: a detection (a rule that fires automatically — covered in §22.4) or a hunt hypothesis (a one-time proactive search — covered in §22.5). Indicators usually become detections; techniques can become either, depending on whether you can express them as a reliable rule.
  5. Feed back. Whatever you learn — a true positive, a false positive, a new variant — flows back to refine your intelligence and your detections.

📟 War Story: A constructed but representative example. A regional bank's threat-intel analyst receives, through a financial-sector ISAC, a report that a named adversary is compromising software-update mechanisms to gain initial access, then beaconing out over HTTPS on a jittered interval. The lazy response is to load the report's three listed C2 domains into the SIEM as a blocklist and call it done — bottom-of-the-pyramid, and the domains are already burned by the time the report is public. The mature response is to read the report for techniques: "compromise of a trusted update mechanism" and "jittered HTTPS beaconing from servers." Those become a hunt hypothesis ("which of our server-class hosts are making regular outbound connections to the internet?") and, once validated, a durable behavioral detection. The indicators get loaded too — they cost nothing — but the team knows they are the least valuable thing in the report.

⚖️ Authorization & Ethics: Threat intelligence is shared in a community built on trust, and much of it carries handling restrictions — most commonly the Traffic Light Protocol (TLP), which tags information RED/AMBER/GREEN/CLEAR to govern how widely you may share it. TLP:RED means "for your eyes only, do not redistribute"; TLP:CLEAR means "share freely." Honoring these markings is not bureaucracy — it is what keeps the sharing ecosystem functioning, because a source whose RED intelligence leaks will stop sharing. Treat received intelligence with the discretion its marking requires, and never post indicators from a restricted feed to a public site.

🔄 Check Your Understanding: 1. A vendor feed delivers two million indicators a day. Why is loading all of them into your SIEM as alert rules a mistake, and what should a TIP do with that feed instead? 2. From the three tiers of intelligence (strategic / operational / tactical), which is most valuable to a detection engineer, and why?

Answers

  1. Loading two million indicators as alert rules drowns the SOC in low-value, short-lived, bottom-of-the-pyramid matches and false positives, and most of those indicators are irrelevant to your environment. A TIP should normalize and de-duplicate the feed, score indicators for confidence and relevance to your organization, filter out the irrelevant, enrich the rest with context, and push only the high-confidence, relevant ones to your tooling. 2. Operational intelligence — the campaign-level TTPs mapped to ATT&CK — is most valuable to a detection engineer, because techniques sit near the top of the pyramid of pain and yield durable behavioral detections, whereas tactical indicators are short-lived and strategic intelligence informs priorities rather than specific rules.

22.4 Detection engineering with Sigma and ATT&CK

Now we build. Detection engineering is where threat intelligence and ATT&CK techniques become rules that actually fire. The problem every SOC hits immediately is portability: a detection written in Splunk's SPL doesn't run on Microsoft Sentinel's KQL, which doesn't run on Elastic's query DSL. Write a rule three times, maintain it three times, and watch the three copies drift apart. The community's answer is Sigma.

A Sigma rule is a generic, vendor-agnostic detection written in YAML that describes a detection logically — what fields, what values, what condition — independent of any specific SIEM's query language. A tool called a Sigma "backend" then converts the rule into the native query language of whatever SIEM you run. Write once in Sigma, compile to SPL for Splunk, KQL for Sentinel, Lucene for Elastic. Sigma is, in the slogan the community uses, "the common language for detections" — the analog of what Snort signatures are for network IDS or what YARA (which we meet at the end of this section) is for files. Because a Sigma rule is just a text file, it lives in version control, gets code-reviewed, and ships through a pipeline — this is what people mean by detection-as-code (a term from Chapter 21): detections managed with the same rigor and tooling as software.

Let us read a real Sigma rule, because the anatomy is learnable in one example. This rule detects a classic technique: a Windows command shell (cmd.exe) being spawned by a Microsoft Office application — the signature of a malicious macro running, which has launched countless intrusions.

title: Office Application Spawning a Command Shell
id: 9b8c1d2e-3f4a-5b6c-7d8e-9f0a1b2c3d4e   # stable UUID, generated once
status: stable
description: >
  Detects a Microsoft Office application (Word, Excel, PowerPoint, Outlook)
  spawning a command interpreter — a strong signal of malicious macro execution.
references:
  - https://attack.mitre.org/techniques/T1059/003/
author: Meridian SOC (Theo Brandt)
date: 2026/05/01
tags:
  - attack.execution
  - attack.t1059.003          # ATT&CK: Command and Scripting Interpreter: Windows Command Shell
logsource:
  category: process_creation
  product: windows
detection:
  selection:
    ParentImage|endswith:
      - '\winword.exe'
      - '\excel.exe'
      - '\powerpnt.exe'
      - '\outlook.exe'
    Image|endswith:
      - '\cmd.exe'
      - '\powershell.exe'
  condition: selection
falsepositives:
  - Rare legitimate macros or document-automation tools that shell out.
level: high

Read it top to bottom and the structure reveals itself:

  • Metadata (title, id, status, description, author, date): the identity and provenance of the rule. The id is a stable UUID so the rule can be referenced and tracked even if the title changes.
  • references and tags: the rule is mapped to ATT&CK. The tag attack.t1059.003 ties this detection to the real technique T1059.003 — Command and Scripting Interpreter: Windows Command Shell. This mapping is not decoration: it is how you later measure coverage (§22.6). Every detection should carry its technique IDs.
  • logsource: what data this rule reads — here, Windows process-creation events (the kind you collect via Sysmon or native Windows auditing). The Sigma backend uses this to know which index/table to query in your SIEM.
  • detection: the logic. A named selection block lists field-to-value conditions (the parent process ends with an Office executable AND the spawned process is a shell), and the condition combines selections with boolean logic. Here it is simply selection, but real rules combine multiple blocks (e.g., selection and not filter).
  • falsepositives and level: honesty and severity. The rule author documents the benign cases that might trip it (some legitimate macros shell out) and sets a severity. This is what makes a detection maintainable: the next analyst knows what false positives to expect and how urgent a hit is.

That single rule, compiled by a Sigma backend, becomes a working query on whatever SIEM Meridian runs. And because it is tagged with its technique, it counts toward Meridian's coverage of ATT&CK Execution.

🧩 Try It in the Lab: In your own lab SIEM (or just on paper), take the rule above and extend it. Add a second selection block that excludes a known-good document-automation service account, so the rule fires only when the shell-spawning is not that service. Express it as condition: selection and not filter. Then write the rule's falsepositives note for your new version. You have just done the core loop of detection engineering: write the logic, then tune it against the benign reality of your environment. (Use only your own lab.)

Now, the worked example that matters for this chapter — turning the SolarWinds beaconing behavior into a detection. Recall the TTP from §22.2: a server-management product reaching out to the internet on a regular, jittered interval. We cannot write a clean process-creation rule for this; beaconing is a network-behavioral pattern, and it is a perfect illustration of a detection that lives at the top of the pyramid and therefore takes more than a simple field match. Here is the logic expressed as a Sigma-style network rule against proxy/Zeek connection logs:

title: Regular Outbound Beacon from a Server-Class Host
id: 1a2b3c4d-5e6f-7a8b-9c0d-1e2f3a4b5c6d
status: experimental
description: >
  Detects a host in the server tier making repeated outbound HTTPS
  connections to a single external destination at a near-regular interval —
  the signature of automated command-and-control (C2) beaconing.
references:
  - https://attack.mitre.org/techniques/T1071/001/   # Application Layer Protocol: Web Protocols
  - https://attack.mitre.org/techniques/T1573/        # Encrypted Channel
tags:
  - attack.command_and_control
  - attack.t1071.001
logsource:
  category: proxy            # or Zeek conn.log / NetFlow
  product: zeek
detection:
  selection:
    src_zone: 'server'                 # source is in the server tier
    dest_category: 'external'          # destination is outside the org
    dest_port: 443
  timing:
    # behavioral condition, evaluated by a correlation/analytics job:
    # for a (src_host, dest_host) pair, COUNT(connections) >= 12 over 6h
    # AND low variance in inter-arrival time (jittered-but-regular)
  condition: selection | beacon_score(timestamps) >= threshold
falsepositives:
  - Legitimate software update checks, telemetry, certificate-revocation
    checks, and SaaS keep-alives — which is why context (a SERVER beaconing
    to a NEW external host) matters more than the beacon alone.
level: high

Notice what this rule cannot do that the Office rule could: pure Sigma field-matching can select the connections, but the beaconing pattern itself — "many connections at a regular interval" — is a statistical property over many events, which is why the condition invokes a beacon_score over timestamps. This is exactly the beacon_score(timestamps) function you built in Chapter 10's pktflow.py. 🔗 Connection: the network-monitoring layer from Chapter 10 produces the flow data; this chapter's detection logic consumes it. The behavioral detection is a collaboration between a selection (narrow to server-tier-to-external connections) and an analytic (score the regularity). This is the shape of most top-of-pyramid detections: a cheap filter to cut the data down, then an expensive analytic on what remains.

Finally, a brief introduction to YARA, because it is the file-and-memory counterpart to Sigma's log focus. YARA is a pattern-matching tool and rule language for identifying and classifying files (and process memory) by their content — you write rules describing byte strings, text strings, and structural conditions, and YARA scans files or memory and reports matches. Where Sigma rules run against logs in your SIEM, YARA rules run against files and memory on endpoints and in malware analysis. A YARA rule for a piece of malware might say "match any file containing these three distinctive strings AND this byte sequence AND under 2 MB." YARA sits a little higher on the pyramid than a raw hash, because a good YARA rule keys on characteristics the malware author finds hard to change (a unique algorithm, a distinctive set of strings) rather than the whole-file hash that changes with one edited byte. We introduce it here as part of the detection-engineering toolkit and as a hunting aid; malware analysis proper is beyond our defensive scope, but knowing that YARA is "Sigma for files" completes your mental map of the detection-rule landscape.

   WHAT IT MATCHES        RULE LANGUAGE     RUNS AGAINST
   ──────────────         ─────────────     ───────────────────────────
   Log events / behavior   Sigma            SIEM (compiled to SPL/KQL/...)
   Network packets         Snort/Suricata   IDS/IPS sensor (Ch.7)
   Files & process memory  YARA             Endpoints, malware sandbox

Figure 22.3 — The three detection-rule families and where each runs. Sigma is the focus of this chapter (log/behavioral detection); Snort/Suricata you met with IDS in Chapter 7; YARA is the file/memory counterpart introduced here. A mature detection program uses all three, each on the data it is built for.

🔄 Check Your Understanding: 1. What problem does Sigma solve that writing rules directly in SPL or KQL does not? What does the tags: field with attack.t1059.003 enable downstream? 2. Why can't the SolarWinds-style beaconing detection be written as a pure field-match Sigma rule, and what extra ingredient does it require?

Answers

  1. Sigma solves portability and maintainability: one rule written in vendor-agnostic YAML compiles to any SIEM's query language, so you write and maintain it once instead of re-implementing it per platform — and as a text file it lives in version control as detection-as-code. The attack.t1059.003 tag maps the detection to a specific ATT&CK technique, which enables coverage measurement (§22.6) — you can ask "which techniques do our detections cover?" only if the detections are tagged. 2. Beaconing is a statistical property over many events (many connections at a regular/jittered interval), not a single field value, so a field-match rule can only select candidate connections; detecting the beacon itself requires a behavioral analytic — a beacon_score over the connection timestamps — layered on top of the selection.

22.5 Hypothesis-driven hunts

Detection engineering automates what you can anticipate. Now we do the other half — we go looking for what we didn't anticipate. Threat hunting, defined in §22.1, is the proactive, human-led search for adversary activity that no alert caught. The mature form of it is hypothesis-driven hunting: hunting structured around a specific, testable hypothesis about adversary behavior, rather than aimlessly browsing logs hoping something jumps out. The hypothesis is what makes a hunt rigorous, repeatable, and — crucially — bounded, so it ends in a defensible conclusion instead of dragging on forever.

A good hunt hypothesis has a particular shape. It is specific (names a behavior, not a vibe), testable (you can confirm or refute it with data you actually have), and grounded (motivated by threat intelligence, an ATT&CK technique, a gap in your detections, or an anomaly). The template:

"If an adversary were performing [technique X] in our environment, we would expect to see [observable Y] in [data source Z]. Let's look."

Compare a vague hunt ("let's look for anything weird in the proxy logs") to a hypothesis-driven one ("if an adversary had compromised a server and were running C2, we would expect to see a server-tier host making regular, jittered outbound HTTPS connections to a single external destination — let's query Zeek conn.log for that pattern"). The second is a hunt; the first is a hobby. The hypothesis tells you exactly what data to pull, exactly what pattern to look for, and exactly what result confirms or refutes it.

The hunt loop has six steps, and the Meridian beaconing hunt walks through all of them:

   ┌──────────────────────────────────────────────────────────────────┐
   │  THE HUNT LOOP                                                     │
   │                                                                    │
   │  1. HYPOTHESIZE   ── grounded in intel / ATT&CK / a gap            │
   │        │                                                           │
   │  2. SCOPE DATA    ── which data source answers it? do we have it?  │
   │        │                                                           │
   │  3. QUERY/ANALYZE ── pull the data, look for the predicted pattern │
   │        │                                                           │
   │  4. TRIAGE        ── investigate hits: malicious, benign, unknown? │
   │        │                                                           │
   │  5. CONCLUDE      ── confirm / refute / inconclusive (+ data gap?) │
   │        │                                                           │
   │  6. OPERATIONALIZE ─ turn a finding into a DURABLE DETECTION,      │
   │        └──────────── and note any visibility gap to fix            │
   └──────────────────────────────────────────────────────────────────┘

Figure 22.4 — The hypothesis-driven hunt loop. Step 6 is what distinguishes a hunt from a one-time investigation: the hunt's output is not just "did we find the adversary?" but a permanent improvement to the program — a new detection if you found something, or a documented visibility gap if you couldn't even look properly.

The Meridian hunt, step by step. Priya Nair, Meridian's incident-response and threat-hunting lead, has just read the financial-sector ISAC report from §22.3: an adversary compromising software-update mechanisms and beaconing over jittered HTTPS. Meridian runs SolarWinds-style server-management software in its on-prem data center. No alert has fired. Priya tasks Theo with a hunt.

Step 1 — Hypothesize. Theo writes it down explicitly, mapped to ATT&CK:

"If an adversary had compromised one of our server-management or server-tier hosts (initial access via a trusted update, T1195.002 — Supply Chain Compromise: Compromise Software Supply Chain) and were running command-and-control (T1071.001 — Application Layer Protocol: Web Protocols), we would expect to see a host in the server network tier making repeated, near-regular outbound HTTPS (port 443) connections to a single external destination that the host has no legitimate reason to contact. We would expect this in our Zeek conn.log / proxy logs."

Step 2 — Scope the data. Does Meridian have the data to test this? Yes — Chapter 10's network-monitoring design put a Zeek sensor on the data-center egress, and conn.log records every connection with source, destination, port, and timestamp. Theo confirms the server tier's IP ranges with Sam (the network architect) so he can filter to server-tier sources. This step matters: if the data didn't exist, the hunt would immediately surface a visibility gap — which is itself a valuable finding (see Step 6).

Step 3 — Query and analyze. Theo writes the hunt query. In SIEM pseudo-SQL against the Zeek connection data:

-- Hunt: regular outbound beacons from the server tier (Meridian, illustrative)
SELECT  src_ip, dest_ip,
        COUNT(*)                              AS conn_count,
        COUNT(DISTINCT dest_ip)               AS distinct_dsts,
        STDDEV(inter_arrival_seconds)         AS jitter_stddev,
        AVG(inter_arrival_seconds)            AS mean_interval
FROM    zeek_conn
WHERE   src_zone = 'server'                   -- server tier only
  AND   dest_category = 'external'            -- leaving the org
  AND   dest_port = 443                       -- HTTPS
  AND   _time > NOW() - INTERVAL '7 days'
GROUP BY src_ip, dest_ip
HAVING  COUNT(*) >= 100                        -- chatty enough to be automated
   AND  STDDEV(inter_arrival_seconds) < 60     -- low jitter = regular = beacon-like
ORDER BY conn_count DESC;

The query encodes the hypothesis directly: server-tier sources, external HTTPS destinations, grouped by source-destination pair, keeping only pairs that are frequent (automated, not human) and regular (low standard deviation in the gap between connections — the statistical fingerprint of a beacon). This is beacon_score from Chapter 10 expressed in SQL. Theo runs it (in his lab/SIEM — never against anything unauthorized) and gets a handful of rows. Most are immediately explainable:

src_ip          dest_ip          conn_count  distinct_dsts  jitter_stddev  mean_interval
10.20.5.11      <Windows Update> 412         1              48.2           300.1   <- patching
10.20.5.11      <vendor SaaS>    288         1              55.0           600.0   <- telemetry
10.20.5.40      203.0.113.77     336         1              9.4            120.0   <- ??? 

Step 4 — Triage. The first two rows are benign and expected: a server checking for Windows updates and a monitoring agent's telemetry beacon to a known vendor. This is the essential hunting skill — knowing your environment's normal well enough to dismiss benign regularity, which is why baselining (Chapter 10) is a prerequisite. The third row is different. Host 10.20.5.40 — which Sam confirms is the server-management application server — is beaconing every 120 seconds, with very low jitter (stddev 9.4), to 203.0.113.77, an external IP it has no documented reason to contact. Theo enriches the destination: the IP belongs to a hosting provider, the domain it resolves to was registered three months ago, and it appears in no Meridian asset or vendor record. Every additional fact deepens the suspicion. This is the moment a hunt becomes an incident.

Step 5 — Conclude. The hypothesis is confirmed (pending formal IR confirmation): a server-tier host exhibits exactly the predicted C2-beacon behavior to an unexplained external destination. Theo does not try to remediate on his own — he hands off to incident response (Chapter 24), preserving the evidence. But the hunt's job is done: it found, by hypothesis and query, the thing no alert caught, exactly as the real SolarWinds hunters did.

Step 6 — Operationalize. This is what makes it a hunt and not just a lucky investigation. Theo and Priya take the query that found the beacon and turn it into the durable behavioral detection from §22.4 — the "Regular Outbound Beacon from a Server-Class Host" Sigma rule — so that the next time any server-tier host beacons to a new external destination, an alert fires automatically and no human has to think to look. They also note a visibility improvement: enriching outbound destinations against the asset/vendor inventory automatically would have surfaced "server talking to an unknown external host" even faster. The hunt has permanently raised Meridian's detection floor.

🛡️ Defender's Lens: Notice that the hunt did not require any threat intelligence about this specific attacker's infrastructure — no hash, no known-bad IP, none of the bottom-of-the-pyramid indicators that were useless against Sunburst anyway. It required only a hypothesis about behavior (servers shouldn't beacon to the internet) and the data to test it. This is why hunting is the answer to the stealthy adversary: you are not matching what they are, you are looking for what they must do. The adversary controls their tools, their malware, and their infrastructure — but they do not control the fact that C2 requires talking to home.

⚠️ Common Pitfall: Hunting without baselining your normal first. A hunter who does not know that host 10.20.5.11's regular beacon to Windows Update is benign will either (a) flag it as malicious and cry wolf, destroying the team's credibility, or (b) learn to ignore all regular beacons, including the malicious one. Triage in Step 4 is only possible if you can separate benign regularity from malicious regularity — and that requires having characterized your environment's normal in advance (Chapter 10). The hunt's hardest part is usually not finding anomalies; it is knowing which anomalies are yours.

🔄 Check Your Understanding: 1. Rewrite this vague hunt as a proper hypothesis using the template: "let's see if anyone is stealing data." (Pick a plausible technique and data source.) 2. In the Meridian hunt, two beaconing hosts were benign and one was malicious — all three were "regular outbound connections." What distinguished the malicious one during triage, and what prior work made that distinction possible?

Answers

  1. For example: "If an adversary were exfiltrating data over the network (T1048 — Exfiltration Over Alternative Protocol / T1041 — Exfiltration Over C2 Channel), we would expect to see one or more internal hosts sending an unusually large volume of outbound data to a single external destination, especially during off-hours — let's query NetFlow/Zeek conn.log for outbound byte counts per host-destination pair over the last 7 days and flag outliers." (Any specific technique + observable + data source that is confirmable is acceptable.) 2. The malicious host was the server-management application server beaconing to an unknown, recently registered external destination found in no asset or vendor record, whereas the benign two beaconed to known, explainable destinations (Windows Update, a known vendor). The distinction was possible because of prior baselining and an asset/vendor inventory — knowing the environment's normal — without which all three would look identical.

22.6 Measuring coverage

You have built detections and run a hunt. Now the question that should keep a detection lead up at night: what can't we see? The most dangerous gap in a security program is not a detection that fires too often — that is annoying but visible. It is the technique an adversary can use against you that no detection covers and no one knows is uncovered. To manage that, you have to measure it.

Detection coverage (also called ATT&CK coverage) is the systematic mapping of your detections to the MITRE ATT&CK matrix, to see which adversary techniques you can detect and which are blind spots. Recall from Chapter 2 that ATT&CK organizes adversary behavior into tactics (the columns — the adversary's goals, like Initial Access, Execution, Persistence, Credential Access, Command and Control, Exfiltration) and techniques (the cells within each column — the specific ways to achieve each goal). Coverage mapping asks, for every technique, "do we have a detection (and the data source it needs) that would catch an adversary using this?" The answer for each cell is roughly: good coverage, partial coverage, no coverage, or not applicable to our environment. The result is a heatmap over the matrix — the single most clarifying artifact a detection program produces.

        ATT&CK COVERAGE HEATMAP (Meridian, illustrative excerpt)
        ███ good   ▒▒▒ partial   ░░░ none / blind spot

  Initial    Execution  Persist.   Cred.      Command &  Exfil.
  Access                            Access     Control
  ┌────────┐ ┌────────┐ ┌────────┐ ┌────────┐ ┌────────┐ ┌────────┐
  │ Phishing│ │ Cmd/   │ │ Sched. │ │ OS Cred│ │ Web    │ │ Exfil  │
  │  ███    │ │ Script │ │ Task   │ │ Dumping│ │ Protos │ │ over C2│
  │         │ │  ███   │ │  ▒▒▒   │ │  ░░░   │ │  ███   │ │  ▒▒▒   │
  ├────────┤ ├────────┤ ├────────┤ ├────────┤ ├────────┤ ├────────┤
  │ Supply  │ │ WMI    │ │ Registry│ │ Brute  │ │ Encrypt│ │ Exfil  │
  │ Chain   │ │        │ │ Run Key│ │ Force  │ │ Channel│ │ to Web │
  │  ░░░ ◄──│ │  ░░░   │ │  ▒▒▒   │ │  ███   │ │  ▒▒▒   │ │  ░░░   │
  └────────┘ └────────┘ └────────┘ └────────┘ └────────┘ └────────┘
        ▲                              ▲
        │                              │
   BLIND SPOT: the exact technique  BLIND SPOT: no detection for
   class SolarWinds used — and we   credential dumping from memory —
   had no detection until the hunt  an adversary could harvest creds
   built one.                       and we'd never see it.

Figure 22.5 — A detection-coverage heatmap over an excerpt of the ATT&CK matrix. Green cells have solid detections (tagged with their technique IDs, per §22.4); red cells are blind spots. The value of this picture is that it makes the invisible visible: before the Meridian hunt, "Supply Chain Compromise" and "Web Protocols (C2)" were uncovered, and the heatmap would have shown that as red — a gap to prioritize, not a surprise to discover during a breach.

Two honest caveats keep this from becoming a vanity metric. First, coverage is not binary and not a percentage to brag about. "We cover 80% of ATT&CK" is nearly meaningless, because techniques are not equally important to your environment and "covered" hides a spectrum from "we have one fragile rule" to "we have robust, tested, multi-source detection." A coverage map is a prioritization tool — it shows you where to invest next — not a scoreboard. Second, coverage depends on data, not just rules. You cannot detect credential dumping if you don't collect the process-and-memory telemetry that would reveal it, no matter how good your rule is. A red cell can mean "we have no rule" or "we have no data" — and the second is the more expensive gap, because building the rule is useless until the data exists. Mapping coverage therefore doubles as a data-source gap analysis: it tells you not only which detections to write but which telemetry to start collecting.

This is where false negatives demand to be named explicitly. A false negative is a real attack that your detection failed to catch — the malicious event that happened and produced no alert. It is the silent, invisible counterpart to the false positive (the benign event that wrongly alerted, from Chapter 21). The asymmetry between them is the heart of detection's hardest tradeoff and worth stating sharply:

  • A false positive is visible and costs you attention: an analyst investigates, finds nothing, moves on. Too many cause alert fatigue (Chapter 21), which is bad — but you know they are happening.
  • A false negative is invisible and costs you a breach: the attack succeeded and you never knew, because nothing fired. You cannot see your own false negatives by definition — if you could, they would be detections.

This invisibility is exactly why hunting and coverage measurement exist. You cannot count false negatives directly, but you can attack their hiding places: a coverage map reveals the techniques where false negatives are guaranteed (no detection, no data — anything there is a false negative by construction), and hunting actively searches the space your detections don't cover, converting would-be false negatives into findings. Tuning a detection is always a balance: loosen it to reduce false negatives (catch more) and you raise false positives (more noise); tighten it to reduce false positives and you raise false negatives (miss more). There is no setting with zero of both. The professional move is to make that tradeoff consciously and per-detection, knowing which way each rule is tuned and why — and to use coverage and hunting to keep the false-negative side, the side you cannot see, from quietly growing.

🛡️ Defender's Lens: The most valuable sentence a detection lead can say to a board is, "Here are the adversary techniques we currently cannot detect, ranked by how likely an adversary targeting a bank is to use them, and here is the plan to close them." That sentence is only possible if you have mapped coverage. It converts the terrifying, unbounded question "are we missing something?" into a bounded, fundable backlog. Compliance asks "do you have detection?" (a yes/no that invites a comforting lie); a coverage map answers the real question, "detection of what, and what are we blind to?" — which is the difference between security and security theater (Theme 5: compliance is the floor, not the ceiling).

🔄 Check Your Understanding: 1. Why is "we cover 80% of ATT&CK" a misleading way to report detection coverage? Give two distinct reasons. 2. Explain why you cannot directly count your false negatives, and name the two practices from this chapter that attack the problem indirectly.

Answers

  1. First, ATT&CK techniques are not equally relevant or equally dangerous to your environment, so a flat percentage weights a trivial technique the same as a critical one. Second, "covered" hides a wide quality spectrum — from a single fragile, untested rule to robust multi-source detection — and it conflates "we have a rule" with "we have the data the rule needs," so the percentage can be inflated by rules that cannot actually fire. Coverage is a prioritization map, not a score. 2. False negatives are real attacks that produced no alert — by definition you have no record of them, so you cannot count what left no trace (if you could see it, it would be a detection). The two indirect attacks on the problem are coverage mapping (which reveals techniques with no detection/data, where false negatives are guaranteed) and threat hunting (which proactively searches the uncovered space and converts would-be false negatives into findings).

Project Checkpoint

Meridian's SIEM is live (Chapter 21) and its first correlation rules are firing. This chapter turns that reactive monitoring into a detection and hunting program, and adds the detect.py module to bluekit.

Program increment — the detection & hunting program. Theo and Priya formalize three things, which become a section of Meridian's security program document. (1) A detection-engineering practice: detections are written as Sigma rules, each tagged with its ATT&CK technique IDs, version-controlled as detection-as-code, and documented with a description, expected false positives, and a severity — so the catalog is auditable and maintainable. (2) A hunting cadence: a standing schedule (say, one hypothesis-driven hunt per week), each hunt grounded in threat intelligence or an ATT&CK gap, run through the six-step loop (Figure 22.4), and required to end by either producing a new detection or documenting a visibility gap. (3) A coverage map: Meridian's detections plotted on the ATT&CK matrix (Figure 22.5), reviewed quarterly, with the top blind spots ranked by relevance to a financial institution and fed into the detection-engineering backlog. The SolarWinds-style beaconing hunt is the program's first documented hunt; the durable beacon-detection rule it produced is the catalog's newest entry; and "Supply Chain Compromise" and "C2 over web protocols," once red cells, are the first gaps the program is closing.

bluekit increment — detect.py. We add the chapter's two canonical functions: ioc_match(event, iocs) — indicator-based detection (does an event contain a known-bad indicator?) — and attack_technique(event) — a tiny behavioral classifier that maps an event to the ATT&CK technique it most resembles. As always, the code is illustrative and never executed during authoring; the expected output is hand-traced.

# bluekit/detect.py  — Chapter 22 increment
"""Two faces of detection: indicator matching (bottom of the pyramid)
and behavioral technique-tagging (top of the pyramid).
Illustrative and hand-traced; nothing here is executed at authoring time.
"""

def ioc_match(event: dict, iocs: dict) -> list[str]:
    """Return ATT&CK-style notes for any known-bad indicator in the event.
    `iocs` maps an indicator-type -> set of known-bad values."""
    hits = []
    for field, bad_values in iocs.items():        # e.g. "dest_ip" -> {...}
        value = event.get(field)
        if value is not None and value in bad_values:
            hits.append(f"IoC[{field}]={value}")
    return hits


def attack_technique(event: dict) -> str:
    """Map an event to the ATT&CK technique it most resembles (behavioral)."""
    parent = (event.get("parent_image") or "").lower()
    image = (event.get("image") or "").lower()
    if parent.endswith(("winword.exe", "excel.exe")) and "cmd" in image:
        return "T1059.003"      # Command and Scripting Interpreter: Windows Command Shell
    if event.get("src_zone") == "server" and event.get("dest_category") == "external" \
            and event.get("beacon_regular"):
        return "T1071.001"      # Application Layer Protocol: Web Protocols (C2 beacon)
    if "lsass" in (event.get("target_image") or "").lower():
        return "T1003.001"      # OS Credential Dumping: LSASS Memory
    return "unknown"


if __name__ == "__main__":
    iocs = {"dest_ip": {"203.0.113.77"}, "sha256": {"5f4dcc3b2ab7"}}
    e1 = {"dest_ip": "203.0.113.77", "src_zone": "server",
          "dest_category": "external", "beacon_regular": True}
    e2 = {"parent_image": r"C:\Office\winword.exe", "image": r"C:\Windows\cmd.exe"}
    print("e1 IoCs:    ", ioc_match(e1, iocs))
    print("e1 technique:", attack_technique(e1))
    print("e2 technique:", attack_technique(e2))

# Expected output:
# e1 IoCs:     ['IoC[dest_ip]=203.0.113.77']
# e1 technique: T1071.001
# e2 technique: T1059.003

Trace it by hand to see the two postures in one module. For event e1 (the Meridian beacon), ioc_match finds the known-bad destination IP — indicator detection, which only works because we already knew that IP was bad. But attack_technique independently classifies e1 as T1071.001 from its behavior (server tier → external → regular beacon) — behavioral detection, which would have fired even if the IP were brand-new and on no blocklist. That is the whole lesson of the pyramid of pain in eight lines of Python: the IoC match is precise but brittle; the technique match is robust to the adversary changing their infrastructure. Meridian now detects the SolarWinds-style beacon two ways, and only one of them depends on having seen the attacker before.

Summary

This chapter built the discipline of finding adversaries — both the ones you anticipated and the ones you didn't.

  • Threat detection has two postures that feed each other: detection engineering (building and maintaining the automated rules that fire on anticipated bad activity) and threat hunting (the proactive, human-led search for activity no rule caught). A mature program runs both, with a loop that turns every hunt finding into a new automated detection.
  • Absence of alerts is not evidence of absence of attackers. A quiet queue is a hypothesis to test by hunting, not a verdict of safety — SolarWinds Sunburst proved this across thousands of silent environments for nine months.
  • The pyramid of pain (hashes → IPs → domains → artifacts → tools → TTPs) ranks indicators by how much it costs the adversary to change them. Indicator-based detection (matching hashes/IPs/domains) is cheap and brittle; behavioral detection (matching technique/TTP) is harder but forces the adversary to change how they operate. Climb the pyramid.
  • Threat intelligence drives detection through a pipeline: ingest → triage for relevance → extract indicators and techniques → deploy as a detection or a hunt → feed back. A threat intelligence platform (TIP) aggregates, de-duplicates, scores, and distributes feeds (often via STIX/TAXII). Operational (TTP-level) intelligence is the gold; tactical indicators are short-lived.
  • A Sigma rule is a portable, vendor-agnostic YAML detection that compiles to any SIEM's query language — write once, run anywhere — and lives in version control as detection-as-code. Anatomy: metadata, ATT&CK tags, logsource, detection (selection + condition), falsepositives, level. YARA is the file/memory counterpart; Snort/Suricata the network-IDS counterpart.
  • Hypothesis-driven hunting structures a hunt around a specific, testable hypothesis — "if adversary did [technique], we'd see [observable] in [data source]" — run through the six-step loop: hypothesize → scope data → query → triage → conclude → operationalize. Step 6 (turn the finding into a durable detection or document a visibility gap) is what makes it a hunt and not a one-off investigation.
  • Detection coverage maps your (ATT&CK-tagged) detections onto the ATT&CK matrix to expose blind spots; it is a prioritization tool and a data-source gap analysis, not a percentage to brag about. A false negative — a real attack that produced no alert — is invisible by definition; coverage mapping and hunting are the two ways to attack the false-negative problem indirectly. Tuning always trades false negatives against false positives.
  • bluekit gained detect.py (ioc_match, attack_technique), and Meridian's program gained a detection-engineering practice, a hunting cadence, and a coverage map.

Spaced Review

Retrieve before you scroll. These revisit the SIEM you built last chapter and the threat foundations from early in the book.

  1. (Ch.21) Your SIEM's correlate(events, rule) fires an alert when a correlation rule matches. How does a Sigma rule in this chapter relate to a SIEM correlation rule — what does Sigma add that writing the correlation rule directly in your SIEM does not?
  2. (Ch.2) Define indicator of compromise (IoC) and TTP in one sentence each, and place each on the pyramid of pain. Why does an adversary find it so much cheaper to change an IoC than a TTP?
  3. (Ch.21) Last chapter you tuned rules to fight alert fatigue from false positives. This chapter introduced false negatives. Why is the false negative the more dangerous of the two, and why can't you simply tune a rule to eliminate both at once?
  4. (Ch.2) The cyber kill chain describes an intrusion as stages. How does building detections across multiple ATT&CK tactics (Initial Access, Execution, C2, Exfiltration) give a defender more chances to catch an adversary than a single detection would?
Answers 1. A Sigma rule *is* a detection rule, but written in a portable, vendor-agnostic YAML format that *compiles to* your SIEM's native correlation/query language (SPL, KQL, etc.). Over writing the correlation rule directly, Sigma adds **portability** (one rule runs on any SIEM), **maintainability and version control** (it is text — detection-as-code), and a standard place to carry **ATT&CK technique tags** so the detection counts toward coverage. 2. An *IoC* is a forensic artifact suggesting compromise (a bad hash, IP, domain, registry key); a *TTP* is the tactic/technique/procedure describing *how* an adversary operates (e.g., dumping credentials from LSASS memory). IoCs sit at the bottom of the pyramid (hashes/IPs/domains), TTPs at the top. Changing an IoC costs the adversary almost nothing (recompile for a new hash, re-register a domain) because it is just a value; changing a TTP forces them to relearn and re-tool *how they do their job*, which is expensive and risky. 3. A false positive is *visible* — an analyst investigates and finds nothing — so it costs attention and causes fatigue but you know it happened; a false negative is a *real attack that fired no alert*, invisible by definition, so it costs you an undetected breach you don't even know to investigate. You can't eliminate both at once because tuning is a tradeoff: loosening a rule to catch more (fewer false negatives) admits more noise (more false positives), and tightening it does the reverse — there is no setting with zero of both. 4. Each ATT&CK tactic is a *stage the adversary must pass through* to reach their goal, and a detection at each stage is an independent chance to catch them; an adversary who evades your Initial Access detection may still trip your Credential Access or C2 detection. Multiple detections across the kill chain mean the defender does not have to be right at one single point — defense in depth applied to detection (Theme 4).

What's Next

You can now find adversaries — both the ones your rules anticipated and the ones you had to hunt for. The Meridian beaconing hunt ended at the most important hand-off in security operations: Theo found the C2 beacon and immediately escalated, because finding an adversary and evicting one are different jobs. Chapter 24 picks up exactly there, with incident response — the disciplined process of preparation, detection and analysis, containment, eradication, recovery, and lessons learned that turns a confirmed detection into a contained, eradicated, and survived incident. You will run the Meridian ransomware tabletop and learn to make containment decisions under uncertainty. (Chapter 23 first sharpens a related skill — vulnerability management and prioritization — closing the loop between the weaknesses adversaries exploit and the speed at which you fix them.) And much later, Chapter 34 returns to this chapter's hardest problem — the false positives and false negatives of behavioral detection — to ask what machine learning can, and cannot, do to help.