Exercises: Threat Detection and Hunting

These exercises build the two crafts of this chapter — engineering detections and hunting for what they miss. Difficulty is marked ⭐ (recall/application), ⭐⭐ (analysis), and ⭐⭐⭐ (synthesis/open-ended). A dagger (†) marks problems with a full worked solution in Appendix: Answers to Selected Exercises — try every problem before you read one.

Write Sigma rules in YAML, hunt queries in any SIEM dialect (SQL/SPL/KQL — pick one and stay consistent), and ATT&CK mappings using real technique IDs where you are confident of them. Where an exercise asks you to score, hunt, or design, the reasoning matters more than landing on one "right" answer. Only ever run queries against your own lab or a system you are explicitly authorized to test.


Part A — Core vocabulary and the pyramid ⭐

1.† In one sentence each, define threat detection, detection engineering, and threat hunting, then write one sentence describing the loop that connects detection engineering and hunting.

2. Place each indicator on the pyramid of pain (from least to most painful for the adversary to change) and state, for each, roughly how the adversary would evade a detection built on it: (a) a file's SHA-256 hash; (b) a C2 IP address; (c) the technique "dumping credentials from LSASS memory"; (d) a malicious domain; (e) a distinctive C2 URI pattern baked into the malware; (f) the adversary's custom tool.

3.† Distinguish indicator-based detection from behavioral detection in one sentence each. Give one concrete advantage and one concrete limitation of each.

4. Define false negative and contrast it with a false positive (Chapter 21). Explain in one sentence why a security team can measure its false positives but cannot directly measure its false negatives.

5. A colleague says, "Our alert queue has been quiet all month — we're clearly secure." Identify the flawed assumption using a concept from §22.1, and state what you would do to actually earn the conclusion "we are not currently compromised."

6.† Name the three tiers of threat intelligence (strategic / operational / tactical), give a one-line example of each for a bank, and state which tier is most valuable to a detection engineer and why.


Part B — Analyze the IoCs / telemetry ⭐⭐

7.† Analyze this beacon. You are handed this (illustrative) summary of outbound connections from Meridian's server tier over 7 days. All destinations are external; the source zone is server.

src_ip       dest_ip          conn_count  distinct_dsts  jitter_stddev(s)  mean_interval(s)
10.20.5.11   <Windows Update> 412         1              48.2              300.1
10.20.5.22   <CDN edge>       1503        37             612.0             variable
10.20.5.40   203.0.113.77     336         1              9.4               120.0
10.20.5.51   198.51.100.9     288         1              7.1               600.0

(a) Which row(s) look most like C2 beaconing, and which fields drove your conclusion? (b) Which row is almost certainly benign human/CDN web traffic, and why? (c) What single enrichment step would most quickly confirm whether 10.20.5.40's destination is malicious? (d) Map the suspected behavior to an ATT&CK tactic and technique.

8. Analyze this process tree. A process-creation log shows: outlook.execmd.exepowershell.exe -enc <base64> → an outbound connection to a new external host. (a) What is the most likely attack story? (b) Which ATT&CK tactics are represented across this chain? (c) Why is detecting on the chain (Office → shell → encoded PowerShell → network) higher on the pyramid of pain than alerting on the specific base64 string?

9.† You receive a threat-intel report listing: 3 file hashes, 2 IP addresses, 1 domain, and a paragraph describing that the adversary "achieves persistence by creating a scheduled task that runs a signed-but-abused binary, then beacons over HTTPS every 15 minutes." Separate this report into indicators (for matching) and techniques (for behavioral detection/hunting). For each technique, state whether you would make it a detection or a hunt, and why.

10. A vendor proudly reports that their feed adds "4 million fresh indicators per month." Using the pyramid of pain and the idea of relevance triage, explain why this number is closer to a liability than an asset, and describe what a threat intelligence platform should do with such a feed before any of it reaches your SOC.


Part C — Write the rule (Sigma / detection-as-code) ⭐⭐

11.† Write the Sigma rule. Write a Sigma rule that detects rundll32.exe being launched with a command line that contains a URL (a common technique for downloading and executing a remote payload). Include: title, id (any UUID), description, the ATT&CK tags (System Binary Proxy Execution: Rundll32 is T1218.011), logsource (Windows process creation), a detection block with a selection and condition, at least one realistic falsepositives entry, and a level.

12. Take the "Office Application Spawning a Command Shell" rule from §22.4 and add a filter block that suppresses a known-good document-automation service account (say, svc-docgen). Rewrite the condition to selection and not filter, and update the falsepositives note accordingly. Explain in one sentence why adding the filter is tuning toward fewer false positives and what risk that introduces.

13.† Map to ATT&CK. For each detection idea, give the most appropriate ATT&CK tactic and a plausible technique ID (state your confidence; describe generically if unsure): (a) a user account added to the local Administrators group; (b) wmic used to spawn a process on a remote host; (c) clearing the Windows Security event log; (d) a DNS query for a long, high-entropy, algorithmically generated domain; (e) a new service installed that runs from a temp directory.

14. Write the rule from intel. Convert this one-line piece of operational intelligence into a Sigma rule (logic, tags, false positives): "The adversary establishes persistence by adding a Run key under HKCU\Software\Microsoft\Windows\CurrentVersion\Run pointing to an executable in the user's profile." (ATT&CK: Boot or Logon Autostart Execution: Registry Run Keys is T1547.001.)

15. Explain detection-as-code in two or three sentences. List three concrete benefits of keeping Sigma rules in version control that you do not get from writing detections directly in a SIEM's web UI.


Part D — Run a hunt (hypothesis-driven) ⭐⭐–⭐⭐⭐

16.† Write the hypothesis. Convert each vague hunt into a proper hypothesis using the template "If an adversary were doing [technique], we would expect to see [observable] in [data source]": (a) "let's look for lateral movement"; (b) "let's see if anyone is exfiltrating data"; (c) "let's check for persistence somewhere."

17. Write the hunt query. For your hypothesis 16(b) (exfiltration), write a hunt query (SQL/SPL/KQL) against flow or proxy data that would surface the predicted observable. Annotate each clause with what it encodes from the hypothesis, and name one benign pattern your query would catch that you'd have to triage away.

18.† Run the loop. Walk the full six-step hunt loop (Figure 22.4) for the hypothesis: "If an adversary had dumped credentials from LSASS, we would expect to see a non-system process opening a handle to lsass.exe." For each step, write 1–2 sentences: (1) the hypothesis (already given — restate with the ATT&CK ID), (2) what data source answers it and whether a typical org collects it, (3) the query/analytic in words, (4) how you'd triage hits (name a benign process that legitimately touches LSASS), (5) a plausible conclusion, (6) the durable detection and/or visibility gap you'd operationalize.

19. In the Meridian beaconing hunt (§22.5), three server-tier hosts all showed "regular outbound connections," but only one was malicious. Explain precisely what triage information separated the malicious host from the two benign ones, and name the prior program investment (from an earlier chapter) without which that triage would have been impossible.

20.† Why does a good hunt end with operationalization (Step 6) rather than with "we found / didn't find the adversary"? Describe the two possible operational outputs of a hunt and why a hunt that produces neither has, at best, only postponed the next miss.

21. ⭐⭐⭐ Design a hunt program. Sketch a one-month hunting cadence for Meridian's SOC: how many hunts, where the hypotheses come from (name at least three sources), how each hunt is documented, and how hunt findings flow back into detection engineering. What would you measure to know the hunting program is working?


Part E — Coverage and measurement ⭐⭐

22.† Your team has Sigma rules for: phishing-link clicks, Office-spawning-a-shell, brute-force logins, and known-bad-IP connections. Sketch (in words or a small grid) an ATT&CK coverage picture across the tactics Initial Access, Execution, Credential Access, and Command and Control. Identify the two biggest blind spots and explain why each matters to a bank.

23. A manager wants to report "92% ATT&CK coverage" to the board. Give three reasons this single percentage is misleading, and propose a more honest one-sentence way to report coverage.

24.† Explain why a "red cell" (no coverage) in a coverage heatmap can mean two very different things, and why one of those meanings is the more expensive problem to fix. How does coverage mapping double as a data-source gap analysis?

25. Detection tuning always trades false negatives against false positives. For a credential-stuffing detection on the online-banking portal, argue which way you would tune it (toward catching more vs. alerting less) and justify the tradeoff in terms of the cost of each error type for a bank.


Part F — CTF-style challenge ⭐⭐⭐

26.† The beacon that hid in the noise. You are given (illustrative) hourly connection counts from one server-tier host to one external destination over 24 hours. The adversary added jitter to evade simple "exactly every N seconds" detection:

hour:   00 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23
count:  29 31 30 28 32 30 29 31 30 33 28 30 31 29 30 32 28 31 30 29 31 30 28 30

(a) Argue from this data why a fixed-interval detector ("alert only if interval == 300s exactly") would miss this beacon, while a low-variance detector (STDDEV(inter_arrival) < threshold) would catch it. (b) The adversary's next move is to randomize the count more aggressively (say, 5–55 connections/hour). What second signal — beyond regularity of timing — could still betray the beacon? (c) Which layer of the pyramid of pain is this whole cat-and-mouse playing out on, and what higher-pyramid detection would end the game?


Part G — Interleaved & forward-looking ⭐⭐

27. (with Ch.21) Your SIEM ingests and normalizes server logs (Chapter 21's normalize) and fires correlation rules (correlate). Explain where a Sigma rule fits relative to those two functions: does Sigma replace normalization, correlation, or neither? What does compiling a Sigma rule produce that the SIEM then runs?

28. (with Ch.2 and Ch.10) The cyber kill chain (Ch.2) describes stages; network monitoring (Ch.10) produced beacon_score. Explain how a single behavioral signal (regular outbound beaconing) maps to a single kill-chain stage, and why a detection program needs signals across multiple stages rather than one excellent detector at one stage.

29. (with Ch.16/19) A hunt hypothesizes adversary use of valid credentials rather than malware ("living off the land"). Why does this kind of adversary defeat indicator-based detection almost entirely, and which earlier-chapter controls (name two) reduce the damage even when detection is hard?

30. ⭐⭐⭐ Open reflection. The chapter argues "absence of alerts is not evidence of absence of attackers." Pick a field outside security where the same epistemics apply — where "no alarm" is routinely mistaken for "no problem" (medical screening, structural inspection, financial-fraud monitoring). What does that field do to go looking for the problems its alarms miss, and what could a SOC borrow from it?


Solutions to daggered (†) problems are in the Answers appendix. The remaining problems are deliberately open — bring them to a study group, your lab, or your instructor.