Case Study 1: The Heartbeat in the Flow Data — Hunting C2 Beaconing at Meridian

DataField.Dev

Case Study 1: The Heartbeat in the Flow Data — Hunting C2 Beaconing at Meridian

"It wasn't any one connection. It was that all of them were the same connection, over and over, on a clock." — Theo Brandt, junior SOC analyst, Meridian Regional Bank (constructed)

Executive Summary

A peer institution's breach disclosure crosses Dana Okafor's desk: an attacker lived inside the peer bank's network for four months, controlled by an encrypted command-and-control (C2) channel that no endpoint or firewall ever flagged. Dana asks the question that drives this entire case study: "If that were happening inside us right now, would we see it?" The honest answer is "only if someone looks at the network the right way" — and so junior analyst Theo Brandt, with threat-hunting lead Priya Nair, runs a proactive hunt for beaconing across Meridian's newly collected flow data. They find it: one workstation that has been checking in with an external server every hour, around the clock, for three weeks, over encrypted HTTPS that every individual device judged normal. This case study follows the hunt step by step — building the query, scoring the regularity with beacon_score, pivoting through Zeek logs to confirm, and handing a scoped finding to incident response. It is the chapter's central skill made concrete: catching a behavior that is invisible at the packet level and obvious from the flow census. All names, addresses, and figures are constructed for teaching (Tier 3).

Skills applied: flow-data analysis; building and reasoning about a network baseline; beacon detection via inter-arrival-time regularity (beacon_score); pivoting across Zeek conn.log/ssl.log/dns.log by connection uid; distinguishing automated check-ins from human traffic; metadata analysis of encrypted channels; scoping a compromise from retained telemetry; the network-monitoring-to-IR handoff.

Background

Meridian's network-monitoring design (this chapter's Project Checkpoint) went live two months ago. Flow export (NetFlow/IPFIX) was enabled on the core and edge routers, feeding a collector with 13 months of retention; Zeek sensors were placed on taps at the internet uplink, the cardholder-data-environment (CDE) link, and the boundary between the user/branch zones and the data center; and all of it forwards into the nascent SIEM. Until now, the team had mostly used this telemetry reactively — to investigate alerts that fired elsewhere. The peer-bank disclosure prompts something new: a hunt, a deliberate search for an adversary that no alert has surfaced. This is the threat-hunting mindset that Chapter 22 will formalize; here, Theo and Priya apply its simplest and most productive hypothesis.

The hypothesis is one sentence: if an attacker has a foothold inside Meridian, some internal host is beaconing to an external C2 server on a regular interval, and that regularity will be visible in our flow data even though the channel is encrypted. It is a good first hunt because it requires no threat intel, no signatures, and no prior alert — only the flow data Meridian already collects and the understanding, from §10.5, that beaconing's tell is timing, not content.

It is worth pausing on why this is the right first hunt for a team that has never hunted before, because the reasoning generalizes. A productive hunt hypothesis has three properties, all of which this one has. It is falsifiable — there either is or is not an internal host with metronomic external check-ins, and the flow data can answer definitively. It is independent of the attacker's specific tools — it does not care which malware family, which C2 framework, or which destination is involved, because it keys on a behavior (regular check-ins) that is nearly universal to remote control rather than on any signature that a new campaign would evade. And it is cheap to run against data you already have — no new collection, no vendor feed, just a query over existing flow records. When you design your own hunts later (Chapter 22), test each candidate against those three properties; a hypothesis that fails any of them tends to produce either no answer or an answer only for the one attacker you already knew about.

🔗 Connection: This hunt is the network-telemetry half of the SolarWinds (Sunburst) lesson that recurs throughout the book. In that real campaign, compromised software beaconed to attacker infrastructure over channels designed to blend in. The defensive takeaway — captured in this case study and revisited as detection in Chapter 22 — is that beaconing leaves a timing signature in network telemetry that survives encryption. Meridian is practicing, on its own network, exactly the hunt that would have shortened that intrusion.

The Hunt

Phase 1 — Framing the question the flow data can answer

Theo starts not by staring at logs but by stating, precisely, what he is looking for in terms of fields the flow data actually contains. Beaconing, from §10.5, is many small flows from one internal host to one external destination, at a regular interval. Translated into a flow query, that is:

HUNT QUERY (plain language):
  Among flows where  src is INTERNAL (10.0.0.0/8)  and  dst is EXTERNAL,
  GROUP BY (src_ip, dst_ip, dst_port),
  KEEP groups with at least N connections over the window (say N >= 50 over 7 days),
  and for each, COMPUTE the regularity of the connection start times.
  RANK by regularity (beacon_score) x number of check-ins.

Two design choices matter here, and both come straight from the chapter. First, the grouping key is (source, destination, destination port) — because a beacon is defined by a persistent pairing, and collapsing all of one host's connections together would drown the signal in the host's ordinary browsing. Second, the minimum count (here, fifty connections over a week) exists because regularity is only meaningful when there is enough of it: three evenly spaced connections could be coincidence; fifty cannot. This is the same reasoning the beacon_score function encodes when it returns 0.0 for fewer than three timestamps and why a real hunt weights the score by the number of check-ins.

⚠️ Common Pitfall: Hunting for beacons by looking for "connections to known-bad IPs." That is signature thinking, and it fails against a fresh C2 server with no reputation — exactly the kind the peer bank's attacker used. The power of beacon hunting is that it is behavioral: it needs no prior knowledge of the destination at all. Theo deliberately does not filter by reputation; he lets the timing surface candidates, then investigates what they turn out to be.

Phase 2 — Running the query and reading the candidates

Theo runs the query over seven days of flow data and gets back a ranked list of (source, destination, port) groups with their check-in counts and beacon scores. The top of the list:

rank  src_ip       dst_ip:port        check-ins(7d)   avg_interval   beacon_score
─────────────────────────────────────────────────────────────────────────────────
  1   10.20.4.55   198.51.100.7:443        168           ~3600 s         0.98
  2   10.20.12.9   192.0.2.53:443         1,420          varies          0.41
  3   10.20.3.7    203.0.113.20:443         96           ~900 s          0.62
  4   10.20.8.40   198.51.100.7:443         168           ~3600 s         0.97

Figure CS1.1 — Top beacon-hunt candidates over seven days. Rank 1 (10.20.4.55) checks in 168 times — exactly 24 per day for 7 days — to one external IP, with a near-perfect regularity score of 0.98. Rank 4 is a second internal host beaconing to the same destination, a strong corroborating signal. Rank 2 is high-volume but irregular (a busy host doing normal varied traffic). Rank 3 is moderately regular but sparse.

Theo reads the list the way the chapter taught him. Rank 2's 1,420 connections look alarming by volume, but its beacon score of 0.41 says the timing is irregular — this is a busy machine doing ordinary, varied work (likely a host polling a cloud service with natural variation), not a beacon. He sets it aside; volume is not the beacon signal, regularity is. Rank 1 is the opposite: a modest 168 connections but a regularity score of 0.98, and the count decodes to exactly 24 per day — one per hour — for seven straight days. That is not how a person uses a computer. The corroborator is Rank 4: a second internal host beaconing to the same external IP on the same cadence, which is far more consistent with two compromised hosts phoning the same C2 than with two users independently developing identical hourly habits.

🛡️ Defender's Lens: Notice the discipline of not chasing the biggest number. A newcomer's eye jumps to Rank 2's 1,420 connections; an experienced hunter's eye goes to Rank 1's 0.98 score and the suspiciously round 24/day. The whole value of beacon_score is that it separates "a lot of traffic" (often benign) from "metronomic traffic" (often malicious). When you build behavioral detections, always ask which property actually carries the signal — here, regularity — and rank by that, not by whatever number is largest.

Phase 3 — Confirming with the timing detail

A high beacon score is a strong lead, not a conviction. Theo pulls the actual connection start times for 10.20.4.55 → 198.51.100.7:443 over a representative 24 hours and lays them out:

00:00:11   06:00:09   12:00:14   18:00:08
01:00:13   07:00:07   13:00:12   19:00:16
02:00:08   08:00:15   14:00:09   20:00:11
03:00:12   09:00:10   15:00:13   21:00:07
04:00:09   10:00:14   16:00:08   22:00:14
05:00:14   11:00:09   17:00:12   23:00:10
   (every flow ~2 KB out, ~1.2 KB in, duration < 1 s; jitter only a few seconds)

Figure CS1.2 — 24 consecutive hourly check-ins from the suspect host. The connection lands within a few seconds of the top of every hour, all day and all night, with near-constant tiny byte counts. The overnight check-ins (when no human is at the keyboard) and the few-second jitter are the dead giveaway: this is software on a schedule, not a person.

Running these timestamps through beacon_score confirms the regularity numerically — the inter-arrival gaps cluster tightly around 3,600 seconds, so the coefficient of variation is tiny and the score sits near 0.98. But the timing detail adds two human-legible confirmations the bare score does not: the check-ins continue overnight and on the weekend, when the workstation's user is demonstrably absent; and the byte counts are near-constant in both directions, the signature of a fixed-format check-in message rather than the variable sizes of real browsing. Theo now has high confidence. What he does not yet have is the what — because the channel is encrypted, the flow data cannot tell him what is being commanded or sent. For that, he pivots.

Phase 4 — Pivoting through Zeek to enrich the finding

This is where the three-altitude model pays off. Flow data found the beacon; Zeek's richer logs, joined by the connection uid, tell Theo more about it without his needing to read encrypted payloads.

conn.log confirms the cadence across all connections (not just the 24-hour sample) and shows conn_state = SF (clean establishment and teardown) every time — a stable, healthy channel, consistent with a well-behaved implant rather than a flaky misconfiguration.
ssl.log reveals the TLS details: the server presents a certificate for cdn-sync.example, a domain registered (per a quick lookup) six weeks ago — squarely in the window before the beaconing started. A freshly registered domain hosting a destination that one of your workstations contacts every hour is a classic C2 tell. The SNI and certificate are visible precisely because, as §10.2 stressed, encryption hides the payload, not the metadata.
dns.log shows how 10.20.4.55 learned the IP: it resolved cdn-sync.example shortly before the beaconing began, and re-resolves it periodically. Crucially, the DNS here is not itself the covert channel (the query volume is normal and the names are not high-entropy) — which tells Theo the exfiltration/command path is the HTTPS channel, not DNS tunneling. He notes this because it shapes where he looks next.

🔗 Connection: The pivot from conn.log to ssl.log to dns.log by shared uid is the daily craft of network security monitoring described in §10.3, and it is what makes Zeek so much more than a packet logger. Each log answers a different question about the same connection: that it happened (conn.log), what server it reached (ssl.log), and how that server was found (dns.log). Stitched together, they turn a flow-data lead into an evidenced narrative — which is exactly the material a SIEM correlation rule (Chapter 21) or a responder (Chapter 24) needs.

Phase 4.5 — Ruling out the benign explanations

A disciplined analyst does not declare "compromise" the moment the score is high; plenty of legitimate software beacons. Automatic-update checkers, telemetry agents, cloud-sync clients, monitoring tools, and license servers all phone home on fixed schedules and would score high on beacon_score. The difference between a good hunter and a generator of false alarms is the habit of actively trying to explain the pattern away before escalating. Theo runs the suspect through that gauntlet:

BENIGN HYPOTHESIS                          EVIDENCE FOR / AGAINST
────────────────────────────────────────  ───────────────────────────────────────────
"It's a software update check."            AGAINST: updaters hit vendor CDNs with known
                                           certs; this cert is for a 6-week-old domain
                                           (cdn-sync.example) on no allow-list.
"It's a corporate telemetry/monitoring     AGAINST: 10.20.4.55 is a standard teller
 agent we deployed."                       workstation; the destination is in no asset
                                           inventory or approved-egress list.
"It's a cloud-sync client (backup/Drive)." AGAINST: byte counts are tiny + symmetric
                                           (~2KB/1.2KB); real sync shows large, lopsided
                                           transfers, not constant micro-exchanges.
"It's just one weird-but-legit app."       AGAINST: a SECOND host (10.20.8.40) beacons to
                                           the SAME new destination on the SAME cadence.

Figure CS1.2b — The benign-explanation gauntlet. Each ordinary reason a host might beacon is checked against the evidence and fails. The single most damning fact is the bottom row: two unrelated workstations independently developing an identical hourly habit to the same six-week-old domain is not how legitimate software behaves — it is how two compromised hosts phoning one C2 behave.

🛡️ Defender's Lens: Notice that ruling out benign explanations is also how you tune a beacon detector so it is usable in production. The legitimate beacons Theo enumerated (update CDNs, your own telemetry agents, approved cloud-sync) are exactly the destinations you add to an allow-list so the detection stops flagging them — turning a noisy "everything regular is suspicious" into a sharp "regular traffic to a destination we have not vetted is suspicious." A behavioral detection without this curation drowns the analyst in false positives and gets ignored; the curation is what makes it survive contact with a real SOC (the alert-fatigue problem Chapter 21 addresses head-on).

This step also illustrates a subtle point about evidence. None of the four benign hypotheses could be disproved by a single fact in isolation — a tiny symmetric transfer alone might be a quirky app; a new domain alone might be a new-but-legitimate service. It is the combination — new uncatalogued destination, plus tiny constant symmetric payloads, plus around-the-clock timing, plus a second host on the identical pattern — that no innocent explanation survives. Building a case from converging independent signals, rather than one decisive smoking gun, is the normal texture of network investigation, and it is why having multiple telemetry sources (flow, conn.log, ssl.log, dns.log) matters: each adds an independent strand to the rope.

Phase 5 — Scoping with retained telemetry, and the handoff

Before this becomes an incident, Priya asks the two questions that always follow a confirmed beacon: how long has this been going on, and has data left? Because Meridian retains 13 months of flow data and 90 days of Zeek logs, Theo can actually answer.

He extends the flow query backward and finds the hourly check-ins for 10.20.4.55 begin 21 days ago — matching, within a day, the registration-and-resolution timeline of cdn-sync.example. He then runs the §10.5 exfiltration check on the same host: total outbound bytes per destination over those 21 days. The beacon flows themselves are tiny (a few KB each), and crucially there is no large outbound transfer to 198.51.100.7 or anywhere unusual — the cumulative outbound to the C2 is consistent with check-ins only, not bulk theft. The provisional, evidence-based conclusion: the host is under C2 control and has been for three weeks, but no large-scale exfiltration has occurred yet. That is a meaningfully different situation from the peer bank's — the team has caught it in the foothold-and-control phase, before the loss — and it changes the urgency and the containment options.

SCOPED FINDING (handed to Incident Response, Chapter 24):
  Confirmed:  10.20.4.55 (and corroborating 10.20.8.40) beaconing to
              198.51.100.7:443 (cert: cdn-sync.example, registered ~6 wks ago),
              hourly, for ~21 days, encrypted HTTPS.
  Evidence:   beacon_score 0.98; 168 check-ins/wk; overnight+weekend pattern;
              constant ~2KB/1.2KB sizes; conn_state SF; DNS resolution timeline.
  Exfil:      NO bulk outbound observed to the C2 over 21 days (flow-confirmed).
  NOT yet known (network blind spot): the encrypted payload contents -> need
              endpoint forensics on 10.20.4.55 to identify the implant + tasking.
  Recommend:  contain both hosts; preserve; begin IR. (-> Ch.24)

Figure CS1.3 — The scoped finding. Note what network telemetry could establish (the channel, its duration, its cadence, the absence of bulk exfiltration) and what it explicitly could not (the payload contents and the implant identity), which defines the handoff to endpoint forensics and incident response.

The "no bulk exfiltration yet" line deserves care, because it is the kind of statement that, said carelessly, gets a defender in trouble. What the flow data actually supports is narrower than "no data left": it supports "no large-volume outbound transfer to the C2 or to any anomalous destination occurred in the retained window." That leaves three honest caveats Theo writes into the finding rather than glossing over. First, small amounts of data can ride out inside the beacon check-ins themselves — a few kilobytes an hour is invisible as a volume anomaly but adds up, so "no bulk theft" is not "nothing sensitive left." Second, the conclusion is bounded by retention: it is confident for 21 days because that is how far back the beacon goes and well within the 13-month flow window, but it could say nothing about a hypothetical earlier channel that predated retention. Third, data could have left by a path the C2 beacon does not reveal — a different destination, or a covert channel — which is why Theo also checked the host's total outbound to all destinations and its dns.log, not just its traffic to the known C2. A finding that states its own boundaries is worth far more to the responders who inherit it than a confident-sounding overreach, because they will make containment and notification decisions on the basis of exactly these words.

It is also worth noting what this hunt cost, because that is what makes it repeatable rather than a heroic one-off. It consumed an afternoon of one junior analyst's time and one query over data Meridian was already collecting and paying to retain. It required no new tooling, no threat-intelligence subscription, and no prior alert. That economics is the entire argument for building the visibility layer: once the flow data is collected and retained, hunts like this are cheap, and a SOC can run them on a schedule — weekly, say — turning a fortunate catch into a standing capability. Priya's note to Dana afterward made exactly this point: the value was not that Theo was brilliant (the method is mechanical), but that the data was there to be queried. The investment decision that mattered was made months earlier, when someone funded the collector and chose a 13-month retention.

🚪 Threshold Concept: The hunt did not "prevent" anything — the host was already compromised. Its value was visibility: it converted an unknown, ongoing intrusion into a scoped, evidenced finding with a known start date, a known channel, and a confident statement about what had and had not been stolen. That is what network monitoring buys you. Prevention will always eventually fail (Theme 4); the question that decides whether a failure becomes a catastrophe is whether you can see it in time to scope and stop it. Meridian could, because someone decided — before the incident — to collect and retain the flow data.

Discussion Questions

Theo deliberately ignored Rank 2 (1,420 connections, score 0.41) and pursued Rank 1 (168 connections, score 0.98). Restate the principle this illustrates, and describe a situation where ranking by volume instead of regularity would send a hunter down the wrong path.
The beacon was caught because Meridian retained 13 months of flow data. How would this hunt have gone if the attacker had used a once-per-day beacon and Meridian retained only 7 days of flow data?
The team concluded "no bulk exfiltration yet" from flow data. What are the limits of that conclusion — what kinds of data theft might not show up as a large outbound flow, and which Chapter 9 / §10.5 idea would you check to cover that gap?
The finding explicitly separates "what the network could establish" from "what it could not." Why is naming the blind spot (encrypted payload → need endpoint forensics) a sign of a mature analysis rather than an incomplete one?
Two internal hosts beaconed to the same C2. List three benign explanations a careful analyst should rule out before concluding "two compromised hosts," and how the evidence in Figures CS1.1–CS1.3 argues against each.

Your Turn

You are handed seven days of flow data (as (src, dst, dst_port, start_time, bytes) rows) for a small network. (1) Write, in pseudocode, the hunt query from Phase 1: the internal→external filter, the grouping key, the minimum-count threshold, and the ranking by regularity × count. (2) For one candidate group, list the timestamps you would extract and explain how you would distinguish a true beacon from a host legitimately polling a service every few minutes. (3) Specify the exact Zeek pivots (which logs, by what key, looking for what) you would run to enrich a confirmed beacon. (4) Draft the three-line scoped finding you would hand to incident response, explicitly stating one thing the network telemetry cannot tell you. Keep it to one page.

Key Takeaways

Beacon hunting is behavioral, not signature-based: it needs no prior knowledge of the destination, because the tell is the regularity of the timing, which survives encryption and a clean reputation.
Rank by the property that carries the signal. For beaconing that is regularity (beacon_score), not volume; the biggest number is frequently benign and the metronomic small one is frequently malicious.
Flow data finds it; Zeek enriches it. The flow census surfaces the cadence; pivoting conn.log → ssl.log → dns.log by uid reconstructs what server was reached and how it was found — without reading payloads.
Long retention is what makes scoping possible. 13 months of flow let Theo establish a 21-day duration and confirm the absence of bulk exfiltration; a 7-day window would have shown the beacon but not its history, and would miss a slow daily beacon entirely.
Name the blind spot. Network telemetry could prove the channel, its duration, and the lack of bulk exfiltration, but not the encrypted payload or the implant identity — which precisely defines the handoff to endpoint forensics and incident response (Chapter 24).
Visibility, not prevention, is the win here. The host was already compromised; the hunt's value was converting an unknown intrusion into a scoped, evidenced, contain-able finding — the whole purpose of the network-monitoring layer.