Key Takeaways: Security Information and Event Management (SIEM)

DataField.Dev

Key Takeaways: Security Information and Event Management (SIEM)

A scannable, reference-grade summary for review and the exam. Roughly 80% tables and decision aids. If you remember one thing: a breach is usually a failure to correlate data, not to collect it — and fidelity, not coverage, is the currency of a SOC.

Core definitions (the term ledger for this chapter)

Term	One-line definition
SIEM	System that ingests, normalizes, correlates, and alerts on logs from many sources for real-time detection and investigation.
Log source	Any system/application that produces timestamped event records (AD, EDR, firewall, cloud, app…).
Normalization	Mapping fields from many source formats onto one common schema (consistent names/formats).
Parsing	Extracting the meaningful fields out of a raw log message (the step before normalization).
Correlation rule	Logic that fires an alert when a defined pattern occurs across events/sources/time.
Use case (detection)	A named threat scenario to detect, with its logic, sources, severity, response, and false-positive risk.
Alert fatigue	Desensitization/degraded performance from too many alerts (mostly false positives) — causes missed attacks.
False positive	An alert with no real malicious activity. (False negative: a real attack with no alert.)
Log retention	How long logs are kept; driven by detection/hunting needs and compliance (PCI-DSS, GLBA).
Data lake	Cheap, long-term store of vast raw data; schema-on-read; no real-time correlation by itself.
SOAR	Security Orchestration, Automation, and Response — automates the response to alerts via playbooks.
Detection-as-code	Managing detection rules as version-controlled, reviewable, testable text (e.g., Sigma).
Dashboard	At-a-glance visual of metrics/events from SIEM queries (operational vs. executive).

Log-source priority (collect top-down, by detection value)

#	Source	Method	Why it ranks here
1	Identity / auth (AD, Entra, IdP, VPN)	agent + API	Identity is the new perimeter; catches credential abuse across the whole kill chain
2	Endpoint detection (EDR)	vendor agent	Process creation, persistence, defense evasion — code execution lives here
3	Cloud control plane (CloudTrail / Azure / GCP)	API pull	The only place cloud attacker actions are recorded
4	Network edge (firewall, proxy, DNS, IDS/IPS)	syslog (TLS)	Context + C2/exfil detection (builds on Ch.10)
5	Servers (Windows/Linux)	agent / syslog	Interactive logons, privilege changes, key services
6	Critical applications (core/online banking)	agent / app logs	Access to crown jewels; high-value custom detections
7	SaaS / email (M365, mail gateway)	API pull	Phishing, account takeover

Rule of thumb: Easy-to-collect ≠ worth-collecting. Collect what serves a use case; send a cheaper full copy to a data lake.

Collection methods

Method	Use for	Trade-off
Agent	Hosts, endpoints (rich data)	Reliable; must deploy/maintain on every machine
Syslog	Network devices, Unix (use TLS)	Universal and cheap; loose, inconsistent format
API pull	Cloud / SaaS (no agents allowed)	High value (identity/cloud); scheduled, rate-limited
Stream (Kafka-style)	Very high-volume sources	Decouples firehose from SIEM; more infrastructure

Normalization at a glance

RAW (3 sources, same event)                    NORMALIZED (common schema)
 sshd "Failed password for jchen from X"   ┐    {timestamp, source, user, src_ip,
 4625 TargetUserName=jchen IpAddress=X     ┼──►  action, outcome, host}
 firewall "DENY TCP X:p -> Y:22"           ┘    -> one query/rule works across all

Parse to get fields out; normalize to give them common names/formats.
Map to a published model — ECS, OCSF, or Splunk CIM — don't invent field names.
UTC everywhere; NTP on every source. Time-based correlation breaks under clock drift.

The correlation ladder (simple → powerful)

Rung	Type	Example	Fidelity
1	Single-event (atomic)	Audit log cleared (Win 1102); login from a banned country	High but narrow
2	Threshold	50 failed logins from one source, many accounts, 5 min (spray)	Good; counting is the power
3	Sequence / temporal	Failure burst → success for same account (brute force worked)	High; reads ordered attacks
4	Cross-source	IDS exploit alert → host's outbound to new IP (exploit → C2)	High; needs normalization
5	Behavioral / baseline	Service account logs in interactively for the first time	Powerful; noisy early (→ Ch.34)

Why correlation beats single-event alerting: attacks are processes (kill chain, Ch.2); each step looks ordinary, but the sequence/combination betrays the attacker.

Meridian's first ten use cases (starter catalog)

#	Use case	Rung	ATT&CK (approx.)
1	Brute force followed by success	sequence	T1110 → T1078
2	Password spraying (one src, many users)	threshold	T1110.003
3	Impossible travel	sequence/geo	T1078
4	Security/audit log cleared	single-event	T1070.001
5	New privileged-group membership added	single-event	T1098 / T1078
6	Service account interactive logon (new)	behavioral	T1078.002
7	Disabled/expired account login attempt	single-event	T1078
8	MFA disabled or reset	single-event	T1556
9	Outbound to known-bad / new external IP	cross-source	T1071 (C2)
10	Mass file access or deletion (ransomware)	threshold	T1486

Querying: the same shape in three dialects

Investigation: count logins by user for one source IP, last hour, most first.

Language	Used by	Style
SQL	databases, data lakes	leads with `SELECT … GROUP BY … ORDER BY`
SPL	Splunk	pipeline: `search \\| stats … by \\| sort`
KQL	Microsoft Sentinel/Defender	pipeline: `where \\| summarize … by \\| sort`

-- SQL
SELECT user, COUNT(*) attempts FROM events
WHERE action='login' AND src_ip='203.0.113.77' AND timestamp >= NOW()-INTERVAL '1' HOUR
GROUP BY user ORDER BY attempts DESC;

-- SPL
index=auth action=login src_ip="203.0.113.77" earliest=-1h
| stats count AS attempts by user | sort - attempts

// KQL
Events | where action=="login" and src_ip=="203.0.113.77" | where timestamp >= ago(1h)
| summarize attempts=count() by user | sort by attempts desc

Shape to memorize: filter (lead with time bound) → aggregate → sort → (sometimes join).

Alert tuning — the fatigue toolkit

Technique	What it does	Watch out for
Tune thresholds/conditions	Narrow the rule (raise count, require new IP, span many accounts)	Don't narrow away the real attack
Allowlist known-benign	Exclude scanners, backup/service accounts, monitoring	An allowlist is a documented hole — review it
Aggregate / deduplicate	100 identical alerts → one "100×"	Don't merge genuinely distinct events
Risk-based alerting	Weak signals accumulate a score; surface high scorers	Tune weights; needs entity tracking
Suppress / schedule	Mute known maintenance/batch windows	Keep the window tight and documented

Decision rule: a noisy-but-valuable rule → TUNE (narrow conditions). A noisy-and-worthless rule → consider disable as a documented risk decision. Disabling a valuable rule = a silent false negative — strictly worse than visible noise.

The fatigue arithmetic: 800 alerts/day × 3% true ≈ 24 true positives buried in 776 false; a 5-analyst SOC (~100 alert capacity) cannot find them. Fidelity, not coverage.

SIEM vs. Data Lake vs. SOAR

	SIEM	Data Lake	SOAR
Job	Detect & investigate	Store cheaply, long	Respond (automate)
Real-time correlation?	Yes	No (by itself)	Acts on SIEM alerts
Schema	on write (normalized)	on read	n/a
Cost driver	ingest volume / storage	raw storage (cheap)	integrations
Used for	alerting, queries, dashboards	retention, hunting, forensics	playbooks: enrich, contain, ticket

Modern pattern: high-value logs → SIEM (hot, real-time); cheaper full copy → data lake (cold, long retention); response orchestrated by SOAR.

Dashboards & metrics (bridge to Ch.36)

Audience	Dashboard	Shows
SOC	Operational	open alerts by severity, oldest un-triaged, noisy rules, log-source health
Leadership	Executive	trends: MTTD, MTTR, FP rate, ATT&CK coverage, program status

Metrics are born in the SIEM: MTTD/MTTR need timestamped activity + alert records. No logging → no measurement (developed in Ch.36).

Recurring themes surfaced

Theme	In this chapter
1 — Process, not product	A SIEM is a loop (detect→investigate→respond→tune); tuning is standing operations
2 — Asymmetry	Attacker needs one ignored alert; alert fatigue hands it over for free
4 — Defense in depth / assume breach	Logging assumes prevention fails; correlation catches the attacker already inside
5 — Compliance is the floor	PCI-DSS/GLBA mandate logging, but the goal is detection beyond the audit

Certification crosswalk

Concept	CompTIA Security+	(ISC)² CISSP
SIEM, log aggregation/correlation	Security Operations (logging & monitoring)	Domain 7 — Security Operations
Log sources, collection, retention	Monitoring; data sources	Domain 7; Domain 2 (asset/retention)
Normalization, time sync (NTP/UTC)	Logging concepts	Domain 7
Correlation rules / use cases	Detection & alerting	Domain 7 (detective controls)
Alert fatigue, false pos/neg, tuning	Alerting & monitoring	Domain 7
SOAR / automation	Automation & orchestration	Domain 7
Dashboards, MTTD/MTTR	Reporting	Domain 7; Domain 1 (governance metrics)

Common pitfalls (quick hit-list)

☐ "Collect everything" → expensive, noisy SIEM that gets ignored. Collect by use case.
☐ Easy-to-collect sources prioritized over high-value ones.
☐ No time bound on queries → scans terabytes, degrades the SIEM.
☐ Clocks not synced / not UTC → sequence correlation silently fails.
☐ Importing hundreds of vendor-default rules → instant alert fatigue.
☐ Disabling a noisy-but-valuable rule instead of tuning it → silent blind spot.
☐ Allowlists added without owner/justification/review → undocumented holes.
☐ Logs only on the host → an attacker clears them. Ship off-box.

Program + toolkit additions

Program artifact: Meridian's logging & monitoring standard — prioritized source list (with owners/onboarding), normalization + UTC/NTP, ≥1-year retention with hot/cold split, first-ten use-case catalog, detections-as-code, weekly tuning review.
bluekit module: siem.py — normalize(raw, source) (raw event → common schema) and correlate(events, rule) (threshold/sequence rule → alerts).
Builds on: Ch.10 (network monitoring feeds the SIEM), Ch.7 (firewall/IDS as sources). Sets up: Ch.22 (detection engineering & hunting, Sigma), Ch.24 (IR consumes alerts; SOAR orchestrates response), Ch.36 (metrics & board reporting). Spaced review: Ch.10, Ch.6.