Case Study 1: Meridian's AWS Posture Review and Remediation

DataField.Dev

Case Study 1: Meridian's AWS Posture Review and Remediation

"We didn't get attacked. We just left the doors open and got lucky that no one walked in first." — Sam Whitfield, Security Engineer, Meridian Regional Bank (constructed)

Executive Summary

Meridian Regional Bank's AWS footprint had grown the way most do: not by plan, but by accretion. Over five years, individual teams had created accounts, spun up storage buckets, written IAM policies under deadline, and opened network ports to "get something working" — each decision reasonable in isolation, none reviewed as a whole. After the program-maturation initiative began (Chapter 1), CISO Dana Okafor asked security engineer Sam Whitfield and junior analyst Theo Brandt to do something the bank had never done: a complete security posture review of its AWS environment, and a remediation plan to close what they found. They expected to hunt for attackers. What they found was an environment full of self-inflicted openings — a public bucket of loan documents, IAM users with administrative wildcards, a database reachable from the entire internet, and audit logging that covered only one of the regions the bank used. None of it had been breached. All of it could have been, and the bank would not have known. This case study follows the review from first scan to a hardened baseline, applying every concept in the chapter to a real (constructed) environment. The scenario and all figures are constructed for teaching (Tier 3).

Skills applied: the shared-responsibility model as a control-ownership map; reading S3 ACLs and IAM policies to find public storage and over-broad access; security-group analysis for 0.0.0.0/0 exposure; prioritizing CSPM findings by real risk; enabling and protecting CloudTrail; designing preventive guardrails (Block Public Access, service control policies); building a cloud security baseline that evidences the customer side of PCI-DSS.

Background

Meridian runs AWS as its primary cloud, alongside Microsoft Entra ID and M365 (Chapter 1's infrastructure facts are frozen). Its AWS usage spans several workloads: a loan-document archive in S3; a customer-facing marketing site; a data-analytics environment that processes (de-identified, in theory) transaction extracts; and a handful of internal applications migrated from the on-premises data center. The bank holds data in scope for both PCI-DSS (cardholder data) and the GLBA Safeguards Rule (customer financial information), which means the customer side of the shared-responsibility line is not optional hygiene — it is a regulatory obligation Meridian must evidence to assessors.

The trigger for the review was mundane and therefore typical. During an unrelated audit, Elena Vasquez (GRC) was asked a simple question by the PCI assessor: "Can you show me that none of your cloud storage holding cardholder-adjacent data is publicly accessible?" Elena could not. Nobody at Meridian could answer "is anything public?" with confidence, because nobody had ever looked across all the accounts at once. Dana turned that gap into a mandate: two weeks, a full posture review, and a baseline that makes "is anything public?" an answerable question forever.

Sam framed the review around the chapter's mental model before touching a console. "Our problem isn't AWS," he told Theo. "AWS secures the hardware and the hypervisor better than we ever could. Our problem is everything on our side of the line — the data, the identity, the configuration. That's the half we own, and that's the half we've never audited."

🔗 Connection: This is the shared-responsibility model (§15.2) used as a scoping tool. By drawing the line first, Sam excludes from the review everything the provider owns (datacenter, hardware, hypervisor — not Meridian's to audit) and concentrates the team's two weeks entirely on the always-yours layers where the actual risk lives: storage exposure, IAM, network configuration, and logging.

The Review

Phase 1 — Turn on the lights: posture scanning and logging

You cannot remediate what you cannot see, so the team's first move was to gain visibility across every account. Sam connected a CSPM tool (read-only) to all of Meridian's AWS accounts, configured to evaluate them against the CIS AWS Foundations Benchmark — the industry-standard baseline of secure cloud settings. Within an hour the tool returned its first picture of reality, and it was not flattering:

   Meridian AWS — initial CSPM scan (illustrative summary)
   ───────────────────────────────────────────────────────────
   Accounts scanned:                         7
   S3 buckets total:                         214
     - Public (READ to AllUsers):            3        <-- investigate NOW
     - Unencrypted at rest:                  41
   IAM users with "Action":"*" policies:     6        <-- investigate NOW
   IAM users without MFA:                    19
   Long-lived access keys (>90 days old):    23
   Security groups open to 0.0.0.0/0 on
     a sensitive port (22/3389/3306/5432):   11       <-- investigate NOW
   CloudTrail coverage:                      1 of 14 regions   <-- blind in 13 regions
   ───────────────────────────────────────────────────────────
   Total findings:                           ~1,900

Two things mattered immediately. First, the raw number — roughly 1,900 findings — was meaningless as a to-do list; nobody fixes 1,900 things in two weeks, and treating them as equal would guarantee the team spent week one on unencrypted log buckets while a public bucket of loan documents sat open. Second, three finding types were not routine hygiene but potential active exposures: the 3 public buckets, the 6 administrative wildcard policies, and the 11 wide-open security groups. Those went to the top.

But before remediating anything, Theo raised the right question: "If we start changing configurations and someone is already in here, will we even know what they touched?" The answer, with CloudTrail covering one of fourteen regions, was no. So the team's genuine first action was not a fix but enabling comprehensive logging: CloudTrail turned on in all regions, delivered to a dedicated, locked-down logging account with S3 object-lock so that the trail could not be deleted even by someone who compromised a production account. "Logs are the ground truth," Sam said, echoing a conviction the reader has carried since Chapter 1. "We're about to make a lot of changes. I want every one of them recorded, and I want a record an attacker couldn't erase."

🛡️ Defender's Lens: Notice the ordering. The team turned on visibility (CSPM) and evidence (CloudTrail) before touching a single insecure setting. This is deliberate: remediation is change, change can mask or be masked by an intruder's activity, and you want the audit trail complete before you start. A team that begins by frantically closing buckets without first ensuring the log is on may erase the only evidence that one of those buckets was already being read.

Phase 2 — The three urgent exposures

With logging on, the team triaged the three urgent finding types. Each is a worked example of the chapter's core skills.

The public buckets. Theo pulled the ACL on the worst of the three — meridian-loan-docs, the loan-document archive:

   Bucket: meridian-loan-docs
   Grants:
     - Grantee: { ID: "ownerid..." }                              FULL_CONTROL
     - Grantee: { URI: ".../groups/global/AllUsers" }   READ      <-- PUBLIC

The AllUsers READ grant meant anyone on the internet who knew or guessed the bucket's name could list and download every loan document in it — names, addresses, financial details. Theo searched CloudTrail for the event that created this grant and found a PutBucketAcl call from fourteen months earlier, made by a contractor's IAM user, with x-amz-acl: public-read. The contractor had needed to share a few files quickly, made the whole bucket public, and never reverted it; the contractor's engagement had ended a year ago. The bucket had been world-readable for fourteen months. The team then did the most important and most uncomfortable analysis of the whole review: had anyone actually downloaded it? They searched S3 access logs (which, fortunately, had been retained) for GetObject requests from sources outside Meridian's known ranges. The finding — a relief, but a thin one — was that access appeared limited to known automated bucket scanners probing the listing, with no evidence of bulk download. "We got lucky," Sam wrote in the report. "The next bucket might not be. We do not build a security program on luck."

Remediation was immediate (revert all three buckets to private) but the team understood that fixing three buckets does not fix the problem, which is that any engineer can make any bucket public at any time. That required a guardrail, in Phase 4.

The other two public buckets told their own small stories, and both are instructive because neither was the result of malice or even carelessness in the moment. The second, meridian-reports-archive, had been made public by a script — an old deployment automation that set public-read on every bucket it created, a default someone had copied from a tutorial years earlier and propagated without understanding. The third, meridian-public-assets, was intentionally public and correctly so: it served a few images for the marketing website, contained nothing sensitive, and was exactly the kind of legitimate public bucket that exists. This last one mattered for the plan, because it meant "block all public buckets" could not be a blunt rule that broke a working website — the team had to handle the legitimate case (serve the assets through a content-delivery network, with the bucket itself private behind it) before enabling the account-wide guardrail. Sam noted the lesson for the report: "Security controls that ignore the legitimate exceptions get turned off the first time they break something real. Find the real exceptions first, design for them, then lock the door."

The administrative wildcards. Six IAM users carried the policy you now recognize on sight:

{ "Version": "2012-10-17",
  "Statement": [{ "Effect": "Allow", "Action": "*", "Resource": "*" }] }

Each was effectively an account administrator. Worse, three of the six were service identities — used by batch jobs and automation — meaning their long-lived credentials lived in configuration files and CI systems where they could leak. Sam's analysis was blunt: "Any one of these credentials, if it leaks, is total account compromise. A service account with * is a loaded gun pointed at the whole environment." The team used AWS's access-analyzer tooling, which observes what permissions each identity actually uses over time and proposes a right-sized policy. For the loan-processing job, the analyzer confirmed it used exactly s3:GetObject and s3:ListBucket on two buckets — so the * policy was replaced with the least-privilege policy from §15.3. The job kept working; its blast radius shrank from "everything" to "read two buckets."

Not every identity right-sized cleanly, and the messy cases are where the real engineering judgment lived. One of the human users with a * policy turned out to be a senior engineer who genuinely performed a wide and unpredictable range of administrative tasks — the access-analyzer's "actually used" set was large and varied. Here the team made a deliberate tradeoff: rather than grant a single sprawling policy, they moved the engineer to assuming a privileged role with MFA required and the session logged, so that the broad access was time-boxed to when it was needed and fully recorded, rather than standing permanently attached to a user whose long-lived key could leak. This is a preview of the privileged-access ideas the program develops later (just-in-time access, session recording), and it makes a subtle point: least privilege does not always mean "a smaller permanent grant." Sometimes it means "the same broad capability, but only when assumed, only with strong authentication, and only with a recording" — privilege bounded in time and accountability rather than in scope. The two service identities, by contrast, were tightened to narrow permanent policies and had their long-lived access keys replaced with role-based credentials, removing the leakable secret entirely.

🔗 Connection: Replacing the senior engineer's standing * policy with an assumable, MFA-gated, session-logged role is exactly the privileged-access pattern of Chapter 19 (PAM), arriving early because the cloud forces the issue. The principle is least privilege (Chapter 3) applied with nuance: the goal is to minimize what a compromised credential can do, and an admin capability that exists only inside a logged, MFA-protected session is far less dangerous than the same capability sitting permanently on a user whose access key might be checked into a repository.

The wide-open security groups. Eleven security groups exposed a sensitive port to 0.0.0.0/0. The most alarming protected an analytics database:

   Security group: sg-analytics-db (BEFORE)
     TCP  5432  0.0.0.0/0   "Postgres"   <-- database open to the entire internet
     TCP  22    0.0.0.0/0   "SSH"        <-- admin login open to the entire internet

A PostgreSQL database with port 5432 open to the world is, as the chapter put it, a breach with a countdown timer — automated scanners brute-force such ports continuously. The team restricted the source to exactly what needed access:

   Security group: sg-analytics-db (AFTER)
     TCP  5432  10.40.0.0/24      "analytics app subnet only"
     TCP  22    10.40.255.10/32   "bastion host only"

Same application functionality; the database now reachable only from the application subnet, SSH only through a single bastion host. The principle is identical to the least-privilege IAM policy: restrict to exactly the sources that legitimately need access, expose nothing to 0.0.0.0/0 that does not have to be.

⚠️ Common Pitfall: Theo's instinct on the security groups was to delete the dangerous rules and move on. Sam stopped him: "Before you change a production firewall rule, confirm what legitimately connects to it, or you'll cause an outage and security will get blamed for breaking things." They checked flow logs to confirm the analytics database's real callers before restricting the source. Security that causes outages loses the political capital it needs to do the rest of its job — a lesson that recurs across the program.

Phase 3 — From 1,900 findings to a defensible plan

With the three urgent exposures closed, the team faced the remaining ~1,900 findings and the real discipline of cloud security: turning an overwhelming list into a defensible plan. They applied the risk-based prioritization from Chapter 1 — not "what does the scanner rate highest?" but "what is the actual risk to Meridian?" — and sorted findings into waves:

Wave	Finding class	Why this priority	Example count
1 — Now	Public buckets; `*` IAM policies; `0.0.0.0/0` on sensitive ports	Direct, internet-exposed paths to data or total compromise	3 + 6 + 11
2 — This sprint	MFA missing on human users; long-lived keys >90 days; root account in use	High-value identity weaknesses; credential leak → account takeover	19 + 23
3 — This quarter	Unencrypted buckets/DBs holding sensitive data	Real but requires an additional failure (data already accessed) to harm	41
4 — Backlog/guardrail	Lower-severity benchmark deviations; logging gaps in unused regions	Address systemically via guardrails, not one at a time	~1,800

The insight that made the list tractable was the same one from Chapter 1's risk model: an unencrypted log bucket and a public customer-data bucket are not equal findings, even if the CSPM tool lists them at similar severities. Risk is exposure times consequence, and the team folded in context the scanner could not know — which buckets held regulated data, which identities were internet-reachable, which findings could be eliminated wholesale by a single guardrail rather than fixed individually 1,800 times.

🔄 Check Your Understanding: The team put "MFA missing on human users" in Wave 2, not Wave 1, even though missing MFA is a serious weakness. Why might that ordering be defensible — and under what circumstance would you promote it to Wave 1? (Hint: consider that a missing MFA is only exploitable after a password is compromised, whereas a public bucket is exploitable right now by anyone; but if you had evidence of active credential-stuffing against those accounts, the calculus changes.)

Phase 4 — Make the fixes permanent: guardrails and the baseline

Fixing the current findings was necessary but insufficient; without systemic change, the same findings would reappear as new engineers created new buckets and policies. So the team's final and most important work was preventive: building guardrails that make the dangerous states unreachable, and codifying everything into a cloud security baseline every account must meet.

The guardrails Sam implemented:

S3 Block Public Access, account-wide, via a service control policy — no bucket in any Meridian account can be made public, regardless of any engineer's bucket-level setting. The fourteen-month exposure becomes structurally impossible.
A service control policy denying 0.0.0.0/0 on ports 22, 3389, 3306, and 5432 — no security group can open those admin/database ports to the world, even by an account administrator. The guardrail sits above the accounts, so it cannot be overridden from within one.
A service control policy denying StopLogging and DeleteTrail — no one can disable CloudTrail, removing an attacker's ability to cover their tracks and protecting the bank's evidence.
Encryption-at-rest default on new storage and databases, so the secure state is the default state.

These guardrails embody the chapter's central operational idea: prevent with guardrails, detect with CSPM. The CSPM tool stays connected as the detective backstop — if some configuration slips past the guardrails or is not covered by one, CSPM flags it, with public-exposure and logging-disabled findings now wired directly to alerts in Marcus's SOC queue. And the highest-severity CloudTrail events (a bucket attempting to go public, a * policy created, a 0.0.0.0/0 rule, logging disabled, root used) became the SOC's first cloud detection rules, feeding the SIEM the bank had begun building.

Finally, Sam wrote the one-page cloud security baseline — the program artifact this chapter contributes (see the Project Checkpoint in index.md). It became Meridian's standard for every AWS account and, crucially, the document Elena handed the PCI assessor to answer the question that started the whole review. "Is anything public?" now had an answer: no — and here is the guardrail that makes it impossible, the CSPM scan that verifies it continuously, and the alert that fires if anyone tries.

🛡️ Defender's Lens: The arc of this review is the arc of cloud security maturity in miniature. Meridian moved from unknown (nobody could answer "is anything public?"), through detected (CSPM made the exposures visible), to prevented (guardrails made them impossible) and monitored (CSPM + CloudTrail detections catch what slips through). That progression — unknown → detected → prevented → monitored — is the goal for every class of cloud risk, and it is defense in depth applied to the control plane.

Phase 5 — What it cost, and what it would have cost

When Dana presented the review to the board's Audit Committee, she made a point of framing it the way the board would understand: in the language of risk and cost from Chapter 1, not in the language of ACLs and security groups. The review itself had cost two engineers two weeks and the price of a CSPM subscription — a few tens of thousands of dollars, all in, much of it recurring and therefore budgetable. Against that, she laid out what the public loan-document bucket alone could have cost had a malicious scanner found it first: mandatory breach notification to affected customers under GLBA and state law; regulatory examination and potential penalties; the cost of credit monitoring offered to affected individuals; legal fees; and the reputational damage to a bank — an institution whose entire business is trust — of a headline reading "Bank left customer loan documents readable by anyone on the internet for fourteen months." Industry experience puts the fully loaded cost of a data breach in the millions; the bucket had been one curious stranger away from that outcome for over a year.

The number that landed hardest with the committee was not the potential breach cost but the duration: the bucket had been exposed for fourteen months, and Meridian had possessed no capability to know. "We were not breached," Dana told them, "but we could not have told you whether we were. The CSPM and the logging we just turned on are the difference between finding the next one in minutes and finding it in a newspaper." This is the business case for cloud posture management stated as plainly as it can be: the cost of the controls is small, recurring, and predictable; the cost of the exposure they prevent is large, sudden, and existential for a trust-based institution — and the controls also convert an unknowable risk into a managed one, which is itself the thing a board is asking for when it asks "are we secure?"

⚠️ Common Pitfall: A trap Dana deliberately avoided was presenting the board a wall of technical findings — "we found 3 public buckets, 6 wildcard policies, 11 open security groups, and 1,900 total CSPM findings." A board does not know whether 1,900 is good or bad, and the number invites either panic or glazed-over dismissal. She translated instead into risk reduced and capability gained: a specific, serious exposure closed; a class of exposure made structurally impossible; and a new ability to detect the next one in minutes. Translating technical findings into business-legible risk is the skill that gets the next budget cycle approved — the same translation the riskcalc.py bands performed back in Chapter 1, now at the scale of a board conversation that Chapter 36 develops fully.

Discussion Questions

The team enabled CloudTrail and CSPM before remediating the urgent exposures. Argue for and against this ordering. Under what circumstances might closing a public bucket first — before logging is fully on — be the right call?
Three of the six *-policy identities were service accounts whose credentials live in configuration. Why is an over-broad service identity arguably more dangerous than an over-broad human one, and how does this preview the machine-identity problem of Chapter 20?
The review found no evidence that the public loan-document bucket had been bulk-downloaded, only that automated scanners had probed it. Should Meridian treat this as "no breach occurred," or is the distinction between "we found no evidence of harm" and "no harm occurred" important? How would it affect breach-notification obligations under GLBA and state law?
Sam insisted on guardrails (preventive) over relying on CSPM findings (detective) for the recurring problems. When is a detective control genuinely sufficient, and when must a risk be addressed with a preventive guardrail? Give a cloud example of each.
The CSPM tool returned ~1,900 findings. Discuss the failure mode of a team that treats the CSPM dashboard itself as "the work." How does this mirror the unwatched-SIEM pitfall from Chapter 1?

Your Turn

Take a cloud environment you can reason about (a personal AWS/Azure/GCP account, your employer's if you are authorized, or the constructed Meridian environment above) and reproduce the review's logic on paper:

Draw the line. For one workload, list which security tasks are the provider's and which are yours (shared-responsibility map). Scope your review to your side only.
Find the urgent three. Identify (or hypothesize) any public storage, any * IAM policy, and any 0.0.0.0/0 rule on a sensitive port. For each, write the exploit in one sentence and the fix.
Prioritize. Sort a hypothetical list of ten findings into "now / this sprint / this quarter / backlog" waves, justifying each placement by risk (exposure × consequence), not by scanner severity.
Make it permanent. For each of the urgent three, name the guardrail that would make the dangerous state impossible, not just the one-time fix.

Keep it to two pages. If you cannot decide a finding's wave, note what additional context (what data is in the bucket? is the identity internet-reachable?) you would need — that gap is itself a finding.

Key Takeaways

A cloud posture review starts by drawing the shared-responsibility line, which scopes the work to the always-yours layers (data, identity, configuration) where the real risk lives — not the provider's hardware.
Visibility and evidence come first. Enable CSPM (to see misconfigurations) and comprehensive, delete-proof CloudTrail (to record everything) before remediating, so changes are recorded and an in-progress intruder is not masked.
The urgent three in almost every cloud environment are public storage, * IAM policies, and 0.0.0.0/0 on sensitive ports — direct, internet-exposed paths to data or total compromise. Triage these ahead of the long tail.
Risk-based prioritization turns thousands of CSPM findings into a defensible plan; an unencrypted log bucket and a public customer-data bucket are not equal, whatever the scanner's severity says.
Fix the cause, not just the instance. Replace one-time remediations with guardrails (Block Public Access, service control policies) that make the dangerous state structurally impossible, and keep CSPM as the detective backstop and CloudTrail-fed detections as the alerting layer.
The maturity arc is unknown → detected → prevented → monitored, and the cloud security baseline is the artifact that makes "is anything public?" a permanently answerable question — and evidences the customer side of PCI-DSS and GLBA.