Case Study 2: Secrets in the Repository

DataField.Dev

Case Study 2: Secrets in the Repository

A Team Discovers Hardcoded Secrets in AI-Generated Code and Implements Proper Secrets Management

Background

DataForge is a Series A startup with a team of eight developers building a SaaS platform for data pipeline management. The platform connects to customer databases, cloud storage services, and third-party APIs. DataForge's engineering team adopted AI coding assistants early and built most of the platform through vibe-coding workflows.

On a Tuesday morning, DataForge received an email from GitHub's secret scanning service: a valid AWS access key had been detected in a public commit. The team had accidentally pushed an internal repository to public visibility during a GitHub organization restructuring the previous weekend. The repository had been public for approximately 36 hours before the team noticed the alert.

This case study follows the team through incident response, forensic investigation, and the implementation of a comprehensive secrets management strategy.

The Discovery

The GitHub alert identified an AWS access key in src/connectors/s3_connector.py:

# The committed code — line 14
AWS_ACCESS_KEY = "AKIAIOSFODNN7EXAMPLE"
AWS_SECRET_KEY = "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY"
S3_BUCKET = "dataforge-customer-pipelines"

class S3Connector:
    def __init__(self):
        self.client = boto3.client(
            's3',
            aws_access_key_id=AWS_ACCESS_KEY,
            aws_secret_access_key=AWS_SECRET_KEY,
        )

The lead developer, James, immediately recognized the severity. These were not example keys—they were production credentials with read/write access to customer data stored in S3.

Phase 1: Incident Response (Hours 0–4)

Hour 0: Containment

James followed the emergency response protocol:

Step 1: Rotate the compromised credentials immediately.

# James logged into the AWS console and:
# 1. Created a new access key for the service account
# 2. Deactivated the old access key (not deleted — for audit trail)
# 3. Updated the running services with the new key

Step 2: Make the repository private.

# Changed repository visibility from public to private via GitHub settings

Step 3: Alert the team.

James posted in the team's incident channel:

SECURITY INCIDENT: AWS production credentials were exposed in a public GitHub repository for approximately 36 hours. I have rotated the keys and made the repo private. Everyone stop what you are doing — we need to assess the damage.

Hour 1: Forensic Investigation

The team checked AWS CloudTrail logs for unauthorized access using the compromised credentials:

aws cloudtrail lookup-events \
    --lookup-attributes AttributeKey=AccessKeyId,AttributeValue=AKIAIOSFODNN7EXAMPLE \
    --start-time 2025-11-15T00:00:00Z \
    --end-time 2025-11-18T00:00:00Z

The CloudTrail logs revealed: - 3 ListBuckets calls from an unfamiliar IP address in Eastern Europe. - 1 GetObject call that downloaded a customer configuration file. - No PutObject, DeleteObject, or privilege escalation attempts.

The team determined that an attacker had discovered the keys (likely through automated GitHub scanning bots) and performed reconnaissance, but had not conducted a large-scale data exfiltration.

Hours 2–4: Notification and Broader Assessment

James and the CTO made the following decisions:

Customer notification: The customer whose configuration file was accessed was notified within 4 hours, per DataForge's data breach notification policy.
Legal consultation: DataForge's legal counsel was informed to assess regulatory obligations.
Comprehensive secret scan: The team ran a full scan of every repository to find other hardcoded secrets.

# Install and run gitleaks across all repositories
gitleaks detect --source . --report-format json --report-path gitleaks_report.json

The results were alarming.

Phase 2: The Full Extent of the Problem

The gitleaks scan across DataForge's 12 repositories found 47 hardcoded secrets:

Secret Type	Count	Repositories
AWS access keys	4	s3_connector, etl_service, backup_scripts
Database connection strings	8	api_server, etl_service, migration_scripts, analytics
API keys (Stripe, SendGrid, Twilio)	11	billing_service, notification_service, api_server
JWT signing secrets	3	api_server, auth_service, admin_panel
OAuth client secrets	5	auth_service, google_connector, slack_integration
Encryption keys	2	data_encryption, etl_service
SMTP credentials	3	notification_service, api_server, admin_panel
Internal service tokens	11	Various

Every secret had the same origin story: an AI coding assistant had generated a "complete working example" with placeholder or development credentials, and the developer had replaced the placeholder with real credentials directly in the code instead of moving them to environment variables.

The team held a retrospective to understand how this happened.

Phase 3: Root Cause Analysis

The team identified several contributing factors:

1. AI-generated code normalized hardcoded credentials.

When developers prompted the AI for "a working S3 connector," the generated code always included credential variables at the top of the file. The pattern was so consistent that developers treated it as the "right way" to configure services. One developer said: "Every example the AI showed me had the credentials right there in the code. I assumed that was how you do it in Python."

2. No pre-commit hooks or CI checks for secrets.

The team had pre-commit hooks for code formatting (Black) and linting (Ruff) but nothing for secret scanning. Secrets passed through code review because reviewers focused on functionality, not security.

3. No .env.example template or configuration documentation.

New developers had no guidance on how to configure the application. They copied credential values from other files or asked teammates to share them over Slack, then pasted them into the code.

4. Development and production used the same credentials.

There was no separation between development and production environments. Developers used production AWS keys locally because "it was easier."

Phase 4: Implementing Proper Secrets Management

The team spent two weeks implementing a comprehensive secrets management strategy. They divided the work into four phases.

Phase 4a: Immediate — Remove All Hardcoded Secrets

Every hardcoded secret was replaced with environment variable lookups:

# BEFORE (vulnerable)
AWS_ACCESS_KEY = "AKIAIOSFODNN7EXAMPLE"
AWS_SECRET_KEY = "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY"

# AFTER (secure)
import os

def get_required_env(name: str) -> str:
    """Retrieve a required environment variable or fail fast."""
    value = os.environ.get(name)
    if not value:
        raise EnvironmentError(
            f"Required environment variable '{name}' is not set. "
            f"See .env.example for required variables."
        )
    return value

AWS_ACCESS_KEY = get_required_env("AWS_ACCESS_KEY_ID")
AWS_SECRET_KEY = get_required_env("AWS_SECRET_ACCESS_KEY")

The team created a centralized configuration module:

# src/config.py
import os
from dataclasses import dataclass, field

@dataclass
class AppConfig:
    """Application configuration loaded from environment variables."""
    # Database
    database_url: str = field(default_factory=lambda: os.environ["DATABASE_URL"])

    # AWS
    aws_access_key: str = field(default_factory=lambda: os.environ["AWS_ACCESS_KEY_ID"])
    aws_secret_key: str = field(default_factory=lambda: os.environ["AWS_SECRET_ACCESS_KEY"])
    aws_region: str = field(default_factory=lambda: os.environ.get("AWS_REGION", "us-east-1"))
    s3_bucket: str = field(default_factory=lambda: os.environ["S3_BUCKET_NAME"])

    # Authentication
    jwt_secret: str = field(default_factory=lambda: os.environ["JWT_SECRET"])

    # Third-party services
    stripe_api_key: str = field(default_factory=lambda: os.environ["STRIPE_API_KEY"])
    sendgrid_api_key: str = field(default_factory=lambda: os.environ["SENDGRID_API_KEY"])

    def __repr__(self) -> str:
        """Mask all secret values in string representation."""
        return "AppConfig(***masked***)"

    @classmethod
    def validate(cls) -> "AppConfig":
        """Create and validate configuration, failing fast if variables are missing."""
        required_vars = [
            "DATABASE_URL", "AWS_ACCESS_KEY_ID", "AWS_SECRET_ACCESS_KEY",
            "S3_BUCKET_NAME", "JWT_SECRET", "STRIPE_API_KEY", "SENDGRID_API_KEY"
        ]
        missing = [var for var in required_vars if not os.environ.get(var)]
        if missing:
            raise EnvironmentError(
                f"Missing required environment variables: {', '.join(missing)}\n"
                f"See .env.example for required variables."
            )
        return cls()

Phase 4b: Development Environment Setup

The team created a .env.example file documenting every required variable:

# .env.example — Copy to .env and fill in values
# NEVER commit the .env file

# Database
DATABASE_URL=postgresql://user:password@localhost:5432/dataforge_dev

# AWS (use development credentials with limited permissions)
AWS_ACCESS_KEY_ID=your-dev-access-key
AWS_SECRET_ACCESS_KEY=your-dev-secret-key
AWS_REGION=us-east-1
S3_BUCKET_NAME=dataforge-dev-bucket

# Authentication
JWT_SECRET=generate-a-random-256-bit-key-for-dev

# Third-party services (use test/sandbox keys)
STRIPE_API_KEY=sk_test_your_stripe_test_key
SENDGRID_API_KEY=your-sendgrid-dev-key

They updated .gitignore:

# Secrets — NEVER commit these
.env
.env.local
.env.production
*.pem
*.key
credentials.json
service-account-key.json

They also created separate AWS IAM users for development with minimal permissions (read-only access to a development S3 bucket only), ensuring that even if development credentials were leaked, the blast radius was minimal.

Phase 4c: Production Secrets with AWS Secrets Manager

For production, the team adopted AWS Secrets Manager:

# src/secrets_loader.py
import json
import boto3
from botocore.exceptions import ClientError

def load_production_secrets(secret_name: str, region: str = "us-east-1") -> dict:
    """Load secrets from AWS Secrets Manager for production deployment."""
    client = boto3.client("secretsmanager", region_name=region)
    try:
        response = client.get_secret_value(SecretId=secret_name)
        return json.loads(response["SecretString"])
    except ClientError as e:
        error_code = e.response["Error"]["Code"]
        if error_code == "ResourceNotFoundException":
            raise EnvironmentError(f"Secret '{secret_name}' not found in Secrets Manager")
        elif error_code == "AccessDeniedException":
            raise EnvironmentError(f"Access denied to secret '{secret_name}'")
        raise

def configure_app_from_secrets(app, environment: str = "production"):
    """Configure a Flask app from AWS Secrets Manager."""
    if environment == "production":
        secrets = load_production_secrets("dataforge/production")
        for key, value in secrets.items():
            app.config[key] = value
    elif environment == "development":
        # Development uses .env file loaded by python-dotenv
        from dotenv import load_dotenv
        load_dotenv()

The production infrastructure was configured so that the ECS task role had permission to read from Secrets Manager. No AWS access keys were needed in the application code at all—the application used IAM role-based authentication:

# In production, use IAM roles instead of access keys
class S3Connector:
    def __init__(self):
        # boto3 automatically uses the IAM role attached to the ECS task
        # No access keys needed!
        self.client = boto3.client('s3', region_name='us-east-1')

Phase 4d: Automated Prevention

The team implemented four layers of automated secret detection:

Layer 1: Pre-commit hooks

# .pre-commit-config.yaml
repos:
  - repo: https://github.com/gitleaks/gitleaks
    rev: v8.18.0
    hooks:
      - id: gitleaks
  - repo: https://github.com/Yelp/detect-secrets
    rev: v1.4.0
    hooks:
      - id: detect-secrets
        args: ['--baseline', '.secrets.baseline']

Layer 2: CI/CD pipeline

# .github/workflows/security.yml
name: Security Checks
on: [push, pull_request]
jobs:
  secret-scan:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0  # Full history for comprehensive scanning
      - uses: gitleaks/gitleaks-action@v2
        env:
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

Layer 3: GitHub secret scanning

Enabled GitHub's built-in secret scanning and push protection for the organization. Push protection blocks pushes that contain recognized secret patterns before they reach the repository.

Layer 4: Periodic full-repository scans

A weekly scheduled job scanned all repositories for secrets that might have slipped through:

# .github/workflows/weekly-secret-scan.yml
name: Weekly Secret Scan
on:
  schedule:
    - cron: '0 6 * * 1'  # Every Monday at 6 AM
jobs:
  scan:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0
      - name: Run gitleaks
        uses: gitleaks/gitleaks-action@v2

Phase 5: Git History Cleanup

Even after removing secrets from the current code, they remained in git history. The team used git filter-repo to rewrite history:

# Created a patterns file listing all secrets to remove
# Then ran filter-repo to scrub them from every commit
git filter-repo --replace-text secrets_to_remove.txt --force

After rewriting history, the team force-pushed to all repositories and required every developer to re-clone. They also invalidated all existing forks.

Outcome and Metrics

Six months after the incident, the team measured the impact of their secrets management program:

Metric	Before	After
Hardcoded secrets in codebase	47	0
Secrets detected in PRs (blocked by hooks)	N/A	12 (all caught pre-merge)
Time to rotate a compromised secret	Unknown	< 15 minutes
Environments sharing credentials	All shared	Fully separated
Developers with production database access	8/8	2/8 (senior + lead)
Secret scanning coverage	0%	100% of repositories

Lessons Learned

AI coding assistants normalize insecure patterns. When every AI-generated example includes hardcoded credentials, developers internalize that pattern as normal. Teams must establish and document the correct pattern (environment variables, secret managers) and explicitly train developers to recognize the AI's pattern as insecure.
Prevention is cheaper than remediation. Installing pre-commit hooks takes 10 minutes. The incident response, secret rotation, git history rewriting, customer notification, and secrets management implementation consumed approximately 120 person-hours over two weeks.
Separate development and production credentials. If the exposed keys had been development-only credentials with no access to customer data, the incident would have been a minor issue instead of a breach notification event.
IAM roles eliminate the need for long-lived credentials. By switching from access keys to IAM role-based authentication for AWS services, the team eliminated an entire category of secrets that could be leaked.
Multiple layers of defense are essential. Pre-commit hooks catch most secrets, but developers can bypass them. CI/CD scanning catches what hooks miss, but force pushes can skip CI. GitHub push protection adds another layer. Periodic scans catch anything that slipped through all other layers.
Fail fast. The centralized configuration module validates that all required environment variables are present at startup. If a variable is missing, the application crashes immediately with a clear error message, rather than failing mysteriously at runtime when a feature tries to use an unconfigured service.

Discussion Questions

Could this incident have been prevented entirely? What single control would have been most effective?
How would you handle the situation if CloudTrail logs showed that an attacker had downloaded a large volume of customer data?
The team used AWS Secrets Manager. What would you recommend for a team that uses multiple cloud providers?
How should the AI coding assistant prompts be modified to avoid generating hardcoded credentials in the first place?
Design an onboarding process for new developers that ensures they understand and follow the secrets management policy from day one.