Case Study 1: Dirty COW (CVE-2016-5195)

The Race Condition That Haunted Linux for Nine Years

Overview

Dirty COW (Copy-On-Write) is one of the most significant Linux kernel vulnerabilities ever discovered. Assigned CVE-2016-5195, this race condition existed in the Linux kernel's memory subsystem for approximately nine years before its discovery in October 2016. The vulnerability affected virtually every Linux distribution, every Android device, and countless embedded systems running Linux kernels from version 2.6.22 (released in 2007) through 4.8.2. Its combination of widespread impact, reliability, and ease of exploitation made it one of the most dangerous privilege escalation vulnerabilities in Linux history.

Technical Background

Copy-On-Write (COW) Mechanism

To understand Dirty COW, you must first understand the Copy-On-Write mechanism. When a process forks, the child process initially shares the parent's memory pages rather than receiving a full copy. The pages are marked as read-only. When either process attempts to write to a shared page, the kernel intercepts the write, creates a private copy of the page for the writing process, and then allows the write to proceed on the new copy. This optimization dramatically reduces the memory overhead of process creation.

The Vulnerability

The vulnerability existed in the get_user_pages() function in the Linux kernel's memory management subsystem, specifically in how it handled the race condition between mapping memory pages and the COW mechanism.

The attack exploits a race condition between two kernel operations:

The madvise(MADV_DONTNEED) system call: This tells the kernel that the application no longer needs certain memory pages, causing the kernel to discard the private copy and fall back to the original (shared, read-only) mapping.
The write() system call to /proc/self/mem: This writes directly to the process's own memory through the /proc filesystem.

By racing these two operations in separate threads, an attacker could create a window where the kernel writes data to the original read-only page instead of a private copy. The sequence works as follows:

Thread A opens a read-only file and maps it into memory using mmap() with MAP_PRIVATE (COW semantics).
Thread A attempts to write to the mapped memory via /proc/self/mem. The kernel begins the COW process: it creates a private copy of the page and starts writing to it.
Thread B calls madvise(MADV_DONTNEED) on the mapped memory. This discards the private copy, reverting to the original read-only mapping.
If the timing is right, Thread A's write completes against the original read-only page in the page cache, effectively modifying a read-only file.

Impact

Because the vulnerability allows writing to any file the user can read (regardless of write permissions), the exploitation possibilities were vast:

Overwriting /etc/passwd to add a root user or remove the root password
Modifying SUID binaries to inject malicious code
Overwriting system libraries to achieve code execution as any process
Gaining root access on any Linux system, Android device, or embedded device

The Discovery

Phil Oester, a Linux kernel developer, discovered Dirty COW in October 2016 while investigating a production server compromise. He found evidence that the vulnerability was being actively exploited in the wild---before any patch was available. This made Dirty COW a zero-day exploit at the time of disclosure.

Linus Torvalds acknowledged that he had actually attempted to fix a related race condition in 2005, but his fix was incomplete and was subsequently reverted because it caused problems on the s390 architecture. The vulnerability then sat dormant in the kernel for another eleven years.

Exploitation in Practice

The Basic Exploit

The most common exploit for Dirty COW targeted /etc/passwd:

// Simplified concept (not complete exploit code)
// Thread 1: Write to /proc/self/mem at the mapped offset
void *writer_thread(void *arg) {
    while (running) {
        lseek(proc_mem_fd, map_offset, SEEK_SET);
        write(proc_mem_fd, payload, payload_len);
    }
}

// Thread 2: Race by discarding the private copy
void *madvise_thread(void *arg) {
    while (running) {
        madvise(mapped_addr, page_size, MADV_DONTNEED);
    }
}

The exploit would: 1. Map /etc/passwd into memory (read-only) 2. Start two threads racing against each other 3. Replace the root:x: line with root:: (empty password field) 4. After successful write, su root requires no password

Real-World Attack Chain

In the incident Phil Oester investigated, the attackers used Dirty COW as part of a larger attack chain:

Initial access through a web application vulnerability
Low-privilege shell as the web server user
Dirty COW exploitation to gain root access
Installation of a rootkit for persistence
Data exfiltration of sensitive information

The attackers had been leveraging the exploit for months before discovery, demonstrating the danger of unknown kernel vulnerabilities in production environments.

MedSecure Scenario Connection

In the MedSecure Health Systems engagement, the patient portal server runs an outdated Ubuntu 16.04 installation with kernel 4.4.0-31---well within the affected range. During the engagement:

The penetration testing team gains access as www-data through a SQL injection vulnerability.
Kernel enumeration reveals the vulnerable version: Linux medsecure-portal 4.4.0-31-generic.
Linux Exploit Suggester confirms Dirty COW as "highly probable."
However, the team opts for a less risky PATH hijacking attack on a custom SUID binary instead, documenting Dirty COW as an additional finding.
The report recommends immediate kernel patching and notes that the system had been vulnerable for years.

This decision reflects real-world penetration testing judgment: kernel exploits carry risk of system instability, and a less destructive alternative was available. The Dirty COW vulnerability was documented as a critical finding regardless.

Defensive Response

Immediate Patches

The Linux kernel team released patches rapidly: - Kernel 4.8.3 and later included the fix - All major distributions released emergency patches within days - Android security bulletin included the fix in the November 2016 update

The Fix

The patch added proper locking to prevent the race condition. Specifically, it ensured that the FOLL_COW flag was checked at the right point during page table traversal, preventing the write from falling through to the original page.

Long-Term Lessons

Patch management is critical: Systems running nine-year-old code without updates were vulnerable from day one.
In-memory exploitation bypasses traditional defenses: Dirty COW did not require writing a file to disk, making it invisible to many file-integrity monitoring tools.
Race conditions are subtle and long-lived: Complex kernel code can harbor race conditions for years before discovery.
Active exploitation precedes disclosure: The vulnerability was exploited in the wild before it was publicly known, underscoring the importance of defense-in-depth.

🔵 Blue Team Perspective

Detection: - Monitor for unusual patterns of madvise() and /proc/self/mem access - Deploy kernel exploit detection tools like Falco with rules for Dirty COW signatures - Monitor for unexpected changes to /etc/passwd, /etc/shadow, and SUID binaries - File integrity monitoring (FIM) on critical system files

Prevention: - Implement automated patch management with rapid deployment for critical kernel vulnerabilities - Use grsecurity or PaX kernel hardening patches where possible - Enable kernel live-patching (kpatch/livepatch) for zero-downtime security updates - Deploy SELinux or AppArmor in enforcing mode to limit exploitation impact - Consider read-only root filesystems for server deployments

Incident Response: - If Dirty COW exploitation is suspected, immediately preserve volatile evidence (running processes, memory dumps) - Check /etc/passwd and /etc/shadow for unauthorized modifications - Review system call audit logs for madvise() patterns - Full system rebuild may be necessary after confirmed exploitation

Discussion Questions

Why did the Dirty COW vulnerability survive in the kernel for nine years despite active development and code review?
How does the decision between using Dirty COW versus a less risky privilege escalation method in the MedSecure engagement reflect real-world penetration testing ethics?
What organizational failures allowed the MedSecure portal to run an unpatched kernel for years?
How would mandatory kernel live-patching change the risk calculus for kernel vulnerabilities?
Given that Dirty COW was exploited in the wild before disclosure, what does this imply about the number of similar undisclosed vulnerabilities?

References

CVE-2016-5195 National Vulnerability Database Entry
Phil Oester, "Dirty COW" (https://dirtycow.ninja/)
Linus Torvalds, kernel patch commit for CVE-2016-5195
Dan Rosenberg, "A Guide to Kernel Exploitation" (kernel race condition context)
Linux kernel git log showing the 2005 attempted fix and subsequent revert