> 🔐 Security Note — Read This First: This chapter explains how buffer overflow vulnerabilities work at the assembly level. The purpose is entirely defensive: understanding the mechanics makes you a better C programmer who writes secure code, a...
In This Chapter
- Memory Corruption: The Original Sin of C Programming
- The Stack Buffer Overflow: Assembly-Level Mechanics
- Shellcode: Position-Independent Attack Code
- NOP Sleds
- Format String Vulnerabilities
- Heap Corruption
- The History: The Morris Worm (1988)
- A Complete GDB Walkthrough: Overwriting the Return Address
- Prevention: Writing Secure C
- Summary
Chapter 35: Buffer Overflows and Memory Corruption
🔐 Security Note — Read This First: This chapter explains how buffer overflow vulnerabilities work at the assembly level. The purpose is entirely defensive: understanding the mechanics makes you a better C programmer who writes secure code, a better security engineer who can protect systems, and a better analyst who can audit code for these vulnerabilities. Every exploitation mechanism shown is paired with its prevention. The historical examples are studied because they shaped the security landscape we work in today. This knowledge is how you protect systems, not how you attack them.
Memory Corruption: The Original Sin of C Programming
C gives you direct control over memory. You can address any byte, cast any pointer, and perform pointer arithmetic without bounds checking. This power is why C produces fast, portable, and predictable code that runs on everything from microcontrollers to supercomputers. It is also why C code has produced more security vulnerabilities than any other language.
The fundamental issue: C does not check whether your code accesses memory it owns. If you write buf[100] when buf is only 64 bytes, the compiler dutifully generates a write to buf + 100. Whatever is at that address gets overwritten. If that address is the return address on the stack, the program will jump to wherever you pointed it when the function returns.
This is the buffer overflow. It is not exotic. It is just C doing exactly what you told it to do.
This chapter examines buffer overflows and related memory corruption vulnerabilities at the assembly level. We will see the stack layout, trace through an overflow step by step, understand what shellcode is and why it is position-independent, examine heap corruption, and look at the history that shaped modern mitigations. Chapter 36 covers those mitigations; this chapter is about understanding the underlying vulnerability.
The Stack Buffer Overflow: Assembly-Level Mechanics
The Stack Layout of a Vulnerable Function
Consider this vulnerable C function:
#include <string.h>
void vulnerable(const char *input) {
char buf[64];
strcpy(buf, input); // No bounds checking!
// ... process buf ...
}
When this function is called, the stack looks like this:
High address
┌─────────────────────────────┐
│ Saved RBP (8 bytes) │ ← RBP points here after prologue
├─────────────────────────────┤
│ Return Address (8 bytes) │ ← address to return to after ret
├─────────────────────────────┤
│ buf[63] │
│ buf[62] │
│ ... │
│ buf[1] │
│ buf[0] │ ← buf starts here
├─────────────────────────────┤
│ (alignment padding) │
└─────────────────────────────┘
Low address (RSP points here)
The assembly for the prologue and strcpy call:
push rbp
mov rbp, rsp
sub rsp, 0x50 ; 80 bytes: 64 for buf + 16 for alignment
; strcpy(buf, input):
lea rdi, [rbp-0x40] ; RDI = &buf[0] = rbp - 64
mov rsi, [rbp-0x58] ; RSI = input (already saved to stack)
call strcpy@plt
During the Overflow
If input is 64 bytes or fewer, strcpy writes into buf and stops. Nothing is overwritten. Normal execution continues.
If input is 80 bytes:
After strcpy writes 80 bytes starting at buf[0]:
┌─────────────────────────────┐
│ bytes [72-79] of input │ ← overwrites Return Address!
├─────────────────────────────┤
│ bytes [64-71] of input │ ← overwrites Saved RBP
├─────────────────────────────┤
│ bytes [63] │
│ bytes [0..62] │ ← fills buf[0..63]
└─────────────────────────────┘
When vulnerable() executes ret:
1. RSP points to the overwritten return address
2. ret pops the value at RSP into RIP
3. Execution jumps to whatever 8-byte address is at bytes [72-79] of input
4. The attacker controls where execution goes
Stack Diagrams: Before, During, After
Before overflow (input = "AAAA...A" × 64):
┌──────────────┐ ← RSP + 0x58
│ saved rbp │ 0x7ffd5520 (legitimate value)
├──────────────┤
│ return addr │ 0x401234 (legitimate return address)
├──────────────┤
│ buf + 56..63 │ 'A' 'A' 'A' 'A' 'A' 'A' 'A' 'A'
│ buf + 48..55 │ 'A' 'A' 'A' 'A' 'A' 'A' 'A' 'A'
│ ... │
│ buf + 0..7 │ 'A' 'A' 'A' 'A' 'A' 'A' 'A' 'A'
└──────────────┘ ← RSP (current stack pointer)
During overflow (input = "AAAA...A" × 80):
┌──────────────┐
│ 0x4141414141 │ ← Return Address OVERWRITTEN with 'AAAAAAAA'
├──────────────┤
│ 0x4141414141 │ ← Saved RBP OVERWRITTEN
├──────────────┤
│ 'A' × 64 │ buf completely filled
└──────────────┘
After ret (execution jumps to 0x4141414141414141):
- CPU attempts to fetch instruction from 0x4141414141414141
- This address is unmapped → SIGSEGV (Segmentation fault)
- Or if attacker puts a valid address: controlled execution
What the Attacker Needs
To exploit a stack buffer overflow, an attacker needs to know: 1. The offset: how many bytes from the start of the buffer to the return address 2. A target address: where to redirect execution
The offset can be calculated from the stack layout (frame size + saved RBP = offset to return address). It can also be found empirically using cyclic patterns in GDB (discussed below).
For x86-64 on Linux, user-space addresses fit in 6 bytes (canonical addresses). The upper 2 bytes of the overwritten 8-byte return address must be zero for a valid canonical address.
Finding the Offset with GDB
The pwndbg cyclic pattern approach:
(gdb) cyclic 200
aaaaaaaabaaaaaaacaaaaaaadaaaaaaaeaaaaaaafaaaaaaagaaaaaaahaaaaaaai...
(gdb) run $(cyclic 200)
Program received signal SIGSEGV at 0x6161616161616166
(gdb) cyclic -l 0x6161616161616166
72
The offset is 72 bytes: 64 bytes of buffer + 8 bytes of saved RBP = 72 bytes before reaching the return address. Bytes [72-79] of the input are the return address.
Shellcode: Position-Independent Attack Code
🔐 Security Note: Understanding shellcode is necessary for: writing exploit mitigations, understanding what NX/DEP protects against, analyzing malware that uses shellcode injection, and understanding why modern exploit techniques exist. The following describes shellcode mechanics for that defensive purpose.
What Shellcode Is
Shellcode is a small piece of machine code designed to perform an action when executed — classically, spawning a shell (/bin/sh). The name comes from its original purpose; modern shellcode might open a network connection, add a user, or run any arbitrary command.
Two requirements make shellcode different from ordinary code:
-
Position-independent: The code cannot use absolute addresses because the address where it will run is unpredictable (especially with ASLR). All addresses must be computed relative to RIP.
-
Null-byte-free: Many overflow vectors use
strcpyor similar string functions that stop at\0. If the shellcode contains a null byte, the copy truncates.
The execve System Call Sequence
The minimal action is execve("/bin/sh", NULL, NULL):
- Syscall number: 59 (0x3b)
- RDI: pointer to the string /bin/sh
- RSI: NULL (argv)
- RDX: NULL (envp)
Position-independent version:
; execve("/bin/sh", NULL, NULL)
; Position-independent, null-byte-free
section .text
global shellcode
shellcode:
; Build "/bin/sh" on the stack
xor rdx, rdx ; rdx = 0 (envp = NULL)
; mov rbx, '/bin//sh' ; would contain 0x00 bytes in encoding
; Instead use a rotation trick:
mov rbx, 0x68732f6e69622f2f ; '//bin/sh' (no null bytes)
push rdx ; push null terminator
push rbx ; push '//bin/sh'
mov rdi, rsp ; rdi = pointer to "//bin/sh"
push rdx ; NULL for argv[1]
push rdi ; argv[0] = "/bin/sh"
mov rsi, rsp ; rsi = argv
xor rax, rax
mov al, 59 ; rax = 59 = execve syscall
syscall
This code:
- Uses only RIP-relative stack operations (no absolute addresses)
- The string "//bin/sh" is pushed to the stack at runtime (position-independent)
- mov al, 59 avoids a null byte in the immediate (instead of mov rax, 59 which has zero bytes in the upper bytes)
⚠️ Common Mistake: Simply writing
mov rax, 59produces the encoding48 C7 C0 3B 00 00 00— note the three null bytes00 00 00. If the overflow vector isstrcpy, those null bytes terminate the copy and truncate the shellcode. Themov al, 59workaround (B0 3B) avoids this.
Why Shellcode Is Historical
Classic shellcode injection — overflow the buffer, write shellcode into the buffer, overwrite the return address to point to the buffer — stopped working when NX/DEP was deployed. The stack became non-executable. Shellcode in the buffer would never execute; the CPU would raise a fault at the first instruction in non-executable memory.
Understanding shellcode is still essential because:
- Malware still uses shellcode injection into executable memory (via mmap/mprotect as in Chapter 34)
- Shellcode is part of CTF challenges
- Understanding what NX/DEP prevents explains why it was a major mitigation
- Understanding position-independent code is essential for understanding ROP (Chapter 37)
NOP Sleds
A NOP sled is a sequence of NOP instructions (0x90) placed before shellcode. Its purpose in classic exploitation: if the attacker cannot predict the exact address of the shellcode (stack addresses vary slightly), but can guess within a range, filling that range with NOPs means that landing anywhere in the sled slides execution down to the actual shellcode.
Input payload structure (pre-ASLR era):
┌─────────────────────────────┐
│ NOP NOP NOP NOP ... │ ← sled: any landing here slides to shellcode
│ (hundreds of bytes) │
├─────────────────────────────┤
│ shellcode (~72 bytes) │ ← actual code
├─────────────────────────────┤
│ padding to return address │
├─────────────────────────────┤
│ target address (return) │ ← guess an address in the sled
└─────────────────────────────┘
ASLR made this unreliable by randomizing stack addresses across a large range. NOP sleds are now primarily a malware and CTF artifact.
Format String Vulnerabilities
Format string bugs are a related but distinct class of memory corruption. They arise from incorrect use of printf-family functions:
// Correct:
printf("%s", user_input);
// Vulnerable:
printf(user_input); // user_input is the format string!
How printf Uses the Stack
printf is a variadic function. It reads arguments from registers (RDI, RSI, RDX, RCX, R8, R9) for the first six, then from the stack for additional arguments. When the format string contains more format specifiers than arguments were passed, printf reads whatever is on the stack.
With input "%x %x %x %x %x %x %x":
printf reads:
- RDI: format string = "%x %x %x %x %x %x %x"
- RSI, RDX, RCX, R8, R9: whatever was there (registers 2-6)
- [RSP], [RSP+8], ...: whatever is on the stack
Output: 7 hex values of stack contents
This is an information disclosure: the attacker reads memory they should not have access to, including: - Stack canary values (enabling canary bypass) - Return addresses (enabling ASLR bypass) - Pointers to data (enabling further memory access)
The %n Specifier: Writing to Memory
The %n format specifier writes the number of characters printed so far to the memory location pointed to by the corresponding argument. If the attacker controls the format string and the stack contains a useful pointer:
printf("AAAA%n"); // writes 4 to the memory location that would be the arg
By controlling the position on the stack (using %K$n to access the K-th argument) and the count of characters (using padding), an attacker can write arbitrary values to arbitrary addresses. This is the format string write-what-where primitive.
Stack Layout During printf
During printf("%p %p %p %p %p %p %p %p", ...):
│ buf (containing the format string) │ ← this is up the stack
│ ... │
│ [rsp+0x30] │ ← 7th arg: read by printf
│ [rsp+0x28] │ ← 6th arg: R9 register initially
│ [rsp+0x20] │ ← 5th arg: R8
│ [rsp+0x18] │ ← 4th arg: RCX
│ [rsp+0x10] │ ← 3rd arg: RDX
│ [rsp+0x08] │ ← 2nd arg: RSI
│ printf's first arg (RDI = fmt) │
Prevention: Never pass untrusted data as the format string argument to printf. Always use printf("%s", user_input). Modern compilers warn about this (-Wformat-security) and it is caught by static analysis tools.
Heap Corruption
Stack overflows are largely mitigated by modern defenses. Heap corruption — bugs that corrupt the malloc heap — is now the dominant exploitation technique.
Use-After-Free
char *ptr = malloc(64);
// ... use ptr ...
free(ptr);
// ... later ...
ptr[0] = 'A'; // use-after-free: write to freed memory
At the assembly level:
mov rdi, [rbp-8] ; load ptr
call free@plt ; free it
; ... code that forgets ptr is freed ...
mov rax, [rbp-8] ; reload ptr (still points to freed memory!)
mov BYTE [rax], 0x41 ; write to freed memory
The freed memory has been returned to the allocator's free list. Between the free and the use, the allocator may have given the same memory to another allocation. Writing to the freed pointer now corrupts that other object.
If the other object happens to be a function pointer, a vtable pointer, or a security-relevant data structure, the attacker can redirect execution.
Double-Free
free(ptr);
// ... some code ...
free(ptr); // free same pointer twice
The allocator's free list data structures are corrupted. In glibc's ptmalloc2, double-free corrupts the fd (forward pointer) of a free chunk, potentially allowing controlled writes on subsequent allocations.
Modern glibc versions have extensive double-free detection that terminates the process. But custom allocators or older glibc versions may be vulnerable.
Heap Chunk Headers (glibc malloc)
Understanding why heap corruption is dangerous requires understanding how malloc organizes memory:
glibc malloc chunk structure:
┌────────────────────────────┐
│ prev_size (8 bytes) │ ← size of previous chunk (if free)
├────────────────────────────┤
│ size (8 bytes) │ ← size of this chunk (+ flags in low bits)
├────────────────────────────┤
│ User data │ ← what malloc() returns a pointer to
│ (requested size, aligned) │
├────────────────────────────┤
│ Padding │
└────────────────────────────┘
When free, user data area contains:
│ fd (forward pointer) │ ← next free chunk in bin
│ bk (backward pointer) │ ← previous free chunk in bin
│ ... │
An overflow into an adjacent heap chunk can corrupt the size field or the fd/bk pointers of the next chunk. When the allocator later processes that chunk (during a free or malloc), it follows the corrupted pointers — potentially writing attacker-controlled data to attacker-controlled locations.
Prevention:
- Use memory-safe languages for code handling untrusted input
- Enable AddressSanitizer (-fsanitize=address) during testing — it detects UAF and heap overflows
- Use valgrind during development
- Enable glibc heap hardening (tcache security, MALLOC_CHECK_)
- Use safe allocators with added integrity checking (jemalloc, tcmalloc with checks)
The History: The Morris Worm (1988)
The Morris Worm of November 1988 was the first widely-known exploit of a buffer overflow vulnerability on the internet. Written by Robert Morris (then a Cornell graduate student), it used a vulnerability in the BSD Unix fingerd daemon.
The fingerd vulnerability:
// Simplified version of the vulnerable code
void get_finger_request(int sock) {
char line[512];
gets(line); // gets() has NO bounds checking whatsoever
// ... process request ...
}
gets() reads from stdin until a newline or EOF, writing to the buffer with no size limit. The Morris Worm sent a carefully crafted string longer than 512 bytes, overwriting the return address with a shellcode pointer.
The worm:
1. Exploited fingerd buffer overflow to gain a shell on the target
2. Used the shell to download and execute its main body
3. Spread to new systems and repeated
It infected an estimated 6,000 computers — roughly 10% of the internet at the time. The internet was small enough that 6,000 machines was a significant fraction of all connected systems. The economic damage was estimated at $100,000 to $10 million.
The Morris Worm established several patterns that persist today: - Buffer overflows in C programs are exploitable for code execution - Network-facing services are high-value targets - Worms can spread rapidly without human action - Security through obscurity fails
gets() was deprecated in C99 and removed from C11. The modern equivalent vulnerability is using strcpy, sprintf, scanf("%s", ...) without bounds checking.
A Complete GDB Walkthrough: Overwriting the Return Address
This walkthrough demonstrates, in a controlled lab environment, how a buffer overflow overwrites a return address. The purpose is to understand the mechanism concretely so you can identify and prevent it in code you write or audit.
Vulnerable program:
// vulnerable.c — INTENTIONALLY VULNERABLE FOR EDUCATIONAL PURPOSES
// Compile: gcc -g -fno-stack-protector -z execstack vulnerable.c -o vulnerable
// The -fno-stack-protector and -z execstack flags disable protections
// to demonstrate the bare vulnerability. NEVER disable these in production.
#include <string.h>
#include <stdio.h>
void vulnerable(const char *input) {
char buf[64];
strcpy(buf, input);
printf("buf: %s\n", buf);
}
int main(int argc, char **argv) {
if (argc < 2) return 1;
vulnerable(argv[1]);
return 0;
}
GDB session:
$ gcc -g -fno-stack-protector vulnerable.c -o vulnerable
$ gdb -q ./vulnerable
(gdb) set disassembly-flavor intel
; Disassemble to see the stack layout
(gdb) disassemble vulnerable
0x401196 <vulnerable>: push rbp
0x401197 <vulnerable+1>: mov rbp,rsp
0x40119a <vulnerable+4>: sub rsp,0x50 ; 80 bytes (64+16 align)
0x40119e <vulnerable+8>: mov [rbp-0x48],rdi ; save input ptr
0x4011a2 <vulnerable+12>: lea rdi,[rbp-0x40] ; buf at rbp-0x40 (offset 64)
0x4011a6 <vulnerable+16>: mov rsi,[rbp-0x48] ; input
0x4011aa <vulnerable+20>: call strcpy@plt
; ...
0x4011c0 <vulnerable+42>: leave
0x4011c1 <vulnerable+43>: ret
; buf is at rbp-0x40 (64 bytes below rbp)
; saved rbp is at rbp (8 bytes)
; return address is at rbp+8
; offset from buf to return address = 64 + 8 = 72 bytes
; Set breakpoint before ret
(gdb) break *0x4011c1
(gdb) run $(python3 -c "print('A'*64 + 'B'*8 + 'C'*8)")
Breakpoint 1, 0x00000000004011c1 in vulnerable ()
(gdb) info registers rsp rbp
rsp 0x7ffd...cb10 (RSP after leave = old RBP value; now pointing to return address)
rbp 0x4242424242424242 (saved rbp = 8 × 'B' = 0x42)
(gdb) x/8xb $rsp
0x7ffd...cb18: 0x43 0x43 0x43 0x43 0x43 0x43 0x43 0x43
; The return address is 0x4343434343434343 (8 × 'C')
; When ret executes, RIP will be set to 0x4343434343434343
; That's an invalid address → SIGSEGV
(gdb) stepi
Program received signal SIGSEGV
(gdb) info registers rip
rip 0x4343434343434343 ; Confirmed: we control the return address
The walkthrough demonstrates concretely what the attacker controls. The offset calculation (64 + 8 = 72) matches the rbp-0x40 stack layout. Every byte position in the overflow payload has a specific effect.
Prevention: Writing Secure C
Every vulnerability in this chapter has a straightforward prevention:
| Dangerous Function | Reason | Safe Alternative |
|---|---|---|
gets(buf) |
No bounds checking, removed from C11 | fgets(buf, sizeof(buf), stdin) |
strcpy(dst, src) |
No bounds check | strlcpy(dst, src, sizeof(dst)) or strncpy + ensure null |
sprintf(buf, fmt, ...) |
Can overflow buf | snprintf(buf, sizeof(buf), fmt, ...) |
scanf("%s", buf) |
No bounds check | scanf("%63s", buf) (with explicit size) |
strcat(dst, src) |
No bounds check | strlcat(dst, src, sizeof(dst)) |
printf(user_str) |
Format string vuln | printf("%s", user_str) |
Beyond function choices:
- Enable compiler warnings: -Wall -Wextra -Wformat-security
- Enable AddressSanitizer: -fsanitize=address (catches buffer overflows, use-after-free at runtime)
- Use -D_FORTIFY_SOURCE=2 for additional runtime checks
- Keep production binaries compiled with stack canaries and NX (the defaults on modern GCC)
- Use checksec to verify your binaries have expected mitigations
📊 C Comparison: In C++:
std::stringandstd::vectorhandle buffer management. In Rust: the borrow checker prevents use-after-free at compile time; bounds checks prevent buffer overflows at runtime. In Python/Java/Go: bounds checking is automatic. Assembly gives you the same risk as C — you manage memory manually, and mistakes are exploitable. The mitigation architecture in Chapter 36 does not make C safe; it makes exploiting C vulnerabilities harder. The safest path is using memory-safe languages where performance permits and writing careful C where it does not.🔄 Check Your Understanding: 1. A function has a 32-byte buffer. What is the offset to the return address on x86-64 (assuming standard frame layout with saved RBP)? 2. Why must shellcode be position-independent? 3. Why does
%nin a format string cause a security vulnerability? 4. What is the difference between a stack buffer overflow and a heap buffer overflow in terms of exploitation potential? 5. Why did the Morris Worm usegets()vulnerability rather than a more sophisticated technique?
Summary
Buffer overflows are the original sin of C programming: writing past the end of a buffer overwrites adjacent memory, and on the stack, that adjacent memory includes the return address. Understanding this at the assembly level — seeing the stack layout, tracing the overflow byte by byte, understanding what the return address is and how ret uses it — transforms "buffer overflow" from an abstract concern into a concrete, preventable vulnerability. Shellcode, NOP sleds, format string bugs, and heap corruption are variations on the same theme: C trusts you to manage memory correctly, and when you do not (or when an attacker can make you not), the consequences extend to arbitrary code execution. Modern mitigations (Chapter 36) make exploitation much harder. Writing careful C and using memory-safe languages where possible makes vulnerabilities far less likely to exist in the first place.