Case Study 26-2: INT3 — The Debugger's Best Friend
How Software Breakpoints Work at the Instruction Level
When you type break main in GDB, something specific happens at the machine level: GDB writes a single byte — 0xCC — over the first byte of the first instruction in main. This is the INT3 instruction, also called the breakpoint instruction. When the CPU executes it, it fires exception vector 3 (#BP), and the kernel notifies the debugger. GDB restores the original byte, presents you with the register state, and waits for your next command.
This case study traces the complete path from GDB command to INT3 firing, explains the assembly mechanics, and shows how to implement a simple software debugger from scratch.
The INT3 Instruction
INT3 is a one-byte instruction with opcode 0xCC. Its two-byte cousin, INT 3 (opcode 0xCD 0x03), is identical in effect but different in encoding. The one-byte encoding is specifically designed for breakpoints — it can replace any instruction's first byte, regardless of the original instruction's length.
When 0xCC executes:
1. The CPU fires exception vector 3 (#BP)
2. RIP is pushed pointing to the byte after the 0xCC
3. The kernel's #BP handler executes
4. If a debugger is attached, the kernel sends SIGTRAP to the process
The key detail: RIP points past the INT3, not at it. When the debugger wants to continue, it must first restore the original byte at the breakpoint address, then set RIP back by one byte (to re-execute the original instruction), then single-step one instruction, then re-insert the breakpoint.
Examining INT3 in Action with GDB
# Compile with debug info
gcc -g -o test_program test.c
# Start GDB
gdb test_program
# Set a breakpoint
(gdb) break main
Breakpoint 1 at 0x401126: file test.c, line 5.
# Check what GDB patched at that address
(gdb) run
Starting program: test_program
Breakpoint 1, main () at test.c:5
# Examine the instruction GDB will show:
(gdb) x/1bx 0x401126
0x401126 <main>: 0xcc ← GDB shows 0xCC (INT3) while stopped here
# GDB has the real byte internally:
(gdb) p/x *(unsigned char*)0x401126
$1 = 0x55 ← GDB shows the original byte (0x55 = PUSH RBP)
The discrepancy is intentional: GDB knows the real byte and shows it to you, even though the memory actually contains 0xCC.
Implementing a Minimal Software Debugger
Here is a functional (if minimal) software debugger implemented in assembly + raw system calls:
; mindbg.asm — A minimal software debugger using ptrace syscalls
; Usage: ./mindbg <program>
;
; Demonstrates: fork, ptrace, breakpoints via POKETEXT, single-step
; Build: nasm -f elf64 mindbg.asm -o mindbg.o && ld mindbg.o -o mindbg
; ptrace request codes
PTRACE_TRACEME equ 0
PTRACE_PEEKTEXT equ 1
PTRACE_POKETEXT equ 4
PTRACE_CONT equ 7
PTRACE_SINGLESTEP equ 9
PTRACE_GETREGS equ 12
PTRACE_SETREGS equ 13
PTRACE_ATTACH equ 16
PTRACE_DETACH equ 17
; waitpid status macros
; WIFEXITED(s) = (s & 0x7F) == 0
; WIFSTOPPED(s) = (s & 0xFF) == 0x7F
; WSTOPSIG(s) = (s >> 8) & 0xFF
SYS_FORK equ 57
SYS_EXECVE equ 59
SYS_EXIT equ 60
SYS_WAIT4 equ 61
SYS_PTRACE equ 101
SYS_WRITE equ 1
%define SIGTRAP 5
section .bss
child_pid: resq 1
wait_status resd 1
; user_regs_struct: 216 bytes (27 × 8-byte registers)
; Layout: r15, r14, r13, r12, rbp, rbx, r11, r10, r9, r8,
; rax, rcx, rdx, rsi, rdi, orig_rax, rip, cs, eflags,
; rsp, ss, fs_base, gs_base, ds, es, fs, gs
regs: resb 216
section .text
global _start
_start:
; For simplicity, trace a hardcoded target program
; In a real debugger, we'd parse argv
; Fork
mov rax, SYS_FORK
syscall
test rax, rax
js .error
jz .child
; Parent: debugger
mov [child_pid], rax
; Wait for child to stop (after PTRACE_TRACEME + execve)
mov rdi, rax ; child PID
lea rsi, [wait_status]
xor rdx, rdx
xor r10, r10
mov rax, SYS_WAIT4
syscall
; Child is now stopped at entry point (first instruction)
; Set a breakpoint at a known address (hardcoded for demo)
; In reality, you'd look up the symbol table
call set_breakpoint
; Continue execution
call ptrace_continue
; Wait for breakpoint hit or exit
.debug_loop:
mov rdi, [child_pid]
lea rsi, [wait_status]
xor rdx, rdx
xor r10, r10
mov rax, SYS_WAIT4
syscall
; Check if stopped (WIFSTOPPED)
mov eax, [wait_status]
and eax, 0xFF
cmp eax, 0x7F ; 0x7F = stopped
jne .child_exited
; Check stop signal (WSTOPSIG)
mov eax, [wait_status]
shr eax, 8
and eax, 0xFF
cmp eax, SIGTRAP
je .handle_breakpoint
; Other signal: pass through
call ptrace_continue
jmp .debug_loop
.handle_breakpoint:
; Print register state
call print_registers
; For a real debugger: restore original byte, set RIP back by 1,
; single-step, re-insert breakpoint, continue
call ptrace_continue
jmp .debug_loop
.child_exited:
; Child exited: decode and print exit status
mov rax, SYS_EXIT
xor rdi, rdi
syscall
.child:
; Child: set up to be traced
; ptrace(PTRACE_TRACEME, 0, 0, 0)
mov rax, SYS_PTRACE
mov rdi, PTRACE_TRACEME
xor rsi, rsi
xor rdx, rdx
xor r10, r10
syscall
; Execute the target program
; (in a real debugger, use the path from argv)
mov rax, SYS_EXECVE
mov rdi, target_path
mov rsi, target_argv
xor rdx, rdx
syscall
; If execve fails, exit
mov rax, SYS_EXIT
mov rdi, 1
syscall
.error:
mov rax, SYS_EXIT
mov rdi, 1
syscall
set_breakpoint:
; ptrace(PTRACE_POKETEXT, pid, addr, word_with_0xCC)
; Reads the current word at addr, replaces first byte with 0xCC
; (simplified: doesn't save original byte here)
mov rax, SYS_PTRACE
mov rdi, PTRACE_POKETEXT
mov rsi, [child_pid]
mov rdx, [breakpoint_addr] ; address to break at
mov r10, 0xCC ; write 0xCC as the first byte
syscall
ret
ptrace_continue:
mov rax, SYS_PTRACE
mov rdi, PTRACE_CONT
mov rsi, [child_pid]
xor rdx, rdx
xor r10, r10
syscall
ret
print_registers:
; ptrace(PTRACE_GETREGS, pid, 0, ®s)
mov rax, SYS_PTRACE
mov rdi, PTRACE_GETREGS
mov rsi, [child_pid]
xor rdx, rdx
lea r10, [regs]
syscall
; Print RIP (at offset 128 in user_regs_struct)
; ... (print formatted output using write syscall)
ret
section .data
target_path: db "/tmp/target", 0
target_argv: dq target_path, 0
breakpoint_addr: dq 0x401126 ; hardcoded for demo
The ptrace System Call
ptrace is the foundation of all Linux debuggers and the mechanism behind strace. It lets one process (the tracer) observe and control another (the tracee). Key operations:
| Request | Effect |
|---|---|
PTRACE_TRACEME |
Child tells kernel to let its parent trace it |
PTRACE_PEEKTEXT |
Read 8 bytes from tracee's memory |
PTRACE_POKETEXT |
Write 8 bytes into tracee's memory (used to insert 0xCC) |
PTRACE_GETREGS |
Copy all registers into a user_regs_struct |
PTRACE_SETREGS |
Set all registers from a user_regs_struct |
PTRACE_CONT |
Resume execution (optionally deliver a signal) |
PTRACE_SINGLESTEP |
Execute exactly one instruction, then stop (sets TF) |
The reason POKETEXT works for breakpoints: it writes a full 64-bit word at the target address. The debugger reads the original word with PEEKTEXT, replaces the lowest byte with 0xCC, writes it back with POKETEXT. When the breakpoint fires, the debugger does the reverse: reads the word, restores the original byte, sets RIP back by 1 (to re-execute the original instruction), and uses SINGLESTEP to execute one instruction before re-inserting the breakpoint.
🔐 Security Note:
ptraceis powerful enough to read and modify arbitrary memory and registers of any process you can trace. This is why container security systems restrict it: an unprivileged process can useptraceto completely control a child process. Seccomp policies often block or limitptrace.
The lesson: every time you use gdb break foo, you are writing 0xCC into a live running process's memory. The instruction set includes a one-byte encoding for this operation specifically because it is so fundamental to software development.