Learning assembly without a functioning toolchain is like learning to cook without a kitchen. You can read recipes all day; until you've assembled, linked, run, and debugged a real program, you don't know assembly. This chapter gets you to the point...
In This Chapter
Chapter 5: Your Development Environment
The Toolchain Is Part of the Skill
Learning assembly without a functioning toolchain is like learning to cook without a kitchen. You can read recipes all day; until you've assembled, linked, run, and debugged a real program, you don't know assembly. This chapter gets you to the point where you can write assembly, compile it, run it, and debug it — and understand exactly what each tool is doing.
The toolchain for assembly development on Linux is: NASM (the assembler), GCC/binutils (compiler, linker, and binary analysis tools), GDB (debugger), Make (build automation), and QEMU (for the OS kernel project). All of it is open source, all of it is free, and all of it runs on any x86-64 Linux system.
Installing the Tools
On a Debian/Ubuntu-based system:
# Core tools
sudo apt update
sudo apt install nasm gcc binutils gdb make
# For the kernel project
sudo apt install qemu-system-x86
# Optional but useful for security work
sudo apt install ghidra # if available in your repos
# Or: download from ghidra.sre.cert.org
# ROPgadget (Python)
pip3 install ROPgadget
On Fedora/RHEL:
sudo dnf install nasm gcc binutils gdb make qemu-system-x86
On Arch Linux:
sudo pacman -S nasm gcc binutils gdb make qemu
Verify the installation:
nasm --version # should print: NASM version 2.15.x or later
ld --version # should print: GNU ld (GNU Binutils) 2.x
gdb --version # should print: GNU gdb (Ubuntu/Fedora/etc) 12.x or later
objdump --version # part of binutils
readelf --version # part of binutils
On macOS
macOS users can install tools via Homebrew:
brew install nasm
brew install binutils # provides gobjdump, readelf as 'greadelf'
# GDB requires code signing on macOS; LLDB (included with Xcode) works similarly
Note: macOS uses a different binary format (Mach-O instead of ELF), so the system call conventions and some tool behaviors differ. For this book's examples, using a Linux environment (native, WSL2 on Windows, or a Linux VM) is strongly recommended.
On Windows with WSL2
# Install WSL2 with Ubuntu:
wsl --install -d Ubuntu
Once WSL2 is running, follow the Debian/Ubuntu instructions above. WSL2 provides a full Linux kernel and complete Linux user space, so all examples in this book work without modification.
Your First NASM Program
Before we debug anything, let's write, assemble, link, and run a working program.
; hello.asm — Hello World for x86-64 Linux
; Every line explained.
section .data
; msg: define bytes for the message
; 'db' = define byte(s)
; 10 = ASCII newline character
msg db "Hello, Assembly!", 10
; len: a compile-time constant = current position ($) minus msg's address
len equ $ - msg ; len = 17 (16 chars + newline)
section .text
; global _start: makes _start visible to the linker as the entry point
global _start
_start:
; System call: sys_write(fd, buf, count)
; On Linux x86-64, syscall number goes in RAX
; Arguments: RDI=first, RSI=second, RDX=third
mov rax, 1 ; syscall number 1 = sys_write
mov rdi, 1 ; argument 1: fd = 1 (stdout)
mov rsi, msg ; argument 2: pointer to message
mov rdx, len ; argument 3: byte count
syscall ; execute the system call
; After syscall: RAX = bytes written (17 if successful)
; System call: sys_exit(status)
mov rax, 60 ; syscall number 60 = sys_exit
xor rdi, rdi ; argument 1: exit status = 0
syscall ; execute and terminate
Assembling: NASM
nasm -f elf64 hello.asm -o hello.o
Flags:
- -f elf64: output format is ELF 64-bit (for Linux x86-64)
- hello.asm: input file
- -o hello.o: output file name
NASM produces an object file (hello.o). It contains the machine code, but the addresses aren't resolved yet and it can't be executed directly.
Linking: LD
ld hello.o -o hello
The linker combines the object file(s), resolves any symbols, and produces an executable. For a standalone assembly program with no C library dependencies, ld alone works. For programs that call C functions, use gcc to link:
gcc -no-pie hello.o -o hello # when linking with C runtime
Running
./hello
# Output: Hello, Assembly!
echo $? # check exit status
# Output: 0
Checking the Result
# Disassemble the executable
objdump -d hello
# Show ELF headers
readelf -h hello
# Show all sections
readelf -S hello
# Show symbol table
nm hello
# Show what system calls the program makes (trace)
strace ./hello
The strace output is particularly instructive:
execve("./hello", ["./hello"], 0x7fff.../* 40 vars */) = 0
write(1, "Hello, Assembly!\n", 17) = 17
exit(0) = ?
+++ exited with 0 +++
Three system calls: execve (kernel loads the program), write (our sys_write), exit (our sys_exit). Clean and direct.
The Makefile for Assembly Projects
Hand-typing the assemble and link commands becomes tedious quickly. A Makefile automates the build:
# Makefile for assembly projects
# Usage:
# make -- build all targets
# make clean -- remove build artifacts
# make run -- build and run
# make debug -- build and start GDB session
# Configuration
NASM := nasm
LD := ld
OBJDUMP := objdump
GDB := gdb
NASM_FLAGS := -f elf64 -g -F dwarf # -g: debug info, -F dwarf: DWARF format
LD_FLAGS :=
# Source files and targets
# Pattern rule: any .asm file in this directory becomes a target
SRCS := $(wildcard *.asm)
TARGETS := $(patsubst %.asm,%,$(SRCS))
# Default target: build all
all: $(TARGETS)
# Pattern rule: .asm -> .o -> executable
%.o: %.asm
$(NASM) $(NASM_FLAGS) $< -o $@
%: %.o
$(LD) $(LD_FLAGS) $< -o $@
# Convenience targets
run: hello
./hello
debug: hello
$(GDB) ./hello
disasm: hello
$(OBJDUMP) -d -M intel hello
clean:
rm -f *.o $(TARGETS)
.PHONY: all run debug disasm clean
Save this as Makefile in your project directory. Now:
make # build hello from hello.asm
make run # build and run
make debug # build and launch GDB
make disasm # show disassembly in Intel syntax
make clean # remove build artifacts
The -g -F dwarf NASM flags add debug information to the object file, which lets GDB show source line numbers alongside assembly instructions.
GDB: The Assembly Debugger
GDB is your most important debugging tool. For assembly work, you'll use a specific subset of its commands. Let's walk through a complete debugging session with the hello world program.
Starting a GDB Session
# Method 1: start GDB with the program as argument
gdb ./hello
# Method 2: start GDB, then load the program
gdb
(gdb) file ./hello
At the GDB prompt, set a breakpoint and run:
(gdb) break _start
Breakpoint 1 at 0x401000: file hello.asm, line 11.
(gdb) run
Starting program: /path/to/hello
Breakpoint 1, _start () at hello.asm:11
11 mov rax, 1
GDB stops at the first instruction of _start.
Examining Registers
(gdb) info registers
rax 0x0 0
rbx 0x0 0
rcx 0x0 0
rdx 0x0 0
rsi 0x0 0
rdi 0x0 0
rbp 0x0 0
rsp 0x7fffffffdfb0 140737488347056
r8 0x0 0
r9 0x0 0
r10 0x0 0
r11 0x0 0
r12 0x0 0
r13 0x0 0
r14 0x0 0
r15 0x0 0
rip 0x401000 0x401000 <_start>
eflags 0x202 [ IF ]
cs 0x33 51
ss 0x2b 43
ds 0x0 0
es 0x0 0
fs 0x0 0
gs 0x0 0
; To show a specific register:
(gdb) print $rax
$1 = 0
(gdb) print/x $rax ; print in hex
$2 = 0x0
(gdb) info register rax rdi rsi rdx ; show specific registers
Stepping Through Instructions
; stepi (si): execute ONE assembly instruction
(gdb) stepi
12 mov rdi, 1
(gdb) info register rax
rax 0x1 1
; rax is now 1 (sys_write)
(gdb) stepi
13 mov rsi, msg
(gdb) stepi
14 mov rdx, len
(gdb) info register rax rdi rsi rdx
rax 0x1 1
rdi 0x1 1
rsi 0x402000 4202496 ; address of msg
rdx 0x11 17 ; len = 17
; nexti (ni): like stepi but steps over function calls (doesn't enter)
; For syscall, stepi steps through it (one "instruction")
(gdb) stepi
; The syscall executes -- we see the output:
Hello, Assembly!
15 mov rax, 60
(gdb) info register rax
rax 0x11 17 ; sys_write returned 17 (bytes written)
Examining Memory
; x/FMT ADDRESS: examine memory
; Format: N = count, F = format (x=hex, d=decimal, s=string, i=instruction)
; Size: b=byte, h=halfword(16), w=word(32), g=giant(64)
; Examine 17 bytes at the msg address, as hex bytes:
(gdb) x/17xb 0x402000
0x402000: 0x48 0x65 0x6c 0x6c 0x6f 0x2c 0x20 0x41
0x402008: 0x73 0x73 0x65 0x6d 0x62 0x6c 0x79 0x21
0x402010: 0x0a
; H e l l o , [space] A s s e m b l y ! \n
; Examine as string:
(gdb) x/s 0x402000
0x402000: "Hello, Assembly!\n"
; Examine current instruction at RIP:
(gdb) x/i $rip
=> 0x401020 <_start+32>: mov $0x3c,%rax ; AT&T syntax: 0x3c = 60
; Examine 5 instructions starting at _start:
(gdb) x/5i _start
0x401000 <_start>: mov $0x1,%rax
0x401007 <_start+7>: mov $0x1,%rdi
0x40100e <_start+14>: mov $0x402000,%rsi
0x401015 <_start+21>: mov $0x11,%rdx
0x40101c <_start+28>: syscall
; Examine stack:
(gdb) x/8gx $rsp ; 8 giant (64-bit) hex values at RSP
0x7fffffffdfb0: 0x0000000000000001 0x00007fffffffe2c4
0x7fffffffdfc0: 0x0000000000000000 0x00007fffffffe2da
0x7fffffffdfd0: 0x00007fffffffe300 0x00007fffffffe30e
0x7fffffffdfe0: 0x00007fffffffe325 0x0000000000000000
; These are argc (1), argv[0] address, etc. — the process arguments
Using the TUI (Text User Interface)
GDB has a split-screen mode that shows registers and assembly simultaneously:
(gdb) layout regs ; show register window + source/asm window
(gdb) layout asm ; show assembly (disassembly) window
(gdb) layout split ; show source + assembly
(gdb) focus regs ; keyboard focus to register window
(gdb) focus asm ; keyboard focus to assembly window
(gdb) Ctrl+X, A ; toggle TUI mode on/off
In TUI mode, the current instruction is highlighted and registers update automatically after each stepi.
Auto-Display: Watch Registers Automatically
(gdb) display /x $rax ; show rax in hex after every command
(gdb) display /x $rdi
(gdb) display /x $rsi
(gdb) display /x $rdx
(gdb) info display ; show all auto-displays
(gdb) undisplay 1 ; remove display 1
After each stepi, GDB will automatically print the current values of all displayed registers. This is the most efficient way to trace register changes.
A Complete GDB Session Transcript
Here is a session that traces the complete hello world program:
$ gdb hello
(gdb) break _start
Breakpoint 1 at 0x401000
(gdb) run
Breakpoint 1, 0x0000000000401000 in _start ()
(gdb) display /x $rax
(gdb) display /x $rdi
(gdb) display /x $rsi
(gdb) display /x $rdx
(gdb) stepi
0x0000000000401007 in _start ()
3: /x $rdx = 0x0
2: /x $rsi = 0x0
1: /x $rdi = 0x0
4: /x $rax = 0x1 ← rax is now 1 (after mov rax, 1)
(gdb) stepi
0x000000000040100e in _start ()
3: /x $rdx = 0x0
2: /x $rsi = 0x0
1: /x $rdi = 0x1 ← rdi is now 1 (after mov rdi, 1)
4: /x $rax = 0x1
(gdb) stepi
0x0000000000401015 in _start ()
3: /x $rdx = 0x0
2: /x $rsi = 0x402000 ← rsi is now address of msg
1: /x $rdi = 0x1
4: /x $rax = 0x1
(gdb) stepi
0x000000000040101c in _start ()
3: /x $rdx = 0x11 ← rdx is now 17 (len = 17)
2: /x $rsi = 0x402000
1: /x $rdi = 0x1
4: /x $rax = 0x1
(gdb) stepi ← this executes syscall (sys_write)
Hello, Assembly! ← output appears!
0x000000000040101e in _start ()
3: /x $rdx = 0x11
2: /x $rsi = 0x402000
1: /x $rdi = 0x1
4: /x $rax = 0x11 ← rax is now 17 (return value: bytes written)
(gdb) stepi
0x0000000000401025 in _start ()
4: /x $rax = 0x3c ← rax is now 60 (sys_exit)
(gdb) stepi
0x000000000040102c in _start ()
1: /x $rdi = 0x0 ← rdi is now 0 (exit code)
4: /x $rax = 0x3c
(gdb) stepi ← executes syscall (sys_exit); program terminates
[Inferior 1 (process 12345) exited normally]
This trace shows exactly what happens at each step: which register is set, what value it gets, and what the system call returns.
The Binary Analysis Toolkit
Beyond GDB, several other tools are essential:
objdump: Disassembly and More
# Disassemble in Intel syntax (default on Linux is AT&T)
objdump -d -M intel hello
# Show all sections with their sizes
objdump -h hello
# Show the symbol table
objdump -t hello
# Disassemble ALL sections (including data — shows data as instructions, which is wrong)
objdump -D hello
# Disassemble with source interleaving (requires -g debug info)
objdump -d -S hello
readelf: ELF Structure Analysis
# ELF header
readelf -h hello
# Program headers (segments — what gets loaded into memory)
readelf -l hello
# Section headers
readelf -S hello
# Symbol table
readelf -s hello
# Relocation entries
readelf -r hello.o # for object files
# Dynamic section (for dynamically linked executables)
readelf -d hello_dynamic
# String dump of a section
readelf -p .rodata hello
readelf -p .data hello
nm: Symbol Table
nm hello # all symbols with addresses
nm -D hello_dynamic # dynamic symbols only
nm -n hello # sorted by address (numeric)
nm -u hello # undefined symbols (need to be resolved by dynamic linker)
strace: System Call Tracer
strace ./hello # trace all system calls
strace -e write ./hello # trace only write() calls
strace -c ./hello # count and summarize system calls
ltrace: Library Call Tracer
ltrace ./hello_with_printf # trace calls to shared library functions
Linking with the C Library
Sometimes you want to call C library functions (printf, strlen, malloc) from assembly. The setup is slightly different:
; hello_c.asm — calls printf() from assembly
extern printf
section .data
fmt db "Hello from assembly! Value: %d", 10, 0 ; printf format string
section .text
global main ; use 'main' not '_start' when linking with libc
main:
push rbp
mov rbp, rsp
; printf(fmt, 42)
lea rdi, [rel fmt] ; first argument: format string
mov esi, 42 ; second argument: integer value
xor eax, eax ; convention: eax = number of FP args (0)
call printf ; call the C function
; return 0
xor eax, eax
pop rbp
ret
Build:
nasm -f elf64 hello_c.asm -o hello_c.o
gcc hello_c.o -o hello_c # gcc handles linking with libc
./hello_c
# Output: Hello from assembly! Value: 42
When using main instead of _start, the C runtime (_start provided by crt1.o) calls main after initialization. You can use all standard C library functions.
Note the extern printf declaration — this tells NASM that printf is defined elsewhere (in libc.so.6, resolved at link time). Without it, NASM would report an error when you reference printf without defining it.
QEMU: The Kernel Testing Platform
For the MinOS kernel project, you'll need QEMU — a system emulator that can run a complete virtual x86-64 machine:
# Install
sudo apt install qemu-system-x86
# Verify
qemu-system-x86_64 --version
QEMU usage for the kernel project (you'll use this extensively in Part V):
# Boot from a floppy image (classic method):
qemu-system-x86_64 -fda minos.img -m 32M -no-reboot
# Boot from a hard disk image:
qemu-system-x86_64 -drive format=raw,file=minos.img -m 64M
# With serial output to terminal (useful for early kernel debugging):
qemu-system-x86_64 -kernel minos.bin -m 64M -serial stdio
# With GDB remote debugging:
qemu-system-x86_64 -fda minos.img -m 32M -s -S
# Then in another terminal:
gdb minios.elf
(gdb) target remote :1234 ; connect to QEMU's GDB server
(gdb) continue
The -s -S flags start QEMU with the CPU paused, waiting for a GDB connection. This lets you debug the kernel from the very first instruction.
Recommended Project Directory Structure
assembly-projects/
├── Makefile ← the Makefile from above
├── hello.asm ← first program
├── exercises/
│ ├── ch01/
│ │ ├── add.c
│ │ ├── sum.c
│ │ └── Makefile
│ ├── ch02/
│ └── ...
├── minos/ ← kernel project
│ ├── boot/
│ │ ├── boot.asm ← bootloader
│ │ └── Makefile
│ ├── kernel/
│ └── include/
└── tools/
├── syscall_table.txt ← reference: x86-64 syscall numbers
└── registers.txt ← reference: calling convention
Summary
You now have a functioning assembly development environment with:
- NASM for assembling .asm files to .o object files
- ld/gcc for linking object files to executables
- GDB for debugging with register inspection and single-stepping
- objdump, readelf, nm, strace for binary analysis
- A Makefile template for project automation
- QEMU for the kernel project
The key GDB commands for assembly debugging:
- break _start / run: start execution and stop at the entry point
- stepi (si): step one instruction
- info registers / print $rax: examine register values
- x/16xb addr / x/8gx $rsp: examine memory
- display /x $reg: auto-display a register on every step
- layout regs / layout asm: TUI mode for combined register/assembly view
With these tools, you can observe exactly what every instruction does. That observability is the foundation of assembly debugging and, by extension, the foundation of security research and performance optimization.
🛠️ Lab Exercise: Set up the toolchain on your system. Assemble and run
hello.asm. Start a GDB session and step through every instruction, observing the register values after each step. The goal is to predict the register state before each step and verify your prediction. Do this until the predictions and reality match without surprises.📐 OS Kernel Project — Step 0: Verify that QEMU is installed. Run
qemu-system-x86_64 --version. Create the directory structure for MinOS:mkdir -p ~/minos/boot ~/minos/kernel ~/minos/include. In Chapter 7, you'll write the first bytes of the MinOS bootloader.