5 min read

Learning assembly without a functioning toolchain is like learning to cook without a kitchen. You can read recipes all day; until you've assembled, linked, run, and debugged a real program, you don't know assembly. This chapter gets you to the point...

Chapter 5: Your Development Environment

The Toolchain Is Part of the Skill

Learning assembly without a functioning toolchain is like learning to cook without a kitchen. You can read recipes all day; until you've assembled, linked, run, and debugged a real program, you don't know assembly. This chapter gets you to the point where you can write assembly, compile it, run it, and debug it — and understand exactly what each tool is doing.

The toolchain for assembly development on Linux is: NASM (the assembler), GCC/binutils (compiler, linker, and binary analysis tools), GDB (debugger), Make (build automation), and QEMU (for the OS kernel project). All of it is open source, all of it is free, and all of it runs on any x86-64 Linux system.


Installing the Tools

On a Debian/Ubuntu-based system:

# Core tools
sudo apt update
sudo apt install nasm gcc binutils gdb make

# For the kernel project
sudo apt install qemu-system-x86

# Optional but useful for security work
sudo apt install ghidra   # if available in your repos
# Or: download from ghidra.sre.cert.org

# ROPgadget (Python)
pip3 install ROPgadget

On Fedora/RHEL:

sudo dnf install nasm gcc binutils gdb make qemu-system-x86

On Arch Linux:

sudo pacman -S nasm gcc binutils gdb make qemu

Verify the installation:

nasm --version      # should print: NASM version 2.15.x or later
ld --version        # should print: GNU ld (GNU Binutils) 2.x
gdb --version       # should print: GNU gdb (Ubuntu/Fedora/etc) 12.x or later
objdump --version   # part of binutils
readelf --version   # part of binutils

On macOS

macOS users can install tools via Homebrew:

brew install nasm
brew install binutils   # provides gobjdump, readelf as 'greadelf'
# GDB requires code signing on macOS; LLDB (included with Xcode) works similarly

Note: macOS uses a different binary format (Mach-O instead of ELF), so the system call conventions and some tool behaviors differ. For this book's examples, using a Linux environment (native, WSL2 on Windows, or a Linux VM) is strongly recommended.

On Windows with WSL2

# Install WSL2 with Ubuntu:
wsl --install -d Ubuntu

Once WSL2 is running, follow the Debian/Ubuntu instructions above. WSL2 provides a full Linux kernel and complete Linux user space, so all examples in this book work without modification.


Your First NASM Program

Before we debug anything, let's write, assemble, link, and run a working program.

; hello.asm — Hello World for x86-64 Linux
; Every line explained.

section .data
    ; msg: define bytes for the message
    ; 'db' = define byte(s)
    ; 10 = ASCII newline character
    msg     db "Hello, Assembly!", 10
    ; len: a compile-time constant = current position ($) minus msg's address
    len     equ $ - msg             ; len = 17 (16 chars + newline)

section .text
    ; global _start: makes _start visible to the linker as the entry point
    global _start

_start:
    ; System call: sys_write(fd, buf, count)
    ; On Linux x86-64, syscall number goes in RAX
    ; Arguments: RDI=first, RSI=second, RDX=third
    mov     rax, 1          ; syscall number 1 = sys_write
    mov     rdi, 1          ; argument 1: fd = 1 (stdout)
    mov     rsi, msg        ; argument 2: pointer to message
    mov     rdx, len        ; argument 3: byte count
    syscall                 ; execute the system call
    ; After syscall: RAX = bytes written (17 if successful)

    ; System call: sys_exit(status)
    mov     rax, 60         ; syscall number 60 = sys_exit
    xor     rdi, rdi        ; argument 1: exit status = 0
    syscall                 ; execute and terminate

Assembling: NASM

nasm -f elf64 hello.asm -o hello.o

Flags: - -f elf64: output format is ELF 64-bit (for Linux x86-64) - hello.asm: input file - -o hello.o: output file name

NASM produces an object file (hello.o). It contains the machine code, but the addresses aren't resolved yet and it can't be executed directly.

Linking: LD

ld hello.o -o hello

The linker combines the object file(s), resolves any symbols, and produces an executable. For a standalone assembly program with no C library dependencies, ld alone works. For programs that call C functions, use gcc to link:

gcc -no-pie hello.o -o hello   # when linking with C runtime

Running

./hello
# Output: Hello, Assembly!

echo $?    # check exit status
# Output: 0

Checking the Result

# Disassemble the executable
objdump -d hello

# Show ELF headers
readelf -h hello

# Show all sections
readelf -S hello

# Show symbol table
nm hello

# Show what system calls the program makes (trace)
strace ./hello

The strace output is particularly instructive:

execve("./hello", ["./hello"], 0x7fff.../* 40 vars */) = 0
write(1, "Hello, Assembly!\n", 17)      = 17
exit(0)                                  = ?
+++ exited with 0 +++

Three system calls: execve (kernel loads the program), write (our sys_write), exit (our sys_exit). Clean and direct.


The Makefile for Assembly Projects

Hand-typing the assemble and link commands becomes tedious quickly. A Makefile automates the build:

# Makefile for assembly projects
# Usage:
#   make          -- build all targets
#   make clean    -- remove build artifacts
#   make run      -- build and run
#   make debug    -- build and start GDB session

# Configuration
NASM    := nasm
LD      := ld
OBJDUMP := objdump
GDB     := gdb

NASM_FLAGS  := -f elf64 -g -F dwarf   # -g: debug info, -F dwarf: DWARF format
LD_FLAGS    :=

# Source files and targets
# Pattern rule: any .asm file in this directory becomes a target
SRCS    := $(wildcard *.asm)
TARGETS := $(patsubst %.asm,%,$(SRCS))

# Default target: build all
all: $(TARGETS)

# Pattern rule: .asm -> .o -> executable
%.o: %.asm
    $(NASM) $(NASM_FLAGS) $< -o $@

%: %.o
    $(LD) $(LD_FLAGS) $< -o $@

# Convenience targets
run: hello
    ./hello

debug: hello
    $(GDB) ./hello

disasm: hello
    $(OBJDUMP) -d -M intel hello

clean:
    rm -f *.o $(TARGETS)

.PHONY: all run debug disasm clean

Save this as Makefile in your project directory. Now:

make          # build hello from hello.asm
make run      # build and run
make debug    # build and launch GDB
make disasm   # show disassembly in Intel syntax
make clean    # remove build artifacts

The -g -F dwarf NASM flags add debug information to the object file, which lets GDB show source line numbers alongside assembly instructions.


GDB: The Assembly Debugger

GDB is your most important debugging tool. For assembly work, you'll use a specific subset of its commands. Let's walk through a complete debugging session with the hello world program.

Starting a GDB Session

# Method 1: start GDB with the program as argument
gdb ./hello

# Method 2: start GDB, then load the program
gdb
(gdb) file ./hello

At the GDB prompt, set a breakpoint and run:

(gdb) break _start
Breakpoint 1 at 0x401000: file hello.asm, line 11.

(gdb) run
Starting program: /path/to/hello

Breakpoint 1, _start () at hello.asm:11
11          mov     rax, 1

GDB stops at the first instruction of _start.

Examining Registers

(gdb) info registers
rax            0x0                 0
rbx            0x0                 0
rcx            0x0                 0
rdx            0x0                 0
rsi            0x0                 0
rdi            0x0                 0
rbp            0x0                 0
rsp            0x7fffffffdfb0      140737488347056
r8             0x0                 0
r9             0x0                 0
r10            0x0                 0
r11            0x0                 0
r12            0x0                 0
r13            0x0                 0
r14            0x0                 0
r15            0x0                 0
rip            0x401000            0x401000 <_start>
eflags         0x202               [ IF ]
cs             0x33                51
ss             0x2b                43
ds             0x0                 0
es             0x0                 0
fs             0x0                 0
gs             0x0                 0

; To show a specific register:
(gdb) print $rax
$1 = 0
(gdb) print/x $rax      ; print in hex
$2 = 0x0
(gdb) info register rax rdi rsi rdx   ; show specific registers

Stepping Through Instructions

; stepi (si): execute ONE assembly instruction
(gdb) stepi
12          mov     rdi, 1

(gdb) info register rax
rax            0x1                 1
; rax is now 1 (sys_write)

(gdb) stepi
13          mov     rsi, msg
(gdb) stepi
14          mov     rdx, len

(gdb) info register rax rdi rsi rdx
rax            0x1                 1
rdi            0x1                 1
rsi            0x402000            4202496     ; address of msg
rdx            0x11                17          ; len = 17

; nexti (ni): like stepi but steps over function calls (doesn't enter)
; For syscall, stepi steps through it (one "instruction")
(gdb) stepi
; The syscall executes -- we see the output:
Hello, Assembly!
15          mov     rax, 60

(gdb) info register rax
rax            0x11                17           ; sys_write returned 17 (bytes written)

Examining Memory

; x/FMT ADDRESS: examine memory
; Format: N = count, F = format (x=hex, d=decimal, s=string, i=instruction)
; Size: b=byte, h=halfword(16), w=word(32), g=giant(64)

; Examine 17 bytes at the msg address, as hex bytes:
(gdb) x/17xb 0x402000
0x402000:       0x48    0x65    0x6c    0x6c    0x6f    0x2c    0x20    0x41
0x402008:       0x73    0x73    0x65    0x6d    0x62    0x6c    0x79    0x21
0x402010:       0x0a
; H  e  l  l  o  ,  [space]  A  s  s  e  m  b  l  y  !  \n

; Examine as string:
(gdb) x/s 0x402000
0x402000:       "Hello, Assembly!\n"

; Examine current instruction at RIP:
(gdb) x/i $rip
=> 0x401020 <_start+32>:        mov    $0x3c,%rax    ; AT&T syntax: 0x3c = 60

; Examine 5 instructions starting at _start:
(gdb) x/5i _start
   0x401000 <_start>:           mov    $0x1,%rax
   0x401007 <_start+7>:         mov    $0x1,%rdi
   0x40100e <_start+14>:        mov    $0x402000,%rsi
   0x401015 <_start+21>:        mov    $0x11,%rdx
   0x40101c <_start+28>:        syscall

; Examine stack:
(gdb) x/8gx $rsp        ; 8 giant (64-bit) hex values at RSP
0x7fffffffdfb0: 0x0000000000000001      0x00007fffffffe2c4
0x7fffffffdfc0: 0x0000000000000000      0x00007fffffffe2da
0x7fffffffdfd0: 0x00007fffffffe300      0x00007fffffffe30e
0x7fffffffdfe0: 0x00007fffffffe325      0x0000000000000000
; These are argc (1), argv[0] address, etc. — the process arguments

Using the TUI (Text User Interface)

GDB has a split-screen mode that shows registers and assembly simultaneously:

(gdb) layout regs    ; show register window + source/asm window
(gdb) layout asm     ; show assembly (disassembly) window
(gdb) layout split   ; show source + assembly
(gdb) focus regs     ; keyboard focus to register window
(gdb) focus asm      ; keyboard focus to assembly window
(gdb) Ctrl+X, A      ; toggle TUI mode on/off

In TUI mode, the current instruction is highlighted and registers update automatically after each stepi.

Auto-Display: Watch Registers Automatically

(gdb) display /x $rax    ; show rax in hex after every command
(gdb) display /x $rdi
(gdb) display /x $rsi
(gdb) display /x $rdx
(gdb) info display       ; show all auto-displays
(gdb) undisplay 1        ; remove display 1

After each stepi, GDB will automatically print the current values of all displayed registers. This is the most efficient way to trace register changes.

A Complete GDB Session Transcript

Here is a session that traces the complete hello world program:

$ gdb hello

(gdb) break _start
Breakpoint 1 at 0x401000

(gdb) run
Breakpoint 1, 0x0000000000401000 in _start ()

(gdb) display /x $rax
(gdb) display /x $rdi
(gdb) display /x $rsi
(gdb) display /x $rdx

(gdb) stepi
0x0000000000401007 in _start ()
3: /x $rdx = 0x0
2: /x $rsi = 0x0
1: /x $rdi = 0x0
4: /x $rax = 0x1        ← rax is now 1 (after mov rax, 1)

(gdb) stepi
0x000000000040100e in _start ()
3: /x $rdx = 0x0
2: /x $rsi = 0x0
1: /x $rdi = 0x1        ← rdi is now 1 (after mov rdi, 1)
4: /x $rax = 0x1

(gdb) stepi
0x0000000000401015 in _start ()
3: /x $rdx = 0x0
2: /x $rsi = 0x402000   ← rsi is now address of msg
1: /x $rdi = 0x1
4: /x $rax = 0x1

(gdb) stepi
0x000000000040101c in _start ()
3: /x $rdx = 0x11       ← rdx is now 17 (len = 17)
2: /x $rsi = 0x402000
1: /x $rdi = 0x1
4: /x $rax = 0x1

(gdb) stepi                ← this executes syscall (sys_write)
Hello, Assembly!           ← output appears!
0x000000000040101e in _start ()
3: /x $rdx = 0x11
2: /x $rsi = 0x402000
1: /x $rdi = 0x1
4: /x $rax = 0x11       ← rax is now 17 (return value: bytes written)

(gdb) stepi
0x0000000000401025 in _start ()
4: /x $rax = 0x3c       ← rax is now 60 (sys_exit)

(gdb) stepi
0x000000000040102c in _start ()
1: /x $rdi = 0x0        ← rdi is now 0 (exit code)
4: /x $rax = 0x3c

(gdb) stepi             ← executes syscall (sys_exit); program terminates
[Inferior 1 (process 12345) exited normally]

This trace shows exactly what happens at each step: which register is set, what value it gets, and what the system call returns.


The Binary Analysis Toolkit

Beyond GDB, several other tools are essential:

objdump: Disassembly and More

# Disassemble in Intel syntax (default on Linux is AT&T)
objdump -d -M intel hello

# Show all sections with their sizes
objdump -h hello

# Show the symbol table
objdump -t hello

# Disassemble ALL sections (including data — shows data as instructions, which is wrong)
objdump -D hello

# Disassemble with source interleaving (requires -g debug info)
objdump -d -S hello

readelf: ELF Structure Analysis

# ELF header
readelf -h hello

# Program headers (segments — what gets loaded into memory)
readelf -l hello

# Section headers
readelf -S hello

# Symbol table
readelf -s hello

# Relocation entries
readelf -r hello.o     # for object files

# Dynamic section (for dynamically linked executables)
readelf -d hello_dynamic

# String dump of a section
readelf -p .rodata hello
readelf -p .data hello

nm: Symbol Table

nm hello                # all symbols with addresses
nm -D hello_dynamic     # dynamic symbols only
nm -n hello             # sorted by address (numeric)
nm -u hello             # undefined symbols (need to be resolved by dynamic linker)

strace: System Call Tracer

strace ./hello          # trace all system calls
strace -e write ./hello # trace only write() calls
strace -c ./hello       # count and summarize system calls

ltrace: Library Call Tracer

ltrace ./hello_with_printf    # trace calls to shared library functions

Linking with the C Library

Sometimes you want to call C library functions (printf, strlen, malloc) from assembly. The setup is slightly different:

; hello_c.asm — calls printf() from assembly
extern printf

section .data
    fmt     db "Hello from assembly! Value: %d", 10, 0   ; printf format string

section .text
    global main                ; use 'main' not '_start' when linking with libc

main:
    push    rbp
    mov     rbp, rsp

    ; printf(fmt, 42)
    lea     rdi, [rel fmt]     ; first argument: format string
    mov     esi, 42            ; second argument: integer value
    xor     eax, eax           ; convention: eax = number of FP args (0)
    call    printf             ; call the C function

    ; return 0
    xor     eax, eax
    pop     rbp
    ret

Build:

nasm -f elf64 hello_c.asm -o hello_c.o
gcc hello_c.o -o hello_c          # gcc handles linking with libc
./hello_c
# Output: Hello from assembly! Value: 42

When using main instead of _start, the C runtime (_start provided by crt1.o) calls main after initialization. You can use all standard C library functions.

Note the extern printf declaration — this tells NASM that printf is defined elsewhere (in libc.so.6, resolved at link time). Without it, NASM would report an error when you reference printf without defining it.


QEMU: The Kernel Testing Platform

For the MinOS kernel project, you'll need QEMU — a system emulator that can run a complete virtual x86-64 machine:

# Install
sudo apt install qemu-system-x86

# Verify
qemu-system-x86_64 --version

QEMU usage for the kernel project (you'll use this extensively in Part V):

# Boot from a floppy image (classic method):
qemu-system-x86_64 -fda minos.img -m 32M -no-reboot

# Boot from a hard disk image:
qemu-system-x86_64 -drive format=raw,file=minos.img -m 64M

# With serial output to terminal (useful for early kernel debugging):
qemu-system-x86_64 -kernel minos.bin -m 64M -serial stdio

# With GDB remote debugging:
qemu-system-x86_64 -fda minos.img -m 32M -s -S
# Then in another terminal:
gdb minios.elf
(gdb) target remote :1234   ; connect to QEMU's GDB server
(gdb) continue

The -s -S flags start QEMU with the CPU paused, waiting for a GDB connection. This lets you debug the kernel from the very first instruction.


assembly-projects/
├── Makefile              ← the Makefile from above
├── hello.asm             ← first program
├── exercises/
│   ├── ch01/
│   │   ├── add.c
│   │   ├── sum.c
│   │   └── Makefile
│   ├── ch02/
│   └── ...
├── minos/                ← kernel project
│   ├── boot/
│   │   ├── boot.asm      ← bootloader
│   │   └── Makefile
│   ├── kernel/
│   └── include/
└── tools/
    ├── syscall_table.txt  ← reference: x86-64 syscall numbers
    └── registers.txt      ← reference: calling convention

Summary

You now have a functioning assembly development environment with: - NASM for assembling .asm files to .o object files - ld/gcc for linking object files to executables - GDB for debugging with register inspection and single-stepping - objdump, readelf, nm, strace for binary analysis - A Makefile template for project automation - QEMU for the kernel project

The key GDB commands for assembly debugging: - break _start / run: start execution and stop at the entry point - stepi (si): step one instruction - info registers / print $rax: examine register values - x/16xb addr / x/8gx $rsp: examine memory - display /x $reg: auto-display a register on every step - layout regs / layout asm: TUI mode for combined register/assembly view

With these tools, you can observe exactly what every instruction does. That observability is the foundation of assembly debugging and, by extension, the foundation of security research and performance optimization.

🛠️ Lab Exercise: Set up the toolchain on your system. Assemble and run hello.asm. Start a GDB session and step through every instruction, observing the register values after each step. The goal is to predict the register state before each step and verify your prediction. Do this until the predictions and reality match without surprises.

📐 OS Kernel Project — Step 0: Verify that QEMU is installed. Run qemu-system-x86_64 --version. Create the directory structure for MinOS: mkdir -p ~/minos/boot ~/minos/kernel ~/minos/include. In Chapter 7, you'll write the first bytes of the MinOS bootloader.