Appendix I: NASM Directives and Preprocessor Reference
NASM (Netwide Assembler) is the assembler used for all x86-64 examples in this book. This appendix covers NASM directives, pseudo-instructions, and preprocessor features.
Full documentation: https://www.nasm.us/doc/
Source File Structure
A NASM source file consists of sections. Each section maps to a segment in the output binary.
section .text ; executable code (default section)
section .data ; initialized read-write data
section .rodata ; initialized read-only data
section .bss ; zero-initialized (uninitialized) data
Alternative spelling: segment is synonymous with section in NASM.
Data Definition Directives
These directives define data inline in the assembly source:
| Directive | Size | Example |
|---|---|---|
DB |
1 byte | db 0x41 or db 'A' |
DW |
2 bytes (word) | dw 0x1234 |
DD |
4 bytes (dword) | dd 0x12345678 |
DQ |
8 bytes (qword) | dq 0x0000000000401000 |
DT |
10 bytes (tword, x87 extended) | dt 3.14159 |
DO |
16 bytes (oword) | do 0 |
DY |
32 bytes (yword, YMM) | dy 0 |
DZ |
64 bytes (zword, ZMM) | dz 0 |
Multiple values on one line:
my_bytes: db 0x90, 0x90, 0x90, 0xC3 ; nop, nop, nop, ret
my_string: db "Hello, World!", 10, 0 ; string + newline + null
my_array: dd 1, 2, 3, 4, 5 ; five 32-bit integers
String literals use single or double quotes. The two forms are equivalent; no escape sequences are processed inside quotes. Use explicit byte values for special characters: db 10 for newline, db 0 for null terminator.
Uninitialized Data (BSS)
Uninitialized data uses RES* directives (reserve):
| Directive | Reserves | Example |
|---|---|---|
RESB |
N bytes | buffer: resb 64 |
RESW |
N words (2 bytes each) | array: resw 100 |
RESD |
N dwords (4 bytes each) | intarray: resd 256 |
RESQ |
N qwords (8 bytes each) | ptrs: resq 32 |
REST |
N 10-byte values | x87buf: rest 4 |
section .bss
input_buffer: resb 256 ; 256 bytes, zero-initialized at runtime
count: resd 1 ; one 32-bit integer
table: resq 16 ; 16 × 8 = 128 bytes
Constants and Equates
EQU defines a constant that is substituted at assembly time. It does not allocate storage.
STDIN equ 0
STDOUT equ 1
STDERR equ 2
SYS_WRITE equ 1
SYS_EXIT equ 60
BUF_SIZE equ 4096
; Used in code:
mov rdi, STDOUT
mov rax, SYS_WRITE
mov rdx, BUF_SIZE
The $ special symbol refers to the current position in the output. The $$ symbol refers to the start of the current section. These enable:
msg: db "Hello, World!", 10
msglen equ $ - msg ; length = current position - start of msg
Labels
A label followed by a colon defines a symbol at the current position:
my_function:
push rbp
mov rbp, rsp
; ...
pop rbp
ret
Local labels begin with . and are scoped to the previous non-local label. This allows reusing common names like .loop, .done, .return without conflicts:
strlen:
xor rcx, rcx
.loop:
cmp BYTE [rdi + rcx], 0
je .done
inc rcx
jmp .loop
.done:
mov rax, rcx
ret
memset:
; .loop, .done here are different labels than strlen's versions
test rdx, rdx
je .done
.loop:
mov [rdi + rcx], sil
inc rcx
cmp rcx, rdx
jl .loop
.done:
ret
Global and External Declarations
global _start ; make _start visible to linker (entry point)
global my_function ; make my_function callable from C
global my_variable ; make a data symbol visible externally
extern printf ; declare printf as external (from libc)
extern _GLOBAL_OFFSET_TABLE_ ; for PIC code (if needed)
For ELF64 shared libraries, use global symbol:function to specify type:
global my_func:function (my_func.end - my_func)
my_func:
; ...
.end:
Section Attributes
Custom section attributes control alignment and flags:
section .text ; executable, readable
section .data ; writable, readable, allocatable
section .rodata ; readable, not writable (read-only data)
section .bss ; writable, allocatable, zero-initialized
section .note.GNU-stack noexec ; marks stack as non-executable
Explicitly marking .note.GNU-stack is good practice for library code — it tells the linker that your code does not need an executable stack:
section .note.GNU-stack noexec
Numeric Literals
| Format | Example | Value |
|---|---|---|
| Decimal | 42 |
42 |
| Hexadecimal | 0x2A or 0x2a or 2Ah |
42 |
| Octal | 0o52 or 52q or 52o |
42 |
| Binary | 0b101010 or 101010b |
42 |
| Float | 3.14 |
3.14 (for FP directives) |
| Character | 'A' |
65 |
Address Arithmetic
NASM computes addresses using square brackets:
mov rax, [rbx] ; load from address in RBX
mov rax, [rbx + 8] ; load from RBX + 8
mov rax, [rbx + rcx*8] ; load from RBX + RCX*8
mov rax, [rbx + rcx*8 + 16] ; full SIB form
REL modifier for RIP-relative addressing (preferred for PIC code):
mov rax, [rel my_data] ; RIP-relative load (PIC-compatible)
lea rsi, [rel msg] ; RIP-relative address
Without the REL modifier, NASM may use an absolute address, which breaks with ASLR. Configure default:
default rel ; make all [symbol] references RIP-relative by default
Size Specifiers
When the operand size is ambiguous, use explicit size specifiers:
mov BYTE [rdi], 0 ; store 1 byte
mov WORD [rdi], 0 ; store 2 bytes
mov DWORD [rdi], 0 ; store 4 bytes
mov QWORD [rdi], 0 ; store 8 bytes
These are required when the destination is a memory address and the source is an immediate (since NASM cannot infer the size):
; These would be ambiguous without size specifier:
mov [rdi], 42 ; ERROR: size ambiguous
mov DWORD [rdi], 42 ; OK: 4-byte store
Preprocessor Macros
%define — Simple Text Substitution
%define SYSCALL_WRITE 1
%define SYSCALL_EXIT 60
%define NULL 0
mov rax, SYSCALL_WRITE
Parameterized define:
%define ARRAY_ELEM(base, idx, size) [base + idx*size]
mov rax, ARRAY_ELEM(rdi, rcx, 8) ; [rdi + rcx*8]
%macro — Multi-Line Macro
%macro PUSH_ALL 0 ; 0 = no parameters
push rax
push rbx
push rcx
push rdx
push rsi
push rdi
push r8
push r9
%endmacro
%macro POP_ALL 0
pop r9
pop r8
pop rdi
pop rsi
pop rdx
pop rcx
pop rbx
pop rax
%endmacro
Macro with parameters (%1, %2, etc.):
; LOAD_ARG n, reg — loads the nth argument (0-indexed) into reg
%macro LOAD_ARG 2
%if %1 == 0
mov %2, rdi
%elif %1 == 1
mov %2, rsi
%elif %1 == 2
mov %2, rdx
%endif
%endmacro
LOAD_ARG 0, rax ; expands to: mov rax, rdi
LOAD_ARG 1, rbx ; expands to: mov rbx, rsi
Local labels in macros use %% prefix:
%macro SAFE_DIV 2 ; dividend, divisor → quotient in RAX
test %2, %2
jz %%div_zero
mov rax, %1
cqo
idiv %2
jmp %%done
%%div_zero:
xor eax, eax
%%done:
%endmacro
%assign — Numeric Assignment
%assign i 0
%rep 8
dq i * 8 ; generates: dq 0, dq 8, dq 16, ...
%assign i i+1
%endrep
Conditional Assembly
%ifdef DEBUG
; Debugging code — only assembled when DEBUG is defined
push rdi
lea rdi, [rel debug_msg]
call puts
pop rdi
%endif
%ifndef WINDOWS
; Linux/macOS code
mov rdi, 1
%else
; Windows code
mov rcx, 1
%endif
Define constants on the command line: nasm -DDEBUG -DVERSION=2 file.asm
%include — File Inclusion
%include "macros.asm"
%include "constants.asm"
Output Formats
Specify with -f on the command line:
| Format | Flag | Use |
|---|---|---|
| ELF64 | -f elf64 |
Linux 64-bit object file |
| ELF32 | -f elf32 |
Linux 32-bit object file |
| Mach-O 64 | -f macho64 |
macOS 64-bit object file |
| PE64 (COFF) | -f win64 |
Windows 64-bit object file |
| Flat binary | -f bin |
Raw binary (for bootloaders) |
| Intel HEX | -f ihex |
For embedded/firmware |
Examples:
# Linux ELF object file:
nasm -f elf64 -o output.o source.asm
# macOS Mach-O object file:
nasm -f macho64 -o output.o source.asm
# Flat binary (bootloader, no file headers):
nasm -f bin -o boot.bin boot.asm
# With debug information (DWARF):
nasm -f elf64 -g -F dwarf -o output.o source.asm
Commonly Used Build Patterns
Simple Linux Executable (no C library)
nasm -f elf64 -o program.o program.asm
ld -o program program.o
Linux Executable with C Library (for libc functions)
nasm -f elf64 -o program.o program.asm
gcc -no-pie -o program program.o # link against libc, no PIE
# OR:
ld -dynamic-linker /lib/x86_64-linux-gnu/ld-linux-x86-64.so.2 \
-o program /usr/lib/x86_64-linux-gnu/crt1.o program.o -lc
Mixed Assembly and C
# Assemble the assembly file:
nasm -f elf64 -o asm_func.o asm_func.asm
# Compile the C file:
gcc -c -o c_code.o c_code.c
# Link together:
gcc -o program c_code.o asm_func.o
Bootloader (flat binary)
nasm -f bin -o boot.bin boot.asm
# boot.bin is a raw 512-byte sector image
# Write to a disk image:
dd if=boot.bin of=disk.img conv=notrunc
List File Generation (for debugging)
nasm -f elf64 -l program.lst -o program.o program.asm
# program.lst shows: line number, offset, hex bytes, source line
NASM vs. GAS Syntax Comparison
| Feature | NASM | GAS (Intel mode) | GAS (AT&T mode) |
|---|---|---|---|
| Operand order | dest, src | dest, src | src, dest |
| Register prefix | none | none | % prefix |
| Immediate prefix | none | none | $ prefix |
| Memory access | [address] |
[address] |
(address) |
| SIB addressing | [base+idx*scale+disp] |
same | disp(base,idx,scale) |
| Size specifiers | BYTE, WORD, DWORD, QWORD |
BYTE PTR, etc. |
size suffix (b/w/l/q) |
| Section directive | section .text |
.section .text |
.section .text |
| Data definition | db, dw, dd, dq |
.byte, .short, .int, .quad |
same as GAS Intel |
| BSS reservation | resb, resw, resd, resq |
.skip, .zero |
same |
| Constants | equ |
= or .equ |
same |
| Macros | %macro/%endmacro |
.macro/.endm |
same |
NASM is used in this book for all x86-64 examples because its Intel syntax is consistent, its error messages are clear, and its macro system is powerful without being complex.