Appendix I: NASM Directives and Preprocessor Reference

NASM (Netwide Assembler) is the assembler used for all x86-64 examples in this book. This appendix covers NASM directives, pseudo-instructions, and preprocessor features.

Full documentation: https://www.nasm.us/doc/


Source File Structure

A NASM source file consists of sections. Each section maps to a segment in the output binary.

section .text           ; executable code (default section)
section .data           ; initialized read-write data
section .rodata         ; initialized read-only data
section .bss            ; zero-initialized (uninitialized) data

Alternative spelling: segment is synonymous with section in NASM.


Data Definition Directives

These directives define data inline in the assembly source:

Directive Size Example
DB 1 byte db 0x41 or db 'A'
DW 2 bytes (word) dw 0x1234
DD 4 bytes (dword) dd 0x12345678
DQ 8 bytes (qword) dq 0x0000000000401000
DT 10 bytes (tword, x87 extended) dt 3.14159
DO 16 bytes (oword) do 0
DY 32 bytes (yword, YMM) dy 0
DZ 64 bytes (zword, ZMM) dz 0

Multiple values on one line:

my_bytes:   db 0x90, 0x90, 0x90, 0xC3      ; nop, nop, nop, ret
my_string:  db "Hello, World!", 10, 0       ; string + newline + null
my_array:   dd 1, 2, 3, 4, 5               ; five 32-bit integers

String literals use single or double quotes. The two forms are equivalent; no escape sequences are processed inside quotes. Use explicit byte values for special characters: db 10 for newline, db 0 for null terminator.


Uninitialized Data (BSS)

Uninitialized data uses RES* directives (reserve):

Directive Reserves Example
RESB N bytes buffer: resb 64
RESW N words (2 bytes each) array: resw 100
RESD N dwords (4 bytes each) intarray: resd 256
RESQ N qwords (8 bytes each) ptrs: resq 32
REST N 10-byte values x87buf: rest 4
section .bss
    input_buffer:   resb 256    ; 256 bytes, zero-initialized at runtime
    count:          resd 1      ; one 32-bit integer
    table:          resq 16     ; 16 × 8 = 128 bytes

Constants and Equates

EQU defines a constant that is substituted at assembly time. It does not allocate storage.

STDIN     equ 0
STDOUT    equ 1
STDERR    equ 2
SYS_WRITE equ 1
SYS_EXIT  equ 60
BUF_SIZE  equ 4096

; Used in code:
    mov     rdi, STDOUT
    mov     rax, SYS_WRITE
    mov     rdx, BUF_SIZE

The $ special symbol refers to the current position in the output. The $$ symbol refers to the start of the current section. These enable:

msg:    db "Hello, World!", 10
msglen  equ $ - msg           ; length = current position - start of msg

Labels

A label followed by a colon defines a symbol at the current position:

my_function:
    push    rbp
    mov     rbp, rsp
    ; ...
    pop     rbp
    ret

Local labels begin with . and are scoped to the previous non-local label. This allows reusing common names like .loop, .done, .return without conflicts:

strlen:
    xor     rcx, rcx
.loop:
    cmp     BYTE [rdi + rcx], 0
    je      .done
    inc     rcx
    jmp     .loop
.done:
    mov     rax, rcx
    ret

memset:
    ; .loop, .done here are different labels than strlen's versions
    test    rdx, rdx
    je      .done
.loop:
    mov     [rdi + rcx], sil
    inc     rcx
    cmp     rcx, rdx
    jl      .loop
.done:
    ret

Global and External Declarations

global _start           ; make _start visible to linker (entry point)
global my_function      ; make my_function callable from C
global my_variable      ; make a data symbol visible externally
extern printf           ; declare printf as external (from libc)
extern _GLOBAL_OFFSET_TABLE_  ; for PIC code (if needed)

For ELF64 shared libraries, use global symbol:function to specify type:

global my_func:function (my_func.end - my_func)
my_func:
    ; ...
.end:

Section Attributes

Custom section attributes control alignment and flags:

section .text           ; executable, readable
section .data           ; writable, readable, allocatable
section .rodata         ; readable, not writable (read-only data)
section .bss            ; writable, allocatable, zero-initialized
section .note.GNU-stack noexec  ; marks stack as non-executable

Explicitly marking .note.GNU-stack is good practice for library code — it tells the linker that your code does not need an executable stack:

section .note.GNU-stack noexec

Numeric Literals

Format Example Value
Decimal 42 42
Hexadecimal 0x2A or 0x2a or 2Ah 42
Octal 0o52 or 52q or 52o 42
Binary 0b101010 or 101010b 42
Float 3.14 3.14 (for FP directives)
Character 'A' 65

Address Arithmetic

NASM computes addresses using square brackets:

mov     rax, [rbx]          ; load from address in RBX
mov     rax, [rbx + 8]      ; load from RBX + 8
mov     rax, [rbx + rcx*8]  ; load from RBX + RCX*8
mov     rax, [rbx + rcx*8 + 16]  ; full SIB form

REL modifier for RIP-relative addressing (preferred for PIC code):

mov     rax, [rel my_data]  ; RIP-relative load (PIC-compatible)
lea     rsi, [rel msg]      ; RIP-relative address

Without the REL modifier, NASM may use an absolute address, which breaks with ASLR. Configure default:

default rel     ; make all [symbol] references RIP-relative by default

Size Specifiers

When the operand size is ambiguous, use explicit size specifiers:

mov     BYTE [rdi], 0       ; store 1 byte
mov     WORD [rdi], 0       ; store 2 bytes
mov     DWORD [rdi], 0      ; store 4 bytes
mov     QWORD [rdi], 0      ; store 8 bytes

These are required when the destination is a memory address and the source is an immediate (since NASM cannot infer the size):

; These would be ambiguous without size specifier:
mov     [rdi], 42           ; ERROR: size ambiguous
mov     DWORD [rdi], 42     ; OK: 4-byte store

Preprocessor Macros

%define — Simple Text Substitution

%define SYSCALL_WRITE   1
%define SYSCALL_EXIT    60
%define NULL            0

mov     rax, SYSCALL_WRITE

Parameterized define:

%define ARRAY_ELEM(base, idx, size)  [base + idx*size]

mov     rax, ARRAY_ELEM(rdi, rcx, 8)   ; [rdi + rcx*8]

%macro — Multi-Line Macro

%macro PUSH_ALL 0           ; 0 = no parameters
    push    rax
    push    rbx
    push    rcx
    push    rdx
    push    rsi
    push    rdi
    push    r8
    push    r9
%endmacro

%macro POP_ALL 0
    pop     r9
    pop     r8
    pop     rdi
    pop     rsi
    pop     rdx
    pop     rcx
    pop     rbx
    pop     rax
%endmacro

Macro with parameters (%1, %2, etc.):

; LOAD_ARG n, reg — loads the nth argument (0-indexed) into reg
%macro LOAD_ARG 2
    %if %1 == 0
        mov     %2, rdi
    %elif %1 == 1
        mov     %2, rsi
    %elif %1 == 2
        mov     %2, rdx
    %endif
%endmacro

LOAD_ARG 0, rax     ; expands to: mov rax, rdi
LOAD_ARG 1, rbx     ; expands to: mov rbx, rsi

Local labels in macros use %% prefix:

%macro SAFE_DIV 2               ; dividend, divisor → quotient in RAX
    test    %2, %2
    jz      %%div_zero
    mov     rax, %1
    cqo
    idiv    %2
    jmp     %%done
%%div_zero:
    xor     eax, eax
%%done:
%endmacro

%assign — Numeric Assignment

%assign i 0
%rep 8
    dq i * 8        ; generates: dq 0, dq 8, dq 16, ...
    %assign i i+1
%endrep

Conditional Assembly

%ifdef DEBUG
    ; Debugging code — only assembled when DEBUG is defined
    push    rdi
    lea     rdi, [rel debug_msg]
    call    puts
    pop     rdi
%endif

%ifndef WINDOWS
    ; Linux/macOS code
    mov     rdi, 1
%else
    ; Windows code
    mov     rcx, 1
%endif

Define constants on the command line: nasm -DDEBUG -DVERSION=2 file.asm

%include — File Inclusion

%include "macros.asm"
%include "constants.asm"

Output Formats

Specify with -f on the command line:

Format Flag Use
ELF64 -f elf64 Linux 64-bit object file
ELF32 -f elf32 Linux 32-bit object file
Mach-O 64 -f macho64 macOS 64-bit object file
PE64 (COFF) -f win64 Windows 64-bit object file
Flat binary -f bin Raw binary (for bootloaders)
Intel HEX -f ihex For embedded/firmware

Examples:

# Linux ELF object file:
nasm -f elf64 -o output.o source.asm

# macOS Mach-O object file:
nasm -f macho64 -o output.o source.asm

# Flat binary (bootloader, no file headers):
nasm -f bin -o boot.bin boot.asm

# With debug information (DWARF):
nasm -f elf64 -g -F dwarf -o output.o source.asm

Commonly Used Build Patterns

Simple Linux Executable (no C library)

nasm -f elf64 -o program.o program.asm
ld -o program program.o

Linux Executable with C Library (for libc functions)

nasm -f elf64 -o program.o program.asm
gcc -no-pie -o program program.o     # link against libc, no PIE
# OR:
ld -dynamic-linker /lib/x86_64-linux-gnu/ld-linux-x86-64.so.2 \
   -o program /usr/lib/x86_64-linux-gnu/crt1.o program.o -lc

Mixed Assembly and C

# Assemble the assembly file:
nasm -f elf64 -o asm_func.o asm_func.asm

# Compile the C file:
gcc -c -o c_code.o c_code.c

# Link together:
gcc -o program c_code.o asm_func.o

Bootloader (flat binary)

nasm -f bin -o boot.bin boot.asm
# boot.bin is a raw 512-byte sector image
# Write to a disk image:
dd if=boot.bin of=disk.img conv=notrunc

List File Generation (for debugging)

nasm -f elf64 -l program.lst -o program.o program.asm
# program.lst shows: line number, offset, hex bytes, source line

NASM vs. GAS Syntax Comparison

Feature NASM GAS (Intel mode) GAS (AT&T mode)
Operand order dest, src dest, src src, dest
Register prefix none none % prefix
Immediate prefix none none $ prefix
Memory access [address] [address] (address)
SIB addressing [base+idx*scale+disp] same disp(base,idx,scale)
Size specifiers BYTE, WORD, DWORD, QWORD BYTE PTR, etc. size suffix (b/w/l/q)
Section directive section .text .section .text .section .text
Data definition db, dw, dd, dq .byte, .short, .int, .quad same as GAS Intel
BSS reservation resb, resw, resd, resq .skip, .zero same
Constants equ = or .equ same
Macros %macro/%endmacro .macro/.endm same

NASM is used in this book for all x86-64 examples because its Intel syntax is consistent, its error messages are clear, and its macro system is powerful without being complex.