7 min read

There are two distinct moments when your assembly source code does something: when the assembler processes it, and when the CPU executes it. The assembler (NASM) runs at build time. It reads text, evaluates expressions, processes macros, and emits...

Chapter 6: The NASM Assembler

The Assembler Is Not the CPU

There are two distinct moments when your assembly source code does something: when the assembler processes it, and when the CPU executes it. The assembler (NASM) runs at build time. It reads text, evaluates expressions, processes macros, and emits binary bytes. The CPU runs at runtime. It fetches those bytes, decodes them, and executes instructions.

NASM is more powerful than a simple lookup table. It has a full preprocessor with macros, conditionals, and include files. It has an expression evaluator that computes addresses and constants at build time. It has directives that control how bytes are laid out in the output file. Understanding what NASM does at assembly time — versus what the CPU does at runtime — is fundamental to understanding why assembly code is written the way it is.

This chapter covers NASM comprehensively: syntax, sections, data declarations, labels, the expression system, the preprocessor, and output formats. By the end, you'll be able to write NASM code that is expressive, maintainable, and correct.


NASM Syntax: Intel Style

NASM uses Intel assembly syntax. This means: - Destination operand is listed first: mov rax, rbx means rax ← rbx (copy rbx to rax) - No register sigils: registers are written plainly (rax, rbx) with no % prefix - Memory operands use brackets: mov rax, [rbx] loads from the address in rbx - Immediate values are plain numbers: mov rax, 42 or mov rax, 0xFF

This contrasts with AT&T syntax (used by GAS, the GNU Assembler and default in objdump): - Source first: movq %rbx, %rax (same operation as mov rax, rbx) - % prefix for registers: %rax, %rbx - $` prefix for immediates: `$42, $0xFF - l/q suffixes for operation sizes: movq, movl

All examples in this book use NASM Intel syntax. When reading GCC output or GNU objdump, you'll see AT&T syntax unless you add -M intel. The chapter 1 case study showed both side-by-side.


Structure of a NASM Source File

A NASM source file consists of three kinds of lines: 1. Instructions — mnemonic plus operands, emit machine code bytes 2. Directives — control the assembler's behavior, do not emit code 3. Preprocessor directives — begin with %, processed before assembly

; This is a comment (anything after semicolon)

; ========= PREPROCESSOR DIRECTIVES =========
%define MAX_SIZE    1024        ; compile-time constant (like #define)
%include "macros.inc"           ; paste another file here

; ========= SECTIONS =========
section .data                   ; switch to data section

; ========= DATA DECLARATIONS =========
    msg     db "Hello", 10, 0   ; define bytes
    count   dq 0                ; define qword

; ========= LABELS =========
.local_label:                   ; local label (scoped to enclosing global label)
global_label:                   ; global label (visible to linker)

; ========= INSTRUCTIONS =========
    mov     rax, 1              ; instruction
    syscall                     ; instruction with no operands

Sections

Sections group code and data and tell the assembler (and linker) what memory region each part belongs to.

Standard Sections

section .text       ; executable code
                    ; mapped as r-xp (read-execute)
                    ; all instructions go here

section .data       ; initialized read-write data
                    ; mapped as rw-p
                    ; global variables with initial values

section .rodata     ; read-only data
                    ; mapped as r--p
                    ; string literals, constants, lookup tables

section .bss        ; uninitialized (zero-initialized) data
                    ; mapped as rw-p
                    ; takes no space in the ELF file

Custom Sections

You can define custom sections for special purposes:

section .init       ; code run before main() (for C programs)
section .fini       ; code run after main() exits
section .my_data    ; completely custom section name

Custom sections are used in kernel development for things like the GDT descriptor table, interrupt descriptor tables, and per-CPU data.

The align Directive in Sections

section .data
    db 1                ; at some offset
    align 8             ; pad to 8-byte boundary
    qword_val dq 0      ; now 8-byte aligned

section .text
    align 16            ; ensure 16-byte alignment (for SIMD code)
    my_hot_function:
        ; function body

The linker also aligns sections by default (.text is typically 4096-byte page-aligned). Individual symbols within a section may need explicit align for SIMD requirements.


Labels

Labels give names to positions in the code. A label is a symbol that the assembler resolves to an address.

Global Labels

Global labels are visible outside the current file — the linker can see them. Use global to declare them:

global _start          ; make _start visible to linker
global my_function     ; make my_function callable from C

_start:
    ; code here

my_function:
    ; code here
    ret

Local Labels

Labels beginning with . are local to the enclosing global label. They can be reused in different functions without naming conflicts:

global func_a
global func_b

func_a:
    ; ...
    jz  .done           ; .done refers to func_a's .done
.done:
    ret

func_b:
    ; ...
    jnz .done           ; .done refers to func_b's .done (different label!)
.done:
    ret

Without the . prefix, a label would be global and you'd get a "duplicate symbol" error if you tried to use done in two functions.

Using Labels as Data Addresses

Labels in the data section give names to data locations:

section .data
    msg     db "Hello", 10     ; msg is the address of the H
    ; $ is the address after the last byte of msg
    msglen  equ $ - msg        ; msglen = address_of_$ - address_of_msg = 6

section .text
    ; Using the label as an address:
    mov  rsi, msg              ; rsi = address of msg
    mov  rdx, msglen           ; rdx = 6 (a compile-time constant, not an address)

Data Declarations in Depth

Define Directives: Initialized Data

section .data

; === Byte (1 byte each) ===
single_byte     db 42                   ; 0x2A
char_val        db 'A'                  ; 0x41
neg_byte        db -1                   ; 0xFF (two's complement)
hex_byte        db 0xFF                 ; 0xFF
string          db "Hello, World!", 0   ; null-terminated string
string_with_nl  db "Line 1", 10, "Line 2", 10, 0   ; embedded newlines
mixed           db 0x01, 'A', 65, 0b01000001        ; all four ways to say 65

; === Word (2 bytes each) ===
word_val        dw 0x1234               ; stored as: 0x34, 0x12 (little-endian!)
neg_word        dw -1                   ; 0xFFFF
unicode_char    dw 0x0041               ; 'A' in UTF-16 (little-endian)

; === Doubleword (4 bytes each) ===
dword_val       dd 0x12345678           ; stored as: 0x78, 0x56, 0x34, 0x12
float_approx    dd 3.14159              ; IEEE 754 32-bit float
int_array       dd 1, 2, 3, 4, 5       ; 5 doublewords (20 bytes total)

; === Quadword (8 bytes each) ===
qword_val       dq 0x0102030405060708
double_approx   dq 3.14159265358979    ; IEEE 754 64-bit double
large_const     dq 0x8000000000000000  ; INT64_MIN

; === Ten-byte (x87 extended precision float, rarely used) ===
ext_float       dt 3.14159265358979323846  ; 80-bit extended float

Reserve Directives: BSS (Uninitialized)

section .bss

; === Reserve N units of each size ===
byte_buffer     resb 1024     ; 1024 bytes
word_buffer     resw 512      ; 512 words = 1024 bytes
dword_buffer    resd 256      ; 256 dwords = 1024 bytes
qword_buffer    resq 128      ; 128 qwords = 1024 bytes

BSS reserves take no space in the ELF file. The OS guarantees they're zero-initialized at program start.

The TIMES Directive: Repeated Values

section .data
    zeros       times 16  db 0         ; 16 zero bytes
    ones        times 8   dw 0xFFFF    ; 8 words, all 0xFFFF
    stride      times 4   dd 4         ; 4 dwords, all value 4
    padding     times 64  db 0x90      ; 64 NOP instructions (0x90 = NOP)

TIMES is evaluated at assembly time. The count must be a compile-time constant.

Practical Pattern: Self-Measuring Data

section .data
    ; The message and its length computed automatically
    msg     db "Hello, Assembly World!", 10
    MSGLEN  equ $ - msg         ; MSGLEN = length including newline

    ; A binary header
    header_magic    db 0x7F, 'E', 'L', 'F'     ; ELF magic
    header_class    db 2                         ; 64-bit
    header_pad      times 11 db 0               ; 11 bytes padding
    HEADER_SIZE     equ $ - header_magic         ; = 16

The $ and $$` Operators These are position operators evaluated by the assembler: - `$` evaluates to the address of the current line (the next byte to be emitted) - `$$ evaluates to the start of the current section

```nasm section .data msg db "Hello" ; 5 bytes at some address, say 0x402000 msglen equ $ - msg ; $ is now 0x402005; $ - msg = 5

section .rodata ; Padding to align to 512 bytes within the section: table_start: times 10 dq 0 ; 80 bytes of data ; Pad to next 512-byte boundary within this section: times (512 - ($ - $$) % 512) % 512 db 0 ; This computes: how many bytes to add to reach the next multiple of 512 ``` The expression `(512 - ($ - $$) % 512) % 512is a standard NASM idiom for padding. Breakdown: -$ - $$` = current offset within section - `($ - $$) % 512= remainder when divided by 512 -512 - ($ - $$) % 512` = bytes needed to reach next 512-byte boundary - `% 512` at the end: handles the case where we're already aligned (would give 512, mod makes it 0) --- ## EQU: Compile-Time Constants `EQU` defines a symbol with a value that is substituted at assembly time. Unlike labels, `EQU` symbols don't correspond to memory addresses — they're purely assembler-time values. ```nasm ; System constants SYS_READ equ 0 SYS_WRITE equ 1 SYS_EXIT equ 60 STDIN equ 0 STDOUT equ 1 ; Array parameters ARRAY_SIZE equ 1024 ELEMENT_SZ equ 8 ; 64-bit elements ARRAY_BYTES equ ARRAY_SIZE * ELEMENT_SZ ; 8192 -- computed at assembly time ; Bit masks FLAG_READ equ 0x01 FLAG_WRITE equ 0x02 FLAG_EXEC equ 0x04 ALL_FLAGS equ FLAG_READ | FLAG_WRITE | FLAG_EXEC ; 0x07 ; Usage: these substitute at assembly time mov rax, SYS_WRITE ; mov rax, 1 mov rdx, ARRAY_BYTES ; mov rdx, 8192 test rax, FLAG_WRITE ; test rax, 2 ``` `EQU` differs from `%define` in that `EQU` defines a numeric value at the current assembly position; `%define` is textual substitution (like C's `#define`). --- ## The NASM Preprocessor The NASM preprocessor runs before assembly, performing textual substitutions and conditional compilation. All preprocessor directives begin with `%`. ### Text Substitution: %define ```nasm ; Simple text substitution %define MAX_BUF 4096 %define NEWLINE 10 ; ASCII newline %define NULL 0 ; Multi-token substitution %define SAVE_REGS push rbx; push r12; push r13 %define RESTORE_REGS pop r13; pop r12; pop rbx ; Parametric %define (like function-like macros in C) %define ARRAY_ELEM(base, index, size) [(base) + (index) * (size)] ; Usage: mov rax, MAX_BUF ; mov rax, 4096 mov rbx, ARRAY_ELEM(rdi, rcx, 8) ; mov rbx, [rdi + rcx * 8] ``` ### Numeric Assignments: %assign `%assign` is like `%define` but works only with numeric values and can be reassigned: ```nasm %assign counter 0 %assign counter counter + 1 ; counter = 1 %assign counter counter + 1 ; counter = 2 %assign counter counter + 1 ; counter = 3 ; Useful for auto-numbering: %assign error_code 0 E_OK equ error_code : %assign error_code error_code+1 E_NOTFOUND equ error_code : %assign error_code error_code+1 E_PERM equ error_code : %assign error_code error_code+1 ; E_OK=0, E_NOTFOUND=1, E_PERM=2 ``` ### Multi-Line Macros: %macro / %endmacro The `%macro` directive defines reusable code blocks: ```nasm ; Basic macro: no arguments %macro save_frame 0 push rbp mov rbp, rsp %endmacro %macro restore_frame 0 mov rsp, rbp pop rbp ret %endmacro ; Macro with arguments: %N for argument N %macro print_string 2 ; 2 arguments: %1 = address, %2 = length mov rax, 1 ; sys_write mov rdi, 1 ; stdout mov rsi, %1 ; buffer address mov rdx, %2 ; length syscall %endmacro ; Macro with a variable number of arguments: %0 = count, %{0} = all %macro debug_print 1+ ; 1 or more arguments ; Print a debug message to stderr section .data .debug_msg_%+ __LINE__ db %{0} ; concatenation hack; complex -- see below section .text mov rax, 1 mov rdi, 2 ; stderr ... %endmacro ``` ### Local Labels in Macros When a macro contains labels, those labels must be local to avoid conflicts when the macro is used multiple times: ```nasm %macro loop_n_times 1 ; %1 = count mov rcx, %1 %%loop_start: ; %% prefix makes it local to this macro invocation ; loop body here dec rcx jnz %%loop_start %endmacro ; Both uses create different internal labels: loop_n_times 10 ; uses ..@1.loop_start internally ; other code loop_n_times 20 ; uses ..@2.loop_start internally -- no conflict! ``` Without `%%`, the label would conflict on the second use. ### Macros with Default Arguments ```nasm ; Macro with optional third argument (default = 0) %macro store_byte 2-3 0 ; 2 required args, 1 optional with default 0 mov BYTE [%1], %2 %if %0 >= 3 add %1, %3 ; advance pointer if stride provided %endif %endmacro store_byte rdi, al ; store al at [rdi], no advance store_byte rdi, al, 1 ; store al at [rdi], then rdi += 1 ``` ### Practical Macro Library Here's a complete set of macros for systems programming: ```nasm ; syscall wrappers %macro sys_write 3 ; fd, buf, len mov rax, 1 mov rdi, %1 mov rsi, %2 mov rdx, %3 syscall %endmacro %macro sys_read 3 ; fd, buf, maxlen mov rax, 0 mov rdi, %1 mov rsi, %2 mov rdx, %3 syscall %endmacro %macro sys_exit 1 ; status mov rax, 60 mov rdi, %1 syscall %endmacro ; function frame macros %macro prologue 0 push rbp mov rbp, rsp %endmacro %macro epilogue 0 pop rbp ret %endmacro ; register preservation %macro preserve 1-* ; variable number of registers %rep %0 push %1 %rotate 1 %endrep %endmacro %macro restore 1-* ; in reverse order! %rep %0 %rotate -1 pop %1 %endrep %endmacro ; Usage: my_function: prologue preserve rbx, r12, r13 ; pushes rbx, r12, r13 in order ; function body using rbx, r12, r13... restore rbx, r12, r13 ; pops r13, r12, rbx (reverse order!) epilogue ``` --- ## Conditional Assembly: %if / %elif / %else / %endif ```nasm ; Conditional compilation based on a defined symbol %ifdef DEBUG ; This code only assembled if DEBUG is defined: ; (define with: nasm -DDEBUG ...) %macro debug_msg 1 sys_write 2, %1, debug_len_%1 %endmacro %else %macro debug_msg 1 ; no-op in release mode %endmacro %endif ; Conditional on numeric value %define TARGET_OS 1 ; 1 = Linux, 2 = macOS %if TARGET_OS == 1 %define SYSCALL_WRITE 1 %define SYSCALL_EXIT 60 %elif TARGET_OS == 2 %define SYSCALL_WRITE 4 ; macOS uses different numbers! %define SYSCALL_EXIT 1 %else %error "Unknown TARGET_OS" %endif ; Check if a macro is defined: %ifndef STACK_CANARY %define STACK_CANARY 0 ; default: no canary %endif ``` --- ## %include: Splitting Code Across Files ```nasm ; main.asm: %include "include/syscall.inc" ; paste syscall wrapper macros %include "include/constants.inc" ; paste constant definitions %include "include/structures.inc" ; paste structure definitions ; After %include, the included file's content is as if it was typed here. ; This is purely textual inclusion — no linking, no separate compilation. ; For separate compilation (object files), you link with ld, not %include. ``` --- ## NASM Output Formats NASM can produce several output formats: ```bash # ELF64: standard Linux object files (for linking with ld or gcc) nasm -f elf64 file.asm -o file.o # Flat binary: no headers, raw machine code (for bootloaders) nasm -f bin file.asm -o file.bin # Flat binary with specific origin address: # (in the source file, use: ORG 0x7C00 for a BIOS bootloader) nasm -f bin bootloader.asm -o bootloader.bin # macOS 64-bit object format (for macOS development) nasm -f macho64 file.asm -o file.o # 32-bit ELF (for 32-bit programs or 32-bit kernel code) nasm -f elf32 file.asm -o file.o # COFF (Windows object format) nasm -f win64 file.asm -o file.obj ``` For the MinOS bootloader, we'll use `-f bin` to produce raw machine code with no ELF headers — just the bytes that the BIOS will load and execute. --- ## Common NASM Errors and Their Meaning ``` error: operation size not specified ``` You wrote `mov [rbp-8], 0` — NASM doesn't know if this is a byte, word, dword, or qword store. Fix: `mov QWORD [rbp-8], 0`. ``` error: symbol `foo' is multiply defined ``` You used the same non-local label twice. Fix: use `.foo` (local labels) or choose unique names. ``` error: invalid combination of opcode and operands ``` The combination you wrote doesn't exist. Common cause: trying to `mov [mem], [mem]` (memory-to-memory MOV doesn't exist) or using the wrong operand size combination. ``` warning: uninitialised space declared in non-BSS section ``` You used `resb`/`resw`/`resd`/`resq` in `.data` instead of `.bss`. The reservation works but generates a warning; switch to `section .bss`. ``` error: `times' count must be a constant ``` You used a runtime variable as the count for `times`. `times` requires a compile-time constant. --- ## A Complete Example: A Reusable Print Function This example uses macros and `%include` to build a small, reusable I/O library: ```nasm ; io.asm — reusable I/O functions using the macro library %include "include/constants.inc" %include "include/macros.inc" section .text global print_string ; void print_string(const char *s, size_t len) global print_cstr ; void print_cstr(const char *s) [null-terminated] global print_newline ; void print_newline(void) global print_uint64 ; void print_uint64(uint64_t n) section .data newline_char db 10 section .bss digit_buf resb 24 ; buffer for number-to-string conversion ; ============================================================ ; print_string: write exactly `len` bytes of `s` to stdout ; Args: rdi = s, rsi = len ; ============================================================ print_string: ; rdi and rsi are already set correctly for sys_write arg layout ; but we need to set rax=1 (sys_write) and the fd push rsi ; save length push rdi ; save pointer pop rsi ; rsi = buffer pop rdx ; rdx = length mov rax, SYS_WRITE mov rdi, STDOUT syscall ret ; ============================================================ ; print_cstr: write a null-terminated string to stdout ; Args: rdi = null-terminated string ; ============================================================ print_cstr: push rbx ; callee-saved mov rbx, rdi ; rbx = string start ; Find the null terminator xor ecx, ecx .scan: cmp BYTE [rdi + rcx], 0 je .found_end inc rcx jmp .scan .found_end: ; rcx = length mov rsi, rbx ; buffer mov rdx, rcx ; length mov rax, SYS_WRITE mov rdi, STDOUT syscall pop rbx ret ; ============================================================ ; print_newline: write a newline to stdout ; ============================================================ print_newline: mov rax, SYS_WRITE mov rdi, STDOUT lea rsi, [rel newline_char] mov rdx, 1 syscall ret ; ============================================================ ; print_uint64: print a 64-bit unsigned integer to stdout ; Args: rdi = value ; ============================================================ print_uint64: push rbx ; Convert integer to decimal string (least significant digit first) lea rbx, [rel digit_buf + 23] ; start at end of buffer mov BYTE [rbx], 10 ; newline terminator dec rbx mov rax, rdi ; value to convert mov rcx, 10 ; divisor ; Handle zero case test rax, rax jnz .convert_loop mov BYTE [rbx], '0' dec rbx jmp .print_number .convert_loop: xor rdx, rdx ; zero high bits before division div rcx ; rax = quotient, rdx = remainder (digit) add dl, '0' ; convert to ASCII mov [rbx], dl ; store digit dec rbx test rax, rax ; more digits? jnz .convert_loop .print_number: inc rbx ; rbx now points to first digit lea rsi, [rel digit_buf + 24] ; end of buffer + newline sub rsi, rbx ; length = end - start ; Wait: that's computing the pointer wrong. Let me recalculate: ; digit_buf + 24 points one past the newline ; rbx points to the first digit ; length = (digit_buf + 24) - rbx lea rdx, [rel digit_buf + 24] sub rdx, rbx mov rsi, rbx ; buffer start mov rax, SYS_WRITE mov rdi, STDOUT syscall pop rbx ret ``` --- ## Summary NASM is a full-featured assembler with: - Intel syntax (destination first, no sigils, brackets for memory) - Sections that map to ELF segments - Local and global labels for code organization - Data declaration directives (db/dw/dd/dq/resb/resw/resd/resq) that map precisely to binary output - `$` and `$$for compile-time position arithmetic -EQUfor compile-time constants - A preprocessor with%define,%macro,%if, and%include` for code reuse and conditional compilation - Multiple output formats (elf64, bin, macho64) for different target environments

The key insight: everything NASM does happens at build time. The CPU knows nothing about labels, macros, %include, or EQU. When NASM is done, there is only bytes.

🔄 Check Your Understanding: What is the difference between %define BUFSIZE 1024 and BUFSIZE equ 1024?

Answer Both produce the same assembly output when used as mov rax, BUFSIZE — they both cause mov rax, 1024 to be assembled. The difference is:

  • %define BUFSIZE 1024 is a preprocessor text substitution. It can be used for non-numeric replacements, can be redefined with %undef, and can be checked with %ifdef. The substitution happens in the preprocessor pass before assembly.

  • BUFSIZE equ 1024 is an assembler directive that defines a numeric symbol. It's evaluated during the assembly pass and cannot be used for non-numeric substitutions. EQU can reference labels and $/$$ (positions), which %define cannot do at the preprocessor level.

Practical rule: use %define for text substitutions and conditional compilation flags; use EQU for computed sizes and offsets that reference assembly-time positions.