Case Study 5.2: Building a Multi-File Assembly Project with Make

A Makefile-based project structure for larger assembly programs


Overview

Real assembly projects are not single files. A kernel has separate files for the bootloader, interrupt handlers, memory management, and process management. A cryptographic library has separate files for AES key expansion, encryption, decryption, and utility functions. A CTF exploit toolkit has separate files for different gadget chains.

This case study builds a properly structured multi-file assembly project with: - A clean directory layout - A Makefile that handles dependencies and incremental builds - Shared definitions via %include - Separate modules for different functionality - Automated testing


Project Structure

string-utils/
├── Makefile
├── include/
│   ├── macros.inc      ; shared macros (syscall wrappers, etc.)
│   └── constants.inc   ; shared constants (syscall numbers, etc.)
├── src/
│   ├── main.asm        ; entry point and tests
│   ├── strlen.asm      ; string length
│   ├── strcpy.asm      ; string copy
│   └── strcmp.asm      ; string comparison
├── obj/                ; object files (created by make)
└── bin/                ; executables (created by make)
    └── string-utils    ; the built executable

The Include Files

include/constants.inc

; constants.inc -- system call numbers and common constants
; Include with: %include "include/constants.inc"

; System call numbers (Linux x86-64)
%define SYS_READ    0
%define SYS_WRITE   1
%define SYS_OPEN    2
%define SYS_CLOSE   3
%define SYS_EXIT    60

; File descriptors
%define STDIN   0
%define STDOUT  1
%define STDERR  2

; Boolean values
%define TRUE    1
%define FALSE   0

include/macros.inc

; macros.inc -- reusable macro definitions
; Include with: %include "include/macros.inc"

; sys_write macro: write string to file descriptor
; Usage: sys_write STDOUT, my_msg, my_msg_len
%macro sys_write 3
    mov  rax, SYS_WRITE
    mov  rdi, %1            ; file descriptor
    mov  rsi, %2            ; buffer
    mov  rdx, %3            ; length
    syscall
%endmacro

; sys_write_str: write a null-terminated string to fd
; Usage: sys_write_str STDOUT, my_msg
; Note: requires a strlen function to be defined
%macro sys_write_str 2
    ; Push rdi before it gets clobbered
    push rdi
    mov  rdi, %2
    call str_len            ; result in rax
    mov  rdx, rax           ; length
    pop  rdi
    mov  rsi, %2            ; buffer
    mov  rdi, %1            ; fd
    mov  rax, SYS_WRITE
    syscall
%endmacro

; save_regs: push a set of registers
; Usage: save_regs rbx, r12, r13
%macro save_regs 1-*
    %rep %0
        push %1
        %rotate 1
    %endrep
%endmacro

; restore_regs: pop a set of registers (in reverse order)
; Usage: restore_regs rbx, r12, r13
%macro restore_regs 1-*
    %rep %0
        %rotate -1
        pop %1
    %endrep
%endmacro

; function_prologue: standard function entry
; Usage: function_prologue
%macro function_prologue 0
    push rbp
    mov  rbp, rsp
%endmacro

; function_epilogue: standard function exit
; Usage: function_epilogue
%macro function_epilogue 0
    mov  rsp, rbp
    pop  rbp
    ret
%endmacro

The Source Files

src/strlen.asm

; strlen.asm -- string length function
; size_t str_len(const char *s)
; Args: rdi = null-terminated string
; Returns: rax = length (not including null terminator)

%include "include/constants.inc"

section .text
    global str_len

str_len:
    ; We scan for the null byte using SCASB (scan string byte)
    ; SCASB: compare AL with [RDI], set flags, advance RDI (if DF=0)
    ; REPNE: repeat while not equal (ZF=0)

    push  rcx
    push  rdi

    cld                 ; clear direction flag (scan forward)
    xor   eax, eax      ; AL = 0 (the null byte we're scanning for)
    mov   rcx, -1       ; scan up to 2^64-1 bytes (effectively unlimited)

    repne scasb         ; scan: compare AL with [RDI], advance RDI until AL=[RDI]
    ; After: RDI points one past the null byte
    ; RCX was decremented for each byte scanned

    ; Length = (initial RCX - final RCX) - 2
    ; = (-1 - final_rcx) - 2 = -2 - final_rcx = ~rcx - 1
    not   rcx           ; NOT reverses all bits
    lea   rax, [rcx-1]  ; rax = ~rcx - 1 = length without null byte

    ; Alternative calculation:
    ; After REPNE SCASB, RDI points past the null byte
    ; Saved original RDI is on the stack
    ; Length = (current RDI - 1) - original RDI

    pop   rdi
    pop   rcx
    ret

Wait, the SCASB approach has a subtle complexity. Let me provide a cleaner implementation:

; strlen.asm -- clean implementation
; size_t str_len(const char *s)
; Returns length of null-terminated string

section .text
    global str_len

str_len:
    ; rdi = string pointer
    mov   rax, rdi          ; save start address in rax

.scan:
    cmp   BYTE [rdi], 0     ; is current byte the null terminator?
    je    .done             ; if yes, we're done
    inc   rdi               ; advance pointer
    jmp   .scan             ; continue scanning

.done:
    sub   rdi, rax          ; length = (end pointer) - (start pointer)
    mov   rax, rdi          ; return length in rax
    ret

src/strcmp.asm

; strcmp.asm -- string comparison
; int str_cmp(const char *s1, const char *s2)
; Returns: 0 if equal, <0 if s1<s2, >0 if s1>s2

%include "include/constants.inc"

section .text
    global str_cmp

str_cmp:
    ; rdi = s1, rsi = s2

.compare_loop:
    movzx eax, BYTE [rdi]   ; eax = *s1 (zero-extended)
    movzx ecx, BYTE [rsi]   ; ecx = *s2 (zero-extended)

    cmp   al, cl            ; compare bytes
    jne   .not_equal        ; if different, we have our result

    test  al, al            ; are we at the null terminator?
    jz    .equal            ; if both bytes are 0, strings are equal

    inc   rdi               ; advance s1
    inc   rsi               ; advance s2
    jmp   .compare_loop

.equal:
    xor   eax, eax          ; return 0
    ret

.not_equal:
    sub   eax, ecx          ; return s1[i] - s2[i] (positive if s1>s2, negative if s1<s2)
    ret

src/strcpy.asm

; strcpy.asm -- string copy
; char* str_cpy(char *dest, const char *src)
; Copies src to dest (including null terminator)
; Returns: rax = dest

%include "include/constants.inc"

section .text
    global str_cpy

str_cpy:
    ; rdi = dest, rsi = src
    push  rdi               ; save dest (return value)

.copy_loop:
    movzx eax, BYTE [rsi]   ; load byte from src
    mov   [rdi], al         ; store to dest
    test  al, al            ; is it the null terminator?
    jz    .done             ; if so, we're done
    inc   rdi               ; advance dest
    inc   rsi               ; advance src
    jmp   .copy_loop

.done:
    pop   rax               ; return original dest
    ret

src/main.asm

; main.asm -- entry point and test suite for string utilities

%include "include/constants.inc"
%include "include/macros.inc"

; Declare external functions (defined in other source files)
extern str_len
extern str_cmp
extern str_cpy

section .data
    ; Test strings
    test_str1   db "Hello, World!", 0
    test_str2   db "Hello, World!", 0
    test_str3   db "Hello, Assembly!", 0
    empty_str   db 0

    ; Messages
    msg_pass    db "PASS", 10
    msg_fail    db "FAIL", 10
    msg_len_eq  db 4

    test_hdr    db "=== String Utils Tests ===", 10
    test_hdr_len equ $ - test_hdr

    ; Test labels
    lbl_strlen  db "str_len:  "
    lbl_strlen_len equ $ - lbl_strlen
    lbl_strcmp  db "str_cmp:  "
    lbl_strcmp_len equ $ - lbl_strcmp
    lbl_strcpy  db "str_cpy:  "
    lbl_strcpy_len equ $ - lbl_strcpy

section .bss
    copy_buf    resb 32     ; destination buffer for str_cpy tests

section .text
    global _start

; print_pass_fail: print PASS or FAIL based on rdi (1=pass, 0=fail)
print_pass_fail:
    test  rdi, rdi
    jnz   .pass
    ; Print FAIL
    mov   rax, SYS_WRITE
    mov   rdi, STDOUT
    lea   rsi, [rel msg_fail]
    mov   rdx, 5
    syscall
    ret
.pass:
    mov   rax, SYS_WRITE
    mov   rdi, STDOUT
    lea   rsi, [rel msg_pass]
    mov   rdx, 5
    syscall
    ret

_start:
    ; Print header
    mov   rax, SYS_WRITE
    mov   rdi, STDOUT
    lea   rsi, [rel test_hdr]
    mov   rdx, test_hdr_len
    syscall

    ; ===== Test str_len =====
    ; Print label
    mov   rax, SYS_WRITE
    mov   rdi, STDOUT
    lea   rsi, [rel lbl_strlen]
    mov   rdx, lbl_strlen_len
    syscall

    ; Test: str_len("Hello, World!") should return 13
    lea   rdi, [rel test_str1]
    call  str_len
    ; rax should be 13
    xor   rdi, rdi
    cmp   rax, 13
    sete  dil               ; dil=1 if equal (pass), 0 if not
    call  print_pass_fail

    ; Test: str_len("") should return 0
    ; (add more tests...)

    ; ===== Test str_cmp =====
    mov   rax, SYS_WRITE
    mov   rdi, STDOUT
    lea   rsi, [rel lbl_strcmp]
    mov   rdx, lbl_strcmp_len
    syscall

    ; Test: str_cmp(test_str1, test_str2) should return 0 (equal)
    lea   rdi, [rel test_str1]
    lea   rsi, [rel test_str2]
    call  str_cmp
    ; rax should be 0
    xor   rdi, rdi
    test  rax, rax
    setz  dil               ; dil=1 if zero (equal), pass
    call  print_pass_fail

    ; Test: str_cmp(test_str1, test_str3) should return non-zero
    lea   rdi, [rel test_str1]
    lea   rsi, [rel test_str3]
    call  str_cmp
    ; rax should be non-zero
    xor   rdi, rdi
    test  rax, rax
    setnz dil               ; dil=1 if non-zero (not equal), pass
    call  print_pass_fail

    ; ===== Test str_cpy =====
    mov   rax, SYS_WRITE
    mov   rdi, STDOUT
    lea   rsi, [rel lbl_strcpy]
    mov   rdx, lbl_strcpy_len
    syscall

    ; Test: str_cpy(copy_buf, test_str1) then str_cmp(copy_buf, test_str1) == 0
    lea   rdi, [rel copy_buf]
    lea   rsi, [rel test_str1]
    call  str_cpy

    lea   rdi, [rel copy_buf]
    lea   rsi, [rel test_str1]
    call  str_cmp
    xor   rdi, rdi
    test  rax, rax
    setz  dil
    call  print_pass_fail

    ; Exit
    mov   rax, SYS_EXIT
    xor   rdi, rdi
    syscall

The Makefile

# Makefile for string-utils multi-file assembly project

NASM      := nasm
LD        := ld
OBJDUMP   := objdump -d -M intel

NASM_FLAGS := -f elf64 -g -F dwarf -I ./include/
LD_FLAGS   :=

SRC_DIR   := src
OBJ_DIR   := obj
BIN_DIR   := bin
INC_DIR   := include

TARGET    := $(BIN_DIR)/string-utils

SRCS      := $(wildcard $(SRC_DIR)/*.asm)
OBJS      := $(patsubst $(SRC_DIR)/%.asm,$(OBJ_DIR)/%.o,$(SRCS))

# Default target
all: dirs $(TARGET)

# Create output directories if they don't exist
dirs:
    mkdir -p $(OBJ_DIR) $(BIN_DIR)

# Pattern rule: .asm -> .o
# The dependency on the include files ensures recompilation if they change
$(OBJ_DIR)/%.o: $(SRC_DIR)/%.asm $(wildcard $(INC_DIR)/*.inc)
    $(NASM) $(NASM_FLAGS) $< -o $@

# Link all objects into the executable
$(TARGET): $(OBJS)
    $(LD) $(LD_FLAGS) $(OBJS) -o $@

# Convenience targets
run: all
    $(TARGET)

debug: all
    gdb $(TARGET)

disasm: all
    $(OBJDUMP) $(TARGET)

clean:
    rm -rf $(OBJ_DIR) $(BIN_DIR)

# Show all source files and their object file targets
list:
    @echo "Sources: $(SRCS)"
    @echo "Objects: $(OBJS)"
    @echo "Target:  $(TARGET)"

.PHONY: all dirs run debug disasm clean list

Building and Running

make list          # show what will be built
make               # build everything
make run           # build and run

Expected output:

=== String Utils Tests ===
str_len:  PASS
str_cmp:  PASS
str_cmp:  PASS
str_cpy:  PASS

Key Makefile Features

Automatic dependency tracking: The pattern rule $(OBJ_DIR)/%.o: $(SRC_DIR)/%.asm $(wildcard $(INC_DIR)/*.inc) ensures that when any .inc file changes, all .asm files are recompiled. Without this, changes to constants.inc or macros.inc would not trigger recompilation.

The -I ./include/ flag: Tells NASM to search ./include/ for %include files. This means you write %include "constants.inc" instead of %include "include/constants.inc".

Directory creation: The dirs target creates the obj/ and bin/ directories before the build. Without this, the first make would fail with "No such file or directory" for the output paths.

Incremental builds: Make's dependency tracking means that a second make after changing only strlen.asm recompiles only strlen.o and relinks, rather than recompiling all four files. For large projects with hundreds of source files, this makes the difference between a 30-second full build and a 1-second incremental build.


What This Pattern Enables

This project structure scales to larger codebases: the MinOS kernel project in this book uses the same pattern with additional complexity (cross-compilation targets, flat binary output for the bootloader, ELF output for the kernel proper). The principles — modular source files, shared include files, clean Makefile dependency tracking — remain constant as the project grows.