Case Study 23-1: Building a Minimal ELF Executable by Hand

Objective

Construct a 64-byte-or-less "hello world" ELF executable entirely by hand in NASM — no C runtime, no libc, no standard sections beyond .text. By building the smallest possible valid ELF, we understand every byte of the format: the ELF header, the program header table, and the machine code, all overlapped into a single 120-byte file.


Background: Minimum Required ELF Structure

An ELF executable needs at minimum: 1. ELF Header (64 bytes): Identifies the file and points to the program header table 2. Program Header Table (56 bytes per entry): Tells the kernel what to map into memory 3. Code: The actual instructions to execute

The section header table is optional for execution — it is only needed by tools like readelf and debuggers. The smallest valid runnable ELF omits it entirely.


Strategy: Overlapping the Headers

The trick used in "tiny ELF" binaries: the ELF header and program header table can overlap with each other, and even with the code. As long as the bytes that the kernel reads for the header fields are correct, unused bytes can hold code.

The classic approach (by Brian Raiter and others): - ELF header starts at byte 0 - Program header starts at byte 64 (immediately after ELF header) — OR we overlap at byte 40 - Code starts immediately after the program header

For clarity, we will NOT use the overlapping trick in our primary example, but we will show the overlap as an optimization at the end.


Version 1: Standard Layout (No Tricks)

; tiny_elf.asm — Minimal ELF executable
; Build: nasm -f bin -o hello tiny_elf.asm && chmod +x hello && ./hello
; Size: 152 bytes

BITS 64

; ============================================================
; Define constants
; ============================================================
VADDR     equ 0x400000        ; Virtual load address
EHDR_SIZE equ 64              ; ELF header size (64-bit)
PHDR_SIZE equ 56              ; Program header size (64-bit)
HEADERS   equ EHDR_SIZE + PHDR_SIZE   ; Total header size

; ============================================================
; ELF Header (64 bytes)
; ============================================================
elf_start:
    db 0x7f, 'E', 'L', 'F'   ; e_ident magic
    db 2                       ; EI_CLASS: 2 = ELFCLASS64
    db 1                       ; EI_DATA: 1 = ELFDATA2LSB (little-endian)
    db 1                       ; EI_VERSION: 1 = EV_CURRENT
    db 0                       ; EI_OSABI: 0 = ELFOSABI_NONE (System V)
    dq 0                       ; EI_ABIVERSION + padding (8 bytes)

    dw 2                       ; e_type: 2 = ET_EXEC (executable)
    dw 0x3e                    ; e_machine: 0x3e = EM_X86_64
    dd 1                       ; e_version: 1 = EV_CURRENT
    dq VADDR + HEADERS         ; e_entry: entry point virtual address (after headers)
    dq EHDR_SIZE               ; e_phoff: program header table offset (right after ELF header)
    dq 0                       ; e_shoff: section header table offset (none)
    dd 0                       ; e_flags: no flags for x86-64
    dw EHDR_SIZE               ; e_ehsize: 64 bytes
    dw PHDR_SIZE               ; e_phentsize: 56 bytes
    dw 1                       ; e_phnum: 1 program header entry
    dw 64                      ; e_shentsize: 64 (arbitrary, no SHT)
    dw 0                       ; e_shnum: 0 sections
    dw 0                       ; e_shstrndx: 0 (no section name table)

; ============================================================
; Program Header Table — 1 PT_LOAD entry (56 bytes)
; ============================================================
    dd 1                       ; p_type: 1 = PT_LOAD (load this segment)
    dd 5                       ; p_flags: 5 = PF_R | PF_X (read + execute)
    dq 0                       ; p_offset: load from byte 0 of file
    dq VADDR                   ; p_vaddr: load to this virtual address
    dq VADDR                   ; p_paddr: physical address (same as vaddr)
    dq file_size               ; p_filesz: bytes to load from file
    dq file_size               ; p_memsz: bytes to allocate in memory
    dq 0x200000                ; p_align: 2 MB alignment (standard for PT_LOAD)

; ============================================================
; Code: Hello World using Linux syscalls directly
; Entry point is here (VADDR + HEADERS = 0x400078)
; ============================================================
code_start:
    ; write(1, message, 13)
    mov rax, 1          ; SYS_write = 1
    mov rdi, 1          ; fd = 1 (stdout)
    lea rsi, [rel message]  ; buf = &message (RIP-relative)
    mov rdx, 13         ; len = 13
    syscall

    ; exit(0)
    mov rax, 60         ; SYS_exit = 60
    xor rdi, rdi        ; status = 0
    syscall

message:
    db "Hello, ELF!", 10, 0    ; "Hello, ELF!\n\0" (13 bytes including \n)

file_size equ $ - elf_start   ; Total file size (computed by NASM)

Build and Run

nasm -f bin -o hello tiny_elf.asm
chmod +x hello
./hello
# Hello, ELF!

# Check file size
ls -l hello
# -rwxr-xr-x 1 user user 152 hello

# Verify ELF structure
readelf -h hello
# ELF Header:
#   Magic:   7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
#   Class:                             ELF64
#   Data:                              2's complement, little endian
#   Type:                              EXEC (Executable file)
#   Machine:                           Advanced Micro Devices X86-64
#   Entry point address:               0x400078
#   Start of program headers:          64 (bytes into file)
#   Number of program headers:         1
#   Number of section headers:         0

readelf -l hello
# Elf file type is EXEC (Executable file)
# Entry point 0x400078
# There is 1 program header, starting at offset 64
#
# Program Headers:
#   Type     Offset VirtAddr   PhysAddr   FileSiz  MemSiz   Flg  Align
#   LOAD     0x0000 0x00400000 0x00400000 0x000098 0x000098 R E  0x200000
#
# Note: offset 0x78 = 120 decimal = 64 + 56 = EHDR + PHDR

strace ./hello
# execve("./hello", ["./hello"], 0x7ffe... /* 53 vars */) = 0
# write(1, "Hello, ELF!\n", 12)           = 12
# exit(0)                                  = ?

Byte-by-Byte Walkthrough

Offset  Bytes                          Meaning
──────────────────────────────────────────────────────────────
0x00    7f 45 4c 46                    ELF magic: \x7fELF
0x04    02                             64-bit
0x05    01                             Little-endian
0x06    01                             Version 1
0x07    00 00 00 00 00 00 00 00 00     OS/ABI + padding
0x10    02 00                          e_type: ET_EXEC
0x12    3e 00                          e_machine: EM_X86_64
0x14    01 00 00 00                    e_version: 1
0x18    78 00 40 00 00 00 00 00        e_entry: 0x400078 (our code)
0x20    40 00 00 00 00 00 00 00        e_phoff: 64 (program headers at offset 64)
0x28    00 00 00 00 00 00 00 00        e_shoff: 0 (no section headers)
0x30    00 00 00 00                    e_flags: 0
0x34    40 00                          e_ehsize: 64
0x36    38 00                          e_phentsize: 56
0x38    01 00                          e_phnum: 1
0x3a    40 00                          e_shentsize: 64 (unused)
0x3c    00 00                          e_shnum: 0
0x3e    00 00                          e_shstrndx: 0
──────────── Program Header (offset 64 = 0x40) ──────────────
0x40    01 00 00 00                    p_type: PT_LOAD
0x44    05 00 00 00                    p_flags: PF_R | PF_X
0x48    00 00 00 00 00 00 00 00        p_offset: 0 (map from start of file)
0x50    00 00 40 00 00 00 00 00        p_vaddr: 0x400000
0x58    00 00 40 00 00 00 00 00        p_paddr: 0x400000
0x60    98 00 00 00 00 00 00 00        p_filesz: 152 bytes
0x68    98 00 00 00 00 00 00 00        p_memsz: 152 bytes
0x70    00 00 20 00 00 00 00 00        p_align: 0x200000
──────────── Code (offset 120 = 0x78) ───────────────────────
0x78    b8 01 00 00 00                 mov eax, 1
0x7d    bf 01 00 00 00                 mov edi, 1
0x82    48 8d 35 08 00 00 00           lea rsi, [rip+8]
0x89    ba 0d 00 00 00                 mov edx, 13
0x8e    0f 05                          syscall
0x90    b8 3c 00 00 00                 mov eax, 60
0x95    31 ff                          xor edi, edi
0x97    0f 05                          syscall
──────────── Data (offset 152 = 0x98 would be here) ─────────
0x99    48 65 6c 6c 6f 2c 20 45        "Hello, E"
0xa1    4c 46 21 0a                    "LF!\n"

Version 2: The Overlap Trick (Smallest Possible ELF)

The ELF header has 16 bytes of padding (EI_ABIVERSION + 7 padding bytes). These can hold parts of the program header. The smallest known x86-64 ELF that prints a message is under 100 bytes using this technique:

; ultra_tiny.asm — ELF header overlaps with program header
; Warning: This is intentionally unreadable — educational only
BITS 64

org 0x400000     ; Load address

ehdr:
    db 0x7f, 'E', 'L', 'F', 2, 1, 1, 0  ; e_ident[0..7]
phdr:
    dd 1           ; p_type: PT_LOAD         ← overlaps e_ident[8..11]
    dd 5           ; p_flags: PF_R|PF_X      ← overlaps e_ident[12..15]

    dw 2           ; e_type: ET_EXEC
    dw 0x3e        ; e_machine: EM_X86_64
    dd 1           ; e_version
    dq _start      ; e_entry: entry point

    dq phdr - ehdr ; e_phoff = 8 (program header at offset 8, within ELF header!)
    dq 0           ; e_shoff = 0

; Program header continues here (not fully overlapped version — simplified)
    ; ...

_start:
    mov eax, 60    ; exit(42)
    mov edi, 42
    syscall

The complete ultra-tiny technique (from Muppet Labs' "Tiny ELF" research) can produce a working ELF under 80 bytes. The details are complex but the principle is: many fields in the ELF header are unused by the kernel for basic execution; their bytes can be repurposed for other header data or even code.


What We Learn From This Exercise

1. ELF Is Simple

The kernel needs only: - Valid magic bytes and e_type = ET_EXEC - A PT_LOAD program header specifying what to map and where - A valid e_entry pointing to executable code - That's it.

The section header table, .symtab, .strtab, .debug_* — none are needed to run. They exist for tooling (debuggers, disassemblers, readelf).

2. The Loader's Job Is Just mmap

The kernel's execve implementation: 1. Reads the ELF header — checks magic, type, machine 2. For each PT_LOAD segment: calls mmap with the specified vaddr, filesz, memsz, and flags 3. Jumps to e_entry

That's nearly the complete picture. (Plus PT_INTERP handling for dynamic executables.)

3. Virtual Address 0x400000 Is Arbitrary

The traditional Linux x86-64 load address is 0x400000 (4 MB), but any valid aligned address works. The ELF header, program header, and code are all position-dependent relative to e_entry. Change the org directive and update e_entry accordingly.

4. No _start, No main, No libc

Our binary has none of the C runtime infrastructure: no _start, no __libc_start_main, no main, no exit() cleanup. We go directly from kernel entry (e_entry) to syscalls. This is how operating system kernels, bootloaders, and some real-time embedded programs work.


Extending the Minimal ELF

Add multiple PT_LOAD segments to create proper RX + RW layout:

; Add a second PT_LOAD for writable data
; p_flags = PF_R | PF_W = 6 for the data segment
; p_offset = data file offset, p_vaddr = data virtual address
; This matches what gcc/ld produce for normal programs

Add a PT_GNU_STACK header to mark the stack as non-executable (security hardening):

; PT_GNU_STACK (type = 0x6474e551)
; p_flags = PF_R | PF_W = 6 (no execute!)
; p_memsz = 0 (just a marker)

These extensions turn our 152-byte toy into a structure nearly identical to what gcc -nostdlib generates.


Summary

By building a minimal ELF from scratch: - Every field in the ELF header has a concrete, verifiable purpose - The program header is the only thing required for execution — sections are optional - The kernel's loader is simpler than expected: map segments, jump to entry - strace and readelf are essential tools for verifying ELF correctness - Understanding the format makes all subsequent tool output (readelf, objdump, nm) immediately interpretable