You have written assembly. You have run the assembler and produced a .o file. Now what? The journey from object file to running process involves two more tools: the linker (ld) and the loader (the kernel + dynamic linker). Understanding this...
In This Chapter
- What Happens After Assembly
- 23.1 Object Files: The Currency of Linking
- 23.2 ELF Format: Executable and Linkable Format
- 23.3 Sections: The Building Blocks
- 23.4 Symbol Tables: Names and Addresses
- 23.5 Relocations: Instructions to the Linker
- 23.6 Linking: Symbol Resolution and Relocation
- 23.7 Linker Scripts: Controlling Memory Layout
- 23.8 Static vs. Dynamic Linking
- 23.9 Position-Independent Code (PIC)
- 23.10 The Loader: From File to Process
- 23.11 Full Worked Example: Tracing a Link
- 23.12 Tools Reference
- Check Your Understanding
- Summary
Chapter 23: Linking, Loading, and ELF
What Happens After Assembly
You have written assembly. You have run the assembler and produced a .o file. Now what? The journey from object file to running process involves two more tools: the linker (ld) and the loader (the kernel + dynamic linker). Understanding this pipeline is not merely academic — it determines what symbols are visible, how relocations work, why linking fails with "undefined reference," what the ELF format actually contains, and how the OS maps your program into memory.
This chapter tears open the black box between as and execution.
23.1 Object Files: The Currency of Linking
An object file (.o) is the direct output of the assembler. It contains:
- Machine code (not yet fully linked — addresses are incomplete)
- Symbols (names and their values/addresses)
- Relocations (instructions to the linker: "fill in this address later")
- Section data (.text, .data, .bss, .rodata, etc.)
Object files are the inputs to the linker. The linker combines multiple .o files (and archives .a) into a single executable or shared library.
source.c → [cc1] → source.s
source.s → [as] → source.o ← object file
source.o → [ld] → a.out ← executable or shared library
a.out → [exec syscall + ld.so] → running process
Examining an Object File
# Compile to object file only
gcc -c -O0 example.c -o example.o
# List sections
readelf -S example.o
# List symbols
readelf -s example.o
# List relocations
readelf -r example.o
# Disassemble
objdump -d example.o
23.2 ELF Format: Executable and Linkable Format
ELF (Executable and Linkable Format) is the standard binary format on Linux, FreeBSD, and most Unix-like systems. The same format serves three purposes:
- Relocatable object file (
.o) — output of assembler, input to linker - Executable — output of linker, directly executable
- Shared library (
.so) — position-independent code, loaded by dynamic linker
ELF File Layout
ELF File
╔══════════════════════════════════════╗
║ ELF Header (64 bytes) ║ ← magic, type, arch, entry point
║ e_ident[16] magic + class + ABI ║
║ e_type ET_REL/ET_EXEC/ET_DYN ║
║ e_machine EM_X86_64 (0x3E) ║
║ e_entry entry point address ║
║ e_phoff program header offset ║
║ e_shoff section header offset ║
╠══════════════════════════════════════╣
║ Program Header Table (executables) ║ ← segments for loader
║ PT_LOAD: .text + .rodata ║ ← read+execute
║ PT_LOAD: .data + .bss ║ ← read+write
║ PT_DYNAMIC: dynamic linking info ║
║ PT_GNU_STACK: stack permissions ║
╠══════════════════════════════════════╣
║ .text section ║ ← machine code
╠══════════════════════════════════════╣
║ .rodata section ║ ← read-only data (string literals)
╠══════════════════════════════════════╣
║ .data section ║ ← initialized global/static variables
╠══════════════════════════════════════╣
║ .bss section ║ ← uninitialized variables (size only, no bytes)
╠══════════════════════════════════════╣
║ .symtab section ║ ← symbol table
╠══════════════════════════════════════╣
║ .strtab section ║ ← string table (symbol names)
╠══════════════════════════════════════╣
║ .rel.text / .rela.text section ║ ← relocations for .text
╠══════════════════════════════════════╣
║ .debug_* sections (if -g) ║ ← DWARF debug info
╠══════════════════════════════════════╣
║ Section Header Table ║ ← index of all sections
╚══════════════════════════════════════╝
ELF Header Fields
// From /usr/include/elf.h (simplified)
typedef struct {
unsigned char e_ident[16]; // Magic: \x7fELF + class + data + version + OS/ABI
uint16_t e_type; // ET_NONE=0, ET_REL=1, ET_EXEC=2, ET_DYN=3, ET_CORE=4
uint16_t e_machine; // EM_X86_64=62, EM_AARCH64=183, EM_RISCV=243
uint32_t e_version; // EV_CURRENT=1
uint64_t e_entry; // Entry point virtual address (0 for .o files)
uint64_t e_phoff; // Program header table offset
uint64_t e_shoff; // Section header table offset
uint32_t e_flags; // Architecture-specific flags
uint16_t e_ehsize; // Size of this header (64 for ELF64)
uint16_t e_phentsize; // Size of one program header entry (56)
uint16_t e_phnum; // Number of program header entries
uint16_t e_shentsize; // Size of one section header entry (64)
uint16_t e_shnum; // Number of section header entries
uint16_t e_shstrndx; // Index of section name string table section
} Elf64_Ehdr;
Reading the magic bytes of an ELF file:
xxd /bin/ls | head -2
# 0000000: 7f45 4c46 0201 0100 0000 0000 0000 0000 .ELF............
# ^^^^ ← \x7fELF (magic)
# ^^ ← 02 = 64-bit (ELFCLASS64)
# ^^ ← 01 = little-endian (ELFDATA2LSB)
23.3 Sections: The Building Blocks
ELF files are organized into sections. Each section has a type, flags (allocatable, writable, executable), and content.
Core Sections
| Section | SHT_type | Flags | Content |
|---|---|---|---|
.text |
PROGBITS | ALLOC+EXECINSTR | Machine code |
.data |
PROGBITS | ALLOC+WRITE | Initialized global/static variables |
.bss |
NOBITS | ALLOC+WRITE | Uninitialized variables (zero-initialized at load) |
.rodata |
PROGBITS | ALLOC | Read-only constants, string literals |
.symtab |
SYMTAB | (none) | Symbol table (name → address mapping) |
.strtab |
STRTAB | (none) | String table for symbol names |
.shstrtab |
STRTAB | (none) | String table for section names |
.rel.X |
REL | (none) | Relocations for section X (no addend) |
.rela.X |
RELA | (none) | Relocations for section X (with addend) |
.debug_* |
PROGBITS | (none) | DWARF debug information |
.note.* |
NOTE | (none) | Build ID, OS notes |
.bss: The Zero-Cost Section
.bss (Block Started by Symbol) is a special section that contains no bytes in the file — only a size. At load time, the kernel allocates zero-initialized pages for the .bss region. This means a 1 MB array of global zeros takes 0 bytes in the .o file:
// This goes in .bss — zero bytes in the file
int big_array[1000000]; // 4 MB in memory, 0 bytes in .o
// This goes in .data — 4 bytes in the file
int initialized = 42; // 4 bytes in both .o and in memory
# Verify: see section sizes
size example.o
# text data bss dec hex filename
# 245 12 4000000 4000257 3d0181 example.o
# ^^^^^^
# 4 MB in .bss — but the file is tiny
Segment vs. Section
Sections are for the linker; segments are for the loader:
- Section: Named unit of content (.text, .data, .bss)
- Segment (Program Header): Memory mapping instructions for the OS loader
The linker combines sections into segments based on permissions. Sections with ALLOC+EXECINSTR go into a read-execute PT_LOAD segment; sections with ALLOC+WRITE go into a read-write PT_LOAD segment.
Sections Segments (PT_LOAD)
.text (rx) ─────────► LOAD [r-x] 0x400000 – 0x401fff
.rodata (r) ─────────►
.data (rw) ─────────► LOAD [rw-] 0x402000 – 0x402fff
.bss (rw) ─────────►
23.4 Symbol Tables: Names and Addresses
The symbol table maps names to addresses (or sizes, types, etc.). Every global function, global variable, and external reference is a symbol.
Symbol Structure
typedef struct {
uint32_t st_name; // Index into string table (.strtab)
uint8_t st_info; // Binding (LOCAL/GLOBAL/WEAK) + Type (FUNC/OBJECT/NOTYPE)
uint8_t st_other; // Visibility (DEFAULT/PROTECTED/HIDDEN)
uint16_t st_shndx; // Section index (SHN_UNDEF=0 for undefined)
uint64_t st_value; // Address/value (offset within section for .o files)
uint64_t st_size; // Size of the symbol (function body size, etc.)
} Elf64_Sym;
Symbol Binding
- LOCAL (
STB_LOCAL): Visible only within the object file. Cstaticfunctions/variables. - GLOBAL (
STB_GLOBAL): Visible to all object files. Default for non-static globals. - WEAK (
STB_WEAK): Like global, but can be overridden by a GLOBAL symbol of the same name.
Symbol Type
- FUNC (
STT_FUNC): A function (code symbol). - OBJECT (
STT_OBJECT): A data variable. - NOTYPE (
STT_NOTYPE): Unspecified (often external/undefined references).
Reading Symbol Tables
readelf -s example.o
# Num: Value Size Type Bind Vis Ndx Name
# 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND
# 1: 0000000000000000 0 FILE LOCAL DEFAULT ABS example.c
# 2: 0000000000000000 32 FUNC GLOBAL DEFAULT 1 add
# 3: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND printf
# ^^^
# UND = undefined (needs to be resolved)
# nm is the traditional tool
nm -C example.o
# 0000000000000000 T add ← T = .text (defined function)
# U printf ← U = undefined (external reference)
23.5 Relocations: Instructions to the Linker
When the assembler generates code that references a symbol (a function call, a global variable access), it does not know the final address. Instead, it writes a placeholder and records a relocation entry: "at this offset in .text, fill in the address of symbol X."
Relocation Structure (RELA format, with addend)
typedef struct {
uint64_t r_offset; // Where to apply the relocation (offset within section)
uint64_t r_info; // Symbol index + relocation type (packed together)
int64_t r_addend; // Addend for the relocation formula
} Elf64_Rela;
The relocation type tells the linker how to compute the final value:
- R_X86_64_PC32: 32-bit PC-relative reference (call/jump target). Formula: S + A - P where S=symbol address, A=addend, P=relocation offset.
- R_X86_64_64: 64-bit absolute address. Formula: S + A.
- R_X86_64_PLT32: Call via PLT (for function calls to shared library functions).
- R_X86_64_GOTPCREL: Load from GOT (for accessing global variables in shared libraries).
Watching Relocations in Action
// simple.c
extern void printf(const char *fmt, ...);
extern int global_var;
void demo(void) {
printf("hello\n"); // Will generate a relocation for printf
global_var = 42; // Will generate a relocation for global_var
}
gcc -c -O0 simple.c -o simple.o
readelf -r simple.o
# Relocation section '.rela.text' at offset 0x... contains 2 entries:
# Offset Info Type Sym. Value Sym. Name + Addend
# 000000000015 000500000002 R_X86_64_PC32 0000000000000000 printf - 4
# 000000000023 000600000002 R_X86_64_PC32 0000000000000000 global_var - 4
#
# Offset 0x15: this is where the call to printf's 4-byte displacement is
# Type R_X86_64_PC32: fill in (target_addr - (here + 4))
# Symbol: printf (undefined — in libc)
objdump -d simple.o
# 0000000000000012 <demo>:
# 12: 48 8d 3d 00 00 00 00 lea 0x0(%rip),%rdi ← 00 00 00 00 = placeholder
# 19: e8 00 00 00 00 call 1e <demo+0xc> ← 00 00 00 00 = placeholder
The 00 00 00 00 placeholders in the object file will be filled in by the linker once it knows where printf actually lives.
23.6 Linking: Symbol Resolution and Relocation
The linker performs two passes:
Pass 1 — Symbol Resolution: Build a global symbol table. Each .o file's symbols are added. Undefined symbols (UND) must be satisfied by another .o file, an archive (.a), or a shared library (.so).
Pass 2 — Relocation: For each relocation entry in each .o file, compute the final address and patch the machine code.
Static Linking with an Archive
An archive (.a) is a collection of .o files. The linker includes only the .o files from the archive that satisfy undefined symbols — not the entire archive.
# Create archive
ar rcs libmylib.a module1.o module2.o module3.o
# Link with archive
gcc main.o -L. -lmylib -o program
# Equivalent: ld main.o libmylib.a -o program (simplified)
The archive is searched once, left-to-right. If main.o references foo from module1.o, and module1.o references bar from module2.o, the order must be: main.o -lmylib (the library satisfies both). If dependencies are circular, list the library twice: -lA -lB -lA.
Linker Symbol Resolution Rules
- If a symbol is defined in multiple
.ofiles as GLOBAL: error (duplicate symbol). - If a symbol is defined as WEAK in one file and GLOBAL in another: the GLOBAL definition wins.
- If a symbol is defined as WEAK in multiple files: any one is chosen (undefined behavior to rely on which).
- If a symbol remains undefined after all inputs: error ("undefined reference to 'foo'").
Common linking errors:
undefined reference to 'foo' → foo not defined in any input
multiple definition of 'bar' → bar defined in two .o files as GLOBAL
cannot find -lfoo → libfoo.a or libfoo.so not found
23.7 Linker Scripts: Controlling Memory Layout
The linker uses a linker script to determine how sections are combined and where they are placed in the output binary. The default script (built into ld) handles normal executables. Custom linker scripts are needed for:
- Embedded systems (no OS, specific memory map)
- Custom executables (non-standard layout)
- Operating system kernels
Default Script Behavior (Simplified)
SECTIONS {
. = 0x400000; /* start at virtual address 0x400000 */
.text : { *(.text) } /* all .text from all .o files */
.rodata : { *(.rodata) } /* all .rodata */
. = ALIGN(0x1000); /* align to page boundary */
.data : { *(.data) } /* all .data */
.bss : { *(.bss) } /* all .bss */
}
MinOS Kernel Linker Script
The MinOS kernel project uses a custom linker script that places the kernel at a specific physical address:
/* minOS/linker.ld */
OUTPUT_FORMAT("elf64-x86-64")
OUTPUT_ARCH(i386:x86-64)
ENTRY(_start)
SECTIONS {
/* Kernel loaded at 1 MB physical = 0x100000 */
/* Higher-half virtual = 0xFFFFFFFF80100000 (identity-mapped initially) */
. = 0x100000;
_kernel_start = .;
.text : {
*(.multiboot) /* Multiboot header must be first */
*(.text)
*(.text.*)
}
. = ALIGN(0x1000);
.rodata : {
*(.rodata)
*(.rodata.*)
}
. = ALIGN(0x1000);
.data : {
*(.data)
*(.data.*)
}
. = ALIGN(0x1000);
_bss_start = .;
.bss : {
*(COMMON) /* Uninitialized C externals */
*(.bss)
*(.bss.*)
}
_bss_end = .;
_kernel_end = .;
/* Discard debug sections (not needed for kernel itself) */
/DISCARD/ : { *(.comment) *(.note*) }
}
The symbols _bss_start, _bss_end, _kernel_start, _kernel_end are defined by the linker script and become accessible in assembly/C code:
// In kernel C code
extern char _bss_start[], _bss_end[];
extern char _kernel_start[], _kernel_end[];
// Zero the BSS (kernel must do this itself — no OS to zero it)
void zero_bss(void) {
memset(_bss_start, 0, _bss_end - _bss_start);
}
23.8 Static vs. Dynamic Linking
Static Linking
All library code is copied into the executable at link time. No external dependencies at runtime.
gcc -static main.c -o main_static
ls -lh main_static
# -rwxr-xr-x 1 user user 892K main_static ← libc is embedded
ls -lh main_dynamic
# -rwxr-xr-x 1 user user 16K main_dynamic ← libc is external
Advantages: No external dependencies, predictable behavior, easier distribution. Disadvantages: Larger binaries, no security patches from library updates, no sharing between processes.
Dynamic Linking
Library code is not copied into the executable. At runtime, the dynamic linker (/lib64/ld-linux-x86-64.so.2) loads the required shared libraries and resolves symbols.
ldd /bin/ls
# linux-vdso.so.1 (0x00007fff...) ← virtual DSO from kernel
# libselinux.so.1 => /lib/x86_64-linux-gnu/libselinux.so.1
# libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6
# libpcre2-8.so.0 => /lib/x86_64-linux-gnu/libpcre2-8.so.0
# /lib64/ld-linux-x86-64.so.2 => /lib64/ld-linux-x86-64.so.2
Advantages: Smaller executables, shared library code between processes, security patches apply automatically. Disadvantages: "DLL hell" (wrong version), startup overhead, deployment complexity.
23.9 Position-Independent Code (PIC)
Shared libraries must be position-independent: they can be loaded at any virtual address in any process. Without PIC, a shared library would need to be at the same address in every process — impossible when multiple libraries exist.
The Problem
; Non-PIC code — assumes it's at a fixed address
mov rax, [global_var] ; WRONG for .so: uses absolute address
call printf ; WRONG for .so: uses absolute address
If the library is loaded at 0x7f000000, these absolute addresses are wrong.
PIC Solution: RIP-Relative Addressing
; PIC code — uses RIP-relative addressing
mov rax, [rip + offset_to_got] ; Load address from GOT (correct regardless of load address)
call printf@PLT ; Call through PLT (correct regardless of load address)
On x86-64, all memory references in shared libraries use RIP-relative addressing. The distance from the current instruction to the target is constant regardless of where the library is loaded, because both the instruction and the target move together.
# Force PIC generation
gcc -fPIC -shared -o libfoo.so foo.c
# Inspect: look for RIP-relative addressing
objdump -d libfoo.so | grep -A3 "rip"
# lea rax, [rip + 0x100] ← RIP-relative
GOT and PLT
PIC introduces two new data structures covered in depth in Chapter 24: - GOT (Global Offset Table): Contains runtime addresses of global variables and functions. Filled by the dynamic linker. - PLT (Procedure Linkage Table): Jump stubs for external function calls. Enables lazy binding (resolving calls on first use, not at startup).
23.10 The Loader: From File to Process
When you execute a program, the kernel does the following:
Kernel Phase (execve syscall)
- Read ELF header: Verify magic bytes, check architecture, find PT_LOAD segments and PT_INTERP segment.
- Map PT_LOAD segments:
mmapeach segment into the process address space with appropriate permissions. - Find the interpreter (PT_INTERP): Usually
/lib64/ld-linux-x86-64.so.2— the dynamic linker. - Map the dynamic linker into the process address space.
- Jump to the dynamic linker's entry point, passing the auxiliary vector (AT_PHDR, AT_ENTRY, etc.) on the stack.
Dynamic Linker Phase (ld-linux.so)
- Process
.dynamicsection: ReadDT_NEEDEDentries (required shared libraries),DT_SYMTAB,DT_RELA,DT_JMPREL, etc. - Load required shared libraries: Recursively process dependencies (DFS order).
- Perform relocations: For each
DT_RELAentry, compute and write the final address. - Run
.initsections and DT_INIT_ARRAY constructors (C++ global constructors,__attribute__((constructor))). - Jump to the program's entry point (
e_entryin the ELF header →_start→main).
Virtual Memory Layout
After loading, the process address space looks like:
Virtual Address Space (x86-64 Linux)
┌─────────────────────────────────────┐ 0xFFFFFFFFFFFFFFFF
│ Kernel (not accessible from user) │
├─────────────────────────────────────┤ 0xFFFF800000000000
│ │
│ (non-canonical gap) │
│ │
├─────────────────────────────────────┤ 0x00007FFFFFFFFFFF
│ Stack (grows downward) │
│ argv, environ, auxv │
│ ... │
│ (mmap region — grows down) │
│ Shared libraries (.so files) │
│ mmap'd files │
│ [stack guard page] │
├─────────────────────────────────────┤
│ [mmap base — dynamic linker chose] │
├─────────────────────────────────────┤
│ Heap (grows upward via brk/mmap) │
├─────────────────────────────────────┤
│ .bss (zero-initialized) │
│ .data (initialized) │
│ .rodata (read-only) │
│ .text (executable) │
├─────────────────────────────────────┤ 0x0000000000400000
│ (unmapped — null pointer guard) │
└─────────────────────────────────────┘ 0x0000000000000000
ASLR: Address Space Layout Randomization
Modern Linux (and macOS, Windows) uses ASLR to randomize the base addresses of the stack, heap, and shared libraries. This prevents attackers from hardcoding target addresses in exploits.
# Run the same program twice — different addresses for libc
ldd /bin/ls # Run twice:
# libc.so.6 => 0x00007f3d8b200000 ← first run
# libc.so.6 => 0x00007f9c3a100000 ← second run
The executable itself is not randomized unless compiled as a PIE (Position-Independent Executable):
# Compile as PIE (default in modern distributions)
gcc -fPIE -pie main.c -o main
# Check
readelf -h main | grep Type
# Type: DYN (Position-Independent Executable)
23.11 Full Worked Example: Tracing a Link
Let's trace a complete link of a two-file project and see every step.
Source Files
// math_utils.c
int add(int a, int b) { return a + b; }
int sub(int a, int b) { return a - b; }
// main.c
#include <stdio.h>
extern int add(int, int);
extern int sub(int, int);
int main(void) {
printf("%d\n", add(3, 4));
printf("%d\n", sub(10, 3));
return 0;
}
Step 1: Compile to Object Files
gcc -c -O0 math_utils.c -o math_utils.o
gcc -c -O0 main.c -o main.o
Step 2: Inspect Object Files
nm math_utils.o
# 0000000000000000 T add ← defined at offset 0 in .text
# 0000000000000020 T sub ← defined at offset 32 in .text
nm main.o
# U add ← undefined reference
# U printf ← undefined reference
# U sub ← undefined reference
# 0000000000000000 T main ← defined at offset 0 in .text
readelf -r main.o
# Relocation section '.rela.text' contains 4 entries:
# Offset Type Sym. Name
# ... R_X86_64_PC32 add ← needs add's address
# ... R_X86_64_PLT32 printf ← needs printf via PLT
# ... R_X86_64_PC32 add ← (second call)
# ... R_X86_64_PLT32 printf
Step 3: Link
gcc main.o math_utils.o -o program
# ld resolves:
# add → math_utils.o:0x0000000000401020 (example address)
# sub → math_utils.o:0x0000000000401040
# printf → libc.so.6 (via PLT — see Chapter 24)
Step 4: Verify
nm program | grep -E "add|sub|main|printf"
# 0000000000401020 T add
# 0000000000401000 T main
# 0000000000401040 T sub
# U printf@GLIBC_2.2.5 ← still "undefined" but resolved via PLT
objdump -d program | grep -A3 "call"
# call 0x401020 <add> ← direct call, address filled in
# call 0x401030 <printf@plt> ← call through PLT for dynamic symbol
The linker filled in the relocation for add (known at link time) and left printf to be resolved at runtime via the PLT mechanism.
23.12 Tools Reference
| Tool | Purpose | Key flags |
|---|---|---|
readelf |
Read ELF metadata | -h header, -S sections, -s symbols, -r relocations, -l segments, -d dynamic |
objdump |
Disassemble + dump | -d disassemble, -D all sections, -s hex dump, -t symbol table |
nm |
List symbols | -C demangle C++, -u undefined only, -D dynamic symbols |
size |
Section sizes | (no flags needed) |
ldd |
List shared libs | (no flags needed) |
file |
Identify file type | Identifies ELF type, architecture |
strings |
Extract strings | -n 8 minimum 8 chars |
strip |
Remove symbols | --strip-debug removes debug only |
objcopy |
Copy/convert | --only-section=.text extract one section |
ld |
Direct linker | -T script.ld custom script, -Map output.map generate map |
ar |
Archive manager | rcs create, t list, x extract |
Check Your Understanding
- Why does
.bsshave no bytes in the object file? - What is the difference between a section and a segment?
- What does a relocation entry contain, and when is it processed?
- Why does compiling a shared library require
-fPIC? - What is the role of the PT_INTERP segment?
- How does the linker handle a WEAK symbol vs. a GLOBAL symbol of the same name?
- What is ASLR and why does it require PIE executables to work on the main binary?
Summary
The journey from object file to running process passes through: ELF sections (content organized by type and permissions), the symbol table (names and addresses), relocation entries (linker instructions for patching addresses), the linker (resolving symbols, applying relocations, combining sections into segments), the static/shared library choice, position-independent code (RIP-relative addressing for .so), and the loader (kernel mapping + dynamic linker bootstrapping). Understanding this pipeline makes debugging link errors, reading binary tools' output, and writing embedded system linker scripts second nature.