Chapter 23 Key Takeaways: Linking, Loading, and ELF
-
ELF (Executable and Linkable Format) serves three roles: relocatable object file (
.o, output of assembler), executable (output of linker), and shared library (.so). The same format with differente_typevalues:ET_REL=1,ET_EXEC=2,ET_DYN=3. -
The ELF header's 64 bytes identify the binary and point to the two tables: the program header table (for the loader — describes segments to map) and the section header table (for tools — describes content organization). The section header table is optional for execution; it can be stripped with
stripwithout affecting program behavior. -
.bssoccupies zero bytes in the file but allocates memory at load time. The section header records only a size; the OS maps zero-initialized pages for the.bssvirtual address range. A 4 MB array of global zeros adds zero bytes to the compiled object file. -
ELF sections are for the linker; ELF segments (program headers) are for the loader. The linker combines sections into segments based on permissions:
.text + .rodata→ read-execute PT_LOAD;.data + .bss→ read-write PT_LOAD. The loader ignores sections entirely. -
A relocation entry records: where to patch (r_offset), what symbol to look up (r_info), and a calculation addend (r_addend). The relocation type (R_X86_64_PC32, R_X86_64_PLT32, R_X86_64_64) specifies the formula:
S + A - Pfor PC-relative,S + Afor absolute. The linker applies these formulas during pass 2. -
Symbol binding determines link-time visibility: LOCAL symbols (C
static) are invisible outside the.ofile; GLOBAL symbols are visible to all inputs; WEAK symbols can be overridden by GLOBAL symbols of the same name. Duplicate GLOBAL symbols are a linker error; duplicate WEAK symbols are allowed (one wins silently). -
Archive (
.a) linking is order-dependent: the linker scans the archive once, left-to-right, and includes only.ofiles that satisfy currently-undefined symbols. If an undefined symbol is created after the archive was scanned, it will not be resolved. Libraries go after the object files that use them on the command line. -
Position-Independent Code (PIC) is required for shared libraries. PIC uses RIP-relative addressing so all references are encoded as offsets from the current instruction — correct regardless of load address. Compile with
-fPICfor shared libraries;-fPIEfor position-independent executables (required for ASLR to randomize the main executable). -
The linker script is the authority on memory layout: it specifies section order, virtual addresses, alignment, and defines symbols visible to C/assembly code. Custom linker scripts are required for OS kernels, bootloaders, and embedded firmware where memory layout is not the default.
-
KEEPprevents the linker's garbage collector from removing unreferenced sections. The Multiboot header, interrupt vector tables, and similar structures referenced only by external tools (GRUB, CPU hardware) needKEEPto survive--gc-sectionsoptimization. -
The loader (kernel + dynamic linker) turns an ELF file into a running process: the kernel reads PT_LOAD segments and calls
mmapfor each; the PT_INTERP segment specifies the dynamic linker path; the dynamic linker loads shared libraries, applies relocations (filling GOT entries), runs constructors, then jumps toe_entry→_start→main. -
ASLR (Address Space Layout Randomization) randomizes stack, heap, and library base addresses on each run to prevent hardcoded exploit targets. The main executable is randomized only when compiled as PIE (
-fPIE -pie, which producesET_DYN). Non-PIE executables (ET_EXEC) always load at the same address. -
Essential ELF tools:
readelf(all ELF metadata),objdump(disassembly + hex dumps),nm(symbol table),size(section sizes),ldd(shared library dependencies),strings(embedded strings),strip(remove debug symbols),ar(archive management),ld -Map(linker map generation). -
OS kernels must zero
.bssthemselves and copy.datafrom flash/ROM to RAM before calling C code. There is no operating system to do it for them. The linker script defines_bss_start/_bss_endsymbols, and the boot assembly uses these forrep stosbzero-initialization — the first task every kernel entry point performs.