Chapter 24: Dynamic Linking in Depth

Open Assembly Language Project

9 min read

Every time a Linux process calls printf, malloc, or sin, it goes through an invisible machinery of tables, stubs, and resolver functions. Most programmers never see this — the C compiler, linker, and dynamic linker conspire to make it transparent...

In This Chapter

The Invisible Machinery
24.1 The Problem: Position-Independent Calls to Unknown Addresses
24.2 The PLT: Procedure Linkage Table
24.3 The GOT: Global Offset Table
24.4 Lazy Binding: Step-by-Step Trace
24.5 Relocations for Dynamic Linking
24.6 RELRO: Hardening the GOT
24.7 LD_PRELOAD: Function Interposition
24.8 dlopen/dlsym/dlclose: Runtime Dynamic Loading
24.9 Weak Symbols and Symbol Versioning
24.10 Library Search Path: DT_RPATH and DT_RUNPATH
24.11 The .dynamic Section
24.12 Constructors and Destructors
24.13 Security: GOT Overwrite and Countermeasures
24.14 Complete Worked Example: Tracing printf Through PLT and GOT
Check Your Understanding
Summary

Key Takeaways Exercises Quiz Case Study 01 Case Study 02 Further Reading

Chapter 24: Dynamic Linking in Depth

The Invisible Machinery

Every time a Linux process calls printf, malloc, or sin, it goes through an invisible machinery of tables, stubs, and resolver functions. Most programmers never see this — the C compiler, linker, and dynamic linker conspire to make it transparent. But for assembly programmers, security researchers, and performance engineers, understanding the PLT/GOT mechanism is essential. This chapter makes it visible.

24.1 The Problem: Position-Independent Calls to Unknown Addresses

At link time, the linker knows that main.o calls printf, but it does not know where printf will be in memory at runtime. Why?

printf is in libc.so.6
libc.so.6 is loaded at a random address (ASLR)
The load address is not known until the dynamic linker runs at program startup

The linker cannot hard-code printf's address into the call instruction. It must leave a placeholder and arrange for the dynamic linker to fill it in.

There are two possible strategies: 1. Load-time relocation: At startup, the dynamic linker patches every call site with the correct address. Simple, but requires patching .text (which is read-only) and means all symbols are resolved at startup (slow for large programs). 2. Lazy binding with PLT/GOT: Each function gets a small "stub" in the PLT. On the first call, the stub resolves the address and caches it in the GOT. Subsequent calls go through the GOT directly. Only functions that are actually called are resolved.

Linux ELF uses lazy binding by default.

24.2 The PLT: Procedure Linkage Table

The PLT is a section (.plt) in the executable or shared library. It contains small stubs — one per external function. Each stub is exactly 16 bytes on x86-64:

PLT structure for printf:

printf@plt:
    jmp     QWORD PTR [rip + printf@GOTPCREL]   ; Jump via GOT entry
    push    0x1                                   ; Push PLT index
    jmp     .plt.0                                ; Jump to resolver

.plt.0 (PLT[0] — the resolver stub):
    push    QWORD PTR [rip + _GLOBAL_OFFSET_TABLE_+8]  ; Push link_map ptr
    jmp     QWORD PTR [rip + _GLOBAL_OFFSET_TABLE_+16] ; Jump to _dl_runtime_resolve

The Three Instructions of a PLT Entry

Instruction 1: jmp [GOT+offset]

This is an indirect jump through the GOT. The GOT entry initially contains the address of instruction 2 (the push that follows). On first call, this jumps into the PLT itself. After resolution, the GOT entry contains printf's actual address, so this instruction jumps directly to printf.

Instruction 2: push index

Pushes the relocation index — which PLT entry is being resolved. The resolver uses this to find the right symbol in the relocation table.

Instruction 3: jmp PLT[0]

Jumps to the resolver stub (PLT entry 0), which calls _dl_runtime_resolve in ld-linux.so.

24.3 The GOT: Global Offset Table

The GOT is a section (.got, .got.plt) containing a table of pointers — one per external symbol. These pointers are updated by the dynamic linker at runtime.

GOT structure:

GOT[0]: Address of .dynamic section    ← ld.so uses this
GOT[1]: Pointer to link_map structure  ← filled by ld.so
GOT[2]: Pointer to _dl_runtime_resolve ← filled by ld.so
GOT[3]: <initially PLT[1]+6>           ← printf stub: initially points back into PLT
GOT[4]: <initially PLT[2]+6>           ← another function...

The GOT is in a writable segment (.got.plt has RW permissions). This is what makes lazy binding possible — the dynamic linker can update the GOT without touching the read-only .text or .plt sections.

GOT Addressing: GOTPCREL

External variable accesses also go through the GOT:

extern int global_counter;
global_counter++;

Compiles to (with -fPIC):

; Load global_counter's GOT entry address
lea rax, [rip + global_counter@GOTPCREL]   ; RAX = &GOT[global_counter]
mov edi, [rax]                              ; EDI = GOT[global_counter] = &global_counter
mov eax, [rdi]                             ; EAX = *(&global_counter) = global_counter
inc eax
mov [rdi], eax                             ; Write back

The indirection: %rip + offset → GOT[entry] → &global_counter → global_counter. Two levels of indirection for one variable access. This overhead is why -fPIC can be slightly slower than non-PIC code, though modern hardware makes it negligible.

24.4 Lazy Binding: Step-by-Step Trace

Let's trace printf("hello\n") on the first call, then the second call.

First Call

main:
    call printf@plt           ; (1) Jump to PLT stub

printf@plt:
    jmp [GOT+printf_offset]   ; (2) Indirect jump through GOT
    ; GOT entry = plt_stub+6 (points to next instruction)

    push 0                    ; (3) Push relocation index for printf
    jmp plt0                  ; (4) Jump to resolver

plt0:
    push [GOT+1]              ; (5) Push link_map pointer
    jmp [GOT+2]               ; (6) Jump to _dl_runtime_resolve

_dl_runtime_resolve:          ; (7) In ld-linux.so
    ; Find the symbol "printf" using the relocation index
    ; Look up "printf" in the symbol table of libc.so
    ; Write printf's actual address to GOT[printf_offset]
    ; Jump to printf                                         (8)

printf:                       ; (9) Actual printf in libc
    ; ... printf executes ...
    ret                       ; (10) Returns to main

After step 7, the GOT has been updated: GOT[printf_offset] now contains printf's actual address.

Second Call

main:
    call printf@plt           ; (1) Jump to PLT stub

printf@plt:
    jmp [GOT+printf_offset]   ; (2) Indirect jump through GOT
    ; GOT entry = printf (now updated!)
    ; → jumps directly to printf                             (3)

printf:                       ; (4) printf executes immediately
    ret

The second call costs one extra indirect jump (through PLT → GOT → printf) compared to a direct call. This overhead is typically 1-2 nanoseconds — negligible for any function that does real work.

24.5 Relocations for Dynamic Linking

RELA Entries in `.rela.plt`

Each PLT entry has a corresponding RELA relocation:

readelf -r /bin/cat
# Relocation section '.rela.plt' at offset 0x... contains 14 entries:
#   Offset          Info           Type           Sym. Value    Sym. Name + Addend
# 000000403fd8  000100000007 R_X86_64_JUMP_SLOT 0000000000000000 printf@GLIBC_2.2.5 + 0
# 000000403fe0  000200000007 R_X86_64_JUMP_SLOT 0000000000000000 malloc@GLIBC_2.2.5 + 0

R_X86_64_JUMP_SLOT: This relocation type says: "write the final address of this symbol into this GOT slot." The dynamic linker applies this during lazy resolution (or at startup with LD_BIND_NOW=1).

RELA Entries in `.rela.dyn`

For non-PLT relocations (global variables, copy relocations):

readelf -r /bin/cat
# Relocation section '.rela.dyn' at offset 0x... contains 3 entries:
#   Type                   Sym. Name
#   R_X86_64_COPY          stdin@@GLIBC_2.2.5
#   R_X86_64_COPY          stdout@@GLIBC_2.2.5

R_X86_64_COPY: Used for global variables exported by shared libraries. The dynamic linker copies the symbol's initial value from the shared library into the executable's .bss segment. This ensures the executable and library always access the same memory location.

24.6 RELRO: Hardening the GOT

The GOT being writable is a classic security vulnerability. Attackers who find a write-anywhere vulnerability can overwrite GOT entries to redirect function calls:

Attacker: overwrite GOT[printf] with address of shellcode
Next call to printf() → executes shellcode instead

RELRO (RELocation Read-Only) mitigates this:

Partial RELRO (-Wl,-z,relro): After dynamic linker completes startup relocations, the .got section is mprotect'd to read-only. Only .got.plt (the lazy-binding entries) remains writable. Most distributions enable partial RELRO by default.

Full RELRO (-Wl,-z,relro -Wl,-z,now): Disables lazy binding (-z now = bind all symbols at startup). All GOT entries are resolved immediately, then the entire GOT including .got.plt is marked read-only. More startup overhead (all symbols resolved at once), but the GOT is completely read-only at runtime.

# Check RELRO status
checksec --file=/bin/ls
# RELRO:     Full RELRO

# Compile with full RELRO
gcc -fPIE -pie -Wl,-z,relro -Wl,-z,now main.c -o main_fullrelro

# Verify: .got.plt is not writable (merged into read-only segment)
readelf -l main_fullrelro | grep -A2 GNU_RELRO
# GNU_RELRO   ...  r--  0x1000
# This segment covers .got.plt — now read-only

24.7 LD_PRELOAD: Function Interposition

The dynamic linker's symbol resolution order can be exploited for interposition: replacing a library function with your own implementation.

Mechanism

When LD_PRELOAD=/path/to/mylib.so program is used, mylib.so is loaded before all other libraries. Symbols in mylib.so take precedence over identically-named symbols in libc. The preloaded library can call the original function via dlsym(RTLD_NEXT, "function_name").

Example: malloc Debugger

// malloc_debug.c
// Intercept malloc/free to track allocations
// Build: gcc -fPIC -shared -o malloc_debug.so malloc_debug.c -ldl
// Use:   LD_PRELOAD=./malloc_debug.so ./your_program

#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
#include <dlfcn.h>

// Function pointers to the real malloc/free
static void *(*real_malloc)(size_t) = NULL;
static void  (*real_free)(void *)   = NULL;

// Statistics
static size_t total_allocated = 0;
static size_t total_freed     = 0;
static size_t alloc_count     = 0;

// Initialize: find the real malloc/free
static void init(void) __attribute__((constructor));
static void init(void) {
    real_malloc = dlsym(RTLD_NEXT, "malloc");
    real_free   = dlsym(RTLD_NEXT, "free");
    if (!real_malloc || !real_free) {
        fprintf(stderr, "malloc_debug: failed to find real malloc/free\n");
        exit(1);
    }
}

// Print summary at exit
static void summary(void) __attribute__((destructor));
static void summary(void) {
    fprintf(stderr, "\n=== malloc_debug summary ===\n");
    fprintf(stderr, "Allocations: %zu\n", alloc_count);
    fprintf(stderr, "Total allocated: %zu bytes\n", total_allocated);
    fprintf(stderr, "Total freed:     %zu bytes\n", total_freed);
    if (total_allocated > total_freed)
        fprintf(stderr, "LEAK: %zu bytes not freed!\n", total_allocated - total_freed);
    else
        fprintf(stderr, "No leaks detected.\n");
}

// Interposed malloc
void *malloc(size_t size) {
    void *ptr = real_malloc(size);
    if (ptr) {
        total_allocated += size;
        alloc_count++;
        fprintf(stderr, "malloc(%zu) = %p\n", size, ptr);
    }
    return ptr;
}

// Interposed free
void free(void *ptr) {
    if (ptr) {
        fprintf(stderr, "free(%p)\n", ptr);
        // Note: we don't know the size at free time without metadata tracking
        // Real allocator wrappers store size in a header before the returned pointer
    }
    real_free(ptr);
}

gcc -fPIC -shared -o malloc_debug.so malloc_debug.c -ldl
LD_PRELOAD=./malloc_debug.so ls /tmp
# malloc(1024) = 0x55a3b2c001a0
# malloc(64)   = 0x55a3b2c005b0
# ... (many allocations from libc internals)
# free(0x55a3b2c001a0)
# ...
# === malloc_debug summary ===
# Allocations: 47
# Total allocated: 32768 bytes
# Total freed:     32768 bytes
# No leaks detected.

Security Implication

LD_PRELOAD is disabled for setuid programs (the dynamic linker ignores it for security). Attackers who find a way to inject LD_PRELOAD into a privileged process's environment can completely replace its library functions. This is why LD_PRELOAD is one of the first things security tools check.

24.8 dlopen/dlsym/dlclose: Runtime Dynamic Loading

LD_PRELOAD and linking work at load time. dlopen works at runtime — loading a shared library explicitly after the program has started.

API

#include <dlfcn.h>

// Load a shared library (or NULL = search in current process)
void *dlopen(const char *filename, int flag);
// RTLD_LAZY: resolve symbols lazily (like normal loading)
// RTLD_NOW:  resolve all symbols immediately (check for missing symbols)
// RTLD_GLOBAL: make symbols available to subsequently loaded libraries
// RTLD_LOCAL: symbols are not available to subsequently loaded libraries (default)

// Find a symbol in a loaded library
void *dlsym(void *handle, const char *symbol);
// RTLD_DEFAULT: search in default order (preloaded libs + executable + libraries)
// RTLD_NEXT: search in libraries loaded after the caller

// Decrement reference count; unload if count reaches 0
int dlclose(void *handle);

// Get error string (call immediately after failed dlopen/dlsym/dlclose)
char *dlerror(void);

Example: Plugin System

// plugin_host.c — Loads plugins at runtime
#include <dlfcn.h>
#include <stdio.h>
#include <stdlib.h>

typedef struct {
    const char *name;
    void (*init)(void);
    void (*run)(const char *input);
    void (*cleanup)(void);
} Plugin;

int load_and_run_plugin(const char *path, const char *input) {
    // Load the plugin shared library
    void *handle = dlopen(path, RTLD_NOW | RTLD_LOCAL);
    if (!handle) {
        fprintf(stderr, "dlopen error: %s\n", dlerror());
        return -1;
    }

    // Find the plugin descriptor (a Plugin struct exported by the .so)
    Plugin *plugin = (Plugin *)dlsym(handle, "plugin_descriptor");
    if (!plugin) {
        fprintf(stderr, "dlsym error: %s\n", dlerror());
        dlclose(handle);
        return -1;
    }

    // Use the plugin
    printf("Loaded plugin: %s\n", plugin->name);
    plugin->init();
    plugin->run(input);
    plugin->cleanup();

    dlclose(handle);
    return 0;
}

int main(int argc, char *argv[]) {
    if (argc < 3) {
        fprintf(stderr, "Usage: %s plugin.so input_string\n", argv[0]);
        return 1;
    }
    return load_and_run_plugin(argv[1], argv[2]);
}

// rot13_plugin.c — A plugin shared library
#include <stdio.h>
#include <string.h>
#include <ctype.h>

static void rot13_init(void) {
    printf("[rot13] Plugin initialized\n");
}

static void rot13_run(const char *input) {
    printf("[rot13] Encoding: ");
    for (const char *p = input; *p; p++) {
        char c = *p;
        if (isalpha(c)) {
            char base = isupper(c) ? 'A' : 'a';
            c = (c - base + 13) % 26 + base;
        }
        putchar(c);
    }
    putchar('\n');
}

static void rot13_cleanup(void) {
    printf("[rot13] Plugin cleaned up\n");
}

// The descriptor that plugin_host looks for
__attribute__((visibility("default")))
Plugin plugin_descriptor = {
    .name    = "ROT13 Encoder",
    .init    = rot13_init,
    .run     = rot13_run,
    .cleanup = rot13_cleanup,
};

gcc -fPIC -shared -o rot13_plugin.so rot13_plugin.c
gcc -o plugin_host plugin_host.c -ldl
./plugin_host ./rot13_plugin.so "Hello, World!"
# Loaded plugin: ROT13 Encoder
# [rot13] Plugin initialized
# [rot13] Encoding: Uryyb, Jbeyq!
# [rot13] Plugin cleaned up

24.9 Weak Symbols and Symbol Versioning

Weak Symbols for Optional Dependencies

A weak symbol is a soft dependency: if the symbol is not defined anywhere (no library provides it), the weak reference resolves to NULL (or 0) instead of causing a link error.

// Optional function that may or may not be available
extern int __attribute__((weak)) pthread_mutex_lock(pthread_mutex_t *);

void safe_lock(pthread_mutex_t *m) {
    // Check if pthreads is linked (single-threaded programs won't have it)
    if (pthread_mutex_lock != NULL) {
        pthread_mutex_lock(m);
    }
}

glibc uses weak symbols extensively for optional features. For example, the pthread_cancel implementation is weak-referenced from the main libc — programs that don't link -lpthread get a NULL pointer and no cancellation overhead.

Symbol Versioning

glibc uses symbol versioning to maintain ABI compatibility while allowing function behavior to change:

nm -D /lib/x86_64-linux-gnu/libc.so.6 | grep printf
# ...
# 000000000006f610 T printf@@GLIBC_2.2.5
# 000000000006f640 T printf@GLIBC_2.0

Programs compiled against glibc 2.0 use printf@GLIBC_2.0; modern programs use printf@@GLIBC_2.2.5 (the @@ means default version). Both symbols exist in the same library file. When a program is linked, the linker records which version of printf was used. At runtime, the dynamic linker provides exactly that version — even if the installed glibc is newer.

This is why you can run a binary compiled on Ubuntu 20.04 on Ubuntu 22.04 without recompiling: glibc maintains all old symbol versions.

Version Scripts

Shared library authors control symbol versioning with a version script:

/* version.map */
MYLIB_1.0 {
    global:
        mylib_open;
        mylib_close;
        mylib_read;
    local:
        *;          /* hide everything else */
};

MYLIB_2.0 {
    global:
        mylib_open_v2;   /* new version of mylib_open with different API */
} MYLIB_1.0;             /* MYLIB_2.0 inherits all MYLIB_1.0 symbols */

gcc -fPIC -shared -Wl,--version-script=version.map -o libmylib.so mylib.c

The local: *; in the version script hides all symbols not explicitly listed in global:. This reduces namespace pollution and prevents users from depending on internal implementation functions.

24.10 Library Search Path: DT_RPATH and DT_RUNPATH

When the dynamic linker loads a shared library, it searches in this order:

DT_RPATH (if set in the executable's .dynamic section) — deprecated, cannot be overridden by LD_LIBRARY_PATH
LD_LIBRARY_PATH environment variable (insecure for setuid programs)
DT_RUNPATH (if set) — recommended replacement for DT_RPATH
/etc/ld.so.cache (maintained by ldconfig)
/lib and /usr/lib (system directories)

# Embed an RPATH in the executable (finds libfoo relative to executable)
gcc -Wl,-rpath,'$ORIGIN/../lib' -o program main.c -L./lib -lfoo

# $ORIGIN expands to the directory containing the executable
# This enables relocatable installations without LD_LIBRARY_PATH

# Check what RPATH is embedded
readelf -d program | grep -E "RPATH|RUNPATH"
# (RPATH)   Library rpath: [$ORIGIN/../lib]

$ORIGIN is expanded by the dynamic linker at runtime to the directory containing the executable (or the library that has the RPATH). This enables self-contained installations where the executable and its libraries can be moved together.

24.11 The `.dynamic` Section

Every dynamically-linked executable and shared library has a .dynamic section containing the information the dynamic linker needs. It is an array of (tag, value) pairs:

readelf -d /bin/ls | head -30
# Dynamic section at offset 0x... contains 28 entries:
#   Tag        Type          Name/Value
#  0x0000001 (NEEDED)       Shared library: [libselinux.so.1]
#  0x0000001 (NEEDED)       Shared library: [libc.so.6]
#  0x000000f (RPATH)        Library rpath: [/usr/lib/x86_64-linux-gnu]
#  0x000000c (INIT)         0x401000
#  0x000000d (FINI)         0x411234
#  0x0000019 (INIT_ARRAY)   0x413f90
#  0x000001b (INIT_ARRAYSZ) 8 (bytes)
#  0x000001a (FINI_ARRAY)   0x413f98
#  0x000001c (FINI_ARRAYSZ) 8 (bytes)
#  0x0000006 (SYMTAB)       0x4003b0
#  0x000000b (SYMENT)       24 (bytes)
#  0x0000005 (STRTAB)       0x400520
#  0x000000a (STRSZ)        235 (bytes)
#  0x0000011 (REL)          0x4006b0
#  0x0000012 (RELSZ)        ...
#  0x0000017 (JMPREL)       0x400700  ← .rela.plt location
#  0x0000015 (DEBUG)        0x0
#  0x0000000 (NULL)         0x0       ← end of .dynamic

Key tags: - DT_NEEDED: Required shared library (one entry per dependency) - DT_INIT / DT_FINI: Constructor/destructor function addresses - DT_INIT_ARRAY / DT_FINI_ARRAY: Arrays of constructor/destructor function pointers (C++ global constructors go here) - DT_SYMTAB, DT_STRTAB: Dynamic symbol and string tables - DT_JMPREL: .rela.plt location (PLT relocations for lazy binding) - DT_RELA, DT_RELASZ: .rela.dyn location (non-lazy relocations)

24.12 Constructors and Destructors

DT_INIT_ARRAY contains function pointers called before main. DT_FINI_ARRAY contains function pointers called after main returns (during exit()).

// constructor.c
#include <stdio.h>

// Called before main()
void __attribute__((constructor)) my_init(void) {
    printf("Before main\n");
}

// Called after main() returns
void __attribute__((destructor)) my_fini(void) {
    printf("After main\n");
}

int main(void) {
    printf("In main\n");
    return 0;
}

gcc constructor.c -o constructor
./constructor
# Before main
# In main
# After main

Multiple constructors run in initialization order (determined by link order and priority). The linker places function pointers in the .init_array section, which becomes the DT_INIT_ARRAY. The C runtime (_start → __libc_start_main) iterates this array before calling main.

LD_PRELOAD libraries' constructors run before the main program's constructors — this is how malloc_debug.so's init() function (marked __attribute__((constructor))) can set up real_malloc before any user code calls malloc.

24.13 Security: GOT Overwrite and Countermeasures

Classic GOT Overwrite Attack

Attacker finds a heap overflow or format string vulnerability
Writes attacker-controlled data to GOT[exit] (or GOT[printf], etc.)
When the program calls exit(), the overwritten GOT entry redirects to shellcode

This attack was ubiquitous in the early 2000s and is why format string bugs (printf(user_input)) are so dangerous.

Modern Defenses

Defense	Mechanism	Limitation
Full RELRO	GOT marked read-only after startup	Startup overhead; no lazy binding
ASLR	Randomizes library addresses	Defeated by info leaks
PIE	Randomizes executable base	Requires Full ASLR to be effective
Stack canaries	Detects stack overflows	Does not protect GOT
Safe Stack	Separate shadow stack for return addresses	CET (HW support) on Alder Lake+
NX bit	Non-executable stack/heap	Defeated by ROP chains

The combination of Full RELRO + PIE + ASLR + NX + stack canaries is the modern baseline for Linux binary hardening. Check any production binary:

checksec --file=/bin/ls
# RELRO:     Full RELRO
# STACK CANARY: Canary found
# NX:        NX enabled
# PIE:       PIE enabled
# RPATH:     No RPATH
# RUNPATH:   No RUNPATH

24.14 Complete Worked Example: Tracing printf Through PLT and GOT

Let's trace printf("hello\n") at the machine instruction level, combining GDB with our knowledge of PLT/GOT.

// trace_demo.c
#include <stdio.h>
int main(void) {
    printf("hello\n");
    return 0;
}

gcc -g -o trace_demo trace_demo.c
gdb trace_demo

(gdb) break main
(gdb) run
Breakpoint 1, main () at trace_demo.c:3

# Find the PLT entry for printf:
(gdb) disassemble main
# ...
# 0x0000000000401140 <+9>:  call   0x401030 <printf@plt>

(gdb) disassemble 'printf@plt'
# Dump of assembler code for function printf@plt:
# 0x0000000000401030 <+0>:  jmp    QWORD PTR [rip+0x2fd2]   # 0x404008 <printf@got.plt>
# 0x0000000000401036 <+6>:  push   0x0
# 0x000000000040103b <+11>: jmp    0x401020 <.plt>

# Before first call: what's in the GOT?
(gdb) x/xg 0x404008
# 0x404008 <printf@got.plt>:   0x0000000000401036
# → Points to push instruction in PLT (lazy binding not yet resolved)

(gdb) step into the call (next instruction)
(gdb) stepi   # into call printf@plt
(gdb) stepi   # execute the jmp [GOT]
# Now at 0x401036 (the push instruction in PLT)

(gdb) stepi   # push 0x0
(gdb) stepi   # jmp .plt[0]
# Now at .plt[0] resolver stub

(gdb) stepi   # push [GOT+8] (link_map)
(gdb) stepi   # jmp [GOT+16] = _dl_runtime_resolve
# Now inside ld-linux.so's _dl_runtime_resolve

(gdb) finish  # Run to the end of _dl_runtime_resolve
# It returns to printf!

# Now check the GOT again:
(gdb) x/xg 0x404008
# 0x404008 <printf@got.plt>:   0x00007ffff7c60d10
# → Now points to actual printf in libc!

(gdb) info symbol 0x00007ffff7c60d10
# printf in section .text of /lib/x86_64-linux-gnu/libc.so.6

This GDB session confirms: 1. Before first call: GOT[printf] = PLT stub + 6 (back to push instruction) 2. During first call: PLT → PLT[0] → _dl_runtime_resolve → updates GOT → printf 3. After first call: GOT[printf] = actual printf address

Check Your Understanding

Why does lazy binding exist? What is the tradeoff vs. bind-at-startup?
What does LD_BIND_NOW=1 do?
What is the ABI difference between RTLD_GLOBAL and RTLD_LOCAL in dlopen?
Why can LD_PRELOAD not be used with setuid programs?
What is the difference between DT_RPATH and DT_RUNPATH? Which is preferred?
Explain why Full RELRO prevents GOT overwrite attacks.
What is a WEAK symbol, and in what scenario is it useful?

Summary

Dynamic linking turns library calls into a two-level indirection: PLT stub → GOT → actual function. Lazy binding defers resolution until first call, improving startup time. The dynamic linker (ld-linux.so) populates the GOT at runtime, manages symbol versioning for ABI stability, runs constructors/destructors, and supports dlopen for runtime plugin systems. Security hardening (Full RELRO, PIE, ASLR) progressively closes the attack surface exposed by the writable GOT. LD_PRELOAD enables powerful diagnostics and instrumentation through function interposition. Understanding this machinery — visible in GDB, readelf, and ltrace — transforms dynamic linking from magic into a comprehensible, debuggable system.

In This Chapter

Chapter 24: Dynamic Linking in Depth

The Invisible Machinery

24.1 The Problem: Position-Independent Calls to Unknown Addresses

24.2 The PLT: Procedure Linkage Table

The Three Instructions of a PLT Entry

24.3 The GOT: Global Offset Table

GOT Addressing: GOTPCREL

24.4 Lazy Binding: Step-by-Step Trace

First Call

Second Call

24.5 Relocations for Dynamic Linking

RELA Entries in .rela.plt

RELA Entries in .rela.dyn

24.6 RELRO: Hardening the GOT

24.7 LD_PRELOAD: Function Interposition

Mechanism

Example: malloc Debugger

Security Implication

24.8 dlopen/dlsym/dlclose: Runtime Dynamic Loading

API

Example: Plugin System

24.9 Weak Symbols and Symbol Versioning

Weak Symbols for Optional Dependencies

Symbol Versioning

Version Scripts

24.10 Library Search Path: DT_RPATH and DT_RUNPATH

24.11 The .dynamic Section

24.12 Constructors and Destructors

24.13 Security: GOT Overwrite and Countermeasures

Classic GOT Overwrite Attack

Modern Defenses

24.14 Complete Worked Example: Tracing printf Through PLT and GOT

Check Your Understanding

Summary

RELA Entries in `.rela.plt`

RELA Entries in `.rela.dyn`

24.11 The `.dynamic` Section