Chapter 24 Exercises: Dynamic Linking in Depth

Open Assembly Language Project

Chapter 24 Exercises: Dynamic Linking in Depth

Exercise 1: Observing Lazy Binding with GDB

Compile the following program and trace it in GDB to observe lazy binding:

#include <stdio.h>
#include <string.h>

int main(void) {
    char buf[64];
    printf("Enter text: ");
    fgets(buf, sizeof(buf), stdin);
    int len = strlen(buf);
    printf("Length: %d\n", len);
    return 0;
}

a) Set a breakpoint at main and run the program. Before the first printf call, use x/xg to read the GOT entry for printf. What does it contain?

b) Single-step through the first printf@plt call until _dl_runtime_resolve returns. Re-read the GOT entry. What changed?

c) Continue to the second printf call. Does it still go through _dl_runtime_resolve? Use GDB to verify by setting a breakpoint in _dl_runtime_resolve and checking whether it's hit.

d) What is the address difference between the GOT entry before and after resolution? How many PLT stubs are there in this binary?

Exercise 2: Disabling Lazy Binding

a) Compile the program from Exercise 1 and run it with LD_BIND_NOW=1:

LD_BIND_NOW=1 ./program

Using ltrace (or GDB), verify that _dl_runtime_resolve is not called during program execution.

b) Compare startup time for a larger program (e.g., /usr/bin/python3) with and without LD_BIND_NOW:

time LD_BIND_NOW=1 python3 -c "pass"
time python3 -c "pass"

Is there a measurable difference? Why or why not for a short-running program vs. a long-running server?

c) What is the relationship between LD_BIND_NOW=1 and the -z now linker flag? Which is more appropriate for a production security-hardened binary?

Exercise 3: Writing an LD_PRELOAD Interposer

Write an LD_PRELOAD library that intercepts all fopen calls and logs: - The filename being opened - The mode string - The result (success or failure) - A timestamp in microseconds (use clock_gettime(CLOCK_MONOTONIC))

// fopen_trace.c
// Build: gcc -fPIC -shared -o fopen_trace.so fopen_trace.c -ldl
// Use:   LD_PRELOAD=./fopen_trace.so ls /tmp

Requirements: - Use dlsym(RTLD_NEXT, "fopen") to get the real fopen - Output to stderr (so it does not interfere with stdout output) - Handle the case where fopen is called before your constructor runs (during dynamic linker initialization)

Test with: LD_PRELOAD=./fopen_trace.so cat /etc/hostname

Exercise 4: dlopen Plugin Architecture

Extend the plugin system from the chapter to support:

a) A plugin that implements a word-count function:

// wc_plugin.c
// Implements: int word_count(const char *text);

b) A plugin that implements ROT13 encoding (from the chapter example).

c) A plugin loader that: - Scans a plugins/ directory for all .so files - Loads each with dlopen(RTLD_NOW | RTLD_LOCAL) - Calls plugin_info() (returns const char *name) on each - Lists available plugins - Accepts a plugin name and input string as arguments - Runs the selected plugin

Requirements: Handle dlerror() correctly; call dlclose() on all handles at exit.

Exercise 5: Symbol Visibility

Create a shared library with both public and private symbols:

// mylib.c
// Public API:
int mylib_public_add(int a, int b);
int mylib_public_mul(int a, int b);

// Private implementation details:
static int helper_1(int x);        // C static — implicit hidden
int __attribute__((visibility("hidden"))) helper_2(int x);  // explicit hidden
int implementation_detail(int x);  // hidden via version script

a) Build without a version script. Use nm -D libmylib.so to list exported symbols. How many symbols are exported?

b) Build with __attribute__((visibility("default"))) on only mylib_public_add and mylib_public_mul, and __attribute__((visibility("hidden"))) on all helpers. Rebuild and recount exported symbols.

c) Build with a version script that explicitly lists only the two public functions. What is the minimum version script to achieve this?

d) Why does hiding symbols improve shared library performance? (Hint: think about how the PLT handles calls to hidden symbols vs. visible symbols within the same library.)

Exercise 6: RELRO Investigation

a) Compile a program three ways:

gcc -no-pie main.c -o main_norelro
gcc -fPIE -pie -Wl,-z,relro main.c -o main_partialrelro
gcc -fPIE -pie -Wl,-z,relro,-z,now main.c -o main_fullrelro

Run checksec --file= on each. What RELRO status does each show?

b) For each binary, use readelf -l to find the GNU_RELRO program header. What virtual address range does it cover? What sections are in that range?

c) For main_partialrelro, use GDB to find the GOT before the first printf call. Verify it's writable by attempting set *address = 0. Does it succeed?

d) For main_fullrelro, repeat (c). Does set *address = 0 succeed? What error does GDB show?

Exercise 7: Weak Symbols for Optional Threading

Write a library that uses weak symbols to optionally support multithreading:

// thread_safe_counter.c
#include <stdint.h>

// Weak references to pthread functions
extern int __attribute__((weak)) pthread_mutex_init(void *, void *);
extern int __attribute__((weak)) pthread_mutex_lock(void *);
extern int __attribute__((weak)) pthread_mutex_unlock(void *);

// Counter that uses mutex if pthreads is available
typedef struct {
    volatile uint64_t value;
    unsigned char mutex[40];   // pthread_mutex_t is typically 40 bytes
    int has_threading;
} SafeCounter;

void safe_counter_init(SafeCounter *c);
void safe_counter_increment(SafeCounter *c);
uint64_t safe_counter_get(SafeCounter *c);

Implement these functions. When pthread_mutex_init is NULL (pthreads not linked), operate without locking. When it is non-NULL, use a real mutex.

Test both cases:

gcc -c thread_safe_counter.c -o counter.o
gcc counter.o main.c -o test_single      # no -lpthread
gcc counter.o main.c -lpthread -o test_multi

Exercise 8: Symbol Versioning

Create a shared library libcalc with two versions of a function:

// calc_v1.c — version 1.0 API
int calc_add(int a, int b) { return a + b; }   // CALC_1.0

// calc_v2.c — version 2.0 API (extended)
int calc_add_v2(int a, int b, int carry) { return a + b + carry; }  // CALC_2.0

// The old calc_add still exists for backward compatibility (aliased to calc_add_v2 with carry=0)

Write a version script calc.map that: - Exports calc_add under version CALC_1.0 - Exports calc_add under version CALC_2.0 (with new behavior) for newly linked programs - Keeps both versions in the library

Then: a) Build a program old_program that links against CALC_1.0::calc_add b) Update the library to change calc_add's behavior (returns a + b + 100 in new version) c) Verify old_program still gets the old behavior (CALC_1.0 version) d) Compile new_program — verify it gets the new behavior (CALC_2.0 version)

Exercise 9: Runtime Library Path

a) Create a small shared library libgreeting.so with a function const char *greet(void) that returns "Hello!". Install it to $HOME/mylibs/.

b) Compile a program that uses it:

gcc main.c -L$HOME/mylibs -lgreeting -o greet_program
./greet_program   # Fails: cannot find libgreeting.so

c) Fix the runtime path three different ways: 1. LD_LIBRARY_PATH=$HOME/mylibs ./greet_program 2. Embed RPATH: gcc main.c -L$HOME/mylibs -lgreeting -Wl,-rpath,$HOME/mylibs -o greet_rpath 3. Embed $ORIGIN` RPATH: `gcc main.c -L. -lgreeting -Wl,-rpath,'$ORIGIN' -o greet_origin (after copying libgreeting.so to the same directory as the binary)

d) Use readelf -d on the binaries from methods 2 and 3 to verify the embedded path.

e) Which method is most suitable for distributable software? Why?

Exercise 10: ltrace — Tracing Library Calls

ltrace intercepts and logs shared library function calls (analogous to strace for syscalls).

a) Run: ltrace /bin/cat /etc/hostname 2>&1 | head -30. Identify the sequence of library calls: which function opens the file? Which reads it? Which writes to stdout?

b) Use ltrace -e malloc+free+realloc ./your_program to trace memory allocation in a program that uses dynamic data structures. How many allocations occur for a small program?

c) Write a program that calls printf in a loop 10 times. Use ltrace to verify each call is recorded, but also note that after the first call, no _dl_runtime_resolve appears in the trace. Why?

d) Compare ltrace output with strace output for the same program. What is the relationship between library calls and system calls? (Hint: printf eventually calls write(1, ...).)

Exercise 11: Constructors and Destructors

a) Write a shared library with initialization and cleanup code using constructors/destructors:

// mylib.c
void __attribute__((constructor(200))) lib_early_init(void);   // priority 200
void __attribute__((constructor(300))) lib_late_init(void);    // priority 300
void __attribute__((destructor(200))) lib_late_cleanup(void);  // priority 200
void __attribute__((destructor(300))) lib_early_cleanup(void); // priority 300

b) Add constructors with the same priorities in the main program. In what order do they all run? (Constructors run in ascending priority order; destructors run in reverse.)

c) Explain why LD_PRELOAD libraries' constructors run before the main executable's constructors. In what scenario does this ordering matter?

d) Find the DT_INIT_ARRAY entry in your compiled library using readelf -d. Verify it contains pointers to your constructor functions using readelf -x .init_array.

Challenge Exercise: Implement a Minimal Dynamic Linker

Write a C program that acts as a minimal ELF loader/dynamic linker. It should:

Accept an ELF path as argument
Parse the ELF header to find PT_LOAD segments
Use mmap to load each segment at the specified virtual address
Parse the .dynamic section to find DT_NEEDED libraries
For each needed library, find it in standard search paths and mmap it
Process R_X86_64_JUMP_SLOT relocations to connect the program's GOT to the library's functions
Jump to the program's entry point

You do not need to implement lazy binding (use eager resolution). You do not need to handle C++ exceptions or thread-local storage. The target programs should be simple C programs that use only a few libc functions.

This is a significant project (~500 lines of C), but understanding each step solidifies everything in Chapters 23 and 24.