Chapter 23 Exercises: Linking, Loading, and ELF


Exercise 1: ELF Header Inspection

Using readelf -h on a compiled program on your system:

a) What is the e_type value? What does it mean (ET_EXEC vs. ET_DYN)?

b) What is the e_machine value in decimal and hex? Verify it matches EM_X86_64 (62 / 0x3E) or EM_AARCH64 (183 / 0xB7) depending on your system.

c) What is e_entry? Use nm program | grep _start to verify this address is the _start symbol.

d) Why does a PIE executable compiled with -fPIE -pie show ET_DYN instead of ET_EXEC?


Exercise 2: Section Sizes and .bss

Create a C file with:

int small_initialized = 42;
char big_uninitialized[1024 * 1024];  // 1 MB

Compile with gcc -c test.c -o test.o. Then:

a) Run size test.o. Report the .bss, .data, and .text sizes.

b) Run ls -l test.o. Report the actual file size in bytes.

c) Explain the discrepancy: the .bss size shows 1 MB, but the file size is tiny.

d) Now add char big_initialized[1024 * 1024] = {1};. Rerun size. What happened and why?


Exercise 3: Undefined References and Archives

Create three C files: - a.c: defines int foo(void) { return 1; } and calls bar() (extern) - b.c: defines int bar(void) { return 2; } and calls baz() (extern) - c.c: defines int baz(void) { return 3; }

Create an archive: ar rcs libabc.a a.o b.o c.o

Then try:

gcc main.o -labc -o program     # where main.o calls foo()

a) Does this link succeed? Why?

b) Now try gcc -labc main.o -o program (library before main.o). Does it fail? Explain why the order matters for archive linking.

c) When would you need -labc -lxyz -labc (listing a library twice)?


Exercise 4: Symbol Visibility

// visibility_test.c
static int private_var = 1;       // static = LOCAL binding
int public_var = 2;               // global = GLOBAL binding
int __attribute__((weak)) weak_var = 3;   // weak binding

static void private_func(void) {}
void public_func(void) {}
void __attribute__((weak)) weak_func(void) {}

Compile to object file and run readelf -s visibility_test.o. For each symbol, identify: - Which have LOCAL binding? Which have GLOBAL? Which have WEAK? - What Ndx value do they show? (1 = .text or .data section; UND = undefined)

Then link two object files where both define weak_var. What value does the final program use? How does the linker choose?


Exercise 5: Relocation Types

Create:

// reloc_test.c
extern int ext_var;
extern void ext_func(void);
void demo(void) {
    ext_func();          // Generates R_X86_64_PLT32 or R_X86_64_PC32
    ext_var = 42;        // Generates R_X86_64_PC32 or R_X86_64_GOTPCREL (if PIC)
}

Compile twice:

gcc -c -O0 reloc_test.c -o reloc_no_pic.o
gcc -c -O0 -fPIC reloc_test.c -o reloc_pic.o

Run readelf -r on both. Compare: a) What relocation type is used for ext_func() in the non-PIC version? b) What relocation type is used for ext_var in the PIC version? Why is it different? c) What does R_X86_64_GOTPCREL mean physically — what does the linker do with it?


Exercise 6: Writing a Linker Script

Write a linker script for a bare-metal embedded system with: - Flash memory (read-only): starts at 0x08000000, 512 KB - RAM (read-write): starts at 0x20000000, 128 KB

Requirements: - .text, .rodata go in flash - .data (initialized variables) go in RAM at runtime, but are stored in flash (must be copied at startup) - .bss goes in RAM (zeroed at startup) - Define symbols _data_load (where .data is in flash), _data_start / _data_end (where it goes in RAM), _bss_start / _bss_end

After writing the script, write the ARM Cortex-M startup code (in C or assembly) that: 1. Copies .data from flash (_data_load) to RAM (_data_start to _data_end) 2. Zeros .bss (_bss_start to _bss_end)


Exercise 7: Segment Permissions

Run readelf -l /bin/cat (or any compiled program) and examine the PT_LOAD segments.

a) How many PT_LOAD segments are there?

b) What are the permissions (R, W, X) of each? Which contains .text? Which contains .data?

c) Why is the .text segment never RW (read-write)? What security property does this enforce?

d) Some programs have a third PT_LOAD segment with just R permission. What sections would go there?

e) Look for the GNU_STACK program header. What permissions does it show (RW vs RWE)? What is the security implication of an executable stack?


Exercise 8: Static vs. Dynamic Binary Size

Create a minimal C program:

#include <stdio.h>
int main(void) { printf("hello\n"); return 0; }

Compile four ways:

gcc -O2 hello.c -o hello_dynamic     # dynamic (default)
gcc -O2 -static hello.c -o hello_static    # static link libc
gcc -O2 -s hello.c -o hello_stripped  # stripped (no debug symbols)
gcc -O2 -fPIE -pie hello.c -o hello_pie   # PIE

a) Report the file size of each variant.

b) Run ldd on each. Which ones show "not a dynamic executable"?

c) Run nm -D hello_dynamic and nm hello_static | wc -l. Compare the number of symbols.

d) Use size to compare section sizes between dynamic and static. Where does the extra code in the static binary come from?


Exercise 9: ASLR and PIE

a) Run the following command twice in a row and compare the addresses:

cat /proc/self/maps | grep "r-xp"

b) Compile a program with and without PIE:

gcc -no-pie hello.c -o hello_nopie
gcc -fPIE -pie hello.c -o hello_pie

Run each through readelf -h and note the e_type (ET_EXEC vs. ET_DYN).

c) For hello_nopie: run it twice and check if the .text address changes. For hello_pie: do the same. Explain the difference.

d) Verify with: cat /proc/$(pidof hello_pie)/maps` vs `cat /proc/$(pidof hello_nopie)/maps while the program is running (use a sleep loop in the program).


Exercise 10: objdump — Before and After Linking

Compile demo.c (a simple function that calls printf) to both .o and linked executable.

a) Run objdump -d demo.o and find the call instruction to printf. What address does it call? (It should be 00 00 00 00 or a relative offset that would be wrong without relocation.)

b) Run objdump -d demo (the linked executable) and find the call instruction to printf. What address does it call now? Is it the actual printf function, or a PLT stub?

c) Find the PLT entry for printf using objdump -d demo | grep -A4 "printf@plt". Describe what the three instructions in the PLT stub do (at a high level — Chapter 24 covers this in detail).

d) Use readelf -r demo and compare to readelf -r demo.o. What happened to the R_X86_64_PC32 relocation for printf? Did it stay, or was it replaced by a different relocation type?


Exercise 11: Linker Map File

Compile a multi-file project and generate a linker map:

gcc -Wl,-Map=output.map main.c utils.c -o program

Open output.map and answer:

a) At what address does .text start? Where does it end?

b) Which .o file contributed to .text first? Why (what determines the order)?

c) Find the __libc_start_main symbol. What does this tell you about where main is called from?

d) Look for .bss entries. Do any symbols from your code appear there?


Exercise 12: Custom Section Attributes

Place a function in a custom named section:

void __attribute__((section(".fast_code"))) time_critical(void) {
    // performance-sensitive code
}

a) Compile and run readelf -S. Does .fast_code appear as a section?

b) Write a linker script snippet that places .fast_code in a separate segment with R-X permissions, located at address 0x500000.

c) When would placing code in a custom section be useful? Give two real-world use cases (hint: think about firmware or Linux kernel modules).


Exercise 13: The _start Symbol

Examine the entry point of a compiled C program:

objdump -d program | grep -A 20 "<_start>:"

a) Identify the sequence of calls in _start. Which functions are called before main?

b) What arguments does _start pass to __libc_start_main?

c) Write a minimal _start in assembly that calls main and then calls the exit syscall directly (without libc):

section .text
global _start
_start:
    ; your code here

d) Link without the C runtime: gcc -nostdlib -nostartfiles _start.s main.o -o minimal. Does it work?


Challenge Exercise: ELF File Parser

Write a C program that reads an ELF64 file and prints: 1. The ELF type (ET_REL/ET_EXEC/ET_DYN) as a string 2. The machine architecture (EM_X86_64/EM_AARCH64) as a string 3. The entry point address 4. A list of all sections: name, type, flags (A=alloc, W=write, X=exec), and size 5. A list of all GLOBAL symbols: name, binding, type, value (address), and size

Use mmap to map the file, then walk the ELF structures directly using Elf64_Ehdr, Elf64_Shdr, Elf64_Sym from <elf.h>.

Test your parser on /bin/ls and verify your output matches readelf -hSs /bin/ls.