Chapter 23 Exercises: Linking, Loading, and ELF
Exercise 1: ELF Header Inspection
Using readelf -h on a compiled program on your system:
a) What is the e_type value? What does it mean (ET_EXEC vs. ET_DYN)?
b) What is the e_machine value in decimal and hex? Verify it matches EM_X86_64 (62 / 0x3E) or EM_AARCH64 (183 / 0xB7) depending on your system.
c) What is e_entry? Use nm program | grep _start to verify this address is the _start symbol.
d) Why does a PIE executable compiled with -fPIE -pie show ET_DYN instead of ET_EXEC?
Exercise 2: Section Sizes and .bss
Create a C file with:
int small_initialized = 42;
char big_uninitialized[1024 * 1024]; // 1 MB
Compile with gcc -c test.c -o test.o. Then:
a) Run size test.o. Report the .bss, .data, and .text sizes.
b) Run ls -l test.o. Report the actual file size in bytes.
c) Explain the discrepancy: the .bss size shows 1 MB, but the file size is tiny.
d) Now add char big_initialized[1024 * 1024] = {1};. Rerun size. What happened and why?
Exercise 3: Undefined References and Archives
Create three C files:
- a.c: defines int foo(void) { return 1; } and calls bar() (extern)
- b.c: defines int bar(void) { return 2; } and calls baz() (extern)
- c.c: defines int baz(void) { return 3; }
Create an archive: ar rcs libabc.a a.o b.o c.o
Then try:
gcc main.o -labc -o program # where main.o calls foo()
a) Does this link succeed? Why?
b) Now try gcc -labc main.o -o program (library before main.o). Does it fail? Explain why the order matters for archive linking.
c) When would you need -labc -lxyz -labc (listing a library twice)?
Exercise 4: Symbol Visibility
// visibility_test.c
static int private_var = 1; // static = LOCAL binding
int public_var = 2; // global = GLOBAL binding
int __attribute__((weak)) weak_var = 3; // weak binding
static void private_func(void) {}
void public_func(void) {}
void __attribute__((weak)) weak_func(void) {}
Compile to object file and run readelf -s visibility_test.o. For each symbol, identify:
- Which have LOCAL binding? Which have GLOBAL? Which have WEAK?
- What Ndx value do they show? (1 = .text or .data section; UND = undefined)
Then link two object files where both define weak_var. What value does the final program use? How does the linker choose?
Exercise 5: Relocation Types
Create:
// reloc_test.c
extern int ext_var;
extern void ext_func(void);
void demo(void) {
ext_func(); // Generates R_X86_64_PLT32 or R_X86_64_PC32
ext_var = 42; // Generates R_X86_64_PC32 or R_X86_64_GOTPCREL (if PIC)
}
Compile twice:
gcc -c -O0 reloc_test.c -o reloc_no_pic.o
gcc -c -O0 -fPIC reloc_test.c -o reloc_pic.o
Run readelf -r on both. Compare:
a) What relocation type is used for ext_func() in the non-PIC version?
b) What relocation type is used for ext_var in the PIC version? Why is it different?
c) What does R_X86_64_GOTPCREL mean physically — what does the linker do with it?
Exercise 6: Writing a Linker Script
Write a linker script for a bare-metal embedded system with: - Flash memory (read-only): starts at 0x08000000, 512 KB - RAM (read-write): starts at 0x20000000, 128 KB
Requirements:
- .text, .rodata go in flash
- .data (initialized variables) go in RAM at runtime, but are stored in flash (must be copied at startup)
- .bss goes in RAM (zeroed at startup)
- Define symbols _data_load (where .data is in flash), _data_start / _data_end (where it goes in RAM), _bss_start / _bss_end
After writing the script, write the ARM Cortex-M startup code (in C or assembly) that:
1. Copies .data from flash (_data_load) to RAM (_data_start to _data_end)
2. Zeros .bss (_bss_start to _bss_end)
Exercise 7: Segment Permissions
Run readelf -l /bin/cat (or any compiled program) and examine the PT_LOAD segments.
a) How many PT_LOAD segments are there?
b) What are the permissions (R, W, X) of each? Which contains .text? Which contains .data?
c) Why is the .text segment never RW (read-write)? What security property does this enforce?
d) Some programs have a third PT_LOAD segment with just R permission. What sections would go there?
e) Look for the GNU_STACK program header. What permissions does it show (RW vs RWE)? What is the security implication of an executable stack?
Exercise 8: Static vs. Dynamic Binary Size
Create a minimal C program:
#include <stdio.h>
int main(void) { printf("hello\n"); return 0; }
Compile four ways:
gcc -O2 hello.c -o hello_dynamic # dynamic (default)
gcc -O2 -static hello.c -o hello_static # static link libc
gcc -O2 -s hello.c -o hello_stripped # stripped (no debug symbols)
gcc -O2 -fPIE -pie hello.c -o hello_pie # PIE
a) Report the file size of each variant.
b) Run ldd on each. Which ones show "not a dynamic executable"?
c) Run nm -D hello_dynamic and nm hello_static | wc -l. Compare the number of symbols.
d) Use size to compare section sizes between dynamic and static. Where does the extra code in the static binary come from?
Exercise 9: ASLR and PIE
a) Run the following command twice in a row and compare the addresses:
cat /proc/self/maps | grep "r-xp"
b) Compile a program with and without PIE:
gcc -no-pie hello.c -o hello_nopie
gcc -fPIE -pie hello.c -o hello_pie
Run each through readelf -h and note the e_type (ET_EXEC vs. ET_DYN).
c) For hello_nopie: run it twice and check if the .text address changes. For hello_pie: do the same. Explain the difference.
d) Verify with: cat /proc/$(pidof hello_pie)/maps` vs `cat /proc/$(pidof hello_nopie)/maps while the program is running (use a sleep loop in the program).
Exercise 10: objdump — Before and After Linking
Compile demo.c (a simple function that calls printf) to both .o and linked executable.
a) Run objdump -d demo.o and find the call instruction to printf. What address does it call? (It should be 00 00 00 00 or a relative offset that would be wrong without relocation.)
b) Run objdump -d demo (the linked executable) and find the call instruction to printf. What address does it call now? Is it the actual printf function, or a PLT stub?
c) Find the PLT entry for printf using objdump -d demo | grep -A4 "printf@plt". Describe what the three instructions in the PLT stub do (at a high level — Chapter 24 covers this in detail).
d) Use readelf -r demo and compare to readelf -r demo.o. What happened to the R_X86_64_PC32 relocation for printf? Did it stay, or was it replaced by a different relocation type?
Exercise 11: Linker Map File
Compile a multi-file project and generate a linker map:
gcc -Wl,-Map=output.map main.c utils.c -o program
Open output.map and answer:
a) At what address does .text start? Where does it end?
b) Which .o file contributed to .text first? Why (what determines the order)?
c) Find the __libc_start_main symbol. What does this tell you about where main is called from?
d) Look for .bss entries. Do any symbols from your code appear there?
Exercise 12: Custom Section Attributes
Place a function in a custom named section:
void __attribute__((section(".fast_code"))) time_critical(void) {
// performance-sensitive code
}
a) Compile and run readelf -S. Does .fast_code appear as a section?
b) Write a linker script snippet that places .fast_code in a separate segment with R-X permissions, located at address 0x500000.
c) When would placing code in a custom section be useful? Give two real-world use cases (hint: think about firmware or Linux kernel modules).
Exercise 13: The _start Symbol
Examine the entry point of a compiled C program:
objdump -d program | grep -A 20 "<_start>:"
a) Identify the sequence of calls in _start. Which functions are called before main?
b) What arguments does _start pass to __libc_start_main?
c) Write a minimal _start in assembly that calls main and then calls the exit syscall directly (without libc):
section .text
global _start
_start:
; your code here
d) Link without the C runtime: gcc -nostdlib -nostartfiles _start.s main.o -o minimal. Does it work?
Challenge Exercise: ELF File Parser
Write a C program that reads an ELF64 file and prints: 1. The ELF type (ET_REL/ET_EXEC/ET_DYN) as a string 2. The machine architecture (EM_X86_64/EM_AARCH64) as a string 3. The entry point address 4. A list of all sections: name, type, flags (A=alloc, W=write, X=exec), and size 5. A list of all GLOBAL symbols: name, binding, type, value (address), and size
Use mmap to map the file, then walk the ELF structures directly using Elf64_Ehdr, Elf64_Shdr, Elf64_Sym from <elf.h>.
Test your parser on /bin/ls and verify your output matches readelf -hSs /bin/ls.