Chapter 21 Exercises: Understanding Compiler Output
Exercise 1: AT&T to Intel Syntax Translation
Translate each AT&T assembly instruction to Intel (NASM) syntax:
a) movq %rbx, %rax
b) addl $42, %eax
c) movl (%rbx), %eax
d) movq -16(%rbp), %rax
e) leaq 8(%rax,%rcx,4), %rdx
f) cmpq $0, -8(%rbp)
g) imulq %rcx
h) movb %al, (%rdi,%rsi)
i) sarq $3, %rax
j) cmovneq %rbx, %rax
Exercise 2: Intel to AT&T Syntax Translation
Translate each Intel (NASM) instruction to AT&T syntax:
a) mov rax, rbx
b) add rax, 100
c) mov qword [rbp-8], rax
d) lea rdx, [rax + rcx*8 + 16]
e) imul eax, ecx, 7
f) movzx eax, byte [rsi]
g) cmp rdi, rsi
h) jne .loop
Exercise 3: Predict Compiler Output
For each C function, predict what GCC -O2 (x86-64) assembly will look like. Then verify using Compiler Explorer.
a)
int double_it(int x) { return x * 2; }
b)
int max(int a, int b) { return a > b ? a : b; }
c)
int is_even(int x) { return x % 2 == 0; }
d)
void swap(int *a, int *b) {
int t = *a; *a = *b; *b = t;
}
e)
int64_t collatz_steps(int64_t n) {
int64_t steps = 0;
while (n != 1) {
if (n % 2 == 0) n /= 2;
else n = 3*n + 1;
steps++;
}
return steps;
}
Exercise 4: Identify Optimization Patterns
The following are GCC -O2 outputs for simple C functions. Identify which optimization was applied and write the original C code:
a)
foo:
leal (%rdi,%rdi,8), %eax
ret
b)
bar:
movl $1, %eax
testl %edi, %edi
jne .L2
xorl %eax, %eax
.L2:
ret
c)
baz:
movl %edi, %eax
negl %eax
testl %edi, %edi
cmovns %edi, %eax
ret
d)
qux:
movl %edi, %eax
movl $1717986919, %edx
imull %edx
sarl $2, %edx
sarl $31, %edi
subl %edi, %edx
movl %edx, %eax
ret
(Hint: this implements integer division by a constant. Which constant?)
Exercise 5: Jump Table Analysis
Compile the following with gcc -O2 -S:
const char *month_name(int month) {
switch (month) {
case 1: return "January";
case 2: return "February";
case 3: return "March";
case 4: return "April";
case 5: return "May";
case 6: return "June";
case 7: return "July";
case 8: return "August";
case 9: return "September";
case 10: return "October";
case 11: return "November";
case 12: return "December";
default: return "Invalid";
}
}
a) Does GCC generate a jump table or a series of comparisons?
b) What is the indirect jump instruction?
c) What bounds check is performed before the jump table access?
d) What would happen if you added case 100: return "Hundred";? (Try it!)
Exercise 6: Optimization Level Comparison
Compile this function at -O0, -O1, -O2, -O3, and -Os:
int64_t dot_product(const int64_t *a, const int64_t *b, int n) {
int64_t sum = 0;
for (int i = 0; i < n; i++) {
sum += a[i] * b[i];
}
return sum;
}
For each level, count: (a) total instructions, (b) load instructions, (c) does it use SIMD instructions?
Exercise 7: Verbose Assembly Annotation
Compile a function with -fverbose-asm -O1 and annotate the output. For each assembly instruction, write a comment explaining which C variable and operation it corresponds to.
int linear_search(int *arr, int len, int target) {
for (int i = 0; i < len; i++) {
if (arr[i] == target) return i;
}
return -1;
}
Exercise 8: Recursive Function Optimization
Compile the following two factorial implementations at -O2:
// Version A: traditional recursive
uint64_t fact_rec(int n) {
if (n <= 1) return 1;
return n * fact_rec(n - 1);
}
// Version B: tail-recursive
uint64_t fact_tail(int n, uint64_t acc) {
if (n <= 1) return acc;
return fact_tail(n - 1, (uint64_t)n * acc);
}
a) Does GCC optimize fact_rec to avoid function calls?
b) Does GCC optimize fact_tail with tail-call optimization (converting to a loop)?
c) What -O flag and -f flag can force or disable tail-call optimization?
Exercise 9: Compiler Explorer Multi-Architecture
Using Compiler Explorer (godbolt.org), compile this function for x86-64 and ARM64:
uint64_t sum_array(const uint64_t *arr, int n) {
uint64_t sum = 0;
for (int i = 0; i < n; i++) {
sum += arr[i];
}
return sum;
}
Compile at -O2 for both architectures. Answer:
a) How many instructions in the loop body for each architecture?
b) Which architecture uses more instructions for the loop?
c) Does either compiler auto-vectorize (use SIMD instructions) at -O2?
d) Try adding -march=native on x86-64 and -march=armv8-a on ARM64. Does vectorization appear?
Exercise 10: Clang vs. GCC
Using Compiler Explorer, compile the absolute value function from Exercise 3c with both GCC and Clang at -O2:
int abs_signed(int x) {
return x < 0 ? -x : x;
}
a) What instruction does GCC use? b) What instruction does Clang use? c) Are they equivalent?
Then try: what if you change int to int64_t?
Exercise 11: Reading Object File Symbols
Given the compiled output of the functions.asm from Chapter 20, use nm and objdump to inspect it:
nasm -f elf64 functions.asm -o functions.o
nm functions.o
objdump -d functions.o
a) What does nm output show for asm_strlen?
b) What does nm show for functions declared extern in the assembly?
c) In objdump -d output, do the addresses start at 0? Why?
d) What relocation entries does objdump -r show?
Exercise 12: Identifying ABI in Compiler Output
Compile the following function at -O0 and identify in the output:
int64_t complex_calc(int64_t a, int64_t b, int64_t c, int64_t d, int64_t e, int64_t f, int64_t g) {
return a + b + c + d + e + f + g;
}
a) Which registers hold arguments a through f?
b) Where is argument g (the 7th argument)?
c) How does the function access g?
d) What is the return value register?
Exercise 13: CMOV Identification
Identify which conditional move instruction (CMOV variant) each code sequence uses and what condition it tests:
a)
testl %edi, %edi
cmovg %esi, %eax # (after cmp or test)
b)
cmpl %esi, %edi
cmovge %edi, %eax
c)
cmpl %esi, %edi
cmovb %esi, %eax # unsigned
Exercise 14: -fverbose-asm vs. Manual Annotation
Take any medium-complexity C function (10-20 lines) of your choice. Compile it with -O2 -fverbose-asm -S. For each assembly instruction, verify the compiler's comment and add your own explanation of what is happening at the C semantic level.
Exercise 15: Decompilation Challenge
The following is GCC -O2 output. Write the C code that produced it:
mystery:
xorl %eax, %eax
testl %edi, %edi
jle .Ldone
.Lloop:
movl (%rsi), %ecx
addq $4, %rsi
leal 1(%rax), %eax
testl %ecx, %ecx
jne .Lnot_zero
movl -1(%rax), %eax # hmm, this backs up—let me re-examine...
.Lnot_zero:
decl %edi
jne .Lloop
.Ldone:
ret
(The exact output may vary. The exercise is to reason about what C logic produces this structure.)