Chapter 21 Exercises: Understanding Compiler Output

Open Assembly Language Project

Chapter 21 Exercises: Understanding Compiler Output

Exercise 1: AT&T to Intel Syntax Translation

Translate each AT&T assembly instruction to Intel (NASM) syntax:

a) movq %rbx, %rax b) addl $42, %eax c) movl (%rbx), %eax d) movq -16(%rbp), %rax e) leaq 8(%rax,%rcx,4), %rdx f) cmpq $0, -8(%rbp) g) imulq %rcx h) movb %al, (%rdi,%rsi) i) sarq $3, %rax j) cmovneq %rbx, %rax

Exercise 2: Intel to AT&T Syntax Translation

Translate each Intel (NASM) instruction to AT&T syntax:

a) mov rax, rbx b) add rax, 100 c) mov qword [rbp-8], rax d) lea rdx, [rax + rcx*8 + 16] e) imul eax, ecx, 7 f) movzx eax, byte [rsi] g) cmp rdi, rsi h) jne .loop

Exercise 3: Predict Compiler Output

For each C function, predict what GCC -O2 (x86-64) assembly will look like. Then verify using Compiler Explorer.

a)

int double_it(int x) { return x * 2; }

b)

int max(int a, int b) { return a > b ? a : b; }

c)

int is_even(int x) { return x % 2 == 0; }

d)

void swap(int *a, int *b) {
    int t = *a; *a = *b; *b = t;
}

e)

int64_t collatz_steps(int64_t n) {
    int64_t steps = 0;
    while (n != 1) {
        if (n % 2 == 0) n /= 2;
        else n = 3*n + 1;
        steps++;
    }
    return steps;
}

Exercise 4: Identify Optimization Patterns

The following are GCC -O2 outputs for simple C functions. Identify which optimization was applied and write the original C code:

a)

foo:
    leal    (%rdi,%rdi,8), %eax
    ret

b)

bar:
    movl    $1, %eax
    testl   %edi, %edi
    jne     .L2
    xorl    %eax, %eax
.L2:
    ret

c)

baz:
    movl    %edi, %eax
    negl    %eax
    testl   %edi, %edi
    cmovns  %edi, %eax
    ret

d)

qux:
    movl    %edi, %eax
    movl    $1717986919, %edx
    imull   %edx
    sarl    $2, %edx
    sarl    $31, %edi
    subl    %edi, %edx
    movl    %edx, %eax
    ret

(Hint: this implements integer division by a constant. Which constant?)

Exercise 5: Jump Table Analysis

Compile the following with gcc -O2 -S:

const char *month_name(int month) {
    switch (month) {
    case 1: return "January";
    case 2: return "February";
    case 3: return "March";
    case 4: return "April";
    case 5: return "May";
    case 6: return "June";
    case 7: return "July";
    case 8: return "August";
    case 9: return "September";
    case 10: return "October";
    case 11: return "November";
    case 12: return "December";
    default: return "Invalid";
    }
}

a) Does GCC generate a jump table or a series of comparisons? b) What is the indirect jump instruction? c) What bounds check is performed before the jump table access? d) What would happen if you added case 100: return "Hundred";? (Try it!)

Exercise 6: Optimization Level Comparison

Compile this function at -O0, -O1, -O2, -O3, and -Os:

int64_t dot_product(const int64_t *a, const int64_t *b, int n) {
    int64_t sum = 0;
    for (int i = 0; i < n; i++) {
        sum += a[i] * b[i];
    }
    return sum;
}

For each level, count: (a) total instructions, (b) load instructions, (c) does it use SIMD instructions?

Exercise 7: Verbose Assembly Annotation

Compile a function with -fverbose-asm -O1 and annotate the output. For each assembly instruction, write a comment explaining which C variable and operation it corresponds to.

int linear_search(int *arr, int len, int target) {
    for (int i = 0; i < len; i++) {
        if (arr[i] == target) return i;
    }
    return -1;
}

Exercise 8: Recursive Function Optimization

Compile the following two factorial implementations at -O2:

// Version A: traditional recursive
uint64_t fact_rec(int n) {
    if (n <= 1) return 1;
    return n * fact_rec(n - 1);
}

// Version B: tail-recursive
uint64_t fact_tail(int n, uint64_t acc) {
    if (n <= 1) return acc;
    return fact_tail(n - 1, (uint64_t)n * acc);
}

a) Does GCC optimize fact_rec to avoid function calls? b) Does GCC optimize fact_tail with tail-call optimization (converting to a loop)? c) What -O flag and -f flag can force or disable tail-call optimization?

Exercise 9: Compiler Explorer Multi-Architecture

Using Compiler Explorer (godbolt.org), compile this function for x86-64 and ARM64:

uint64_t sum_array(const uint64_t *arr, int n) {
    uint64_t sum = 0;
    for (int i = 0; i < n; i++) {
        sum += arr[i];
    }
    return sum;
}

Compile at -O2 for both architectures. Answer: a) How many instructions in the loop body for each architecture? b) Which architecture uses more instructions for the loop? c) Does either compiler auto-vectorize (use SIMD instructions) at -O2? d) Try adding -march=native on x86-64 and -march=armv8-a on ARM64. Does vectorization appear?

Exercise 10: Clang vs. GCC

Using Compiler Explorer, compile the absolute value function from Exercise 3c with both GCC and Clang at -O2:

int abs_signed(int x) {
    return x < 0 ? -x : x;
}

a) What instruction does GCC use? b) What instruction does Clang use? c) Are they equivalent?

Then try: what if you change int to int64_t?

Exercise 11: Reading Object File Symbols

Given the compiled output of the functions.asm from Chapter 20, use nm and objdump to inspect it:

nasm -f elf64 functions.asm -o functions.o
nm functions.o
objdump -d functions.o

a) What does nm output show for asm_strlen? b) What does nm show for functions declared extern in the assembly? c) In objdump -d output, do the addresses start at 0? Why? d) What relocation entries does objdump -r show?

Exercise 12: Identifying ABI in Compiler Output

Compile the following function at -O0 and identify in the output:

int64_t complex_calc(int64_t a, int64_t b, int64_t c, int64_t d, int64_t e, int64_t f, int64_t g) {
    return a + b + c + d + e + f + g;
}

a) Which registers hold arguments a through f? b) Where is argument g (the 7th argument)? c) How does the function access g? d) What is the return value register?

Exercise 13: CMOV Identification

Identify which conditional move instruction (CMOV variant) each code sequence uses and what condition it tests:

a)

testl %edi, %edi
cmovg %esi, %eax    # (after cmp or test)

b)

cmpl %esi, %edi
cmovge %edi, %eax

c)

cmpl %esi, %edi
cmovb %esi, %eax    # unsigned

Exercise 14: -fverbose-asm vs. Manual Annotation

Take any medium-complexity C function (10-20 lines) of your choice. Compile it with -O2 -fverbose-asm -S. For each assembly instruction, verify the compiler's comment and add your own explanation of what is happening at the C semantic level.

Exercise 15: Decompilation Challenge

The following is GCC -O2 output. Write the C code that produced it:

mystery:
    xorl    %eax, %eax
    testl   %edi, %edi
    jle     .Ldone
.Lloop:
    movl    (%rsi), %ecx
    addq    $4, %rsi
    leal    1(%rax), %eax
    testl   %ecx, %ecx
    jne     .Lnot_zero
    movl    -1(%rax), %eax    # hmm, this backs up—let me re-examine...
.Lnot_zero:
    decl    %edi
    jne     .Lloop
.Ldone:
    ret

(The exact output may vary. The exercise is to reason about what C logic produces this structure.)