Chapter 8 Exercises: Data Movement and Addressing Modes

Section A: MOV and Size Rules

Exercise 1. What is the value of RAX after each of the following instruction sequences? Work through each independently (treat each as starting fresh with RAX = 0xAABBCCDDEEFF1122).

(a)

mov rax, 0xAABBCCDDEEFF1122
mov eax, 0x12345678

(b)

mov rax, 0xAABBCCDDEEFF1122
mov ax, 0x9900

(c)

mov rax, 0xAABBCCDDEEFF1122
mov al, 0xFF

(d)

mov rax, 0xAABBCCDDEEFF1122
mov ah, 0xFF

Exercise 2. A programmer writes the following code intending to sum the values 1 through 8 into RAX using a loop. Find and fix the bug.

    xor rax, rax          ; total = 0
    mov cl, 1             ; i = 1 (stored in CL)
.loop:
    add rax, rcx          ; total += i  ← potential bug here
    inc cl
    cmp cl, 9
    jl .loop

Hint: CL is the low byte of RCX. What might be in the upper bits of RCX?


Exercise 3. Explain why the following instruction is not encodable and provide two correct alternatives:

mov [rdi], [rsi]          ; copy 8 bytes from [rsi] to [rdi]

Section B: Addressing Mode Identification

Exercise 4. Identify the addressing mode used in each instruction and list all components (base, index, scale, displacement):

(a) mov rax, [rbp - 16] (b) mov rax, [rdi + rdx*4] (c) mov rcx, [rbx + r9*8 + 32] (d) mov rdx, [0x7fff00000000] (e) mov rsi, [rel my_array] (f) mov r8, [rbx]


Exercise 5. Given the following C struct:

struct Node {
    int32_t  key;       // offset 0
    uint32_t color;     // offset 4
    uint64_t value;     // offset 8
    struct Node *left;  // offset 16
    struct Node *right; // offset 24
};               // sizeof = 32

Write the NASM instruction to: (a) Load node->key into EAX (assume RDI = node pointer) (b) Load node->value into RBX (RDI = node pointer) (c) Load node->right into RCX (RDI = node pointer) (d) Store RDX into node->color (RDI = node pointer) (e) Load node->left->value into RSI (RDI = node pointer, requires two instructions)


Exercise 6. Write the NASM code to access element [3][2] of a 2D array declared as int64_t matrix[5][5]. Assume RDI holds the address of matrix[0][0].

Use the formula: element address = base + (row × 5 + col) × 8


Section C: LEA Arithmetic

Exercise 7. Express each of the following C expressions as a single LEA instruction (or two LEA instructions if necessary). Inputs are in the registers shown; write results to RAX.

(a) rax = rdi * 3 (input: RDI) (b) rax = rdi * 5 (input: RDI) (c) rax = rdi * 9 (input: RDI) (d) rax = rdi * 5 + 7 (input: RDI) (e) rax = rdi + rsi * 4 (inputs: RDI, RSI) (f) rax = rdi * 10 (input: RDI; may use two LEAs) (g) rax = rdi * 25 (input: RDI; may use two LEAs)


Exercise 8. A compiler generates this code for a function that computes a * 12:

lea rax, [rdi + rdi*2]   ; rax = ?
shl rax, 2               ; rax = ?

Trace the computation step by step and verify that the result is rdi * 12. Why might the compiler prefer this over imul rax, rdi, 12?


Exercise 9. The following LEA chain computes rax = rdi * 60. Fill in the intermediate value at each step:

lea rax, [rdi + rdi*4]   ; rax = _____ (what multiple of rdi?)
lea rax, [rax + rax*2]   ; rax = _____ (now what multiple?)
shl rax, 2               ; rax = _____ (final multiple)

Exercise 10. Why is lea rax, [rax + 1] sometimes preferred over add rax, 1 or inc rax in performance-critical code? Give a specific scenario where this matters.


Section D: MOVZX and MOVSX

Exercise 11. Fill in the value of RAX after each instruction. Assume the memory byte at [rbx] contains 0xC4.

(a) movzx rax, byte [rbx] (b) movsx rax, byte [rbx] (c) movzx eax, byte [rbx] (d) movsx eax, byte [rbx]

(Note: 0xC4 = 196 unsigned, -60 signed)


Exercise 12. A 16-bit value in memory at [rdi] contains 0x8010. Fill in the value of RAX after each instruction:

(a) movzx rax, word [rdi] → RAX = ? (b) movsx rax, word [rdi] → RAX = ?


Exercise 13. Write a function int64_t sign_extend_array_sum(int8_t *arr, int count) in NASM. The function should sum all signed byte values in the array, returning a 64-bit signed result. Use MOVSX for proper sign extension.

Function signature (System V AMD64 ABI): - RDI = arr pointer - ESI = count - Return in RAX


Exercise 14. A programmer loads an array index with:

mov cl, [rbp - 1]        ; load index from stack (stored as int8_t)
mov rax, [array + rcx*8] ; access array[index]

This code has a potential bug. Describe the bug, give a specific input value for the index that demonstrates the bug, and provide the fix.


Section E: Putting It Together

Exercise 15. Write a complete NASM program that does the following: - Declare a static array of 8 int64_t values: {100, 200, 300, 400, 500, 600, 700, 800} - Load the 4th element (index 3) using base+index×scale addressing - Load the 6th element (index 5) and compute their sum in RAX - Store the sum back into the 8th element (index 7) - Exit with status 0

Include the section .data and section .text parts.


Exercise 16 (Challenge). Implement void matrix_transpose(int64_t *A, int64_t *B, int n) in NASM, where A and B are both n×n matrices stored in row-major order. The function should compute B = A^T (B[i][j] = A[j][i]).

Signature: - RDI = A (source matrix) - RSI = B (destination matrix) - EDX = n (matrix dimension, assume n ≤ 8)

Use base+index×scale addressing wherever possible.


Exercise 17 (Debug). The following code attempts to copy a string (a sequence of bytes) from src to dst, stopping at the null terminator. Find all bugs:

section .data
    src: db "Hello", 0
    dst: times 10 db 0

section .text
global _start
_start:
    mov rsi, src           ; RSI = src address
    mov rdi, dst           ; RDI = dst address
.loop:
    mov al, [rsi]          ; load byte
    mov [rdi], al          ; store byte
    inc rsi
    inc rdi
    cmp al, 0              ; stop if null?
    jne .loop              ; WRONG: should it be je or jne?
    ; also: is RIP-relative addressing being used correctly?
    ; also: what happens after the loop?
    mov eax, 60
    xor edi, edi
    syscall

List all bugs and provide the corrected version.


Exercise 18. True or false, and explain why:

(a) mov eax, ecx and mov rax, rcx have the same observable effect if RAX and RCX are both less than 2^32.

(b) lea rax, [rax] is a valid no-op (does nothing observable).

(c) mov [rbx+rcx*3], rax is a valid NASM instruction.

(d) xchg rax, rbx and the two-instruction sequence push rax; push rbx; pop rax; pop rbx always produce the same result.

(e) movzx eax, ax and mov eax, eax are equivalent for all possible values of RAX.


Exercise 19. Write NASM code to pack four 16-bit values (A, B, C, D) stored at memory locations [rdi], [rdi+2], [rdi+4], [rdi+6] into a single 64-bit register RAX, with A in bits 63:48, B in bits 47:32, C in bits 31:16, and D in bits 15:0. Use only MOV, MOVZX, SHL, and OR instructions.


Exercise 20 (Performance Analysis). Consider two implementations of accessing records[i].score where Record has the layout from the chapter (32 bytes total, score at offset 24):

; Implementation A:
imul rcx, rdi, 32         ; rcx = i * 32
mov rax, [rsi + rcx + 24] ; load records[i].score

; Implementation B:
lea rcx, [rdi + rdi*4]    ; rcx = i * 5
shl rcx, 3                ; rcx *= 8, so rcx = i * 40  ← BUG
mov rax, [rsi + rcx + 24]

Implementation B has a bug. Find it, explain what it actually computes, and provide the correct version using LEA+SHL.