Chapter 4 Exercises: Memory

Section A: Memory Layout

Exercise 4.1 — Process Memory Map Reading

Run the readmaps program from the chapter (or use cat /proc/self/maps directly). For the output, identify:

a) The address range of the main executable's code section (.text). How large is it? b) The address range of the stack. How much stack space is currently mapped? c) Find the entry for libc.so.6. How many separate mappings does it have? What permissions does each have? d) Is there a [heap] entry? If your program doesn't use malloc, there may not be one. What does this tell you about when the heap is created? e) Find the [vdso] and [vvar] entries. What is the vDSO (Virtual Dynamic Shared Object) and why does it have execute permission?


Exercise 4.2 — Memory Segment Classification

For each of the following, state which segment (.text, .rodata, .data, .bss, stack, heap) it belongs to, and justify your answer:

a) The machine code for a function b) A global variable initialized to 0 c) A global const char* greeting = "Hello"; (two parts: the pointer, and the string) d) A local variable declared inside a function e) A buffer returned by malloc(1024) f) A jump table generated by the compiler for a switch statement g) A global int counter = 100; h) An uninitialized global array char buffer[4096];


Exercise 4.3 — Address Space Layout

For a non-PIE x86-64 Linux executable, the typical base addresses are: - .text starts at 0x400000 - Stack is near the top of user space - libc.so.6 is mapped somewhere in the 0x7f... range

a) What is the highest user-space virtual address on a 48-bit virtual address system? b) If the stack top is at 0x7fffffffe000 and the stack has been grown by 8,192 bytes, where is the current RSP? c) Why is there a "canonical hole" in the x86-64 address space between user space and kernel space? What happens if you try to access an address in this hole?


Section B: Alignment

Exercise 4.4 — Alignment Calculations

For each of the following, determine whether the address is correctly aligned for the given data type:

a) Address 0x100 for a dword (4 bytes) b) Address 0x102 for a dword (4 bytes) c) Address 0x108 for a qword (8 bytes) d) Address 0x110 for an XMM register value (16 bytes) e) Address 0x118 for an XMM register value (16 bytes) f) Address 0x200 for an AVX register value (32 bytes) g) Address 0x220 for an AVX register value (32 bytes)


Exercise 4.5 — Stack Alignment Calculation

A function is called. Before the call instruction, RSP = 0x7FFFFFFFEFF0 (16-byte aligned — verify this).

a) After the call instruction (which pushes an 8-byte return address), what is RSP? b) After push rbp, what is RSP? c) After sub rsp, 32, what is RSP? Is it 16-byte aligned? d) If the function then calls another function (via call), is RSP 16-byte aligned at the call instruction? (It needs to be.) e) Suppose a function needs exactly 12 bytes of local variable space. The programmer writes sub rsp, 12. Is this correct? If not, what should they write instead?


Exercise 4.6 — Data Declaration Layout

Given the following NASM data section:

section .data
    a   db  1
    b   dw  0x0203
    c   dd  0x04050607
    d   dq  0x08090A0B0C0D0E0F

If a is at address 0x1000, write out the bytes at addresses 0x1000 through 0x100F, one byte per address. Show both the address and the byte value in hex.


Section C: NASM Data Declarations

Exercise 4.7 — Data Declaration Writing

Write NASM data declarations for the following C declarations:

// (a)
const char *msg = "Error: file not found";

// (b)
int32_t values[] = {1, 2, 3, 4, 5, 6, 7, 8};

// (c) -- note: the pointer and string are in different sections
const char *greeting = "Hello, World!";

// (d) -- BSS declarations
char input_buffer[256];
int32_t result;
double accumulator;

// (e) -- initialized with sizeof-like calculation
uint8_t data[] = {0xDE, 0xAD, 0xBE, 0xEF};
// Also: a constant for the length of data

**Exercise 4.8 — The $ and $$ Operators** Given this NASM code: ```nasm section .data start_marker: msg1 db "Hello" msg1_end: msg1_len equ $ - msg1 ; (a) What is msg1_len? padding times (8 - ($ - $$) % 8) % 8 db 0 ; (b) What does this do?

msg2    db "World"
total_len equ $ - start_marker  ; (c) What is total_len?

For each EQU:
a) What is `msg1_len`?
b) Explain what the `times` expression does. Why might you want it?
c) What is `total_len`? (Include the padding bytes in your calculation.)

---

## Section D: Stack Operations

**Exercise 4.9 — Stack Trace**

Trace the stack state after each instruction. Show RSP value and the contents at RSP (and adjacent addresses). Start with RSP = `0x7FFFE000`:

```nasm
mov  rax, 0x1111111111111111
mov  rbx, 0x2222222222222222
mov  rcx, 0x3333333333333333

push rax    ; step 1
push rbx    ; step 2
push rcx    ; step 3
pop  rdx    ; step 4 -- what value does rdx get?
pop  rsi    ; step 5 -- what value does rsi get?

For each step, draw the stack:

Address       │ Value
─────────────────────
0x7FFFE000    │ (uninitialized)
0x7FFFDFF8    │
0x7FFFDFF0    │

Exercise 4.10 — PUSH/POP in Function Prologues

The following function uses PUSH/POP for its prologue and epilogue:

my_function:
    push rbp          ; (1) RSP before call was 0x7FFFE010; after CALL = 0x7FFFE008
    mov  rbp, rsp     ; (2)
    push rbx          ; (3)
    push r12          ; (4)
    push r13          ; (5)
    sub  rsp, 24      ; (6) -- allocate local variables
    ; ... function body ...
    add  rsp, 24      ; (7)
    pop  r13          ; (8)
    pop  r12          ; (9)
    pop  rbx          ; (10)
    pop  rbp          ; (11)
    ret               ; (12)

After step 5 (all three registers pushed, but before sub rsp, 24), is RSP 16-byte aligned? Show your calculation. After sub rsp, 24, is RSP aligned? If a function called from within my_function requires 16-byte aligned RSP at the CALL site, does my_function maintain this invariant?


Section E: Addressing Modes

Exercise 4.11 — Addressing Mode Identification

For each instruction, identify the addressing mode of the memory operand and compute the effective address in terms of the register values:

; Assume: rax=0x1000, rbx=0x2000, rcx=5, rdx=0x10

mov r8, [rax]            ; (a) mode and effective address?
mov r8, [rbx + 8]        ; (b)
mov r8, [rax + rcx*8]    ; (c)
mov r8, [rbx + rcx*4 + 20]  ; (d)
mov r8, [rip + my_var]   ; (e) -- what is different about this one?

Exercise 4.12 — LEA as Arithmetic

For each LEA instruction, compute the value stored in the destination register. No memory is accessed.

; Assume: rax=10, rbx=3, rcx=0x100

lea rdx, [rax + 1]           ; (a) rdx = ?
lea rdx, [rax + rbx]         ; (b) rdx = ?
lea rdx, [rbx + rbx*2]       ; (c) rdx = ?  (hint: rax + rax*2 = rax*3)
lea rdx, [rax + rbx*4 + 8]   ; (d) rdx = ?
lea rdx, [rcx*2]             ; (e) rdx = ?  (hint: no base)

Section F: Synthesis

Exercise 4.13 — Complete Memory Layout Program

Modify the readmaps program from the chapter to: a) Count the number of mapped regions (lines in /proc/self/maps) b) Count regions with execute permission (containing 'x' in the permissions field) c) Print a summary: "Total regions: N, executable: M"

This requires parsing the output of /proc/self/maps in assembly — a useful exercise in string processing.


Exercise 4.14 — Stack Canary Location

The stack canary is stored at fs:0x28 (thread-local storage offset 0x28). Write a NASM snippet that: a) Reads the canary value from TLS: mov rax, QWORD [fs:0x28] b) Stores a copy of it on the stack at [rbp-8] (the standard compiler location) c) After the function body, verify it hasn't changed

verify_canary_example:
    push rbp
    mov  rbp, rsp
    sub  rsp, 16

    ; Read canary from TLS and save it:
    mov  rax, QWORD [fs:0x28]
    mov  QWORD [rbp-8], rax

    ; ... simulated function body ...
    ; (Some code that could potentially corrupt the stack)

    ; Verify canary hasn't changed:
    mov  rax, QWORD [rbp-8]
    xor  rax, QWORD [fs:0x28]   ; XOR: if unchanged, result is 0
    jnz  .canary_failed          ; if not zero, canary was modified!

    xor  eax, eax
    leave
    ret

.canary_failed:
    ; Stack was corrupted -- terminate
    mov  rax, 60
    mov  rdi, 1
    syscall

Explain: What value would [rbp-8] contain if a buffer overflow had overwritten it? How does the XOR comparison detect this?


Exercise 4.15 — Preview: Buffer on the Stack

Consider the following NASM function that declares a local buffer:

read_input:
    push rbp
    mov  rbp, rsp
    sub  rsp, 64                ; allocate 64-byte buffer

    ; Buffer is at [rbp-64] to [rbp-1] (64 bytes)
    lea  rdi, [rbp-64]          ; pointer to buffer
    mov  rsi, 64                ; max bytes to read
    ; ... read into buffer ...

    leave
    ret

a) What is the address of the first byte of the buffer relative to RBP? b) What is at [rbp+0] (the byte at RBP itself)? c) What is at [rbp+8]? d) If an attacker could write more than 64 bytes into the buffer, which adjacent memory locations would be overwritten first? What is at those locations? e) Without implementing an exploit: what would happen to program control if the value at [rbp+8] was overwritten with an attacker-controlled value?

(This is a preview of Chapter 11's buffer overflow material. Just describe the memory layout — don't implement an exploit.)