Chapter 20 Key Takeaways: Calling C from Assembly and Assembly from C

  1. extern declares a symbol defined elsewhere; global exports a symbol from the current file. Every assembly function you want C to call must be declared global. Every C function or variable you want to call from assembly must be declared extern.

  2. Link mixed C+assembly projects with gcc, not ld directly. gcc automatically includes the C runtime startup files (crt1.o, crti.o, crtn.o) that set up the C runtime environment. Using ld directly bypasses these and produces programs that may crash before main is called.

  3. For variadic function calls (printf, scanf, etc.), AL must equal the number of XMM registers used for floating-point arguments. xor eax, eax sets AL=0 (no FP args). mov eax, 1 sets AL=1 (one FP arg in XMM0). Forgetting this causes printf to read garbage from unused XMM registers.

  4. Always save malloc's return value (RAX) into a callee-saved register (RBX, R12-R15) before any subsequent function call. Function calls clobber RAX and all caller-saved registers. Using a callee-saved register guarantees the pointer survives.

  5. The System V AMD64 ABI specifies exactly 6 callee-saved GP registers: RBX, RBP, R12, R13, R14, R15. Any other register (RAX, RCX, RDX, RSI, RDI, R8-R11) may be modified by a called function. Save only what you need; restore exactly what you saved.

  6. The red zone (128 bytes below RSP) can be used by leaf functions without adjusting RSP. It cannot be used by non-leaf functions (those that call other functions), by kernel code, or by signal handlers. Misusing the red zone in a non-leaf function causes silent data corruption.

  7. Small structs (≤ 16 bytes) are passed in registers: up to 2 GP registers for integer structs. Large structs (> 16 bytes) are passed by copying to the stack, with the callee receiving a pointer. The return of large structs uses a "hidden first argument" — the caller allocates space and passes a pointer in RDI.

  8. The standard function prologue (push rbp; mov rbp, rsp) restores 16-byte RSP alignment because CALL pushed 8 bytes (making RSP 8-byte aligned), and push rbp pushes 8 more (making RSP 16-byte aligned). Subsequent sub rsp, N must use N as a multiple of 16.

  9. C++ uses name mangling (_Z3fooi for foo(int)). Use extern "C" in C++ headers to suppress mangling for functions implemented in assembly or plain C. Without it, the C++ compiler generates a call to the mangled name that the assembler cannot resolve.

  10. call printf actually calls printf@plt — the PLT stub. On the first call, the PLT stub invokes the dynamic linker to resolve printf's address and update the GOT. On subsequent calls, the GOT already holds the address and the PLT stub jumps directly to printf. This "lazy binding" mechanism is also a security attack surface (GOT overwrite).

  11. Accessing external C global variables requires [rel varname] to load the value — not varname alone (which gives the address). For position-independent code (shared libraries), use [rel varname wrt ..got] to load through the GOT.

  12. xor eax, eax is the canonical way to zero RAX before a variadic call. It's also 2 bytes shorter than mov eax, 0 and doesn't affect RFLAGS in a way that causes issues. It's idiomatic — any reviewer recognizes it immediately as "set AL=0 for no FP args."