Chapter 4 Further Reading: Memory
Core References
1. "What Every Programmer Should Know About Memory" Ulrich Drepper, 2007. Free PDF at people.redhat.com/drepper/cpumemory.pdf
The definitive non-academic treatment of the memory hierarchy for working programmers. Covers DRAM organization, the cache hierarchy (L1/L2/L3), TLB (Translation Lookaside Buffer), NUMA (Non-Uniform Memory Access), and their performance implications. Particularly relevant to Chapter 4's alignment section: Drepper explains exactly why misaligned accesses split across cache lines cause performance degradation, and why stride patterns in array access matter for cache prefetcher effectiveness. About 100 pages; dense but practical. Essential reading for anyone writing performance-critical assembly.
2. Intel 64 and IA-32 Architectures Software Developer's Manual, Volume 3A Intel Corporation. Chapters 4-5 (Paging), Chapter 7 (Memory Ordering). Free at intel.com/sdm
Volume 3A covers system programming topics including the virtual memory system. Chapters 4 and 5 describe the x86-64 paging structures in detail: the 4-level page table (PML4, PDPT, PD, PT), page sizes (4KB, 2MB, 1GB), and the protection bits including the NX/XD bit. Chapter 7 covers memory ordering — when the CPU is allowed to reorder loads and stores relative to each other, and which instructions (MFENCE, LFENCE, SFENCE, LOCK) enforce ordering. For understanding why the stack layout from the chapter is reliable even on out-of-order CPUs, Chapter 7 is the reference.
3. "Linux Kernel Development" Robert Love, Addison-Wesley, 3rd Edition, 2010
Chapter 15 covers the Linux process address space from the kernel's perspective. Explains the mm_struct and vm_area_struct data structures that the kernel uses to track memory regions (the same information exposed in /proc/self/maps). Chapter 12 covers memory allocation. Understanding how the kernel manages the process address space deepens your understanding of what /proc/self/maps is showing you. Love's book is clear and well-organized for programmers who don't plan to become kernel developers.
4. "The Linux Programming Interface" Michael Kerrisk, No Starch Press, 2010. Chapters 48-50 (Virtual Memory)
Kerrisk's treatment of virtual memory from the user-space programmer's perspective. Chapter 48 covers virtual memory background and mmap(). Chapter 49 covers memory locking and location (mlock, madvise). Chapter 50 covers virtual memory operations. The mmap system call introduced in Chapter 4 is covered exhaustively here, including all the flags (MAP_PRIVATE, MAP_ANONYMOUS, MAP_SHARED, MAP_FIXED) and their implications. Definitive reference for Linux system programming.
Virtual Memory and Paging
5. "Operating Systems: Three Easy Pieces" — Chapters 12-24 (Virtualization: Memory) Remzi and Andrea Arpaci-Dusseau. Free PDF at ostep.org
The most accessible explanation of virtual memory for students. Chapters covering address spaces, address translation, segmentation, paging, TLBs, and multi-level page tables. OSTEP presents the concepts that Intel's manual assumes you already know. If the description of "canonical addresses" or "4-level paging" in the chapter was unclear, OSTEP's background is the right starting point. The homework assignments include simulation exercises that build intuition for page table lookups.
6. "Computer Architecture: A Quantitative Approach" Hennessy and Patterson, 6th Edition (Morgan Kaufmann)
Chapter 5 covers memory hierarchy design: cache organization, cache performance, virtual memory, and memory protection. Hennessy and Patterson provide the quantitative framework for understanding how cache misses translate to performance penalties and how TLB efficiency affects overall throughput. This is the academic foundation for the performance intuitions in this chapter's alignment section.
ELF Format and Memory Layout
7. ELF Specification: Executable and Linkable Format Tool Interface Standard (TIS) Committee. Available at refspecs.linuxbase.org
The formal specification of the ELF file format used for Linux executables, object files, and shared libraries. Describes the ELF header, program headers (segments, which map to the runtime memory layout), section headers (sections, used by the linker), symbol tables, relocation entries, and dynamic linking structures. When readelf output in this chapter shows "PT_LOAD segments" or "SHF_ALLOC sections," this specification defines what those mean. Free and authoritative.
8. "Linkers and Loaders" John R. Levine, Morgan Kaufmann, 1999. Free online at iecc.com/linker
The definitive explanation of the linking and loading process. Chapters cover object files, symbols, relocation, shared libraries, and dynamic linking. Particularly relevant: the explanation of how the linker combines .text sections from multiple object files, how it resolves relocations, and how the dynamic linker (ld.so) maps shared libraries into the process address space. Older (pre-dates x86-64) but the fundamental concepts are unchanged.
Security and Memory Layout
9. "Smashing the Stack for Fun and Profit" Aleph One (Elias Levy), Phrack Magazine #49, 1996. Available at phrack.org
The original paper that introduced buffer overflow exploitation to a wide audience. Though written for 32-bit x86 on Linux, every concept — stack layout, return address location, shellcode placement — applies directly to the 64-bit world with appropriate adjustments. Reading this paper with the stack frame diagram from Chapter 4 in hand makes every step concrete. Historically significant and still technically valuable as a mental model builder.
10. "Exploit Mitigation Techniques" Various authors. PaX Team, Grsecurity, lwn.net articles
The collection of kernel patches and compiler features that implement NX, ASLR, RELRO, stack canaries, and CFI (Control Flow Integrity). The PaX ASLR documentation explains the randomization of the stack, heap, mmap region, and executable base address. The GCC stack protector documentation explains canary placement (which this chapter references). Understanding these mitigations at the implementation level — which bytes they protect, which conditions they detect — requires understanding the memory layout they're protecting. These are the practical consequences of the stack frame knowledge from this chapter.