Case Study 27-2: How mmap() Creates Shared Memory Between Processes

The Page Tables Behind MAP_SHARED

Shared memory is one of the most interesting things you can do with the virtual memory system. Two separate processes, with entirely separate virtual address spaces, both see the same physical pages. A write in one process immediately appears in the other — no syscall, no copy, just a write to a virtual address that happens to map to the same physical frame.

This case study shows exactly what happens at the page table level when you create shared memory with mmap(MAP_SHARED|MAP_ANONYMOUS) and then fork.

The Setup

; shared_memory_demo.asm
; Demonstrates MAP_SHARED across fork: parent and child share physical pages.
; Build: nasm -f elf64 shared_memory_demo.asm -o shm_demo.o && ld shm_demo.o -o shm_demo

section .data
    msg_parent   db "Parent wrote: 0xDEADBEEF", 10
    msg_parent_l equ $ - msg_parent
    msg_child    db "Child read:   0x"
    msg_hex      db "????????????????", 10   ; 16-char hex placeholder
    msg_child_l  equ $ - msg_child

section .bss
    shared_page: resq 1     ; will hold pointer to shared mapping

section .text
global _start

_start:
    ;=== Step 1: Create shared anonymous mapping ===
    ; mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_SHARED|MAP_ANONYMOUS, -1, 0)
    mov rax, 9              ; sys_mmap
    xor rdi, rdi            ; addr = kernel chooses
    mov rsi, 4096           ; one page
    mov rdx, 3              ; PROT_READ | PROT_WRITE
    mov r10, 0x21           ; MAP_SHARED(0x01) | MAP_ANONYMOUS(0x20)
    mov r8, -1              ; fd = -1 (anonymous)
    xor r9, r9              ; offset = 0
    syscall
    ; Test for MAP_FAILED
    cmp rax, -1
    je .mmap_failed
    mov [shared_page], rax  ; save the mapping address
    mov r12, rax            ; r12 = shared page pointer

    ;=== Step 2: Initialize the shared page ===
    ; Write a magic value so we can detect it
    mov qword [r12], 0      ; clear to 0 initially
    mov qword [r12 + 8], 0  ; clear next qword too

    ;=== Step 3: Fork ===
    mov rax, 57             ; sys_fork
    syscall
    test rax, rax
    js .fork_failed
    jz .child_code

    ;=== Parent code ===
    ; Write to shared memory
    mov rax, 0xDEADBEEFCAFEBABE
    mov [r12], rax          ; write to shared page

    ; Print confirmation
    mov rax, 1
    mov rdi, 1
    mov rsi, msg_parent
    mov rdx, msg_parent_l
    syscall

    ; Wait for child
    mov rax, 61             ; sys_wait4
    mov rdi, -1             ; wait for any child
    xor rsi, rsi
    xor rdx, rdx
    xor r10, r10
    syscall

    ; Unmap
    mov rax, 11             ; sys_munmap
    mov rdi, r12
    mov rsi, 4096
    syscall

    ; Exit
    mov rax, 60
    xor rdi, rdi
    syscall

    ;=== Child code ===
.child_code:
    ; Small delay to let parent write first (busy-wait in real code would use futex)
    ; For demo: just loop 100M times
    mov rcx, 100000000
.wait_loop:
    dec rcx
    jnz .wait_loop

    ; Read the value the parent wrote
    mov rbx, [r12]          ; read from shared page

    ; Print the value
    ; ... (convert rbx to hex, print) ...
    ; For brevity: print a message
    mov rax, 1
    mov rdi, 1
    mov rsi, msg_child
    mov rdx, msg_child_l
    syscall

    ; Verify it matches expected value
    cmp rbx, 0xDEADBEEFCAFEBABE
    jne .mismatch
    ; success: parent's write visible in child
    mov rax, 60
    xor rdi, rdi
    syscall

.mismatch:
    mov rax, 60
    mov rdi, 1              ; non-zero = failure
    syscall

.mmap_failed:
.fork_failed:
    mov rax, 60
    mov rdi, 1
    syscall

What Happens in the Page Tables

Before fork:

Parent's Virtual Address Space:
┌─────────────────────────────────┐
│ virtual: 0x7f1234560000         │ → Physical Frame: 0x0001A000
│ (shared mapping, MAP_SHARED)    │   [contains: 0x0000000000000000]
└─────────────────────────────────┘

Physical Memory:
Frame 0x1A:  [00 00 00 00 00 00 00 00 ...]

After fork:

Parent's PTE:                      Child's PTE:
VA: 0x7f1234560000                 VA: 0x7f1234560000
  → PA: 0x1A000                      → PA: 0x1A000
  flags: P|R/W|U                     flags: P|R/W|U
         ↑                                   ↑
         └─────────── SAME PHYSICAL PAGE ────┘

Physical Memory:
Frame 0x1A:  [00 00 00 00 00 00 00 00 ...]
             ↑ both parent and child see this

After parent writes 0xDEADBEEFCAFEBABE:

Physical Memory:
Frame 0x1A:  [BE BA FE CA EF BE AD DE ...]
              ↑ both processes' virtual addresses map here
              ↑ child reads this without any kernel involvement

MAP_SHARED vs MAP_PRIVATE After Fork

The behavior is fundamentally different:

MAP_PRIVATE | MAP_ANONYMOUS + fork: Both parent and child initially share physical pages (copy-on-write). When either writes, a new physical frame is allocated, the content is copied, and that process's PTE is updated to point to the new frame. The other process is unaffected.

MAP_SHARED | MAP_ANONYMOUS + fork: Both parent and child share physical pages permanently. Writes by either are immediately visible to the other. No copying ever occurs for this mapping (though the kernel still uses reference counting on the physical pages).

# Demonstrate with strace to see mmap flags:
strace ./shm_demo 2>&1 | grep mmap
# mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_SHARED|MAP_ANONYMOUS, -1, 0) = 0x7f...

File-Backed MAP_SHARED

MAP_SHARED with a file descriptor creates a mapping where writes go directly to the file:

// Map a file for shared read-write access
int fd = open("data.bin", O_RDWR);
void *ptr = mmap(NULL, file_size, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
close(fd);  // can close fd after mapping; mapping remains

// Write to the mapping — this writes to the file
*(uint64_t*)ptr = 0xDEADBEEF;  // file is updated

// Another process mapping the same file with MAP_SHARED sees this write

// Unmap (kernel flushes dirty pages to file)
munmap(ptr, file_size);

This is how databases and memory-mapped message queues work. Writes to the mapping are visible to other processes immediately (from their cached view) and eventually flushed to the underlying storage. The msync() syscall (26) forces an immediate flush to disk.

Performance Characteristics

Shared memory communication is the fastest inter-process communication available: - Writing to a shared mapping: ~1 ns (a cache write, no kernel involvement) - A pipe write: ~5000 ns (kernel copies data) - A socket send/recv on localhost: ~10,000 ns

The cost of shared memory is synchronization: without mutexes or atomic operations (Chapter 30), concurrent access causes data corruption. The raw speed of the mechanism is why databases, message brokers, and high-performance IPC all use shared memory.

🔐 Security Note: Any two processes can share memory if they are forked from a common ancestor, or if they both open the same file with MAP_SHARED, or if they use the same POSIX shared memory object (shm_open). The kernel enforces file permissions on shared file mappings. Anonymous MAP_SHARED is only available to processes related by fork.