Chapter 25 Key Takeaways: System Calls
-
A system call is a supervised privilege escalation. The
syscallinstruction causes the CPU to transition from ring 3 to ring 0 (kernel mode) via a fixed entry point defined in theLSTARMSR. It is not a function call; no attacker can redirect it to arbitrary kernel code. -
syscalldestroysRCXandR11. The instruction savesRIPintoRCXandRFLAGSintoR11for later use bySYSRET. Any values your program had in these registers are gone after asyscall. This is why the Linux syscall ABI usesR10(notRCX) for the fourth argument. -
The Linux x86-64 syscall ABI: RAX = syscall number; RDI, RSI, RDX, R10, R8, R9 = arguments 1–6; RAX = return value. Negative return means error; the absolute value is the errno code.
-
Error handling in raw assembly: If RAX is negative after a syscall,
-RAXequals the errno. The C library converts this to a-1return plus a write to the thread-localerrnovariable. In raw assembly, you manage this yourself. -
The vDSO eliminates ring transitions for hot paths.
gettimeofdayandclock_gettimecan be called millions of times per second without ever entering the kernel, because the kernel maps a shared page of time data into every process. This is ~10x faster than a real syscall. -
ARM64 uses
SVC #0with X8 as the syscall number. The argument registers X0–X5 correspond to arguments 1–6. Crucially, ARM64 syscall numbers are completely different from x86-64 numbers (e.g., write=64 on ARM64, write=1 on x86-64). -
stracemakes the system call layer visible without modifying the program. Every file open, network connection, memory allocation, and process creation appears in the trace. For debugging and security analysis, it is often the fastest way to understand what a program is actually doing. -
The C standard library is mostly syscall wrappers.
open,read,write,malloc(viabrkormmap),fork,exec— all of these are thin wrappers around raw syscalls. Building your own minimal libc demonstrates that there is no magic below the syscall interface. -
sys_mmapis the general-purpose memory interface. It handles anonymous memory allocation (likemallocfor large chunks), file mapping, and shared memory between processes. The flagsMAP_PRIVATE|MAP_ANONYMOUSgive you zeroed anonymous memory;MAP_SHAREDgives shared memory that survives afork. -
sys_execvereplaces the process image entirely. On success, it never returns — the calling code no longer exists. On failure, it returns a negative errno and the original code continues. This is the foundation of all process launching: shell, loader, and supervisor all use it. -
The MinOS syscall dispatcher uses
swapgsto access per-CPU data. Onsyscallentry, the kernel is running on the user's stack. Before doing anything else, it must switch to the kernel stack. The per-CPU kernel stack address is stored in kernel GS-relative memory, accessed afterswapgsswaps in the kernel GS base. -
Suspicious syscall patterns are security-relevant. Reading SSH keys then opening a network connection to an unknown IP, calling
ptrace(PTRACE_TRACEME)as an anti-debugging check, writing to.bashrcor crontab directories — these behaviors are visible instraceoutput regardless of how obfuscated the binary is, because they must eventually make system calls.