Case Study 23-2: Writing a MinOS Kernel Linker Script and Boot Sequence
Objective
Apply the linker script knowledge from the chapter to the MinOS kernel project. We will write a complete linker script that places the kernel in memory correctly for a GRUB-booted x86-64 system, implement the boot assembly that satisfies the Multiboot2 specification, and trace the full path from GRUB handing control to our kernel_main() function.
Background: Booting Without an OS
A kernel has no operating system beneath it. This means:
- No dynamic linker — the kernel is statically linked
- No main() — the entry point is a raw assembly label
- No automatic zero-initialization of .bss — the kernel must do it
- No standard load address — the kernel specifies its own memory map
GRUB (Grand Unified Bootloader) solves the problem of getting the kernel from disk into memory. The Multiboot2 specification defines a contract: the kernel places a specific header structure in its binary, GRUB finds it, loads the kernel, and jumps to the entry point with machine state in a known condition.
The Memory Map
For a 32-bit protected mode kernel loaded by GRUB, the conventional layout is:
Physical Memory Layout (at boot)
┌────────────────────────────────┐ 0xFFFFFFFF
│ Reserved (BIOS/ACPI) │
├────────────────────────────────┤ 0x00100000 (1 MB)
│ Kernel (loaded by GRUB) │ ← We go here
│ .multiboot (header) │
│ .text (code) │
│ .rodata (constants) │
│ .data (initialized vars) │
│ .bss (zero variables) │
├────────────────────────────────┤ 0x00100000 + kernel_size
│ Available RAM │
├────────────────────────────────┤ 0x000A0000
│ Reserved (VGA, BIOS) │
├────────────────────────────────┤ 0x00000000
│ Real mode interrupt vectors │
└────────────────────────────────┘
The kernel is loaded at 1 MB (0x00100000). GRUB reads our PT_LOAD segment (or Multiboot1 load address fields) and copies the kernel there.
The Linker Script: minOS/linker.ld
/* minOS/linker.ld
* Linker script for the MinOS teaching kernel.
* Links a 64-bit kernel that GRUB loads at 1 MB physical.
* The kernel runs in higher half (virtual 0xFFFFFFFF80000000 + 1MB)
* but is loaded at physical 0x100000.
*
* For simplicity in this case study, we use a lower-half layout:
* both virtual and physical addresses are at 0x100000.
*/
OUTPUT_FORMAT("elf64-x86-64")
OUTPUT_ARCH(i386:x86-64)
/* Entry point: the Multiboot2 entry point in boot.asm */
ENTRY(_start)
/* ============================================================
* Memory Regions
* ============================================================ */
MEMORY {
/* Kernel loaded at 1 MB, up to 16 MB available */
kernel (rwx) : ORIGIN = 0x100000, LENGTH = 15M
}
/* ============================================================
* Section Layout
* ============================================================ */
SECTIONS {
/* Start at 1 MB */
. = 0x100000;
_kernel_start = .;
/* .multiboot MUST be within the first 8192 bytes of the file.
* GRUB scans for the Multiboot2 header in this range. */
.multiboot : {
KEEP(*(.multiboot)) /* KEEP prevents --gc-sections from removing it */
}
/* Code: executable, read-only */
.text ALIGN(0x1000) : {
*(.text._start) /* _start first within .text */
*(.text)
*(.text.*) /* LTO sections (gcc -flto) */
}
/* Read-only data: constants, string literals */
.rodata ALIGN(0x1000) : {
*(.rodata)
*(.rodata.*)
}
/* Read-write data: initialized global variables */
.data ALIGN(0x1000) : {
*(.data)
*(.data.*)
}
/* Uninitialized data: zeroed by our boot code */
. = ALIGN(0x1000);
_bss_start = .;
.bss : {
*(COMMON) /* C "common" symbols (uninitialized globals without explicit size) */
*(.bss)
*(.bss.*)
}
_bss_end = .;
_kernel_end = .;
/* ============================================================
* Discard sections the kernel does not need at runtime
* ============================================================ */
/DISCARD/ : {
*(.comment) /* Compiler version strings */
*(.note.GNU-stack) /* Stack attributes (Linux-specific, not needed) */
*(.eh_frame) /* C++ exception unwind tables (we don't use exceptions) */
*(.note.gnu.build-id) /* Build ID (useful for debug but optional) */
}
}
Key Linker Script Elements
KEEP(*(.multiboot)): Without KEEP, the linker's garbage collector (--gc-sections) might remove sections it thinks are unreferenced. The Multiboot header is referenced only by GRUB (not by our code), so KEEP is mandatory.
ALIGN(0x1000): Each section starts on a 4 KB page boundary. This ensures each section can have independent page permissions (future: mark .text RX, .rodata R, .data RW).
_bss_start and _bss_end: These are linker-defined symbols — their addresses are the start and end of the .bss section. They are visible to C code as extern char _bss_start[].
/DISCARD/: Removes sections that would add size without benefit. .eh_frame can be 10-20% of a small kernel's size if not discarded.
The Boot Assembly: minOS/boot.asm
GRUB requires a Multiboot2 header. When GRUB finds a kernel with this header, it loads the kernel at the address specified (or into available memory), sets registers to Multiboot magic values, and jumps to _start.
; minOS/boot.asm
; Multiboot2 header and kernel boot sequence.
; GRUB loads this, sets EBX = ptr to Multiboot info structure,
; sets EAX = 0x36d76289 (Multiboot2 magic), then jumps to _start.
;
; NOTE: GRUB starts us in 32-bit protected mode with paging disabled.
; For a 64-bit kernel, we must enable long mode here.
; This example shows the 32-bit bootstrap; long mode setup is Chapter 25.
BITS 32 ; GRUB delivers us in 32-bit protected mode
section .multiboot ; Goes first (within first 8192 bytes)
; ============================================================
; Multiboot2 Header (placed in .multiboot section)
; GRUB searches for these 4 bytes: \x00\x00\x00\xe8 (Magic: 0xe85250d6)
; ============================================================
MULTIBOOT2_MAGIC equ 0xe85250d6
MULTIBOOT_ARCH_32 equ 0 ; Protected mode i386
align 8
mb2_header:
dd MULTIBOOT2_MAGIC ; Magic number
dd MULTIBOOT_ARCH_32 ; Architecture
dd mb2_end - mb2_header ; Header length
dd -(MULTIBOOT2_MAGIC + MULTIBOOT_ARCH_32 + (mb2_end - mb2_header)) ; Checksum
; Tags follow here. Minimum: end tag.
; End tag:
dw 0, 0 ; type=0, flags=0
dd 8 ; size=8
mb2_end:
; ============================================================
; Bootstrap stack (before we set up the real kernel stack)
; ============================================================
section .bss
align 16
bootstrap_stack_bottom:
resb 16384 ; 16 KB bootstrap stack
bootstrap_stack_top:
; ============================================================
; Kernel Entry Point: _start (32-bit, called by GRUB)
; ============================================================
section .text
global _start
_start:
; Set up a temporary stack immediately — without a stack,
; we cannot call any functions.
mov esp, bootstrap_stack_top
; Save Multiboot2 info pointer (EBX) before we clobber registers.
; We'll pass it to kernel_main later.
push ebx ; multiboot_info *mb_info
push eax ; uint32_t magic (should be 0x36d76289)
; Zero the BSS section.
; The linker defined _bss_start and _bss_end for us.
; We must zero BSS before any C code runs (C guarantees global vars are 0).
extern _bss_start
extern _bss_end
mov edi, _bss_start ; destination
xor eax, eax ; fill value = 0
mov ecx, _bss_end
sub ecx, edi ; count = _bss_end - _bss_start
rep stosb ; memset(edi, 0, ecx)
; In a real 64-bit kernel, we would now:
; 1. Set up a minimal GDT for long mode
; 2. Enable PAE (set CR4.PAE)
; 3. Set up PML4 page tables (identity map first 4 GB)
; 4. Load CR3 (page table base)
; 5. Set EFER.LME (long mode enable)
; 6. Set CR0.PG (enable paging) — this activates long mode
; 7. Far jump to 64-bit code segment
; For this case study, we stay in 32-bit protected mode and call kernel_main.
; Restore saved Multiboot values and call kernel_main(uint32_t magic, void *mb_info)
pop eax ; magic
pop ebx ; mb_info
push ebx
push eax
extern kernel_main
call kernel_main
; kernel_main should never return. If it does, halt.
halt:
cli ; Disable interrupts
hlt ; Halt the CPU
jmp halt ; Loop in case of NMI
The Kernel Entry Point: minOS/kernel.c
/* minOS/kernel.c
* First C code to run after BSS is zeroed and boot.asm calls us.
*/
#include <stdint.h>
#include <stddef.h>
/* VGA text buffer at physical address 0xB8000.
* Each character is 2 bytes: [ASCII][attribute].
* 80 columns × 25 rows = 2000 characters = 4000 bytes.
*/
#define VGA_WIDTH 80
#define VGA_HEIGHT 25
#define VGA_ATTR_WHITE_ON_BLACK 0x0F
static volatile uint16_t *vga = (volatile uint16_t *)0xB8000;
static int vga_col = 0, vga_row = 0;
static void vga_putchar(char c) {
if (c == '\n') {
vga_col = 0;
vga_row++;
return;
}
vga[vga_row * VGA_WIDTH + vga_col] = (uint16_t)c | (VGA_ATTR_WHITE_ON_BLACK << 8);
vga_col++;
if (vga_col >= VGA_WIDTH) {
vga_col = 0;
vga_row++;
}
}
static void vga_puts(const char *s) {
while (*s) vga_putchar(*s++);
}
static void vga_puthex(uint32_t v) {
const char *hex = "0123456789ABCDEF";
vga_puts("0x");
for (int i = 28; i >= 0; i -= 4)
vga_putchar(hex[(v >> i) & 0xF]);
}
/* Linker-defined symbols */
extern char _kernel_start[], _kernel_end[];
extern char _bss_start[], _bss_end[];
/* Kernel main — called from boot.asm */
void kernel_main(uint32_t multiboot_magic, void *multiboot_info) {
/* Clear VGA screen */
for (int i = 0; i < VGA_WIDTH * VGA_HEIGHT; i++)
vga[i] = ' ' | (VGA_ATTR_WHITE_ON_BLACK << 8);
vga_puts("MinOS v0.1 — Booted!\n");
vga_puts("Multiboot2 magic: ");
vga_puthex(multiboot_magic);
vga_puts("\n");
vga_puts("Multiboot info: ");
vga_puthex((uint32_t)(uintptr_t)multiboot_info);
vga_puts("\n");
vga_puts("Kernel start: ");
vga_puthex((uint32_t)(uintptr_t)_kernel_start);
vga_puts("\n");
vga_puts("Kernel end: ");
vga_puthex((uint32_t)(uintptr_t)_kernel_end);
vga_puts("\n");
vga_puts("BSS: ");
vga_puthex((uint32_t)(uintptr_t)_bss_start);
vga_puts(" — ");
vga_puthex((uint32_t)(uintptr_t)_bss_end);
vga_puts("\n");
uint32_t bss_size = (uint32_t)(_bss_end - _bss_start);
vga_puts("BSS size (bytes): ");
vga_puthex(bss_size);
vga_puts("\n");
vga_puts("\nKernel halted. All is well.\n");
/* Halt */
for (;;) {
__asm__ volatile ("cli; hlt");
}
}
The Makefile: minOS/Makefile
# minOS/Makefile
CC = gcc
AS = nasm
LD = ld
# Cross-compilation flags: 32-bit protected mode, no standard libraries
CFLAGS = -m32 -ffreestanding -fno-stack-protector -nostdlib -O2 -Wall -Wextra
ASFLAGS = -f elf32
LDFLAGS = -m elf_i386 -T linker.ld --gc-sections
OBJS = boot.o kernel.o
.PHONY: all clean run
all: minOS.elf
%.o: %.asm
$(AS) $(ASFLAGS) $< -o $@
%.o: %.c
$(CC) $(CFLAGS) -c $< -o $@
minOS.elf: $(OBJS) linker.ld
$(LD) $(LDFLAGS) -o $@ $(OBJS)
@echo "Kernel size:"
@size minOS.elf
@echo "Linker map:"
@$(LD) $(LDFLAGS) -Map=minOS.map -o $@ $(OBJS)
run: minOS.elf
qemu-system-i386 -kernel minOS.elf -display curses
# Inspect the result
inspect: minOS.elf
readelf -h minOS.elf
readelf -S minOS.elf
nm -n minOS.elf | head -20
clean:
rm -f *.o minOS.elf minOS.map
Build and Run
# Prerequisites: nasm, gcc (i686-elf-gcc for clean cross-compile), qemu-system-i386
make all
# Expected output from 'make all':
# Kernel size:
# text data bss dec hex filename
# 1247 4 16388 17639 44e7 minOS.elf
# Run in QEMU:
make run
# QEMU window should show:
# MinOS v0.1 — Booted!
# Multiboot2 magic: 0x36D76289
# Multiboot info: 0x0009FC00
# Kernel start: 0x00100000
# Kernel end: 0x001044E7
# BSS: 0x00103000 — 0x001070C4
# BSS size (bytes): 0x000040C4
# Kernel halted. All is well.
Reading the Linker Map
The -Map=minOS.map flag generates a human-readable map of where everything ended up:
# minOS.map (excerpt)
Memory Configuration
Name Origin Length
kernel 0x00100000 0x00f00000
Linker script and memory map
.multiboot 0x00100000 0x20
0x00100000 boot.o(.multiboot)
.text 0x00101000 0x4cf
*(.text._start)
.text._start 0x00101000 0x40 boot.o
*(.text)
.text 0x00101040 0x48f kernel.o
.rodata 0x00102000 0x120
*(.rodata)
.rodata 0x00102000 0x120 kernel.o
.data 0x00103000 0x4
*(.data)
.data 0x00103000 0x4 kernel.o
.bss 0x00104000 0x4024
*(COMMON)
.bss 0x00104000 0x4024 boot.o ← bootstrap_stack!
_bss_start = 0x00104000
_bss_end = 0x00108024
_kernel_start = 0x00100000
_kernel_end = 0x00108024
The linker map reveals:
- Each section's precise start address and size
- Which object file contributed to each section
- The values of linker-defined symbols (_bss_start, etc.)
This is indispensable for debugging: if the kernel crashes on boot, the map tells you where each function is and whether .bss was zeroed correctly.
What This Teaches About Linking
1. The Linker Controls Everything
The linker script is the authority on the kernel's memory layout. Changing ORIGIN = 0x100000 to a different address automatically updates all symbol addresses, all section placements, and all cross-references. No source code changes needed.
2. BSS Must Be Zeroed Manually
In a user program, the OS zeros .bss before main. In a kernel, there is no OS — the kernel is the OS. If boot.asm skips the rep stosb BSS-zeroing step, any global variable (like vga_col = 0) might contain garbage from GRUB's memory contents.
3. KEEP Prevents Dead Code Elimination
The Multiboot header has no C code references — it is only "referenced" by GRUB (external to our build). Without KEEP, --gc-sections would remove .multiboot as unreachable, producing a binary GRUB cannot find. This is the canonical use case for KEEP.
4. Freestanding vs. Hosted Compilation
The -ffreestanding flag tells GCC: "do not assume a C standard library exists." This prevents GCC from generating calls to memset, memcpy, or other library functions — which would link-fail since we have no libc. It also enables use of <stdint.h> (fixed-width types) and <stddef.h> (NULL, size_t) which are provided by GCC itself.
Summary
The MinOS linker script and boot sequence demonstrate linker scripts as practical engineering tools, not just arcane configuration. The linker script: - Places the Multiboot header within GRUB's scanning range - Aligns sections for future page-level permission control - Defines BSS boundary symbols that the boot assembly uses for zero-initialization - Discards sections that add size without runtime benefit
The resulting kernel is 17 KB, boots in QEMU in under 1 second, and demonstrates every element of the linking process: custom section layout, KEEP, linker symbols, the map file, and freestanding compilation.