Case Study 39-2: RISC-V Assembly — Hello World on RISC-V

Introduction

Writing hello world in a new assembly language is the quickest way to understand what an ISA looks like and how it differs from what you know. This case study writes, assembles, links, and runs a RISC-V 64-bit hello world in QEMU, then compares it systematically to the equivalent x86-64 and ARM64 programs. RISC-V is the most important new ISA in decades; understanding it through the lens of assembly you already know is the most efficient path.

The Programs: Three Architectures, One Goal

x86-64 (for comparison)

; hello_x86.asm — x86-64 Linux hello world
section .data
msg:    db "Hello from x86-64!", 0x0A
len:    equ $ - msg

section .text
global _start
_start:
    mov rax, 1          ; sys_write = 1
    mov rdi, 1          ; fd = stdout
    lea rsi, [rel msg]  ; buffer
    mov rdx, len        ; length
    syscall

    mov rax, 60         ; sys_exit = 60
    xor rdi, rdi        ; exit code = 0
    syscall

ARM64 (for comparison)

// hello_arm64.s — ARM64 Linux hello world
.section .data
msg:    .ascii "Hello from ARM64!\n"
        .set len, . - msg

.section .text
.global _start
_start:
    mov x8, #64         // sys_write = 64
    mov x0, #1          // fd = stdout
    adr x1, msg         // buffer
    mov x2, #len        // length
    svc #0

    mov x8, #93         // sys_exit = 93
    mov x0, #0          // exit code = 0
    svc #0

RISC-V 64-bit

# hello_riscv.s — RISC-V 64-bit Linux hello world
# GNU assembler syntax (gas), RV64GC

.section .data
msg:    .ascii "Hello from RISC-V!\n"
        .set len, . - msg

.section .text
.global _start
_start:
    li      a7, 64          # sys_write = 64
    li      a0, 1           # fd = stdout
    la      a1, msg         # buffer address
    li      a2, len         # length
    ecall                   # invoke syscall

    li      a7, 93          # sys_exit = 93
    li      a0, 0           # exit code = 0
    ecall                   # invoke syscall

Building and Running

Setup

# Install cross-compilation tools (Ubuntu/Debian)
sudo apt install gcc-riscv64-linux-gnu binutils-riscv64-linux-gnu qemu-user

# Alternative: use pre-built RISC-V toolchain from riscv.org
# Assemble
riscv64-linux-gnu-as hello_riscv.s -o hello_riscv.o

# Link (statically, to avoid needing RISC-V libc)
riscv64-linux-gnu-ld hello_riscv.o -o hello_riscv

# Run under QEMU user-mode emulation
qemu-riscv64 ./hello_riscv

Output:

Hello from RISC-V!

Disassembly

riscv64-linux-gnu-objdump -d hello_riscv
0000000000010078 <_start>:
   10078:  04800893   li      a7,72       # Hmm — 72? Not 64?

Wait — the assembler may have resolved the .set len to 19 (length of the string) and placed the string differently. Let us use concrete values to see the actual disassembly:

With a concrete 19-byte string "Hello from RISC-V!\n":

0000000000010078 <_start>:
   10078:  04000893   li      a7,64        # sys_write
   1007c:  00100513   li      a0,1         # fd = 1
   10080:  00001597   auipc   a1,0x1       # PC-relative load of msg address (high)
   10084:  f8858593   addi    a1,a1,-120   # PC-relative load of msg address (low)
   10088:  01300613   li      a2,19        # length = 19
   1008c:  00000073   ecall               # syscall
   10090:  05d00893   li      a7,93        # sys_exit
   10094:  00000513   li      a0,0         # exit code = 0
   10098:  00000073   ecall               # syscall

Instruction-by-Instruction Comparison

Task x86-64 ARM64 RISC-V
Load syscall number mov rax, 1 mov x8, #64 li a7, 64
Load integer constant mov rdi, 1 mov x0, #1 li a0, 1
Load PC-relative address lea rsi, [rel msg] adr x1, msg la a1, msg
Execute syscall syscall svc #0 ecall
Zero a register xor rdi, rdi mov x0, #0 or xor x0, x0, x0 li a0, 0 or mv a0, x0

Observations

1. Syscall instruction name: - x86-64: syscall (SYSCALL instruction) - ARM64: svc #0 (Supervisor Call) - RISC-V: ecall (Environment Call)

All three invoke the operating system, but via different mechanisms and privilege model transitions.

2. Syscall argument registers: - x86-64: number in RAX, args in RDI, RSI, RDX, R10, R8, R9 - ARM64: number in X8, args in X0-X5 - RISC-V: number in A7 (x17), args in A0-A5 (x10-x15)

3. Syscall numbers: - x86-64 write = 1; exit = 60 - ARM64 write = 64; exit = 93 - RISC-V write = 64; exit = 93

Note: ARM64 and RISC-V share Linux syscall numbers (both use the unified ARM/RISC-V syscall table). x86-64 has its own historic numbering.

4. PC-relative address loading: - x86-64: lea rsi, [rip + offset] — one instruction (ModRM encoding handles it) - ARM64: adr x1, label — one instruction when within ±1MB; adrp + add for farther - RISC-V: la a1, label — pseudoinstruction that assembles to auipc + addi — always two instructions

RISC-V's la (Load Address) is a pseudo-instruction because RISC-V immediates are only 12 bits. Loading a full 64-bit address takes multiple instructions. The auipc instruction adds an upper 20-bit immediate to PC; addi adds the lower 12-bit offset.

5. Load immediate: - x86-64: mov rax, 64 — 7 bytes (REX + opcode + 4-byte immediate) - ARM64: mov x8, #64 — 4 bytes (MOVZ encoding) - RISC-V: li a7, 64 — pseudo-instruction → addi a7, x0, 64 — 4 bytes

All three RISC ISAs (ARM64 and RISC-V) use 4-byte fixed-width instructions.

The li Pseudo-Instruction

li (Load Immediate) is one of several RISC-V pseudo-instructions that the assembler expands to one or more real instructions:

Pseudo Expands to Range
li rd, imm addi rd, x0, imm 12-bit signed (-2048 to 2047)
li rd, large_imm lui rd, upper; addi rd, rd, lower 32-bit
la rd, label auipc rd, hi; addi rd, rd, lo PC ± 2GB
mv rd, rs addi rd, rs, 0 register copy
ret jalr x0, x1, 0 return via ra
nop addi x0, x0, 0 no operation

This is RISC-V's design philosophy: keep the ISA minimal, use pseudo-instructions for convenience. The assembler expands them; the hardware only sees real instructions.

The Elegance of RISC-V

Compared to x86-64: - No implicit register uses: RISC-V has no equivalent of rax being special for division, rcx for shifts, rdx:rax for multiply. Every register is general-purpose. - Fixed instruction width: all RISC-V base instructions are 4 bytes (2 bytes for compressed C extension). No variable-length encoding complexity. - No instruction prefixes: x86-64 REX, VEX, EVEX prefixes are a complexity burden. RISC-V has none. - Open ISA: you can read the full specification (about 200 pages), understand it completely, and design hardware to implement it. The x86-64 ISA specification runs to thousands of pages.

The hello world in RISC-V is nearly identical to ARM64 — which is expected, since both are clean modern RISC ISAs. The experience of writing it confirms: with assembly fundamentals, learning a new ISA is a matter of learning the register names and instruction encoding, not learning new concepts.

Running in QEMU Full System Emulation

For a more complete experience:

# Download RISC-V Fedora or Ubuntu image
# (instructions at wiki.qemu.org/Documentation/Platforms/RISCV)

qemu-system-riscv64 \
    -machine virt \
    -cpu rv64 \
    -m 2G \
    -bios /usr/lib/riscv64-linux-gnu/opensbi/generic/fw_dynamic.bin \
    -kernel Image \
    -append "root=/dev/vda rw" \
    -drive file=rootfs.img,format=qcow2,id=hd0 \
    -device virtio-blk-device,drive=hd0 \
    -netdev user,id=net0 \
    -device virtio-net-device,netdev=net0 \
    -nographic

This boots a full RISC-V Linux system. You can compile and run programs natively, install packages, and explore the system — all on emulated RISC-V hardware.

What Comes Next with RISC-V

If this brief introduction is interesting: 1. Read the RISC-V ISA specification Volume I (Unprivileged): it is short, clear, and free 2. Try writing a RISC-V version of the MinOS bootloader for qemu-system-riscv64 3. Follow the Linux RISC-V port: the kernel code for the RISC-V architecture entry points is in arch/riscv/ and closely parallels arch/x86/ and arch/arm64/ 4. Get real hardware: a StarFive VisionFive 2 is a Linux-capable RISC-V board at ~$70

The open nature of RISC-V means that understanding it deeply — at the assembly level, as you do now — gives you leverage to contribute to hardware design, compiler development, and operating systems work in ways that proprietary ISAs simply do not allow.