Case Study 39-2: RISC-V Assembly — Hello World on RISC-V
Introduction
Writing hello world in a new assembly language is the quickest way to understand what an ISA looks like and how it differs from what you know. This case study writes, assembles, links, and runs a RISC-V 64-bit hello world in QEMU, then compares it systematically to the equivalent x86-64 and ARM64 programs. RISC-V is the most important new ISA in decades; understanding it through the lens of assembly you already know is the most efficient path.
The Programs: Three Architectures, One Goal
x86-64 (for comparison)
; hello_x86.asm — x86-64 Linux hello world
section .data
msg: db "Hello from x86-64!", 0x0A
len: equ $ - msg
section .text
global _start
_start:
mov rax, 1 ; sys_write = 1
mov rdi, 1 ; fd = stdout
lea rsi, [rel msg] ; buffer
mov rdx, len ; length
syscall
mov rax, 60 ; sys_exit = 60
xor rdi, rdi ; exit code = 0
syscall
ARM64 (for comparison)
// hello_arm64.s — ARM64 Linux hello world
.section .data
msg: .ascii "Hello from ARM64!\n"
.set len, . - msg
.section .text
.global _start
_start:
mov x8, #64 // sys_write = 64
mov x0, #1 // fd = stdout
adr x1, msg // buffer
mov x2, #len // length
svc #0
mov x8, #93 // sys_exit = 93
mov x0, #0 // exit code = 0
svc #0
RISC-V 64-bit
# hello_riscv.s — RISC-V 64-bit Linux hello world
# GNU assembler syntax (gas), RV64GC
.section .data
msg: .ascii "Hello from RISC-V!\n"
.set len, . - msg
.section .text
.global _start
_start:
li a7, 64 # sys_write = 64
li a0, 1 # fd = stdout
la a1, msg # buffer address
li a2, len # length
ecall # invoke syscall
li a7, 93 # sys_exit = 93
li a0, 0 # exit code = 0
ecall # invoke syscall
Building and Running
Setup
# Install cross-compilation tools (Ubuntu/Debian)
sudo apt install gcc-riscv64-linux-gnu binutils-riscv64-linux-gnu qemu-user
# Alternative: use pre-built RISC-V toolchain from riscv.org
Assemble, Link, Run
# Assemble
riscv64-linux-gnu-as hello_riscv.s -o hello_riscv.o
# Link (statically, to avoid needing RISC-V libc)
riscv64-linux-gnu-ld hello_riscv.o -o hello_riscv
# Run under QEMU user-mode emulation
qemu-riscv64 ./hello_riscv
Output:
Hello from RISC-V!
Disassembly
riscv64-linux-gnu-objdump -d hello_riscv
0000000000010078 <_start>:
10078: 04800893 li a7,72 # Hmm — 72? Not 64?
Wait — the assembler may have resolved the .set len to 19 (length of the string) and placed the string differently. Let us use concrete values to see the actual disassembly:
With a concrete 19-byte string "Hello from RISC-V!\n":
0000000000010078 <_start>:
10078: 04000893 li a7,64 # sys_write
1007c: 00100513 li a0,1 # fd = 1
10080: 00001597 auipc a1,0x1 # PC-relative load of msg address (high)
10084: f8858593 addi a1,a1,-120 # PC-relative load of msg address (low)
10088: 01300613 li a2,19 # length = 19
1008c: 00000073 ecall # syscall
10090: 05d00893 li a7,93 # sys_exit
10094: 00000513 li a0,0 # exit code = 0
10098: 00000073 ecall # syscall
Instruction-by-Instruction Comparison
| Task | x86-64 | ARM64 | RISC-V |
|---|---|---|---|
| Load syscall number | mov rax, 1 |
mov x8, #64 |
li a7, 64 |
| Load integer constant | mov rdi, 1 |
mov x0, #1 |
li a0, 1 |
| Load PC-relative address | lea rsi, [rel msg] |
adr x1, msg |
la a1, msg |
| Execute syscall | syscall |
svc #0 |
ecall |
| Zero a register | xor rdi, rdi |
mov x0, #0 or xor x0, x0, x0 |
li a0, 0 or mv a0, x0 |
Observations
1. Syscall instruction name:
- x86-64: syscall (SYSCALL instruction)
- ARM64: svc #0 (Supervisor Call)
- RISC-V: ecall (Environment Call)
All three invoke the operating system, but via different mechanisms and privilege model transitions.
2. Syscall argument registers: - x86-64: number in RAX, args in RDI, RSI, RDX, R10, R8, R9 - ARM64: number in X8, args in X0-X5 - RISC-V: number in A7 (x17), args in A0-A5 (x10-x15)
3. Syscall numbers:
- x86-64 write = 1; exit = 60
- ARM64 write = 64; exit = 93
- RISC-V write = 64; exit = 93
Note: ARM64 and RISC-V share Linux syscall numbers (both use the unified ARM/RISC-V syscall table). x86-64 has its own historic numbering.
4. PC-relative address loading:
- x86-64: lea rsi, [rip + offset] — one instruction (ModRM encoding handles it)
- ARM64: adr x1, label — one instruction when within ±1MB; adrp + add for farther
- RISC-V: la a1, label — pseudoinstruction that assembles to auipc + addi — always two instructions
RISC-V's la (Load Address) is a pseudo-instruction because RISC-V immediates are only 12 bits. Loading a full 64-bit address takes multiple instructions. The auipc instruction adds an upper 20-bit immediate to PC; addi adds the lower 12-bit offset.
5. Load immediate:
- x86-64: mov rax, 64 — 7 bytes (REX + opcode + 4-byte immediate)
- ARM64: mov x8, #64 — 4 bytes (MOVZ encoding)
- RISC-V: li a7, 64 — pseudo-instruction → addi a7, x0, 64 — 4 bytes
All three RISC ISAs (ARM64 and RISC-V) use 4-byte fixed-width instructions.
The li Pseudo-Instruction
li (Load Immediate) is one of several RISC-V pseudo-instructions that the assembler expands to one or more real instructions:
| Pseudo | Expands to | Range |
|---|---|---|
li rd, imm |
addi rd, x0, imm |
12-bit signed (-2048 to 2047) |
li rd, large_imm |
lui rd, upper; addi rd, rd, lower |
32-bit |
la rd, label |
auipc rd, hi; addi rd, rd, lo |
PC ± 2GB |
mv rd, rs |
addi rd, rs, 0 |
register copy |
ret |
jalr x0, x1, 0 |
return via ra |
nop |
addi x0, x0, 0 |
no operation |
This is RISC-V's design philosophy: keep the ISA minimal, use pseudo-instructions for convenience. The assembler expands them; the hardware only sees real instructions.
The Elegance of RISC-V
Compared to x86-64:
- No implicit register uses: RISC-V has no equivalent of rax being special for division, rcx for shifts, rdx:rax for multiply. Every register is general-purpose.
- Fixed instruction width: all RISC-V base instructions are 4 bytes (2 bytes for compressed C extension). No variable-length encoding complexity.
- No instruction prefixes: x86-64 REX, VEX, EVEX prefixes are a complexity burden. RISC-V has none.
- Open ISA: you can read the full specification (about 200 pages), understand it completely, and design hardware to implement it. The x86-64 ISA specification runs to thousands of pages.
The hello world in RISC-V is nearly identical to ARM64 — which is expected, since both are clean modern RISC ISAs. The experience of writing it confirms: with assembly fundamentals, learning a new ISA is a matter of learning the register names and instruction encoding, not learning new concepts.
Running in QEMU Full System Emulation
For a more complete experience:
# Download RISC-V Fedora or Ubuntu image
# (instructions at wiki.qemu.org/Documentation/Platforms/RISCV)
qemu-system-riscv64 \
-machine virt \
-cpu rv64 \
-m 2G \
-bios /usr/lib/riscv64-linux-gnu/opensbi/generic/fw_dynamic.bin \
-kernel Image \
-append "root=/dev/vda rw" \
-drive file=rootfs.img,format=qcow2,id=hd0 \
-device virtio-blk-device,drive=hd0 \
-netdev user,id=net0 \
-device virtio-net-device,netdev=net0 \
-nographic
This boots a full RISC-V Linux system. You can compile and run programs natively, install packages, and explore the system — all on emulated RISC-V hardware.
What Comes Next with RISC-V
If this brief introduction is interesting:
1. Read the RISC-V ISA specification Volume I (Unprivileged): it is short, clear, and free
2. Try writing a RISC-V version of the MinOS bootloader for qemu-system-riscv64
3. Follow the Linux RISC-V port: the kernel code for the RISC-V architecture entry points is in arch/riscv/ and closely parallels arch/x86/ and arch/arm64/
4. Get real hardware: a StarFive VisionFive 2 is a Linux-capable RISC-V board at ~$70
The open nature of RISC-V means that understanding it deeply — at the assembly level, as you do now — gives you leverage to contribute to hardware design, compiler development, and operating systems work in ways that proprietary ISAs simply do not allow.