Appendix K: ASCII, Encoding, and Number Reference

ASCII Table (0–127)

Control Characters (0–31)

Dec	Hex	Symbol	Name
0	0x00	NUL	Null — string terminator in C
1	0x01	SOH	Start of Heading
2	0x02	STX	Start of Text
3	0x03	ETX	End of Text
4	0x04	EOT	End of Transmission
5	0x05	ENQ	Enquiry
6	0x06	ACK	Acknowledge
7	0x07	BEL	Bell (audible alert)
8	0x08	BS	Backspace
9	0x09	HT	Horizontal Tab
10	0x0A	LF	Line Feed (Unix newline `\n`)
11	0x0B	VT	Vertical Tab
12	0x0C	FF	Form Feed
13	0x0D	CR	Carriage Return (Windows line ending `\r`)
14	0x0E	SO	Shift Out
15	0x0F	SI	Shift In
16	0x10	DLE	Data Link Escape
17	0x11	DC1	Device Control 1 (XON)
18	0x12	DC2	Device Control 2
19	0x13	DC3	Device Control 3 (XOFF)
20	0x14	DC4	Device Control 4
21	0x15	NAK	Negative Acknowledge
22	0x16	SYN	Synchronous Idle
23	0x17	ETB	End of Transmission Block
24	0x18	CAN	Cancel
25	0x19	EM	End of Medium
26	0x1A	SUB	Substitute
27	0x1B	ESC	Escape (terminal escape sequences)
28	0x1C	FS	File Separator
29	0x1D	GS	Group Separator
30	0x1E	RS	Record Separator
31	0x1F	US	Unit Separator

Printable Characters (32–127)

Dec	Hex	Char	Dec	Hex	Char	Dec	Hex	Char	Dec	Hex	Char
32	0x20	(space)	56	0x38	8	80	0x50	P	104	0x68	h
33	0x21	!	57	0x39	9	81	0x51	Q	105	0x69	i
34	0x22	"	58	0x3A	:	82	0x52	R	106	0x6A	j
35	0x23	#	59	0x3B	;	83	0x53	S	107	0x6B	k
36	0x24	$	60	0x3C	<	84	0x54	T	108	0x6C	l
37	0x25	%	61	0x3D	=	85	0x55	U	109	0x6D	m
38	0x26	&	62	0x3E	>	86	0x56	V	110	0x6E	n
39	0x27	'	63	0x3F	?	87	0x57	W	111	0x6F	o
40	0x28	(	64	0x40	@	88	0x58	X	112	0x70	p
41	0x29	)	65	0x41	A	89	0x59	Y	113	0x71	q
42	0x2A	*	66	0x42	B	90	0x5A	Z	114	0x72	r
43	0x2B	+	67	0x43	C	91	0x5B	[	115	0x73	s
44	0x2C	,	68	0x44	D	92	0x5C	\	116	0x74	t
45	0x2D	-	69	0x45	E	93	0x5D	]	117	0x75	u
46	0x2E	.	70	0x46	F	94	0x5E	^	118	0x76	v
47	0x2F	/	71	0x47	G	95	0x5F	_	119	0x77	w
48	0x30	0	72	0x48	H	96	0x60	`	120	0x78	x
49	0x31	1	73	0x49	I	97	0x61	a	121	0x79	y
50	0x32	2	74	0x4A	J	98	0x62	b	122	0x7A	z
51	0x33	3	75	0x4B	K	99	0x63	c	123	0x7B	{
52	0x34	4	76	0x4C	L	100	0x64	d	124	0x7C	\|
53	0x35	5	77	0x4D	M	101	0x65	e	125	0x7D	}
54	0x36	6	78	0x4E	N	102	0x66	f	126	0x7E	~
55	0x37	7	79	0x4F	O	103	0x67	g	127	0x7F	DEL

Key ASCII Ranges (useful for validation code)

Range	Values	Characters
Digits	0x30–0x39	`0`–`9`
Uppercase	0x41–0x5A	`A`–`Z`
Lowercase	0x61–0x7A	`a`–`z`
Uppercase → lowercase	add 0x20	`A` (0x41) → `a` (0x61)
Lowercase → uppercase	subtract 0x20	mask bit 5: `AND 0xDF`
Digit → value	subtract 0x30	`'7'` (0x37) - 0x30 = 7

UTF-8 Encoding

UTF-8 is the dominant text encoding on the internet and in Linux systems. It is a variable-width encoding of Unicode code points.

Encoding Rules

Code point range	Byte sequence
U+0000 to U+007F (ASCII)	`0xxxxxxx` (1 byte)
U+0080 to U+07FF	`110xxxxx 10xxxxxx` (2 bytes)
U+0800 to U+FFFF	`1110xxxx 10xxxxxx 10xxxxxx` (3 bytes)
U+10000 to U+10FFFF	`11110xxx 10xxxxxx 10xxxxxx 10xxxxxx` (4 bytes)

The leading byte type is identified by its high bits: - 0xxxxxxx: single-byte (ASCII range) - 110xxxxx: start of 2-byte sequence - 1110xxxx: start of 3-byte sequence - 11110xxx: start of 4-byte sequence - 10xxxxxx: continuation byte

Examples

Character	Code Point	UTF-8 Bytes
`A`	U+0041	`41`
`€`	U+20AC	`E2 82 AC`
`©`	U+00A9	`C2 A9`
`你`	U+4F60	`E4 BD A0`
`😀`	U+1F600	`F0 9F 98 80`

Assembly: Testing for ASCII vs. Multi-byte UTF-8

; Test if a byte is ASCII (single-byte UTF-8):
test    al, 0x80        ; if bit 7 is 0, it is ASCII
jz      .is_ascii

; Test if a byte is a UTF-8 continuation byte:
and     al, 0xC0
cmp     al, 0x80        ; 10xxxxxx = continuation
je      .is_continuation

Number Base Conversion Reference

Powers of 2

Power	Value	Hex	Notes
2^0	1	0x1
2^1	2	0x2
2^2	4	0x4
2^3	8	0x8
2^4	16	0x10	1 hex digit
2^5	32	0x20
2^6	64	0x40
2^7	128	0x80	High bit of byte
2^8	256	0x100	1 byte + 1
2^9	512	0x200
2^10	1,024	0x400	1 KiB
2^11	2,048	0x800
2^12	4,096	0x1000	1 page (4 KiB)
2^13	8,192	0x2000
2^14	16,384	0x4000
2^15	32,768	0x8000	High bit of 16-bit
2^16	65,536	0x10000	64 KiB
2^20	1,048,576	0x100000	1 MiB
2^21	2,097,152	0x200000	2 MiB page
2^30	1,073,741,824	0x40000000	1 GiB
2^31	2,147,483,648	0x80000000	High bit of 32-bit
2^32	4,294,967,296	0x100000000	4 GiB
2^40	1,099,511,627,776	0x10000000000	1 TiB
2^48	281,474,976,710,656	0x1000000000000	Max addressable in 48-bit VA
2^63	9,223,372,036,854,775,808	0x8000000000000000	High bit of 64-bit

Hexadecimal Quick Reference

Hex	Binary	Dec	Hex	Binary	Dec
0	0000	0	8	1000	8
1	0001	1	9	1001	9
2	0010	2	A	1010	10
3	0011	3	B	1011	11
4	0100	4	C	1100	12
5	0101	5	D	1101	13
6	0110	6	E	1110	14
7	0111	7	F	1111	15

Two's Complement Quick Reference

For an N-bit signed integer: - Range: -2^(N-1) to +2^(N-1) - 1 - Negative number: invert all bits, add 1

N	Min (signed)	Max (signed)	Max (unsigned)
8	-128 (0x80)	127 (0x7F)	255 (0xFF)
16	-32,768 (0x8000)	32,767 (0x7FFF)	65,535 (0xFFFF)
32	-2,147,483,648 (0x80000000)	2,147,483,647 (0x7FFFFFFF)	4,294,967,295 (0xFFFFFFFF)
64	-9,223,372,036,854,775,808	9,223,372,036,854,775,807	18,446,744,073,709,551,615

Common Small Negatives in Hex (32-bit and 64-bit)

Decimal	32-bit Hex	64-bit Hex
-1	0xFFFFFFFF	0xFFFFFFFFFFFFFFFF
-2	0xFFFFFFFE	0xFFFFFFFFFFFFFFFE
-4	0xFFFFFFFC	0xFFFFFFFFFFFFFFFC
-8	0xFFFFFFF8	0xFFFFFFFFFFFFFFF8
-16	0xFFFFFFF0	0xFFFFFFFFFFFFFFF0
-128	0xFFFFFF80	0xFFFFFFFFFFFFFF80

When you see a value like 0xFFFFFFFFFFFFF000 as a return value from a system call, it is -4096 = -0x1000, which is the error code -ENOMEM (negated errno 12) if it's in the expected error range, or a valid large address if it's from mmap.

Byte Order (Endianness)

Little-Endian (x86-64, ARM64, RISC-V default)

The least significant byte is stored at the lowest address. The value 0x0000000000401160 in memory (as stored by x86-64):

Address:  0x7fff0000  0x7fff0001  0x7fff0002  0x7fff0003  0x7fff0004  0x7fff0005  0x7fff0006  0x7fff0007
Value:    0x60        0x11        0x40        0x00        0x00        0x00        0x00        0x00

Reading left to right: 60 11 40 00 00 00 00 00 is 0x0000000000401160 in little-endian.

Big-Endian (network byte order, some MIPS/SPARC configurations)

The most significant byte is stored at the lowest address. The same value 0x0000000000401160 big-endian:

Address:  0x7fff0000  0x7fff0001  ...  0x7fff0007
Value:    0x00        0x00        ...  0x60

Conversion in Assembly (x86-64)

; Swap bytes of RAX (convert between endianness):
bswap   rax         ; reverse byte order of 64-bit register
bswap   eax         ; reverse byte order of 32-bit register (clears upper 32 bits)

Python Packing Conventions

import struct

# Little-endian (x86-64, ARM64):
struct.pack('<Q', 0x401160)   # b'\x60\x11\x40\x00\x00\x00\x00\x00'
struct.pack('<I', 0x401160)   # b'\x60\x11\x40\x00'

# Big-endian (network):
struct.pack('>Q', 0x401160)   # b'\x00\x00\x00\x00\x00\x40\x11\x60'

# pwntools:
from pwn import p64, p32, u64, u32
p64(0x401160)                  # little-endian 8 bytes
u64(b'\x60\x11\x40\x00' + b'\x00' * 4)  # unpack

IEEE 754 Floating-Point Quick Reference

Single Precision (32-bit, `float`)

Field	Bits	Description
Sign	1 (bit 31)	0 = positive, 1 = negative
Exponent	8 (bits 30-23)	Biased by 127
Mantissa	23 (bits 22-0)	Fractional part (implicit leading 1)

Special values: - 0x7F800000 = +Infinity - 0xFF800000 = -Infinity - 0x7FC00000 = NaN (quiet) - 0x00000000 = +0.0 - 0x3F800000 = 1.0 - 0x40000000 = 2.0 - 0x3F000000 = 0.5

Double Precision (64-bit, `double`)

Field	Bits	Description
Sign	1 (bit 63)	0 = positive, 1 = negative
Exponent	11 (bits 62-52)	Biased by 1023
Mantissa	52 (bits 51-0)	Fractional part

Special values: - 0x7FF0000000000000 = +Infinity - 0x3FF0000000000000 = 1.0 - 0x4000000000000000 = 2.0 - 0x3FE0000000000000 = 0.5

Assembly: Examining Floats

; Store float 1.0 to memory and read back as integer:
mov     DWORD [rsp - 4], 0x3F800000   ; 1.0f as raw bits
movss   xmm0, DWORD [rsp - 4]         ; load as float

; Convert float to integer representation in GDB:
; (gdb) p/f $xmm0.v4_float[0]         shows as 1.0
; (gdb) p/x $xmm0.v4_int32[0]         shows as 0x3f800000