Appendix K: ASCII, Encoding, and Number Reference


ASCII Table (0–127)

Control Characters (0–31)

Dec Hex Symbol Name
0 0x00 NUL Null — string terminator in C
1 0x01 SOH Start of Heading
2 0x02 STX Start of Text
3 0x03 ETX End of Text
4 0x04 EOT End of Transmission
5 0x05 ENQ Enquiry
6 0x06 ACK Acknowledge
7 0x07 BEL Bell (audible alert)
8 0x08 BS Backspace
9 0x09 HT Horizontal Tab
10 0x0A LF Line Feed (Unix newline \n)
11 0x0B VT Vertical Tab
12 0x0C FF Form Feed
13 0x0D CR Carriage Return (Windows line ending \r)
14 0x0E SO Shift Out
15 0x0F SI Shift In
16 0x10 DLE Data Link Escape
17 0x11 DC1 Device Control 1 (XON)
18 0x12 DC2 Device Control 2
19 0x13 DC3 Device Control 3 (XOFF)
20 0x14 DC4 Device Control 4
21 0x15 NAK Negative Acknowledge
22 0x16 SYN Synchronous Idle
23 0x17 ETB End of Transmission Block
24 0x18 CAN Cancel
25 0x19 EM End of Medium
26 0x1A SUB Substitute
27 0x1B ESC Escape (terminal escape sequences)
28 0x1C FS File Separator
29 0x1D GS Group Separator
30 0x1E RS Record Separator
31 0x1F US Unit Separator

Printable Characters (32–127)

Dec Hex Char Dec Hex Char Dec Hex Char Dec Hex Char
32 0x20 (space) 56 0x38 8 80 0x50 P 104 0x68 h
33 0x21 ! 57 0x39 9 81 0x51 Q 105 0x69 i
34 0x22 " 58 0x3A : 82 0x52 R 106 0x6A j
35 0x23 # 59 0x3B ; 83 0x53 S 107 0x6B k
36 0x24 $ 60 0x3C < 84 0x54 T 108 0x6C l
37 0x25 % 61 0x3D = 85 0x55 U 109 0x6D m
38 0x26 & 62 0x3E > 86 0x56 V 110 0x6E n
39 0x27 ' 63 0x3F ? 87 0x57 W 111 0x6F o
40 0x28 ( 64 0x40 @ 88 0x58 X 112 0x70 p
41 0x29 ) 65 0x41 A 89 0x59 Y 113 0x71 q
42 0x2A * 66 0x42 B 90 0x5A Z 114 0x72 r
43 0x2B + 67 0x43 C 91 0x5B [ 115 0x73 s
44 0x2C , 68 0x44 D 92 0x5C \ 116 0x74 t
45 0x2D - 69 0x45 E 93 0x5D ] 117 0x75 u
46 0x2E . 70 0x46 F 94 0x5E ^ 118 0x76 v
47 0x2F / 71 0x47 G 95 0x5F _ 119 0x77 w
48 0x30 0 72 0x48 H 96 0x60 ` 120 0x78 x
49 0x31 1 73 0x49 I 97 0x61 a 121 0x79 y
50 0x32 2 74 0x4A J 98 0x62 b 122 0x7A z
51 0x33 3 75 0x4B K 99 0x63 c 123 0x7B {
52 0x34 4 76 0x4C L 100 0x64 d 124 0x7C |
53 0x35 5 77 0x4D M 101 0x65 e 125 0x7D }
54 0x36 6 78 0x4E N 102 0x66 f 126 0x7E ~
55 0x37 7 79 0x4F O 103 0x67 g 127 0x7F DEL

Key ASCII Ranges (useful for validation code)

Range Values Characters
Digits 0x30–0x39 09
Uppercase 0x41–0x5A AZ
Lowercase 0x61–0x7A az
Uppercase → lowercase add 0x20 A (0x41) → a (0x61)
Lowercase → uppercase subtract 0x20 mask bit 5: AND 0xDF
Digit → value subtract 0x30 '7' (0x37) - 0x30 = 7

UTF-8 Encoding

UTF-8 is the dominant text encoding on the internet and in Linux systems. It is a variable-width encoding of Unicode code points.

Encoding Rules

Code point range Byte sequence
U+0000 to U+007F (ASCII) 0xxxxxxx (1 byte)
U+0080 to U+07FF 110xxxxx 10xxxxxx (2 bytes)
U+0800 to U+FFFF 1110xxxx 10xxxxxx 10xxxxxx (3 bytes)
U+10000 to U+10FFFF 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx (4 bytes)

The leading byte type is identified by its high bits: - 0xxxxxxx: single-byte (ASCII range) - 110xxxxx: start of 2-byte sequence - 1110xxxx: start of 3-byte sequence - 11110xxx: start of 4-byte sequence - 10xxxxxx: continuation byte

Examples

Character Code Point UTF-8 Bytes
A U+0041 41
U+20AC E2 82 AC
© U+00A9 C2 A9
U+4F60 E4 BD A0
😀 U+1F600 F0 9F 98 80

Assembly: Testing for ASCII vs. Multi-byte UTF-8

; Test if a byte is ASCII (single-byte UTF-8):
test    al, 0x80        ; if bit 7 is 0, it is ASCII
jz      .is_ascii

; Test if a byte is a UTF-8 continuation byte:
and     al, 0xC0
cmp     al, 0x80        ; 10xxxxxx = continuation
je      .is_continuation

Number Base Conversion Reference

Powers of 2

Power Value Hex Notes
2^0 1 0x1
2^1 2 0x2
2^2 4 0x4
2^3 8 0x8
2^4 16 0x10 1 hex digit
2^5 32 0x20
2^6 64 0x40
2^7 128 0x80 High bit of byte
2^8 256 0x100 1 byte + 1
2^9 512 0x200
2^10 1,024 0x400 1 KiB
2^11 2,048 0x800
2^12 4,096 0x1000 1 page (4 KiB)
2^13 8,192 0x2000
2^14 16,384 0x4000
2^15 32,768 0x8000 High bit of 16-bit
2^16 65,536 0x10000 64 KiB
2^20 1,048,576 0x100000 1 MiB
2^21 2,097,152 0x200000 2 MiB page
2^30 1,073,741,824 0x40000000 1 GiB
2^31 2,147,483,648 0x80000000 High bit of 32-bit
2^32 4,294,967,296 0x100000000 4 GiB
2^40 1,099,511,627,776 0x10000000000 1 TiB
2^48 281,474,976,710,656 0x1000000000000 Max addressable in 48-bit VA
2^63 9,223,372,036,854,775,808 0x8000000000000000 High bit of 64-bit

Hexadecimal Quick Reference

Hex Binary Dec Hex Binary Dec
0 0000 0 8 1000 8
1 0001 1 9 1001 9
2 0010 2 A 1010 10
3 0011 3 B 1011 11
4 0100 4 C 1100 12
5 0101 5 D 1101 13
6 0110 6 E 1110 14
7 0111 7 F 1111 15

Two's Complement Quick Reference

For an N-bit signed integer: - Range: -2^(N-1) to +2^(N-1) - 1 - Negative number: invert all bits, add 1

N Min (signed) Max (signed) Max (unsigned)
8 -128 (0x80) 127 (0x7F) 255 (0xFF)
16 -32,768 (0x8000) 32,767 (0x7FFF) 65,535 (0xFFFF)
32 -2,147,483,648 (0x80000000) 2,147,483,647 (0x7FFFFFFF) 4,294,967,295 (0xFFFFFFFF)
64 -9,223,372,036,854,775,808 9,223,372,036,854,775,807 18,446,744,073,709,551,615

Common Small Negatives in Hex (32-bit and 64-bit)

Decimal 32-bit Hex 64-bit Hex
-1 0xFFFFFFFF 0xFFFFFFFFFFFFFFFF
-2 0xFFFFFFFE 0xFFFFFFFFFFFFFFFE
-4 0xFFFFFFFC 0xFFFFFFFFFFFFFFFC
-8 0xFFFFFFF8 0xFFFFFFFFFFFFFFF8
-16 0xFFFFFFF0 0xFFFFFFFFFFFFFFF0
-128 0xFFFFFF80 0xFFFFFFFFFFFFFF80

When you see a value like 0xFFFFFFFFFFFFF000 as a return value from a system call, it is -4096 = -0x1000, which is the error code -ENOMEM (negated errno 12) if it's in the expected error range, or a valid large address if it's from mmap.


Byte Order (Endianness)

Little-Endian (x86-64, ARM64, RISC-V default)

The least significant byte is stored at the lowest address. The value 0x0000000000401160 in memory (as stored by x86-64):

Address:  0x7fff0000  0x7fff0001  0x7fff0002  0x7fff0003  0x7fff0004  0x7fff0005  0x7fff0006  0x7fff0007
Value:    0x60        0x11        0x40        0x00        0x00        0x00        0x00        0x00

Reading left to right: 60 11 40 00 00 00 00 00 is 0x0000000000401160 in little-endian.

Big-Endian (network byte order, some MIPS/SPARC configurations)

The most significant byte is stored at the lowest address. The same value 0x0000000000401160 big-endian:

Address:  0x7fff0000  0x7fff0001  ...  0x7fff0007
Value:    0x00        0x00        ...  0x60

Conversion in Assembly (x86-64)

; Swap bytes of RAX (convert between endianness):
bswap   rax         ; reverse byte order of 64-bit register
bswap   eax         ; reverse byte order of 32-bit register (clears upper 32 bits)

Python Packing Conventions

import struct

# Little-endian (x86-64, ARM64):
struct.pack('<Q', 0x401160)   # b'\x60\x11\x40\x00\x00\x00\x00\x00'
struct.pack('<I', 0x401160)   # b'\x60\x11\x40\x00'

# Big-endian (network):
struct.pack('>Q', 0x401160)   # b'\x00\x00\x00\x00\x00\x40\x11\x60'

# pwntools:
from pwn import p64, p32, u64, u32
p64(0x401160)                  # little-endian 8 bytes
u64(b'\x60\x11\x40\x00' + b'\x00' * 4)  # unpack

IEEE 754 Floating-Point Quick Reference

Single Precision (32-bit, float)

Field Bits Description
Sign 1 (bit 31) 0 = positive, 1 = negative
Exponent 8 (bits 30-23) Biased by 127
Mantissa 23 (bits 22-0) Fractional part (implicit leading 1)

Special values: - 0x7F800000 = +Infinity - 0xFF800000 = -Infinity - 0x7FC00000 = NaN (quiet) - 0x00000000 = +0.0 - 0x3F800000 = 1.0 - 0x40000000 = 2.0 - 0x3F000000 = 0.5

Double Precision (64-bit, double)

Field Bits Description
Sign 1 (bit 63) 0 = positive, 1 = negative
Exponent 11 (bits 62-52) Biased by 1023
Mantissa 52 (bits 51-0) Fractional part

Special values: - 0x7FF0000000000000 = +Infinity - 0x3FF0000000000000 = 1.0 - 0x4000000000000000 = 2.0 - 0x3FE0000000000000 = 0.5

Assembly: Examining Floats

; Store float 1.0 to memory and read back as integer:
mov     DWORD [rsp - 4], 0x3F800000   ; 1.0f as raw bits
movss   xmm0, DWORD [rsp - 4]         ; load as float

; Convert float to integer representation in GDB:
; (gdb) p/f $xmm0.v4_float[0]         shows as 1.0
; (gdb) p/x $xmm0.v4_int32[0]         shows as 0x3f800000