Chapter 2 Key Takeaways: Numbers in the Machine
The 14 Most Important Points from This Chapter
1. The machine does not know the type of its bit patterns.
The value 0xFF in a byte is simultaneously 255 (unsigned), -1 (signed two's complement), and the ASCII character ÿ. The instruction that operates on it determines the interpretation. The CPU executes the instruction's rule; the programmer decides the meaning.
2. Two's complement is mathematically principled, not an arbitrary convention.
For an n-bit two's complement integer, the MSB has weight −2^(n-1). The flip-and-add-1 negation rule follows directly from this: for any value P, ~P + 1 = -P because P + ~P = -1 (all ones in two's complement). This is a theorem, not a definition.
3. Carry Flag (CF) signals unsigned overflow; Overflow Flag (OF) signals signed overflow.
These are independently set and independently checked. They can both be set, neither, or one without the other, depending on the specific values. Using the wrong conditional jump (e.g., jl instead of jb) for a given number type is a silent bug that produces wrong results without any error signal.
4. Writing a 32-bit register zeroes the upper 32 bits; 8 and 16-bit writes do not.
mov eax, 0 sets all 64 bits of RAX to zero. mov ax, 0 leaves the upper 48 bits of RAX unchanged. mov al, 0 leaves the upper 56 bits unchanged. The asymmetry between 32-bit and narrower writes is unique to x86-64 and a common source of subtle bugs.
5. Sign extension (MOVSX) and zero extension (MOVZX) are not interchangeable.
movzx rax, al fills upper bits with zeros (treats AL as unsigned). movsx rax, al fills upper bits with copies of bit 7 (treats AL as signed). Using the wrong one on signed data produces a wildly incorrect large positive value instead of a small negative value.
6. MUL computes RDX:RAX = RAX × operand. DIV divides RDX:RAX by the operand.
Both instructions implicitly use RDX as an operand or result. Forgetting to zero RDX before DIV means you're dividing a 128-bit value instead of a 64-bit one. Forgetting that MUL writes RDX means you may have a corrupted register after a multiplication.
7. IEEE 754 floating-point cannot exactly represent most decimal fractions.
0.1, 0.2, and 0.3 are all approximated. 0.1 + 0.2 ≠ 0.3 in IEEE 754 arithmetic because both 0.1 and 0.2 are approximated, their sum accumulates the approximation errors, and the result differs from the approximated 0.3 by one ULP (unit in the last place).
8. NaN is not equal to NaN — use this to detect NaN.
A comparison involving NaN returns "unordered" (false for all ordered comparisons). The IEEE 754 property is: x == x is false only if x is NaN. In assembly, UCOMISS or UCOMISD sets PF=1 when a NaN is involved; jp (jump if parity) branches on this condition.
9. Denormal numbers cause massive performance penalties.
When a floating-point result underflows (is too small for normal representation), the CPU produces a denormal (subnormal) number. Operations on denormals can be 10-100x slower than normal FP operations on CPUs that don't have hardware support. The MXCSR register's FTZ (flush to zero) and DAZ (denormals as zero) bits mitigate this at the cost of slightly different arithmetic near zero.
10. x86-64 is little-endian: least-significant byte at the lowest address.
The value 0x01020304 stored at address A occupies bytes: A=0x04, A+1=0x03, A+2=0x02, A+3=0x01. This is counterintuitive when reading hex dumps but consistent: the base address always holds the least-significant byte regardless of data size.
11. Network byte order (big-endian) requires conversion on x86-64.
Reading IP packet headers, DNS records, and most internet protocols requires BSWAP or equivalent byte swapping. Forgetting this converts a port number like 0x0050 (80) into 0x5000 (20480) — a subtle bug that can waste hours.
12. The RFLAGS register records arithmetic outcomes but is not automatically checked. After any arithmetic instruction, the flags are set — but nothing happens unless you follow up with a conditional jump or conditional move that checks the relevant flag. Overflow in x86-64 is a silent event; the programmer is responsible for checking.
13. Use integers for money; floating-point is for physical quantities.
Store monetary values in the smallest denomination as 64-bit integers (e.g., cents). int64_t can represent up to ~92 quadrillion cents — more than enough for any real transaction. Floating-point accumulates rounding errors that cause calculated totals to drift from exact values.
14. The integer range chart to memorize:
| Width | Unsigned Min | Unsigned Max | Signed Min | Signed Max |
|---|---|---|---|---|
| 8-bit | 0 | 255 (0xFF) | -128 (0x80) | +127 (0x7F) |
| 16-bit | 0 | 65,535 (0xFFFF) | -32,768 (0x8000) | +32,767 (0x7FFF) |
| 32-bit | 0 | ~4.3 billion (0xFFFFFFFF) | ~-2.1 billion (0x80000000) | ~+2.1 billion (0x7FFFFFFF) |
| 64-bit | 0 | ~1.8×10^19 | ~-9.2×10^18 | ~+9.2×10^18 |
Visual Summary: Two's Complement 8-bit Number Line
Unsigned: 0 1 2 ... 126 127 128 129 ... 254 255
Binary: 0000 0001 0010 ... 0111 0111 1000 1000 ... 1111 1111
0000 0001 0010 1110 1111 0000 0001 1110 1111
Signed: 0 1 2 ... 126 127 -128 -127 ... -2 -1
↑ ↑
0x7F (max) | 0x80 (min)
Overflow point
Visual Summary: IEEE 754 Double Precision Layout
63 62 52 51 0
┌─┬────────────────────┬─────────────────────────────────────┐
│S│ Exponent (11 bits) │ Mantissa (52 bits) │
└─┴────────────────────┴─────────────────────────────────────┘
│ │
│ └── stored mantissa; value = 1.mantissa (implicit leading 1)
└── 0=positive, 1=negative
Value = (-1)^S × 2^(E-1023) × 1.M
Special: E=0 → denormal/zero; E=2047 → infinity/NaN