Chapter 9 Exercises: Arithmetic and Logic
Section A: Flag Prediction
Exercise 1. For each instruction sequence, predict the final values of CF, OF, ZF, and SF. Show your work.
(a) After:
mov rax, 0x8000000000000000 ; INT64_MIN
sub rax, 1
(b) After:
mov rax, 0xFFFFFFFFFFFFFFFF ; UINT64_MAX
add rax, 1
(c) After:
mov rax, 100
cmp rax, 200
(d) After:
mov rax, 200
cmp rax, 100
(e) After:
mov rax, 0
test rax, rax
Exercise 2. After each ADD instruction, state whether the overflow is a signed overflow, an unsigned overflow, both, or neither:
(a) mov rax, 0x7FFFFFFFFFFFFFFF; add rax, 1
(b) mov rax, 0xFFFFFFFFFFFFFFFF; add rax, 1
(c) mov rax, 0x8000000000000000; add rax, 0x8000000000000000
(d) mov rax, 0x7FFFFFFF00000000; add rax, 0x0000000100000000
Exercise 3. Explain why INC/DEC do not modify CF, and give a concrete example where this behavior could cause a bug if you are not careful. Then give an example where this behavior is intentionally useful (in a multi-precision arithmetic loop).
Section B: Multi-Precision Arithmetic
Exercise 4.
Write NASM code to compute the sum of two 128-bit unsigned integers:
- A is stored as [a_hi:a_lo] at memory addresses [rbp-8] and [rbp-16]
- B is stored as [b_hi:b_lo] at memory addresses [rbp-24] and [rbp-32]
- Store the result in [rdx:rax]
- If the addition overflows 128 bits, jump to label overflow
Exercise 5.
Extend Exercise 4 to 256-bit addition. You have:
- A = [a3:a2:a1:a0] in registers R8, R9, R10, R11 (a0 in R8, a3 in R11)
- B = [b3:b2:b1:b0] in registers R12, R13, R14, R15
Write the 4-instruction ADC chain. After your chain, how do you detect 256-bit overflow?
Exercise 6.
Implement uint128_t negate_128(uint128_t x) in assembly, where x is passed as [rsi:rdi] (rdi=low, rsi=high) and the result is returned in [rdx:rax] (rax=low, rdx=high).
Hint: Two's complement negation of a 128-bit value: negate the low half, then negate-with-borrow the high half. Alternatively: ~x + 1 (bitwise NOT, then add 1 as a 128-bit number).
Section C: Multiplication and Division
Exercise 7. Write the assembly code to compute the following. Choose the appropriate MUL/IMUL form for each:
(a) rax = rdi * 100 (signed, result fits in 64 bits)
(b) rdx:rax = rdi * rsi (unsigned, full 128-bit result; inputs in RDI, RSI)
(c) rcx = [rbx+8] * 7 (signed, memory operand, result fits in 64 bits)
(d) rax = rdi * rsi (signed, you know the product fits in 64 bits)
Exercise 8. The following code has a bug. Identify it, explain what goes wrong, and fix it:
; Compute rax = large_value / 7
mov rax, 0x123456789ABCDEF0
mov rbx, 7
idiv rbx ; bug: what is in RDX before this?
Exercise 9.
Write a function divmod(int64_t a, int64_t b) in NASM that computes both the quotient and the remainder of signed division. Return the quotient in RAX and the remainder in RDX.
Function signature: a in RDI, b in RSI.
Exercise 10. Implement division by 10 using only shifts and additions (no DIV instruction). The algorithm for dividing by 10 using multiply-high is:
q = (x * M) >> (32 + 3)
where M = 0x1999999A (a magic multiplier). Implement this in assembly for a 32-bit unsigned dividend, returning the quotient in EAX.
Hint: Use the 64-bit IMUL form: imul rax, rdi, M (truncated), then shift right.
Section D: Implementing Functions
Exercise 11.
Implement the C function int64_t abs64(int64_t x) in NASM without using any conditional jumps. Use NEG and CMOV instructions (CMOV is in Chapter 10, but you can preview it: cmovl dst, src moves src to dst if SF≠OF).
Then implement it using only arithmetic (no branches, no CMOV): Hint: right-shift to get sign mask, XOR and subtract.
Exercise 12.
Implement int64_t max64(int64_t a, int64_t b) in NASM, returning the larger of the two signed 64-bit integers. Use only CMP and CMOV (no branches). Function: a in RDI, b in RSI, return in RAX.
Exercise 13.
Implement int64_t clamp(int64_t x, int64_t lo, int64_t hi) in NASM. The function should return x clamped to the range [lo, hi]: if x < lo return lo; if x > hi return hi; otherwise return x.
Function: x in RDI, lo in RSI, hi in RDX. Return in RAX.
Section E: Logic and Bit Manipulation Preview
Exercise 14. Write NASM code for each of the following bit manipulation operations on RAX:
(a) Set bits 3, 5, and 7 of RAX (b) Clear bits 0, 1, and 2 of RAX (c) Toggle bits 4, 8, and 16 of RAX (d) Isolate the low 4 bits of RAX (zero all other bits) (e) Extract bits 8:15 of RAX into a clean value (zero all other bits, shift down)
Exercise 15. Implement a bitfield insert operation: given a value in RDI, insert the low N bits of RSI into RDI starting at bit position P. Assume N=8, P=16 (insert a byte into bits 23:16 of RDI). Return result in RAX.
Section F: Shift Operations
Exercise 16. Predict the output of the following shift operations. Assume RAX = 0xA5A5A5A5A5A5A5A5:
(a) shl rax, 4
(b) shr rax, 4
(c) sar rax, 4
(d) rol rax, 8
(e) ror rax, 8
Exercise 17.
Implement a 128-bit logical left shift in assembly. The 128-bit value is stored as [rdx:rax] (rdx=high, rax=low). Shift count is in RCX (assume 0 ≤ CL ≤ 63).
Use SHLD and SHL to implement this efficiently.
Exercise 18.
A function needs to compute x / 7 for unsigned 32-bit values using the multiply-by-reciprocal trick. The magic number for dividing by 7 (32-bit) is 0x24924925, with a post-shift of 0. The actual formula involves:
t1 = mulhi(x, 0x24924925) // upper 32 bits of 64-bit multiply
t2 = x - t1
t3 = t2 >> 1
q = (t1 + t3) >> 2
Implement this in NASM. The input x is in EDI.
Exercise 19 (Challenge: Big Integer).
Write a NASM function that adds a 64-bit unsigned integer (in RSI) to a big integer stored as an array of 64-bit limbs in little-endian order (least significant limb at [rdi]). The big integer has N limbs, where N is in EDX.
The function should propagate the carry through all N limbs if necessary.
Exercise 20 (Challenge: Constant-Time Operations).
Implement int64_t ct_select(int64_t a, int64_t b, int64_t condition) in NASM, where:
- If condition != 0, return a
- If condition == 0, return b
- The function must run in constant time (no branches, no memory accesses dependent on condition)
This is the fundamental building block for constant-time cryptographic code. Use only arithmetic and logical instructions.
Hint: Compute a mask that is all-ones if condition != 0 and all-zeros if condition == 0. Then use AND to select.