Chapter 9 Exercises: Arithmetic and Logic

Section A: Flag Prediction

Exercise 1. For each instruction sequence, predict the final values of CF, OF, ZF, and SF. Show your work.

(a) After:

mov rax, 0x8000000000000000   ; INT64_MIN
sub rax, 1

(b) After:

mov rax, 0xFFFFFFFFFFFFFFFF   ; UINT64_MAX
add rax, 1

(c) After:

mov rax, 100
cmp rax, 200

(d) After:

mov rax, 200
cmp rax, 100

(e) After:

mov rax, 0
test rax, rax

Exercise 2. After each ADD instruction, state whether the overflow is a signed overflow, an unsigned overflow, both, or neither:

(a) mov rax, 0x7FFFFFFFFFFFFFFF; add rax, 1 (b) mov rax, 0xFFFFFFFFFFFFFFFF; add rax, 1 (c) mov rax, 0x8000000000000000; add rax, 0x8000000000000000 (d) mov rax, 0x7FFFFFFF00000000; add rax, 0x0000000100000000


Exercise 3. Explain why INC/DEC do not modify CF, and give a concrete example where this behavior could cause a bug if you are not careful. Then give an example where this behavior is intentionally useful (in a multi-precision arithmetic loop).


Section B: Multi-Precision Arithmetic

Exercise 4. Write NASM code to compute the sum of two 128-bit unsigned integers: - A is stored as [a_hi:a_lo] at memory addresses [rbp-8] and [rbp-16] - B is stored as [b_hi:b_lo] at memory addresses [rbp-24] and [rbp-32] - Store the result in [rdx:rax] - If the addition overflows 128 bits, jump to label overflow


Exercise 5. Extend Exercise 4 to 256-bit addition. You have: - A = [a3:a2:a1:a0] in registers R8, R9, R10, R11 (a0 in R8, a3 in R11) - B = [b3:b2:b1:b0] in registers R12, R13, R14, R15

Write the 4-instruction ADC chain. After your chain, how do you detect 256-bit overflow?


Exercise 6. Implement uint128_t negate_128(uint128_t x) in assembly, where x is passed as [rsi:rdi] (rdi=low, rsi=high) and the result is returned in [rdx:rax] (rax=low, rdx=high).

Hint: Two's complement negation of a 128-bit value: negate the low half, then negate-with-borrow the high half. Alternatively: ~x + 1 (bitwise NOT, then add 1 as a 128-bit number).


Section C: Multiplication and Division

Exercise 7. Write the assembly code to compute the following. Choose the appropriate MUL/IMUL form for each:

(a) rax = rdi * 100 (signed, result fits in 64 bits) (b) rdx:rax = rdi * rsi (unsigned, full 128-bit result; inputs in RDI, RSI) (c) rcx = [rbx+8] * 7 (signed, memory operand, result fits in 64 bits) (d) rax = rdi * rsi (signed, you know the product fits in 64 bits)


Exercise 8. The following code has a bug. Identify it, explain what goes wrong, and fix it:

; Compute rax = large_value / 7
mov rax, 0x123456789ABCDEF0
mov rbx, 7
idiv rbx                   ; bug: what is in RDX before this?

Exercise 9. Write a function divmod(int64_t a, int64_t b) in NASM that computes both the quotient and the remainder of signed division. Return the quotient in RAX and the remainder in RDX.

Function signature: a in RDI, b in RSI.


Exercise 10. Implement division by 10 using only shifts and additions (no DIV instruction). The algorithm for dividing by 10 using multiply-high is:

q = (x * M) >> (32 + 3)

where M = 0x1999999A (a magic multiplier). Implement this in assembly for a 32-bit unsigned dividend, returning the quotient in EAX.

Hint: Use the 64-bit IMUL form: imul rax, rdi, M (truncated), then shift right.


Section D: Implementing Functions

Exercise 11. Implement the C function int64_t abs64(int64_t x) in NASM without using any conditional jumps. Use NEG and CMOV instructions (CMOV is in Chapter 10, but you can preview it: cmovl dst, src moves src to dst if SF≠OF).

Then implement it using only arithmetic (no branches, no CMOV): Hint: right-shift to get sign mask, XOR and subtract.


Exercise 12. Implement int64_t max64(int64_t a, int64_t b) in NASM, returning the larger of the two signed 64-bit integers. Use only CMP and CMOV (no branches). Function: a in RDI, b in RSI, return in RAX.


Exercise 13. Implement int64_t clamp(int64_t x, int64_t lo, int64_t hi) in NASM. The function should return x clamped to the range [lo, hi]: if x < lo return lo; if x > hi return hi; otherwise return x.

Function: x in RDI, lo in RSI, hi in RDX. Return in RAX.


Section E: Logic and Bit Manipulation Preview

Exercise 14. Write NASM code for each of the following bit manipulation operations on RAX:

(a) Set bits 3, 5, and 7 of RAX (b) Clear bits 0, 1, and 2 of RAX (c) Toggle bits 4, 8, and 16 of RAX (d) Isolate the low 4 bits of RAX (zero all other bits) (e) Extract bits 8:15 of RAX into a clean value (zero all other bits, shift down)


Exercise 15. Implement a bitfield insert operation: given a value in RDI, insert the low N bits of RSI into RDI starting at bit position P. Assume N=8, P=16 (insert a byte into bits 23:16 of RDI). Return result in RAX.


Section F: Shift Operations

Exercise 16. Predict the output of the following shift operations. Assume RAX = 0xA5A5A5A5A5A5A5A5:

(a) shl rax, 4 (b) shr rax, 4 (c) sar rax, 4 (d) rol rax, 8 (e) ror rax, 8


Exercise 17. Implement a 128-bit logical left shift in assembly. The 128-bit value is stored as [rdx:rax] (rdx=high, rax=low). Shift count is in RCX (assume 0 ≤ CL ≤ 63).

Use SHLD and SHL to implement this efficiently.


Exercise 18. A function needs to compute x / 7 for unsigned 32-bit values using the multiply-by-reciprocal trick. The magic number for dividing by 7 (32-bit) is 0x24924925, with a post-shift of 0. The actual formula involves:

t1 = mulhi(x, 0x24924925)  // upper 32 bits of 64-bit multiply
t2 = x - t1
t3 = t2 >> 1
q  = (t1 + t3) >> 2

Implement this in NASM. The input x is in EDI.


Exercise 19 (Challenge: Big Integer). Write a NASM function that adds a 64-bit unsigned integer (in RSI) to a big integer stored as an array of 64-bit limbs in little-endian order (least significant limb at [rdi]). The big integer has N limbs, where N is in EDX.

The function should propagate the carry through all N limbs if necessary.


Exercise 20 (Challenge: Constant-Time Operations). Implement int64_t ct_select(int64_t a, int64_t b, int64_t condition) in NASM, where: - If condition != 0, return a - If condition == 0, return b - The function must run in constant time (no branches, no memory accesses dependent on condition)

This is the fundamental building block for constant-time cryptographic code. Use only arithmetic and logical instructions.

Hint: Compute a mask that is all-ones if condition != 0 and all-zeros if condition == 0. Then use AND to select.