Chapter 13 Key Takeaways: Bit Manipulation
-
AND isolates and clears bits; OR sets bits; XOR toggles bits. These three operations form the complete vocabulary of bitwise manipulation. Every bit manipulation problem reduces to some combination of them.
-
TEST reg, maskis the preferred way to check specific bits without modifying the register.TESTcomputes AND without storing the result, setting ZF (zero if no bits in common) and SF. Follow withJZ(bits not set) orJNZ(at least one bit set). -
BT reg, bit_indexcopies a specific bit to CF without modifying the source register. BTS sets the bit, BTR clears it, BTC toggles it. The memory formBT [mem], regtreats memory as a bit array of arbitrary length — useful for large bitsets. -
BSF(Bit Scan Forward) andBSR(Bit Scan Reverse) have undefined behavior when the source is zero. Always check for zero before using them, or preferTZCNT/LZCNT(BMI1) which return 64 for zero input. -
LZCNTandTZCNT(BMI1) are strictly better thanBSR/BSFfor new code. They have well-defined behavior for zero input, provide leading/trailing zero counts directly (without the conversion step needed for BSR/BSF), and have no false-zero-input undefined behavior. -
POPCNTcounts set bits in a single instruction. It replaces multi-step software algorithms (the parallel counting technique) that required 10+ instructions. Always use POPCNT when available (all modern x86-64 CPUs have it). -
BMI1 provides ANDN, BLSI, BLSMSK, BLSR, BEXTR. These cover the most common single-bit manipulation patterns:
BLSIisolates the lowest set bit,BLSRclears it,BLSMSKcreates a mask up to and including it. Available on Intel Haswell (2013) and AMD Piledriver. -
BMI2 provides PEXT, PDEP, BZHI, and the flag-free shifts SARX/SHLX/SHRX/RORX. PEXT and PDEP are the "gather bits" and "scatter bits" operations used in hardware-accelerated CRC, codec implementations, and spatial indexing. SARX/SHLX/SHRX avoid false flag dependencies in shift-heavy code.
-
x & (x-1)clears the lowest set bit;x & (-x)isolates it. These are the fundamental building blocks of algorithms that iterate over set bits, test for powers of 2, and manipulate bit positions. -
The XOR swap (
a ^= b; b ^= a; a ^= b) swaps two registers without a temporary, but fails if both operands are the same variable. In practice, XCHG or a three-MOV sequence is clearer; use XOR swap only when saving one instruction is genuinely important. -
XOR is the foundation of symmetric cryptography because it is the only Boolean operation that is both invertible and bit-uniform. If the key is truly random and used only once, XOR with the key (one-time pad) is information-theoretically secure. The problem is key management, not the operation itself.
-
Simple repeating-key XOR is not secure. Ciphertext1 XOR Ciphertext2 = Plaintext1 XOR Plaintext2 when the same key is used. AES and other modern ciphers use XOR for key mixing but surround it with nonlinear transformations (S-Boxes) that prevent this attack.
-
The
imul rax, 0x0101010101010101trick replicates a byte to all 8 bytes of a 64-bit register. This is used for XOR cipher (key broadcast), SIMD-style byte operations on 64-bit values, andmemset-like operations. The multiplier0x01010101...ensures each byte of the result gets a copy of the original byte. -
Bit manipulation instructions (AND, OR, XOR, NOT, shifts, rotates) clear CF and OF. They do not modify CF (except shifts, which put the last shifted-out bit in CF). This is important when bit manipulation instructions appear in sequences that also use conditional branches based on CF.
-
PARITY computation:
popcnt rax, rdi; and rax, 1. The parity of a value (odd or even number of set bits) is the LSB of the popcount. This is used in ECC (error-correcting codes), CRC computation, and certain encoding schemes.