Case Study: Building a Send-Over-a-Noisy-Wire Pipeline (RSA + Hamming)

DataField.Dev

Case Study: Building a Send-Over-a-Noisy-Wire Pipeline (RSA + Hamming)

"What I cannot create, I do not understand." — Richard Feynman

Executive Summary

This is a build-heavy capstone walkthrough that combines two tracks: Track A (RSA, §39.3) and Track D (error-correcting codes, §39.6). You will assemble a complete, end-to-end pipeline that takes a short text message, encrypts it so an eavesdropper learns nothing, error-protects it so a single bit flipped in transit is silently repaired, sends it across a deliberately noisy simulated wire, and recovers the exact original at the far end. Nothing here is a new algorithm — every component is a dmtoolkit function you built across the book — but wiring them into a working whole, in the right order, with the round-trip proven to invert, is the capstone skill. The dangerous bugs are all at the seams between stages, and the chapter's discipline ("compose proven pieces; state the exact guarantee") is what keeps the seams sound.

The headline design decision is the ordering of the two protections — encrypt-then-encode, not encode-then-encrypt — and there is a clean reason for it that the build will make concrete. Every byte of expected output is hand-derived from the worked keys and Hamming layout in the chapter, so you can check the pipeline against your own pencil.

Skills applied

Composing crypto.py (Track A) and coding.py (Track D) into a single staged pipeline.
Designing the order of stages from a correctness requirement, not convenience.
Hand-deriving the output of each stage and the full round-trip (no execution).
Stating the pipeline's correctness as a composition of two proven theorems (Euler, Ch. 25; the minimum-distance bound, Ch. 38).
Reporting the pipeline's honest guarantee and its sharp limits (§39.7).

Background

The scenario

You are building the transport layer for a tiny telemetry device — a sensor that must send a short status code to a base station over a cheap radio link. Two non-negotiable requirements:

Confidentiality. Anyone listening to the radio must not learn the status. The device holds the base station's public key and encrypts with it; only the base station, holding the private key, can read it. (This is exactly the RSA setup of Track A.)
Integrity over a noisy channel. The radio is cheap and the environment is electrically noisy: a bit occasionally flips in transit. A flipped bit in ciphertext is catastrophic — decryption of corrupted ciphertext yields garbage, not "almost the right message." So the bits actually sent must be error-protected with enough redundancy to repair a single flip per block. (This is Track D.)

Neither track alone suffices. RSA gives confidentiality but no error tolerance; Hamming gives error tolerance but no confidentiality. The capstone is to combine them — and the combination is more than the sum of its parts because the order matters.

💡 Intuition: Think of the two protections as two envelopes. Encryption is an opaque, tamper-soft envelope: it hides the contents, but if you scribble on it (flip a bit), the contents inside are destroyed. Error-correction is a rugged, transparent shipping box: it survives a dent (a bit flip), but anyone can see inside. You want the opaque envelope inside the rugged box — encrypt first, then wrap the ciphertext in error-correction — so the message is both hidden and dent-proof. Put them the other way and a dent reaches the fragile envelope before the box can absorb it.

Why this matters

This encrypt-then-protect pattern is the real architecture of essentially every secure link you use: TLS encrypts, and the underlying physical/link layer (Wi-Fi, cellular, Ethernet) carries error-correcting codes beneath it. CDs and QR codes layer Reed–Solomon error-correction under their data; deep-space links (the Voyager probes) stack codes to survive a channel that flips bits constantly. You are building the two-layer idea in miniature, from primitives you understand to the bottom.

Phase 1: The plan — stages and their order

Lay out the pipeline as a sequence of total functions, each with a one-line contract (the §39.7 design-first habit). Sender stages on the left run top-to-bottom; receiver stages reverse them bottom-to-top.

#	Sender stage	Contract	Receiver stage (reverse)
1	`text_to_int`	text → big integer $m$ ($m < n$)	`int_to_text`: integer → text
2	`rsa_encrypt`	$m \mapsto c = m^e \bmod n$	`rsa_decrypt`: $c \mapsto m = c^d \bmod n$
3	`int_to_bits`	ciphertext integer → bit list	`bits_to_int`: bits → integer
4	`hamming_encode` per nibble	4 data bits → 7 code bits	`hamming_decode`: 7 → 4 (repairs 1 flip)
—	the noisy wire	flips ≤ 1 bit per 7-bit block	(the flip is corrected in stage 4)

The ordering rationale, stated as a requirement and not a preference: error-correction must be the outermost layer (last to apply on send, first to undo on receive) because that is the layer the noise hits. The wire flips bits in whatever is physically transmitted — the Hamming-encoded bits. If we had encoded then encrypted, the noise would land on ciphertext that Hamming never sees, and the single flip would survive into rsa_decrypt, which would then produce garbage (RSA has no error tolerance — one wrong bit in $c$ changes $c^d \bmod n$ completely). So: encrypt for secrecy, then Hamming-encode for the channel. The correctness of the whole pipeline hinges on this single decision.

⚠️ Common Pitfall: Reasoning "encryption is the more important protection, so do it last/outermost." Importance is not the criterion — which layer the noise touches is. The outermost layer must be the one engineered against the channel's failure mode. Get this backwards and the pipeline still runs, still type-checks, and silently corrupts every message that hits a bit flip. It is the seam-bug the whole case study is about.

Phase 2: Build the bit-plumbing (stages 3 and the glue)

The two tracks speak different languages: RSA produces integers, Hamming consumes 4-bit nibbles. The new code in this pipeline is entirely the glue between them — converting an integer to a flat bit list, chopping it into nibbles, and reversing both. This is the "thin new layer" §39.2 promised; the heavy lifting is all in the Toolkit.

def int_to_bits(x, width):
    """Integer -> list of `width` bits, most-significant first."""
    return [(x >> (width - 1 - i)) & 1 for i in range(width)]

def bits_to_int(bits):
    """Inverse of int_to_bits: bit list (MSB first) -> integer."""
    x = 0
    for b in bits:
        x = (x << 1) | b
    return x

def nibbles(bits):
    """Chop a bit list into 4-bit groups (the input width Hamming expects)."""
    return [bits[i:i + 4] for i in range(0, len(bits), 4)]

# Sanity check the round-trip on a small integer, by hand:
print(int_to_bits(0b1011, 4))                 # 11 -> four bits
print(bits_to_int([1, 0, 1, 1]))              # back to 11
print(nibbles([1, 0, 1, 1, 0, 1, 1, 0]))      # two nibbles
# Expected output:
# [1, 0, 1, 1]
# 11
# [[1, 0, 1, 1], [0, 1, 1, 0]]

Hand-derive each line. int_to_bits(0b1011, 4): $0b1011 = 11$; bit $i$ is $(11 \gg (3-i)) \,\&\, 1$, giving for $i=0,1,2,3$ the bits $1,0,1,1$. bits_to_int([1,0,1,1]): shift-and-or accumulates $((((0{\cdot}2+1){\cdot}2+0){\cdot}2+1){\cdot}2+1) = 11$. nibbles splits 8 bits into two groups of 4. The two conversions are inverse by construction — bits_to_int(int_to_bits(x, w)) == x for any $0 \le x < 2^w$ — which is the property the receiver depends on.

🔄 Check Your Understanding 1. Why must the sender and receiver agree on the bit width ahead of time? What goes wrong if the sender uses width=12 but the receiver assumes width=8? 2. The ciphertext integer $c$ might not be a multiple of 4 bits long. What must the glue do so that nibbles produces only full 4-bit groups?

Answer

The width is the padding agreement. int_to_bits(5, 8) is [0,0,0,0,0,1,0,1]; if the receiver reads it as width 4 it loses the leading zeros and mis-aligns every nibble — a silent corruption. The width is part of the protocol, exactly like RSA's $mpad the bit list up to a multiple of 4 (prepend zeros) before chopping, and the receiver must know to strip that padding — or, equivalently, both sides fix the total bit-width from $n$ so the length is constant. An unpadded last nibble of 1–3 bits would break hamming_encode, which demands exactly 4.

Phase 3: Wire the full pipeline and run one message end to end

Now compose everything. We use the chapter's verified toy RSA key ($n = 3233$, $e = 17$, $d = 2753$) and send the one-byte message "A" (value 65), which fits in one block since $65 < 3233$. We deliberately flip one bit on the wire to prove the error-correction earns its keep.

from dmtoolkit.crypto import rsa_keygen_from_primes, rsa_encrypt, rsa_decrypt
from dmtoolkit.coding import hamming_encode, hamming_decode

def text_to_int(s):  return int.from_bytes(s.encode("utf-8"), "big")
def int_to_text(m):  return m.to_bytes((m.bit_length() + 7) // 8, "big").decode("utf-8")

pub, d = rsa_keygen_from_primes(61, 53, 17)        # n = 3233, e = 17, d = 2753
n = pub[0]

# ---- SENDER ----
m = text_to_int("A")                                # 65
c = rsa_encrypt(m, pub)                              # 65^17 mod 3233 = 2790
cbits = int_to_bits(c, 12)                           # 2790 in 12 bits
blocks = [hamming_encode(nib) for nib in nibbles(cbits)]   # 3 nibbles -> 3 codewords
print("ciphertext:", c)
print("on the wire:", blocks)
# Expected output:
# ciphertext: 2790
# on the wire: [[1, 0, 1, 1, 0, 1, 0], [0, 0, 1, 0, 1, 1, 0], [1, 1, 0, 0, 1, 1, 0]]

Hand-derive the sender side. RSA: $c = 65^{17} \bmod 3233 = 2790$ (the chapter's verified value). Now $2790$ in 12 bits: $2790 = 2048 + 512 + 128 + 64 + 32 + 4 + 2 = 101011100110_2$, i.e. [1,0,1,0,1,1,1,0,0,1,1,0]. Split into three nibbles: [1,0,1,0], [1,1,1,0], [0,1,1,0]. Each is Hamming-encoded into a 7-bit codeword laid out as $[p_1, p_2, d_1, p_4, d_2, d_3, d_4]$, with the even-parity bits $p_1 = d_1 \oplus d_2 \oplus d_4$, $p_2 = d_1 \oplus d_3 \oplus d_4$, and $p_4 = d_2 \oplus d_3 \oplus d_4$ (the §39.6 layout that sends [1,0,1,1] to [0,1,1,0,0,1,1]). Working the three nibbles: [1,0,1,0] gives $p_1{=}1, p_2{=}0, p_4{=}1 \to$ [1,0,1,1,0,1,0]; [1,1,1,0] gives all parities 0 $\to$ [0,0,1,0,1,1,0]; [0,1,1,0] gives $p_1{=}1, p_2{=}1, p_4{=}0 \to$ [1,1,0,0,1,1,0]. These three 7-bit blocks are what physically travels.

# ---- THE NOISY WIRE: flip one bit in the first block ----
import copy
sent = copy.deepcopy(blocks)
sent[0][2] ^= 1                                      # corrupt one bit, block 0, position index 2
print("corrupted block 0:", sent[0])
# Expected output:
# corrupted block 0: [1, 0, 0, 1, 0, 1, 0]

# ---- RECEIVER ----
recovered_nibbles = [hamming_decode(b) for b in sent]   # repairs the single flip
rbits = [bit for nib in recovered_nibbles for bit in nib]
c_back = bits_to_int(rbits)
m_back = rsa_decrypt(c_back, d, n)                       # 2790^2753 mod 3233 = 65
print("decoded ciphertext:", c_back)
print("message:", int_to_text(m_back))
# Expected output:
# decoded ciphertext: 2790
# message: A

Hand-derive the receiver side — this is where the two theorems do their work:

Hamming repair. Block 0 was [1,0,1,1,0,1,0]; we flipped position index 2 to get [1,0,0,1,0,1,0], Hamming distance 1 from the codeword. hamming_decode recomputes the parity checks, the syndrome points at the flipped position, it flips it back, and extracts the original nibble [1,0,1,0]. Blocks 1 and 2 were untouched and decode to [1,1,1,0] and [0,1,1,0]. The flip is gone.
Reassemble. Concatenating the three repaired nibbles gives back [1,0,1,0,1,1,1,0,0,1,1,0], which bits_to_int reads as $2790$ — exactly the ciphertext the sender produced. The channel error left no trace.
Decrypt. $c_{\text{back}} = 2790$, so $m = 2790^{2753} \bmod 3233 = 65$ (the chapter's verified round-trip), and int_to_text(65) is "A". The original message is recovered, despite the bit flip.

That is the whole system working: a flipped bit on a noisy wire, an eavesdropper who sees only the meaningless codewords of 2790, and a base station that reads "A" perfectly.

🐛 Find the Error: A colleague "optimizes" the pipeline by swapping stages 2 and 4 — Hamming-encode the plaintext bits first, then RSA-encrypt the result. Walk through what happens to the single bit flip on the wire under this ordering, and explain why the base station now reads garbage.

Answer

Under encode-then-encrypt, the bits on the wire are ciphertext, which Hamming never wrapped — so when the wire flips a bit, there is no error-correcting structure around the ciphertext to repair it. The receiver decrypts a ciphertext that differs from the sent one by a bit; since RSA has no error tolerance ($c'^d \bmod n$ is unrelated to $c^d \bmod n$ when $c' \ne c$), the result is garbage, and the inner Hamming layer — applied to the already-destroyed plaintext — cannot help. This is precisely why error-correction must be the outermost layer: it has to surround the bits the channel actually touches.

Phase 4: Correctness as a composition of two theorems

The pipeline's correctness is not one proof but a chain of two, one per track — and the capstone point is that you cite each, never re-derive it. State the end-to-end claim precisely first.

Pipeline round-trip claim. If the message integer satisfies $m < n$, and the channel flips at most one bit per 7-bit block, then the receiver recovers $m$ exactly.

The strategy. Decompose the round-trip into the two stages that can fail and invoke the theorem that each cannot. Hamming guarantees the bits arrive uncorrupted (so the receiver's ciphertext equals the sender's); RSA then guarantees the plaintext is recovered from that ciphertext. Chain the two and the message survives.

Proof (by composition). Let the sender's ciphertext be $c = m^e \bmod n$ and let the bits sent be the Hamming encodings of $c$'s nibbles.

Stage 4 (channel + Hamming). Each 7-bit block is a Hamming(7,4) codeword with minimum distance $d = 3$. By the single-error-correction theorem (Chapter 38; restated in §39.6: a code with $d \ge 3$ corrects any single-bit error by decoding to the nearest codeword, via the triangle inequality), each block with at most one flipped bit decodes to its original nibble. Concatenating the repaired nibbles and applying bits_to_int — the exact inverse of the sender's int_to_bits — yields $c$ unchanged. So the receiver's ciphertext equals the sender's: $c_{\text{back}} = c$.

Stage 2 (RSA). Since $c_{\text{back}} = c = m^e \bmod n$ and $0 \le m < n$, the RSA correctness theorem (Chapter 25; $m^{ed} \equiv m \pmod n$, from Euler's theorem with CRT for the non-coprime case) gives $\,\texttt{rsa\_decrypt}(c, d, n) = c^d \bmod n = m^{ed} \bmod n = m$.

Composing the two stages, the receiver computes $m$, and int_to_text recovers the original string. $\blacksquare$

🔗 Connection. Notice the shape: the pipeline is correct because each of its two load-bearing pieces is separately proven, and the seams between them (the bit-plumbing of Phase 2) are exact inverses by construction. This is the chapter's thesis in its strongest form — a correct system is a composition of correct parts — and here it is literally a composition of two named theorems from two different parts of the book.

Discussion Questions

The proof assumes "at most one bit flipped per 7-bit block." A real radio produces burst errors — several consecutive flips. Explain why a burst could put two flips in one block, defeating Hamming(7,4), and name the §39.6 technique (interleaving) that spreads a burst across blocks so each block again sees at most one error.
We sent a one-byte message because the toy modulus $n = 3233$ only holds $m < 3233$. To send "Hello", which §39.3 extension is required before the Hamming layer, and at which stage number does it slot into the Phase-1 table?
The pipeline encrypts with a toy key. State the §39.7-style honest limitation in one sentence — what exactly is and is not guaranteed about confidentiality — and contrast "the round-trip is proven correct" with "the encryption is secure."
Hamming(7,4) expands 4 bits to 7 (rate $4/7$), and RSA expands a small message to a full modulus-sized ciphertext. Estimate the total size blow-up from plaintext byte to bits-on-the-wire for this toy system, and discuss why real systems use hybrid encryption (RSA on a short key, symmetric cipher on the bulk) to control it.
Where in the pipeline would a bug fail loudly (an exception) versus silently (a wrong message that looks fine)? Identify one of each, and connect to why the chapter insists silent failures are the ones a correctness argument must rule out.

Your Turn: Extensions

Option A (add blocking, send a word). Implement the §39.3 blocking layer and extend the pipeline to send "Hi" (two bytes) under the toy key, splitting into blocks each below $n$, Hamming-protecting each, and reversing on receipt. Hand-derive the two ciphertext blocks and confirm the round-trip.
Option B (interleave against bursts). Add an interleaving stage between Hamming-encode and the wire: transmit the bits of several blocks in a rotated order so a 2-bit burst lands in two different blocks. Simulate a 2-bit burst by hand and show the de-interleaved blocks each suffer at most one flip, so the pipeline still recovers the message.
Option C (upgrade to SECDED). Swap Hamming(7,4) for Hamming(8,4) (add one overall parity bit — the §39.6 extension used in ECC memory) so the channel layer corrects single errors and detects double errors. State the new guarantee precisely, and explain what the receiver should do when it detects (but cannot correct) a double error.

Key Takeaways

Order the stages from the failure mode, not from importance. Error-correction must be outermost because the channel touches the outermost bits; encrypt-then-encode is a correctness requirement, and the reverse silently corrupts.
The new code is the glue; the power is in the Toolkit. The only fresh logic was bit-plumbing between RSA's integers and Hamming's nibbles — built as exact inverses so the seams hold.
Pipeline correctness is a composition of proven theorems. Hamming's minimum-distance bound (Ch. 38) delivers the ciphertext intact; RSA's $m^{ed}\equiv m$ (Ch. 25) recovers the plaintext. Cite each; re-derive neither.
State the exact guarantee. "The round-trip is proven correct under ≤1 flip per block" is true and defensible; "the system is secure" is not — the key is a toy. Naming that boundary is the §39.7 honesty the capstone is built to teach.