38 min read

> "We stand today on the brink of a revolution in cryptography."

Prerequisites

  • 23
  • 24

Learning Objectives

  • Analyze the classical substitution and shift ciphers and explain precisely why frequency analysis breaks them.
  • Distinguish symmetric-key from public-key cryptography and articulate the key-distribution problem that motivates the public-key idea.
  • Implement RSA end to end — key generation, encryption, and decryption — in fewer than thirty lines of Python.
  • Prove that RSA decryption inverts encryption, citing Euler's theorem and the Chinese Remainder Theorem where each is needed.
  • Explain RSA's security as a computational-hardness assumption (integer factoring) and identify the textbook-RSA pitfalls that real systems must avoid.

Chapter 25: Cryptography — From Caesar Cipher to RSA

"We stand today on the brink of a revolution in cryptography." — Whitfield Diffie and Martin Hellman, New Directions in Cryptography (1976)

Overview

Every time your browser shows a padlock, a small miracle of number theory has just happened. Two computers that have never met, exchanging messages over a wire that anyone can tap, agreed on how to keep a secret — without ever transmitting that secret in the clear. For thousands of years this was believed to be impossible. To share a secret, the reasoning went, you must first share a key, and to share a key you need a secure channel, and if you already had a secure channel you wouldn't need the secret in the first place. It is a perfect circle, and humanity was trapped inside it from Caesar's legions to the Cold War.

In the 1970s, that circle was broken — not by a better lock or a faster courier, but by a piece of pure mathematics that had been sitting in number-theory textbooks for two centuries, dismissed as beautiful and useless. The branch of math that G. H. Hardy in 1940 proudly called the most useless of all, the one with no conceivable military application, turned out to be the thing that secures every bank transfer, every password, every private message on Earth. This chapter is the payoff of Part IV. Everything you built — divisibility and the Euclidean algorithm (Chapter 22), modular arithmetic and Euler's theorem (Chapter 23), the group and field structure underneath (Chapter 24) — was scaffolding for one moment: the moment you implement RSA yourself and understand, line by line, why it works and why it is hard to break.

You are not going to take any of this on faith. You'll see the classical ciphers and break them. You'll feel the key-distribution problem as a genuine impossibility, not a story. Then you'll watch public-key cryptography dissolve it, build RSA in under thirty lines, and prove — with a proof whose strategy we state before the first symbol — that decryption undoes encryption for every possible message. Finally, you'll see exactly where RSA's security comes from (the believed hardness of factoring large integers) and exactly how a naive implementation leaks the very secrets it was built to protect.

In this chapter, you will learn to:

  • Encrypt and decrypt with the shift and substitution ciphers, and break them with frequency analysis.
  • State the key-distribution problem precisely and explain why symmetric crypto alone cannot solve it at internet scale.
  • Explain the public-key idea: a published lock anyone can snap shut, a private key only you can open.
  • Generate an RSA key pair, encrypt a message, and decrypt it, all from scratch.
  • Prove RSA's correctness using Euler's theorem (and patch the proof with the Chinese Remainder Theorem for the case the textbook usually skips).
  • Reason about RSA's security as a hardness assumption, and name the pitfalls — textbook determinism, small exponents, shared moduli — that break it in practice.

Learning Paths

🏃 Fast Track: If you already know modular exponentiation cold and just want the RSA payoff, skim 25.1–25.3 for the vocabulary, then work 25.4 (the algorithm) and the Project Checkpoint by hand for a tiny key pair. Read the statement of the correctness theorem in 25.5; the proof can wait.

📖 Standard Path: Read in order. Break a real shift cipher in 25.1 before reading the break. Trace the RSA round-trip in 25.4 with pencil and paper for $p=5, q=11$ before you read the Python. Work every 🔄 Check Your Understanding.

🔬 Deep Dive: After 25.5, prove RSA correctness without assuming $\gcd(m,n)=1$ — the case the Euler-only proof quietly drops — using the Chinese Remainder Theorem. Then read 25.6 and work out why deterministic "textbook" RSA fails semantic security even when factoring is hard.


25.1 Classical ciphers and why they fail

Start with the problem cryptography has always tried to solve. You want to send a message — call it the plaintext — to a friend, across a channel where an eavesdropper can read everything that passes. You transform the plaintext into scrambled ciphertext before sending; your friend reverses the transformation to recover the plaintext; the eavesdropper, seeing only ciphertext, learns nothing useful. The transformation is governed by a secret called the key. This whole arrangement has a name.

A cipher is a pair of algorithms — one to encrypt and one to decrypt — parameterized by a key, such that decrypting an encrypted message with the matching key recovers the original. The input to encryption is the plaintext; its output is the ciphertext.

The oldest cipher anyone names is Caesar's cipher: shift every letter three positions forward in the alphabet, wrapping around at the end. A becomes D, B becomes E, ..., X becomes A. The suspicious thing — and your first clue that classical cryptography is in trouble — is that this is arithmetic you already know. Number the letters A=0, B=1, ..., Z=25. Then Caesar's cipher is

$$c \equiv m + 3 \pmod{26},$$

and decryption is $m \equiv c - 3 \pmod{26}$. The "shift by three" is just one choice; the shift cipher allows any shift $k \in \{0, 1, \dots, 25\}$ as the key:

$$\text{encrypt: } c \equiv m + k \pmod{26}, \qquad \text{decrypt: } m \equiv c - k \pmod{26}.$$

🔗 Connection. That congruence is exactly the modular arithmetic of Chapter 23 — addition in $\mathbb{Z}_{26}$. Encryption adds the key; decryption adds its additive inverse $-k$. The cipher is correct precisely because every element of $\mathbb{Z}_{26}$ has an additive inverse, a fact you proved when you established that $\mathbb{Z}_n$ is a ring (Chapter 24). Cryptography is not a separate subject bolted onto number theory; it is number theory, viewed as a tool.

Here is why the shift cipher is hopeless. The key is a single number from 0 to 25, so there are only 26 possible keys — and one of them (the shift $k=0$) doesn't encrypt at all. An attacker who intercepts ciphertext simply tries all 25 nontrivial shifts and reads the one that produces English. This is a brute-force attack, and against the shift cipher it costs 25 tries.

💡 Intuition: The size of the key space is the first thing you check about any cipher. If an attacker can enumerate every key faster than you can drink a coffee, the cipher is broken before the mathematics even begins. The shift cipher has a key space of size 26. Chapter 15's counting tells you a modern symmetric key of 128 bits has a key space of size $2^{128} \approx 3.4 \times 10^{38}$ — that is the difference between a toy and a tool.

The obvious fix is to use a richer key. Instead of shifting every letter by the same amount, scramble the alphabet by an arbitrary permutation: maybe A→Q, B→W, C→E, .... This is the substitution cipher, and its key is a full permutation of the 26 letters. How big is that key space? By Chapter 16's count of permutations it is $26! \approx 4.03 \times 10^{26}$ — about 88 bits. Brute force is now utterly hopeless: enumerating $26!$ keys at a billion per second would take longer than the age of the universe. Surely this cipher is secure?

It is not, and the reason is one of the most important ideas in the subject: a large key space is necessary but nowhere near sufficient. The substitution cipher preserves structure. Whatever letter E maps to, it maps to consistently, and E is the most common letter in English (about 12.7% of letters in typical text — a Tier-1 figure reported across standard references such as Katz–Lindell). So in any reasonably long ciphertext, the most common symbol is almost certainly the encryption of E. The next most common is probably T, then A, O, and so on. Pairs of letters ("digraphs") like TH and ER leave the same fingerprints. This attack is frequency analysis, described in the 9th century by the Arab polymath al-Kindi, and it shreds the substitution cipher with nothing more than a frequency count and patience.

Frequency analysis is even easier against the shift cipher, and watching it work makes the principle concrete. Suppose you intercept the shift-cipher ciphertext WKLV LV D VHFUHW PHVVDJH and want the key without trying all 25 shifts. The attacker's first move is always the same: count the letters. V is the most common, appearing five times. The instinct is to guess the most common ciphertext letter decrypts to the most common English letter, E — but here the correct decryption sends V to S, not E, because this sample is far too short to be statistically reliable. So the right discipline is: let frequency propose a few likely keys, then confirm by reading. The two or three highest-frequency letters give you only a handful of candidate shifts to test, and the moment one shift turns the ciphertext into English you are done. Trying $k = 3$ — i.e., shifting every letter back by 3, so the common V (=21) becomes S (=18) — yields THIS IS A SECRET MESSAGE. Against a 26-key cipher this is instantaneous; against a $26!$-key substitution cipher it is slower but still decisive, because you solve the alphabet one high-frequency letter at a time. Here is a frequency counter — the attacker's first tool — in a few lines:

from collections import Counter

def letter_frequencies(text):
    """Return (letter, count) pairs, most common first, for A-Z only."""
    letters = [ch.upper() for ch in text if ch.isalpha()]
    return Counter(letters).most_common()

ciphertext = "WKLV LV D VHFUHW PHVVDJH"   # 'THIS IS A SECRET MESSAGE' shifted by 3
print(letter_frequencies(ciphertext)[:3])
# Expected output:
# [('V', 5), ('H', 4), ('W', 2)]

The most common ciphertext letter is V (five occurrences), then H (four). In this short sample V decrypts to S rather than the textbook E — a reminder that short texts are statistically noisy — but even here the top frequencies hand the attacker a tiny search of candidate shifts. On a paragraph of real English, the top ciphertext letters line up with E, T, A, O closely enough to crack the cipher by counting and a little reading.

🔄 Check Your Understanding 1. The shift cipher and the substitution cipher both have keys, yet one has a key space of size 26 and the other of size $26!$. Why is the substitution cipher still breakable despite its enormous key space? 2. Caesar's cipher is the special case of which more general cipher, and what is its key? 3. Why does frequency analysis work less reliably on a very short ciphertext than on a long one?

Answers

  1. Because the substitution cipher preserves letter frequencies: each plaintext letter always maps to the same ciphertext symbol, so the statistical fingerprint of the language survives encryption. Frequency analysis exploits that fingerprint and ignores the key space entirely. A big key space stops brute force; it does nothing against structure. 2. Caesar's cipher is the shift cipher with key $k = 3$. 3. Letter frequencies are statistical averages over large samples; a short ciphertext is a tiny, noisy sample in which the most common plaintext letter may not be the most common letter at all (in our example S beat E). The longer the text, the closer the observed counts track the true language frequencies, and the more reliably the top ciphertext symbol pins down E.

The deep lesson of the classical ciphers is that security through obscurity fails and structure leaks. This is now codified as Kerckhoffs's principle: a cryptosystem must remain secure even if the attacker knows everything about it except the key. We assume the enemy has our source code. The only secret is the key. Every cipher in the rest of this chapter is designed under that assumption.


25.2 The key-distribution problem

The ciphers of 25.1 share one structural feature that no amount of cleverness about which permutation to use can fix. They are symmetric.

Symmetric-key cryptography is encryption in which the same key (or two keys trivially derived from each other) is used to encrypt and to decrypt. The sender and receiver must both hold this single shared secret, and anyone who learns it can both read and forge messages.

Symmetric encryption is fast, well understood, and — in modern form (AES, which you may meet in a systems course) — believed extremely secure. But it has an Achilles' heel that is not about the algorithm at all. It is about logistics. Before Alice and Bob can exchange a single symmetric-encrypted message, they must already share the key. How did the key get from Alice to Bob?

If they meet in person and whisper it, fine — but that defeats the purpose of remote communication. If Alice emails the key to Bob, the eavesdropper reads the email and now holds the key. If Alice encrypts the key... with what? Another key, which must itself be shared first. The regress never ends. This is the key-distribution problem (also called key exchange): symmetric cryptography can protect a conversation only after a shared secret has somehow been established, and establishing it over an insecure channel is exactly the problem you were trying to solve.

🧩 Productive Struggle. Before reading on, try to design a scheme that lets Alice send Bob a secret message over a channel an eavesdropper can fully read, where Alice and Bob have never exchanged anything beforehand. Spend five real minutes. Most people conclude it is impossible — and for all of recorded history, everyone did. The fact that it is not impossible is one of the great surprises of twentieth-century mathematics.

The scale of the problem gets worse with the number of participants. Suppose $n$ people all want to communicate privately and pairwise. With symmetric keys, every pair needs its own shared secret, and by Chapter 16's combinations the number of pairs is

$$\binom{n}{2} = \frac{n(n-1)}{2}.$$

For a company of $n = 1000$ employees that is $\binom{1000}{2} = 499{,}500$ keys to generate, distribute, and protect. For the roughly two billion devices on the modern web it is on the order of $10^{18}$ keys. Pre-distributing a unique secret to every possible pair of machines on the internet is not merely inconvenient; it is physically impossible. Symmetric cryptography alone cannot secure the internet, and the reason is not weak math — it is the combinatorics of key distribution.

⚠️ Common Pitfall: Students sometimes think a faster or fancier symmetric cipher would solve this. It cannot. The key-distribution problem is orthogonal to the strength of the cipher. You could have a perfect, unbreakable symmetric cipher and still be unable to use it with a stranger, because you have no way to agree on the key. The problem to be solved is not "how do we scramble the message" but "how do we agree on a secret across a wire the enemy controls."

🔄 Check Your Understanding 1. Why does emailing a symmetric key (even an encrypted one) fail to solve the key-distribution problem? 2. A network grows from 100 to 200 mutually communicating parties. By what factor does the number of required pairwise symmetric keys grow (approximately)?

Answers

  1. Sending the key in the clear hands it to the eavesdropper; encrypting it first requires another shared key, which must itself be distributed — an infinite regress. 2. Keys scale as $\binom{n}{2} \approx n^2/2$, so doubling $n$ roughly quadruples the count: $\binom{200}{2}/\binom{100}{2} = (200\cdot199)/(100\cdot99) \approx 4.02$.

25.3 Public-key cryptography: the idea

The escape from the circle is an idea so counterintuitive that it took until 1976 for anyone to publish it (Diffie and Hellman; we now know it was discovered a few years earlier in classified British work by James Ellis, Clifford Cocks, and Malcolm Williamson, and kept secret). Here is the idea, stripped to its essence, before any number theory.

💡 Intuition: the open padlock. Imagine you want people to send you secret packages. You manufacture thousands of identical padlocks and hand them out freely — you leave them on every street corner, publish them in the newspaper, mail them to strangers. The padlocks snap shut with a click; locking one requires no key. But opening a locked padlock requires the single physical key, which you keep and never share. Now anyone can put a message in a box, snap one of your padlocks shut, and send it. The eavesdropper can see the locked box and even has a padlock of their own — but a padlock does not help you open a locked box. Only your private key does.

That asymmetry — anyone can lock, only the holder can unlock — is the whole game. Translated to mathematics, it means we want a pair of keys:

Public-key cryptography (also called asymmetric cryptography) uses a pair of keys per participant: a public key, published openly and used by anyone to encrypt messages to that participant, and a private key, kept secret and used only by that participant to decrypt. The two keys are mathematically linked, but deriving the private key from the public key must be computationally infeasible.

Two things change immediately. First, the key-distribution problem evaporates: there is no shared secret to distribute. Alice looks up Bob's public key (he can post it on his website), encrypts with it, and sends. Bob decrypts with his private key, which never left his machine. The eavesdropper has Bob's public key too — and it does them no good, because the public key only locks. Second, the combinatorial explosion collapses: $n$ participants need only $n$ key pairs total, one each, not $\binom{n}{2}$ shared secrets.

What kind of mathematical object behaves like an open padlock? We need a function $f$ that is easy to compute but hard to invertunless you possess a secret piece of information, in which case inversion becomes easy again. That last clause is the crucial twist.

A trapdoor function is a function $f$ that is easy to evaluate ($x \mapsto f(x)$ is cheap for everyone) and computationally infeasible to invert ($y \mapsto x$ such that $f(x)=y$ is intractable in general), except for anyone who holds a secret "trapdoor," for whom inversion is also easy.

A one-way function alone is not enough — that would be a padlock no one, including you, could open. The trapdoor is what lets the legitimate receiver, and only the receiver, invert the function. Public-key cryptography is the search for trapdoor functions. RSA is the most famous one ever found, and its trapdoor is built entirely from the number theory you already know.

🚪 Threshold Concept. Public-key cryptography is a genuine before-and-after in how you see computation. Before it, "secret communication requires a pre-shared secret" feels like a law of nature. After it, you understand that computational hardness can substitute for secrecy: the attacker can know the algorithm, hold the public key, and watch every byte on the wire, and still be defeated — not because they lack information, but because turning that information into the plaintext would require solving a problem we believe is intractable. Security stops being about hiding and starts being about hardness. Almost everything in modern cryptography, and a great deal of the rest of this book's Part VI (what is hard to compute, and why), grows from this single shift.

🔄 Check Your Understanding 1. In the padlock analogy, which physical object corresponds to the public key and which to the private key? Why is it safe to hand out the public object freely? 2. Why is a plain one-way function (easy forward, hard backward, no trapdoor) useless for encryption, even though it is hard to invert?

Answers

  1. The padlock is the public key; the physical key that opens it is the private key. Handing out padlocks is safe because owning a padlock lets you lock but never unlock — the public key only encrypts. 2. Because the legitimate receiver must be able to decrypt. A function no one can invert protects the message from everyone, including its intended recipient — useless. The trapdoor is precisely what lets the receiver (and only the receiver) invert it.

25.4 RSA: key generation, encryption, decryption

RSA — named for Ron Rivest, Adi Shamir, and Leonard Adleman, who published it in 1978 — turns the trapdoor idea into arithmetic. Every piece of it is something you built in Chapters 22–24. The trapdoor is this: it is easy to multiply two large primes together, but believed to be hard to factor the product back into those primes. The private key is, in effect, knowledge of the factors.

We describe RSA as three algorithms: key generation, encryption, decryption. Here, key generation is the procedure that produces a matched public/private key pair.

Key generation

  1. Choose two distinct primes $p$ and $q$ (in practice each hundreds of digits long; for learning, tiny). Compute the modulus $n = pq$.
  2. Compute the totient $\phi(n)$. Because $p$ and $q$ are distinct primes, Chapter 23's totient formula gives $$\phi(n) = (p-1)(q-1).$$
  3. Choose a public exponent $e$ with $1 < e < \phi(n)$ and $\gcd(e, \phi(n)) = 1$. (Common real choice: $e = 65537$.)
  4. Compute the private exponent $d$ as the modular inverse of $e$ modulo $\phi(n)$ — that is, the unique $d$ with $$e \cdot d \equiv 1 \pmod{\phi(n)},$$ which you find with the extended Euclidean algorithm from Chapter 22 (the mod_inverse you wrote for the Toolkit in Chapter 23).

The public key is the pair $(n, e)$ — published. The private key is $d$ (kept secret); the primes $p, q$ and $\phi(n)$ must also be discarded or guarded, since any of them reveals $d$.

🔗 Connection. Step 4 is only solvable because of step 3. A modular inverse of $e$ mod $\phi(n)$ exists if and only if $\gcd(e, \phi(n)) = 1$ — exactly the existence criterion you proved in Chapter 23 from Bézout's identity (Chapter 22). The seemingly arbitrary condition "$\gcd(e, \phi(n)) = 1$" in step 3 is not a convention; it is the precondition that makes the private key exist at all.

Encryption and decryption

Represent the message as an integer $m$ with $0 \le m < n$. (Longer messages are split into blocks; real systems also pad — see 25.6.) Then:

$$\boxed{\ \text{encrypt: } c \equiv m^{e} \pmod{n}, \qquad \text{decrypt: } m \equiv c^{d} \pmod{n}.\ }$$

That is the entire system. Encryption raises the message to the public exponent; decryption raises the ciphertext to the private exponent; both reduce mod $n$. The astonishing claim — proved in 25.5 — is that these two operations are inverses: $\left(m^{e}\right)^{d} \equiv m \pmod n$ for every message $m$.

💡 Intuition: $e$ and $d$ are inverses in the exponent — they multiply to $1$ modulo $\phi(n)$. Encrypting then decrypting raises $m$ to the power $ed$, and because $ed \equiv 1 \pmod{\phi(n)}$, Euler's theorem (the engine you forged in Chapter 23) collapses $m^{ed}$ back to $m$. RSA is Euler's theorem wearing a disguise.

A worked example, by hand

Let's run RSA with primes small enough to trace with a pencil. Take $p = 5$ and $q = 11$.

  • $n = pq = 55$.
  • $\phi(n) = (p-1)(q-1) = 4 \cdot 10 = 40$.
  • Choose $e = 3$. Check: $\gcd(3, 40) = 1$. ✓
  • Compute $d = 3^{-1} \bmod 40$. We need $3d \equiv 1 \pmod{40}$. Trying $d = 27$: $3 \cdot 27 = 81 = 2 \cdot 40 + 1 \equiv 1 \pmod{40}$. ✓ So $d = 27$.

Public key $(n, e) = (55, 3)$; private key $d = 27$.

Encrypt the message $m = 7$: $$c \equiv 7^{3} = 343 \pmod{55}.$$ Reduce: $343 = 6 \cdot 55 + 13$, so $c = 13$.

Decrypt the ciphertext $c = 13$: $$m \equiv 13^{27} \pmod{55}.$$ Computing $13^{27} \bmod 55$ directly looks brutal, but fast modular exponentiation (Chapter 23) makes it painless. First reduce powers of 13 mod 55 by repeated squaring: $$13^1 \equiv 13, \quad 13^2 = 169 \equiv 4, \quad 13^4 \equiv 4^2 = 16, \quad 13^8 \equiv 16^2 = 256 \equiv 36, \quad 13^{16} \equiv 36^2 = 1296 \equiv 31 \pmod{55}.$$ (Each reduction: $169 = 3\cdot55+4$; $256 = 4\cdot55+36$; $1296 = 23\cdot55+31$.) Now $27 = 16 + 8 + 2 + 1$, so $$13^{27} = 13^{16} \cdot 13^{8} \cdot 13^{2} \cdot 13^{1} \equiv 31 \cdot 36 \cdot 4 \cdot 13 \pmod{55}.$$ Multiply step by step, reducing as we go: $31 \cdot 36 = 1116 \equiv 26$ (since $1116 = 20\cdot55+16$ — wait, recompute: $20 \cdot 55 = 1100$, $1116 - 1100 = 16$, so $31 \cdot 36 \equiv 16$). Then $16 \cdot 4 = 64 \equiv 9$. Then $9 \cdot 13 = 117 = 2\cdot55 + 7 \equiv 7 \pmod{55}$. We recover $m = 7$. The round trip worked.

⚠️ Common Pitfall: Arithmetic mod $n$ is unforgiving — a single miscomputed reduction derails the whole chain, as the deliberate stumble above shows (always re-reduce after every multiplication, never at the end). This is also why you never compute $13^{27}$ as a giant integer and then reduce: in real RSA the exponent has hundreds of digits and $c^d$ would have more digits than there are atoms in the universe. Fast modular exponentiation keeps every intermediate value below $n$. It is not an optimization; it is the only way the computation is possible at all.

RSA in Python, from scratch

Here is the entire cryptosystem, leaning on the numbertheory helpers you built in Chapters 22–23 (mod_inverse, and Python's built-in three-argument pow for fast modular exponentiation, which is exactly the mod_pow you implemented):

def mod_inverse(a, m):
    # extended Euclidean algorithm (from Chapter 22-23 Toolkit)
    old_r, r = a % m, m
    old_s, s = 1, 0
    while r != 0:
        q = old_r // r
        old_r, r = r, old_r - q * r
        old_s, s = s, old_s - q * s
    return old_s % m              # a * old_s == 1 (mod m)

p, q = 61, 53                     # two distinct primes
n = p * q                         # modulus = 3233
phi = (p - 1) * (q - 1)           # 60 * 52 = 3120
e = 17                            # gcd(17, 3120) == 1
d = mod_inverse(e, phi)           # private exponent

m = 65                            # message (0 <= m < n)
c = pow(m, e, n)                  # encrypt: m^e mod n
back = pow(c, d, n)               # decrypt: c^d mod n
print(d, c, back)
# Expected output:
# 2753 2790 65

The output says the private exponent is $d = 2753$, the ciphertext of $65$ is $2790$, and decrypting $2790$ returns $65$. (You can verify $d$ by hand: $17 \cdot 2753 = 46801 = 15 \cdot 3120 + 1$, so $17 \cdot 2753 \equiv 1 \pmod{3120}$. ✓) Twenty lines of arithmetic, and you have reproduced the mechanism that secures the web.

🔄 Check Your Understanding 1. In key generation, which quantities are published, which are kept secret, and which must be destroyed after $d$ is computed? 2. Why must $\gcd(e, \phi(n)) = 1$? (Name the chapter result you are invoking.) 3. In the hand example, we found $d = 27$ from $e = 3$, $\phi(n) = 40$. Verify $ed \equiv 1 \pmod{\phi(n)}$ and state, in one sentence, what that congruence guarantees about encrypt-then- decrypt.

Answers

  1. Published: $(n, e)$. Secret: $d$. Destroyed/guarded: $p$, $q$, and $\phi(n)$ — each of them lets an attacker recompute $d$. 2. Because $d$ must be the modular inverse of $e$ modulo $\phi(n)$, and (by the result in Chapter 23, from Bézout's identity in Chapter 22) that inverse exists iff $\gcd(e, \phi(n)) = 1$. 3. $3 \cdot 27 = 81 = 2\cdot 40 + 1 \equiv 1 \pmod{40}$. It guarantees $ed \equiv 1 \pmod{\phi(n)}$, which (next section) forces $(m^e)^d \equiv m \pmod n$ — decryption inverts encryption.

25.5 Why RSA works: a proof via Euler

We have asserted that decryption inverts encryption. Now we prove it. This is the mathematical heart of the chapter, and it is a beautiful, short proof — but only if you state the strategy first.

What we must prove. For RSA keys generated as in 25.4, and for any integer message $m$ with $0 \le m < n$, $$\left(m^{e}\right)^{d} \equiv m \pmod{n}.$$

The strategy first. Encrypt-then-decrypt raises $m$ to the power $ed$. By construction $ed \equiv 1 \pmod{\phi(n)}$, which means $ed = 1 + k\phi(n)$ for some nonnegative integer $k$ (this is just the definition of congruence, from Chapter 23). So $m^{ed} = m^{1 + k\phi(n)} = m \cdot \left(m^{\phi(n)} \right)^{k}$. Euler's theorem (Chapter 23) says $m^{\phi(n)} \equiv 1 \pmod n$ *when* $\gcd(m,n)=1$, which would instantly collapse the parenthesized factor to $1$ and leave $m$. The entire proof is: rewrite the exponent using $ed \equiv 1$, then apply Euler. The only subtlety is the clause "when $\gcd(m,n)=1$" — we will first prove the clean case, then explain why the result holds even without it.

Proof (main case: $\gcd(m, n) = 1$). By the choice of $d$ in key generation, $ed \equiv 1 \pmod{\phi(n)}$, so there is an integer $k \ge 0$ with $$ed = 1 + k\,\phi(n).$$ Therefore $$\left(m^{e}\right)^{d} = m^{ed} = m^{1 + k\phi(n)} = m \cdot \left(m^{\phi(n)}\right)^{k}.$$ Since $\gcd(m, n) = 1$, Euler's theorem (Chapter 23) gives $m^{\phi(n)} \equiv 1 \pmod{n}$. Hence $$m \cdot \left(m^{\phi(n)}\right)^{k} \equiv m \cdot 1^{k} = m \pmod{n}.$$ Combining, $\left(m^{e}\right)^{d} \equiv m \pmod n$. $\blacksquare$

That proof is complete and correct whenever the message $m$ is coprime to $n$. But $n = pq$, and a message could (in principle) be a multiple of $p$ or of $q$, in which case $\gcd(m, n) \ne 1$ and Euler's theorem does not directly apply. The textbook story usually stops at the clean case; we won't, because the gap is exactly where careful number theory earns its keep.

The strategy for the general case. Instead of working mod $n$ all at once, work modulo each prime separately, then glue the two results back together with the Chinese Remainder Theorem (Chapter 23). Modulo a single prime $p$, Fermat's little theorem handles even the case $p \mid m$ — because if $p \mid m$ then both $m^{ed}$ and $m$ are $\equiv 0 \pmod p$, and they trivially agree. So we'll show $m^{ed} \equiv m \pmod{p}$ and $m^{ed} \equiv m \pmod{q}$ unconditionally, and CRT promotes those two congruences to one mod $n = pq$.

Proof (general case, any $0 \le m < n$). We show $m^{ed} \equiv m \pmod{p}$ for every $m$; the same argument with $q$ gives $m^{ed} \equiv m \pmod{q}$.

Recall $ed = 1 + k\phi(n) = 1 + k(p-1)(q-1)$.

Case (i): $p \mid m$. Then $m \equiv 0 \pmod p$, so $m^{ed} \equiv 0 \equiv m \pmod p$. Done.

Case (ii): $p \nmid m$. Then $\gcd(m, p) = 1$ (since $p$ is prime), so Fermat's little theorem (Chapter 23) gives $m^{p-1} \equiv 1 \pmod p$. Now $$m^{ed} = m^{\,1 + k(p-1)(q-1)} = m \cdot \left(m^{\,p-1}\right)^{k(q-1)} \equiv m \cdot 1^{k(q-1)} = m \pmod{p}.$$ Either way, $m^{ed} \equiv m \pmod p$.

By the identical argument, $m^{ed} \equiv m \pmod q$. So $p \mid (m^{ed} - m)$ and $q \mid (m^{ed} - m)$. Because $p$ and $q$ are distinct primes, they are coprime, and the Chinese Remainder Theorem (equivalently, the fact that coprime divisors of a number have a product that also divides it) gives $pq \mid (m^{ed} - m)$, i.e. $$m^{ed} \equiv m \pmod{pq = n}.$$ Therefore $\left(m^{e}\right)^{d} \equiv m \pmod{n}$ for every message $m$ with $0 \le m < n$. $\blacksquare$

💡 Intuition: The CRT-based proof is the "right" proof: it works for all messages and reveals why $n$ must be a product of distinct primes. The structure "prove it mod $p$, prove it mod $q$, glue with CRT" is the same move you'll use whenever a statement about a composite modulus is easier one prime at a time — and it is exactly how high-speed RSA implementations actually decrypt (they compute mod $p$ and mod $q$ separately, then CRT-combine, for a roughly 4× speedup).

🐛 Find the Error. A student proves RSA correctness as follows: "Since $ed \equiv 1 \pmod{\phi(n)}$, we have $ed = 1$, so $m^{ed} = m^1 = m$. Done." Where is the error, and why does it matter that $ed$ is generally much larger than 1?

Answer

The student confuses a congruence with an equality. $ed \equiv 1 \pmod{\phi(n)}$ does not mean $ed = 1$; it means $ed = 1 + k\phi(n)$ for some integer $k$, and in fact $ed > \phi(n) > 1$ always (since $e, d > 1$). The whole point of the real proof is that the extra factor $(m^{\phi(n)})^k$, coming from the $k\phi(n)$ term, collapses to $1$ via Euler/Fermat. Dropping it assumes the conclusion. In the hand example, $ed = 81$, emphatically not $1$.

🔄 Check Your Understanding 1. Exactly which earlier theorem makes $\left(m^{\phi(n)}\right)^k$ collapse to $1$ in the main-case proof, and what hypothesis does it require? 2. Why does the general-case proof need Fermat's little theorem instead of Euler's theorem? 3. Where in the general-case proof is the hypothesis "$p$ and $q$ are distinct" actually used?

Answers

  1. Euler's theorem, $m^{\phi(n)} \equiv 1 \pmod n$; it requires $\gcd(m, n) = 1$. 2. Because we work modulo a single prime $p$, where Fermat's little theorem ($m^{p-1}\equiv 1 \pmod p$ for $p \nmid m$) applies, *and* the case $p \mid m$ is handled trivially — Euler mod $n$ can't cover $p \mid m$.
  2. At the final CRT step: $p$ and $q$ must be coprime for $p \mid x$ and $q \mid x$ to imply $pq \mid x$. If $p = q$, then $pq = p^2$ would *not* follow from $p \mid x$ alone.

25.6 Why RSA is secure — and how it breaks

Correctness says RSA works for the legitimate parties. Security asks the opposite question: can an attacker, holding the public key $(n, e)$ and the ciphertext $c$, recover $m$? Notice what the attacker is missing — the private exponent $d$. And here is the pivotal observation: computing $d$ from the public information is as hard as factoring $n$.

The chain is short. If the attacker could factor $n$ into $p$ and $q$, they would compute $\phi(n) = (p-1)(q-1)$ and then $d = e^{-1} \bmod \phi(n)$ by the extended Euclidean algorithm — recovering the private key in seconds. So if factoring is easy, RSA is broken. The security of RSA rests on the converse belief:

The RSA / factoring assumption. For large $n = pq$ with $p, q$ random primes of equal size, no known algorithm factors $n$ in feasible time. Multiplying $p$ and $q$ is fast (polynomial in their size); reversing the multiplication — factoring — is believed to require time that grows faster than any polynomial in the number of digits. This easy-forward / hard-backward gap is the trapdoor: the factors are the secret that makes inversion easy.

This is exactly the trapdoor function of 25.3 made concrete. The function "multiply two primes" is easy; its inverse "factor the product" is — as far as anyone knows — intractable; and the trapdoor (the factors themselves, equivalently $d$) restores easy inversion for the key holder.

The asymmetry is worth feeling at scale. Multiplying two 1024-bit primes to form a 2048-bit modulus takes a fraction of a millisecond. Factoring that 2048-bit modulus with the best known classical algorithm (the general number field sieve) would take vastly longer than the age of the universe on all the computers on Earth combined. The two operations are inverses of each other, yet one is instant and the other is, for practical purposes, impossible — and that gap is the entire security of the system. This is also where Chapter 15's keyspace counting returns: an RSA key's strength is measured not by enumerating keys (the $d$ that an attacker seeks is unique) but by the cost of the factoring shortcut, which is why recommended modulus sizes have grown from 1024 to 2048 to 3072 bits as factoring methods and hardware improved. Choosing a key size is choosing how far out of reach to put the factoring attack.

⚠️ Common Pitfall: "RSA has been proven secure." It has not. RSA's security is a conjecture: it rests on the unproven assumption that factoring is hard, which is closely related to the great open problems of complexity theory you'll meet in Chapter 37. No one has proven factoring is hard; no one has found a fast classical algorithm either. We trust RSA the way we trust that P ≠ NP — on the strength of decades of failed attacks, not a proof. (A large enough quantum computer running Shor's algorithm would factor efficiently — which is why "post-quantum" cryptography is an active field.)

Even granting the factoring assumption, the textbook RSA of 25.4 — encrypt $m$ as $m^e \bmod n$, nothing more — is not safe to deploy, and seeing why sharpens your understanding of what security means.

  • It is deterministic. The same plaintext always encrypts to the same ciphertext. So an attacker who suspects the message is "YES" or "NO" can encrypt both with the public key and compare — no factoring required. Real RSA injects random padding (e.g., OAEP) so that encryption is randomized and identical plaintexts yield different ciphertexts.
  • Small messages and small exponents leak. If $e = 3$ and $m$ is small enough that $m^3 < n$, then $c = m^3$ with no modular wraparound, and the attacker just takes an ordinary integer cube root. Padding (again) prevents tiny $m$.
  • Never share a modulus. If two users share the same $n$ but have different exponents, each can compute the other's private key from their own. Every key pair needs its own freshly generated $n$.

💡 Intuition: Security is not a property of the core equation alone; it is a property of the whole protocol — padding, randomness, key sizes, key management. "The math is sound" and "the system is secure" are different claims. This is the same lesson as the substitution cipher in 25.1: a big key space (there, $26!$; here, an unfactorable $n$) does not by itself imply security if the structure leaks. Determinism is structure, and structure leaks.

One more idea, in preview, shows the power of the public/private split. Because the keys are inverses, you can run RSA backwards: the holder of the private key can compute $s \equiv m^{d} \pmod n$, and anyone can verify with the public key that $s^{e} \equiv m \pmod n$. Only the private-key holder could have produced such an $s$, so $s$ acts as an unforgeable digital signature — a way to prove authorship of a message rather than to hide it. We introduce it only as a glimpse; signatures are a study of their own, and the capstone (Chapter 39, Track A) lets you build one on top of the RSA you implement here.

🔄 Check Your Understanding 1. State the precise sense in which "breaking RSA is as hard as factoring." Which direction of that relationship is a theorem, and which is an assumption? 2. Textbook RSA encrypts "ATTACK" and "RETREAT" to fixed ciphertexts. Describe an attack that reads the message without factoring $n$. 3. In one sentence, how does a digital signature reuse the same RSA arithmetic for the opposite purpose?

Answers

  1. Theorem: if you can factor $n$, you can recover $d$ and break RSA (compute $\phi(n)$, invert $e$). Assumption: that there is no feasible way to break RSA without factoring — i.e., that factoring (and recovering $d$) is hard. 2. Since textbook RSA is deterministic, the attacker encrypts both candidate plaintexts with the public key and compares to the observed ciphertext; a match reveals the message — no factoring needed. This is why real RSA randomizes via padding.
  2. The private-key holder raises the message to $d$ (instead of the public $e$) to sign, and anyone raises the signature to $e$ to verify — encryption run backwards proves authorship instead of providing secrecy.

Project Checkpoint: the RSA core of the Toolkit

This is the chapter your whole dmtoolkit has been building toward. In Chapter 24 you began crypto.py; now you complete its RSA core, reusing numbertheory.py from Chapters 22–23 (mod_inverse for the private exponent; fast modular exponentiation — Python's three-argument pow, which is the mod_pow you wrote — for encrypt and decrypt). Add to dmtoolkit/crypto.py:

from dmtoolkit.numbertheory import mod_inverse   # built in Ch.22-23

def rsa_keygen_from_primes(p, q, e):
    """Build an RSA key pair from chosen primes p, q and public exponent e.
    Returns (public, private) = ((n, e), d). Requires gcd(e, (p-1)(q-1)) == 1."""
    n = p * q
    phi = (p - 1) * (q - 1)
    d = mod_inverse(e, phi)          # exists iff gcd(e, phi) == 1
    return (n, e), d

def rsa_encrypt(m, pub):
    n, e = pub
    return pow(m, e, n)              # m^e mod n  (fast modular exponentiation)

def rsa_decrypt(c, priv, n):
    return pow(c, priv, n)           # c^d mod n

pub, d = rsa_keygen_from_primes(61, 53, 17)
c = rsa_encrypt(65, pub)
m = rsa_decrypt(c, d, pub[0])
print(pub, d, c, m)
# Expected output:
# ((3233, 17), 2753) 2753 2790 65

The expected output is hand-derived: $n = 61 \cdot 53 = 3233$, $\phi = 60 \cdot 52 = 3120$, $d = 17^{-1} \bmod 3120 = 2753$ (since $17 \cdot 2753 = 46801 = 15\cdot 3120 + 1$), $c = 65^{17} \bmod 3233 = 2790$, and $65^{17 \cdot 2753} \equiv 65 \pmod{3233}$ recovers $m = 65$ by the correctness theorem of 25.5.

The full rsa_keygen(bits) from the canonical Toolkit API additionally generates random primes of a given bit length (using the primality testing that belongs with Chapter 23's number-theory module); project-checkpoint.py includes that wrapper, but the prime-generation step is shown, not executed. Toward the capstone: with rsa_keygen, rsa_encrypt, and rsa_decrypt in place, your Toolkit can now perform real public-key encryption — the foundation of Capstone Track A (Chapter 39), where you'll wrap this core in message encoding, padding, and a digital-signature mode.

Toolkit so far: logic.py (Ch.1–3), sets.py (Ch.8), relations.py (Ch.12–13), combinatorics.py (Ch.15–17), recurrences.py (Ch.6, 18–19), probability.py (Ch.20–21), numbertheory.py (Ch.22–23), and now a working crypto.py. You can encrypt a message to a public key and decrypt it with a private key — the thing this entire part was for.


Summary

RSA is the worked payoff of Part IV: classical secrecy meets a trapdoor built from primes, totients, and Euler's theorem.

Ciphers and their fates

Cipher Key Key-space size How it breaks
Shift (Caesar = shift 3) one shift $k \in \mathbb{Z}_{26}$ 26 brute force (≤ 25 tries)
Substitution a permutation of 26 letters $26! \approx 4\times10^{26}$ frequency analysis (structure leaks)
Symmetric (modern, e.g. AES) shared secret, $\ge 128$ bits $\ge 2^{128}$ strong cipher — but blocked by key distribution
RSA (public-key) public $(n,e)$ / private $d$ governed by hardness of factoring $n$ factor $n$ (believed infeasible)

The two structural problems and their resolution

Problem Statement Resolution
Key distribution symmetric crypto needs a pre-shared secret, impossible over an open channel public-key crypto: no shared secret; publish a lock, keep the key
Scaling $n$ parties need $\binom{n}{2} = \tfrac{n(n-1)}{2}$ symmetric keys public-key: $n$ key pairs total

RSA reference

Step Operation
Key generation pick distinct primes $p,q$; $n=pq$; $\phi(n)=(p-1)(q-1)$; pick $e$ with $\gcd(e,\phi(n))=1$; $d \equiv e^{-1}\pmod{\phi(n)}$
Public / private key public $(n,e)$; private $d$ (discard/guard $p,q,\phi(n)$)
Encrypt $c \equiv m^{e}\pmod n$, for $0\le m < n$
Decrypt $m \equiv c^{d}\pmod n$
Why it works $ed\equiv 1\pmod{\phi(n)}\Rightarrow m^{ed}\equiv m\pmod n$ (Euler; CRT for $\gcd(m,n)\ne1$)
Why it's secure recovering $d$ from $(n,e)$ is as hard as factoring $n$ (an assumption, not a theorem)

Key definitions: a cipher is an encrypt/decrypt pair keyed by a secret; symmetric crypto shares one key, public-key crypto splits a published key from a private one; a trapdoor function is easy forward, hard to invert without a secret; key generation produces a matched key pair; a digital signature runs RSA backwards (sign with $d$, verify with $e$) to prove authorship.

Pitfalls: re-reduce mod $n$ after every multiplication; never compute $c^d$ as a giant integer (use fast modular exponentiation); textbook RSA is deterministic and therefore insecure — real systems randomize with padding, avoid tiny $m$ / tiny $e$, and never share a modulus.


Spaced Review

Retrieval keeps knowledge available. Answer from memory before checking.

  1. (Ch. 24) RSA decryption lives in $\mathbb{Z}_n$. Is the structure $(\mathbb{Z}_n, +, \times)$ a field when $n = pq$ with $p \ne q$ both prime? Justify in one line, and say which $n$ do make $\mathbb{Z}_n$ a field.
  2. (Ch. 24) RSA's correctness for $\gcd(m,n)=1$ uses the group of units modulo $n$. What is the order of that group, and what theorem names it?
  3. (Ch. 23) State Euler's theorem and the precise hypothesis it needs. Where exactly does that hypothesis appear in the main-case RSA proof?
  4. (Ch. 23) You need $d$ with $ed \equiv 1 \pmod{\phi(n)}$. Which algorithm computes it, and what condition guarantees it exists?
  5. (Ch. 24 / Ch. 23) In the hand example $p=5, q=11, e=3$, the units group mod 55 has order 40. Show how that number both determines $\phi(n)$ and constrains the choice of $e$.

Answers

  1. No — $\mathbb{Z}_{pq}$ is not a field, because it has zero divisors: $p \cdot q \equiv 0 \pmod{pq}$ with neither factor zero, so $p$ has no multiplicative inverse. $\mathbb{Z}_n$ is a field iff $n$ is prime. 2. The group of units modulo $n$ has order $\phi(n) = (p-1)(q-1)$; Lagrange's theorem (Chapter 24) — or directly Euler's theorem (Chapter 23) — gives $m^{\phi(n)} \equiv 1$ for units. 3. Euler's theorem: if $\gcd(a,n)=1$ then $a^{\phi(n)} \equiv 1 \pmod n$. In the RSA proof it is applied to collapse $\left(m^{\phi(n)}\right)^k \equiv 1$, and its hypothesis is exactly the "$\gcd(m,n)=1$" assumption of the main case. 4. The extended Euclidean algorithm (Chapter 22/23); $d = e^{-1} \bmod \phi(n)$ exists iff $\gcd(e, \phi(n)) = 1$. 5. $\phi(55) = (5-1)(11-1) = 4 \cdot 10 = 40$, the order of the units group; and $e$ must satisfy $\gcd(e, 40) = 1$, so $e=3$ is allowed (while $e=2,4,5,8,10$ would not be).

What's Next

RSA hides a message by making it computationally infeasible to read. The next chapter, Hashing, Checksums, and Error Detection, turns the same arithmetic — modular reduction, the pigeonhole principle (Chapter 17) — to a different goal: not hiding data, but fingerprinting it. You'll see how a hash function maps arbitrary data to a fixed-size digest, why collisions are unavoidable (the pigeonhole principle, made quantitative), and how checksums and cyclic redundancy checks catch the bit-flips that creep into every real transmission. Cryptographic hashing, glimpsed there, is the other pillar — alongside the public-key encryption you just built — on which the security of the internet stands.