Case Study 1: Building a Simple Encryption Tool

DataField.Dev

Case Study 1: Building a Simple Encryption Tool

Overview

Encryption — transforming readable text into an unreadable form and back again — is one of the oldest applications of text processing. In this case study, we build two classic ciphers using the string operations from Chapter 10: the Caesar cipher (shifting each letter by a fixed amount) and a substitution cipher (mapping each letter to a different letter via a key). Along the way, we will use character arithmetic, string indexing, case conversion, and the Ord/Chr functions extensively.

⚠️ Important. These ciphers are historically interesting but cryptographically broken. Do not use them to protect real secrets. Modern encryption requires algorithms like AES or RSA, which are far beyond simple character manipulation.

Part 1: The Caesar Cipher

How It Works

The Caesar cipher shifts every letter in the plaintext by a fixed number of positions in the alphabet. With a shift of 3:

A → D, B → E, C → F, ..., X → A, Y → B, Z → C
a → d, b → e, etc.

Non-letter characters (digits, spaces, punctuation) are left unchanged.

Implementation

The key insight is modular arithmetic on character codes. For an uppercase letter c with shift n:

Subtract the base (Ord('A')) to get a value 0–25.
Add the shift.
Apply mod 26 to wrap around.
Add the base back.

function CaesarEncrypt(const plaintext: String; shift: Integer): String;
var
  i: Integer;
  c: Char;
  base: Integer;
begin
  Result := plaintext;
  { Normalize shift to 0..25 range }
  shift := ((shift mod 26) + 26) mod 26;

  for i := 1 to Length(Result) do
  begin
    c := Result[i];
    if (c >= 'A') and (c <= 'Z') then
    begin
      base := Ord('A');
      Result[i] := Chr(base + (Ord(c) - base + shift) mod 26);
    end
    else if (c >= 'a') and (c <= 'z') then
    begin
      base := Ord('a');
      Result[i] := Chr(base + (Ord(c) - base + shift) mod 26);
    end;
    { Non-letter characters remain unchanged }
  end;
end;

function CaesarDecrypt(const ciphertext: String; shift: Integer): String;
begin
  { Decryption is just encryption with the negative shift }
  Result := CaesarEncrypt(ciphertext, -shift);
end;

Testing the Caesar Cipher

var
  original, encrypted, decrypted: String;
begin
  original := 'The quick brown fox jumps over the lazy dog!';

  encrypted := CaesarEncrypt(original, 13);
  WriteLn('Original:  ', original);
  WriteLn('Encrypted: ', encrypted);

  decrypted := CaesarDecrypt(encrypted, 13);
  WriteLn('Decrypted: ', decrypted);

  { Verify round-trip }
  if original = decrypted then
    WriteLn('Round-trip successful!')
  else
    WriteLn('ERROR: Round-trip failed!');
end.

Expected output:

Original:  The quick brown fox jumps over the lazy dog!
Encrypted: Gur dhvpx oebja sbk whzcf bire gur ynml qbt!
Decrypted: The quick brown fox jumps over the lazy dog!
Round-trip successful!

Note that shift 13 gives us ROT13 — a cipher where encryption and decryption are the same operation.

Brute-Force Attack

Because there are only 25 meaningful shifts (shift 0 and shift 26 are identity), we can try all of them:

procedure BruteForce(const ciphertext: String);
var
  shift: Integer;
begin
  WriteLn('Brute-force attack on Caesar cipher:');
  WriteLn('------------------------------------');
  for shift := 1 to 25 do
    WriteLn(Format('Shift %2d: %s', [shift,
            Copy(CaesarEncrypt(ciphertext, shift), 1, 50)]));
end;

This demonstrates why the Caesar cipher is trivially breakable — an attacker simply checks all 25 possibilities and picks the one that produces readable English.

Part 2: The Substitution Cipher

How It Works

A substitution cipher replaces each letter with a different letter according to a key — a permutation of the alphabet. For example:

Plain:  ABCDEFGHIJKLMNOPQRSTUVWXYZ
Key:    QWERTYUIOPASDFGHJKLZXCVBNM

With this key, A encrypts to Q, B to W, C to E, and so on.

Generating a Key

We need a procedure that generates a random permutation of the alphabet:

function GenerateSubstitutionKey: String;
var
  i, j: Integer;
  temp: Char;
begin
  Result := 'ABCDEFGHIJKLMNOPQRSTUVWXYZ';
  { Fisher-Yates shuffle }
  for i := 26 downto 2 do
  begin
    j := Random(i) + 1;
    { Swap Result[i] and Result[j] }
    temp := Result[i];
    Result[i] := Result[j];
    Result[j] := temp;
  end;
end;

Encryption and Decryption

function SubstitutionEncrypt(const plaintext, key: String): String;
var
  i, idx: Integer;
  c: Char;
begin
  Result := plaintext;
  for i := 1 to Length(Result) do
  begin
    c := Result[i];
    if (c >= 'A') and (c <= 'Z') then
    begin
      idx := Ord(c) - Ord('A') + 1;
      Result[i] := key[idx];
    end
    else if (c >= 'a') and (c <= 'z') then
    begin
      idx := Ord(c) - Ord('a') + 1;
      Result[i] := Chr(Ord(LowerCase(key[idx])[1]));
    end;
  end;
end;

function SubstitutionDecrypt(const ciphertext, key: String): String;
var
  reverseKey: String;
  i: Integer;
begin
  { Build the reverse mapping }
  SetLength(reverseKey, 26);
  for i := 1 to 26 do
    reverseKey[Ord(key[i]) - Ord('A') + 1] := Chr(Ord('A') + i - 1);

  Result := SubstitutionEncrypt(ciphertext, reverseKey);
end;

The decryption function builds a reverse key — if key[3] = 'E' (meaning C encrypts to E), then reverseKey[5] = 'C' (meaning E decrypts back to C).

Testing the Substitution Cipher

var
  key, original, encrypted, decrypted: String;
begin
  Randomize;
  key := GenerateSubstitutionKey;
  WriteLn('Key:       ', key);

  original := 'MEET ME AT THE BRIDGE AT MIDNIGHT';
  encrypted := SubstitutionEncrypt(original, key);
  WriteLn('Original:  ', original);
  WriteLn('Encrypted: ', encrypted);

  decrypted := SubstitutionDecrypt(encrypted, key);
  WriteLn('Decrypted: ', decrypted);

  if original = decrypted then
    WriteLn('Round-trip successful!')
  else
    WriteLn('ERROR: Round-trip failed!');
end.

Part 3: Frequency Analysis Attack

The substitution cipher has 26! (approximately 4 x 10^26) possible keys, so brute force is impractical. However, letter frequency analysis can crack it. In English, E is the most common letter (~12.7%), followed by T (~9.1%), A (~8.2%), and so on.

procedure FrequencyAnalysis(const ciphertext: String);
var
  freq: array['A'..'Z'] of Integer;
  c: Char;
  i: Integer;
  total: Integer;
begin
  for c := 'A' to 'Z' do
    freq[c] := 0;
  total := 0;

  for i := 1 to Length(ciphertext) do
  begin
    c := UpCase(ciphertext[i]);
    if (c >= 'A') and (c <= 'Z') then
    begin
      Inc(freq[c]);
      Inc(total);
    end;
  end;

  WriteLn('Letter Frequency Analysis:');
  WriteLn('--------------------------');
  for c := 'A' to 'Z' do
    if freq[c] > 0 then
      WriteLn(Format('  %s: %3d (%5.1f%%)  %s',
                     [c, freq[c], 100.0 * freq[c] / total,
                      StringOfChar('#', freq[c])]));
end;

This analysis reveals which ciphertext letter probably corresponds to E, which to T, and so on — allowing a cryptanalyst to recover the key without trying all 26! possibilities.

Discussion Questions

The Caesar cipher has 25 possible keys; the substitution cipher has approximately 4 x 10^26. Why is the substitution cipher still considered insecure despite this enormous key space?
Both ciphers preserve spaces and punctuation unchanged. How does this make the ciphers easier to break? What would you change to fix this weakness?
Our CaesarEncrypt function uses mod 26 for wrapping. Why did we write ((shift mod 26) + 26) mod 26 instead of simply shift mod 26? What happens with negative shifts if we use only shift mod 26?
The substitution cipher key is a permutation of the alphabet. How many characters of information does the key contain, and is there a more compact way to represent or communicate it?
What string operations from Chapter 10 were most critical in implementing these ciphers? Could you implement them without character arithmetic (Ord/Chr)?

Extension Activities

Activity 1: Vigenere Cipher. Implement the Vigenere cipher, which uses a keyword to determine a different shift for each position. For example, with keyword KEY, the first letter is shifted by 10 (K=10), the second by 4 (E=4), the third by 24 (Y=24), then back to 10 for the fourth letter, and so on.

Activity 2: Auto-Cracker. Write a program that attempts to automatically crack a Caesar cipher by scoring each possible shift against English letter frequencies. The shift whose output most closely matches typical English frequencies is probably correct.

Activity 3: One-Time Pad. Implement a one-time pad — a substitution where the key is as long as the message and is truly random. Demonstrate that the same ciphertext can decrypt to two completely different plaintexts with two different keys, proving that the one-time pad is theoretically unbreakable.