Chapter 2 Exercises: Cryptographic Foundations

DataField.Dev

Chapter 2 Exercises: Cryptographic Foundations

These exercises progress from conceptual understanding through implementation to analysis. The coding exercises build directly on the chapter's Python examples — you should have the hashlib and ecdsa libraries available. Exercises marked with a star (*) are more challenging and suitable for students with stronger mathematical backgrounds.

Section A: Hash Functions

Exercise 2.1 — Implement a Simple (Insecure) Hash Function

Write a simple hash function that maps any string input to a 4-digit number (0000-9999). Do not use SHA-256 — the point is to build something simple from scratch so you understand the concept.

Requirements: 1. The function must be deterministic (same input always produces the same output). 2. It must accept strings of any length. 3. It should be somewhat uniform (different inputs should produce different outputs most of the time).

Starter code:

def simple_hash(input_string: str) -> str:
    """
    A simple (insecure) hash function that produces a 4-digit output.

    Steps:
    1. Convert each character to its ASCII value.
    2. Combine them using multiplication and addition.
    3. Take modulo 10000 to get a 4-digit result.
    """
    hash_value = 0
    for i, char in enumerate(input_string):
        # YOUR CODE HERE: combine ord(char) with hash_value
        # Hint: use both the character value and its position (i)
        pass
    return str(hash_value).zfill(4)

Tasks: a. Complete the implementation. Test it on at least 10 different inputs. b. Find two different inputs that produce the same hash (a collision). How many inputs did you need to try? How does this compare to the birthday bound for a 4-digit hash (sqrt(10000) = 100)? c. Explain why this function is unsuitable for cryptographic applications. Which of the three security properties does it violate, and why?

Expected output: A working hash function, at least one collision, and a written explanation of its weaknesses.

Exercise 2.2 — The Avalanche Effect, Quantified

Using SHA-256, systematically measure the avalanche effect:

import hashlib

def avalanche_test(base_string: str, num_variants: int = 100) -> dict:
    """
    Modify the last character of base_string through num_variants values.
    For each variant, compute the SHA-256 hash and measure how many bits
    differ from the hash of the original string.

    Return a dictionary with:
    - 'mean_bits_changed': average bits changed across all variants
    - 'min_bits_changed': minimum bits changed
    - 'max_bits_changed': maximum bits changed
    - 'std_dev': standard deviation of bits changed
    """
    # YOUR CODE HERE
    pass

Tasks: a. Implement the function and run it with base_string = "Test message". b. Plot a histogram of the number of changed bits across 1000 variants (use matplotlib if available, or print a text-based histogram). c. What distribution does the histogram approximate? Why does this make sense given the design goals of SHA-256? d. Repeat with a different base string. Does the distribution change? Should it?

Exercise 2.3 — Brute-Force Preimage Attack

Write a program that attempts a brute-force preimage attack on a truncated hash:

import hashlib
import time

def brute_force_preimage(target_prefix: str, max_attempts: int = 10_000_000) -> dict:
    """
    Find a string whose SHA-256 hash starts with target_prefix.

    For example, if target_prefix = "0000", find any string m such that
    SHA-256(m) starts with "0000".

    Return a dict with 'input', 'hash', 'attempts', and 'time'.
    """
    # YOUR CODE HERE
    pass

Tasks: a. Find a preimage for target prefix "0" (4 bits). How many attempts? b. Find a preimage for "00" (8 bits). How many attempts? c. Repeat for "000" (12 bits), "0000" (16 bits), and "00000" (20 bits). d. Plot the number of attempts vs. the number of leading zero bits. What is the relationship? Does this match the theoretical expectation of 2^n attempts for n leading zero bits? e. Explain how this exercise relates to Bitcoin mining. What is the "difficulty" in proof-of-work?

Exercise 2.4 — Hash Chain

A hash chain is a sequence where each element is the hash of the previous one:

h0 = SHA-256("seed")
h1 = SHA-256(h0)
h2 = SHA-256(h1)
...

Tasks: a. Generate a hash chain of length 1000 starting from the seed "blockchain-ch2". b. Verify that you can traverse the chain forward (given h_n, compute h_{n+1}) but not backward (given h_{n+1}, you cannot efficiently find h_n). c. Use your hash chain to build a simple one-time password system: the server stores h_1000; the user authenticates by presenting h_999, then h_998, etc. Implement generate_chain(), authenticate(), and demonstrate 5 successful authentications. d. Explain why this one-time password system is secure even if an attacker observes a used password.

Exercise 2.5 — Birthday Attack Simulation*

Empirically verify the birthday paradox for hash functions:

import hashlib
import random
import string

def birthday_attack(hash_bits: int, num_trials: int = 100) -> float:
    """
    Simulate a birthday attack on a hash function truncated to hash_bits bits.

    For each trial:
    1. Generate random strings and compute their truncated hashes.
    2. Stop when you find a collision.
    3. Record the number of hashes computed.

    Return the average number of hashes needed across all trials.
    """
    # YOUR CODE HERE
    pass

Tasks: a. Run the simulation for 8, 12, 16, 20, and 24 bits (truncated SHA-256). b. For each bit length, compare the average number of hashes to the theoretical birthday bound (sqrt(pi/2 * 2^n)). c. Why is it infeasible to run this experiment for the full 256-bit SHA-256? d. If quantum computers provide a cubic root improvement via Grover's algorithm applied to the birthday problem (they do not — Grover provides a square root improvement on preimage attacks, and the birthday bound is already a square root), how would this change the collision resistance of SHA-256? Research and state the actual quantum collision-finding complexity.

Section B: Public-Key Cryptography and Digital Signatures

Exercise 2.6 — Key Generation and Address Derivation

Using the ecdsa library, write a program that:

from ecdsa import SECP256k1, SigningKey
import hashlib

def generate_keypair():
    """Generate a private key, public key, and simplified address."""
    # YOUR CODE HERE
    # Return: (private_key_hex, public_key_hex, address)
    pass

def verify_derivation(private_key_hex: str):
    """Given a private key hex string, re-derive the public key and address."""
    # YOUR CODE HERE
    pass

Tasks: a. Generate 5 key pairs and display them in a formatted table. b. Implement verify_derivation: given a private key, show that you can always re-derive the same public key and address. c. Generate 10,000 key pairs and verify that no two produce the same public key. What is the probability of a collision in 10,000 keys? How does this compare to 2^256? d. Measure the time to generate 10,000 key pairs. Extrapolate: how long would it take to generate 2^128 key pairs (the birthday bound for 256-bit keys)?

Exercise 2.7 — Digital Signature Verification Lab

Create a comprehensive signature verification lab:

from ecdsa import SECP256k1, SigningKey, BadSignatureError
import hashlib

class SignatureVerificationLab:
    def __init__(self):
        self.alice_private = SigningKey.generate(curve=SECP256k1)
        self.alice_public = self.alice_private.get_verifying_key()
        self.bob_private = SigningKey.generate(curve=SECP256k1)
        self.bob_public = self.bob_private.get_verifying_key()

    def test_valid_signature(self):
        """Sign a message with Alice's key, verify with Alice's public key."""
        # YOUR CODE HERE
        pass

    def test_tampered_message(self):
        """Sign a message, modify it, show verification fails."""
        # YOUR CODE HERE
        pass

    def test_wrong_key(self):
        """Sign with Alice's key, verify with Bob's key — should fail."""
        # YOUR CODE HERE
        pass

    def test_signature_uniqueness(self):
        """Sign the same message twice — are the signatures identical?"""
        # Hint: RFC 6979 deterministic nonces mean yes for the same key
        # YOUR CODE HERE
        pass

    def test_different_messages(self):
        """Sign two different messages — show signatures are different."""
        # YOUR CODE HERE
        pass

Tasks: a. Implement all five test methods. Each should print a clear PASS/FAIL result. b. For test_signature_uniqueness: the ecdsa library uses RFC 6979 (deterministic nonces). Research why deterministic nonces are preferred over random nonces. Write a 200-word explanation. c. Add a sixth test: test_signature_malleability. Research ECDSA signature malleability and demonstrate it. Explain why this was a problem for early Bitcoin and how BIP 66 (strict DER encoding) addressed it.

Exercise 2.8 — Transaction Signing System

Build a simple transaction signing and verification system:

import json
from ecdsa import SECP256k1, SigningKey
import hashlib

class Wallet:
    def __init__(self, name: str):
        self.name = name
        self.private_key = SigningKey.generate(curve=SECP256k1)
        self.public_key = self.private_key.get_verifying_key()
        self.address = hashlib.sha256(
            self.public_key.to_string()
        ).hexdigest()[:40]

    def sign_transaction(self, recipient_address: str, amount: float) -> dict:
        """Create and sign a transaction."""
        # YOUR CODE HERE
        # Return a dict with: sender, recipient, amount, signature, public_key
        pass

class TransactionVerifier:
    @staticmethod
    def verify(transaction: dict) -> bool:
        """Verify a signed transaction."""
        # YOUR CODE HERE
        pass

    @staticmethod
    def verify_batch(transactions: list) -> dict:
        """Verify a batch of transactions. Return counts of valid/invalid."""
        # YOUR CODE HERE
        pass

Tasks: a. Implement the Wallet and TransactionVerifier classes. b. Create three wallets (Alice, Bob, Charlie). Generate 5 transactions between them. c. Verify all 5 transactions. d. Create a forged transaction (someone tries to spend from Alice's address without her private key). Verify that it is rejected. e. Measure the time to sign and verify 1000 transactions. What are the bottlenecks?

Exercise 2.9 — ECDSA Nonce Reuse Attack*

Demonstrate the catastrophic consequence of ECDSA nonce reuse:

Background: When the same nonce k is used for two different messages, the private key can be extracted. The math: - Given signatures (r, s1) and (r, s2) for messages with hashes z1 and z2: - k = (z1 - z2) * (s1 - s2)^(-1) mod n - private_key = (s1 * k - z1) * r^(-1) mod n

Task: Using the low-level ECDSA operations in the ecdsa library, demonstrate this attack: a. Generate a key pair. b. Sign two different messages using the same (fixed) nonce. c. Extract the private key from the two signatures. d. Verify that the extracted key matches the original. e. Write a 300-word analysis of the 2013 Android Bitcoin wallet vulnerability (Case Study 1), explaining how this same flaw was exploited in the wild.

Note: This exercise demonstrates a real attack. Never use fixed or predictable nonces in production code. The ecdsa library's default behavior (RFC 6979) prevents this.

Section C: Merkle Trees

Exercise 2.10 — Merkle Tree Visualization

Extend the chapter's MerkleTree class with a visualization method:

def print_tree(self):
    """Print a visual representation of the Merkle tree."""
    # Show each level of the tree with connecting lines
    # Use truncated hashes (first 8 hex chars) for readability
    # YOUR CODE HERE
    pass

Tasks: a. Implement print_tree() and display a tree with 8 transactions. b. Display a tree with 5 transactions (odd number). Show how the duplicate leaf is handled. c. Display a tree with 1 transaction. What is the Merkle root in this case?

Exercise 2.11 — Merkle Proof Verification Chain

Write a program that:

Creates a Merkle tree from 16 transactions.
Generates proofs for all 16 transactions.
Verifies all 16 proofs.
Modifies one transaction and shows that its proof (and only its proof) becomes invalid while all other proofs remain valid.

def full_merkle_test(num_transactions: int = 16):
    """
    Create a tree, prove all transactions, modify one, re-verify.
    Print a detailed report of what changed and what didn't.
    """
    # YOUR CODE HERE
    pass

Tasks: a. Implement and run the test. b. When a transaction is modified, which nodes in the tree change? Trace the path from the modified leaf to the root. c. How many nodes change out of the total? Express this as a function of n (total transactions) and i (index of the modified transaction).

Exercise 2.12 — Merkle Tree Performance Benchmarking

Measure the performance characteristics of Merkle trees:

import time

def benchmark_merkle(sizes: list = [100, 1000, 10000, 100000]):
    """
    For each size in sizes:
    1. Build a Merkle tree with that many transactions.
    2. Generate a proof for a random transaction.
    3. Verify the proof.
    Measure and report time for each operation.
    """
    # YOUR CODE HERE
    pass

Tasks: a. Run benchmarks for 100, 1,000, 10,000, and 100,000 transactions. b. For each size, report: tree construction time, proof generation time, proof verification time, and proof size (number of hashes). c. Plot proof size vs. number of transactions. Verify that it is O(log n). d. A Bitcoin block can have up to ~2,700 transactions. Based on your benchmarks, how long does it take to verify a single Merkle proof for a Bitcoin-sized block?

Exercise 2.13 — Merkle Tree for File Integrity*

Extend the Merkle tree concept to verify file integrity:

import os
import hashlib

class FileMerkleTree:
    def __init__(self, directory_path: str, chunk_size: int = 4096):
        """
        Build a Merkle tree over the contents of a directory.
        Each file is split into chunks, each chunk is a leaf.
        """
        # YOUR CODE HERE
        pass

    def get_file_proof(self, filename: str) -> dict:
        """Generate a proof that a specific file's content is part of the tree."""
        # YOUR CODE HERE
        pass

    def verify_file(self, filename: str, proof: dict) -> bool:
        """Verify that a file matches its proof against the Merkle root."""
        # YOUR CODE HERE
        pass

Tasks: a. Implement FileMerkleTree for a directory of your choice (create test files if needed). b. Generate the Merkle root for the directory. c. Modify one file and show that the root changes. d. Explain how this concept is used in torrent files (BitTorrent) and content-addressed storage (IPFS).

Section D: Integration and Analysis

Exercise 2.14 — Cryptographic Primitives Integration

Combine all four primitives to simulate a simplified blockchain transaction flow:

def simulate_transaction_flow():
    """
    1. Alice generates a key pair.
    2. Alice constructs a transaction to send funds to Bob.
    3. Alice signs the transaction with ECDSA.
    4. A "miner" collects transactions and builds a Merkle tree.
    5. The miner creates a block header including the Merkle root and previous hash.
    6. The miner performs (simplified) proof-of-work.
    7. A "verifier" checks: signature, Merkle proof, and block hash.
    Print each step with the cryptographic values involved.
    """
    # YOUR CODE HERE
    pass

Tasks: a. Implement the full flow and print detailed output for each step. b. Intentionally break each component (forge a signature, tamper with a transaction, modify the Merkle tree) and show that verification fails. c. Measure the time for each step. Which is the bottleneck?

Exercise 2.15 — Written Analysis: Cryptographic Assumptions

Write a 500-word essay answering the following question:

"What happens to blockchain security if any one of the three cryptographic properties of hash functions (preimage resistance, second preimage resistance, collision resistance) is broken, while the others remain intact?"

Address each property separately: a. If only preimage resistance breaks, what attacks become possible? b. If only second preimage resistance breaks, what attacks become possible? c. If only collision resistance breaks, what attacks become possible? d. Which property is most critical for blockchain security, and why?

Reference the SHA-1 collision (SHAttered, 2017) as an example of collision resistance being broken while preimage resistance remains intact.

Exercise 2.16 — Research Exercise: Beyond secp256k1

Research and write a comparison (400-500 words) of three elliptic curves: - secp256k1 (used by Bitcoin, Ethereum) - Curve25519 (used by Signal, Tor, and many modern protocols) - secp256r1/P-256 (NIST standard, used by many government and enterprise systems)

Address: a. Why did Satoshi choose secp256k1 over the NIST standard P-256? b. What is the "nothing-up-my-sleeve" concern with P-256, and does secp256k1 address it? c. Why have some newer projects (like Ed25519-based systems) chosen Curve25519 over secp256k1? d. If you were designing a new blockchain in 2026, which curve would you choose and why?