Case Study: Building a Turing-Machine Simulator and Watching the Undecidable Gap Appear
"What I cannot create, I do not understand." — Richard P. Feynman (found on his blackboard, 1988)
Executive Summary
In this build-focused case study you will construct, from scratch, a working Turing-machine
simulator — the formal model of §36.1 turned into runnable code — and then use it to build the very
boundary the chapter proves. You will write the simulation loop, encode three machines (even length,
length divisible by three, and a unary incrementer that writes to the tape), hand-trace their
configurations, and verify each against its specification. Then comes the payoff: because a real
simulator must decide when to stop simulating, you will see the halting problem appear inside your
own code as the "LOOP?" verdict — the place where the simulator honestly admits it cannot tell
"still working" from "loops forever." Finally you will build the self-application gadget applied_to_self
and trace how the harmless "run a program on itself" move becomes the proof's weapon $D$.
Where Case Study 1 audited a claim, this one builds the artifact. Every line of code here is meant to
run in your head; we hand-derive each # Expected output: exactly as the book requires, and never
execute anything.
Skills applied
- Implementing the §36.1 transition-function-driven simulation loop, including tape extension.
- Encoding decision problems as transition tables and tracing configurations by hand.
- Building a bounded universal simulator and locating the undecidable gap as the
"LOOP?"return. - Constructing the self-application mechanism and connecting it to the diagonal program $D$ (§36.4).
- Verifying a built component against its specification without a single execution.
Background
The scenario
You are adding a small computability.py module to your team's teaching toolkit (the book's
dmtoolkit). The goal is pedagogical: give students a runnable Turing machine so the 7-tuple stops
being abstract, and let them feel the halting problem by hitting it in code. The module needs three
things: a general simulator run_tm, a few example machines, and the applied_to_self helper that
exhibits self-reference. The engineering constraint is the interesting one — a Turing tape is unbounded
and a machine may never halt, but your simulator runs on a real computer with finite time. How your
simulator handles "might run forever" is where the chapter's theorem becomes concrete.
💡 Intuition: A simulator is a recognizer with a stopwatch. Run the machine and, if it reaches an accept/reject state, report it — that part is faithful. But you cannot wait forever to find out it won't halt (§36.4 says no finite procedure can decide that in general). So a practical simulator imposes a step budget and, when the budget is exhausted, returns an honest "I don't know" rather than a dishonest "loops." That
"LOOP?"is the undecidable gap, built right into the API.
Why this matters
Writing the simulator does three things at once. It makes the formal model in §36.1 operational, so "transition function" becomes a dictionary you can index. It demonstrates the Church–Turing thesis from the inside: your Python loop is simulating a Turing machine, which can in turn simulate any computation — the two are inter-translatable (§36.2). And it stages the halting problem honestly: the moment you must choose what to return when the budget runs out, you are face to face with §36.4. Build it once and the abstraction is yours for good.
Phase 1: Build the simulation loop
Start from the definition. A Turing machine is $M = (Q, \Sigma, \Gamma, \delta, q_0, q_{\text{accept}}, q_{\text{reject}})$. The only moving parts at run time are the current state, the head position, and the tape. The loop reads the symbol under the head, looks up $\delta$, writes, moves, and switches state — until it lands in a halting state. We add tape extension (the tape is unbounded, so we grow it with blanks on demand) and, deferring the budget to Phase 3, first build the unbounded loop to match §36.1 exactly.
def run_tm_basic(delta, tape, start="q0", accept="ACC", reject="REJ", blank="_"):
"""Faithful (unbounded) TM loop -- mirrors 36.1. May NEVER return (that is the point)."""
state, head, tape = start, 0, list(tape)
while state not in (accept, reject):
if head < 0: # head walked off the left end
tape.insert(0, blank); head = 0
if head == len(tape): # head walked off the right end
tape.append(blank)
state, write, move = delta[(state, tape[head])]
tape[head] = write
head += 1 if move == "R" else -1
return state
even = {("q0", "a"): ("q1", "a", "R"), ("q0", "_"): ("ACC", "_", "R"),
("q1", "a"): ("q0", "a", "R"), ("q1", "_"): ("REJ", "_", "R")}
print(run_tm_basic(even, "aa"), run_tm_basic(even, "aaa"))
# Expected output:
# ACC REJ
Let's verify, not just trust. Trace aa by writing each configuration (the chapter's notation: tape
with the head's state inserted at the head position):
$$ q_0\,aa \;\to\; a\,q_1\,a \;\to\; aa\,q_0\,\sqcup \;\to\; \text{ACC}. $$
Two a's flip $q_0 \to q_1 \to q_0$; the end-blank in $q_0$ sends us to ACC. For aaa: $q_0 \to q_1
\to q_0 \to q_1$, and the end-blank in $q_1$ sends us to REJ. Output ACC REJ — matching the hand
trace and §36.1.
🔄 Check Your Understanding Trace
run_tm_basic(even, "")(empty input) by hand. What configuration does it start in, what does it read first, and what does it return?
Answer
It starts as $q_0\,\sqcup$ (head over a blank, since the input is empty). The first read is
_in stateq0, whose rule is("ACC", "_", "R"), so it goes straight to ACC. That is correct: zero is an even length.
Phase 2: Encode and verify two more machines
A simulator is only as convincing as the variety of machines it runs. Build two more, each a transition table, and verify each against its spec by hand. The first counts length mod 3; the second actually writes to the tape (so far our machines only read).
# Machine 1: accept iff the number of a's is divisible by 3. States q0,q1,q2 = len mod 3.
div3 = {("q0", "a"): ("q1", "a", "R"), ("q0", "_"): ("ACC", "_", "R"),
("q1", "a"): ("q2", "a", "R"), ("q1", "_"): ("REJ", "_", "R"),
("q2", "a"): ("q0", "a", "R"), ("q2", "_"): ("REJ", "_", "R")}
print(run_tm_basic(div3, "aaa"), run_tm_basic(div3, "aaaa"))
# Expected output:
# ACC REJ
Trace aaa: $q_0 \to q_1 \to q_2 \to q_0$, end-blank in $q_0 \Rightarrow$ ACC (3 is divisible by 3).
Trace aaaa: $q_0 \to q_1 \to q_2 \to q_0 \to q_1$, end-blank in $q_1 \Rightarrow$ REJ (4 is not).
The state index is the running length mod 3, and acceptance checks "mod 3 equals 0" — a clerk counting
on three fingers.
Now a machine that modifies the tape: a unary incrementer. Input is a block of 1s representing a
number in unary ($n$ ones $=$ the number $n$); the machine appends one more 1, computing $n+1$. It
walks right to the first blank and writes a 1 there.
# Machine 2: unary increment. Walk right over 1's to the end-blank; write a 1; accept.
inc = {("q0", "1"): ("q0", "1", "R"), # scan right across the 1's
("q0", "_"): ("q1", "1", "R"), # at the end blank, WRITE a 1, step right
("q1", "_"): ("ACC", "_", "R")} # land on the new blank -> accept
print(run_tm_basic(inc, "11")) # 2 -> 3, tape becomes 111
# Expected output:
# ACC
Trace 11 (the number 2), tracking the tape as it changes:
$$ q_0\,11 \;\to\; 1\,q_0\,1 \;\to\; 11\,q_0\,\sqcup \;\xrightarrow{\text{write }1}\; 111\,q_1\,\sqcup \;\to\; \text{ACC}. $$
At the third step the head is on the end-blank in $q_0$; the rule writes 1 (turning 11_ into 111)
and moves right; the new cell is blank in $q_1$, which accepts. The tape now holds 111 — the number 3.
The machine computed, not just recognized: it transformed its input. This is the feature a DFA lacks
(§36.1, the 🔗 Connection) and the reason Turing machines reach the top of the Chomsky hierarchy.
⚠️ Common Pitfall: forgetting tape extension. A beginner's simulator indexes
tape[head]after the head walks off the written input and crashes with anIndexError. The twoifguards inrun_tm_basic(head < 0andhead == len(tape)) implement the unbounded tape by lazily growing it with blanks. Drop them and your "Turing machine" is secretly bounded — a different, weaker model. The unbounded tape is the one infinite ingredient §36.1 insists on.🔄 Check Your Understanding In the
incmachine, what doesrun_tm_basic(inc, "")(the number 0) return, and what is on the tape when it accepts?
Answer
Start $q_0\,\sqcup$: the head reads
_inq0, so the rule("q1", "1", "R")writes a1and moves right; then_inq1$\Rightarrow$ACC. The tape holds a single1— the number 1 — so it computed $0 + 1 = 1$. It returnsACC.
Phase 3: Build in the step budget — and meet the undecidable gap
Every machine above halts, so the unbounded loop was safe. But the whole point of §36.4 is that some
machines never halt, and your simulator must run on a finite computer. Add a max_steps budget. The
design decision — what to return when the budget is exhausted — is where the halting problem enters
your code.
def run_tm(delta, tape, start="q0", accept="ACC", reject="REJ", blank="_", max_steps=10_000):
"""Bounded TM simulator. Returns 'ACC'/'REJ', or 'LOOP?' if the budget is hit.
'LOOP?' is HONEST: we CANNOT decide halting (36.4), so we never claim 'REJ'/'loops'."""
state, head, tape = start, 0, list(tape)
for _ in range(max_steps):
if state in (accept, reject):
return state
if head < 0:
tape.insert(0, blank); head = 0
if head == len(tape):
tape.append(blank)
state, write, move = delta[(state, tape[head])]
tape[head] = write
head += 1 if move == "R" else -1
return "LOOP?" # budget exhausted: 'maybe loops' -- the undecidable gap, made visible
# A deliberately NON-HALTING machine: it walks right forever, never reaching ACC/REJ.
spin = {("q0", "a"): ("q0", "a", "R"), ("q0", "_"): ("q0", "_", "R")}
print(run_tm(even, "aaaa"), run_tm(even, "aaa"), run_tm(spin, "a", max_steps=50))
# Expected output:
# ACC REJ LOOP?
Trace the three calls. even on aaaa: four flips $q_0\to q_1\to q_0\to q_1\to q_0$, end-blank in
$q_0$ $\Rightarrow$ ACC (reached well within the budget). even on aaa: ends in $q_1$ on the
blank $\Rightarrow$ REJ. spin on a: the head marches right forever — on the a, then across blank
after blank, always in $q_0$, never halting. After 50 steps the budget runs out and the simulator
returns "LOOP?".
Sit with that third return value. The simulator cannot print "REJ" or "loops" for spin,
because that would be claiming to have decided non-halting — exactly what §36.4 forbids. All it can
honestly say is "LOOP?": "I did not see it halt within the budget; I cannot conclude it never will."
Raise max_steps to a million and spin still returns "LOOP?"; there is no finite budget that
converts the honest "LOOP?" into a justified "loops", because halting times are unbounded — some
machines halt at step $N+1$ for any $N$ you pick (the §36.4 Find-the-Error trap, now a fact about your
own simulator).
🚪 Threshold Concept. You did not choose to put the halting problem in your code; it appeared unavoidably the moment you made the simulator finite. Any honest simulator of a Turing-complete model has a
"LOOP?"-shaped hole, because deciding which inputs would have halted is the halting problem itself. This is the chapter's theorem reaching out of the page and into your editor: the limit is not a missing feature you could add next sprint — it is structural.
Phase 3.5: Programs as data — a simulator that reads a machine off its input
So far run_tm takes delta as a Python object. But §36.4's whole enabling move is that a program is
a finite string you can feed to another program (the encoding $\langle P, x\rangle$). To feel that, we
build a small step toward a universal simulator: one that reads a machine's transition table as
text and runs it. This is the Turing-machine version of an interpreter, and it is what makes "run a
program on its own description" (Phase 4) more than a slogan.
We encode a machine as newline-separated rows state,read -> next,write,move, parse them into a delta
dictionary, then hand that to run_tm. The encoder and decoder are pure string manipulation — exactly
the "programs are data" idea (§36.4).
def encode_tm(delta):
"""Serialize a transition table to text: one rule per line."""
return "\n".join(f"{s},{r}->{ns},{w},{mv}" for (s, r), (ns, w, mv) in delta.items())
def decode_tm(text):
"""Parse the text back into a delta dict -- the 'read a machine off the tape' step."""
delta = {}
for line in text.splitlines():
left, right = line.split("->")
s, r = left.split(",")
ns, w, mv = right.split(",")
delta[(s, r)] = (ns, w, mv)
return delta
def universal_run(machine_text, tape, **kw):
"""A mini 'universal' simulator: decode a machine FROM TEXT, then run it (36.2/36.4)."""
return run_tm(decode_tm(machine_text), tape, **kw)
src = encode_tm(even) # 'even' from Phase 1, now as a STRING
print(universal_run(src, "aa"), universal_run(src, "aaa"))
# Expected output:
# ACC REJ
Trace the round trip. encode_tm(even) turns the four-rule table into text such as
q0,a->q1,a,R / q0,_->ACC,_,R / q1,a->q0,a,R / q1,_->REJ,_,R (one rule per line). decode_tm
parses each line back into the same (state, read) -> (next, write, move) entries, reconstructing the
identical delta. universal_run then simulates it: aa $\to$ ACC, aaa $\to$ REJ, matching
Phase 1. The machine survived the trip through text unchanged — which is the entire point.
🔗 Connection.
universal_runis a baby universal Turing machine (§36.2, §36.4): a single machine that, given an encoding of any machine $M$ and an input $w$, simulates $M$ on $w$. Turing's 1936 paper built exactly this — the ancestor of every interpreter, virtual machine, andeval. It is also the construction that lets the halting problem even be stated: $\mathrm{HALT}$ takes $\langle P, x\rangle$ as input precisely because a universal machine can run $P$ from its description.💡 Intuition: Notice that
universal_runinherits the same"LOOP?"gap asrun_tm— feed it the text ofspinand it returns"LOOP?"after the budget. Building the universal layer did not add any power to decide halting; it only made "programs as data" tangible. A universal simulator can run any machine, but it still cannot decide whether the machine it is running will halt — that is §36.4, one level up.⚠️ Common Pitfall: "an interpreter could detect the loop for us." Wrapping a machine in an interpreter (or a VM, or a sandbox) feels like it should grant new visibility into the machine's fate. It does not. The interpreter is itself a program; if it could decide whether the interpreted machine halts, it would be the halting decider §36.4 forbids. More tooling around the execution changes convenience, never the computability of "will it halt?".
Phase 4: Build the self-application gadget and connect it to $D$
The last piece is the mechanism behind the proof: feeding a program its own description. Build the helper, exhibit a safe self-application, then trace how the same shape becomes the contradiction $D$.
def applied_to_self(program):
"""Run a program on its own description -- the diagonal 'run it on itself' move (36.4)."""
return program(program)
def name_of(f): # a one-argument program we can safely apply to itself
return f.__name__
print(applied_to_self(name_of)) # name_of(name_of) -> the string "name_of"
# Expected output:
# name_of
applied_to_self(name_of) evaluates name_of(name_of). name_of ignores the body of its argument
and returns its __name__ attribute, which is the string "name_of". It halts; it is harmless. The
pattern — a program taking itself as input — is all we need; here it is benign.
Now build the weaponized version. We cannot supply a real halting decider H (the theorem forbids
it), so we assume one and show the self-application makes every verdict wrong — the §36.4 construction,
matching the chapter's make_D and example-03.
def make_D(H):
"""Given a (hypothetical) halting decider H, build the contrarian D.
D(p): ask H whether p halts on p, then DO THE OPPOSITE."""
def D(p):
if H(p, p) == "halts":
while True: # H predicted 'halts' -> D loops forever
pass
return "halted" # H predicted 'loops' -> D halts
return D
def trace_D_on_D():
# Consider applied_to_self(D) = D(D). Try both verdicts H could give for D-on-D:
for verdict in ("halts", "loops"):
actual = "loops" if verdict == "halts" else "halts" # D does the opposite
print(f"H(D,D)='{verdict}' -> D(D) actually {actual} -> contradicts H")
trace_D_on_D()
# Expected output:
# H(D,D)='halts' -> D(D) actually loops -> contradicts H
# H(D,D)='loops' -> D(D) actually halts -> contradicts H
The two helpers share one shape: applied_to_self runs program(program); D is run on D, i.e.
D(D). The difference is the body. name_of reads only its argument's name and returns peacefully.
D reads H's prediction about D(D) and then contradicts it — so whatever H would say, D
does the opposite, and no correct, always-halting H can survive. Self-application is not the villain;
self-application combined with the freedom to negate a prediction about yourself is. That is the liar's
paradox compiled into a program, and it is why the decider H you would need to upgrade your simulator's
"LOOP?" into a real "loops" cannot exist.
🔗 Connection. Lay this beside Chapter 10's
diagonal_escape. There, $g(n) = 1 - f_n(n)$ ran down the diagonal of a proposed list of functions and flipped each bit, producing a function on no list. Here,Druns down the diagonal of "every program applied to itself" and flips the halting behavior, producing a behaviorHcannot correctly predict. Same diagonal, same flip — yourapplied_to_selfplusmake_Disdiagonal_escapeaimed at programs (§36.4).
Discussion Questions
- Your
run_tmreturns"LOOP?", never"loops", when the budget is hit. Explain to a teammate why returning"loops"would be a correctness bug (not just a style choice), referencing the bounded versus unbounded distinction in §36.4 and the Find-the-Error callout. - The
incmachine writes to the tape; theevenanddiv3machines only read. Which feature of the Turing-machine definition (§36.1) does writing exercise, and why is it the feature that separates Turing machines from the DFAs of Chapter 35? - Suppose you "improve"
run_tmby detecting repeated configurations (same state, head position, and tape) and returning"REJ"when one recurs, reasoning "a repeat means an infinite loop." Does this makerun_tma halting decider? Explain why it catches some non-halting machines but not all, and connect to why no finite trick closes the gap (§36.4). applied_to_self(name_of)halts butapplied_to_self(D)(i.e.,D(D)) cannot consistently do either. What single property ofD's body, absent fromname_of's, creates the paradox? State it in one sentence.- By the Church–Turing thesis (§36.2), your Python simulator and a Turing machine compute the same
functions. Use that to argue that the
"LOOP?"gap is not an artifact of Python — it would appear in a simulator written in any Turing-complete language. Which sentence of §36.2 makes this airtight?
Your Turn: Extensions
- Option A (binary incrementer). Build a Turing machine that adds 1 to a binary number written on
the tape (handle the carry, including the all-ones case
111 -> 1000). Give its transition table, hand-trace it on11(which is 3, $\to$100), and verify withrun_tm's expected output. - Option B (a decider with a guaranteed bound). Encode a machine that always halts within a number
of steps you can predict from the input length (e.g.,
even), and argue that for this machinerun_tmwill never return"LOOP?"oncemax_stepsexceeds the input length. Contrast withspin. - Option C (configuration-cycle detector). Implement the "repeated configuration $\Rightarrow$
reject" idea from Discussion Question 3 as
run_tm_cycle. Hand-trace it onspin(does it detect the cycle?) and on a machine that loops by growing the tape forever (does it?). Explain, from your two traces, exactly which non-halting machines this catches and which it misses. - Option D (toolkit increment). Package
run_tm,applied_to_self, and your three machines ascomputability.py, matching the chapter's Project Checkpoint. Confirm your signatures match (run_tm(delta, tape, ...),applied_to_self(program)) so the module composes with the rest ofdmtoolkit. - Option E (round-trip a machine). Using the Phase 3.5
encode_tm/decode_tm, encode thediv3machine to text, hand-write the exact linesencode_tmproduces, decode them, and confirmuniversal_runon the decoded text gives the sameACC/REJverdicts as Phase 2. Then explain in two sentences why running a machine from its text encoding is the construction that makes the $\langle P, x\rangle$ input of $\mathrm{HALT}$ meaningful (§36.4).
Key Takeaways
- A Turing machine is a dictionary plus a loop. The transition function is a
(state, symbol) -> (state, write, move)table; simulation reads, writes, moves, and switches state until a halting state is reached. Building it makes the §36.1 model concrete. - Verify built machines by tracing configurations, not by running them. Each machine here was
checked against its spec by hand —
ACC/REJderived from the configuration sequence — exactly the book's no-execution discipline. - The unbounded tape is non-negotiable; the step budget is unavoidable. Lazy blank-extension
implements the one infinite ingredient; a finite computer forces a
max_stepsbound on the otherwise-infinite loop. "LOOP?"is the halting problem, sitting in your code. An honest finite simulator cannot convert "did not halt within the budget" into "loops forever" — that conversion is the undecidable problem itself, and no finite trick (not even cycle detection) closes the gap.- Self-application is harmless until paired with self-negation.
applied_to_self(name_of)halts; the proof'sDrun onDcannot — becauseDreads a prediction about itself and defies it. That is Cantor's diagonal flip (Chapter 10) compiled into a program, and the reason a halting decider — the thing that would fix your simulator's gap — cannot exist.