44 min read

> "It is always the way — each new discovery reveals more questions than it answers."

Learning Objectives

  • Analyze the Klein-Gordon equation and identify its problems for single-particle quantum mechanics
  • Derive the Dirac equation by linearizing the relativistic energy-momentum relation
  • Show that spin emerges as a necessary consequence of combining relativity with quantum mechanics
  • Interpret the negative-energy solutions and connect them to the prediction of antimatter
  • Explain why the marriage of quantum mechanics and special relativity ultimately demands quantum field theory

Chapter 29: Relativistic Quantum Mechanics: The Dirac Equation and What Comes Next

"The equation was more intelligent than I was." — Paul Dirac, on his equation predicting antimatter

"It is always the way — each new discovery reveals more questions than it answers." — attributed to Frank Wilczek

Everything we have built in this textbook so far rests on a foundation that, if you examine it honestly, has a crack running through it. The Schrodinger equation — the backbone of Chapters 2 through 28 — is not Lorentz invariant. It treats time and space on fundamentally different footings: first-order in time, second-order in space. This asymmetry is perfectly acceptable in the non-relativistic domain where $v \ll c$, and every result we have obtained remains valid in that regime. But the universe does not care about our convenience. Electrons in heavy atoms move at substantial fractions of the speed of light. Particle accelerators routinely create particles with kinetic energies far exceeding their rest mass energy. And the most striking prediction of relativistic quantum mechanics — the existence of antimatter — has no analogue whatsoever in the non-relativistic theory.

This chapter tells the story of what happens when you take the marriage of quantum mechanics and special relativity seriously. The first attempt — the Klein-Gordon equation — gets some things right but fails catastrophically as a single-particle wave equation. Dirac's brilliant resolution — a first-order relativistic wave equation for spin-1/2 particles — is one of the great triumphs of theoretical physics, predicting both electron spin and antimatter from pure mathematical reasoning. But even Dirac's equation, perfect as it is, points beyond itself to a deeper framework: quantum field theory, where particles can be created and destroyed, and the vacuum itself seethes with activity.

We will derive everything from first principles, maintaining the Dirac notation you have used since Chapter 8 and connecting every result back to the hydrogen atom — the running example that ties this textbook together.

🏃 Fast Track: If you are primarily interested in the conceptual story — why spin is a relativistic effect, how antimatter was predicted, and why QFT is necessary — you can read Sections 29.1, 29.4, 29.6, 29.8–29.9, and 29.11, skipping the technical derivations in Sections 29.2–29.3 and 29.5. But the derivations are where the real magic lives, and this chapter is one of those rare cases where the mathematics is genuinely more illuminating than any verbal summary.


29.1 Special Relativity Meets Quantum Mechanics

The Problem in One Sentence

The Schrodinger equation is built on the non-relativistic energy-momentum relation:

$$E = \frac{p^2}{2m}$$

which becomes, after the canonical quantization prescription $E \to i\hbar\partial_t$ and $\mathbf{p} \to -i\hbar\nabla$:

$$i\hbar\frac{\partial}{\partial t}|\Psi\rangle = -\frac{\hbar^2}{2m}\nabla^2|\Psi\rangle$$

This equation treats time and space asymmetrically — first derivative in $t$, second derivative in $\mathbf{x}$ — and is manifestly not Lorentz covariant. Under a Lorentz boost, it does not transform into an equation of the same form. For non-relativistic systems, this is fine; the Schrodinger equation is the correct $v/c \to 0$ limit of any relativistic wave equation. But when particles move at speeds comparable to $c$, or when we need the precision that relativistic corrections provide (as in the fine structure of hydrogen), we must start from the correct relativistic energy-momentum relation.

When Does Relativity Matter?

To appreciate when relativistic effects become important, consider the speed of an electron in the ground state of a hydrogen-like atom with nuclear charge $Z$:

$$v \sim Z\alpha c$$

where $\alpha \approx 1/137$ is the fine-structure constant. For hydrogen ($Z = 1$), the electron moves at about $0.7\%$ of the speed of light — comfortably non-relativistic, though the $\alpha^2 \approx 5 \times 10^{-5}$ corrections are measurable as fine structure. For uranium ($Z = 92$), the inner electrons reach $v \sim 0.67c$, and relativistic effects are not perturbative corrections but dominant features of the physics. The binding energy of a $1s$ electron in uranium is increased by about 25% relative to the non-relativistic prediction. Heavy-element chemistry (why gold is yellow, why mercury is liquid at room temperature, why lead-acid batteries work) depends on relativistic quantum mechanics.

📊 By the Numbers: The fine-structure constant $\alpha = e^2/(4\pi\epsilon_0\hbar c) \approx 1/137.036$ is the fundamental coupling constant of electromagnetism. It sets the scale of all relativistic corrections in atomic physics: fine structure is $O(\alpha^2)$ relative to the gross structure, the Lamb shift is $O(\alpha^3)$, and so on. The remarkable smallness of $\alpha$ is what makes non-relativistic quantum mechanics work as well as it does for light atoms.

The Relativistic Energy-Momentum Relation

Special relativity tells us that the energy $E$ and momentum $\mathbf{p}$ of a free particle of rest mass $m$ satisfy:

$$E^2 = p^2c^2 + m^2c^4 \tag{29.1}$$

or equivalently, in terms of the four-momentum $p^\mu = (E/c, \mathbf{p})$:

$$p^\mu p_\mu = m^2c^2 \tag{29.2}$$

This is the equation we must quantize. But how? The naive approach — take the square root to get $E = \sqrt{p^2c^2 + m^2c^4}$ and then replace $E$ and $\mathbf{p}$ with operators — immediately runs into trouble. The square root of an operator involving spatial derivatives is not a well-defined local differential operator. What does $\sqrt{-\hbar^2c^2\nabla^2 + m^2c^4}$ even mean?

There are two strategies: 1. Don't take the square root. Quantize $E^2 = p^2c^2 + m^2c^4$ directly. This gives the Klein-Gordon equation. 2. Find a way to take the square root. This is Dirac's stroke of genius, yielding the Dirac equation.

Both strategies are physically important, and both teach us something essential about the structure of relativistic quantum mechanics. Let us pursue them in order.

🔗 Connection: The energy-momentum relation $E^2 = p^2c^2 + m^2c^4$ was first derived in Chapter 7 (Section 7.8) as the relativistic generalization needed for high-energy time evolution. Here we finally build a full theory on it.


29.2 The Klein-Gordon Equation: First Attempt

Derivation

We quantize Eq. (29.1) directly, without taking the square root. Replace:

$$E \to i\hbar\frac{\partial}{\partial t}, \qquad \mathbf{p} \to -i\hbar\nabla$$

Since we are quantizing $E^2$, we need $(i\hbar\partial_t)^2 = -\hbar^2\partial_t^2$:

$$-\hbar^2\frac{\partial^2\phi}{\partial t^2} = -\hbar^2 c^2\nabla^2\phi + m^2c^4\phi$$

Rearranging:

$$\frac{1}{c^2}\frac{\partial^2\phi}{\partial t^2} - \nabla^2\phi + \frac{m^2c^2}{\hbar^2}\phi = 0 \tag{29.3}$$

This is the Klein-Gordon equation. It was first written down by Schrodinger himself in 1925 (before his famous non-relativistic equation!) and independently by Oskar Klein and Walter Gordon in 1926–1927.

In the compact notation of four-vectors, define $\partial_\mu = (\frac{1}{c}\frac{\partial}{\partial t}, \nabla)$ and $\partial^\mu = (\frac{1}{c}\frac{\partial}{\partial t}, -\nabla)$, so that $\partial_\mu\partial^\mu = \frac{1}{c^2}\frac{\partial^2}{\partial t^2} - \nabla^2 \equiv \Box$ (the d'Alembertian). Then the Klein-Gordon equation becomes:

$$\left(\Box + \frac{m^2c^2}{\hbar^2}\right)\phi = 0 \tag{29.4}$$

This is beautifully Lorentz covariant. The d'Alembertian $\Box$ is a Lorentz scalar, $m^2c^2/\hbar^2$ is a Lorentz scalar, and the equation transforms properly under Lorentz transformations. From a purely mathematical standpoint, this is exactly what we wanted: a relativistic wave equation.

Plane Wave Solutions

The plane wave solutions of the Klein-Gordon equation are:

$$\phi(\mathbf{r}, t) = A\exp\left(\frac{i}{\hbar}(\mathbf{p}\cdot\mathbf{r} - Et)\right)$$

Substitution into Eq. (29.3) yields the dispersion relation $E^2 = p^2c^2 + m^2c^4$, exactly as required. But there are two solutions for $E$:

$$E = \pm\sqrt{p^2c^2 + m^2c^4} \tag{29.5}$$

The positive-energy solutions are expected — they correspond to the familiar relativistic particles. But the negative-energy solutions cannot be dismissed. They are mathematically valid solutions of the Klein-Gordon equation, and any attempt to restrict to positive energies alone fails because the positive-energy and negative-energy subspaces mix under time evolution in the presence of interactions. We will return to this problem.

📊 By the Numbers: For an electron at rest ($p = 0$), the two solutions give $E = +m_ec^2 = +0.511$ MeV and $E = -m_ec^2 = -0.511$ MeV. The negative-energy solution is separated from the positive by a gap of $2m_ec^2 \approx 1.022$ MeV. This energy gap will turn out to be the threshold for electron-positron pair creation.


29.3 Problems with Klein-Gordon: Negative Probability

The Klein-Gordon equation passes the test of Lorentz covariance with flying colors. But it fails as a single-particle quantum mechanical wave equation for a devastating reason: it does not yield a positive-definite probability density.

The Continuity Equation

In non-relativistic quantum mechanics, the probability density is $\rho = |\psi|^2 = \psi^*\psi$, and the continuity equation $\partial\rho/\partial t + \nabla\cdot\mathbf{j} = 0$ ensures conservation of total probability. We derived this in Chapter 2 by taking the Schrodinger equation times $\psi^*$ minus the complex conjugate Schrodinger equation times $\psi$.

Let us try the same procedure with the Klein-Gordon equation. Take $\phi^*$ times Eq. (29.3), subtract $\phi$ times the complex conjugate of Eq. (29.3):

$$\phi^*\left(\frac{1}{c^2}\partial_t^2\phi - \nabla^2\phi + \frac{m^2c^2}{\hbar^2}\phi\right) = 0$$ $$\phi\left(\frac{1}{c^2}\partial_t^2\phi^* - \nabla^2\phi^* + \frac{m^2c^2}{\hbar^2}\phi^*\right) = 0$$

Subtracting:

$$\frac{1}{c^2}\left(\phi^*\partial_t^2\phi - \phi\partial_t^2\phi^*\right) - \left(\phi^*\nabla^2\phi - \phi\nabla^2\phi^*\right) = 0$$

The spatial part gives us $\nabla\cdot(\phi^*\nabla\phi - \phi\nabla\phi^*)$ as before. The temporal part gives $\partial_t(\phi^*\partial_t\phi - \phi\partial_t\phi^*)$. We can write this as a continuity equation:

$$\frac{\partial\rho_{\text{KG}}}{\partial t} + \nabla\cdot\mathbf{j}_{\text{KG}} = 0 \tag{29.6}$$

with:

$$\rho_{\text{KG}} = \frac{i\hbar}{2mc^2}\left(\phi^*\frac{\partial\phi}{\partial t} - \phi\frac{\partial\phi^*}{\partial t}\right) \tag{29.7}$$

$$\mathbf{j}_{\text{KG}} = -\frac{i\hbar}{2m}\left(\phi^*\nabla\phi - \phi\nabla\phi^*\right) \tag{29.8}$$

The probability current $\mathbf{j}_{\text{KG}}$ looks exactly like the non-relativistic one (as it should). But look at $\rho_{\text{KG}}$. It involves $\partial\phi/\partial t$, which can have either sign. For a positive-energy plane wave, $\phi \propto e^{-iEt/\hbar}$ gives $\rho_{\text{KG}} > 0$. But for a negative-energy plane wave, $\phi \propto e^{+i|E|t/\hbar}$ gives $\rho_{\text{KG}} < 0$.

A negative probability density is physically meaningless.

Why This Happens: Second-Order in Time

The root cause is that the Klein-Gordon equation is second-order in time. To specify the initial conditions for a second-order equation, you need both $\phi(\mathbf{r}, 0)$ and $\dot{\phi}(\mathbf{r}, 0)$. This means $\rho_{\text{KG}}$ — which depends on $\dot{\phi}$ — is not determined by $\phi$ alone. You can choose initial conditions that make $\rho_{\text{KG}}$ negative anywhere you like.

By contrast, the Schrodinger equation is first-order in time. You need only $\psi(\mathbf{r}, 0)$ to specify the state, and $\rho = |\psi|^2 \geq 0$ automatically.

Other Problems

The Klein-Gordon equation has additional difficulties as a single-particle equation:

  1. Negative-energy solutions (discussed above) — these cannot be simply discarded because interactions mix positive and negative energies.

  2. No spin — the Klein-Gordon field $\phi$ is a scalar. It describes spin-0 particles (like pions), not spin-1/2 particles (like electrons). Nature's most common matter particles have spin-1/2.

  3. The Klein paradox — a Klein-Gordon particle encountering a step potential $V_0 > 2mc^2$ shows a transmission probability greater than one, suggesting that particles are being created at the barrier. This is not a bug but a hint: single-particle relativistic quantum mechanics is inconsistent whenever energies are large enough to create particle-antiparticle pairs. The threshold energy $2mc^2$ is precisely the energy needed to pull an antiparticle-particle pair out of the vacuum, and any potential strong enough to exceed this threshold will inevitably produce pairs. A single-particle theory has no vocabulary for this process.

  4. Superluminal propagation — the Klein-Gordon propagator $G(\mathbf{r}, t; \mathbf{r}', 0)$ is nonzero outside the light cone (for $|\mathbf{r} - \mathbf{r}'| > ct$). This means the particle has a nonzero amplitude to travel faster than light, violating relativistic causality. In QFT, this is resolved by the cancellation between particle and antiparticle contributions — the commutator of field operators vanishes at spacelike separation.

⚠️ Common Misconception: The Klein-Gordon equation is not "wrong." It is the correct relativistic wave equation for spin-0 particles when reinterpreted within quantum field theory, where $\phi$ becomes a field operator and $\rho_{\text{KG}}$ is reinterpreted as a charge density (which can be negative). The equation is wrong only as a single-particle wave equation with probabilistic interpretation.

💡 Key Insight: The problems of the Klein-Gordon equation are not mathematical failures — they are physical messages. Nature is telling us three things: (1) electrons are not described by scalar wave equations, (2) single-particle descriptions break down at relativistic energies, and (3) the correct framework must somehow accommodate particle creation and annihilation. All three messages will be heeded in what follows.


29.4 Dirac's Equation: Linearizing the Energy-Momentum Relation

Dirac's Insight

In late 1927, Paul Adrien Maurice Dirac — then 25 years old and a fellow of St John's College, Cambridge — set out to find a relativistic wave equation for the electron. He understood the problems with the Klein-Gordon equation. His key demands were:

  1. First-order in time — so that the probability density $\rho = \psi^\dagger\psi$ is automatically positive-definite, just as in the Schrodinger equation.
  2. Lorentz covariant — the equation must have the same form in every inertial frame.
  3. Consistent with the relativistic energy-momentum relation — iterating the equation must recover $E^2 = p^2c^2 + m^2c^4$.

The first condition requires the equation to be first-order in both $\partial/\partial t$ and $\nabla$ (since Lorentz covariance treats $t$ and $\mathbf{x}$ symmetrically). Dirac therefore sought an equation of the form:

$$i\hbar\frac{\partial\psi}{\partial t} = \left(c\,\boldsymbol{\alpha}\cdot\hat{\mathbf{p}} + \beta mc^2\right)\psi \tag{29.9}$$

where $\hat{\mathbf{p}} = -i\hbar\nabla$, and $\boldsymbol{\alpha} = (\alpha_1, \alpha_2, \alpha_3)$ and $\beta$ are objects to be determined. The Hamiltonian is:

$$\hat{H}_D = c\,\boldsymbol{\alpha}\cdot\hat{\mathbf{p}} + \beta mc^2 \tag{29.10}$$

Determining the Constraints on $\alpha_i$ and $\beta$

For this equation to be consistent with special relativity, it must reproduce the correct energy-momentum relation when iterated. Apply $\hat{H}_D$ twice:

$$\hat{H}_D^2 = \left(c\,\boldsymbol{\alpha}\cdot\hat{\mathbf{p}} + \beta mc^2\right)^2$$

Expand, being very careful not to assume that $\alpha_i$ and $\beta$ commute with each other (they might be matrices):

$$\hat{H}_D^2 = c^2\sum_{i,j}\alpha_i\alpha_j\hat{p}_i\hat{p}_j + mc^3\sum_i(\alpha_i\beta + \beta\alpha_i)\hat{p}_i + \beta^2 m^2c^4$$

For this to equal $E^2 = p^2c^2 + m^2c^4$ (as an operator equation $\hat{H}_D^2 = c^2\hat{p}^2 + m^2c^4$), we need:

$$c^2\sum_{i,j}\alpha_i\alpha_j\hat{p}_i\hat{p}_j = c^2\hat{p}^2 = c^2\sum_i\hat{p}_i^2$$

Since $\hat{p}_i\hat{p}_j = \hat{p}_j\hat{p}_i$, the symmetric part of $\alpha_i\alpha_j$ is what matters:

$$\frac{1}{2}(\alpha_i\alpha_j + \alpha_j\alpha_i) = \delta_{ij}\mathbb{I} \tag{29.11}$$

For the cross-terms to vanish:

$$\alpha_i\beta + \beta\alpha_i = 0 \qquad \text{for all } i \tag{29.12}$$

For the mass term:

$$\beta^2 = \mathbb{I} \tag{29.13}$$

From Eq. (29.11) with $i = j$: $\alpha_i^2 = \mathbb{I}$ for each $i$.

These conditions are impossible to satisfy if $\alpha_i$ and $\beta$ are ordinary numbers. But they can be satisfied if $\alpha_i$ and $\beta$ are matrices. Dirac's great realization was this: the "square root" of the relativistic energy-momentum relation requires the wave function to have multiple components. The electron is not described by a single complex function but by a multi-component spinor.

What Size Must the Matrices Be?

From the conditions above, $\alpha_i$ and $\beta$ are Hermitian matrices (to make $\hat{H}_D$ Hermitian) satisfying: - $\alpha_i^2 = \beta^2 = \mathbb{I}$ (eigenvalues $\pm 1$) - $\{\alpha_i, \alpha_j\} = 2\delta_{ij}\mathbb{I}$ and $\{\alpha_i, \beta\} = 0$ (anticommutation) - $\text{Tr}(\alpha_i) = \text{Tr}(\beta) = 0$ (since, e.g., $\alpha_i = -\beta\alpha_i\beta$, so $\text{Tr}(\alpha_i) = -\text{Tr}(\beta\alpha_i\beta) = -\text{Tr}(\alpha_i)$)

Tracelessness combined with eigenvalues $\pm 1$ means the dimension $N$ must be even. For $N = 2$, we only have the three Pauli matrices $\sigma_1, \sigma_2, \sigma_3$ (which are traceless, Hermitian, and square to $\mathbb{I}$), but we need four such anticommuting matrices ($\alpha_1, \alpha_2, \alpha_3, \beta$), and three is not enough.

Therefore, the minimum dimension is $N = 4$. The Dirac wave function is a four-component spinor:

$$\psi = \begin{pmatrix} \psi_1 \\ \psi_2 \\ \psi_3 \\ \psi_4 \end{pmatrix} \tag{29.14}$$

💡 Key Insight: The four-component nature of the Dirac spinor is not put in by hand — it is forced by the requirements of Lorentz covariance and positive-definite probability. The internal structure of the electron (two spin states and the existence of the positron) emerges from the mathematics itself. As Dirac said, "The equation was more intelligent than I was."

The Standard Representation

The standard (or Dirac) representation of $\alpha_i$ and $\beta$ is:

$$\alpha_i = \begin{pmatrix} 0 & \sigma_i \\ \sigma_i & 0 \end{pmatrix}, \qquad \beta = \begin{pmatrix} \mathbb{I}_2 & 0 \\ 0 & -\mathbb{I}_2 \end{pmatrix} \tag{29.15}$$

where $\sigma_i$ are the $2 \times 2$ Pauli matrices you know from Chapter 13, and $\mathbb{I}_2$ is the $2 \times 2$ identity. You can verify that these satisfy all the required anticommutation relations.

Checkpoint: Verify explicitly that $\{\alpha_1, \alpha_2\} = 0$ and $\{\alpha_1, \beta\} = 0$ using the standard representation. This is a matrix multiplication exercise worth doing once.

The Dirac Equation for a Free Particle: Explicit Form

In the standard representation, the Dirac equation $(i\hbar\partial_t - c\boldsymbol{\alpha}\cdot\hat{\mathbf{p}} - \beta mc^2)\psi = 0$ reads explicitly:

$$i\hbar\frac{\partial}{\partial t}\begin{pmatrix} \psi_1 \\ \psi_2 \\ \psi_3 \\ \psi_4 \end{pmatrix} = \begin{pmatrix} mc^2 & 0 & c\hat{p}_z & c(\hat{p}_x - i\hat{p}_y) \\ 0 & mc^2 & c(\hat{p}_x + i\hat{p}_y) & -c\hat{p}_z \\ c\hat{p}_z & c(\hat{p}_x - i\hat{p}_y) & -mc^2 & 0 \\ c(\hat{p}_x + i\hat{p}_y) & -c\hat{p}_z & 0 & -mc^2 \end{pmatrix} \begin{pmatrix} \psi_1 \\ \psi_2 \\ \psi_3 \\ \psi_4 \end{pmatrix}$$

This is a system of four coupled first-order partial differential equations. The off-diagonal blocks (the $\sigma_i$ matrices) couple the upper two components ($\psi_1, \psi_2$) to the lower two components ($\psi_3, \psi_4$). In the non-relativistic limit, the upper components dominate for positive-energy solutions and are called the "large components," while the lower components are suppressed by a factor of $v/c$ and are called the "small components." For negative-energy solutions, the roles are reversed.

🔵 Historical Note: Dirac's derivation of his equation was published in the Proceedings of the Royal Society in February 1928. The paper, "The Quantum Theory of the Electron," is only 11 pages long and is remarkably close to the derivation we have just presented. Dirac was 25 years old. Within two years, the equation would predict antimatter; within four, the prediction would be confirmed. It remains one of the most consequential pieces of mathematical reasoning in the history of science.


29.5 The Gamma Matrices and the Dirac Algebra

Covariant Form of the Dirac Equation

While the form $i\hbar\partial_t\psi = \hat{H}_D\psi$ makes the Hamiltonian structure clear, it is not manifestly Lorentz covariant because it singles out the time direction. To write a covariant form, define the gamma matrices:

$$\gamma^0 = \beta, \qquad \gamma^i = \beta\alpha_i \tag{29.16}$$

In the standard representation:

$$\gamma^0 = \begin{pmatrix} \mathbb{I}_2 & 0 \\ 0 & -\mathbb{I}_2 \end{pmatrix}, \qquad \gamma^i = \begin{pmatrix} 0 & \sigma_i \\ -\sigma_i & 0 \end{pmatrix} \tag{29.17}$$

Multiplying the Dirac equation (29.9) by $\beta/c$ from the left and defining $\bar{\psi} = \psi^\dagger\gamma^0$ (the Dirac adjoint), we arrive at the manifestly covariant Dirac equation:

$$\left(i\hbar\gamma^\mu\partial_\mu - mc\right)\psi = 0 \tag{29.18}$$

where $\partial_\mu = (\frac{1}{c}\frac{\partial}{\partial t}, \nabla)$ and repeated indices are summed (Einstein summation convention). This is often written even more compactly using Feynman's slash notation, $\slashed{\partial} \equiv \gamma^\mu\partial_\mu$:

$$\left(i\hbar\slashed{\partial} - mc\right)\psi = 0 \tag{29.19}$$

This is the Dirac equation in its most elegant form. It is a set of four coupled first-order partial differential equations for the four components of $\psi$.

The Clifford Algebra

The defining algebraic property of the gamma matrices, inherited from the conditions on $\alpha_i$ and $\beta$, is:

$$\{\gamma^\mu, \gamma^\nu\} \equiv \gamma^\mu\gamma^\nu + \gamma^\nu\gamma^\mu = 2g^{\mu\nu}\mathbb{I}_4 \tag{29.20}$$

where $g^{\mu\nu} = \text{diag}(+1, -1, -1, -1)$ is the Minkowski metric (in the $(+, -, -, -)$ convention). This algebraic structure is called a Clifford algebra, and it completely determines the properties of the Dirac matrices up to unitary equivalence.

Some important properties that follow immediately from Eq. (29.20):

  • $(\gamma^0)^2 = \mathbb{I}_4$, $(\gamma^i)^2 = -\mathbb{I}_4$
  • $\gamma^0$ is Hermitian: $(\gamma^0)^\dagger = \gamma^0$
  • $\gamma^i$ is anti-Hermitian: $(\gamma^i)^\dagger = -\gamma^i$
  • More generally: $(\gamma^\mu)^\dagger = \gamma^0\gamma^\mu\gamma^0$

The Fifth Gamma Matrix

From the four gamma matrices, we can construct a fifth independent $4\times 4$ matrix:

$$\gamma^5 \equiv i\gamma^0\gamma^1\gamma^2\gamma^3 \tag{29.21}$$

In the standard representation:

$$\gamma^5 = \begin{pmatrix} 0 & \mathbb{I}_2 \\ \mathbb{I}_2 & 0 \end{pmatrix}$$

This matrix anticommutes with all four gamma matrices ($\{\gamma^5, \gamma^\mu\} = 0$), satisfies $(\gamma^5)^2 = \mathbb{I}_4$, and is Hermitian. It plays a crucial role in the theory of parity, chirality, and the weak interaction.

The Dirac Algebra: A Complete Basis

The sixteen independent products of gamma matrices form a complete basis for all $4\times 4$ matrices:

Type Matrices Count
Scalar $\mathbb{I}_4$ 1
Vector $\gamma^\mu$ 4
Tensor $\sigma^{\mu\nu} = \frac{i}{2}[\gamma^\mu, \gamma^\nu]$ 6
Pseudovector $\gamma^5\gamma^\mu$ 4
Pseudoscalar $\gamma^5$ 1
Total 16

These sixteen matrices, called the Dirac algebra or the gamma matrix algebra, are the building blocks for constructing all possible Lorentz-covariant interactions involving Dirac spinors. Each type transforms in a definite way under the Lorentz group, which is why they are classified by their tensor character.

🔵 Historical Note: Dirac's original 1928 paper, "The Quantum Theory of the Electron," is remarkably concise — only 11 pages. The derivation we have followed is essentially Dirac's own, with only minor notational modernization. The paper is readable by any student who has followed the development in this chapter and is well worth studying in the original. See the Further Reading section.


29.6 Spin Emerges from the Dirac Equation

One of the most beautiful results in all of theoretical physics is that spin is not an additional postulate in relativistic quantum mechanics — it is a consequence. We introduced spin as an empirical fact in Chapter 13, motivated by the Stern-Gerlach experiment. The Pauli matrices appeared as an ad hoc addition to the non-relativistic theory. In the Dirac equation, spin arises inevitably from the structure of the mathematics.

Angular Momentum Conservation

Consider a free Dirac particle. The orbital angular momentum operator is $\hat{\mathbf{L}} = \hat{\mathbf{r}} \times \hat{\mathbf{p}}$. Let us check whether $\hat{\mathbf{L}}$ is conserved — i.e., whether $[\hat{L}_i, \hat{H}_D] = 0$.

The Dirac Hamiltonian for a free particle is $\hat{H}_D = c\,\boldsymbol{\alpha}\cdot\hat{\mathbf{p}} + \beta mc^2$. The mass term commutes with $\hat{\mathbf{L}}$ (it is proportional to the identity in position space). But:

$$[\hat{L}_i, c\alpha_j\hat{p}_j] = c\alpha_j[\hat{L}_i, \hat{p}_j] = c\alpha_j(-i\hbar\epsilon_{ijk}\hat{p}_k) = -i\hbar c\,\epsilon_{ijk}\alpha_j\hat{p}_k$$

This is not zero. Orbital angular momentum alone is not conserved for a free Dirac particle.

But total angular momentum must be conserved for a free particle (there are no external torques). So there must be another contribution to angular momentum that, when combined with $\hat{\mathbf{L}}$, is conserved. Define:

$$\hat{\mathbf{S}} = \frac{\hbar}{2}\boldsymbol{\Sigma}, \qquad \text{where} \qquad \Sigma_i = \begin{pmatrix} \sigma_i & 0 \\ 0 & \sigma_i \end{pmatrix} \tag{29.22}$$

This is a $4\times 4$ generalization of the spin operator from Chapter 13. Now compute:

$$[\hat{S}_i, \hat{H}_D] = \frac{\hbar}{2}[\Sigma_i, c\alpha_j\hat{p}_j] = \frac{\hbar c}{2}[\Sigma_i, \alpha_j]\hat{p}_j$$

Working out the commutator $[\Sigma_i, \alpha_j]$ using the explicit matrix representations, one finds:

$$[\Sigma_i, \alpha_j] = 2i\epsilon_{ijk}\alpha_k$$

Therefore:

$$[\hat{S}_i, \hat{H}_D] = i\hbar c\,\epsilon_{ijk}\alpha_j\hat{p}_k$$

Compare with the orbital part: $[\hat{L}_i, \hat{H}_D] = -i\hbar c\,\epsilon_{ijk}\alpha_j\hat{p}_k$. The two contributions cancel exactly:

$$[\hat{L}_i + \hat{S}_i, \hat{H}_D] = 0 \tag{29.23}$$

The total angular momentum $\hat{\mathbf{J}} = \hat{\mathbf{L}} + \hat{\mathbf{S}}$ is conserved, where $\hat{\mathbf{S}}$ is an intrinsic angular momentum — spin — that was not put in by hand but emerged from the requirement that the wave equation be both first-order in time and Lorentz covariant.

The Electron's Magnetic Moment

Couple the Dirac equation to an electromagnetic field by the minimal coupling prescription (covariant derivative):

$$\hat{\mathbf{p}} \to \hat{\mathbf{p}} - \frac{e}{c}\mathbf{A}, \qquad i\hbar\frac{\partial}{\partial t} \to i\hbar\frac{\partial}{\partial t} - e\Phi$$

where $\Phi$ and $\mathbf{A}$ are the scalar and vector potentials, and $e < 0$ for the electron. In the non-relativistic limit of the Dirac equation (which we obtain by separating the four-component spinor into "large" and "small" components and expanding in powers of $v/c$), the Hamiltonian becomes:

$$\hat{H} \approx \frac{(\hat{\mathbf{p}} - \frac{e}{c}\mathbf{A})^2}{2m} + e\Phi - \frac{e}{2mc}\boldsymbol{\sigma}\cdot\mathbf{B} + \cdots \tag{29.24}$$

The last term is the interaction of the electron's magnetic moment with the magnetic field. Comparing with $-\boldsymbol{\mu}\cdot\mathbf{B}$, we identify:

$$\boldsymbol{\mu} = \frac{e}{2mc}\boldsymbol{\sigma} = \frac{e}{mc}\hat{\mathbf{S}} = g_s\frac{e}{2mc}\hat{\mathbf{S}} \tag{29.25}$$

with $g_s = 2$.

This is extraordinary. In the non-relativistic Pauli theory (Chapter 13), the electron's g-factor $g_s = 2$ had to be inserted by hand as an experimental fact. The Dirac equation predicts $g_s = 2$ from first principles. It is not an input; it is an output.

💡 Key Insight: Before Dirac, physics had three separate mysteries: (1) Why does the electron have spin? (2) Why is its g-factor 2 rather than 1? (3) Why does the fine structure of hydrogen have the form it does? Dirac's equation answers all three with a single stroke — they are all consequences of combining quantum mechanics with special relativity. This is one of the most powerful unifications in the history of physics.

⚠️ Common Misconception: The Dirac equation predicts $g_s = 2$ exactly. The measured value is $g_s = 2.00231930436256\ldots$ — the famous anomalous magnetic moment. The deviation from 2 is a radiative correction computed in quantum electrodynamics (QED). The leading correction, first computed by Schwinger in 1948, is $g_s = 2(1 + \alpha/2\pi + \cdots)$ where $\alpha \approx 1/137$ is the fine-structure constant. This is one of the most precisely tested predictions in all of science, agreeing with experiment to better than 10 significant figures.

Worked Example: The Magnetic Moment Prediction

Let us trace the logic explicitly. The non-relativistic Hamiltonian for a charged particle in a magnetic field is:

$$\hat{H} = \frac{(\hat{\mathbf{p}} - \frac{e}{c}\mathbf{A})^2}{2m} + e\Phi$$

Expanding the kinetic term: $(\hat{\mathbf{p}} - \frac{e}{c}\mathbf{A})^2 = \hat{p}^2 - \frac{e}{c}(\hat{\mathbf{p}}\cdot\mathbf{A} + \mathbf{A}\cdot\hat{\mathbf{p}}) + \frac{e^2}{c^2}A^2$. In the Coulomb gauge ($\nabla\cdot\mathbf{A} = 0$) and for a uniform field $\mathbf{B} = \nabla\times\mathbf{A}$ with $\mathbf{A} = \frac{1}{2}\mathbf{B}\times\mathbf{r}$, the leading magnetic interaction is:

$$\hat{H}_{\text{mag}} = -\frac{e}{2mc}\hat{\mathbf{L}}\cdot\mathbf{B}$$

This gives a g-factor of $g_L = 1$ for orbital angular momentum — exactly the classical result. But for spin, the Dirac equation gives:

$$\hat{H}_{\text{spin-mag}} = -\frac{e}{mc}\hat{\mathbf{S}}\cdot\mathbf{B} = -g_s\frac{e}{2mc}\hat{\mathbf{S}}\cdot\mathbf{B}$$

with $g_s = 2$. The factor of 2 difference between the orbital and spin g-factors is one of the most striking predictions of the Dirac equation. Before Dirac, this factor had been measured experimentally (from the anomalous Zeeman effect, studied in Chapter 13) but had no theoretical explanation. After Dirac, it was a mathematical consequence of the structure of the gamma matrices.


29.7 The Dirac Hydrogen Atom: Energy Levels with Fine Structure

Setup

The hydrogen atom in the Dirac theory is described by the Dirac equation with the Coulomb potential:

$$\hat{H}_D = c\,\boldsymbol{\alpha}\cdot\hat{\mathbf{p}} + \beta mc^2 - \frac{e^2}{r} \tag{29.26}$$

where $e^2 \equiv e^2/(4\pi\epsilon_0)$ in Gaussian-CGS units (which simplify relativistic electrodynamics considerably — we will use them in this section). Solving this equation exactly is one of the great technical achievements of quantum mechanics. The solution, first obtained by Darwin (Charles Galton Darwin, grandson of the Darwin) and by Gordon in 1928, uses the fact that $\hat{J}^2$, $\hat{J}_z$, $\hat{K}$, and parity are all conserved, where $\hat{K} = \beta(\hat{\boldsymbol{\Sigma}}\cdot\hat{\mathbf{L}} + \hbar)$ is a Dirac-specific operator that has no non-relativistic analogue.

The Exact Energy Levels

The exact result for the energy eigenvalues of the Dirac hydrogen atom is:

$$E_{n,j} = mc^2\left[1 + \left(\frac{\alpha}{n - j - \frac{1}{2} + \sqrt{(j + \frac{1}{2})^2 - \alpha^2}}\right)^2\right]^{-1/2} \tag{29.27}$$

where $\alpha = e^2/(\hbar c) \approx 1/137.036$ is the fine-structure constant, $n = 1, 2, 3, \ldots$ is the principal quantum number, and $j = \frac{1}{2}, \frac{3}{2}, \ldots, n - \frac{1}{2}$ is the total angular momentum quantum number.

This is one of the most beautiful formulas in physics. Let us unpack it. Notice several remarkable features immediately:

  • The formula depends on $n$ and $j$ only — not on $l$. The orbital angular momentum quantum number has disappeared entirely. For fixed $n$ and $j$, different values of $l$ (e.g., $2S_{1/2}$ with $l = 0$ and $2P_{1/2}$ with $l = 1$) give the same energy.
  • The formula contains a square root $\sqrt{(j+1/2)^2 - \alpha^2}$. For $\alpha Z > j + 1/2$ (i.e., for $Z > 137/(2j+1)$), the argument becomes negative. For $j = 1/2$, this critical charge is $Z_c = 1/\alpha \approx 137$. Beyond this, the Dirac equation for a point nucleus has no normalizable ground state — the system "falls to the center." This signals the onset of spontaneous pair creation in supercritical fields, a dramatic prediction that has been partially tested in heavy-ion collisions.
  • In the limit $\alpha \to 0$ (turning off electromagnetism), $E_{n,j} \to mc^2$ — the electron's rest energy, with zero binding. All corrections are proportional to powers of $\alpha$.

Expansion in Powers of $\alpha$

Since $\alpha \approx 1/137 \ll 1$, we can expand Eq. (29.27) in powers of $\alpha$:

$$E_{n,j} \approx mc^2\left[1 - \frac{\alpha^2}{2n^2} - \frac{\alpha^4}{2n^4}\left(\frac{n}{j + \frac{1}{2}} - \frac{3}{4}\right) + \cdots\right] \tag{29.28}$$

Each term has a clear physical interpretation:

  1. $mc^2$ — the rest energy. Subtract this to get the binding energy.

  2. $-mc^2\alpha^2/(2n^2) = -13.6\,\text{eV}/n^2$ — the familiar Bohr/Schrodinger energy levels from Chapter 5! This is the non-relativistic result, independent of $l$ or $j$. The entire degeneracy structure of the hydrogen atom at order $\alpha^2$ is the Schrodinger result.

  3. $-mc^2\alpha^4/(2n^4)\left(\frac{n}{j+1/2} - \frac{3}{4}\right)$ — the fine-structure correction, of order $\alpha^4$. This depends on $j$ but not on $l$ separately. For a given $n$, states with different $l$ but the same $j$ remain degenerate in the Dirac theory.

🔗 Connection: In Chapter 18, we computed the fine structure of hydrogen perturbatively, finding three separate contributions: (1) the relativistic kinetic energy correction, (2) spin-orbit coupling, and (3) the Darwin term. Each contribution separately depends on $l$. But when added together, the dependence on $l$ cancels, and the result depends only on $j$. This "miraculous" cancellation is not miraculous at all — it is guaranteed by the Dirac equation, which treats all three effects simultaneously and exactly. The perturbative calculation of Chapter 18 is the $\alpha^4$ approximation to the exact Dirac result (29.27).

Numerical Example: The $n = 2$ Fine Structure

For $n = 2$, the non-relativistic energy is $E_2 = -13.6/4 = -3.4$ eV. The fine-structure splitting between $j = 3/2$ and $j = 1/2$ levels is:

$$\Delta E_{\text{fs}} = \frac{mc^2\alpha^4}{32}\left(\frac{2}{1} - \frac{2}{2}\right) = \frac{mc^2\alpha^4}{32} = \frac{(0.511 \times 10^6\,\text{eV})}{32}\left(\frac{1}{137}\right)^4$$

$$\Delta E_{\text{fs}} \approx 4.53 \times 10^{-5}\,\text{eV} \approx 0.365\,\text{cm}^{-1}$$

This corresponds to a frequency of about $10.9$ GHz. The fine structure is tiny compared to the gross structure — smaller by a factor of $\alpha^2 \approx 5 \times 10^{-5}$ — but easily measurable with spectroscopic techniques.

The Dirac Degeneracy

In the exact Dirac spectrum, states with the same $n$ and $j$ are degenerate, even if they have different $l$. Specifically, for $n = 2$:

  • $2S_{1/2}$ ($l = 0$, $j = 1/2$) and $2P_{1/2}$ ($l = 1$, $j = 1/2$) are degenerate.
  • $2P_{3/2}$ ($l = 1$, $j = 3/2$) has a different (higher) energy.

This degeneracy between $2S_{1/2}$ and $2P_{1/2}$ is a prediction of the Dirac equation. As we will see in Section 29.10, this prediction is wrong — the Lamb shift breaks this degeneracy. But it took until 1947 to measure the splitting, and explaining it required quantum electrodynamics.

📊 By the Numbers: The Hydrogen Fine Structure Hierarchy

Level $n$ $l$ $j$ Energy correction (in $mc^2\alpha^4$ units)
$1S_{1/2}$ 1 0 1/2 $-1/4$
$2S_{1/2}$ 2 0 1/2 $-5/128$
$2P_{1/2}$ 2 1 1/2 $-5/128$ (degenerate with $2S_{1/2}$)
$2P_{3/2}$ 2 1 3/2 $-1/128$
$3S_{1/2}$ 3 0 1/2 $-11/972$
$3P_{1/2}$ 3 1 1/2 $-11/972$ (degenerate with $3S_{1/2}$)
$3P_{3/2}$ 3 1 3/2 $-7/2916$
$3D_{3/2}$ 3 2 3/2 $-7/2916$ (degenerate with $3P_{3/2}$)
$3D_{5/2}$ 3 2 5/2 $-1/2916$

29.8 Negative-Energy Solutions and the Dirac Sea

The Problem Returns

The free-particle Dirac equation has plane-wave solutions:

$$\psi(\mathbf{r}, t) = u(\mathbf{p})\exp\left(\frac{i}{\hbar}(\mathbf{p}\cdot\mathbf{r} - Et)\right)$$

where $u(\mathbf{p})$ is a four-component spinor. For each momentum $\mathbf{p}$, there are four independent solutions: two with $E = +\sqrt{p^2c^2 + m^2c^4}$ (spin-up and spin-down positive-energy) and two with $E = -\sqrt{p^2c^2 + m^2c^4}$ (spin-up and spin-down negative-energy).

The negative-energy solutions are more troublesome for the Dirac equation than for the Klein-Gordon equation, because they are part of the same equation that correctly describes positive-energy electrons. We cannot simply dismiss them. Moreover, the negative-energy spectrum extends continuously down to $-\infty$: there is no lower bound on the energy. This means that a positive-energy electron could radiate photons and cascade down through the negative-energy states, losing energy forever. Atoms would be unstable. Matter would collapse.

This is clearly not what happens. Every hydrogen atom in the universe would have collapsed within a tiny fraction of a second. The stability of matter — the most basic empirical fact about the physical world — would be violated.

The transition rate for an electron to radiate a photon and drop from a positive-energy state to a negative-energy state can be estimated using the golden rule from Chapter 21. The rate is enormous — proportional to $\alpha(mc^2/\hbar) \sim 10^{18}\,\text{s}^{-1}$ — because the available phase space is infinite (the negative-energy continuum extends to $-\infty$). The entire matter content of the universe would disintegrate in about $10^{-18}$ seconds.

Something must prevent these transitions. But what?

Dirac's Audacious Solution: The Dirac Sea

In 1930, Dirac proposed a radical interpretation. Since electrons are fermions and obey the Pauli exclusion principle, what if all the negative-energy states are already occupied? The vacuum is not empty — it is a "sea" of electrons filling every negative-energy state. An electron in a positive-energy state cannot transition to a negative-energy state because those states are full.

This picture — the Dirac sea — is extravagant. It posits an infinite number of invisible electrons filling up all negative-energy states. But it makes a stunning prediction.

The Prediction of Holes

If a photon with energy $E \geq 2mc^2$ is absorbed by a negative-energy electron, it can be promoted to a positive-energy state. This leaves behind a "hole" in the Dirac sea. What are the properties of this hole?

Consider a filled Dirac sea with one electron removed from a state with energy $-|E|$, momentum $-\mathbf{p}$, and charge $-e$ (where $e > 0$ is the elementary charge). The missing electron means the sea has:

  • Energy higher by $|E|$ (relative to the filled sea), so the hole has positive energy $+|E|$
  • Momentum different by $+\mathbf{p}$ (the missing $-\mathbf{p}$ creates a momentum of $+\mathbf{p}$), so the hole has positive momentum $+\mathbf{p}$
  • Charge different by $+e$ (the missing $-e$ creates a charge of $+e$), so the hole has positive charge

The hole behaves exactly like a particle with the same mass as the electron but with positive charge. Dirac had predicted the existence of a new particle — the anti-electron — from pure theoretical reasoning.

The process of promoting a negative-energy electron to a positive-energy state, creating both a visible electron and a visible hole, is what we now call pair creation. It requires energy $\geq 2mc^2$ (to bridge the gap from the negative-energy sea to the positive-energy continuum). The reverse process — a positive-energy electron falling into an empty negative-energy state, annihilating both the electron and the hole — is pair annihilation, releasing energy $\geq 2mc^2$ as photons.

Initially (and understandably), Dirac hoped this particle might be the proton, since it was the only known positively charged particle. But the hole must have the same mass as the electron (since $E = +\sqrt{p^2c^2 + m^2c^4}$ applies equally to the hole), and the proton is 1836 times heavier. J. Robert Oppenheimer and Hermann Weyl independently pointed out that the hole must be a new particle with the electron's mass and opposite charge. Oppenheimer further showed that if the holes were protons, the rate of electron-proton annihilation would be catastrophically fast — every hydrogen atom would self-destruct in about $10^{-10}$ seconds. Since hydrogen manifestly exists, the hole cannot be the proton.

🔵 Historical Note: Dirac was initially reluctant to predict a new particle. In his 1931 paper, he wrote: "A hole, if there were one, would be a new kind of particle, unknown to experimental physics, having the same mass and opposite charge to an electron." The qualifier "if there were one" reflects his caution. Within a year, the particle would be found.


29.9 Antimatter: Dirac's Prediction, Anderson's Discovery

The Experimental Confirmation

On August 2, 1932, Carl David Anderson at Caltech was studying cosmic rays using a cloud chamber immersed in a magnetic field. He observed a track that curved the wrong way — a particle with the mass of an electron but positive charge. He named it the positron.

Anderson's discovery confirmed Dirac's prediction, though Anderson later claimed he was unaware of Dirac's theoretical work at the time. Patrick Blackett and Giuseppe Occhialini, working at the Cavendish Laboratory, subsequently confirmed the discovery and explicitly connected it to Dirac's theory. Dirac received the Nobel Prize in 1933 (shared with Schrodinger), and Anderson received it in 1936.

Beyond the Dirac Sea: Modern Interpretation

The Dirac sea picture, while historically important and pedagogically vivid, has serious conceptual difficulties:

  1. It is specific to fermions. Bosons do not obey the exclusion principle, so there is no "filled sea" to prevent cascading. Yet bosons also have antiparticles (e.g., the $W^+$ is the antiparticle of the $W^-$).

  2. It requires an infinite, unobservable background. The vacuum is defined as an infinite sea of negative-energy electrons, whose charge and energy are somehow subtracted by hand.

  3. It does not generalize well. In quantum field theory, the Dirac sea is replaced by a much more elegant framework.

The modern interpretation, due to Ernst Stückelberg (1941) and Richard Feynman (1949), is the Feynman-Stückelberg interpretation: a negative-energy particle traveling backward in time is equivalent to a positive-energy antiparticle traveling forward in time.

Mathematically, this is implemented by redefining the negative-energy solution:

$$\psi_{\text{neg}} = v(\mathbf{p})\exp\left(\frac{i}{\hbar}(-\mathbf{p}\cdot\mathbf{r} + |E|t)\right)$$

This is a positive-frequency (positive-energy) solution for a positron with momentum $\mathbf{p}$ traveling forward in time. There are no negative-energy states at all — just particles and antiparticles, both with positive energy.

Pair Creation and Annihilation

The Dirac theory predicts two fundamentally new processes:

Pair creation: A photon with energy $E_\gamma \geq 2m_ec^2 \approx 1.022$ MeV can convert into an electron-positron pair in the presence of a nucleus (which absorbs the recoil momentum):

$$\gamma \to e^- + e^+$$

Pair annihilation: An electron and positron can annihilate, converting their rest mass energy into photons:

$$e^- + e^+ \to \gamma + \gamma$$

(Two photons are required to conserve both energy and momentum in the center-of-mass frame; a single photon cannot simultaneously satisfy both conservation laws.) These processes are routinely observed in particle physics experiments and in medical PET (positron emission tomography) scans. At the Large Hadron Collider at CERN, trillions of particle-antiparticle pairs are created in every high-energy proton-proton collision.

🧪 Experiment: In a PET scan, a radioactive tracer emits positrons, each of which annihilates with a nearby electron, producing two 511 keV photons traveling in opposite directions. Detecting these back-to-back photons pinpoints the location of the tracer in the body. This is Dirac's equation at work in the hospital.

⚖️ Interpretation: Every known particle has an antiparticle (some particles, like the photon, are their own antiparticle). The existence of antimatter is one of the most robustly confirmed predictions in all of physics. It follows from the conjunction of quantum mechanics and special relativity — neither theory alone predicts it.

Antimatter and the CPT Theorem

The existence of antimatter is intimately connected to the CPT theorem, one of the deepest results in quantum field theory. The theorem states that any Lorentz-invariant quantum field theory with a Hermitian Hamiltonian and local interactions is invariant under the combined operation of:

  • C (charge conjugation): replace every particle with its antiparticle
  • P (parity): spatial inversion $\mathbf{r} \to -\mathbf{r}$
  • T (time reversal): $t \to -t$

The CPT theorem guarantees that every particle must have an antiparticle with exactly the same mass and lifetime but opposite charge and magnetic moment. Any violation of CPT would signal a breakdown of Lorentz invariance itself — one of the most fundamental symmetries of nature.


29.10 The Lamb Shift: Beyond Dirac

The Dirac Prediction and Its Failure

As we discussed in Section 29.7, the Dirac equation predicts that the $2S_{1/2}$ and $2P_{1/2}$ states of hydrogen are exactly degenerate. For nearly two decades (1928–1947), this prediction was accepted as correct. The experimental precision was not sufficient to test it.

In 1947, Willis Lamb and Robert Retherford used microwave spectroscopy — a technique developed from wartime radar research — to directly measure the transition frequency between the $2S_{1/2}$ and $2P_{1/2}$ states. Their result:

$$\Delta E_{\text{Lamb}} = E(2S_{1/2}) - E(2P_{1/2}) \approx 1057\,\text{MHz} \approx 4.37 \times 10^{-6}\,\text{eV} \tag{29.29}$$

The $2S_{1/2}$ state is higher in energy than the $2P_{1/2}$ state. The Dirac equation, which says they should be degenerate, is wrong.

Why Does the Dirac Equation Fail?

The Dirac equation treats the electromagnetic field classically — the Coulomb potential $-e^2/r$ is just a classical function, not a quantum object. But the electromagnetic field is itself quantized. The electron interacts not only with the classical Coulomb field of the proton but also with the quantum fluctuations of the electromagnetic vacuum.

Three effects contribute to the Lamb shift:

  1. Electron self-energy: The electron emits and reabsorbs virtual photons, which briefly change its effective mass and "smear out" its position over a distance $\sim (\alpha/\pi)(m_ec)^{-1}\ln(1/\alpha) \approx 10^{-13}$ m. This smearing is more significant for $S$-states (which have nonzero probability at $r = 0$) than for $P$-states, breaking the degeneracy.

  2. Vacuum polarization: Virtual electron-positron pairs momentarily appear in the vacuum near the proton, partially screening its charge. This effect slightly decreases the $2S_{1/2}$ energy (opposite sign to the self-energy) and contributes about $-27$ MHz to the total Lamb shift.

  3. Vertex correction: The electron's interaction with the quantized electromagnetic field modifies its effective magnetic moment (the anomalous magnetic moment), which also shifts the energy levels.

The dominant contribution is the electron self-energy, computed by Hans Bethe in a famous calculation performed on the train ride home from the 1947 Shelter Island conference:

$$\Delta E_{\text{Bethe}} \approx \frac{4\alpha^5mc^2}{3\pi n^3}\ln\left(\frac{1}{\alpha^2}\right)\delta_{l,0} \tag{29.30}$$

Bethe's non-relativistic estimate gave about 1040 MHz — within 2% of the experimental value. This calculation, performed on the train ride home from the conference, is one of the most celebrated back-of-the-envelope estimates in the history of physics.

The remaining discrepancy was resolved by the full relativistic QED calculation of Kroll and Lamb, and independently by French and Weisskopf, over the following two years. The complete result includes the electron self-energy (+1017 MHz), vacuum polarization ($-27$ MHz), the vertex correction (+68 MHz), and smaller higher-order terms. The total agrees with experiment to extraordinary precision.

The theoretical tools developed to carry out these calculations — Feynman diagrams, renormalization, the systematic perturbative expansion in powers of $\alpha$ — became the foundation of modern quantum field theory and, ultimately, of the Standard Model of particle physics. In a very real sense, the Lamb shift launched modern theoretical physics.

📊 By the Numbers: The Lamb shift is an effect of order $\alpha^5mc^2 \approx 10^{-6}$ eV. Compare this with the fine structure ($\alpha^4mc^2 \approx 10^{-4}$ eV), the gross structure ($\alpha^2mc^2 \approx 10$ eV), and the hyperfine structure ($\alpha^4(m_e/m_p)mc^2 \approx 10^{-7}$ eV). Each layer of the hydrogen energy spectrum reveals a new piece of physics.

🔗 Connection: The hierarchy of hydrogen energy corrections is: - Gross structure (Bohr/Schrodinger, Ch 5): $\sim \alpha^2 mc^2 \sim 10$ eV - Fine structure (Dirac/spin-orbit, Ch 18 and this chapter): $\sim \alpha^4 mc^2 \sim 10^{-4}$ eV - Lamb shift (QED, this section): $\sim \alpha^5 mc^2 \sim 10^{-6}$ eV - Hyperfine structure (nuclear spin): $\sim \alpha^4(m_e/m_p) mc^2 \sim 10^{-7}$ eV

Each step down in energy scale requires a more sophisticated theoretical framework. The hydrogen atom is the Rosetta Stone of quantum physics — and we have been reading it throughout this textbook.


29.11 Why Quantum Field Theory Is Necessary

The Dirac equation is a magnificent achievement, but it is ultimately an incomplete theory. The problems that drove us from the Klein-Gordon equation to the Dirac equation were partially solved — spin, positive probability density, $g_s = 2$ — but deeper problems remain. These problems are not technical inconveniences; they are the universe telling us that single-particle relativistic quantum mechanics is not the final framework. Let us catalogue the reasons why quantum field theory (QFT) is the necessary next step.

Problem 1: Particle Number Is Not Conserved

The Klein paradox, pair creation, and pair annihilation all demonstrate that the number of particles is not fixed in relativistic processes. When a photon creates an electron-positron pair, we go from one particle to three. When an electron and positron annihilate, we go from two particles to two photons. A theory built on a single-particle wave function has no room for these processes.

In QFT, particle number is not a fixed parameter but a quantum observable. The state space is a Fock space — a direct sum of $n$-particle Hilbert spaces for all $n = 0, 1, 2, \ldots$ — and operators can create and destroy particles. The Dirac field $\hat{\psi}(\mathbf{r}, t)$ is no longer a wave function but a field operator that creates an electron or annihilates a positron at the point $(\mathbf{r}, t)$.

Problem 2: The Klein Paradox

A Dirac electron encountering a step potential $V_0 > E + mc^2$ shows a transmitted current moving in the opposite direction to the incident particle, with a transmission coefficient that can exceed 1. In the single-particle Dirac theory, this is paradoxical. In QFT, the resolution is straightforward: the strong potential creates electron-positron pairs at the barrier. The "transmitted" current includes the positrons moving backward and the created electrons moving forward. The total particle number changes, which is precisely what a single-particle theory cannot accommodate.

The Klein paradox is not merely a theoretical curiosity. In the vicinity of superheavy nuclei ($Z \gtrsim 170$, achievable momentarily in heavy-ion collisions), the electric field is strong enough to spontaneously create $e^+e^-$ pairs from the vacuum. This phenomenon, called spontaneous pair creation or vacuum breakdown, has been partially observed at GSI (the heavy-ion research center in Darmstadt, Germany) and is an active area of research. It is the electromagnetic analogue of Hawking radiation from black holes.

Problem 3: The Dirac Sea Is Unsatisfying

The Dirac sea picture "works" for fermions but: - Requires an infinite, unobservable background - Does not apply to bosons, which also have antiparticles - Is representation-dependent (there are other ways to fill the sea)

In QFT, the vacuum is defined as the state annihilated by all annihilation operators: $\hat{a}_\mathbf{p}|0\rangle = \hat{b}_\mathbf{p}|0\rangle = 0$, where $\hat{a}$ annihilates electrons and $\hat{b}$ annihilates positrons. Both electrons and positrons are positive-energy excitations above this vacuum. No infinite sea is needed.

Problem 4: Interactions Require Field Quantization

The Lamb shift demonstrates that treating the electromagnetic field classically while quantizing the electron is inconsistent — it misses real, measurable effects. The electromagnetic field itself must be quantized, with photons as its quanta. The interaction between electrons and photons is described by quantum electrodynamics (QED), the prototypical quantum field theory, which treats the Dirac field and the electromagnetic field on equal footing as quantized fields.

Problem 5: Causality and Locality

In non-relativistic quantum mechanics, the propagator $G(\mathbf{r}, t; \mathbf{r}', 0)$ is nonzero for all $\mathbf{r}$ at any $t > 0$, no matter how large $|\mathbf{r} - \mathbf{r}'|$ or how small $t$. This means the particle can "propagate" faster than light — a violation of relativistic causality. The single-particle Dirac propagator has the same problem.

In QFT, this is resolved. While the propagator for a single particle can be superluminal, observable quantities (which involve both particles and antiparticles) respect causality. The key result is that field operators at spacelike-separated points commute (for bosons) or anticommute (for fermions):

$$[\hat{\phi}(x), \hat{\phi}(y)] = 0 \quad \text{for } (x - y)^2 < 0 \tag{29.31}$$

This ensures that no measurement at point $x$ can influence a measurement at a spacelike-separated point $y$. Causality is saved — but only by the conspiracy between particles and antiparticles that QFT provides.

Problem 6: The Spin-Statistics Connection and the CPT Theorem

Why are electrons fermions (spin-1/2, Fermi-Dirac statistics, Pauli exclusion) while photons are bosons (spin-1, Bose-Einstein statistics, no exclusion)? In non-relativistic quantum mechanics, this is imposed as an axiom. In QFT, the spin-statistics theorem proves that half-integer-spin particles must be fermions and integer-spin particles must be bosons, on pain of violating either causality or the positivity of energy. This is another triumph of the relativistic framework.

Closely related is the CPT theorem: any Lorentz-invariant local quantum field theory with a Hermitian Hamiltonian is automatically invariant under the combined transformation of charge conjugation (C), parity (P), and time reversal (T). The CPT theorem guarantees that every particle has an antiparticle with the same mass and lifetime but opposite quantum numbers. Any observed violation of CPT invariance would signal a breakdown of one of the most fundamental assumptions of modern physics — either Lorentz invariance or quantum mechanics itself.

Neither the spin-statistics theorem nor the CPT theorem can be derived within single-particle quantum mechanics, not even with the Dirac equation. They are intrinsically field-theoretic results, requiring the full machinery of creation and annihilation operators acting on Fock space.

The Road Ahead

Quantum field theory is the natural culmination of the journey that began with the Schrodinger equation. Each step along the way — from the Klein-Gordon equation to the Dirac equation to QED — has been driven by the demand for consistency between quantum mechanics and special relativity. The final framework, QFT, resolves all the problems we have identified:

Problem Single-Particle QM QFT Resolution
Negative-energy solutions Dirac sea (fermions only) Antiparticle creation operators
Klein paradox Paradoxical transmission Pair creation at barrier
Particle creation/annihilation Impossible Fock space, field operators
Lamb shift Cannot compute Virtual photon self-energy
Causality violation Superluminal propagation $[\hat{\phi}(x), \hat{\phi}(y)] = 0$ at spacelike separation
Spin-statistics connection Imposed as axiom Proved from Lorentz invariance + positivity

💡 Key Insight: The pattern of physics revealed by this chapter is remarkable. Each time we achieve a theoretical milestone (Klein-Gordon, Dirac), the new theory both solves old problems and points to new ones that demand an even deeper framework. The Dirac equation is not wrong — it is the correct single-particle limit of QED and remains indispensable for atomic, molecular, and condensed-matter physics. But it is incomplete, and the universe's insistence on completeness drives us inevitably toward quantum field theory.

⚠️ Common Misconception: Students sometimes conclude that "the Dirac equation is obsolete" because QFT is more fundamental. This is wrong. The Dirac equation remains the correct description of spin-1/2 particles in external fields whenever pair creation is negligible — which includes the vast majority of atomic and molecular physics. The Dirac equation for hydrogen, coupled with QED corrections computed perturbatively, gives the most precise predictions in all of science. What is obsolete is not the Dirac equation but the single-particle probabilistic interpretation of the Dirac equation.


29.12 Summary

This chapter has traced the arc from the non-relativistic Schrodinger equation to the threshold of quantum field theory. Let us consolidate the key results.

The Klein-Gordon Equation

$$\left(\Box + \frac{m^2c^2}{\hbar^2}\right)\phi = 0$$

  • Lorentz covariant, second-order wave equation
  • Describes spin-0 particles
  • Probability density is not positive-definite (disqualifies it as a single-particle wave equation)
  • Correctly reinterpreted in QFT as a field equation for scalar bosons (pions, Higgs)

The Dirac Equation

$$(i\hbar\gamma^\mu\partial_\mu - mc)\psi = 0$$

  • Lorentz covariant, first-order wave equation for spin-1/2 particles
  • Four-component spinor wave function
  • Gamma matrices satisfy the Clifford algebra: $\{\gamma^\mu, \gamma^\nu\} = 2g^{\mu\nu}\mathbb{I}_4$
  • Automatically predicts:
  • Electron spin ($s = 1/2$)
  • Correct magnetic moment ($g_s = 2$)
  • Fine structure of hydrogen
  • Existence of antimatter

Fine Structure from the Dirac Equation

$$E_{n,j} = mc^2\left[1 + \left(\frac{\alpha}{n - j - \frac{1}{2} + \sqrt{(j + \frac{1}{2})^2 - \alpha^2}}\right)^2\right]^{-1/2}$$

  • Depends on $n$ and $j$ only (not $l$ separately)
  • Recovers the Schrodinger energy levels at order $\alpha^2$
  • Fine structure at order $\alpha^4$ unifies relativistic correction, spin-orbit coupling, and Darwin term

Antimatter

  • Negative-energy solutions of the Dirac equation predict the existence of antiparticles
  • The positron (anti-electron) was discovered by Anderson in 1932
  • Every particle has an antiparticle (CPT theorem)
  • Particle-antiparticle creation and annihilation are fundamental processes

Why QFT

The marriage of quantum mechanics and special relativity demands quantum field theory because: 1. Particle number is not conserved 2. The vacuum has structure (fluctuations, virtual pairs) 3. Causality requires both particles and antiparticles 4. The spin-statistics connection can be derived only in QFT 5. The Lamb shift requires quantization of the electromagnetic field

The Dirac equation is not the end of the story — it is the last chapter before a new book begins. That new book is quantum field theory, and while it lies beyond the scope of this textbook, everything we have built here — the operator formalism, Hilbert spaces, symmetry principles, perturbation theory, angular momentum algebra, and now the Dirac equation — forms its foundation.

"In science one must search for ideas. Once an idea is obtained, its development involves only a technical problem." — Paul Dirac


Looking Forward

Chapter 30 will survey the state of the art in quantum mechanics and quantum technology, showing how the foundational concepts of this textbook connect to the cutting edge of physics in the 21st century. Chapters 31–37 (Part VII: Advanced Topics) will develop several of the advanced frameworks we have glimpsed here — path integrals, Berry phases, open quantum systems, second quantization — for students who wish to go deeper.


Chapter 29 has brought us to the frontier where quantum mechanics meets special relativity. The Dirac equation stands as one of the most beautiful equations in physics — a single line of mathematics that predicts spin, antimatter, and the fine structure of atoms. But nature's deepest secrets live in the quantum fields that the Dirac equation only hints at. The journey continues.