Chapter 19: The Variational Principle: When You Can't Solve It, Bound It

Contributors to Quantum Mechanics: From Wavefunctions to Qubits

41 min read

> "The variational method is the most powerful approximate method in quantum mechanics. It is remarkably robust: even a terrible guess for the wavefunction gives a useful bound on the energy."

Prerequisites

8
9
5
Multivariable calculus (partial derivatives, chain rule)
Linear algebra (eigenvalue problems, matrix diagonalization)
Basic optimization (finding minima of functions)

Learning Objectives

Prove the variational theorem and explain why ⟨ψ|Ĥ|ψ⟩/⟨ψ|ψ⟩ ≥ E₀ for any trial state |ψ⟩
Design effective trial wavefunctions with adjustable parameters for ground state estimation
Calculate the ground state energy of helium using a screened hydrogen wavefunction
Analyze the hydrogen molecule ion H₂⁺ using LCAO trial functions
Implement variational Monte Carlo for ground state estimation of arbitrary potentials

In This Chapter

19.1 The Variational Theorem: Proof and Meaning
19.2 Trial Wavefunctions and Variational Parameters
19.3 The Helium Ground State: Beating Perturbation Theory
19.4 The Hydrogen Molecule Ion $\text{H}_2^+$
19.5 The Linear Variational Method (Ritz Method)
19.6 Variational Monte Carlo
19.7 Comparison: Perturbation Theory vs. Variational Method
19.8 Summary and Progressive Project
Worked Examples
Conceptual Summary Table
New Terminology
Techniques Introduced

Exercises Quiz Case Study 01 Case Study 02 Key Takeaways Further Reading

Chapter 19: The Variational Principle: When You Can't Solve It, Bound It

"The variational method is the most powerful approximate method in quantum mechanics. It is remarkably robust: even a terrible guess for the wavefunction gives a useful bound on the energy." — Hans Bethe

"Make things as simple as possible, but not simpler." — Albert Einstein (attributed)

Here is the uncomfortable truth about quantum mechanics: you can solve the Schrödinger equation exactly for a depressingly small number of systems. The hydrogen atom, the harmonic oscillator, the infinite square well, a few others. The moment you add a second electron — even in the humble helium atom — the exact analytical solution vanishes. For helium, for lithium, for every molecule and solid, you are stuck.

In Chapter 17, we developed perturbation theory — our first systematic weapon against unsolvable problems. Perturbation theory works beautifully when the problem is close to one you can solve. But what happens when there is no nearby solvable problem? What if the perturbation is not small? What if you do not even know where to start?

This chapter introduces the variational principle — an approach so powerful, so flexible, and so fundamentally different from perturbation theory that it forms the backbone of modern computational quantum mechanics. The idea is deceptively simple: you do not need to solve the Schrödinger equation to get a useful answer. You need only guess a wavefunction, compute its energy expectation value, and then prove — rigorously — that this energy is an upper bound on the true ground state energy. Then you optimize your guess.

The variational method does not give you the exact answer. But it gives you a bound — a number that the true ground state energy cannot exceed. And with clever choices and persistent optimization, that bound can be astonishingly tight. For helium, a system that defeated perturbation theory at the few-percent level (Chapter 17), the variational method yields results accurate to better than 0.1% with a simple one-parameter trial function — and with more sophisticated wavefunctions, accuracy exceeding ten decimal places.

Let us see how this works.

19.1 The Variational Theorem: Proof and Meaning

Statement of the Theorem

The variational theorem is one of the most important results in quantum mechanics. It states:

Variational Theorem. Let $\hat{H}$ be the Hamiltonian of a quantum system with ground state energy $E_0$ (the lowest eigenvalue of $\hat{H}$). For any normalized state $|\psi\rangle$ in the Hilbert space,

$$\langle \psi | \hat{H} | \psi \rangle \geq E_0$$

with equality if and only if $|\psi\rangle$ is the ground state $|0\rangle$.

This is a remarkable statement. It says that no matter what state you choose — no matter how wildly wrong your guess — the expectation value of the Hamiltonian in that state is always greater than or equal to the true ground state energy. The ground state energy is the absolute floor. You cannot go below it.

For an unnormalized trial state $|\tilde{\psi}\rangle$, the theorem reads:

$$E[\tilde{\psi}] \equiv \frac{\langle \tilde{\psi} | \hat{H} | \tilde{\psi} \rangle}{\langle \tilde{\psi} | \tilde{\psi} \rangle} \geq E_0$$

The quantity $E[\tilde{\psi}]$ is called the energy functional or Rayleigh quotient. It maps each state in the Hilbert space to a real number, and the minimum of this functional over all possible states is exactly $E_0$.

Proof

The proof is elegant and instructive. Let $\{|n\rangle\}$ be the complete set of energy eigenstates of $\hat{H}$, with eigenvalues $E_n$ ordered so that $E_0 \leq E_1 \leq E_2 \leq \cdots$. Since these eigenstates form a complete orthonormal basis, we can expand any normalized trial state as:

$$|\psi\rangle = \sum_{n=0}^{\infty} c_n |n\rangle$$

where $c_n = \langle n | \psi \rangle$ are the expansion coefficients. Normalization requires:

$$\langle \psi | \psi \rangle = \sum_{n=0}^{\infty} |c_n|^2 = 1$$

Now compute the energy expectation value:

$$\langle \psi | \hat{H} | \psi \rangle = \sum_{n=0}^{\infty} \sum_{m=0}^{\infty} c_m^* c_n \langle m | \hat{H} | n \rangle = \sum_{n=0}^{\infty} |c_n|^2 E_n$$

where we used $\hat{H}|n\rangle = E_n|n\rangle$ and $\langle m | n \rangle = \delta_{mn}$.

Here comes the key step. Since $E_n \geq E_0$ for all $n$, we can write:

$$\langle \psi | \hat{H} | \psi \rangle = \sum_{n=0}^{\infty} |c_n|^2 E_n \geq \sum_{n=0}^{\infty} |c_n|^2 E_0 = E_0 \sum_{n=0}^{\infty} |c_n|^2 = E_0$$

Therefore $\langle \psi | \hat{H} | \psi \rangle \geq E_0$. $\square$

Equality holds if and only if $|c_n|^2 = 0$ for all $n \neq 0$, i.e., $|\psi\rangle = e^{i\phi}|0\rangle$ — the trial state is the ground state (up to a phase).

💡 Key Insight: The proof reveals why the variational principle works. Every trial state has "contamination" from excited states. These excited state components can only raise the energy expectation value, never lower it. The ground state is the absolute minimum, so any deviation from it pushes the energy up.

What the Theorem Does NOT Say

The variational theorem gives an upper bound on $E_0$. It does not tell you:

How close your upper bound is to $E_0$ (it could be terrible).
Anything about excited state energies (in general, $\langle \psi | \hat{H} | \psi \rangle$ is not an upper bound on $E_1$ for an arbitrary $|\psi\rangle$).
The wavefunction itself (even if your energy is very close to $E_0$, your trial wavefunction might be a poor approximation to $|0\rangle$).

The last point deserves emphasis. The energy is a single number — a weighted average of all the eigenvalues. It is possible for a trial function to get the energy almost exactly right while getting the spatial distribution of the wavefunction quite wrong. The energy is less sensitive to errors in $|\psi\rangle$ than the wavefunction itself.

⚠️ Common Misconception: "A good variational energy means a good variational wavefunction." This is false. The energy is a quadratic functional of the wavefunction, so small errors in $|\psi\rangle$ produce second-order errors in $E$. An energy that is accurate to 0.1% could correspond to a wavefunction that is accurate to only ~3%.

Extension to Excited States

The variational theorem can be extended to excited states, but only with additional constraints. If your trial state is guaranteed to be orthogonal to the true ground state — $\langle 0 | \psi \rangle = 0$ — then:

$$\langle \psi | \hat{H} | \psi \rangle \geq E_1$$

More generally, if $|\psi\rangle$ is orthogonal to the lowest $k$ eigenstates, then $\langle \psi | \hat{H} | \psi \rangle \geq E_k$. This is the basis of the variational method for excited states, but in practice it requires knowing the lower eigenstates exactly (or at least their symmetry), which limits its applicability.

A more practical approach uses symmetry: if the ground state has a certain symmetry (e.g., even parity) and your trial function has a different symmetry (e.g., odd parity), then the trial function is automatically orthogonal to the ground state. In this case, the variational theorem gives an upper bound on the lowest energy eigenvalue with that symmetry.

🔗 Connection: In Chapter 10, we showed that symmetry operators commute with the Hamiltonian, so energy eigenstates can be labeled by symmetry quantum numbers. This fact is what makes the variational method for excited states practical: design a trial function with the symmetry of the desired excited state.

✅ Checkpoint: Before proceeding, make sure you can answer these questions: 1. Why does the proof require that the energy eigenstates form a complete basis? 2. If your trial function accidentally equals the first excited state, what does the variational theorem tell you? 3. Can the variational method ever give you a lower bound on $E_0$?

19.2 Trial Wavefunctions and Variational Parameters

The power of the variational method lies entirely in the choice of trial wavefunction. A good trial function captures the essential physics of the problem; a bad one wastes your time with a useless upper bound. Choosing well is as much art as science, and it is the part of the variational method that textbooks cannot fully teach — you develop intuition through practice.

Anatomy of a Trial Wavefunction

A variational trial wavefunction has the general form:

$$|\psi(\alpha_1, \alpha_2, \ldots, \alpha_k)\rangle$$

where $\alpha_1, \alpha_2, \ldots, \alpha_k$ are variational parameters — free parameters that you will adjust to minimize the energy. The variational procedure is:

Choose a functional form for $|\psi(\boldsymbol{\alpha})\rangle$ guided by physical intuition.
Compute the energy functional $E(\boldsymbol{\alpha}) = \langle \psi(\boldsymbol{\alpha}) | \hat{H} | \psi(\boldsymbol{\alpha}) \rangle / \langle \psi(\boldsymbol{\alpha}) | \psi(\boldsymbol{\alpha}) \rangle$.
Minimize $E(\boldsymbol{\alpha})$ with respect to all parameters: $\partial E / \partial \alpha_i = 0$ for all $i$.
Report $E_{\min} = E(\boldsymbol{\alpha}_{\text{opt}})$ as your upper bound on $E_0$.

The variational theorem guarantees that $E_{\min} \geq E_0$, so the lower you can push $E_{\min}$, the better your bound.

Guiding Principles for Trial Functions

Here are the rules of thumb that experienced practitioners use:

1. Respect the boundary conditions. If the true wavefunction must vanish at certain boundaries (e.g., at the walls of an infinite well, or at $r \to \infty$ for a bound state), your trial function must do the same. Violating boundary conditions does not invalidate the variational theorem, but it produces needlessly poor results.

2. Capture the qualitative shape. The trial function should have the same general features as the true ground state: it should peak where the potential is lowest, decay where the potential is high, and have no nodes (for the ground state). A Gaussian is often a reasonable first guess for a confining potential.

3. Include the right physics. If the system has a known feature — screening, correlation, a cusp condition — build it into the trial function. For atoms, the wavefunction must satisfy the Kato cusp condition: $\psi'(0)/\psi(0) = -Z/a_0$ at each nucleus, where $Z$ is the nuclear charge. This follows from the Coulomb singularity in the potential.

4. More parameters are better (usually). Adding parameters to a trial function can only decrease (or leave unchanged) the optimized energy, never increase it. A 3-parameter trial function will always give a result at least as good as a 1-parameter trial function. But more parameters means harder optimization — there is a practical tradeoff.

5. Use symmetry. If the Hamiltonian has certain symmetries, the ground state wavefunction must respect those symmetries. Do not waste variational freedom on components that will be projected out by symmetry.

Example: The Gaussian Trial Function for the Hydrogen Atom

To illustrate the method, let us apply the variational principle to a problem we can solve exactly: the hydrogen atom ground state. This will let us see how well the method works and learn from its limitations.

The hydrogen Hamiltonian (in atomic units, $\hbar = m_e = e = 1$, $a_0 = 1$) is:

$$\hat{H} = -\frac{1}{2}\nabla^2 - \frac{1}{r}$$

The exact ground state energy is $E_0 = -1/2$ hartree $= -13.6$ eV, with wavefunction $\psi_0(r) = \pi^{-1/2} e^{-r}$.

Suppose we did not know this and tried a Gaussian trial function:

$$\psi(r; \alpha) = \left(\frac{2\alpha}{\pi}\right)^{3/4} e^{-\alpha r^2}$$

This is normalized: $\langle \psi | \psi \rangle = 1$ for all $\alpha > 0$. The parameter $\alpha$ controls the width.

Computing the energy expectation value (the integrals are standard Gaussians):

$$\langle T \rangle = \frac{3\alpha}{2}, \qquad \langle V \rangle = -2\sqrt{\frac{2\alpha}{\pi}}$$

$$E(\alpha) = \frac{3\alpha}{2} - 2\sqrt{\frac{2\alpha}{\pi}}$$

Minimizing: $\frac{dE}{d\alpha} = \frac{3}{2} - \sqrt{\frac{2}{\pi\alpha}} = 0$, giving $\alpha_{\text{opt}} = \frac{8}{9\pi}$, and:

$$E_{\min} = -\frac{4}{3\pi} \approx -0.4244 \text{ hartree}$$

📊 By the Numbers: The Gaussian trial function gives $E_{\min} = -0.4244$ hartree, compared to the exact $E_0 = -0.5000$ hartree. The error is about 15% — not terrible for such a crude guess, but not great either. The problem is that the Gaussian has the wrong behavior near $r = 0$ (it is flat instead of having a cusp) and the wrong decay at large $r$ ($e^{-\alpha r^2}$ instead of $e^{-r}$).

This example teaches two lessons. First, even a qualitatively wrong trial function gives a legitimate upper bound — the variational theorem guarantees it. Second, getting the physics right matters enormously. If we had instead used the trial function $\psi(r; \alpha) = N e^{-\alpha r}$ (the correct exponential form with adjustable decay rate), we would recover the exact answer $E_0 = -0.5$ hartree with $\alpha_{\text{opt}} = 1$ — because this family includes the true ground state.

💡 Key Insight: The variational method is only as good as the family of trial functions you search over. If the true ground state lies within your family, you will find it exactly. If it does not, you will find the best approximation within that family. This is why physical intuition about the correct functional form matters so much.

The Variational Procedure as a Map

It is helpful to think of the variational method geometrically. The full Hilbert space is infinite-dimensional. Your trial wavefunction, parameterized by $k$ numbers $(\alpha_1, \ldots, \alpha_k)$, defines a $k$-dimensional surface — a variational manifold — within this Hilbert space. The variational procedure finds the point on this surface that has the lowest energy. If the true ground state happens to lie on this surface, you find it exactly; otherwise, you find the closest point (in an energy sense) on the surface to the ground state.

Adding more parameters expands the surface, always including the previous surfaces as subsets. This is why more parameters give (at worst) the same energy and (generically) a better energy.

19.3 The Helium Ground State: Beating Perturbation Theory

The helium atom is the simplest system that cannot be solved exactly. It has two electrons orbiting a nucleus of charge $Z = 2$, and the electron-electron repulsion $e^2/r_{12}$ makes the problem three-body and analytically intractable.

The Helium Hamiltonian

In atomic units:

$$\hat{H} = -\frac{1}{2}\nabla_1^2 - \frac{1}{2}\nabla_2^2 - \frac{Z}{r_1} - \frac{Z}{r_2} + \frac{1}{r_{12}}$$

with $Z = 2$ and $r_{12} = |\mathbf{r}_1 - \mathbf{r}_2|$.

Without the electron-electron repulsion ($1/r_{12}$), each electron would independently occupy a hydrogen-like ground state with nuclear charge $Z$. The ground state would be:

$$\psi_0(\mathbf{r}_1, \mathbf{r}_2) = \frac{Z^3}{\pi} e^{-Z(r_1 + r_2)}$$

with energy $E_0^{(0)} = -Z^2 = -4$ hartree $= -108.8$ eV.

The experimental ground state energy of helium is $E_0^{\text{exp}} = -2.9037$ hartree $= -79.0$ eV. The zeroth-order approximation ($-4$ hartree) overshoots by 38% — the electron-electron repulsion is not a small perturbation.

First-Order Perturbation Theory (Review)

In Chapter 17, we treated $1/r_{12}$ as a perturbation and computed the first-order correction:

$$E^{(1)} = \left\langle \psi_0 \left| \frac{1}{r_{12}} \right| \psi_0 \right\rangle = \frac{5Z}{8} = \frac{5}{4}$$

This is a famous six-dimensional integral that Unsöld evaluated in 1927. The first-order energy is:

$$E_0 \approx -Z^2 + \frac{5Z}{8} = -4 + 1.25 = -2.75 \text{ hartree}$$

🔗 Connection: The perturbation theory result $E_0 \approx -2.75$ hartree (Chapter 17) has an error of 5.3% compared to experiment ($-2.9037$ hartree). This is decent for first-order perturbation theory, but we can do much better with the variational method.

The Screening Variational Approach

The key physical insight is that each electron partially screens the nuclear charge from the other electron. If one electron is, on average, between the nucleus and the other electron, the outer electron "sees" an effective nuclear charge $Z_{\text{eff}} < Z$. We can build this insight into a trial function.

Our trial wavefunction uses the same hydrogen-like product form, but with an adjustable effective nuclear charge $Z_{\text{eff}}$ (which we will call $Z'$ for brevity):

$$\psi(\mathbf{r}_1, \mathbf{r}_2; Z') = \frac{(Z')^3}{\pi} e^{-Z'(r_1 + r_2)}$$

When $Z' = Z = 2$, this reduces to the unperturbed ground state (no screening). When $Z' < 2$, the electrons spread out farther from the nucleus — physically, each electron is partially screening the nucleus from the other.

Computing the Energy

We need $E(Z') = \langle \psi(Z') | \hat{H} | \psi(Z') \rangle$, where $\hat{H}$ is the full helium Hamiltonian with $Z = 2$ (not $Z'$). This is a crucial distinction: $Z'$ is the variational parameter in the trial function, but the Hamiltonian always has the true nuclear charge $Z = 2$.

A standard trick: write $\hat{H}$ as the sum of a $Z'$-hydrogen Hamiltonian plus corrections:

$$\hat{H} = \hat{H}_0(Z') + (Z' - Z)\left(\frac{1}{r_1} + \frac{1}{r_2}\right) + \frac{1}{r_{12}}$$

where $\hat{H}_0(Z') = -\frac{1}{2}\nabla_1^2 - \frac{1}{2}\nabla_2^2 - \frac{Z'}{r_1} - \frac{Z'}{r_2}$.

Our trial function is the exact ground state of $\hat{H}_0(Z')$, with energy $-2(Z')^2/2 \times 2 = -(Z')^2$ per electron, so:

$$\langle \hat{H}_0(Z') \rangle = -(Z')^2$$

For the correction terms, using hydrogen-like expectation values $\langle 1/r \rangle = Z'$ (for a 1s state with effective charge $Z'$):

$$\left\langle (Z' - Z)\left(\frac{1}{r_1} + \frac{1}{r_2}\right) \right\rangle = 2(Z' - Z) Z'$$

And from Unsöld's integral with $Z$ replaced by $Z'$:

$$\left\langle \frac{1}{r_{12}} \right\rangle = \frac{5Z'}{8}$$

Putting it all together:

$$E(Z') = -(Z')^2 + 2(Z' - Z)Z' + \frac{5Z'}{8}$$

$$= -(Z')^2 + 2(Z')^2 - 2ZZ' + \frac{5Z'}{8}$$

$$= (Z')^2 - 2ZZ' + \frac{5Z'}{8}$$

Minimizing

$$\frac{dE}{dZ'} = 2Z' - 2Z + \frac{5}{8} = 0$$

$$Z'_{\text{opt}} = Z - \frac{5}{16} = 2 - \frac{5}{16} = \frac{27}{16} = 1.6875$$

The optimal effective nuclear charge is $27/16$ — each electron screens the nuclear charge by $5/16$ of an electron charge. This is the most famous result in variational quantum mechanics.

Substituting back:

$$E_{\min} = \left(\frac{27}{16}\right)^2 - 2(2)\left(\frac{27}{16}\right) + \frac{5}{8}\left(\frac{27}{16}\right) = -\frac{729}{256} + \frac{5 \cdot 27}{128} - \frac{27 \cdot 2}{8}$$

Let us compute this carefully:

$$E_{\min} = (Z')^2 - 2ZZ' + \frac{5Z'}{8} = \left(\frac{27}{16}\right)^2 - 4 \cdot \frac{27}{16} + \frac{5}{8} \cdot \frac{27}{16}$$

$$= \frac{729}{256} - \frac{6912}{256} + \frac{135}{128} = \frac{729}{256} - \frac{6912}{256} + \frac{270}{256} = -\frac{5913}{256}$$

Hmm, let us redo this more carefully. We had:

$$E(Z') = (Z')^2 - 2ZZ' + \frac{5Z'}{8}$$

With $Z = 2$ and $Z' = 27/16$:

$$E = \left(\frac{27}{16}\right)^2 - 2 \cdot 2 \cdot \frac{27}{16} + \frac{5}{8} \cdot \frac{27}{16}$$

$$= \frac{729}{256} - \frac{108}{16} + \frac{135}{128}$$

$$= \frac{729}{256} - \frac{1728}{256} + \frac{270}{256} = \frac{729 - 1728 + 270}{256} = -\frac{729}{256}$$

$$E_{\min} = -\frac{729}{256} \approx -2.8477 \text{ hartree}$$

📊 By the Numbers: Helium Ground State Energy Comparison

Method Energy (hartree) Error

Zeroth-order ($Z' = 2$) $-4.000$ 37.8%

First-order perturbation $-2.750$ 5.3%

Variational ($Z' = 27/16$) $-2.848$ 1.9%

Hylleraas (6 params, 1929) $-2.9033$ 0.01%

Modern CI (thousands of params) $-2.90372$ $< 0.001\%$

Experiment $-2.9037$ —

The one-parameter variational result ($-2.848$ hartree, 1.9% error) already beats first-order perturbation theory ($-2.750$ hartree, 5.3% error) — and it required less mathematical machinery. This is the power of the variational method: by encoding the right physics (screening) into the trial function, you can leapfrog over the perturbative approach.

🔵 Historical Note: Egil Hylleraas, a Norwegian physicist working in Göttingen in 1929, applied the variational method to helium with a trial function that explicitly included the electron-electron distance $r_{12}$ as a coordinate. His 6-parameter function achieved 0.01% accuracy. By 1959, Pekeris had extended this approach to 1078 terms, achieving energy accuracy to six decimal places. Today, helium's ground state energy is known to 40+ significant figures from variational calculations — more precisely than it can be measured experimentally.

Physical Interpretation of $Z_{\text{eff}} = 27/16$

The result $Z_{\text{eff}} = 27/16 \approx 1.69$ has a beautiful physical interpretation. Each electron sees the full nuclear charge $Z = 2$ minus a screening of $5/16 \approx 0.31$ from the other electron. The screening is less than one full electron charge — this makes sense because the electrons' probability distributions overlap significantly, so neither electron fully screens the nucleus from the other.

The screening is also symmetric: each electron contributes equally to screening the other. This is built into our trial function, which treats the two electrons identically. A more sophisticated trial function could allow asymmetric screening (one electron closer, one farther), which is the basis of the Hylleraas-type wavefunctions.

19.4 The Hydrogen Molecule Ion $\text{H}_2^+$

The helium atom showed the variational method applied to a spherically symmetric problem with one parameter. The hydrogen molecule ion $\text{H}_2^+$ — a single electron orbiting two protons — introduces a fundamentally different challenge: the system has cylindrical symmetry, not spherical, and the natural trial function is a linear combination of atomic orbitals.

The Problem

$\text{H}_2^+$ consists of two protons (labeled A and B) separated by a distance $R$, with a single electron. In the Born-Oppenheimer approximation (the protons are much heavier than the electron, so we can treat them as stationary), the electronic Hamiltonian is:

$$\hat{H}_{\text{el}} = -\frac{1}{2}\nabla^2 - \frac{1}{r_A} - \frac{1}{r_B}$$

where $r_A$ and $r_B$ are the distances from the electron to proton A and proton B, respectively. (We add the proton-proton repulsion $1/R$ later.)

When the protons are far apart ($R \to \infty$), the electron is localized on one proton or the other, and the system is just a hydrogen atom. The ground state energy approaches $-1/2$ hartree.

When the protons are close together ($R \to 0$), the system approaches $\text{He}^+$ (a helium ion), with ground state energy $-Z^2/2 = -2$ hartree.

At intermediate distances, the electron shares itself between both protons — a chemical bond forms.

The LCAO Trial Function

The simplest and most physically motivated trial function is a Linear Combination of Atomic Orbitals (LCAO):

$$|\psi\rangle = c_A |1s_A\rangle + c_B |1s_B\rangle$$

where $|1s_A\rangle$ is the hydrogen 1s orbital centered on proton A, and $|1s_B\rangle$ is the hydrogen 1s orbital centered on proton B. In position space:

$$\psi(\mathbf{r}) = c_A \frac{1}{\sqrt{\pi}} e^{-r_A} + c_B \frac{1}{\sqrt{\pi}} e^{-r_B}$$

The variational parameters are $c_A$ and $c_B$ (subject to normalization), but symmetry simplifies things enormously. The Hamiltonian is symmetric under interchange of protons A and B, so the ground state must be symmetric: $c_A = c_B$. The antisymmetric combination $c_A = -c_B$ gives the first excited state.

The bonding orbital (symmetric, ground state):

$$\psi_+(r) = \frac{1}{\sqrt{2(1 + S)}} \left( \phi_A(r) + \phi_B(r) \right)$$

The antibonding orbital (antisymmetric, excited state):

$$\psi_-(r) = \frac{1}{\sqrt{2(1 - S)}} \left( \phi_A(r) - \phi_B(r) \right)$$

where $\phi_A$ and $\phi_B$ are normalized 1s orbitals and $S = \langle \phi_A | \phi_B \rangle$ is the overlap integral.

Key Integrals

Three integrals characterize the LCAO solution:

Overlap integral: $$S(R) = \langle \phi_A | \phi_B \rangle = e^{-R}\left(1 + R + \frac{R^2}{3}\right)$$

Coulomb integral (energy of an electron on A in the field of B): $$J(R) = -\left\langle \phi_A \left| \frac{1}{r_B} \right| \phi_A \right\rangle = -\frac{1}{R}\left(1 - (1 + R)e^{-2R}\right)$$

Exchange integral (the quantum-mechanical term with no classical analogue): $$K(R) = -\left\langle \phi_A \left| \frac{1}{r_B} \right| \phi_B \right\rangle = -(1 + R)e^{-R}$$

The energy of the bonding state, including nuclear repulsion, is:

$$E_+(R) = E_{1s} + \frac{J + K}{1 + S} + \frac{1}{R}$$

where $E_{1s} = -1/2$ hartree. The antibonding energy is:

$$E_-(R) = E_{1s} + \frac{J - K}{1 - S} + \frac{1}{R}$$

Results and the Chemical Bond

Minimizing $E_+(R)$ over $R$ gives:

$$R_e \approx 2.49 \, a_0 = 1.32 \text{ \AA}, \qquad E_+(R_e) \approx -0.5648 \text{ hartree}$$

The binding energy (energy below the separated atom limit $-0.5$ hartree) is:

$$D_e = 0.5648 - 0.5 = 0.0648 \text{ hartree} = 1.76 \text{ eV}$$

📊 By the Numbers: Comparison with experiment for H₂⁺:

Quantity LCAO Variational Experiment

$R_e$ (equilibrium distance) 1.32 \AA 1.06 \AA

$D_e$ (binding energy) 1.76 eV 2.79 eV

The simple LCAO gets the qualitative physics right (the molecule is bound) but the quantitative agreement is only fair. The LCAO overestimates the bond length and underestimates the binding energy. Improving the trial function — e.g., adding a variational parameter $Z'$ to the atomic orbitals, or including $2s$ and $2p$ contributions — dramatically improves the results.

The antibonding state $\psi_-$ is always repulsive: $E_-(R) > -1/2$ hartree for all $R > 0$. There is no stable molecule in the antibonding state.

💡 Key Insight: Chemical bonding is a quantum phenomenon. The exchange integral $K$ — the term that creates the energy splitting between bonding and antibonding — has no classical analogue. It arises from the coherent superposition of $\phi_A$ and $\phi_B$, i.e., from quantum interference. Covalent bonding is literally a quantum interference effect.

Improving the LCAO: Adding $Z'$

We can combine the LCAO approach with the screening idea from helium. Replace the 1s orbitals $e^{-r}$ with $e^{-Z'r}$, where $Z'$ is a variational parameter. Now we optimize over both $R$ (bond length) and $Z'$ (orbital size). The result:

$$R_e = 2.00 \, a_0 = 1.06 \text{ \AA}, \qquad D_e = 2.35 \text{ eV}, \qquad Z'_{\text{opt}} = 1.24$$

The optimized $Z' > 1$ means the orbitals contract (electrons are pulled closer to the nuclei in the molecule than in the free atom). This improvement brings the binding energy to 84% of the experimental value with just two parameters.

19.5 The Linear Variational Method (Ritz Method)

From Nonlinear to Linear Optimization

In Sections 19.3 and 19.4, our variational parameters appeared nonlinearly in the trial function (e.g., $Z'$ in the exponent $e^{-Z'r}$). This generally requires numerical optimization. But there is a special class of trial functions where the optimization can be done exactly by solving a matrix eigenvalue problem.

The linear variational method (or Ritz method) uses a trial function that is a linear combination of fixed basis functions:

$$|\psi\rangle = \sum_{i=1}^{N} c_i |\phi_i\rangle$$

where $\{|\phi_i\rangle\}$ are $N$ fixed (not adjustable) basis functions and $\{c_i\}$ are the variational parameters. The energy functional becomes:

$$E[\mathbf{c}] = \frac{\sum_{i,j} c_i^* c_j H_{ij}}{\sum_{i,j} c_i^* c_j S_{ij}}$$

where $H_{ij} = \langle \phi_i | \hat{H} | \phi_j \rangle$ is the Hamiltonian matrix and $S_{ij} = \langle \phi_i | \phi_j \rangle$ is the overlap matrix.

Deriving the Secular Equation

Minimizing $E$ with respect to the coefficients $c_k^*$:

$$\frac{\partial E}{\partial c_k^*} = 0 \implies \sum_j H_{kj} c_j = E \sum_j S_{kj} c_j$$

This is a generalized eigenvalue problem:

$$\mathbf{H} \mathbf{c} = E \, \mathbf{S} \mathbf{c}$$

The nontrivial solutions exist only when the secular determinant vanishes:

$$\det(\mathbf{H} - E\mathbf{S}) = 0$$

This is a polynomial equation of degree $N$ in $E$, yielding $N$ eigenvalues $E_1 \leq E_2 \leq \cdots \leq E_N$. Each eigenvalue is an upper bound on the corresponding true eigenvalue:

$$E_k^{\text{Ritz}} \geq E_k^{\text{true}} \quad \text{for } k = 1, 2, \ldots, N$$

This remarkable result — called the Hylleraas-Undheim-MacDonald theorem — means the linear variational method gives upper bounds on all eigenvalues, not just the ground state. (It follows from the interlacing theorem for eigenvalues of bordered matrices.)

💡 Key Insight: The linear variational method transforms the Schrödinger equation from a differential equation (infinite-dimensional) into a matrix eigenvalue problem ($N$-dimensional). This is the fundamental idea behind every computational quantum mechanics package: choose a basis, compute $H_{ij}$ and $S_{ij}$, diagonalize. The entire field of quantum chemistry rests on this foundation.

If the Basis is Orthonormal

If $\langle \phi_i | \phi_j \rangle = \delta_{ij}$ (orthonormal basis), then $\mathbf{S} = \mathbf{I}$ and the generalized eigenvalue problem reduces to a standard one:

$$\mathbf{H} \mathbf{c} = E \, \mathbf{c}$$

This is just matrix diagonalization — find the eigenvalues and eigenvectors of the Hamiltonian matrix. The lowest eigenvalue is the variational estimate of $E_0$, and the corresponding eigenvector gives the optimal coefficients.

Example: Two-State System

The simplest nontrivial case is $N = 2$. With an orthonormal basis $\{|\phi_1\rangle, |\phi_2\rangle\}$:

$$\mathbf{H} = \begin{pmatrix} H_{11} & H_{12} \\ H_{21} & H_{22} \end{pmatrix}$$

The secular equation gives:

$$E_{\pm} = \frac{H_{11} + H_{22}}{2} \pm \sqrt{\left(\frac{H_{11} - H_{22}}{2}\right)^2 + |H_{12}|^2}$$

This is the standard formula for a two-level system. The off-diagonal matrix element $H_{12}$ splits the levels apart — the lower one goes down, the upper one goes up. This is the variational origin of level repulsion.

🔗 Connection: This two-state Ritz problem is exactly the $\text{H}_2^+$ calculation from Section 19.4, with $|\phi_1\rangle = |1s_A\rangle$ and $|\phi_2\rangle = |1s_B\rangle$. The bonding and antibonding orbitals are the two eigenstates of the $2 \times 2$ Hamiltonian matrix. The exchange integral $K$ is the off-diagonal matrix element that creates the bond.

Systematic Improvement: The Basis Set Convergence

A key feature of the linear variational method is systematic improvability. Start with a small basis set (say, $N = 3$). Compute the variational energies. Then add more basis functions ($N = 10$, $N = 50$, $N = 200$). The eigenvalues can only decrease (or stay the same) as $N$ increases, monotonically approaching the exact eigenvalues from above.

This convergence is the foundation of modern computational quantum mechanics:

In quantum chemistry, the basis sets (STO-3G, 6-31G*, cc-pVDZ, cc-pVTZ, etc.) are systematically enlarged to approach the complete basis set limit.
In condensed matter physics, plane wave basis sets are truncated at a kinetic energy cutoff, and convergence is tested by increasing the cutoff.
In nuclear physics, harmonic oscillator basis functions are used, with convergence studied as a function of the number of oscillator shells.

In all cases, the variational theorem guarantees that each increase in basis size either improves the answer or confirms that it has converged.

⚠️ Common Misconception: "More basis functions always give a significantly better answer." In practice, there are diminishing returns. The first few basis functions capture the gross features of the wavefunction and produce large energy improvements. Later basis functions capture fine details and produce progressively smaller improvements. The energy converges from above, usually roughly exponentially in the number of basis functions for well-chosen bases.

19.6 Variational Monte Carlo

The Problem with High Dimensions

The variational method as described so far requires computing integrals of the form:

$$E[\psi] = \frac{\int \psi^*(\mathbf{r}) \hat{H} \psi(\mathbf{r}) \, d\mathbf{r}}{\int |\psi(\mathbf{r})|^2 \, d\mathbf{r}}$$

For a single particle in three dimensions, the integrals are three-dimensional — manageable by quadrature. But for $N$ electrons, $\mathbf{r}$ is a $3N$-dimensional vector, and the integrals are $3N$-dimensional. For helium ($N = 2$), that is 6 dimensions. For a carbon atom ($N = 6$), it is 18 dimensions. For a modest molecule like benzene ($N = 42$ electrons), it is 126 dimensions.

Standard numerical integration (quadrature on a grid) is hopeless in high dimensions. With $M$ grid points per dimension, you need $M^{3N}$ evaluations — this is the curse of dimensionality. Even $M = 10$ and $N = 10$ gives $10^{30}$ evaluations, far beyond any computer's reach.

Monte Carlo to the Rescue

The solution is Monte Carlo integration — estimating integrals by random sampling. The key insight is to rewrite the energy expectation value as:

$$E[\psi] = \frac{\int |\psi(\mathbf{r})|^2 \cdot \frac{\hat{H}\psi(\mathbf{r})}{\psi(\mathbf{r})} \, d\mathbf{r}}{\int |\psi(\mathbf{r})|^2 \, d\mathbf{r}} = \left\langle E_L(\mathbf{r}) \right\rangle_{|\psi|^2}$$

where $E_L(\mathbf{r}) = \hat{H}\psi(\mathbf{r}) / \psi(\mathbf{r})$ is the local energy — the result of applying the Hamiltonian to $\psi$ and dividing by $\psi$ at each point in configuration space. (Here we assume $\psi$ is real and positive, as it is for the ground state of most systems.)

The energy is the average of the local energy over the probability distribution $|\psi(\mathbf{r})|^2$. If we can draw random samples $\mathbf{r}_1, \mathbf{r}_2, \ldots, \mathbf{r}_M$ from the distribution $|\psi(\mathbf{r})|^2$, then:

$$E[\psi] \approx \frac{1}{M} \sum_{i=1}^{M} E_L(\mathbf{r}_i)$$

with a statistical error that scales as $1/\sqrt{M}$ — independent of the dimensionality. This is the miracle of Monte Carlo: the convergence rate $1/\sqrt{M}$ holds in 3 dimensions, 30 dimensions, or 300 dimensions.

The VMC Algorithm

The Variational Monte Carlo (VMC) algorithm combines Monte Carlo integration with variational optimization:

Input: Trial wavefunction $\psi(\mathbf{r}; \boldsymbol{\alpha})$ with variational parameters $\boldsymbol{\alpha}$.

Algorithm:

Choose initial parameter values $\boldsymbol{\alpha}$.
Use the Metropolis algorithm to generate $M$ samples $\{\mathbf{r}_i\}$ from $|\psi(\mathbf{r}; \boldsymbol{\alpha})|^2$: - Start from a random configuration $\mathbf{r}$. - Propose a move: $\mathbf{r}' = \mathbf{r} + \delta \boldsymbol{\eta}$, where $\boldsymbol{\eta}$ is a random displacement. - Accept the move with probability $\min\left(1, \frac{|\psi(\mathbf{r}')|^2}{|\psi(\mathbf{r})|^2}\right)$. - After a burn-in period, record the configurations as samples.
Compute the energy estimate: $\bar{E}(\boldsymbol{\alpha}) = \frac{1}{M}\sum_{i=1}^{M} E_L(\mathbf{r}_i; \boldsymbol{\alpha})$.
Compute the variance: $\sigma^2 = \frac{1}{M}\sum_{i=1}^{M} (E_L(\mathbf{r}_i) - \bar{E})^2$.
Adjust $\boldsymbol{\alpha}$ to minimize $\bar{E}$ (using gradient descent, stochastic reconfiguration, or other optimization methods).
Repeat steps 2-5 until convergence.

Output: Optimized parameters $\boldsymbol{\alpha}_{\text{opt}}$ and energy estimate $\bar{E} \pm \sigma/\sqrt{M}$.

💡 Key Insight: If $\psi$ were the exact eigenstate, the local energy $E_L(\mathbf{r})$ would be a constant $E_0$ everywhere (because $\hat{H}\psi = E_0\psi$). The variance of $E_L$ is a measure of how far $\psi$ is from being an exact eigenstate. This is called the zero-variance principle: as the trial function approaches the exact eigenstate, the statistical noise in VMC vanishes.

The Metropolis Algorithm

The Metropolis algorithm is worth understanding in detail because it appears throughout computational physics. The goal is to generate samples from a probability distribution $P(\mathbf{r}) = |\psi(\mathbf{r})|^2 / \int |\psi|^2 d\mathbf{r}$ without having to compute the normalization integral (the denominator).

The algorithm constructs a Markov chain — a sequence of configurations where each depends only on the previous one — that converges to $P(\mathbf{r})$ in the long run. The acceptance criterion $\min(1, |\psi(\mathbf{r}')|^2/|\psi(\mathbf{r})|^2)$ ensures detailed balance: in equilibrium, the rate of transitions from $\mathbf{r}$ to $\mathbf{r}'$ equals the rate from $\mathbf{r}'$ to $\mathbf{r}$, weighted by the target distribution.

Practical considerations:

The step size $\delta$ should be tuned so that approximately 50% of proposed moves are accepted.
A burn-in period (typically $10^3$ to $10^4$ steps) should be discarded to allow the chain to equilibrate.
Successive samples are correlated, so the effective number of independent samples is $M_{\text{eff}} = M / \tau_{\text{corr}}$, where $\tau_{\text{corr}}$ is the autocorrelation time.

🧪 Experiment: In the code examples accompanying this chapter (see code/example-01-variational.py), you will implement VMC for the hydrogen atom and verify that with a simple Jastrow trial function, the algorithm converges to the exact ground state energy $E_0 = -0.5$ hartree.

VMC in Practice

VMC is the simplest quantum Monte Carlo method, but it is already powerful enough for research applications. Modern VMC calculations use:

Slater-Jastrow trial functions: a Slater determinant (antisymmetric, captures exchange) multiplied by a Jastrow factor (symmetric, captures correlation): $\psi = e^{J(\{r_{ij}\})} \det[\phi_k(\mathbf{r}_i)]$.
Backflow transformations: the coordinates in the Slater determinant are replaced by quasi-coordinates that depend on all particle positions, improving the nodal surface.
Neural network wavefunctions: recent breakthroughs use deep neural networks as variational ansatze, with VMC-like optimization. The FermiNet and PauliNet architectures have achieved chemical accuracy for small molecules.

🔗 Connection: VMC is a stepping stone to diffusion Monte Carlo (DMC), which projects out the exact ground state by simulating imaginary-time diffusion. DMC is exact for bosonic systems but suffers from the fermion sign problem for many-electron systems — one of the great unsolved problems in computational physics. We will encounter the sign problem again in Chapter 34 (second quantization) and Chapter 37 (quantum field theory).

19.7 Comparison: Perturbation Theory vs. Variational Method

We now have two major approximation methods in our toolkit. When should you use each?

Systematic Comparison

Feature	Perturbation Theory (Ch 17)	Variational Method
Basic idea	Expand around a solvable problem	Optimize a trial wavefunction
When it works best	Problem is close to a solvable one	You have physical insight into the ground state shape
What it gives	Energy corrections order by order; wavefunction corrections	Upper bound on ground state energy; approximate wavefunction
Accuracy control	Higher orders (but series may diverge)	More variational parameters (monotonic improvement)
Excited states	Natural (corrections for each level)	Requires orthogonality constraints or symmetry
Systematic?	Yes (order by order, but may not converge)	Yes (Ritz method with basis set expansion)
Convergence	Not guaranteed (asymptotic series)	Guaranteed (from above, monotonically)
Requires	Solvable $\hat{H}_0$ and small $\hat{V}$	Good physical intuition for trial function
Computational cost	Integrals over known eigenstates	Optimization over parameter space
Dimensional scaling	Fixed cost per order	VMC scales polynomially with $N$

When They Agree and When They Don't

For the helium ground state, both methods give reasonable results: - First-order perturbation: $-2.750$ hartree (5.3% error) - One-parameter variational: $-2.848$ hartree (1.9% error)

The variational method wins here because (1) the electron-electron repulsion is not truly "small," so perturbation theory converges slowly, and (2) the screening physics is easy to capture variationally.

But for fine structure corrections to hydrogen (Chapter 18), perturbation theory is the natural tool. The relativistic and spin-orbit corrections are genuinely small ($\alpha^2 \sim 10^{-4}$ times the gross structure), and the hydrogen eigenstates are known exactly. A variational approach would be awkward and unnecessary.

💡 Key Insight: Perturbation theory and the variational method are complementary, not competing. The best quantum physicists use both, often on the same problem. Perturbation theory tells you how corrections enter (their functional dependence on parameters). The variational method tells you how large those corrections are (by optimizing numerically). Together, they provide both qualitative understanding and quantitative accuracy.

The Best of Both Worlds: Variational Perturbation Theory

In many modern applications, the two methods are combined. One approach:

Use a variational calculation to find an optimized zeroth-order Hamiltonian $\hat{H}_0$.
Define the perturbation as $\hat{V} = \hat{H} - \hat{H}_0$.
Apply perturbation theory on top of the variational solution.

This often converges much faster than either method alone, because the variational step ensures that $\hat{V}$ is genuinely small relative to $\hat{H}_0$.

Another approach is many-body perturbation theory (MBPT) starting from a Hartree-Fock (variational) reference state. The Hartree-Fock method is a specific variational approach restricted to single Slater determinant trial functions. MBPT then systematically includes correlation effects order by order. This is the Moller-Plesset (MP2, MP3, MP4) hierarchy used extensively in quantum chemistry.

🔗 Connection: The Ritz variational method (Section 19.5) is the foundation of the Hartree-Fock method and density functional theory (DFT), which we will encounter in Chapter 16 (multi-electron atoms). Nearly all of modern quantum chemistry and materials science computation rests on variational foundations.

A Decision Flowchart

When facing an unsolvable quantum problem, ask yourself:

Is the problem close to a solvable one? If yes, try perturbation theory first (Chapter 17-18).
Do you have good intuition for the ground state shape? If yes, try a nonlinear variational calculation (Sections 19.2-19.4).
Do you have a good basis set? If yes, try the linear variational / Ritz method (Section 19.5).
Is the system high-dimensional ($N > 3$)? If yes, you probably need Monte Carlo methods (Section 19.6).
Do you need both qualitative understanding and quantitative accuracy? Combine both methods.

19.8 Summary and Progressive Project

Chapter Summary

This chapter introduced the variational principle — one of the two pillars of quantum approximation methods (the other being perturbation theory). The key ideas:

The Variational Theorem (Section 19.1): For any trial state $|\psi\rangle$, the energy expectation value $\langle \psi | \hat{H} | \psi \rangle \geq E_0$. The proof follows from expanding $|\psi\rangle$ in the energy eigenbasis and using $E_n \geq E_0$. The theorem provides a rigorous upper bound — the closer you can push the energy down, the better your approximation.

Trial Wavefunctions (Section 19.2): The art of the variational method lies in choosing trial functions that capture the essential physics. Key principles: respect boundary conditions, get the qualitative shape right, include relevant physics (cusp conditions, screening, correlation), and use symmetry. More parameters always help but increase computational cost.

Helium Ground State (Section 19.3): With a one-parameter screened wavefunction $\psi \propto e^{-Z'(r_1 + r_2)}$, we found $Z'_{\text{opt}} = 27/16$ and $E = -2.848$ hartree — beating first-order perturbation theory's $-2.750$ hartree and achieving 1.9% accuracy compared to the experimental $-2.904$ hartree.

Hydrogen Molecule Ion (Section 19.4): The LCAO trial function $\psi = c_A \phi_A + c_B \phi_B$ describes chemical bonding through quantum interference. The exchange integral $K$ — with no classical analogue — creates the energy splitting between bonding and antibonding states. Adding a variational $Z'$ parameter significantly improves quantitative predictions.

The Ritz Method (Section 19.5): When the trial function is a linear combination of fixed basis functions, the variational optimization reduces to a generalized eigenvalue problem $\mathbf{H}\mathbf{c} = E\mathbf{S}\mathbf{c}$. This gives upper bounds on all eigenvalues simultaneously (Hylleraas-Undheim-MacDonald theorem) and is the foundation of modern computational quantum mechanics.

Variational Monte Carlo (Section 19.6): For high-dimensional systems, Monte Carlo integration evaluates energy integrals by random sampling. The local energy $E_L = \hat{H}\psi/\psi$ is averaged over $|\psi|^2$ using the Metropolis algorithm. The statistical error scales as $1/\sqrt{M}$ regardless of dimensionality, defeating the curse of dimensionality. The zero-variance principle guarantees that the statistical noise vanishes as the trial function approaches an exact eigenstate.

Perturbation vs. Variational (Section 19.7): The two methods are complementary. Perturbation theory works when the problem is near a solvable one and provides order-by-order corrections. The variational method works when you have physical intuition and provides rigorous bounds with guaranteed convergence. Modern methods combine both.

Looking Ahead

In Chapter 20, we develop the WKB approximation — our third major approximation method. The WKB method is semiclassical: it works when the de Broglie wavelength varies slowly compared to the distance over which the potential changes. Where the variational method excels at ground state energies, WKB excels at tunneling rates and highly excited states.

In Chapter 21, we extend perturbation theory to time-dependent problems, opening the door to transition rates, absorption and emission of radiation, and Fermi's golden rule.

Together, Chapters 17–22 give you a complete toolkit for attacking the vast majority of quantum mechanical problems that cannot be solved exactly.

Progressive Project: Variational Solver Module

In this chapter's project component, you will build the variational solver for the Quantum Simulation Toolkit. The specific deliverables:

Module: variational_solve()

Implement a general-purpose variational solver that: 1. Takes a Hamiltonian (as a function or matrix), a trial wavefunction class, and initial parameter values. 2. Computes the energy functional $E(\boldsymbol{\alpha})$ by numerical integration or matrix diagonalization. 3. Minimizes $E$ using scipy.optimize.minimize. 4. Returns the optimized energy and parameters.

Module: optimize_trial()

A convenience wrapper that handles: - Normalization of the trial function. - Computation of $\langle T \rangle$ and $\langle V \rangle$ separately (for the virial theorem check). - Multiple random restarts to avoid local minima. - Convergence diagnostics.

Module: ritz_method()

Implement the linear variational method: 1. Takes a set of basis functions and a Hamiltonian. 2. Constructs the $\mathbf{H}$ and $\mathbf{S}$ matrices. 3. Solves the generalized eigenvalue problem using scipy.linalg.eigh. 4. Returns all eigenvalues and eigenvectors.

Application: Helium and H₂⁺

Apply the toolkit to: - Reproduce the $Z_{\text{eff}} = 27/16$ result for helium. - Compute the $\text{H}_2^+$ potential energy curve $E(R)$ and find the equilibrium bond length. - Compare variational results with perturbation theory results from Chapter 17's toolkit module.

See code/project-checkpoint.py for the implementation template.

✅ Checkpoint: You should now be able to: 1. Prove the variational theorem and explain each step of the proof. 2. Design a trial wavefunction for a given potential, choosing appropriate functional form and parameters. 3. Calculate the helium ground state energy with the screening parameter and reproduce $Z' = 27/16$. 4. Set up and solve the LCAO problem for H₂⁺, explaining the origin of the chemical bond. 5. Explain the Ritz method and set up the secular determinant for a given basis. 6. Describe the VMC algorithm, including the Metropolis sampling step and the zero-variance principle. 7. Decide whether to use perturbation theory or the variational method for a given problem.

Worked Examples

Worked Example 19.1: Variational Bound for the QHO

Problem: Use the trial function $\psi(x; \alpha) = N e^{-\alpha x^2}$ (a Gaussian) as a variational trial function for the quantum harmonic oscillator $\hat{H} = -\frac{\hbar^2}{2m}\frac{d^2}{dx^2} + \frac{1}{2}m\omega^2 x^2$. Find the optimal $\alpha$ and the resulting energy bound. Compare with the exact ground state energy.

Solution:

The normalized Gaussian is $\psi(x) = \left(\frac{2\alpha}{\pi}\right)^{1/4} e^{-\alpha x^2}$.

Kinetic energy: $$\langle T \rangle = -\frac{\hbar^2}{2m} \int_{-\infty}^{\infty} \psi \frac{d^2\psi}{dx^2} dx = \frac{\hbar^2 \alpha}{2m}$$

Potential energy: $$\langle V \rangle = \frac{1}{2}m\omega^2 \langle x^2 \rangle = \frac{1}{2}m\omega^2 \cdot \frac{1}{4\alpha} = \frac{m\omega^2}{8\alpha}$$

Total energy: $$E(\alpha) = \frac{\hbar^2 \alpha}{2m} + \frac{m\omega^2}{8\alpha}$$

Minimize: $$\frac{dE}{d\alpha} = \frac{\hbar^2}{2m} - \frac{m\omega^2}{8\alpha^2} = 0 \implies \alpha_{\text{opt}} = \frac{m\omega}{2\hbar}$$

Optimal energy: $$E_{\min} = \frac{\hbar^2}{2m} \cdot \frac{m\omega}{2\hbar} + \frac{m\omega^2}{8} \cdot \frac{2\hbar}{m\omega} = \frac{\hbar\omega}{4} + \frac{\hbar\omega}{4} = \frac{1}{2}\hbar\omega$$

This is exactly the ground state energy! The Gaussian trial function is the exact ground state of the QHO, so the variational method recovers the exact answer. (The exact ground state wavefunction is a Gaussian with $\alpha = m\omega/2\hbar$.)

💡 Key Insight: When the trial function family includes the exact ground state, the variational method finds it exactly. This is both a strength (the method is self-correcting) and a diagnostic (if you suspect your family includes the true ground state, verifying this is straightforward).

Worked Example 19.2: Variational Estimate for the Linear Potential

Problem: A particle of mass $m$ is bound by the linear potential $V(x) = \beta|x|$ (with $\beta > 0$), which is relevant to quark confinement. Use the trial function $\psi(x; \alpha) = N e^{-\alpha x^2}$ to estimate the ground state energy.

Solution:

We already know $\langle T \rangle = \frac{\hbar^2 \alpha}{2m}$ for this trial function. For the potential energy:

$$\langle V \rangle = \beta \langle |x| \rangle = \beta \int_{-\infty}^{\infty} |x| \left(\frac{2\alpha}{\pi}\right)^{1/2} e^{-2\alpha x^2} dx = \beta \cdot \frac{1}{\sqrt{2\pi\alpha}}$$

The total energy: $$E(\alpha) = \frac{\hbar^2 \alpha}{2m} + \frac{\beta}{\sqrt{2\pi\alpha}}$$

Minimize: $$\frac{dE}{d\alpha} = \frac{\hbar^2}{2m} - \frac{\beta}{2\sqrt{2\pi}} \alpha^{-3/2} = 0$$

$$\alpha_{\text{opt}} = \left(\frac{m\beta}{\hbar^2\sqrt{2\pi}}\right)^{2/3}$$

Substituting back: $$E_{\min} = \frac{3}{2} \left(\frac{\hbar^2}{2m}\right)^{1/3} \left(\frac{\beta}{\sqrt{2\pi}}\right)^{2/3}$$

The exact ground state energy (from the Airy function solution) is:

$$E_0 = \left(\frac{\hbar^2 \beta^2}{2m}\right)^{1/3} \cdot a_1'$$

where $a_1' \approx 1.0188$ is related to the first zero of the Airy function. The ratio $E_{\min}/E_0 \approx 1.015$ — the Gaussian trial function gives a result within 1.5% of the exact answer. Not bad for a one-parameter guess.

Worked Example 19.3: Ritz Method for a Perturbed Infinite Well

Problem: Consider a particle in an infinite square well of width $L$ with a perturbation $V(x) = V_0 \sin(\pi x / L)$. Use the Ritz method with the first three unperturbed eigenstates as basis functions to find the ground state energy.

Solution:

The unperturbed eigenstates are $\phi_n(x) = \sqrt{2/L} \sin(n\pi x/L)$ with energies $E_n^{(0)} = n^2 \pi^2 \hbar^2 / (2mL^2) \equiv n^2 E_1$.

The basis is orthonormal, so $S_{mn} = \delta_{mn}$ and we need to compute the $3 \times 3$ Hamiltonian matrix $H_{mn} = E_n^{(0)} \delta_{mn} + V_{mn}$.

The perturbation matrix elements: $$V_{mn} = V_0 \frac{2}{L} \int_0^L \sin\frac{m\pi x}{L} \sin\frac{n\pi x}{L} \sin\frac{\pi x}{L} \, dx$$

Using the product formula $\sin A \sin B = \frac{1}{2}[\cos(A-B) - \cos(A+B)]$:

$$V_{mn} = \frac{V_0}{L} \int_0^L \left[\cos\frac{(m-n)\pi x}{L} - \cos\frac{(m+n)\pi x}{L}\right] \sin\frac{\pi x}{L} \, dx$$

After evaluation (using standard integrals), the nonzero elements for the $3 \times 3$ basis are:

$$V_{11} = \frac{8V_0}{3\pi}, \qquad V_{13} = V_{31} = -\frac{V_0}{2 \cdot 3 \cdot 5} \cdot \frac{8}{\pi} = \frac{4V_0}{15\pi}$$

$$V_{22} = \frac{8V_0}{15\pi}, \qquad V_{12} = V_{21} = 0, \qquad V_{23} = V_{32} = 0$$

Wait — let us be more careful. By parity, $V_{12} = 0$ (the integrand with $m = 1, n = 2$ involves $\sin(\pi x/L)\sin(2\pi x/L)\sin(\pi x/L)$). In fact, $V_{mn} = 0$ when $m + n$ is odd.

The Hamiltonian matrix is:

$$\mathbf{H} = \begin{pmatrix} E_1 + V_{11} & 0 & V_{13} \\ 0 & 4E_1 + V_{22} & 0 \\ V_{13} & 0 & 9E_1 + V_{33} \end{pmatrix}$$

Since $\phi_2$ decouples (all its off-diagonal elements vanish), the ground state problem reduces to a $2 \times 2$ matrix in the $\{\phi_1, \phi_3\}$ subspace. The ground state energy from the $2 \times 2$ secular equation is:

$$E_{\text{gs}} = \frac{(E_1 + V_{11}) + (9E_1 + V_{33})}{2} - \sqrt{\left(\frac{(9E_1 + V_{33}) - (E_1 + V_{11})}{2}\right)^2 + V_{13}^2}$$

For $V_0 \ll E_1$, this reduces to the first-order perturbation theory result $E_{\text{gs}} \approx E_1 + V_{11}$ plus a second-order correction from the mixing with $\phi_3$ — exactly consistent with Chapter 17. The Ritz method gives a better answer because it includes the effect of mixing between states non-perturbatively.

Conceptual Summary Table

Concept	Definition	Key Equation	Section
Variational principle	Energy expectation value of any trial state bounds the ground state from above	$\langle \psi\|\hat{H}\|\psi\rangle \geq E_0$	19.1
Trial wavefunction	Parameterized guess for the ground state	$\|\psi(\alpha_1, \ldots, \alpha_k)\rangle$	19.2
Energy functional	Maps trial states to energy values (Rayleigh quotient)	$E[\psi] = \langle \psi\|\hat{H}\|\psi\rangle / \langle\psi\|\psi\rangle$	19.1
Ritz method	Linear variational method: expand in fixed basis, solve secular equation	$\det(\mathbf{H} - E\mathbf{S}) = 0$	19.5
Local energy	Hamiltonian applied to trial function divided by trial function	$E_L(\mathbf{r}) = \hat{H}\psi(\mathbf{r})/\psi(\mathbf{r})$	19.6

New Terminology

Term	Definition
Variational principle	The theorem that $\langle \psi\|\hat{H}\|\psi\rangle/\langle\psi\|\psi\rangle \geq E_0$ for any state $\|\psi\rangle$
Trial wavefunction	A parameterized function used as an approximation to the true ground state
Variational parameter	An adjustable parameter in the trial wavefunction, optimized to minimize energy
Upper bound	A value guaranteed to be greater than or equal to the true ground state energy
Energy functional	The mapping $E[\psi] = \langle\psi\|\hat{H}\|\psi\rangle / \langle\psi\|\psi\rangle$, also called the Rayleigh quotient
Ritz variational method	The linear variational method using a fixed basis set; reduces to a matrix eigenvalue problem
Linear variational method	Using a trial function that is a linear combination of basis functions with coefficients as variational parameters
Secular equation	The generalized eigenvalue equation $\mathbf{H}\mathbf{c} = E\mathbf{S}\mathbf{c}$ arising from the Ritz method
Secular determinant	$\det(\mathbf{H} - E\mathbf{S}) = 0$; the condition for nontrivial solutions of the secular equation
Variational Monte Carlo	A stochastic method that estimates ground state energies by sampling from $\|\psi\|^2$ using the Metropolis algorithm

Techniques Introduced

Technique	Description	Primary Application
Nonlinear variational optimization	Choose a parameterized trial function, compute $E(\boldsymbol{\alpha})$ analytically or numerically, minimize over parameters	Ground state energies of atoms (helium), molecules (H₂⁺)
Linear variational (Ritz) method	Expand in a fixed basis, construct $\mathbf{H}$ and $\mathbf{S}$ matrices, solve the generalized eigenvalue problem	All of computational quantum mechanics / quantum chemistry
Variational Monte Carlo	Estimate high-dimensional energy integrals by Metropolis sampling from $\|\psi\|^2$	Many-electron systems (atoms, molecules, solids)

Method	Energy (hartree)	Error
Zeroth-order ($Z' = 2$)	$-4.000$	37.8%
First-order perturbation	$-2.750$	5.3%
Variational ($Z' = 27/16$)	$-2.848$	1.9%
Hylleraas (6 params, 1929)	$-2.9033$	0.01%
Modern CI (thousands of params)	$-2.90372$	$< 0.001\%$
Experiment	$-2.9037$	—

Quantity	LCAO Variational	Experiment
$R_e$ (equilibrium distance)	1.32 \AA	1.06 \AA
$D_e$ (binding energy)	1.76 eV	2.79 eV