Appendix E — Computational Chemistry Software Setup
Reference for the computational exercises across the book, plus enough method-selection guidance to extend your work beyond what we assigned. Computational chemistry is now standard infrastructure in modern organic — treat it as a sixth spectroscopic tool.
1. Why computational chemistry matters in modern organic
Three concrete uses appear throughout this book:
- Predicting reactivity — HOMO/LUMO surfaces (Ch 2, 19), electrostatic potential maps (Ch 3), Fukui indices for site selectivity in EAS (Ch 21).
- Transition state energies — locating TS structures to rationalize selectivity (Ch 10 SN2 vs SN1 partitioning, Ch 19 endo/exo Diels-Alder, Ch 39 sigmatropic stereospecificity).
- NMR shift prediction — DFT GIAO calculations to assign ambiguous diastereomers and natural product structures (Ch 6, Ch 38). Modern GIAO/DP4+ analysis frequently distinguishes regio- and stereoisomers when experimental NMR is ambiguous.
Other routine uses: IR frequency assignment, conformational scanning, pKa estimation, dipole moment, partial charges, NBO bonding analysis, NCI (non-covalent interaction) surfaces.
Limits to remember: computational chemistry estimates. Errors are typically ±2 kcal/mol for DFT on organic systems, ±0.1-0.3 ppm for ¹H NMR shifts, ±5 ppm for ¹³C. Reliability scales with system size and method cost — pick the method to match the question.
2. Tool tiers
Free / open source
| Tool | Purpose | Notes |
|---|---|---|
| Avogadro | GUI builder + visualizer + FF optimization | Drives most of the book's computational exercises |
| ORCA | DFT, MP2, CCSD(T), TS search | Free for academic use; binaries from orcaforum.kofo.mpg.de |
| NWChem | Multi-purpose QM, plane-wave DFT | DOE-supported; strong for solid state, weaker UX |
| PSI4 | Python-driven QM | Excellent for scripting; pip-installable |
| xTB / CREST | GFN-xTB semi-empirical | Fast conformer searches; Grimme group |
| RDKit | Cheminformatics, 2D/3D, FF | Python; used in our scripts/generate_structures.py |
| Open Babel | File format conversion, FF | CLI swiss-army knife |
| Jmol / PyMOL / VMD | Visualization | PyMOL excellent for protein-ligand views |
Academic (license required)
| Tool | Strength |
|---|---|
| Gaussian | Industry standard; widest method coverage; GaussView GUI |
| GAMESS | Free for academic, written request; strong MCSCF/CASSCF |
| Spartan | Best teaching GUI; bundled curricula |
| Schrödinger (Jaguar, Glide, Maestro) | Pharma-grade docking + QM |
| Q-Chem | Strong excited-state, range-separated DFT |
| Molpro | High-accuracy correlated methods |
Web / cloud
- WebMO — browser front-end to Gaussian/GAMESS/Q-Chem; departmental install
- Chemcraft — Windows visualization for output files (free for non-commercial)
- IQmol — free GUI tied to Q-Chem
- Chemcompute.org — free shared GAMESS/PSI4 access for students
For this book's exercises, the free stack (Avogadro + ORCA + RDKit) covers every problem.
3. Setting up Avogadro
avogadro.cc. Versions referenced here: Avogadro 1.2.x (classic, stable) and Avogadro 2.0+ (rewrite, faster, plugin-based). The book's screenshots use 1.2.x; 2.0 has identical menus for the operations we need.
Windows
- Download the
.exeinstaller from avogadro.cc. - Run installer. Default install path:
C:\Program Files\Avogadro. - Launch from Start menu.
macOS
- Download
.dmg. - Drag
Avogadro.appto/Applications. - On first launch: right-click → Open (bypasses Gatekeeper warning).
Linux
sudo apt install avogadro # Debian/Ubuntu
sudo dnf install avogadro # Fedora
flatpak install flathub cc.avogadro.Avogadro2 # Avogadro 2 via Flatpak
First-run sanity check
- Click empty canvas → places C atom.
- Drag → forms bond to new C.
- Add Hs: Build → Add Hydrogens (or toolbar H button).
- Optimize: Extensions → Optimize Geometry (default UFF).
- Energy reported in status bar.
Force-field choice within Avogadro
- UFF — Universal; covers full periodic table; less accurate for organics.
- MMFF94 — Best default for closed-shell organics. Use this.
- GAFF — Better for biomolecules and ligands.
- Ghemical — Legacy; avoid.
4. Installing ORCA
orcaforum.kofo.mpg.de. Free for academic and personal non-commercial use; registration required before download. Commercial users need a license. Versions current at time of writing: ORCA 6.x.
Download
- Register on the ORCA forum.
- Download platform binary (Windows:
.zip; Linux:.tar.xz; macOS: ARM/x86 builds). - Extract to a permanent path, e.g.
C:\orca\or/opt/orca/.
Environment variables (Windows PowerShell)
$env:Path = "C:\orca;$env:Path"
$env:OMP_NUM_THREADS = "4"
Add the same Path entry permanently in System Properties → Environment Variables.
Linux/macOS .bashrc or .zshrc
export PATH=/opt/orca:$PATH
export LD_LIBRARY_PATH=/opt/orca:$LD_LIBRARY_PATH
export OMP_NUM_THREADS=4
First input file — water DFT optimization
File water.inp:
! B3LYP def2-SVP Opt
%pal nprocs 4 end
* xyz 0 1
O 0.000000 0.000000 0.117790
H 0.000000 0.755453 -0.471161
H 0.000000 -0.755453 -0.471161
*
Run:
orca water.inp > water.out
Output appears in water.out. Optimized geometry in water.xyz; orbitals in water.gbw; thermochemistry in the .out tail.
Opt + Freq + single-point (typical workflow)
! B3LYP def2-SVP Opt Freq
! B3LYP def2-TZVP
%pal nprocs 8 end
%maxcore 3000
* xyz 0 1
[coordinates]
*
The second ! line is read as a single-point on the optimized geometry using a larger basis — the standard "optimize cheap, refine energy" pattern.
5. Setting up RDKit in Python
Conda (recommended)
conda create -n rdkit-env python=3.11
conda activate rdkit-env
conda install -c conda-forge rdkit
Pip
pip install rdkit
The pure-pip wheel works on Linux/macOS/Windows since RDKit 2022.09.
Minimum example — SMILES to 3D to FF-optimized
from rdkit import Chem
from rdkit.Chem import AllChem, Draw
# 2-bromobutane
mol = Chem.MolFromSmiles('CCC(C)Br')
mol = Chem.AddHs(mol)
# Embed 3D coords using ETKDG (Riniker-Landrum)
AllChem.EmbedMolecule(mol, AllChem.ETKDGv3())
# Optimize with MMFF94
AllChem.MMFFOptimizeMolecule(mol)
# Write .xyz for ORCA / Avogadro
Chem.MolToXYZFile(mol, 'butane.xyz')
# 2D depiction
Draw.MolToFile(mol, 'butane.png', size=(400, 400))
Conformer search (ETKDG)
from rdkit import Chem
from rdkit.Chem import AllChem
mol = Chem.AddHs(Chem.MolFromSmiles('CC(C)(C)OC1CCCCC1'))
params = AllChem.ETKDGv3()
params.numThreads = 4
cids = AllChem.EmbedMultipleConfs(mol, numConfs=50, params=params)
# Optimize each conformer, collect energies
results = AllChem.MMFFOptimizeMoleculeConfs(mol, numThreads=4)
energies = [(cid, e) for cid, (status, e) in zip(cids, results) if status == 0]
energies.sort(key=lambda x: x[1])
best_cid, best_e = energies[0]
print(f"Lowest-E conformer: id={best_cid}, E={best_e:.3f} kcal/mol")
6. Method selection guide
| Method class | When to use | Typical cost (relative) | Comment |
|---|---|---|---|
| MM / Force field (MMFF, UFF, GAFF, OPLS) | Conformer searches, large molecule pre-opt, FF-relevant scans | 1 | No electronic structure — no bond breaking, no excited states |
| Semi-empirical (PM6, PM7, GFN2-xTB, AM1) | Geometry pre-opt of 100-1000 atom systems; rough TS scans | 10-100 | GFN2-xTB now competitive with low-cost DFT for organics |
| DFT | The default for organic chemistry — energies, geometries, IR, NMR, TS | 10³-10⁴ | See functional table below |
| MP2 | When DFT fails for dispersion or anion stability; small benchmarks | 10⁴-10⁵ | Scales O(N⁵); double-hybrid DFT is often better value |
| CCSD(T) | Gold standard for small-molecule energies, benchmarks | 10⁶+ | Practical limit ~20 heavy atoms; basis set extrapolation typical |
DFT functional choices
| Functional | Use case |
|---|---|
| B3LYP | Generic workhorse; underestimates dispersion — pair with D3(BJ) or D4 correction |
| B3LYP-D3(BJ) | Add empirical dispersion to B3LYP — now near-default for organic geometries |
| ωB97X-D | Range-separated + dispersion; excellent for thermochemistry, kinetics |
| M06-2X | Truhlar's functional, strong for kinetics, noncovalent interactions |
| PBE0 | Hybrid GGA, fast, reliable |
| B97-3c, r²SCAN-3c | "3c" composite methods (Grimme) — DFT + basis + corrections bundled, cheap and accurate |
| TPSS, M06-L | Pure (non-hybrid) functionals — cheaper, OK for geometries |
| DLPNO-CCSD(T) | Local CC — extends CCSD(T) accuracy to 100+ atoms |
For 90% of organic questions in this book: B3LYP-D3(BJ)/def2-SVP for geometry + frequencies, ωB97X-D/def2-TZVP for single-point energies. For NMR: mPW1PW91/6-311+G(2d,p) GIAO is a widely cited recipe.
7. Basis set guide
| Basis | Quality | Use case |
|---|---|---|
| STO-3G | Minimal | Pedagogy only — don't publish |
| 3-21G, 6-31G | Split-valence | Quick scans, very rough |
| 6-31G(d) = 6-31G* | Polarization on heavies | Old default; OK for geometry |
| 6-31+G(d) | + diffuse on heavies | Anions, lone-pair-heavy systems |
| 6-311+G(d,p) | Triple-zeta, polarization on H + heavies, diffuse | Standard energy/property basis |
| def2-SVP | Karlsruhe split-valence + polarization | Modern default geometry |
| def2-TZVP | Triple-zeta + polarization | Energies, thermochemistry |
| def2-TZVPP | Larger polarization | Tighter benchmarks |
| def2-QZVPP | Quadruple-zeta | Near-CBS benchmarks |
| cc-pVDZ, cc-pVTZ, cc-pVQZ | Dunning correlation-consistent | CCSD(T) extrapolations |
| aug-cc-pVTZ | + diffuse | Anions, polarizabilities |
Heuristic: def2-SVP for geometry, def2-TZVP for energy, add diffuse functions when treating anions, hydrogen-bonded clusters, or excited states.
8. Common calculation types
| Type | ORCA keyword | What you get |
|---|---|---|
| Geometry optimization | Opt |
Local minimum on PES; .xyz of optimized structure |
| Frequency | Freq |
Vibrational frequencies, IR intensities, ZPE, thermochemistry (S, H, G) |
| Single-point energy | (no Opt) | Electronic energy at fixed geometry |
| Transition state | OptTS (with Hess: NumFreq) |
Saddle point; exactly one imaginary frequency |
| QST2 / QST3 | (Gaussian) | TS interpolated between two/three reference structures |
| IRC | IRC |
Reaction path forward + backward from TS to reactants/products |
| NMR shielding | NMR |
Isotropic shielding tensors; subtract from TMS reference to get δ |
| NBO | %nbo block |
Natural bond orbital populations, hyperconjugation analysis |
| NCI | %plots NCI true end |
Non-covalent interaction surfaces (Yang) |
| Excited state (TD-DFT) | ! TD-DFT |
Vertical excitation energies, oscillator strengths |
| Solvation | ! CPCM(water) |
Implicit solvent correction (PCM, SMD, COSMO) |
TS validation: always run Freq after OptTS and check that exactly one imaginary frequency is present and its mode visually corresponds to the bond making/breaking motion. Then run IRC to verify it connects the intended reactant and product.
9. Computational exercise solutions (cross-reference)
| Exercise | Chapter | Method | Expected result |
|---|---|---|---|
| Methane, ethane build | Ch 1 | Avogadro UFF | Tₐ at C; C-H 1.09 Å; H-C-H 109.5° |
| Ethane rotation barrier | Ch 1 | MMFF94 dihedral scan | ~2.9 kcal/mol staggered → eclipsed |
| Ethylene HOMO/LUMO | Ch 2 | Avogadro ext. or ORCA HF/STO-3G | HOMO = π; LUMO = π*; nodes at midpoint of C-C |
| Electrostatic potential, HCl | Ch 3 | DFT B3LYP/6-31G(d) | δ⁻ red on Cl, δ⁺ blue on H |
| Cyclohexane chair | Ch 5 | MMFF94 opt | Chair lower than twist-boat by ~5 kcal/mol |
| Methyl A-value | Ch 5 | MMFF94 axial vs equatorial | ΔE ≈ 1.7 kcal/mol (eq favored) |
| IR prediction of acetone | Ch 6 | B3LYP/6-31G(d) Freq | C=O stretch ~1715 cm⁻¹ (after 0.96 scale) |
| ¹H NMR of toluene | Ch 6 | mPW1PW91/6-311+G(2d,p) GIAO | δ ~7.2 (aryl), 2.3 (CH₃) |
| SN2 TS for Cl⁻ + CH₃Br | Ch 10 | ωB97X-D/def2-TZVP OptTS | Trigonal bipyramidal C; one imag freq ~-450 cm⁻¹ |
| Carbocation stabilities | Ch 11 | DFT isodesmic | 3° < 2° < 1° < methyl (relative E) |
| Diels-Alder endo/exo | Ch 19 | M06-2X/def2-SVP | endo TS ~1-2 kcal/mol lower |
| Aromatic substitution σ⁺ | Ch 21 | B3LYP charges, Fukui | Para preferred for OMe; meta for NO₂ |
| Aldol TS Zimmerman-Traxler | Ch 28 | B3LYP TS | Chair TS with Z-enolate gives syn product |
For each: build in Avogadro → preoptimize MMFF94 → export .xyz → wrap with ORCA input → run → analyze.
10. Sanity-checking results
Computational chemistry rewards skepticism. Run through this checklist on any new result before trusting it:
| Symptom | Cause | Fix |
|---|---|---|
| Multiple imaginary frequencies after Opt | Not a minimum — saddle point | Follow imag mode, reoptimize |
| One imaginary freq after Opt (not TS) | Spurious low-mode from FF history, or true second-order saddle | Tighten convergence (! TightOpt); displace and reopt |
| No imaginary freq after OptTS | Not a TS | Restart from displaced TS guess |
| SCF convergence failure | Bad initial guess; near-degeneracy | Use ! SlowConv or ! VeryTightSCF; try different starting orbitals (! NoIter then read) |
| Wildly wrong bond length (>0.1 Å off) | Symmetry constraint accidentally imposed, or wrong charge/multiplicity | Check * line: charge, multiplicity correct? |
| Energy off by ~30 kcal/mol vs expectation | Forgot dispersion correction on B3LYP | Use B3LYP-D3(BJ) or switch functional |
| Negative atomic charges nonsensical | Mulliken artifact | Use Hirshfeld, CM5, or NBO charges instead |
| Frequencies look fine but G° wrong | Forgot to include solvation; default T = 298.15 K | Add ! CPCM(solvent); check ! Print[P_Thermo] 1 |
| TS connects wrong reactants/products | Wrong saddle | Run IRC; if wrong, restart from better guess |
| "Wrong" stereochemistry from opt | Local minimum, not global | Run conformer search first (xTB/CREST or RDKit ETKDG) |
Always visualize the result. A geometry that converges to the right energy but looks distorted is usually a bug.
11. Reading the literature
When citing computational results, papers report at minimum: - Method — functional + basis set, e.g., "B3LYP-D3(BJ)/def2-TZVP//B3LYP-D3(BJ)/def2-SVP" (energy basis // geometry basis). - Software + version — "ORCA 5.0.4" or "Gaussian 16, rev C.01." - Solvation model — "SMD(toluene)" or "gas phase." - Thermal corrections — "Gibbs free energies at 298.15 K, 1 atm."
Typical error bars for organic chemistry DFT: - Bond lengths: ±0.01-0.02 Å - Bond angles: ±1-2° - Reaction enthalpies: ±2-3 kcal/mol (good functional); ±5+ (poor choice) - Activation barriers: ±1-3 kcal/mol - ¹H NMR shifts: ±0.1-0.3 ppm after referencing - ¹³C NMR shifts: ±2-5 ppm - IR frequencies: scale factor ~0.96-0.97 needed; ±20 cm⁻¹ residual
Foundational papers to cite for method validation: - Becke 1993; Lee, Yang, Parr 1988 (B3LYP) - Grimme et al. 2010 / 2011 (D3, BJ damping) - Weigend & Ahlrichs 2005 (def2 basis sets) - Zhao & Truhlar 2008 (M06 family) - Chai & Head-Gordon 2008 (ωB97X-D) - Riniker & Landrum 2015 (ETKDG) - Grimme et al. 2017 (GFN-xTB)
Computational chemistry is fast, cheap, and increasingly trustworthy — but only when you understand what you asked the computer to do.