> "Retrosynthetic analysis is the chemist's superpower. Given any target molecule, you can work backward to commercially available starting materials, finding a route that no one has done before, in a few minutes of thought. The principles are...
In This Chapter
- 31.1 Retrosynthetic logic: working backward
- 31.2 The disconnection cookbook
- 31.3 Strategic bond reasoning
- 31.4 Protecting groups: the rules
- 31.5 Worked retrosynthesis: Lidocaine
- 31.6 Worked retrosynthesis: Ibuprofen (the BHC route)
- 31.7 Worked retrosynthesis: Atenolol (a beta-blocker)
- 31.8 Convergent vs linear synthesis
- 31.9 Modern tools: computer-aided synthesis planning
- 31.10 Summary
Chapter 31 — Synthesis Workshop 2: Retrosynthetic Analysis and Multi-Step Synthesis Design
"Retrosynthetic analysis is the chemist's superpower. Given any target molecule, you can work backward to commercially available starting materials, finding a route that no one has done before, in a few minutes of thought. The principles are simple; the application is endless." — paraphrase from Corey, The Logic of Chemical Synthesis
"A synthesis is just a problem in graph theory. The target is one node; the starting materials are other nodes; the chemistry is the edges. Find the shortest path." — paraphrase from a synthesis text
This is the second synthesis workshop, building on the first (Chapter 14). Now you have the full toolkit of Chapters 10–30: substitution and elimination (Chs 10–14), addition (Chs 15–19), aromatic chemistry (Chs 20–23), all three carbonyl reactivity families (Chs 24–29), and amine chemistry (Ch 30). You can now plan multi-step syntheses of drug-sized molecules — and that is what this chapter teaches.
The chapter: 1. Reviews the retrosynthetic disconnection rules for each major functional group. 2. Develops strategic-bond reasoning: which bonds to disconnect first. 3. Discusses protecting-group strategy. 4. Explores convergent vs. linear synthesis. 5. Walks through complete retrosyntheses of several real drugs (lidocaine, ibuprofen, atenolol, others).
By the end of this chapter you should be able to: - Perform multi-step retrosynthesis on a drug-sized molecule. - Identify strategic bond disconnections. - Plan protecting-group strategies. - Compare and execute linear vs convergent syntheses. - Justify choice of reagents and conditions for each step. - Design clean syntheses with high overall yield.
31.1 Retrosynthetic logic: working backward
The basic move of retrosynthesis: given a target molecule (TM), identify a bond that you could form (in the synthesis direction) using a reliable reaction. Disconnect that bond on paper; the result is one or two precursors that are simpler than the target. Repeat the process on each precursor until you reach commercially available starting materials.
Mechanism Map 31.1: A retrosynthesis cycle.
- Look at the target. Identify the most complex / "central" bond — typically a C-C or C-N bond involving a carbonyl or other reactive site.
- Mentally disconnect the bond. The two fragments are the precursors.
- For each precursor, identify the synthetic equivalent (e.g., an enolate for the donor side; a carbonyl for the electrophile side).
- Repeat for each precursor. Keep going until each fragment is simple.
- Reverse the disconnection sequence to get the synthesis (forward) direction.
The notation: the symbol "$\Rightarrow$" denotes "retrosynthetic disconnection" (read "implies the precursors"). The symbol "$\to$" denotes "synthesis goes forward."
Synthons vs. synthetic equivalents
A synthon is an idealized fragment of the target with electrons drawn explicitly. For example, the synthon for a Michael acceptor is "an electrophilic β-C of an α,β-unsaturated carbonyl" — drawn as $C^+$. The synthetic equivalent is the actual molecule used: methyl vinyl ketone (an enone with the right reactivity).
Similarly, a Grignard synthon is "$R^-$" (a carbanion); the synthetic equivalent is R-MgX.
This vocabulary lets you think abstractly about the chemistry, then choose specific reagents.
31.2 The disconnection cookbook
For each functional group in the target, there is a canonical retrosynthesis. Here are the most useful ones from the carbonyl chemistry of Chs 24–30:
Esters
$$RCOOR' \Rightarrow RCOOH + R'OH$$ Forward: Fischer esterification (acid + alcohol + H⁺) OR acid chloride + alcohol.
Amides
$$RCONR'_2 \Rightarrow RCOOH + R'_2NH$$ Forward: DCC coupling (mild, no racemization); OR acid chloride + amine; OR mixed anhydride + amine.
Secondary alcohols
$$R-CH(OH)-R' \Rightarrow R-CHO + R'^-$$ Forward: aldehyde + Grignard (R'-MgX) → secondary alcohol after workup.
Tertiary alcohols
$$R-C(OH)(R')(R'') \Rightarrow R-CO-R' + R''^-$$ Forward: ketone + Grignard → tertiary alcohol.
β-hydroxy carbonyls
$$R-CH(OH)-CHR'-CO-R'' \Rightarrow R-CHO + R'CH_2-CO-R''$$ Forward: aldol reaction (the second compound's α-C attacks the first compound's C=O).
α,β-Unsaturated carbonyls (enones)
$$R-CH=CR'-CO-R'' \Rightarrow R-CHO + R'CH_2-CO-R''$$ Forward: aldol condensation (aldol + dehydration). Same disconnection as β-hydroxy, but the product has lost water.
β-Keto esters
$$RCO-CHR'-CO_2R'' \Rightarrow RCOO-R'' + R'CH_2-CO-R''$$ Forward: Claisen condensation (the second compound's α-C attacks the first compound's ester C=O).
1,5-Dicarbonyls
$$R-CO-CH_2CH_2-CO-R' \Rightarrow R-CO-CH_3 + CH_2=CH-CO-R'$$ Forward: Michael addition (an enolate of the first compound attacks the β-C of the second).
6-Membered enones (Robinson annulation products)
$$\text{6-ring enone fused to other ring} \Rightarrow \text{1,5-diketone} \Rightarrow \text{ketone enolate + MVK}$$ Forward: Robinson annulation (Michael + intramolecular aldol + dehydration).
Amines from carbonyls
$$RCH_2-NHR' \Rightarrow RCHO + R'NH_2$$ Forward: reductive amination (carbonyl + amine + NaBH₃CN).
Primary amines from alkyl halides
$$RCH_2-NH_2 \Rightarrow RCH_2-X (X = Br, I)$$ Forward: Gabriel synthesis (phthalimide + R-X, then hydrazine).
Aromatic functional groups
$$Ar-OH \Rightarrow Ar-N_2^+ \Rightarrow Ar-NH_2$$ Forward: ArNH₂ + HNO₂/HCl → diazonium; + H₂O + heat → ArOH.
$$Ar-X \Rightarrow Ar-N_2^+$$ Forward: Sandmeyer reaction (CuX + diazonium).
$$Ar-NH_2 \Rightarrow Ar-NO_2$$ Forward: reduction of nitro (Sn/HCl, or H₂/Pd).
Alkenes
$$RCH=CHR' \Rightarrow R-CO-R' (\text{ketone}) + R'CH_2P(Ph)_3X$$ Forward: Wittig reaction (ketone + ylide → alkene).
α-Alkylated carbonyls
$$RCO-CHR'(R'') \Rightarrow RCO-CHR' + R''-X$$ Forward: enolate (LDA at -78 °C) + alkyl halide (SN2).
31.3 Strategic bond reasoning
Not all disconnections are equally good. A "strategic bond" is one that satisfies three criteria:
- Reliable: the forward reaction is known to work for this substrate type.
- Simplifying: the precursor is structurally simpler than the target.
- Cleavable on paper: the disconnection makes sense based on the target's functional groups.
Examples of strategic bonds: - The C-C bond between an α-C and a C=O (aldol disconnection): always strategic for any β-hydroxy or enone. - The C-N bond in an amide (acid + amine disconnection): always strategic; very reliable. - The C-C bond in a 1,5-dicarbonyl (Michael disconnection): strategic for 1,5-relationships. - A C-C bond at a quaternary carbon (might be made by Grignard, alkylation, or rearrangement): often worth deep retrosynthesis.
Disconnections that are usually NOT strategic: - Random C-C bonds without a functional group nearby (no good way to make them). - C-H bonds in the middle of a chain (no easy retrosynthesis). - Aromatic ring C-C bonds (made by EAS chemistry; usually as the start, not by disconnection).
The Corey approach to retrosynthesis (E. J. Corey, Nobel 1990) is a systematic application of these rules. Modern computer-aided synthesis planning (CASP) tools (Synthia, IBM RXN, etc.) automate the process.
31.4 Protecting groups: the rules
When the target has multiple functional groups that would interfere with each other in a synthesis, protect the more reactive group while you work on the other. Then deprotect at the end.
Common protecting groups
| Functional Group | Protecting Group | Conditions to add | Conditions to remove |
|---|---|---|---|
| Aldehyde / ketone | Acetal (e.g., dioxolane from ethylene glycol) | Diol + H⁺, remove H₂O | aqueous H⁺ |
| Alcohol | TBS or TBDPS silyl ether | TBSCl + imidazole | TBAF (F⁻) or aqueous HF |
| Alcohol | Methyl ether (sometimes) | NaH + MeI | BBr₃ or HBr |
| Alcohol | Benzyl ether | NaH + BnBr | H₂/Pd, Na/NH₃ |
| Amine | Boc (tert-butyloxycarbonyl) | Boc₂O + base | TFA (acid) |
| Amine | Cbz (carbobenzyloxy) | CbzCl + base | H₂/Pd or HBr/AcOH |
| Amine | Fmoc | Fmoc-Cl + base | piperidine (base) |
| Carboxylic acid | Methyl ester | CH₂N₂ or MeOH/H⁺ | NaOH/H₂O |
| Diol | Acetonide (from acetone) | acetone + H⁺ | aqueous H⁺ |
| Phenol | Methyl ether | NaH + MeI | BBr₃ |
Protection rules
- Choose orthogonal protecting groups that are deprotected by different conditions. This lets you remove one without affecting another.
- Protect early; deprotect late — you don't want a protecting group on for long if it's expensive or labile.
- Use selective protections when possible. For example, only the more accessible OH of a diol might react with TBSCl in 5 minutes; the less accessible can be selectively protected next.
- Remember the protecting groups in your final analysis — count them as additional steps.
When NOT to use protecting groups
If you can run the reaction selectively without protecting, do so. For example: - NaBH₄ reduces aldehydes/ketones but not esters/amides — use this selectivity to spare the slower-reacting groups. - LiAlH₄ reduces everything but might be too aggressive — use NaBH₄ or DIBAL-H if mild enough. - DCC coupling of acid + amine is mild enough to spare alcohols.
The rule of thumb: try the reaction without protecting groups first. Only add them if there's a clear conflict.
31.5 Worked retrosynthesis: Lidocaine
Target: lidocaine, a local anesthetic.
Structure: 2,6-dimethylaniline (an aromatic amine) attached via an amide bond to a methylene group, which has an N,N-diethylamine substituent. Specifically: 2,6-dimethylaniline-N-(diethylaminoacetamide).
(CH3)2-Ar-NH-CO-CH2-N(Et)2
Retrosynthesis
Step 1: Identify the amide (C-N bond between ring-N and acyl C). Disconnect this bond. $$\text{(CH}_3\text{)}_2\text{-Ar-NH-CO-CH}_2\text{-N(Et)}_2 \Rightarrow \text{(CH}_3\text{)}_2\text{-Ar-NH}_2 + \text{HOOC-CH}_2\text{-N(Et)}_2$$
The retrosynthetic precursors are 2,6-dimethylaniline (commercial) and N,N-diethylglycine (also commercial, or can be made from diethylamine + chloroacetic acid).
Step 2: For the N,N-diethylglycine, disconnect the C-N bond. $$\text{HOOC-CH}_2\text{-N(Et)}_2 \Rightarrow \text{HOOC-CH}_2\text{-Cl} + \text{HN(Et)}_2$$
The precursors are chloroacetic acid (or chloroacetyl chloride) and diethylamine.
Step 3: Stop here — both precursors are commercially available.
Forward synthesis
-
Step 1: 2,6-dimethylaniline + chloroacetyl chloride (in EtOAc, with Et₃N as base) → 2,6-dimethylphenyl-N-(chloroacetyl)amide. (One acyl substitution: amide formation from aryl amine + acid chloride.)
-
Step 2: The chloride from step 1 is displaced by diethylamine (SN2 on the chloromethyl carbon). Conditions: diethylamine (excess) + heat. → lidocaine.
Total: 2 steps from commercial materials. Linear synthesis. Yield typically > 80% per step → > 64% overall.
31.6 Worked retrosynthesis: Ibuprofen (the BHC route)
Target: ibuprofen, the most-used NSAID.
Structure: 2-(4-isobutylphenyl)propanoic acid. A para-isobutyl-substituted benzene with an α-methylcarboxylic acid attached.
Retrosynthesis
Step 1: Identify the COOH; disconnect to a precursor that has an extra CO₂ group (suggesting carboxylation of a Grignard or carbonylation). $$\text{4-iBu-Ar-CH(CH}_3\text{)-COOH} \Rightarrow \text{4-iBu-Ar-CH(CH}_3\text{)-MgBr} + CO_2$$
Forward: Grignard + CO₂ → carboxylate; protonate to give COOH.
The Grignard precursor is 1-(1-bromoethyl)-4-isobutylbenzene.
Step 2: Disconnect the C-Br bond; the precursor is the alcohol or alkene. $$\text{4-iBu-Ar-CH(CH}_3\text{)-Br} \Rightarrow \text{4-iBu-Ar-CHOH-CH}_3$$
Forward: alcohol + HBr → alkyl bromide.
Step 3: For the alcohol, disconnect to the corresponding ketone (4-isobutylacetophenone). $$\text{4-iBu-Ar-CHOH-CH}_3 \Rightarrow \text{4-iBu-Ar-CO-CH}_3$$
Forward: ketone + NaBH₄ → secondary alcohol.
Step 4: For 4-isobutylacetophenone, disconnect via Friedel-Crafts acylation to isobutylbenzene + acetic anhydride or acetyl chloride. $$\text{4-iBu-Ar-CO-CH}_3 \Rightarrow \text{4-iBu-Ar-H} + (CH_3CO)_2O$$
Forward: Friedel-Crafts acylation (anhydride + AlCl₃).
Step 5: Isobutylbenzene is commercial.
Forward synthesis (BHC process, 1990s)
- Isobutylbenzene + acetic anhydride + HF (a Friedel-Crafts catalyst) → 4-isobutylacetophenone.
- 4-Isobutylacetophenone + Pd-catalyzed carbonylation with H₂ + CO → α-substituted carboxylic acid (the BHC step uses Pd(OAc)₂ + CO + H₂ for one-step conversion).
Total: 3 steps. Convergent in using both H₂ and CO simultaneously. The BHC process replaced an older 6-step Boots process. Industrial-scale: tons per year.
31.7 Worked retrosynthesis: Atenolol (a beta-blocker)
Target: atenolol, a β1-selective beta-blocker.
Structure: 4-(2-hydroxy-3-(isopropylamino)propoxy)benzeneacetamide. Aryl ether at the para position; the side chain is -O-CH₂-CHOH-CH₂-NH-CH(CH₃)₂.
Retrosynthesis
Step 1: Identify the amine. Disconnect the C-N bond (between the side-chain CH₂ and the amine). $$\text{Ar-O-CH}_2\text{-CHOH-CH}_2\text{-NH-CH(CH}_3\text{)}_2 \Rightarrow \text{Ar-O-CH}_2\text{-CHOH-CH}_2\text{-X} + H_2N-CH(CH_3)_2$$
Forward: SN2 between an alkyl halide (or epoxide) and isopropylamine.
Step 2: Recognize that "Ar-O-CH₂-CHOH-CH₂-X" is the alcohol-and-leaving-group form of an epoxide. The epoxide is: Ar-O-CH₂-cyclo(epoxide)-CH₂.
$$\text{Ar-O-CH}_2\text{-CHOH-CH}_2\text{-X} \Rightarrow \text{Ar-O-CH}_2\text{-CH(O)-CH}_2 \text{ (epoxide)}$$
Or the precursor is glycidol (3-chloro-1,2-propanediol) attached to the phenol.
Step 3: Disconnect the aryl ether: Ar-O-CH₂- comes from a phenol + an electrophile. $$\text{Ar-O-CH}_2\text{-CH(O)-CH}_2 \Rightarrow \text{Ar-OH} + \text{Cl-CH}_2\text{-CH(O)-CH}_2 \text{ (epichlorohydrin)}$$
Forward: phenol + epichlorohydrin + base → aryl glycidyl ether.
Step 4: 4-Hydroxyphenylacetamide (the phenol with the amide) is the precursor.
Forward synthesis
- 4-Hydroxyphenylacetamide + epichlorohydrin + NaOH → aryl glycidyl ether.
- Aryl glycidyl ether + isopropylamine (excess) → atenolol (the amine opens the epoxide regioselectively at the less-hindered CH₂; the OH is on the central CH).
Total: 2 steps from commercial materials. Convergent? Slightly — one branch is the phenol, the other is the epichlorohydrin/isopropylamine pair.
31.8 Convergent vs linear synthesis
A linear synthesis processes a single starting material through many steps in sequence. Yield: $Y_{\text{overall}} = Y_1 \times Y_2 \times ... \times Y_n$.
A convergent synthesis makes two or more fragments separately, then couples them. Each branch can have many steps; the coupling step is usually the highest-yielding step.
Math: a 10-step linear synthesis with 80% yield per step gives $0.80^{10} = 10.7\%$ overall yield. A 10-step convergent synthesis with two 5-step branches (each at 80% per step) gives $0.80^5 \times 0.80^5 \times 0.80 \text{ (coupling)} = 26.8\%$ overall — about 2.5× higher than linear.
This is why convergent synthesis is preferred for complex natural products and drugs. Industrial-scale syntheses are nearly always convergent.
Strategy for convergent synthesis
- Look at the target. Identify a "central" or "strategic" bond — preferably a C-C or C-N bond at a structurally important position.
- Disconnect that bond. The two halves are the convergent precursors.
- Plan each half's synthesis separately. Aim to make each half have similar complexity (so neither dominates the time/cost).
- Plan the coupling step (a Grignard, an aldol, an amide formation, a Pd-coupling, etc.).
- Verify: count total steps; check protecting groups; estimate yields.
If the convergent synthesis has 4 steps + 4 steps + 1 coupling = 9 steps, and the linear would have 12 steps, the convergent saves 3 steps and gives much higher overall yield.
31.9 Modern tools: computer-aided synthesis planning
In the 21st century, retrosynthesis has been partly automated: - Synthia (formerly Chematica): IBM/Allchemy software that performs retrosynthesis using rules + machine learning. - IBM RXN: predicts forward reactions and proposes retrosynthesis paths. - AiZynthFinder: open-source retrosynthesis tool. - Reaxys: database of known reactions; useful for finding precedents.
These tools complement (not replace) human retrosynthetic intuition. Still, they have made it possible for chemists to plan complex syntheses in minutes that would have taken days of manual work.
The 2020s have seen a renaissance of computer-aided synthesis, with AI-driven approaches showing strong performance. Even more sophisticated tools are coming.
31.10 Summary
- Retrosynthesis = systematic disconnection of target into precursors using known reactions in reverse.
- Strategic bond = reliable to form, simplifying. Carbonyl C-C and C-N bonds are typically strategic.
- Functional-group disconnections cookbook: - Ester ⇒ acid + alcohol. - Amide ⇒ acid + amine. - 2° alcohol ⇒ aldehyde + Grignard. - 3° alcohol ⇒ ketone + Grignard. - β-hydroxy carbonyl ⇒ aldol partners. - α,β-unsaturated carbonyl ⇒ aldol condensation partners. - β-keto ester ⇒ Claisen partners. - 1,5-dicarbonyl ⇒ Michael partners. - 6-ring enone ⇒ Robinson annulation. - Amine ⇒ reductive amination.
- Protecting groups: acetals (carbonyl), TBS/TBDPS (OH), Boc (NH), Bn (OH), Cbz (NH), methyl ester (COOH).
- Linear vs. convergent: convergent is preferred for complex syntheses; gives higher yield by avoiding sequential multiplications.
- Modern tools (Synthia, IBM RXN, AiZynthFinder) automate parts of retrosynthesis using AI.
- Worked examples: lidocaine (2 steps), ibuprofen (3 steps via BHC), atenolol (2 steps).
This concludes Part VI. Carbonyl chemistry is the foundation; you now have the synthetic toolkit to make most drug-sized molecules. Part VII turns to the bioorganic and special topics: carbohydrates, nucleic acids, lipids, proteins, and the medicinal chemistry of pharmaceutical synthesis.