12 min read

> "The vocabulary of organic chemistry is not a burden. It is the only thing that lets one chemist talk to another about anything at all."

Chapter 4 — Functional Groups and Nomenclature: The Vocabulary of Molecules

"The vocabulary of organic chemistry is not a burden. It is the only thing that lets one chemist talk to another about anything at all." — paraphrased from countless professors


This chapter is a vocabulary chapter. It is short compared to the chapters around it, but its contents are non-negotiable: you cannot think about or discuss an organic molecule without being able to name its parts and name the molecule as a whole.

The chapter has two main jobs:

  1. Functional groups. The recurring structural motifs that organize organic molecules into families. A molecule's functional group largely determines its chemistry. Learning the twenty or so most common functional groups is among the highest-payoff investments of the whole course.
  2. IUPAC nomenclature. The systematic naming scheme that lets any organic chemist anywhere in the world unambiguously identify any molecule from its name.

We will go quickly. Work through the material, do the exercises, and proceed to Chapter 5.


4.1 Hydrocarbons

Organic molecules built from only carbon and hydrogen are called hydrocarbons. Four families:

Alkanes — saturated, only single bonds

Alkanes have the general formula $C_nH_{2n+2}$ for open chains. All carbons are $sp^3$. Named with the suffix -ane:

n Name Formula
1 methane $CH_4$
2 ethane $C_2H_6$
3 propane $C_3H_8$
4 butane $C_4H_{10}$
5 pentane $C_5H_{12}$
6 hexane $C_6H_{14}$
7 heptane $C_7H_{16}$
8 octane $C_8H_{18}$
9 nonane $C_9H_{20}$
10 decane $C_{10}H_{22}$

Cyclic alkanes (cycloalkanes) have formula $C_nH_{2n}$ (two fewer hydrogens because the two end carbons are joined). Named with the prefix cyclo-: cyclopropane, cyclobutane, cyclopentane, cyclohexane (the most important — covered in detail in Chapter 5).

Alkenes — one or more C=C double bonds

Named with the suffix -ene, with a locant indicating where the double bond starts:

  • ethene ($CH_2=CH_2$, common name ethylene)
  • propene ($CH_3CH=CH_2$)
  • 1-butene ($CH_2=CHCH_2CH_3$)
  • 2-butene ($CH_3CH=CHCH_3$, which can be cis or trans, aka Z or E)

Alkynes — C≡C triple bonds

Named with the suffix -yne:

  • ethyne ($HC \equiv CH$, common name acetylene)
  • propyne ($CH_3C \equiv CH$)
  • 1-butyne, 2-butyne

Aromatic hydrocarbons (arenes)

Compounds containing one or more benzene rings. Full treatment in Chapter 20. Simple examples: benzene itself, toluene (methylbenzene), xylene (dimethylbenzene), naphthalene (two fused rings).


4.2 Oxygen-containing functional groups

Alcohols (R-OH)

A hydroxyl group attached to an $sp^3$ carbon. Suffix -ol or prefix hydroxy-. Important $pK_a \approx 16$.

  • methanol, $CH_3OH$
  • ethanol, $CH_3CH_2OH$
  • 2-propanol (isopropyl alcohol), $(CH_3)_2CHOH$
  • cyclohexanol

Classified as primary (1°, one carbon attached to the C-OH), secondary (2°, two carbons attached), or tertiary (3°, three carbons attached). The classification matters for reactivity in substitution, elimination, and oxidation.

Phenols (Ar-OH)

A hydroxyl attached to an aromatic ring. Much more acidic than alcohols ($pK_a \approx 10$) because the phenoxide anion is stabilized by delocalization into the ring (Chapter 22).

  • phenol, $C_6H_5OH$
  • p-cresol (4-methylphenol)

Ethers (R-O-R')

Two carbons bonded to the same oxygen. No unique suffix in IUPAC; named as alkoxy- substituents (or by the older "R-O-R' ether" nomenclature).

  • dimethyl ether, $CH_3OCH_3$ (IUPAC: methoxymethane)
  • diethyl ether, $CH_3CH_2OCH_2CH_3$ (common lab solvent)
  • tetrahydrofuran (THF), a cyclic ether

Ethers are relatively inert and make excellent solvents for many organic reactions.

Carbonyl compounds — the $C=O$ family

The $C=O$ group (carbonyl) appears in many functional groups, distinguished by what else is attached.

The carbonyl family of functional groups

Figure 4.1 — The major carbonyl-containing functional groups. Each is characterized by the atoms attached to the carbonyl carbon: R,H for aldehyde; R,R for ketone; R,OH for carboxylic acid; R,OR for ester; R,NR₂ for amide; R,X for acid halide; R,O-C(=O)R for anhydride. All react via the same mechanistic families (nucleophilic addition, nucleophilic acyl substitution) covered in Part VI.

Family Structure Suffix Example Notes
Aldehyde R-CHO -al ethanal (acetaldehyde) $C=O$ with at least one H
Ketone R-CO-R' -one propan-2-one (acetone) $C=O$ with two C neighbors
Carboxylic acid R-COOH -oic acid ethanoic acid (acetic acid) $pK_a \approx 5$
Ester R-COOR' -oate ethyl ethanoate (ethyl acetate) fruity smell
Amide R-CO-NR'₂ -amide ethanamide (acetamide) N can be NH₂, NHR, NR₂
Acid halide R-COX -oyl halide ethanoyl chloride (acetyl chloride) highly reactive
Anhydride (R-CO)₂O -oic anhydride ethanoic anhydride two carbonyls sharing O
Nitrile R-C≡N -nitrile ethanenitrile (acetonitrile) technically $C \equiv N$, not $C=O$, but similar electrophilic chemistry

Carbonyl chemistry is the heart of organic reactivity (see Part VI). Of the ~80 reactions in your Synthesis Toolkit by the end of this book, perhaps half involve a carbonyl.


4.3 Nitrogen-containing functional groups

Amines (R-NH₂, R₂NH, R₃N)

Nitrogen with one, two, or three alkyl/aryl groups attached; classified as primary, secondary, tertiary (different from alcohol classification!). A fourth category — quaternary ammonium ($R_4N^+$) — has four groups and a permanent positive charge.

Basicity: amines have $pK_{aH}$ around 9–11 for simple alkyl amines; weaker ($pK_{aH}$ 5 for pyridine, 4.6 for aniline) for amines where the nitrogen lone pair is delocalized.

Nitriles (R-C≡N)

Already covered under carbonyl chemistry above. Note that nitriles hydrolyze to carboxylic acids under acidic or basic conditions.

Nitro groups (R-NO₂)

$-N(=O)=O$ with formal charges (see Chapter 2). Strongly electron-withdrawing. Nitro-benzene derivatives are precursors to many drugs; nitro groups are generally NOT good leaving groups despite the negative oxygen formal charge.

Imines (R₂C=NR')

Formed from carbonyl + primary amine. Shows up repeatedly in Chapter 25 and in biosynthesis.


4.4 Halides

Alkyl halides (R-X where X = F, Cl, Br, I) have a halogen attached to $sp^3$ carbon. Chapter 10-14 territory.

Vinyl halides have the halogen on an $sp^2$ carbon of an alkene. They do not undergo $S_N$ reactions the way alkyl halides do, for reasons explained in Chapter 15.

Aryl halides have the halogen on an aromatic ring. They undergo nucleophilic aromatic substitution only under special conditions (Chapter 23) but are the starting materials for palladium-catalyzed cross-couplings (Chapter 37).

Alkyl halides themselves are classified as primary/secondary/tertiary based on the carbon bearing the halogen.


4.5 Sulfur-containing functional groups

Thiols (R-SH) — the sulfur analog of alcohols. More acidic ($pK_a \approx 10$) and more nucleophilic than alcohols, because S is larger and more polarizable.

Sulfides (R-S-R') — sulfur analog of ethers.

Sulfoxides (R₂S=O) and sulfones (R₂SO₂) — oxidized forms of sulfides.

Sulfonic acids (R-SO₃H) — analogous to carboxylic acids but much more acidic ($pK_a \approx 0$). Sulfonate esters (tosylates, mesylates, triflates) are excellent leaving groups in substitution reactions.


4.6 IUPAC nomenclature — the systematic naming scheme

The International Union of Pure and Applied Chemistry (IUPAC) maintains an official set of rules for naming organic molecules. The goal: any chemist should be able to draw a structure from a IUPAC name, and vice versa, without ambiguity.

The rules are extensive. We present the essentials for naming alkanes, alkenes, alkynes, alcohols, and simple derivatives. Appendix H is a complete nomenclature reference for when things get complicated.

The four-step procedure

Step 1. Find the parent chain. Identify the longest continuous carbon chain that contains the principal functional group (if any). For alkanes, this is simply the longest carbon chain. The parent chain name determines the base of the compound's name.

Step 2. Number the parent chain. Number the carbons of the parent chain to give the lowest possible locants to: - The principal functional group (highest priority — determines the suffix). - Multiple bonds (-ene, -yne) if present. - Substituents.

If the principal functional group is at one end, start numbering from that end. If there is a choice, start from the end that gives the overall lowest set of locants.

Step 3. Identify substituents. Every branch or group off the parent chain is a substituent. Name each one, including its locant (the number of the carbon it's attached to).

Step 4. Assemble the name. Format is generally:

<substituents in alphabetical order> <parent chain> <suffix>

With locants placed in front of the group they describe, separated by hyphens.

The suffix-priority table

When a molecule has multiple functional groups, only the highest-priority group determines the suffix; the rest become substituents (prefixes). Priority order (high to low):

Rank Functional group Suffix (when principal) Prefix (when not principal)
1 Carboxylic acid -oic acid carboxy-
2 Ester -oate alkoxycarbonyl-
3 Amide -amide carbamoyl-
4 Nitrile -nitrile cyano-
5 Aldehyde -al oxo- or formyl-
6 Ketone -one oxo-
7 Alcohol -ol hydroxy-
8 Amine -amine amino-
9 Alkene -ene (always in parent)
10 Alkyne -yne (always in parent)

Halogens and alkyl substituents are always prefixes and never suffixes. They do not compete for principal-group status.

Worked examples

Worked Problem 4.1 — Naming a simple alcohol

Structure: $CH_3CH_2CH(OH)CH_3$

Step 1: Longest chain is 4 carbons (butane parent). Contains an alcohol. Step 2: Number from the end nearest the OH. The OH is on C2, so numbering gives: $CH_3-CH(OH)-CH_2-CH_3$ numbered 1-2-3-4 or 4-3-2-1. Lowest locant for OH = 2. Either direction gives the same number here, so it does not matter. Step 3: No other substituents. Step 4: butan + ol with locant 2.

Name: butan-2-ol (or 2-butanol — both accepted)

Worked Problem 4.2 — Naming a compound with multiple groups

Structure: $CH_3CH(OH)CH_2COOH$ (3-hydroxybutanoic acid)

Step 1: Longest chain containing the highest-priority group (carboxylic acid) is 4 carbons. Parent: butanoic acid. Step 2: Number from the carboxylic-acid end. C1 is the COOH carbon. Step 3: The hydroxyl is on C3 — a substituent (since COOH is principal). Written as 3-hydroxy-. Step 4: Combine.

Name: 3-hydroxybutanoic acid

(This compound, β-hydroxybutyrate, is a real metabolite and a "ketone body" produced during fasting.)

Worked Problem 4.3 — Aspirin, with a mouthful of IUPAC

Structure: acetylsalicylic acid (Figure 1.1).

The molecule has a benzene ring, a carboxylic acid, and an ester (two oxygens, with an acetyl group).

Priority: carboxylic acid > ester. The carboxylic acid is the suffix.

The parent ring is benzene (a benzoic acid if we want a COOH on it). With substituents at appropriate positions.

The acetoxy group (-OCOCH₃) is the ester in prefix form.

IUPAC name: 2-(acetyloxy)benzoic acid (or 2-acetoxybenzoic acid)

In practice, everyone calls it aspirin or acetylsalicylic acid. IUPAC names are precise; common names are convenient.

Stereochemistry in names

For alkenes, the geometry of the double bond is indicated by $E$ (entgegen, "opposite") or $Z$ (zusammen, "together"), using Cahn-Ingold-Prelog priority (Chapter 7):

  • (E)-2-butene: higher-priority groups on opposite sides of the double bond.
  • (Z)-2-butene: higher-priority groups on the same side.

For stereocenters, $R$ or $S$ is prefixed to the stereocenter's locant:

  • (R)-2-butanol
  • (S)-ibuprofen

Chapter 7 covers the priority rules. For now, just know that these letters appear in real names.

Common names — the living exceptions

Many organic compounds have common names that are widely used in addition to (and sometimes in preference to) the IUPAC names. You should know a few:

  • methanol → methyl alcohol → wood alcohol
  • ethanol → ethyl alcohol → grain alcohol
  • acetone → propan-2-one → 2-propanone
  • acetic acid → ethanoic acid
  • aspirin → 2-(acetyloxy)benzoic acid → acetylsalicylic acid
  • glycerol → propane-1,2,3-triol
  • urea → carbamide
  • glycine → 2-aminoacetic acid
  • benzene (irreducibly common; no common IUPAC alternative)

Chemists use the convenient name when one exists and reach for the IUPAC name when unambiguity is required (a patent application, a regulatory document, a scientific publication).


4.7 Putting it all together — applying to the anchor examples

Aspirin

Structure: benzene with a carboxylic acid at C1 and an acetoxy ester at C2.

  • Functional groups: carboxylic acid, ester, aromatic ring.
  • Principal group: carboxylic acid (priority 1).
  • IUPAC name: 2-(acetyloxy)benzoic acid.
  • Common name: acetylsalicylic acid. Parent acid: salicylic acid.

Ibuprofen

Structure: 4-isobutylbenzene with a 2-methylpropanoic acid side chain.

  • Functional groups: carboxylic acid, aromatic ring, alkyl substituents.
  • Principal group: carboxylic acid.
  • IUPAC name: (2RS)-2-[4-(2-methylpropyl)phenyl]propanoic acid.
  • The molecule has one chiral carbon (the $\alpha$-methyl carbon of the propanoic acid). Ibuprofen is marketed as a racemate, though only the $(S)$ enantiomer is pharmacologically active.

Acetaminophen (paracetamol)

Structure: benzene with a hydroxyl at C1 and an acetamido group (NHCOCH₃) at C4.

  • Functional groups: phenol, amide (as substituent), aromatic ring.
  • Principal group: amide (priority 3) — but in this case, the amide is an N-substituent of the ring; treated with the parent as the acetamido-phenol.
  • IUPAC name: N-(4-hydroxyphenyl)acetamide (or, equivalently, 4-acetamidophenol).
  • Common name: paracetamol (international) or acetaminophen (US/Canada).

Thalidomide

Structure: phthalimide (the benzene-fused imide) connected to a glutarimide via a single carbon bridge.

  • Functional groups: two imides (which are diamides with the nitrogen between two carbonyls), one chiral carbon, aromatic ring.
  • Principal group: imide (treated as diamide).
  • IUPAC name: 2-(2,6-dioxopiperidin-3-yl)-1H-isoindole-1,3(2H)-dione.
  • Common name: thalidomide. (The IUPAC name is a mouthful — an instance where almost nobody actually uses the formal name.)

The anchor molecules of this book are, functionally, a full inventory of the functional groups you will meet in subsequent chapters. Every time one of them appears, you will be doing nomenclature practice implicitly.


4.8 Summary

  1. Functional groups are the structural motifs that organize organic molecules. Learn the ~20 common ones. The functional group largely determines the molecule's chemistry.
  2. IUPAC nomenclature provides unambiguous naming. Four-step procedure: find the parent chain, number it, identify substituents, assemble the name.
  3. Suffix priority picks which functional group gets to be the principal one in the name. Carboxylic acids > esters > amides > nitriles > aldehydes > ketones > alcohols > amines > alkenes > alkynes.
  4. Common names persist for many everyday compounds. Use them when they are unambiguous in context; use IUPAC when precision is required.
  5. Stereochemistry descriptors ($E$, $Z$, $R$, $S$) are part of the name for chiral or geometrically isomeric molecules. We will cover how to assign them in Chapter 7.

Chapter 5 returns to energetics — alkane conformations, thermodynamics, and kinetics — and then Chapter 6 introduces spectroscopy. Chapter 4's vocabulary will be used constantly.