Chapter 32 Key Takeaways: Digital Audio — Sampling, Quantization & the Nyquist Theorem

The Nyquist-Shannon Theorem

The Nyquist-Shannon theorem provides a mathematical guarantee. A continuous signal whose highest frequency component is f_max can be perfectly reconstructed from samples taken at any rate greater than 2 × f_max. This is not an approximation — it is a mathematical theorem that establishes exact equivalence between the bandlimited analog signal and its discrete sample sequence.

The Nyquist frequency is exactly half the sample rate. For CD audio at 44,100 Hz, the Nyquist frequency is 22,050 Hz. Any signal component below this frequency is perfectly captured; any component above this frequency must be removed before sampling or it will alias.

Aliasing is not just lost high-frequency information — it is phantom low-frequency content. A frequency component above the Nyquist limit does not simply disappear in the digital representation. It folds back into the audio band as a false frequency: f_alias = |f − n × f_s|. This alias is audible, inharmonic (unrelated to the original signal), and indistinguishable from genuine audio at that frequency once it has been sampled.

Quantization and Dynamic Range

Each bit of bit depth adds approximately 6 dB of dynamic range. The formula DR ≈ 6.02 × B + 1.76 dB gives the theoretical maximum signal-to-noise ratio for B-bit audio. 16-bit audio provides approximately 98 dB; 24-bit provides approximately 146 dB. For playback in real acoustic environments (ambient noise floors above −60 dBFS), 16-bit is sufficient; for recording and mixing (where headroom is valuable), 24-bit is preferred.

Quantization noise is most problematic at very low signal levels. At high signal levels (near full scale), quantization error is negligible relative to the signal amplitude. At very low levels, the signal uses only a few quantization levels per cycle, and the quantization error becomes correlated with the signal — producing audible granular distortion rather than white noise.

Dithering converts correlated distortion into uncorrelated noise — a beneficial trade. Adding random noise before quantization randomizes the quantization error, replacing correlated distortion with a constant white noise floor. The brain finds random noise much less objectionable than correlated distortion at the same power level. Noise-shaped dithering concentrates this noise at frequencies where hearing is least sensitive, achieving perceptual dynamic range exceeding 120 dB from 16-bit audio.

CD Standard and Format Choices

The CD standard was derived from physical requirements and historical accident. The 44,100 Hz sample rate emerged from video tape compatibility requirements. The 16-bit depth was chosen to exceed the dynamic range of analog tape while being feasible with late-1970s IC technology. These are engineering choices made under constraints, not theoretically optimal specifications.

High-resolution audio (96 kHz/24-bit) provides clear benefits in production but uncertain benefits for delivery. During recording and mixing, 24-bit provides valuable headroom and 96 kHz provides gentler anti-aliasing filter characteristics. For final delivery to listeners, the perceptual evidence for improvement over 16/44.1 is mixed — multiple controlled listening tests find no reliable difference for most listeners and most program material.

Anti-Aliasing and Reconstruction

Anti-aliasing filters must precede the ADC, not follow it. Once aliasing occurs, the alias cannot be distinguished from genuine audio content at the aliased frequency. Filtering after sampling cannot remove aliases. The filter must remove all content above the Nyquist frequency before any sampling occurs.

Oversampling with sigma-delta conversion is how modern converters avoid steep analog filters. By sampling at 64× to 512× the target rate, the Nyquist frequency moves far above human hearing, allowing a very gentle analog anti-aliasing filter. Digital filtering then reduces the sample rate to 44.1 or 48 kHz, with perfect brick-wall characteristics impossible in analog but straightforward in digital.

Reconstruction filters must remove spectral images introduced by the staircase output of DACs. The ideal reconstruction filter is a sinc function (brick-wall low-pass at Nyquist), which introduces pre- and post-ringing around sharp transients (Gibbs phenomenon). Real filters approximate this ideal with various trade-offs between transition band steepness, phase distortion, and ringing behavior.

Theme Connections

Theme 2 (Technology as Mediator): The digital audio system mediates between the continuous physics of sound and the discrete mathematics of digital representation. The mediation is — under Nyquist conditions — mathematically perfect within the captured frequency band. But the choices of what frequency band to capture and at what precision are engineering decisions that shape what music can sound like and what music can be made.

Theme 1 (Reductionism vs. Emergence): The Nyquist theorem is a reductionist triumph: it says that a continuous signal is fully determined by a discrete sequence of numbers. But the vinyl renaissance case study reminds us that the musical experience is not reducible to the signal alone — the ritual, the physical object, the community context all contribute to what listening means, and these aspects of musical experience escape any purely physical measurement.