Part VII: Recording, Technology & Signal Processing
The Moment Everything Changed
For most of human history, music was ephemeral. It existed only in the moment of its making — in the breath of a singer, the bow on a string, the finger on a drum skin — and then it was gone. Even the most carefully notated score was not the music; it was a set of instructions for recreating something that would have to be experienced live, in the air, between people. When the performance ended, it ended absolutely.
Thomas Edison changed that on December 6, 1877, when he spoke "Mary Had a Little Lamb" into a tinfoil cylinder and played it back. The implications of that moment are still unfolding. Recording did not merely preserve music the way a photograph preserves a face. It transformed music's relationship to time, to authorship, to the human body, to commerce, to culture, and to physics itself. A recording is not a capture of music. It is a translation — from pressure waves in air, to electrical signals, to magnetic domains, to binary code, to speaker membrane, to air again. At every stage of that translation, something is gained and something is changed.
Part VII is the story of those translations. Every chapter covers a different technology that mediates between the physics of sound and the human experience of music — and every mediation has a physics, a history, and a set of irreversible consequences.
Guiding Question for Part VII: "When technology mediates between physics and experience, what is gained and what is irretrievably lost?"
Theme 4 at Full Power
This is Theme 4's home territory. Technology as Mediator has appeared throughout the textbook — the piano's action mediates between finger pressure and string vibration, equal temperament mediates between mathematical purity and keyboard convenience, the concert hall mediates between the physics of sound radiation and the experience of the audience. But in Part VII, the mediation becomes the subject itself.
We will find that technological mediation is never neutral. A microphone does not capture sound; it transduces it, with a frequency response that emphasizes some ranges and attenuates others. A digital converter does not sample sound; it quantizes it, imposing a grid of discrete values on a continuous signal. A compression algorithm does not preserve music; it identifies what the human ear is least likely to notice and discards that first. Each of these operations has a physical description, and each has aesthetic and cultural consequences that extend far beyond the physics.
Theme 3 (Constraint and Creativity) also runs strongly through Part VII. The 3-minute single emerged from the physical limitations of early shellac 78 rpm discs. The loudness wars of the 1990s emerged from the constraint of limited dynamic range in broadcasting. The lo-fi aesthetic that became a defining sound of certain indie genres emerged directly from the constraints of cheap cassette recordings. In music, constraints do not only limit; they generate.
What Each Chapter Contributes
Chapter 31: The Physics of Recording begins with the transduction chain: how pressure waves in air are converted to electrical signals by a microphone, how those signals are amplified without distortion (or are deliberately distorted for artistic effect), and how the signal is eventually encoded onto a physical medium. We examine the physics of different microphone types (dynamic, condenser, ribbon), the mathematics of signal-to-noise ratio and dynamic range, and the physical basis of magnetic recording. Aiko Tanaka appears prominently here: as part of her research on spectral features of musical performance, she records a chamber ensemble with multiple microphone configurations and analyzes how the choice of microphone placement changes the captured frequency content — not just the timbre, but the spatial information. She discovers, with some frustration, that the "same" performance sounds like genuinely different physics depending on where you put the microphone.
Chapter 32: Digital Audio — Sampling, Quantization, and the Nyquist Theorem is one of the most mathematically rich chapters in Part VII. We develop the Nyquist-Shannon sampling theorem from first principles: to reconstruct a signal with maximum frequency f, you must sample at least at 2f. This is not an engineering convention; it is a mathematical theorem with a proof, and understanding it changes how you hear the difference between analog and digital audio. We examine quantization — the conversion of continuous amplitude to discrete values — and its primary artifact: quantization noise, which has a specific spectral signature. The chapter includes a detailed treatment of dither (deliberately added noise that paradoxically reduces the perceptual impact of quantization) and ends with the question of whether human hearing can actually detect differences above CD quality — a question that turns out to be more complicated than the audio industry would prefer.
Chapter 33: Audio Compression — What Gets Thrown Away and Why confronts the most consequential and least understood technology in modern music. We examine psychoacoustic masking (the physical basis of why some sounds hide others), critical bands in the basilar membrane, and how MP3 and AAC codecs use these phenomena to selectively discard audio information. This chapter features the textbook's most direct collision between physics and human consequence: Aiko Tanaka discovers that several of her archival recordings of rare performance traditions, stored in lossy formats, have had their most analytically significant spectral features compressed away. The information she needs for her research has been discarded by an algorithm that correctly determined most listeners would not notice. This is not a failure of the compression algorithm. It is a fundamental consequence of building technology around average perceptual requirements.
Chapter 34: Room Acoustics and Sound Design returns to the physical environment and asks: what happens after the signal leaves the speaker? Room acoustics is the physics of how sound interacts with architectural space — the geometry of early reflections, the exponential decay of reverberation, the comb-filtering that occurs when direct and reflected sound arrive at slightly different times. We cover the Sabine equation for reverberation time, the physics of acoustic diffusers and absorbers, and the engineering of spaces designed for specific sonic purposes. We then examine the inverse problem: how do recording engineers and producers use artificial reverberation to place recorded sounds in fictional acoustic spaces that may never have existed? The convolution reverb, which applies the impulse response of a real space to a recorded signal, is examined in detail — it is a physically precise technology with deeply fictional applications.
Chapter 35: Spatial Audio — Stereo, Surround, and Binaural Sound examines how technology encodes and reproduces spatial information. The physics of human sound localization — interaural time differences, interaural level differences, and the head-related transfer function (HRTF) — determines what spatial information the auditory system uses. Stereo reproduces a simplified subset of this information; binaural recording attempts to capture it more completely; Dolby Atmos and other immersive formats try to reconstruct full three-dimensional acoustic scenes from speaker arrays. We examine the physics of each approach and its limitations: what no current technology can fully reproduce, and why. The Choir and the Particle Accelerator appears here in a striking parallel: particle detectors, like binaural microphones, must reconstruct the three-dimensional trajectory of an event from a set of point measurements — and the reconstruction algorithms face structurally similar trade-offs.
⚠️ The Mediation Is Never Transparent A persistent intuition about recording technology is that "better" technology means "more transparent" technology — that the ideal microphone, the ideal codec, the ideal speaker system would become invisible, delivering a perfect window onto the original sound. Part VII argues that this intuition, while partially useful, is physically impossible. Every transduction introduces transformation. The question is not how to eliminate the mediation but how to understand and control it.
📊 The Spotify Spectral Dataset in Part VII The streaming era's sonic characteristics — the loudness normalization introduced by Spotify and Apple Music, the shift toward mix styles that survive earphone listening, the premium placed on the first 30 seconds (the threshold for streaming royalty payment) — are visible in the Spotify Spectral Dataset as systematic shifts in audio features over time. Part VII provides the physical tools to read those shifts precisely: they are not aesthetic mysteries but measurable consequences of technology reshaping creative constraint.
The Question of Fidelity
Underlying every chapter in Part VII is a question that turns out to be harder than it looks: fidelity to what? The intuitive answer — fidelity to the original sound — immediately runs into the problem that the "original sound" is itself a physical event that existed in a specific space, heard by a specific set of ears, at a specific moment that is now gone. Recording does not capture that event; it captures a transduced representation of one aspect of that event, heard by specific microphones, in specific positions, processed by specific electronics.
High-fidelity audio is more accurately described as high-fidelity to a prior stage of the mediation chain. The 192 kHz/24-bit studio master is "more faithful" than the MP3 only in the sense that fewer transformations have been applied — but it is still a transduction, still a construction, still a mediation. Understanding this does not make high-resolution audio worthless. It makes the concept of "fidelity" more honest and more interesting.
The chapters ahead will make you hear technology differently. Not with suspicion — the technologies in Part VII are among the most sophisticated and beneficial physical instruments humans have ever built — but with awareness. The mediation is always there. The question is what it costs and what it gives.
Part VII begins with Chapter 31: The Physics of Recording.