Chapter 31 Exercises: The Physics of Recording — From Edison to Digital

Part A: Mechanical Recording and Groove Physics

A1. A phonograph cylinder rotates so that its surface moves at 40 cm/s past the stylus. A 200 Hz tone is recorded on the cylinder. What is the spatial wavelength of the groove undulations corresponding to this tone? If the stylus tip has a radius of 0.5 mm, can it accurately reproduce this feature? Show your calculation and explain your reasoning.

A2. Early phonograph recordings captured frequencies up to approximately 4,000 Hz reliably. A modern microphone captures up to 20,000 Hz. Given that both are capturing the same tuba performance, describe in physical terms what information the phonograph missed that the microphone captured. Be specific: what acoustic features of the tuba (harmonics, attack transients, breath noise) would be missing from the phonograph recording?

A3. The shift from cylinder to flat disc recordings in the 1890s–1900s was partly driven by manufacturing economics. A flat disc can be pressed from a master mold, while cylinders had to be recorded one at a time. Explain the physics of why a disc can be pressed from a mold while maintaining the encoded information, and why this is harder to do with cylinders. Consider the geometry of the groove information in both cases.

A4. Describe the "proximity effect" in directional (cardioid) microphones and explain why it occurs in physical terms. A recording engineer places a cardioid microphone 3 inches from a singer's mouth for a jazz recording. The engineer then moves the microphone to 18 inches. Qualitatively describe how the recording will change, and whether the proximity effect will be more or less pronounced at each distance.

A5. An elliptical stylus has a smaller radius of curvature in the lateral direction (0.3 mil) than the vertical direction (0.7 mil). A spherical stylus has a uniform radius of 0.7 mil. Explain why the elliptical stylus produces better high-frequency reproduction. At what approximate frequency does the spherical stylus begin to fail to track the groove accurately, given a disc surface speed of 35 cm/s? (Hint: the stylus fails when its diameter exceeds the groove wavelength.)


Part B: Magnetic Recording Physics

B1. A tape recorder operates at 15 IPS (inches per second). A 10,000 Hz tone is recorded on this tape. What is the spatial wavelength of the magnetic pattern on the tape in millimeters? If the same recorder runs at 3.75 IPS, what happens to this spatial wavelength? What practical consequence does this have for high-frequency reproduction?

B2. Explain the concept of magnetic hysteresis and why it creates distortion in magnetic recording. Draw (or describe in precise terms) the shape of a hysteresis curve (B-H curve) and identify: (a) the remanent magnetization point, (b) the coercive force point, (c) the saturation region, and (d) where small-signal audio recording operates without bias.

B3. AC bias in magnetic recording is set at a frequency of 80 kHz on a particular tape machine. Explain in physical terms why this ultrasonic signal, which is far above the audio band, improves the quality of audio recording at frequencies like 1,000 Hz. What would happen to the recorded audio quality if the bias frequency were set to 8,000 Hz instead? If the bias level were doubled?

B4. Tape saturation adds "warmth" to audio by generating even-order harmonic distortion. A pure 440 Hz sine wave recorded with heavy saturation emerges with significant 880 Hz and 1,760 Hz content. (a) Are these harmonics musically consonant with the 440 Hz fundamental? Why or why not? (b) If the same saturation were applied to a 300 Hz tone, what harmonics would be added? Are these consonant with the fundamental? (c) Does this suggest why tape saturation tends to work better with some instruments than others?

B5. A Dolby B noise reduction system boosts frequencies above 1,000 Hz by up to 10 dB during recording, then attenuates them by exactly 10 dB during playback. (a) If the recording is played on a non-Dolby player, how would the sound be affected? (b) If a recording made without Dolby is played on a Dolby B playback unit, how would the sound be affected? (c) Why does this noise reduction scheme reduce the audibility of tape hiss specifically? What assumption about where tape hiss concentrates spectrally does this scheme rely on?


Part C: Microphones and Signal Chain

C1. Compare the transduction mechanisms of dynamic, condenser, and ribbon microphones. For each: (a) name the physical principle used (Faraday's law, capacitance variation, etc.), (b) identify what moves in response to sound, (c) describe one application where this type is preferred, and (d) identify one physical limitation of this type.

C2. A condenser microphone has a sensitivity of −40 dBV/Pa (meaning it produces −40 dBV output at 1 Pascal of sound pressure). A typical voice at 1 meter produces a sound pressure level of about 60 dB SPL (0.002 Pa). What voltage does this microphone produce? If the preamplifier following this mic has a gain of 50 dB, what voltage emerges from the preamp? If the preamp's noise floor is −130 dBV (referred to input), what is the signal-to-noise ratio at the preamp output?

C3. An omnidirectional microphone is placed in the center of a reverberant room. A cardioid microphone is placed at the same position, aimed at a guitar amplifier 6 feet away. Compare the recordings produced by each microphone in terms of: (a) room-to-direct sound ratio, (b) frequency response, (c) sensitivity to sound sources behind the microphone. Under what musical circumstances would you prefer each?

C4. A dynamic range compressor is set with a threshold of −12 dBFS, a ratio of 4:1, and an attack time of 10 ms. A vocal signal arrives with occasional peaks that reach −6 dBFS (6 dB above the threshold). (a) After compression, what is the output level of these peaks? (b) What happens to the signal during the 10 ms attack time before the compressor engages? (c) How would the recording sound different if the ratio were 20:1 (near-limiting)?

C5. Explain the physics of the inverse-square law as it applies to microphone placement. A vocalist singing at 90 dB SPL at 1 meter is recorded with a microphone at 1 meter, then at 2 meters. (a) What is the SPL at each microphone position? (b) If the studio has a diffuse reverberant field at 70 dB SPL throughout the room, what is the direct-to-reverberant ratio at 1 meter? At 2 meters? (c) Which placement would produce a more "intimate" sound, and why?


Part D: Stereo and Spatial Audio

D1. The stereo illusion relies on Interaural Level Difference (ILD) and Interaural Time Difference (ITD). (a) What is the maximum ITD for a sound directly to one side of a listener with ears separated by 17 cm? Assume sound travels at 343 m/s. (b) At which frequencies is ILD the more important cue — low or high? Explain physically why the head is a better sound blocker at higher frequencies. (c) What happens to the stereo image if a listener sits far to the left of the ideal "sweet spot" between the speakers?

D2. A recording engineer pans a guitar "25% left" using intensity panning (meaning the left channel is 3 dB louder than the right). Explain why a listener wearing headphones hears this differently than a listener sitting between two speakers at equal distance. In the headphone case, is the ILD cue accurately reproduced? In the speaker case, does each ear actually receive a different signal?

D3. Binaural recording uses two microphones placed in the ear canals of a dummy head to capture the Head-Related Transfer Function (HRTF). This type of recording, when played back on headphones, can produce convincing three-dimensional spatial images. (a) What acoustic information does the HRTF contain that standard stereo microphone techniques miss? (b) Why does binaural recording lose its spatial qualities when played on loudspeakers? (c) Why do the spatial qualities also degrade when the listener's HRTF differs substantially from the dummy head's?

D4. A recording was made with a coincident X-Y stereo microphone pair (two cardioid mics angled 90 degrees apart, capsules at the same point). Another recording of the same performance was made with a spaced omni pair (two omnidirectional mics 50 cm apart). Compare these two techniques in terms of: (a) the type of stereo cue they primarily produce (ILD or ITD), (b) mono compatibility (what happens when both channels are summed to mono), and (c) the perceived width and depth of the stereo image.

D5. Modern surround sound systems (5.1, 7.1, Dolby Atmos) extend the two-channel stereo concept. Dolby Atmos adds overhead speakers to the traditional horizontal array. (a) Identify the additional spatial dimension that overhead speakers address that two-channel stereo cannot. (b) What additional binaural cues (beyond ILD and ITD) does the human auditory system use for vertical localization, and how would overhead speakers exploit these? (c) What is the fundamental physical reason why no loudspeaker-based system can exactly recreate the binaural cues of a real acoustic environment?


Part E: Digital Recording, Historical Context, and Synthesis

E1. The Compact Disc uses a 780-nanometer laser to read pits on an aluminum surface through a polycarbonate disc. The minimum pit length is 0.833 micrometers and the track pitch is 1.6 micrometers. (a) Why must the pit depth be approximately one-quarter of the laser wavelength? (b) The CD spiral track is approximately 5.38 km long. At a constant linear velocity of 1.2–1.4 m/s, calculate the approximate playing time this represents. (c) The DVD uses a 650 nm laser. Qualitatively, how does this shorter wavelength enable higher data density?

E2. Glenn Gould recorded the Goldberg Variations twice: in 1955 (live studio session) and 1981 (heavily edited multitrack production). Compare these two recordings from the perspective of the philosophy of recording. (a) What does it mean for the 1981 recording to be "more perfect" than a single-take performance? (b) Is the 1981 recording more or less "authentic" than the 1955 recording? Define "authentic" before answering. (c) Can a recording be considered a primary musical work in its own right, or is it always a document of something else?

E3. The "loudness war" (discussed in Case Study 2) involved mastering engineers applying heavy compression to CDs to make them sound louder on radio and in comparison tests. From a physics perspective: (a) if a CD is mastered so that the average level is −6 dBFS (6 dB below maximum), what headroom remains for peaks? (b) If the same music is remastered so the average level is −2 dBFS, what happens to dynamic range? (c) Why did streaming services adopting LUFS normalization effectively end the commercial incentive for the loudness war?

E4. Consider a scenario where recording technology had never been invented. How might Western art music have developed differently? Consider: (a) the role of notation in preserving and transmitting musical works before recording, and its limitations; (b) which genres of music (jazz, blues, electronic music) might not have developed as they did without recording; (c) how the economic structure of the music industry would differ; (d) whether the concert experience itself would have evolved differently without the comparison point of studio recordings.

E5. The text states that "recording technology does not preserve music — it translates music into a new medium." Write a 400-word analysis applying this claim to a specific recording you know well. Identify: (a) at least two specific ways the recording medium (microphone choice, room acoustics, mixing decisions, mastering) has shaped the sound of this recording; (b) what features of the live performance or original composition are absent or transformed in the recording; (c) whether you believe the recording is a document of the music, a new work of art, or something between these positions.