Chapter 35 Further Reading: Spatial Audio & 3D Sound

Foundational Texts on Spatial Hearing

Blauert, Jens. Spatial Hearing: The Psychophysics of Human Sound Localization. Revised ed. MIT Press, 1997. The definitive academic reference on the physics and psychoacoustics of three-dimensional hearing. Covers ITD, ILD, pinna filtering, the cone of confusion, and the precedence effect in comprehensive, rigorous detail. Includes original experimental data that established much of what we know about spatial hearing. Graduate-level but accessible to motivated undergraduates.

Hartmann, William M. Signals, Sound, and Sensation. AIP Press/Springer, 1997. A rigorous but readable treatment of acoustics and psychoacoustics with strong chapters on spatial hearing. Covers ITD and ILD detection thresholds, binaural unmasking, and the psychophysics of localization. Unusual in integrating physical and perceptual analysis more tightly than most comparable texts.

Middlebrooks, John C., and David M. Green. "Sound Localization by Human Listeners." Annual Review of Psychology 42 (1991): 135–159. A landmark review paper on the psychoacoustics of sound localization that synthesizes the experimental literature on ITD, ILD, and HRTF contributions to spatial hearing. Essential reading for understanding the perceptual side of spatial audio.

HRTF and Binaural Audio

Møller, Henrik. "Fundamentals of Binaural Technology." Applied Acoustics 36, nos. 3–4 (1992): 171–218. The most comprehensive review of binaural recording and reproduction technology. Covers HRTF measurement, dummy head design, headphone equalization, and the physics of binaural reproduction. Dated in some technical details but unmatched in conceptual coverage.

Xie, Bosun. Head-Related Transfer Function and Virtual Auditory Display. 2nd ed. J. Ross Publishing, 2013. The most thorough modern text on HRTF theory and application. Covers HRTF measurement, modeling, personalization, and application to virtual auditory displays. Includes Chinese research contributions that are underrepresented in English-language literature.

Zotkin, Dmitry N., Ramani Duraiswami, and Larry S. Davis. "Rendering Localized Spatial Audio in a Virtual Auditory Space." IEEE Transactions on Multimedia 6, no. 4 (2004): 553–564. An important paper on the practical implementation of HRTF-based spatial audio rendering, including methods for interpolating between measured HRTF directions and approaches to reducing the computational cost of real-time binaural rendering.

Ambisonics

Gerzon, Michael A. "Periphony: With-Height Sound Reproduction." Journal of the Audio Engineering Society 21, no. 1 (1973): 2–10. Gerzon's original paper introducing the ambisonics framework and describing the periphonic (three-dimensional) sound reproduction system that became first-order ambisonics. A foundational document in spatial audio history.

Daniel, Jérôme. "Représentation de champs acoustiques, application à la transmission et à la reproduction de scènes sonores complexes dans un contexte multimédia." Ph.D. thesis, Université Paris 6, 2000. The most complete mathematical treatment of higher-order ambisonics, establishing the theoretical foundations for HOA encoding, decoding, and the relationship between ambisonic order and spatial resolution. In French but widely cited in English-language HOA literature.

Rafaely, Boaz. Fundamentals of Spherical Array Processing. Springer, 2015. A rigorous treatment of spherical harmonic analysis applied to acoustic field decomposition. Covers microphone array design for ambisonics capture, spherical harmonic transforms, and the mathematics underlying HOA processing. Requires linear algebra background.

Spatial Audio Technology and Formats

Pulkki, Ville. "Spatial Sound Reproduction with Directional Audio Coding." Journal of the Audio Engineering Society 55, nos. 6 (2007): 503–516. Introduces DirAC (Directional Audio Coding), an analysis-synthesis framework for parametric spatial audio coding. Relevant for understanding how spatial audio is efficiently compressed and coded for streaming.

The Dolby Atmos White Paper. Available at dolby.com. Dolby's technical documentation on the Atmos format, covering the distinction between beds and objects, the rendering engine, and the cinema and consumer variants of the format. An accessible technical introduction to object-based audio.

ITU-R BS.2051-3: Advanced Sound System for Programme Production. The international standard specifying 3D audio formats for broadcast, including the "9+10+3" system. This document defines the speaker configurations and audio format requirements for immersive broadcast audio.

VR Audio

Wenzel, Elizabeth M., et al. "Localization Using Nonindividualized Head-Related Transfer Functions." Journal of the Acoustical Society of America 94, no. 1 (1993): 111–123. The foundational study on the consequences of using non-individualized (generic) HRTFs for spatial audio rendering. Demonstrates quantitatively the elevation errors and front-back confusions that result from HRTF mismatch — the core problem of VR audio personalization.

Caponetto, A., et al. "Personalized Spatial Audio with Automatic Equalization Using Measured Headphone and HRTF Data." AES 137th Convention, 2014. A practical study of HRTF personalization using in-ear microphone measurement — the approach being pursued by Meta and others for consumer VR headsets.

Savioja, Lauri, et al. "Creating Interactive Virtual Acoustic Environments." Journal of the Audio Engineering Society 47, no. 9 (1999): 675–705. A comprehensive review of the techniques available for real-time acoustic simulation in interactive virtual environments, covering ray tracing, image source methods, and the computational tradeoffs involved. Foundation reading for VR audio engineering.

Spatial Music and Creative Applications

Emmerson, Simon. Living Electronic Music. Ashgate, 2007. Explores the history and aesthetics of electronically mediated music, including early experiments in spatial composition, tape music, and electronic spatial manipulation. Contextualizes "Tomorrow Never Knows" and other pioneering spatial audio works.

Wishart, Trevor. On Sonic Art. New Academica Press, 1996. A theoretical treatment of the creative possibilities of electronic music, including extended discussion of spatial composition, transformation of acoustic space, and the relationship between physical sound and musical experience.

Zvonar, Richard. "A History of Spatial Music." Econtact 7, no. 4 (2005). Available at econtact.ca. A comprehensive historical survey of compositional and technological approaches to spatial music, from Berlioz's spatially distributed orchestras through ambisonics and beyond.

Online Resources and Tools

Spatial Audio Designer (SAD) — Plugin from Noise Makers — noisemakers.fr Professional ambisonic production tools. Their resources section includes tutorials on first-order and higher-order ambisonics that are clear and technically grounded.

IEM Plugin Suite (Institute of Electronic Music and Acoustics) — plugins.iem.at Free and open-source tools for ambisonics encoding, decoding, and visualization. Includes the EnergyVisualizer tool that visually displays the directional energy distribution of an ambisonic scene — an excellent tool for developing intuition about spherical harmonic spatial encoding.

3D Tune-In Toolkit — 3d-tune-in.eu Open-source library for binaural audio rendering with HRTF personalization. Includes interfaces for HRTF measurement, processing, and real-time binaural rendering. Useful for research-level exploration of the topics in this chapter.

Ambisonic Decoder Toolbox (ADT) — bitbucket.org/ambidecodertoolbox Open-source MATLAB/GNU Octave tools for ambisonic decoding matrix computation. Useful for exploring the mathematics of decoder design discussed in Section 35.4.

SADIE Database — sadie-project.org A freely available HRTF database with measurements from 20 subjects (including humans and KEMAR dummy head) at high angular resolution. Useful for studying HRTF individual differences and for prototyping binaural rendering applications.