Chapter 9 Exercises: The Voice as Instrument


Part A: The Source-Filter Model and Acoustic Fundamentals

Exercise A1 — Source and Filter Identification A singer sustains the vowel /a/ (as in "father") on a D4 (293 Hz). She then transitions to /i/ (as in "beat") on the same pitch. (a) Describe what changes in the source (if anything) during this transition. (b) Describe what changes in the filter during this transition. (c) If the singer's F2 for /a/ is 1090 Hz and for /i/ is 2290 Hz, and her fundamental is 293 Hz, identify which harmonic (3rd, 4th, 5th, 6th, 7th, or 8th) is closest to F2 in each vowel. (d) Explain why the /i/ vowel typically sounds brighter than /a/ even at the same pitch and loudness.

Exercise A2 — Glottal Source Characteristics The glottal source spectrum rolls off at approximately -12 dB per octave. (a) If the fundamental (f₀ = 200 Hz) has amplitude 60 dB SPL, what is the approximate amplitude at 400 Hz? At 800 Hz? At 1600 Hz? (b) Plot (sketch) this roll-off on a frequency axis from 200 Hz to 3200 Hz. (c) A formant at 3000 Hz provides +18 dB of boost. What is the output level at 3000 Hz, given the source level from your calculation in (a)? (d) Why is the singer's formant acoustically useful despite the steep source roll-off?

Exercise A3 — Vocal Tract as Resonator Consider the vocal tract as a tube approximately 17 cm long, closed at the glottis end and open at the lips. (a) Using the formula for a quarter-wave resonator (f = v/4L, where v = 343 m/s), calculate the fundamental resonant frequency of this tube. (b) What are the frequencies of the next three resonant modes? (c) Compare these resonant frequencies to typical formant values for the neutral /ə/ vowel. How well does the simple tube model match? (d) The vocal tract is not a simple cylindrical tube. Name three ways in which its actual geometry differs from a cylinder and explain what each difference does to the resonant frequencies.

Exercise A4 — The Mucosal Wave (a) Explain the difference between the "vocal cord" model of vibration (simple string model) and the "mucosal wave" model. (b) Why is the mucosal wave a more accurate description of what actually happens during phonation? (c) Describe what a stroboscopic laryngoscopy of a healthy vs. a nodule-affected vocal fold would reveal about the mucosal wave. (d) Relate the mucosal wave to the efficiency of voice production. Why does a singer with vocal nodules use more subglottal pressure to achieve the same volume as a healthy singer?

Exercise A5 — Formant Measurement The following are formant frequency measurements for an adult male speaker:

Token F1 (Hz) F2 (Hz)
Vowel X 730 1090
Vowel Y 300 870
Vowel Z 270 2290

(a) Using the standard F1-F2 vowel chart, identify vowels X, Y, and Z (use IPA symbols or common English words). (b) Describe the tongue position associated with each vowel (height and frontness/backness). (c) If the same vowels were measured in an adult female speaker, how would the formant values likely differ, and why? (d) Why are the ratios of formant frequencies more useful than absolute formant values for vowel identity recognition?


Part B: Registers, Vibrato, and Extended Techniques

Exercise B1 — Registers and Physiology A baritone singer's chest voice extends to approximately F4 (349 Hz), and his falsetto begins around G4 (392 Hz). (a) Describe the physiological differences in vocal fold configuration between his chest voice on E4 and his falsetto on A4. (b) Why does the passaggio region (F4–G4) present a challenge for this singer? (c) What does "mixed voice" mean physiologically, and how might training help the baritone smooth the passaggio? (d) If the baritone performs Baroque repertoire as a countertenor, using primarily falsetto, how does his acoustic output differ from a true alto voice (female singer using modal register)?

Exercise B2 — Vibrato Analysis A singer's vibrato has a rate of 6 Hz and a depth of ±40 cents. (a) If the singer is sustaining a note at A4 (440 Hz), what are the highest and lowest instantaneous frequencies during vibrato? (Recall: 100 cents = 1 semitone; 1 semitone ≈ 6% frequency change.) (b) A listener hears this note. What pitch do they perceive? Explain why. (c) If two singers in a section both have 6 Hz vibrato but their vibrato phases are opposite (one at peak when the other is at trough), describe what the combined signal sounds like compared to a single voice. (d) Why might a choral director ask singers to "reduce vibrato" for a piece of early Renaissance polyphony but encourage full vibrato for late Romantic repertoire?

Exercise B3 — Overtone Singing A Tuvan throat singer establishes a drone at G2 (98 Hz). (a) List the frequencies of the first 10 harmonics of this drone. (b) The singer shapes their vocal tract to create a sharp formant at approximately 784 Hz. Which harmonic of the drone does this emphasize? (c) What is the musical interval between the emphasized harmonic and the drone fundamental? Is this interval consonant or dissonant? (d) The singer then shifts the formant to approximately 980 Hz. Which harmonic is now emphasized? What is the new musical interval? (e) What does this exercise reveal about the relationship between the harmonic series and musical scales?

Exercise B4 — Whistle Register and Physics The soprano Mariah Carey's whistle register extends to approximately C8 (4186 Hz). (a) Given that normal female vocal range extends to about E6 (1318 Hz) in head voice, how many octaves above normal range is Mariah's whistle C? (b) The whistle register is thought to involve edge-tone oscillation rather than full mucosal contact. Explain why this mechanism would allow for much higher frequencies. (c) The whistle register produces very little harmonic richness — the sound is nearly sinusoidal. Using the source-filter model, explain why this might be the case. (d) A singing teacher claims that the whistle register is "not real singing." Using physics, argue for or against this claim.

Exercise B5 — Comparing Vocal Techniques Across Traditions Compare the following pairs of vocal techniques. For each pair, describe both the physical production mechanism and the resulting acoustic output: (a) Western operatic soprano vs. Hindustani classical singer (compare production approach, acoustic strategies, and performance context) (b) Tuvan khoomei sygyt vs. Western falsetto (compare what aspects of the source-filter model each emphasizes) (c) Tibetan low chant vs. Western bass voice (compare the extremes of glottal vibration) (d) Sprechstimme vs. natural speech (compare the degree of pitch control imposed on the laryngeal system) (e) Belting (musical theater) vs. operatic chest voice (compare subglottal pressure, vowel modification, and singer's formant use)


Part C: The Singer's Formant and Projection

Exercise C1 — The Spectral Gap The orchestra produces a dense spectrum from 100 Hz to about 2500 Hz, with much lower energy in the 2500–3500 Hz range. (a) Why does the orchestra have lower acoustic energy in this range? What is it about orchestral instruments that creates this gap? (b) A trained operatic tenor sings a note at A3 (220 Hz). His singer's formant is at 3000 Hz. Which harmonic of his voice (3rd? 4th? 5th? ...) falls closest to the singer's formant? (c) This harmonic is enhanced by 18 dB due to the singer's formant. If without the enhancement, this harmonic was at the same level as the orchestra in that frequency range (say, 80 dB SPL), what is its level with the enhancement? (d) Explain in perceptual terms why an 18 dB advantage in a narrow frequency range allows the voice to "cut through" the orchestra.

Exercise C2 — Formant Tuning at High Pitches A soprano sings a high C5 (523 Hz). Her vowel /a/ normally has F1 at approximately 700 Hz. (a) The harmonics of C5 are at 523, 1046, 1569, 2092 Hz, etc. Is the F1 frequency (700 Hz) exactly between two harmonics, or does one harmonic fall near it? (b) The soprano modifies the vowel, opening her jaw slightly, which raises F1 to approximately 1000 Hz. Now which harmonic is closest to F1? (c) By how many Hz is the nearest harmonic now from F1? Compare this to the case in (a). (d) Explain why the soprano would benefit acoustically from this vowel modification. What does the audience experience? (e) Is this vowel modification a "mistake"? Defend your answer.

Exercise C3 — The Epilaryngeal Tube The epilaryngeal tube — the narrow constriction just above the vocal folds — plays a key role in creating the singer's formant. (a) Explain the physical principle by which a narrow tube section embedded in a wider tube creates a distinct resonance. (b) Why does this resonance occur in the 2800–3200 Hz range for trained singers? (c) Describe two training techniques that a classical singing teacher might use that, acoustically, are developing the singer's formant through epilaryngeal tube narrowing. (d) Is the singer's formant present in all singing traditions? Discuss one tradition where it is highly developed and one where it is not the primary acoustic strategy.

Exercise C4 — Cross-Cultural Acoustic Strategies Different vocal traditions solve the "projection problem" (being heard) in different ways. Compare: (a) Indian classical singing (khyal): typically performed in relatively intimate spaces with sitar and tabla accompaniment. What projection strategy is needed? How does this differ from opera? (b) Mongolian long song (Urtiin Duu): sung in wide open steppe environments. What acoustic challenges does this present, and how might the singing style address them? (c) Western choral singing: 40 singers projecting into a reverberant cathedral. How does the room environment help with projection, and what acoustic strategies are optimized for reverberant spaces? (d) Pop singing with close microphone technique: the singer barely needs acoustic projection. How does this change the physical production approach, and what new possibilities does it create?

Exercise C5 — Designing a Vocal Pedagogy Based on Physics You are designing a one-year vocal training program for adult beginners based strictly on acoustic physics. For each module, identify the physical principle being trained and the expected acoustic outcome: (a) Month 1-2: Breath Support — What is the physical relationship between subglottal pressure and vocal fold vibration? Why does "breath support" matter acoustically? (b) Month 3-4: Register Integration — Design an exercise sequence based on the physics of the passaggio. What physiological changes are you trying to develop? (c) Month 5-6: Resonance Development — How would you use acoustic feedback (e.g., spectral analysis software) to help students visualize and develop their singer's formant? (d) Month 7-8: Vowel Modification — Based on formant tuning principles, what guidance would you give students about vowel modification on high pitches? (e) Month 9-12: Style Integration — A student wants to learn both operatic and contemporary commercial music (CCM) styles. What physical adjustments distinguish these styles, and how would you sequence the training?


Part D: Language, Evolution, and Cultural Acoustics

Exercise D1 — Vowel System Design Language X has exactly 5 vowel phonemes. Language Y has exactly 10 vowel phonemes. (a) Based on acoustic optimization principles, where in the F1-F2 vowel space would you expect Language X's vowels to be located? Describe their distribution. (b) Language Y needs to fit 10 vowels into the same vowel space. What challenges does this create for speakers and listeners? How does acoustic distance between vowel categories change? (c) Languages with very high front vowels (high F2, high pitch perception) are common worldwide. Is there an acoustic physics reason why front-high vowels might be acoustically salient? (Hint: think about the harmonic series and formant interactions.) (d) Invent a minimal 5-vowel system for a language where all 5 vowels must be easily distinguishable even in very noisy environments. Describe each vowel using its approximate F1 and F2 values and explain your design choices.

Exercise D2 — Tone Languages and Music (a) In Mandarin Chinese, the four tones are: level high (Tone 1), rising (Tone 2), dipping-then-rising (Tone 3), and falling (Tone 4). How does a Mandarin speaker's brain process pitch information differently from an English speaker's brain? What implications might this have for musical pitch perception? (b) When Mandarin is sung, the musical melody may conflict with the word tones. For example, a word with a falling tone sung on a rising melodic phrase creates an acoustic conflict. Describe how Chinese opera and Chinese popular music each handle this problem differently. (c) A linguist proposes that tone languages evolved in populations living in dense jungle environments where visual communication was difficult. Propose an acoustic physics argument for or against this hypothesis. (d) Historically, Gregorian chant was developed for Latin — a language without lexical tone. Would Gregorian chant have developed differently if it had been developed for a tone language? Explain your reasoning using acoustic physics.

Exercise D3 — The Descended Larynx (a) Explain why the position of the larynx affects the number of distinct vowels a vocal system can produce. Use F1-F2 space in your explanation. (b) Adult male humans have larynxes that are slightly lower than adult female humans (relative to neck length). How might this difference relate to the different F1 formant values observed in males vs. females? (c) Infant humans are born with the larynx high (like other primates). The larynx descends during the first two years of life. At what developmental stage would you expect children to begin producing the full adult vowel inventory? Explain. (d) If evolution had found a way to give humans the low larynx without the choking risk (perhaps by redesigning the epiglottis), how might music and language have developed differently — or would they?

Exercise D4 — Choral Physics A university choir of 80 singers prepares a concert in a 500-seat hall with a 2.5-second reverberation time. (a) The sopranos (20 singers) all sing A5 (880 Hz) at mezzo-forte. If each singer produces 65 dB SPL at 1 meter, estimate the combined SPL at 1 meter using the incoherent addition model. (b) During the same passage, a single soprano soloist steps forward and sings the same note. She sings at forte (70 dB SPL). Can she be heard distinctly above the choir? Explain using acoustic principles. (c) The conductor wants to improve "blend" in the soprano section. She instructs the singers to match vowel formants and reduce vibrato depth. Explain physically why each of these instructions would improve blend. (d) The hall has a 2.5-second reverberation time. How does this affect the choir's effective loudness compared to a dead (0.5-second reverb) room? How should the choir adjust its performance style for each room?

Exercise D5 — Research and Analysis: The Voice in Recording Before recording technology, vocal performance was shaped by the acoustic demands of the live hall. Since the 1920s, microphones have transformed vocal aesthetics. (a) Compare the vocal production of a pre-microphone popular singer (e.g., Enrico Caruso) with a post-microphone era singer (e.g., Frank Sinatra, who pioneered close-mic crooning). What physical production differences would you expect to find? (b) The close microphone technique emphasizes the "proximity effect" — an increase in bass frequencies when a cardioid microphone is placed very close to a source. How does this change the acoustic character of the recorded voice relative to the live voice? (c) Modern pop vocal production often includes pitch correction (Auto-Tune), compression, reverb, and harmonic saturation. Map each of these processes onto the source-filter model: which processes modify the source, which modify the filter, and which modify neither? (d) Is the physical act of singing into a close microphone fundamentally different from singing in a concert hall? What new acoustic freedoms does microphone technique create, and what constraints does it impose?


Part E: Integration and Creative Application

Exercise E1 — The Voice as Scientific Object Design a study to determine whether professional opera singers develop the singer's formant through training or whether singers with a natural singer's formant are selected into opera training. Your study should: (a) Define the acoustic measurements you would take and how you would take them. (b) Identify the participant groups you would need. (c) Describe a longitudinal component (measurements over time). (d) Identify confounding variables and how to control for them. (e) Propose what evidence would support the "training" hypothesis vs. the "selection" hypothesis.

Exercise E2 — Cross-Traditional Vocal Analysis Obtain (or recall from memory/listening experience) examples of the following vocal styles. For each, write a 200-word acoustic description: (a) Western classical soprano (e.g., Maria Callas, Renée Fleming) (b) Bollywood playback singing (c) Tuvan throat singing (d) Overtone or harmonic choral music (e.g., Gyuto Tibetan monks) For each: identify the dominant acoustic strategy (what physical aspects of the voice are most prominently deployed?), describe the likely formant structure, and relate the technique to the cultural performance context.

Exercise E3 — Voice Synthesis from First Principles Without using electronic technology, design an acoustic "voice simulator" using physical materials: (a) What would you use as the "glottal source"? How would it generate a harmonic-rich buzz? (b) What would you use as the "vocal tract"? How would you make it variable? (c) What is the minimum set of formant tuning controls needed to distinguish at least three vowels? (d) What are the fundamental physical limitations of your design? What aspects of the real voice can it not replicate? (e) This design is essentially what the human larynx + vocal tract is. Reflect on the evolutionary process that produced this system: what "design decisions" did natural selection make, and what alternatives might it have taken?

Exercise E4 — Acoustic Diagnosis A singer presents with the following symptoms: voice sounds rough and breathy, pitch accuracy has declined, high notes are effortful, and she describes a "scratching" sensation during phonation. (a) Based on these symptoms, propose two possible acoustic diagnoses (pathologies) and explain how each would produce these symptoms physically. (b) For each diagnosis, describe what a laryngoscopy would likely reveal. (c) For each diagnosis, describe the physical mechanism of treatment that would restore normal voice quality. (d) Explain why "vocal rest" is a physically meaningful (not just common-sense) prescription for voice recovery. (e) If this singer is a professional who cannot take time off, what acoustic strategies might she use to minimize further damage while continuing to perform?

Exercise E5 — Composition for the Voice You are composing a work for unaccompanied mixed choir (SATB, 40 singers). Using knowledge from this chapter: (a) If you want the choir to produce a very loud, bright sound, what vowels would you choose for the climactic moment? Why? (b) If you want the choir to sound soft and "blended," what vowels would you choose and why? (c) You want to incorporate overtone singing. What voice part (bass, tenor, alto, soprano) would be most effective as the overtone singer, and why? What drone pitch would give the richest overtone series in the human hearing range? (d) You want to incorporate a soprano soloist who must be heard above the full choir. What tessitura (pitch range) would you write for her to maximize her acoustic advantage over the choir? (e) The concert will be performed in a reverberant Gothic cathedral (reverberation time approximately 7 seconds). How does this change your compositional decisions about tempo, rhythm, and vowel choice?