23 min read

This appendix collects the mathematical tools that appear throughout the textbook. The goal here is not to derive everything from first principles but to give you a sturdy, intuitive grasp of each idea so that when you encounter it in a chapter, it...

Appendix A: Mathematical Foundations (Intuitive)

This appendix collects the mathematical tools that appear throughout the textbook. The goal here is not to derive everything from first principles but to give you a sturdy, intuitive grasp of each idea so that when you encounter it in a chapter, it feels like meeting a familiar face rather than a stranger. If you are a physics student who has taken calculus, some of this will be review; if you are a music student encountering these ideas for the first time, work through the examples slowly and use the "Chapter Links" sidebars to see exactly how each concept shows up in the main text.


A.1 Waves and Sine Functions

What a Sine Wave Is

Imagine you are watching a point on the rim of a bicycle wheel as someone slowly rolls the wheel past you. The point starts at the rightmost position, rises as the wheel turns, reaches the top, descends, hits the bottom, and returns to where it started. If you plot the height of that point against time, you get a smooth, repeating S-curve. That curve is a sine wave.

More precisely, a sine wave is the projection of uniform circular motion onto a straight line. The circle is rotating at a constant rate; the sine wave is what you see when you collapse that rotation down to one dimension. This geometric origin is the reason sine waves are everywhere in physics: any system that "wants" to return to equilibrium in a restoring force proportional to displacement (a spring, a pendulum, a vibrating string) will produce exactly this kind of motion.

Amplitude →   1 |    *       *
              0.5|  *   *   *   *
  y(t)         0 |*       *       *——— time →
             -0.5|          *   *   *
              -1 |            *
                  |—————|—————|—————|
                  0    T/2    T    3T/2

The plot above shows a pure sine wave. Notice that it is perfectly symmetrical, rising and falling in an identical curved pattern. The horizontal axis is time; the vertical axis is the value of the wave at that moment — which could be air pressure, string displacement, or electrical voltage depending on context.

The Three Numbers That Define a Sine Wave

Every pure sine wave is completely described by exactly three numbers:

Amplitude (A) is the peak value — how far the wave swings from zero. In sound, amplitude corresponds to loudness: a larger amplitude means more air pressure variation and therefore a louder sound. In the diagram above, the amplitude is 1. If we doubled A to 2, the wave would swing from +2 to −2 but otherwise look identical — same shape, same timing, just taller.

Frequency (f) is the number of complete cycles per second, measured in Hertz (Hz). If f = 440 Hz, the wave completes 440 full oscillations every second. That is the note A above middle C, the standard orchestral tuning pitch. Higher frequency means more cycles per second, which means higher pitch. Lower frequency means fewer cycles per second, which means lower pitch.

Phase (φ) is the starting angle of the wave, measured in radians. It answers the question: where in its cycle is the wave at time t = 0? A phase of 0 means the wave starts at zero and goes up immediately. A phase of π/2 (90 degrees) means the wave starts at its maximum. Phase matters enormously when two waves interact — two waves with the same frequency and amplitude but opposite phases (φ = π apart) will cancel each other out completely. This is the principle behind noise-canceling headphones.

The Equation

The master equation for a sine wave is:

y(t) = A × sin(2π × f × t + φ)

Let us read every symbol in plain English:

  • y(t) is the value of the wave at time t. This might be air pressure at a microphone, displacement of a guitar string, or voltage from a speaker amplifier.
  • A is the amplitude (peak value). Units depend on context: Pascals for pressure, meters for displacement.
  • sin(...) is the sine function, which takes an angle and returns a number between −1 and +1. It is the mathematical engine that generates the smooth oscillation.
  • converts frequency from cycles per second into radians per second. One complete cycle = 2π radians. So 2πf gives the angular frequency ω (omega), in radians per second.
  • f is frequency in Hz. For A4, f = 440.
  • t is time in seconds.
  • φ (phi) is the initial phase in radians.

The product 2π × f × t grows steadily as time increases, and passing this ever-growing angle into the sine function is what makes the wave repeat.

Chapter Links: The sine wave equation appears in Chapter 1 (Introduction to Sound as Waveform), Chapter 5 (Resonance and Standing Waves), and Chapter 12 (Fourier Analysis in Music).

Why Sine Waves Are "Pure"

A sine wave is the unique solution to the simplest possible oscillation equation:

d²y/dt² = −(2πf)² × y

This equation says: the acceleration of y is proportional to y but in the opposite direction. A spring exerts this kind of force. A pendulum (for small angles) does too. A column of air in a tube does as well. Any system described by this equation will naturally produce sine waves and only sine waves. That is why physicists call the sine wave the "natural" or "pure" oscillation — it is not a combination of anything; it is the ground floor of oscillatory motion.

Reference Table: Common Musical Frequencies

Note Frequency (Hz) Period (ms) Wavelength in Air (cm)
A0 (lowest piano) 27.5 36.4 1,254
C2 65.4 15.3 527
A2 110 9.09 313
Middle C (C4) 261.6 3.82 132
A4 (concert A) 440 2.27 78.2
C5 523.3 1.91 65.7
A5 880 1.14 39.1
C8 (highest piano) 4,186 0.239 8.2

Wavelengths assume speed of sound ≈ 343 m/s in air at 20°C. Notice that bass notes have wavelengths measured in meters — comparable to room dimensions — which is why bass frequencies are strongly affected by room acoustics and why subwoofer placement matters.


A.2 Frequency, Period, and Wavelength

The Reciprocal Relationship: Period and Frequency

Period and frequency are two ways of describing the same thing from opposite viewpoints. The period T is the time it takes to complete one full cycle, measured in seconds. The frequency f is the number of cycles completed per second.

They are exact reciprocals:

T = 1/f       f = 1/T

If a wave completes 440 cycles per second (f = 440 Hz), each cycle takes 1/440 seconds ≈ 0.00227 seconds = 2.27 milliseconds. This is the period of concert A.

The intuition is simple: the faster the wave oscillates (higher f), the less time each oscillation takes (smaller T). You cannot have a high-frequency wave with a long period — these are mutually exclusive.

Worked Example: The lowest note on a standard piano is A0 at 27.5 Hz. What is its period?

T = 1/27.5 ≈ 0.0364 seconds = 36.4 milliseconds

That means each oscillation of the lowest piano note takes about 36 thousandths of a second — imperceptible as individual events, but repeated 27.5 times per second they create the sensation of a rumbling bass pitch.

Wavelength in a Medium

When a sound wave travels through air, it has a wavelength λ (lambda) — the physical distance from one pressure peak to the next. Wavelength, frequency, and the speed of sound c are related by:

λ = c / f

The speed of sound in air at room temperature (20°C) is approximately 343 m/s. This speed is fixed by the medium — it does not depend on frequency or amplitude. Therefore, high-frequency sounds have short wavelengths and low-frequency sounds have long wavelengths.

Worked Example: What is the wavelength of A4 (440 Hz)?

λ = 343 / 440 ≈ 0.780 m = 78 cm

What about a bass note at 80 Hz?

λ = 343 / 80 ≈ 4.3 m

That 4.3-meter wavelength is comparable to the dimensions of a typical room, which is why bass frequencies interact so strongly with room boundaries (standing waves, room modes).

Frequency Ranges Reference Table

Frequency Range Category Musical/Physical Notes
20 – 60 Hz Sub-bass Organ pedal, kick drum felt, room modes
60 – 250 Hz Bass Bass guitar, cello low notes, male voice fundamental
250 – 500 Hz Low-midrange Piano midrange, vocal warmth
500 Hz – 2 kHz Midrange Core of most instruments, vocal intelligibility
2 – 4 kHz Upper-mid Presence, consonant clarity, nasal tones
4 – 8 kHz Presence/high-mid Sibilance, high guitar, tin whistle
8 – 20 kHz Air/treble Cymbals, breath noise, recording "air"
> 20 kHz Ultrasound Dog whistles, sonar, medical imaging
Human hearing 20 Hz – 20 kHz Shrinks with age, especially at high end
Middle C 261.6 Hz Reference point for keyboard instruments

The Octave: Why Doubling Makes "Same but Higher"

In virtually every musical culture on Earth, a pitch that is exactly twice the frequency of another is perceived as the "same note, just higher." An A at 440 Hz and an A at 880 Hz are both called "A." Why does doubling produce this perceptual equivalence?

The answer lies partly in the physics of harmonics. When any instrument plays A at 440 Hz, it also produces overtones at 880, 1320, 1760 Hz and so on. The note at 880 Hz is already present as the second harmonic of the lower A. The two pitches share a dense harmonic overlap, which makes them sound related — almost the same.

Psychoacoustically, the brain's pitch-processing mechanism has a compressive, approximately logarithmic character. On a logarithmic pitch scale, each octave spans the same perceptual "distance" regardless of where in the frequency range you are. The interval from 110 to 220 Hz feels the same size as the interval from 440 to 880 Hz.

This logarithmic octave equivalence is why we can meaningfully say that "doubling frequency equals one octave," and it is the foundation of all the ratio-based music theory in Section A.3.

Chapter Links: Wavelength and room acoustics appear in Chapter 8 (Room Acoustics and Standing Waves). Octave equivalence and logarithmic pitch are explored in Chapter 3 (Pitch Perception and Psychoacoustics).


A.3 Ratios and Intervals

What a Ratio Means Musically

In music, an interval is the relationship between two pitches — and relationships are best expressed as ratios. If note A has frequency f₁ and note B has frequency f₂, the interval between them is characterized by the ratio f₂/f₁.

What matters for musical perception is not the absolute difference (f₂ − f₁) but the ratio. The interval from 220 Hz to 330 Hz (ratio 3:2) sounds identical to the interval from 440 Hz to 660 Hz (also ratio 3:2). Both are perfect fifths, even though the first pair differs by 110 Hz and the second pair differs by 220 Hz. Ratios, not differences, define musical intervals.

Integer Ratios and Consonance

The simplest integer ratios correspond to the intervals that most human cultures regard as the most consonant (harmonious, stable-sounding):

Interval Name Frequency Ratio Example (from A4=440 Hz)
Unison 1 : 1 440 Hz (same note)
Octave 2 : 1 880 Hz
Perfect Fifth 3 : 2 660 Hz
Perfect Fourth 4 : 3 586.7 Hz
Major Third 5 : 4 550 Hz
Minor Third 6 : 5 528 Hz
Major Sixth 5 : 3 733.3 Hz
Minor Seventh 7 : 4 770 Hz
Minor Second 16 : 15 469.3 Hz (very dissonant)

The physical reason these simple ratios sound consonant involves the alignment of overtones. When you play a perfect fifth (3:2), the overtones of both notes land on many shared frequencies: the 3rd harmonic of the lower note equals the 2nd harmonic of the upper note. This spectral alignment reduces beating (interference) and produces a smooth, fused sound.

Complex ratios (like 16:15 for a minor second) produce many misaligned overtones that beat against each other rapidly, generating the roughness perceived as dissonance.

Cents: The Logarithmic Unit

Ratios are the "correct" way to think about intervals, but they become cumbersome for precise comparison. A musician needs to know whether a particular tuning is 3 cents sharp or 7 cents flat — that precision requires a finer unit than "roughly 3:2."

The cent is defined such that one octave equals exactly 1200 cents, and one semitone (equal temperament) equals exactly 100 cents. The cent is a logarithmic unit: each cent is a ratio of 2^(1/1200), a tiny frequency multiplier.

Why logarithmic? Because our perception of pitch intervals is logarithmic. The perceptual "distance" from 440 to 880 Hz equals the distance from 880 to 1760 Hz — both are one octave. On a linear frequency scale, these distances are 440 Hz and 880 Hz — very different. On a logarithmic scale, they are identical. Cents live on the logarithmic scale, so they match how we actually hear.

Converting a frequency ratio to cents:

cents = 1200 × log₂(f₂ / f₁)

The log₂ (logarithm base 2) asks: "to what power must I raise 2 to get this ratio?"

  • log₂(2) = 1, so an octave (ratio 2:1) gives 1200 × 1 = 1200 cents. ✓
  • log₂(3/2) ≈ 0.585, so a pure fifth gives 1200 × 0.585 ≈ 702 cents.
  • log₂(4/3) ≈ 0.415, so a pure fourth gives 1200 × 0.415 ≈ 498 cents.
  • log₂(5/4) ≈ 0.322, so a major third gives 1200 × 0.322 ≈ 386 cents.

Intuition for log₂: The logarithm base 2 counts octaves. log₂(4) = 2 means "4 is two octaves above 1." log₂(8) = 3. If the ratio is between 1 and 2, log₂ gives a number between 0 and 1 — meaning the interval is less than an octave.

Equal Temperament Reference Table

Modern instruments use equal temperament: the octave is divided into 12 equal semitones. "Equal" here means equal in ratio, not equal in Hertz — each semitone is a ratio of 2^(1/12) ≈ 1.05946.

Semitone Note (from C) ET Ratio (×1) Cents Just Ratio (approx.) Deviation
0 C 1.0000 0 1:1 0
1 C#/Db 1.0595 100 16:15 −12 cents
2 D 1.1225 200 9:8 +4 cents
3 D#/Eb 1.1892 300 6:5 −16 cents
4 E 1.2599 400 5:4 +14 cents
5 F 1.3348 500 4:3 +2 cents
6 F#/Gb 1.4142 600 45:32 −10 cents
7 G 1.4983 700 3:2 −2 cents
8 G#/Ab 1.5874 800 8:5 +14 cents
9 A 1.6818 900 5:3 −16 cents
10 A#/Bb 1.7818 1000 7:4 +31 cents
11 B 1.8877 1100 15:8 +12 cents
12 C' 2.0000 1200 2:1 0

The "Deviation" column shows how far equal temperament departs from pure integer ratios. The perfect fifth (G, 700 cents) is only 2 cents flat from pure — nearly imperceptible. The major third (E, 400 cents) is 14 cents sharp from pure — a musically significant difference that gives equal temperament its characteristic slightly brash sound compared to just intonation.

Chapter Links: Ratios and consonance are explored deeply in Chapter 4 (Consonance, Dissonance, and Harmony). Temperament and tuning systems occupy Chapters 6 and 7.


A.4 Logarithms and Decibels

Why Logarithms Match Human Perception

The human ear is capable of detecting sounds across an enormous range of intensities — from the faintest audible whisper to the roar of a jet engine. The ratio of intensities between these extremes is approximately 10^12 (one trillion). If sound levels were reported on a linear scale in Watts per square meter, you would need to say "that's 0.000000000001 W/m² for a whisper and 1 W/m² for a jet engine" — deeply inconvenient.

More importantly, our perception is not linear. In the 1830s, the physiologist Ernst Weber and later Gustav Fechner formalized the observation that equal ratios of stimulus correspond to equal steps in perception. If doubling the intensity produces one noticeable step in loudness, then you need to double again (quadruple the original) to get another step — not merely add another unit. This Weber-Fechner Law implies that loudness is approximately proportional to the logarithm of intensity.

The decibel (dB) scale was designed to match this logarithmic perception. It is defined relative to a reference level:

dB (SPL) = 20 × log₁₀(A / A₀)    [for amplitude, pressure]
dB (SPL) = 10 × log₁₀(P / P₀)    [for power, intensity]

Why 20 and 10? Because power goes as amplitude squared (P ∝ A²). If you double amplitude (ratio 2), you quadruple power (ratio 4). log₁₀(2) ≈ 0.301. Multiplying by 20 gives 6.02 dB for a doubling of amplitude. log₁₀(4) ≈ 0.602. Multiplying by 10 gives 6.02 dB for a quadrupling of power. Both formulas produce the same dB value for the same physical change — they are two ways of expressing the same thing.

The reference level A₀ for Sound Pressure Level (SPL) is 20 micropascals (20 × 10⁻⁶ Pa) — approximately the quietest sound a young adult with normal hearing can detect at 1 kHz.

Key Properties of Decibels

Adding dB values: When two independent sound sources of equal power combine, the total intensity doubles. A doubling of power = +3 dB. Two violins playing the same note will be about 3 dB louder than one violin — not 6 dB, not twice the dB value.

10 dB ≈ double perceived loudness: Research by Fletcher and others showed that approximately 10 dB increase corresponds to a perceived doubling of loudness (though this varies with frequency and listener). This is a rough perceptual rule, not a physical law.

Every 6 dB = doubling the amplitude: In audio engineering, 6 dB (exactly 20 × log₁₀(2)) corresponds to doubling amplitude. This is used constantly in recording and mixing.

Decibel Reference Table

dB SPL Sound Source Perception
0 Threshold of hearing Inaudible to most adults
10 Rustling leaves Barely perceptible
20 Whisper (1 m away) Very quiet
30 Quiet bedroom at night Quiet
40 Library, soft background Quiet ambient
60 Normal conversation Comfortable
70 Busy restaurant Loud
80 Alarm clock, loud traffic Annoyingly loud
85 OSHA 8-hour exposure limit Hearing damage possible
90 Lawnmower, motorcycle Loud; protective gear advised
100 Power saw, subway train Very loud
110 Rock concert (near stage) Painful for sustained exposure
120 Threshold of pain Physical pain in ear
130 Jet engine at 100 m Immediate hearing risk
140 Gunshot at close range Instant damage possible
194 Theoretical maximum SPL in air Pressure wave goes to vacuum

Worked Example: Adding Two Sound Sources

A trumpet plays at 90 dB and another trumpet joins in at the same level. What is the combined level?

The physical intensities add, not the decibels:

I₁ = I₂ = I (same level)
I_total = 2I

dB_total = 10 × log₁₀(2I / I₀)
         = 10 × log₁₀(2) + 10 × log₁₀(I / I₀)
         = 10 × 0.301 + 90
         ≈ 3 + 90 = 93 dB

Two identical sources add only 3 dB — this surprises students who expect a doubling in dB. The reason is that the dB scale already "compresses" large ratios; adding a second equal source is only a 2:1 intensity ratio, which maps to just 3 dB.

Chapter Links: Decibels and loudness perception appear in Chapter 3 (Psychoacoustics), Chapter 9 (Dynamics and Dynamic Range), and Chapter 15 (Audio Recording and Signal Chain).


A.5 The Fourier Series (Intuitive)

The Central Idea

Jean-Baptiste Joseph Fourier showed, in the early 19th century, something that initially seems almost magical: any periodic function, no matter how complicated its shape, can be written as a sum of sine and cosine waves at different frequencies.

This is not an approximation. Given enough terms, the sum converges exactly to the target function. A square wave, a sawtooth wave, the pressure waveform of a violin — all of these are (in principle) exact sums of pure sine waves.

The practical implication for music is profound: every musical timbre is a recipe for mixing harmonics. The characteristic sound of a clarinet vs. an oboe vs. a violin is entirely encoded in which harmonics are present and at what amplitudes. If you could dial up and down the amplitudes of harmonics at will, you could synthesize any timbre from pure sine waves. This is exactly what synthesizers do.

Reading a Fourier Series

A Fourier series for a periodic function with fundamental frequency f₀ looks like:

y(t) = A₀ + A₁sin(2πf₀t + φ₁) + A₂sin(2×2πf₀t + φ₂) + A₃sin(3×2πf₀t + φ₃) + ...
  • A₀ is the DC offset — the average value of the function. For audio centered at zero pressure, this is 0.
  • A₁, A₂, A₃, ... are the amplitudes of the 1st, 2nd, 3rd harmonics (overtones).
  • f₀ is the fundamental frequency — the perceived pitch.
  • 2f₀, 3f₀, 4f₀, ... are the harmonics. They are all integer multiples of f₀.
  • φ₁, φ₂, φ₃, ... are the phases of each harmonic.

To "read" a Fourier series, look at each term and ask: what frequency is this, how loud is it (amplitude), and where does it start (phase)? The amplitudes tell you the spectral content — the tone color.

The Square Wave: Building Complexity from Pure Tones

A square wave alternates instantly between +1 and −1, spending equal time at each value. Its Fourier series is:

y(t) = (4/π) × [sin(2πf₀t) + (1/3)sin(3×2πf₀t) + (1/5)sin(5×2πf₀t) + ...]

Only odd harmonics (1st, 3rd, 5th, 7th, ...) are present. Each has amplitude 1/n where n is the harmonic number. As you add more terms, the sum approaches the square wave:

1 term:   ~~~~~  (smooth sine)
3 terms:  _|‾|_  (roughly squarish, with ripples)
10 terms: |‾‾‾|  (clearly square, with small Gibbs ripples at corners)
∞ terms:  exact square wave

The clarinet's tone is dominated by odd harmonics — this is why clarinets sound "hollow" and are sometimes compared to a square wave.

Spectral Content of Common Waveforms

Waveform Harmonics Present Character
Pure sine 1st only Flute-like, pure
Square Odd only (1/n amplitude) Hollow, nasal (clarinet)
Sawtooth All harmonics (1/n amplitude) Bright, buzzy (string, brass)
Triangle Odd only (1/n² amplitude) Softer than square
Pulse (narrow) All harmonics (nearly equal amplitude) Thin, percussive

From Fourier Series to Fourier Transform

The Fourier series applies to periodic functions. The Fourier Transform extends the idea to non-periodic signals — like a spoken word or a single piano note that decays over time. Instead of a discrete list of harmonics, the Fourier Transform produces a continuous spectrum of frequencies.

In practice, audio is analyzed using the Discrete Fourier Transform (DFT) computed via the Fast Fourier Transform (FFT) algorithm. The FFT takes a block of N audio samples and returns N/2 complex numbers, each representing the amplitude and phase at a specific frequency bin.

The frequency resolution of an FFT is:

Δf = sample_rate / N

For N = 4096 samples at 44,100 Hz sample rate: Δf = 44100/4096 ≈ 10.8 Hz per bin. You can resolve two frequencies that are at least 10.8 Hz apart.

Chapter Links: Fourier series appear in Chapter 11 (Timbre and Spectral Analysis). The FFT is implemented in Python throughout Chapters 12–16. The connection between waveform shape and timbre is explored in Chapter 13.


A.6 Basic Statistics for Music Analysis

Mean and Standard Deviation

When we analyze audio features — the spectral centroid of 100 recordings, the tempo of songs across a decade, the fundamental frequency variation in a vocalist's vibrato — we are working with datasets that need statistical summarization.

The mean (average) is the most familiar summary: add all values and divide by the count.

mean = (x₁ + x₂ + ... + xₙ) / n

The standard deviation σ (sigma) measures how spread out the values are around the mean:

σ = sqrt[ Σ(xᵢ − mean)² / n ]

Intuitively: compute how far each value is from the mean, square those distances (to make them positive), average them, and take the square root. A small σ means the data clusters tightly around the mean; a large σ means it is widely scattered.

Example: A soprano sings a sustained A4 (440 Hz). Analysis of 500 ms of audio using pitch tracking yields individual estimates: 438, 441, 440, 443, 439, 440, 442, ... Hz. The mean might be 440.3 Hz (very close to target). The standard deviation might be 1.8 Hz (tight intonation) or 6.4 Hz (wavering intonation with excessive vibrato).

Correlation

Correlation measures the linear relationship between two variables, on a scale from −1 to +1:

  • +1: Perfect positive relationship — when one variable increases, the other increases proportionally.
  • 0: No linear relationship — the variables are independent (or related nonlinearly).
  • −1: Perfect negative relationship — when one increases, the other decreases proportionally.

The formula is Pearson's correlation coefficient:

r = Σ[(xᵢ − x̄)(yᵢ − ȳ)] / [n × σₓ × σᵧ]

Example in music research: If we analyze 500 songs and measure both the "spectral brightness" (high-frequency content) and listener ratings of "energy," we might find r = 0.72 — a strong positive correlation. Brighter-sounding songs tend to be rated as more energetic. This does not prove that brightness causes the rating; it shows a strong statistical association.

Histograms and Distributions

A histogram counts how many data points fall into each "bin" of values. For audio, we might histogram the distribution of note durations in a jazz solo, or the distribution of spectral centroid values across a genre.

A key distribution for music analysis is the normal (Gaussian) distribution, the famous bell curve:

         *
       * | *
      *  |  *
    *    |    *
  *      |      *
—————————|—————————
      mean

Many biological measurements (vocal pitch variation, timing deviation in human performance) follow near-normal distributions. Many audio features (spectral centroid across many recordings) may be skewed or multimodal — not normal. Always visualize data before assuming normality.

The standard deviation defines the width of the bell curve: in a normal distribution, 68% of values fall within 1σ of the mean, 95% within 2σ, and 99.7% within 3σ.

What p < 0.05 Means (and Its Limits)

In music research papers, you will frequently encounter statements like "listeners preferred the equal-tempered version significantly more (p = 0.03)." The p-value requires careful interpretation.

The p-value is the probability of getting results at least this extreme if there were truly no effect — if the null hypothesis (H₀: no difference) were true. p = 0.03 means: "if there were truly no preference difference, we would get results this extreme only 3% of the time by chance."

By convention, p < 0.05 is called "statistically significant" — it falls below the threshold where we accept the result as real rather than chance.

Critical caveats that responsible researchers acknowledge:

  1. p < 0.05 does not mean there is a 95% chance the hypothesis is true. The p-value is not the probability of the hypothesis; it is the probability of the data given no effect.

  2. Statistical significance ≠ practical significance. A study with 10,000 participants might find a "significant" difference of 0.5 Hz in pitch perception that is completely musically irrelevant.

  3. Multiple comparisons inflate false positive rates. If you test 20 different features and find p < 0.05 for one of them, that result has a high chance of being a false positive — you expected about one false positive by chance.

  4. Replication is the gold standard. A single study with p = 0.04 should be treated with appropriate skepticism. Replicated results from multiple independent labs carry much more weight.

Chapter Links: Statistical analysis appears in Chapter 28 (Empirical Research in Music Psychology), Chapter 33 (Machine Learning for Music Analysis), and Chapter 38 (Cross-Cultural Comparisons of Musical Scales).


Summary of Key Formulas

Concept Formula Units
Sine wave y(t) = A sin(2πft + φ) Varies
Period-frequency T = 1/f s, Hz
Wavelength λ = c/f m
Angular frequency ω = 2πf rad/s
Interval in cents c = 1200 log₂(f₂/f₁) cents
Octave f₂/f₁ = 2 dimensionless
Equal temperament semitone ratio = 2^(1/12) ≈ 1.0595 dimensionless
dB (amplitude) dB = 20 log₁₀(A/A₀) dB
dB (power) dB = 10 log₁₀(P/P₀) dB
FFT frequency resolution Δf = f_s / N Hz
Standard deviation σ = √[Σ(xᵢ−x̄)²/n] same as data

This appendix provides a reference foundation. For deeper treatment of any topic, consult the chapters indicated in the Chapter Links callouts, or the resources listed in Appendix B.