Chapter 18 Quiz: Information Theory & Music

DataField.Dev

Chapter 18 Quiz: Information Theory & Music

Instructions: Answer each question, then reveal the answer by clicking the disclosure triangle.

Question 1. What is Shannon's definition of "information," and how does it differ from the everyday meaning of the word?

Reveal Answer

**Shannon's definition:** The information content of a message is proportional to how surprising it is — specifically, I(m) = -log₂P(m) bits, where P(m) is the probability of the message. A highly expected event (P close to 1) carries near-zero information; a highly unexpected event (P close to 0) carries a large amount of information. **How it differs from everyday meaning:** In ordinary speech, "information" suggests content, facts, meaning — the *semantic* content of a message. Shannon's definition is purely *statistical*: it ignores meaning entirely and measures only how predictable or unpredictable the message is. A completely meaningless sequence of symbols can carry maximum information in Shannon's sense if the symbols are maximally unpredictable. Conversely, a highly meaningful message ("the sun rose today") can carry essentially zero Shannon information if it was fully expected. The application to music: musical information is the surprise value of each note given its context, not the "meaning" of the note in any deeper sense.

Question 2. What is Shannon entropy (H), and how is it different from the information content of a single message?

Reveal Answer

**Information content of a single message:** I(m) = -log₂P(m) bits. This is the information carried by one specific event. **Shannon entropy:** H = -Σ P(m) × log₂P(m). This is the *average* information content, averaged over all possible messages weighted by their probability. It is a property of the *source* (the random process generating messages), not of any specific message. Intuitively: entropy is the expected surprise per message. A source that always produces the same message (P=1 for one message, 0 for all others) has H=0 — no surprise ever. A source that produces all messages with equal probability has maximum entropy H = log₂(N) bits, where N is the number of possible messages. For music: the entropy of a melody is the average information content per note — how surprised you are, on average, by each successive note.

Question 3. In Aiko's entropy experiment, what was the most surprising finding, and what did she learn from it?

Reveal Answer

**The surprising finding:** When Aiko computed the entropy of her own composition and compared it to Bach's chorale, she found that **Bach had *lower* conditional entropy** than her composition — particularly when context was taken into account (bigram and trigram entropy). Bach's entropy dropped rapidly as more context was included; her own composition's entropy dropped much more slowly. **What she learned:** She had been confusing two things she thought were the same: - **High entropy** (statistically unpredictable, lacking consistent grammar) — which she had - **Structural sophistication** (following complex but consistent rules) — which Bach had Bach's low conditional entropy reflects that his tonal and contrapuntal grammar strongly constrains each note given its context. The rules are sophisticated; the output, given those rules, is highly predictable. Her own composition avoided conventional predictability but had no alternative consistent grammar — it was closer to controlled randomness than to structural depth. The lesson: **structural complexity and informational randomness are opposites, not synonyms**. The most structurally complex music may have low conditional entropy, because the sophistication is in the rules, not in the unpredictability of the output.

Question 4. What is ITPRA theory, and how does it connect to information theory?

Reveal Answer

**ITPRA theory** (David Huron, 2006) describes the cycle of expectation and response to musical events in five stages: - **I**magination: the brain generates predictions about upcoming events - **T**ension: arousal and preparation as the anticipated moment approaches - **P**rediction: comparison of actual event with prediction - **R**eaction: automatic, rapid response to prediction error (surprise) - **A**ppraisal: conscious evaluation of whether the surprise was good or bad **Connection to information theory:** - The Imagination stage corresponds to computing a probability distribution over possible next events (estimating P(m) for each possible next note). - The Prediction error at stage P corresponds to the information content I(m) = -log₂P(m): more unexpected events produce larger prediction errors. - The Reaction magnitude (neurally measured as ERAN amplitude) correlates with information content. - The Appraisal stage evaluates whether the high-information event was musically appropriate and aesthetically valuable. ITPRA shows that the brain implements information-theoretic processing, but adds layers that Shannon entropy alone cannot capture: the distinction between expected and unexpected deviations, and the evaluation of whether violations are aesthetically positive.

Question 5. Why does the chapter claim that "tonality is a compression scheme"? What is being compressed?

Reveal Answer

**The claim:** Tonal harmony reduces the conditional entropy of pitch sequences by constraining what notes are likely given the context. This is information compression: the listener's model of the current key and harmonic progression allows them to predict many notes correctly, requiring less cognitive processing per note. **What is being compressed:** The uncertainty about what comes next. Without any tonal grammar, the listener faces maximum uncertainty (up to log₂(12) = 3.58 bits per note). With tonal grammar internalized, the listener faces much lower conditional entropy at many moments (the next note after the leading tone is almost certainly the tonic: perhaps 0.5 bits of uncertainty rather than 3.58 bits). **What this means cognitively:** Tonal grammar acts like a dictionary or grammar for music — it allows the listener to decode the message using a shared code that dramatically reduces the information per symbol. This frees attentional resources for the musically meaningful deviations from the expected: the expressive gestures, the narrative arc, the moments of genuine surprise. Tonal music is "easy" to follow in part because tonality compresses its information load.

Question 6. What is Lempel-Ziv complexity and how does it differ from Shannon entropy as a measure of musical complexity?

Reveal Answer

**Lempel-Ziv complexity (LZ complexity):** A measure of how many distinct phrases (non-repeating substrings) a sequence contains. It is computed by the LZ parsing algorithm: reading left to right, each new phrase is the shortest substring that has not appeared earlier. Low LZ complexity = many repeated substrings = the sequence is compressible. High LZ complexity = many new substrings = the sequence is incompressible. **Differences from Shannon entropy:** 1. **No prior knowledge required:** Shannon entropy requires estimating probability distributions, which requires either a long sequence or prior knowledge of the source. LZ complexity can be computed directly from any sequence of any length. 2. **Captures structural patterns, not just statistics:** LZ complexity measures how much new material the sequence introduces, regardless of whether that material was statistically predictable. Shannon entropy measures average surprise; LZ measures structural novelty. 3. **Algorithmic vs. statistical:** LZ complexity is closer to Kolmogorov complexity (the length of the shortest program generating the sequence) than to Shannon entropy (which measures statistical properties). For music: LZ complexity captures motific and thematic structure (repeating motives reduce LZ complexity) while Shannon entropy captures the statistical distribution of pitch classes. Both are useful, and they can diverge: a piece can have low Shannon entropy (predictable pitch statistics) but moderate LZ complexity (many distinct melodic subpatterns).

Question 7. What is "conditional entropy" and why is it more musically informative than "unigram entropy"?

Reveal Answer

**Unigram entropy** treats each note as independent — it computes the entropy of the distribution of pitch classes without regard for what came before. This measures how uniformly distributed the pitch classes are, but ignores the sequential structure of melody. **Conditional entropy** H(X_n | X_{n-1}, ..., X_{n-k}) measures the entropy of the next note *given* the context of the previous k notes. It answers: "knowing what just happened, how uncertain are we about what comes next?" **Why more musically informative:** - Music is deeply sequential — notes make sense in context, not in isolation - Tonal grammar, voice-leading rules, and melodic expectations all operate at the level of conditional probabilities - Two melodies with identical unigram entropy can have very different conditional entropies: one might have strong sequential patterns (low conditional entropy); the other might be sequential random (high conditional entropy matching its unigram entropy) Aiko's experiment illustrated this: her composition and Bach's chorale had similar unigram entropies, but very different bigram and trigram entropies. The conditional measures revealed that Bach's music has a richer, more constraining grammar.

Question 8. The neuroscience of musical expectation involves dopamine release at moments of "tension-release." How does this connect to information theory?

Reveal Answer

**The connection:** Dopamine in the brain is associated with reward, motivation, and learning. In music listening, neuroimaging studies have found that dopamine peaks are largest not at moments of simple pleasure but at moments of **tension-release** — when an expected event occurs after a period of uncertainty. **Information-theoretic interpretation:** - During "tension" (approach to a cadence, building dominant chord): the listener's prediction model generates a specific expectation (resolution to tonic), but with some uncertainty — the entropy of the expected outcome is above zero because alternatives are possible (deceptive cadence, etc.) - At "release" (the resolution occurs): the prediction is confirmed. The information content of the event is low — it was expected. But the *relief* of resolution, the reduction in uncertainty, produces a dopamine release. This is the brain rewarding itself for having maintained a correct prediction model. The dopamine is not released because the event was surprising (high information); it is released because the tension (held uncertainty) is resolved. The reward is for successful prediction under uncertainty. Great music, in this framework, maximizes the duration and intensity of tension while maintaining the listener's belief that resolution is possible. The longer the tension is held, the more reward when it finally releases. This is why the deferred resolution (a long dominant pedal before the tonic arrives) is so emotionally powerful.

Question 9. What did Claude Shannon mean when he described information theory as applying to the "statistical" properties of messages, not their semantic content?

Reveal Answer

Shannon explicitly stated that information theory applies to the *statistical* structure of messages — the probabilities of different symbols and symbol sequences — and makes no claims about the *meaning* or *semantic content* of those messages. For Shannon, a message about the weather and a random sequence of symbols are both "messages"; their information content is determined entirely by how predictable they are, not by what they mean. The sentence "It will rain tomorrow" may have the same Shannon information content as "The wok sprang algebra" if both are equally likely in their respective contexts, even though one is meaningful and the other nonsensical. **For music, this means:** - Information theory can tell us how predictable each note is given its context, but it cannot tell us what the music *means* emotionally, culturally, or expressively - A maximally random sequence of pitches has maximum Shannon entropy but conveys no musical meaning - A Bach chorale has lower Shannon entropy but immensely more musical meaning Shannon information theory captures one important dimension of music (the statistical structure of pitch sequences) while being entirely silent on others (meaning, expression, beauty, cultural significance). This is the fundamental limitation that the chapter's thought experiment about *4'33"* illustrates: the information content of silence is zero by Shannon's measure, but the meaning of *4'33"* is enormous.

Question 10. What is "Kolmogorov complexity" and how does it relate to musical creativity?

Reveal Answer

**Kolmogorov complexity** K(x) of a string x is the length of the shortest computer program that generates x. It is a measure of the string's "algorithmic information content" — how much information is needed to describe it completely. - Low K(x): the string can be described compactly, because it has a pattern (e.g., "repeat ABCDEF a thousand times") - High K(x): the string can only be described by writing it out in full — no pattern allows compression **Relation to musical creativity:** Musical creativity, in this framework, can be understood as finding **low Kolmogorov complexity rules that generate high Shannon entropy output**: discovering simple, elegant generative systems (compact rules) that produce music that is locally surprising and difficult to predict (high conditional entropy). This is what Schoenberg did: the twelve-tone system is a very compact rule set (the tone row and four symmetry operations) that generates music with high surface entropy (the output is unpredictable from conventional tonal expectations). The system is elegant (low K); the music is complex (high H). By contrast: - Pure randomness: high Shannon entropy AND high Kolmogorov complexity (no short description) - Pure repetition: low Shannon entropy AND low Kolmogorov complexity The creative "sweet spot" is low K (elegant rules) producing high H (surprising output): maximum impact for minimum generative complexity. This is why mathematical structures like the tone row, sonata form, and the fugue are such powerful creative tools — they are compact generative rules with rich outputs.

Question 11. How does the entropy measure differ for Western tonal music and Indian raga, and what does this comparison suggest about cross-cultural information theory?

Reveal Answer

**Western tonal music:** The grammar consists of key, scale, and harmonic function. Within a key, the conditional entropy is reduced by knowledge of these rules. The grammar constrains pitch sequence but is relatively general — many melodic contours are possible within a key. **Indian raga:** The raga specifies not only the scale (which notes are used) but characteristic melodic phrases (*pakad*), ornamentation patterns (*gamaka*), permitted and forbidden ascending/descending motion, time of day or season for performance, and emotional character (*rasa*). This is a much more specific prior model. A listener who knows the raga has a much more constrained prediction model, and the conditional entropy within the raga grammar is potentially even lower than within Western tonal grammar. **What this suggests:** 1. **Musical information is relational:** The entropy of a piece is not absolute but depends on the listener's model. Raga music has very low entropy for a trained Hindustani listener; it has high entropy for a Western listener with no knowledge of the raga. 2. **Different systems, different compression schemes:** Both Western tonality and raga are information compression schemes, but they compress along different dimensions. Neither is "objectively" more efficient; each is efficient for listeners who have internalized its grammar. 3. **Cultural learning as grammar acquisition:** Learning to appreciate any musical tradition is, in part, the acquisition of a grammar that allows you to build prediction models — thereby reducing the entropy of the music and enabling the sophisticated appreciation of musically meaningful deviations.

Question 12. What does Aiko conclude about the difference between "avoiding conventional predictability" and "achieving structural depth"?

Reveal Answer

Aiko's key conclusion (section 18.7): these are **not the same thing**, and she had been confusing them. **Avoiding conventional predictability** means writing music that does not follow the expectations of the tonal (or other established) grammar. This produces high conditional entropy *relative to the conventional grammar*. Her composition was doing this: it was unpredictable by tonal standards. **Achieving structural depth** means following a sophisticated, consistent set of rules that makes the music's local moves predictable within that system, even if the system itself is novel. Bach achieved this: his music has low conditional entropy within the tonal-contrapuntal system, because the system is richly constraining and consistently applied. **The implication:** Avoiding conventional predictability can produce either structural depth (if you're working within a sophisticated alternative grammar) or structural shallowness (if you're just being unpredictable by avoiding all grammars). Aiko's music was the second type: it avoided tonal predictability without establishing an alternative grammar, so it was informationally random — close to noise. Her response: "My job is to make the grammar worth learning." The goal is not to be unpredictable per se, but to define a grammar — a consistent set of rules — and then work within it so deeply that the music is predictable by *that* grammar while remaining surprising by conventional standards. Schoenberg's twelve-tone system is the paradigmatic example of this.

Question 13. What does the Spotify Spectral Dataset analysis suggest about the relationship between musical entropy and listener personality?

Reveal Answer

The Spotify-scale analysis (section 18.8) found that: - Different genres have characteristic entropy profiles (jazz and experimental music have higher pitch and harmonic entropy; pop and EDM have lower entropy; classical falls in between) - Listener preferences for these genres correlate with personality traits measured by the Big Five personality model - Specifically: listeners who prefer high-entropy genres (experimental, jazz) tend to score higher on "openness to experience"; listeners who prefer low-entropy genres (pop, EDM) tend to score higher on "conscientiousness" and lower on "openness" **What this suggests:** There may be an individual difference in how the brain processes uncertainty and prediction error. "Open to experience" individuals may find high prediction error environments rewarding — they seek out novelty and uncertainty. "Conscientious" individuals may find confirmed expectations more rewarding than violated ones. If this is true, music serves different cognitive needs for different people: high-entropy music satisfies the need for novelty and cognitive challenge; low-entropy music satisfies the need for predictability and comfort. **Important caution:** The correlations are statistically significant but modest in effect size. Personality does not determine music preference; many other factors (cultural background, social identity, exposure, peer influence) are likely more important. The finding is suggestive, not definitive.

Question 14. What is the "information content of silence" according to Shannon, and how does John Cage's 4'33" challenge this measure?

Reveal Answer

**Shannon's measure:** Silence (no sound) in a musical context has near-zero information content, because silence is usually expected to continue until the performer plays. At any given moment in a performance, the probability of continued silence is close to 1 (for the first few seconds) or close to 0 (if the performer has been silent for a long time and is expected to play). Either way, the information content is low once the initial decision to be silent is encoded. **How *4'33"* challenges this:** 1. **The frame transforms meaning:** Cage designated silence as music, placing it in a concert frame that creates the expectation of sound. This dramatically changes the listener's model — they expect sound but receive silence — making each additional second of silence highly informative (high prediction error). Shannon's measure would give the same value whether the silence is designated as music or not; but the *experience* is completely different because of the frame. 2. **Ambient sounds as information source:** Cage intended the ambient sounds to be the music. These environmental sounds are high-entropy (unpredictable), so the "music" of *4'33"* is, in Shannon's terms, maximum-entropy music — pure, unstructured information. 3. **Meaning beyond statistics:** The piece's most important information content lies in its *conceptual* content — the statement that any sound can be music, that silence is a musical material, that the boundary between music and non-music is a cultural construct. This conceptual information is not captured by any statistical measure of the pitch sequence. *4'33"* is thus a perfect demonstration of the limits of Shannon's framework: it is simultaneously the highest-entropy music imaginable (if we count the ambient noise) and a piece of profound meaning that cannot be captured by any probability distribution.

Question 15. Why does the chapter argue that musical education changes the information content of music for the listener — and is this a paradox?

Reveal Answer

**How education changes information content:** Musical training gives listeners more accurate prediction models. A trained listener can predict more notes correctly (has more accurate conditional probability estimates) and therefore experiences lower information content per note than an untrained listener hearing the same music. **The apparent paradox:** If education reduces information content, then educated listeners receive *less* information per note. But we generally think educated listeners "get more" out of music — they appreciate more, hear more, understand more. How can receiving less information per note correspond to getting more out of the music? **Resolution:** 1. **Lower entropy enables higher-level processing:** When basic pitch prediction is low-effort (the grammar has been internalized), cognitive resources are freed for higher-level processing: tracking multiple voices, appreciating form, noticing subtle expressive details, catching intertextual references. 2. **Sensitivity to meaningful deviations:** A listener with an accurate model is more sensitive to *deviations* from that model — deceptive cadences, unusual harmonies, subtle rhythmic displacements. The same events carry more information for the trained listener *relative to their expectations* because their expectations are more precisely defined. 3. **Appreciation vs. information:** "Appreciating more" is not the same as "receiving more Shannon information." Appreciation involves meaning, recognition, contextual understanding — all of which go beyond statistical predictability. So the apparent paradox resolves: education reduces the per-note information content while increasing the richness and depth of musical understanding. This is possible because information theory captures only one dimension of musical experience.

Question 16. What is Markov chain modeling of chord progressions, and what does it reveal about musical grammar?

Reveal Answer

**Markov chain modeling of chord progressions:** Treats the sequence of chords in a tonal piece as a probabilistic process where the probability of the next chord depends only on the current chord (or, in higher-order models, on the last k chords). The model is specified by a transition matrix: P(next chord | current chord). For common-practice Western music, transition probabilities include: - P(I after V) is high (0.4–0.6) — the dominant-to-tonic resolution is frequent - P(vi after V) is lower (the deceptive cadence) but non-zero - P(vi# after I) is very low — unusual harmonic motion **What this reveals about musical grammar:** 1. **Tonal grammar has memory:** The Markov model captures how the current harmonic context constrains what comes next — the "grammar" of chord progression. 2. **Context reduces entropy:** For most chord types, knowing the current chord substantially reduces the entropy of the next chord compared to the uniform prior. This is the compression mechanism of tonality. 3. **Deviations are informative:** Events with low transition probability (deceptive cadences, unusual harmonies) carry more information and produce stronger perceptual responses. 4. **Different styles have different matrices:** Baroque, Classical, Romantic, Jazz, and Pop all have different transition probability matrices — different grammars. Comparing matrices reveals how musical style has evolved.

Question 17. The chapter compares musical grammar to linguistic grammar as information reduction tools. What are the similarities and what are the key differences?

Reveal Answer

**Similarities:** - Both are **grammar systems** that constrain what can follow what — linguistic grammar constrains word sequences; tonal grammar constrains note sequences - Both are **learned through immersion**: children acquire language grammar and musical grammar (in their culture) through exposure, without explicit instruction, during a developmental window - Both enable **efficient communication**: the grammar reduces entropy, lowering the cognitive load of processing the sequence - Both produce **violation sensitivity**: grammatically unexpected events (semantically anomalous sentences, harmonically unexpected chords) produce measurable neural responses (N400 for language, ERAN for music) **Key differences:** - **Semantic content:** Linguistic sentences have propositional meaning (they can be true or false, they describe states of affairs). Musical sequences do not have propositional meaning in the same sense — the "meaning" of a chord progression is expressive and emotional, not propositional - **Universal grammar:** There is debate about whether human language has a universal deep grammar (Chomsky's Universal Grammar hypothesis). Musical grammar is more clearly culturally variable — different cultures have different musical grammars, and these do not appear to share the same universal deep structure that some claim for language - **Multiple simultaneous voices:** Tonal grammar must handle multiple simultaneous pitches (chords, counterpoint) while linguistic grammar is primarily sequential - **Iconicity:** Some musical structures have quasi-iconic relationships to what they depict (rising pitch = rising emotion, slow tempo = sadness). Language is primarily arbitrary (the word "sad" does not sound sad).

Question 18. Why does the chapter say that information theory is "necessary but not sufficient" for understanding music?

Reveal Answer

**Necessary:** Information theory captures a real and important dimension of musical structure — the statistical predictability of pitch sequences, the role of expectation and violation, the compression efficiency of tonal grammar. These are genuine properties of music that have measurable effects on musical experience, neurological processing, and emotional response. Any complete theory of music must be consistent with information-theoretic findings. **Not sufficient:** Information theory is silent about several dimensions of musical experience that are musically central: 1. **Semantic meaning:** Music conveys emotional and expressive states that are not reducible to statistical surprise. A melody can be sad or joyful; information theory assigns no valence to the direction or character of expectation violations. 2. **Cultural meaning:** Music is embedded in cultural contexts that give it significance beyond statistical structure. The same pitch sequence can be a national anthem (laden with cultural meaning) or background music (culturally neutral). Information theory cannot distinguish these. 3. **Performance variation:** Two performances of the same score have identical information content in Shannon's sense, yet may be experienced as radically different in quality and expression. Information theory cannot capture what differentiates great performances from mediocre ones. 4. **Aesthetic beauty:** Some music is beautiful; other music, with similar statistical properties, is not. The concept of beauty eludes information theory entirely. 5. **The frame problem** (*4'33"*): Information theory cannot capture information that lies in intention, context, and conceptual framing rather than in the statistical sequence of sounds. The chapter's position: information theory is one of several necessary theoretical frameworks for understanding music, and it is more foundationally important than most people realize, but it is not sufficient by itself for a complete account.

Question 19. How does the concept of "redundancy" in information theory apply to music, and why might some redundancy be necessary rather than wasteful?

Reveal Answer

**Redundancy in information theory:** R = 1 - H/H_max. A source with H = H_max (maximum entropy) has zero redundancy — every symbol carries maximum information. A source with H = 0 has R = 1 — complete redundancy, every symbol is fully predictable, no information is carried. **Redundancy in music:** Musical redundancy includes: repetition of themes, motives, and phrases; use of conventional harmonic progressions that are highly predictable; metric regularity; use of a limited pitch set (scale/key). All of these reduce entropy below maximum. **Why redundancy is necessary rather than wasteful:** 1. **Error correction:** Shannon's Channel Capacity theorem shows that some redundancy is necessary to transmit information reliably over a noisy channel. In music, "noise" includes: poor acoustics, listener inattention, background sounds, unfamiliarity with the style. Redundancy (repetition of themes, use of predictable patterns) allows the listener to reconstruct the music even when some of it is missed. A completely non-redundant (maximum entropy) musical signal would be catastrophically disrupted by any attention lapse or acoustic interference. 2. **Pattern recognition:** The brain recognizes and tracks musical patterns using redundancy — recurring themes, motives, and harmonic patterns anchor the listener's model of the piece. Without any redundancy, there is nothing to recognize, no basis for expectation, no structure to follow. 3. **Emotional payoff:** The satisfaction of resolution (very low entropy, very redundant) requires that the expectation was built and sustained (moderate entropy). The redundancy of the resolution is not wasteful — it is the payoff for the sustained expectation. Without the low-entropy resolution, the higher-entropy tension has no release.

Question 20. How should we interpret the finding that pop music has lower harmonic entropy than jazz? Does this make pop music "worse" than jazz?

Reveal Answer

**The finding:** Pop music (particularly mainstream commercial pop) has lower harmonic entropy than jazz — its chord progressions are more predictable. The I-V-vi-IV progression (used in hundreds of popular songs) has near-zero conditional entropy after the first chord: once you know the style and the first chord, you can predict the rest. **Does this make pop music "worse"?** No — and the reasons for this "no" are important: 1. **Information content and aesthetic value are not the same.** Shannon entropy measures statistical predictability, not beauty, significance, or emotional impact. A maximally random sequence of chords has maximum entropy; it is not great music. 2. **Low entropy can be mastery.** A pop songwriter who has internalized the I-V-vi-IV grammar and uses it with precise control of production, timbre, vocal delivery, and lyrical imagery may achieve profound aesthetic results within a low-entropy harmonic framework. The constraint focuses creativity on other dimensions. 3. **Different genres optimize different dimensions.** Jazz optimizes for harmonic surprise and improvisational spontaneity. Pop optimizes for immediate accessibility, emotional directness, and commercial memorability. These are different aesthetic goals, not different positions on a single scale of quality. 4. **Audience matters.** Low harmonic entropy in pop music enables immediate enjoyment by listeners who have not internalized jazz harmony. This accessibility is a feature, not a failure. 5. **The cultural dimension.** Calling pop music "worse" based on entropy would be a category error: it would be imposing the values of one tradition (jazz, which prizes harmonic complexity) on a tradition with different values. A complete aesthetic evaluation must be culturally and contextually sensitive.