Code Lab: Musical Entropy Calculator
Claude Shannon's information theory gives us a precise way to measure the unpredictability of a message. When applied to music, entropy quantifies how surprising a melody is: a highly repetitive sequence has low entropy, while a random sequence has maximum entropy. In this lab, you will compute Shannon entropy for musical note sequences, build Markov chain models of melodic transitions, and compare the information content of different musical styles.
Shannon Entropy of a Note Sequence
The Shannon entropy of a discrete source is defined as $H = -\sum p_i \log_2 p_i$, where $p_i$ is the probability of symbol $i$. For a melody, each symbol is a note (or pitch class).
import numpy as np
import matplotlib.pyplot as plt
from collections import Counter
def shannon_entropy(sequence):
"""Compute Shannon entropy in bits for a discrete sequence."""
counts = Counter(sequence)
total = len(sequence)
probs = np.array([c / total for c in counts.values()])
return -np.sum(probs * np.log2(probs))
# Example: a simple melody (MIDI note numbers)
melody = [60, 62, 64, 65, 67, 65, 64, 62, 60, 60, 62, 64, 62, 60, 60]
H = shannon_entropy(melody)
print(f"Melody: {melody}")
print(f"Shannon entropy: {H:.3f} bits")
print(f"Max possible entropy (log2 of {len(set(melody))} symbols): "
f"{np.log2(len(set(melody))):.3f} bits")
The ratio of measured entropy to maximum possible entropy tells us how efficiently the melody uses its available pitch vocabulary.
Note Distribution Analysis
Visualizing the frequency distribution of pitches reveals the structural biases in a melody at a glance.
def plot_distribution(sequence, title="Note Distribution"):
"""Plot the probability distribution of symbols in a sequence."""
counts = Counter(sequence)
labels = sorted(counts.keys())
freqs = [counts[l] / len(sequence) for l in labels]
plt.figure(figsize=(8, 3))
plt.bar(range(len(labels)), freqs, tick_label=labels)
plt.xlabel("Note (MIDI number)")
plt.ylabel("Probability")
plt.title(f"{title} | H = {shannon_entropy(sequence):.3f} bits")
plt.tight_layout()
plt.show()
# Generate three contrasting sequences
np.random.seed(42)
random_melody = np.random.choice([60, 62, 64, 65, 67, 69, 71], size=200)
structured_melody = np.tile([60, 64, 67, 64], 50) # arpeggiated triad
jazz_melody = np.random.choice(
[60, 62, 63, 65, 67, 69, 70, 72],
size=200,
p=[0.18, 0.08, 0.12, 0.10, 0.18, 0.08, 0.14, 0.12]
)
for seq, label in [(random_melody, "Random"),
(structured_melody, "Highly Structured (Arpeggio)"),
(jazz_melody, "Jazz-like (Blues Scale, Uneven Weights)")]:
plot_distribution(seq, title=label)
Notice how the random melody has entropy close to the theoretical maximum, the arpeggio has low entropy because it uses only three pitches, and the jazz-like sequence falls in between with its uneven but non-trivial distribution.
Transition Probabilities and Markov Chains
First-order entropy treats notes independently. A Markov chain model captures how notes relate to their predecessors, giving us the conditional entropy, which is always less than or equal to the first-order entropy.
def transition_matrix(sequence, states=None):
"""Compute a first-order Markov transition matrix."""
if states is None:
states = sorted(set(sequence))
state_idx = {s: i for i, s in enumerate(states)}
n = len(states)
T = np.zeros((n, n))
for a, b in zip(sequence[:-1], sequence[1:]):
T[state_idx[a], state_idx[b]] += 1
row_sums = T.sum(axis=1, keepdims=True)
row_sums[row_sums == 0] = 1 # avoid division by zero
T /= row_sums
return T, states
def markov_entropy(sequence):
"""Compute the conditional (Markov) entropy of a sequence."""
T, states = transition_matrix(sequence)
state_counts = Counter(sequence)
total = sum(state_counts.values())
H = 0.0
for i, s in enumerate(states):
p_state = state_counts[s] / total
row = T[i]
row_entropy = -np.sum(row[row > 0] * np.log2(row[row > 0]))
H += p_state * row_entropy
return H
for seq, label in [(random_melody, "Random"),
(structured_melody, "Structured"),
(jazz_melody, "Jazz-like")]:
H1 = shannon_entropy(seq)
Hm = markov_entropy(seq)
print(f"{label:25s} H1 = {H1:.3f} bits H_markov = {Hm:.3f} bits "
f"redundancy = {H1 - Hm:.3f} bits")
The gap between first-order and Markov entropy measures the sequential redundancy of the melody: how much information the previous note gives you about the next one.
# Visualize the transition matrix for the jazz-like melody
T, states = transition_matrix(jazz_melody)
plt.figure(figsize=(6, 5))
plt.imshow(T, cmap="Blues", vmin=0, vmax=0.5)
plt.colorbar(label="Transition probability")
plt.xticks(range(len(states)), states)
plt.yticks(range(len(states)), states)
plt.xlabel("Next note")
plt.ylabel("Current note")
plt.title("Markov Transition Matrix (Jazz-like Melody)")
plt.tight_layout()
plt.show()
Entropy Rate Over Time
We can track how entropy evolves as a melody unfolds, revealing sections of high predictability (verses, ostinatos) and high surprise (improvisations, modulations).
def windowed_entropy(sequence, window_size=32, step=4):
"""Compute entropy in a sliding window across the sequence."""
positions, entropies = [], []
for start in range(0, len(sequence) - window_size + 1, step):
window = sequence[start : start + window_size]
positions.append(start + window_size // 2)
entropies.append(shannon_entropy(window))
return np.array(positions), np.array(entropies)
composite = np.concatenate([structured_melody, jazz_melody, random_melody])
pos, ent = windowed_entropy(composite, window_size=40, step=4)
plt.figure(figsize=(10, 3))
plt.plot(pos, ent)
plt.axvline(200, color="gray", linestyle="--", alpha=0.6, label="Style boundary")
plt.axvline(400, color="gray", linestyle="--", alpha=0.6)
plt.xlabel("Position in sequence")
plt.ylabel("Entropy (bits)")
plt.title("Windowed Entropy Across Three Musical Styles")
plt.legend()
plt.tight_layout()
plt.show()
The plot clearly shows three plateaus corresponding to the structured, jazz-like, and random sections, demonstrating that entropy is an effective fingerprint of musical style.
Try It Yourself
-
Real melody analysis. Encode a well-known melody (such as "Twinkle Twinkle Little Star" or "Happy Birthday") as a list of MIDI note numbers. Compute its first-order and Markov entropy. How does it compare to the synthetic sequences above? What does the transition matrix reveal about the melody's structure?
-
Higher-order Markov models. Extend the
markov_entropyfunction to compute second-order conditional entropy, where the probability of the next note depends on the previous two notes. How much additional redundancy does the second-order model capture for each of the three styles? -
Entropy and perception. Generate ten random melodies with entropies ranging from 0.5 bits to the maximum. If you have a MIDI playback library (e.g.,
midoorpretty_midi), listen to each one. At roughly what entropy level does a melody begin to sound "random" rather than "structured"? Write a brief paragraph relating your observation to information-theoretic concepts.