Code Lab: Musical Entropy Calculator

DataField.Dev

Code Lab: Musical Entropy Calculator

Claude Shannon's information theory gives us a precise way to measure the unpredictability of a message. When applied to music, entropy quantifies how surprising a melody is: a highly repetitive sequence has low entropy, while a random sequence has maximum entropy. In this lab, you will compute Shannon entropy for musical note sequences, build Markov chain models of melodic transitions, and compare the information content of different musical styles.

Shannon Entropy of a Note Sequence

The Shannon entropy of a discrete source is defined as $H = -\sum p_i \log_2 p_i$, where $p_i$ is the probability of symbol $i$. For a melody, each symbol is a note (or pitch class).

import numpy as np
import matplotlib.pyplot as plt
from collections import Counter

def shannon_entropy(sequence):
    """Compute Shannon entropy in bits for a discrete sequence."""
    counts = Counter(sequence)
    total = len(sequence)
    probs = np.array([c / total for c in counts.values()])
    return -np.sum(probs * np.log2(probs))

# Example: a simple melody (MIDI note numbers)
melody = [60, 62, 64, 65, 67, 65, 64, 62, 60, 60, 62, 64, 62, 60, 60]
H = shannon_entropy(melody)
print(f"Melody: {melody}")
print(f"Shannon entropy: {H:.3f} bits")
print(f"Max possible entropy (log2 of {len(set(melody))} symbols): "
      f"{np.log2(len(set(melody))):.3f} bits")

The ratio of measured entropy to maximum possible entropy tells us how efficiently the melody uses its available pitch vocabulary.

Note Distribution Analysis

Visualizing the frequency distribution of pitches reveals the structural biases in a melody at a glance.

def plot_distribution(sequence, title="Note Distribution"):
    """Plot the probability distribution of symbols in a sequence."""
    counts = Counter(sequence)
    labels = sorted(counts.keys())
    freqs = [counts[l] / len(sequence) for l in labels]

    plt.figure(figsize=(8, 3))
    plt.bar(range(len(labels)), freqs, tick_label=labels)
    plt.xlabel("Note (MIDI number)")
    plt.ylabel("Probability")
    plt.title(f"{title}  |  H = {shannon_entropy(sequence):.3f} bits")
    plt.tight_layout()
    plt.show()

# Generate three contrasting sequences
np.random.seed(42)
random_melody = np.random.choice([60, 62, 64, 65, 67, 69, 71], size=200)
structured_melody = np.tile([60, 64, 67, 64], 50)  # arpeggiated triad
jazz_melody = np.random.choice(
    [60, 62, 63, 65, 67, 69, 70, 72],
    size=200,
    p=[0.18, 0.08, 0.12, 0.10, 0.18, 0.08, 0.14, 0.12]
)

for seq, label in [(random_melody, "Random"),
                   (structured_melody, "Highly Structured (Arpeggio)"),
                   (jazz_melody, "Jazz-like (Blues Scale, Uneven Weights)")]:
    plot_distribution(seq, title=label)

Notice how the random melody has entropy close to the theoretical maximum, the arpeggio has low entropy because it uses only three pitches, and the jazz-like sequence falls in between with its uneven but non-trivial distribution.

Transition Probabilities and Markov Chains

First-order entropy treats notes independently. A Markov chain model captures how notes relate to their predecessors, giving us the conditional entropy, which is always less than or equal to the first-order entropy.

def transition_matrix(sequence, states=None):
    """Compute a first-order Markov transition matrix."""
    if states is None:
        states = sorted(set(sequence))
    state_idx = {s: i for i, s in enumerate(states)}
    n = len(states)
    T = np.zeros((n, n))
    for a, b in zip(sequence[:-1], sequence[1:]):
        T[state_idx[a], state_idx[b]] += 1
    row_sums = T.sum(axis=1, keepdims=True)
    row_sums[row_sums == 0] = 1  # avoid division by zero
    T /= row_sums
    return T, states

def markov_entropy(sequence):
    """Compute the conditional (Markov) entropy of a sequence."""
    T, states = transition_matrix(sequence)
    state_counts = Counter(sequence)
    total = sum(state_counts.values())
    H = 0.0
    for i, s in enumerate(states):
        p_state = state_counts[s] / total
        row = T[i]
        row_entropy = -np.sum(row[row > 0] * np.log2(row[row > 0]))
        H += p_state * row_entropy
    return H

for seq, label in [(random_melody, "Random"),
                   (structured_melody, "Structured"),
                   (jazz_melody, "Jazz-like")]:
    H1 = shannon_entropy(seq)
    Hm = markov_entropy(seq)
    print(f"{label:25s}  H1 = {H1:.3f} bits   H_markov = {Hm:.3f} bits   "
          f"redundancy = {H1 - Hm:.3f} bits")

The gap between first-order and Markov entropy measures the sequential redundancy of the melody: how much information the previous note gives you about the next one.

# Visualize the transition matrix for the jazz-like melody
T, states = transition_matrix(jazz_melody)

plt.figure(figsize=(6, 5))
plt.imshow(T, cmap="Blues", vmin=0, vmax=0.5)
plt.colorbar(label="Transition probability")
plt.xticks(range(len(states)), states)
plt.yticks(range(len(states)), states)
plt.xlabel("Next note")
plt.ylabel("Current note")
plt.title("Markov Transition Matrix (Jazz-like Melody)")
plt.tight_layout()
plt.show()

Entropy Rate Over Time

We can track how entropy evolves as a melody unfolds, revealing sections of high predictability (verses, ostinatos) and high surprise (improvisations, modulations).

def windowed_entropy(sequence, window_size=32, step=4):
    """Compute entropy in a sliding window across the sequence."""
    positions, entropies = [], []
    for start in range(0, len(sequence) - window_size + 1, step):
        window = sequence[start : start + window_size]
        positions.append(start + window_size // 2)
        entropies.append(shannon_entropy(window))
    return np.array(positions), np.array(entropies)

composite = np.concatenate([structured_melody, jazz_melody, random_melody])
pos, ent = windowed_entropy(composite, window_size=40, step=4)

plt.figure(figsize=(10, 3))
plt.plot(pos, ent)
plt.axvline(200, color="gray", linestyle="--", alpha=0.6, label="Style boundary")
plt.axvline(400, color="gray", linestyle="--", alpha=0.6)
plt.xlabel("Position in sequence")
plt.ylabel("Entropy (bits)")
plt.title("Windowed Entropy Across Three Musical Styles")
plt.legend()
plt.tight_layout()
plt.show()

The plot clearly shows three plateaus corresponding to the structured, jazz-like, and random sections, demonstrating that entropy is an effective fingerprint of musical style.

Try It Yourself

Real melody analysis. Encode a well-known melody (such as "Twinkle Twinkle Little Star" or "Happy Birthday") as a list of MIDI note numbers. Compute its first-order and Markov entropy. How does it compare to the synthetic sequences above? What does the transition matrix reveal about the melody's structure?
Higher-order Markov models. Extend the markov_entropy function to compute second-order conditional entropy, where the probability of the next note depends on the previous two notes. How much additional redundancy does the second-order model capture for each of the three styles?
Entropy and perception. Generate ten random melodies with entropies ranging from 0.5 bits to the maximum. If you have a MIDI playback library (e.g., mido or pretty_midi), listen to each one. At roughly what entropy level does a melody begin to sound "random" rather than "structured"? Write a brief paragraph relating your observation to information-theoretic concepts.