Case Study 2: Gene Expression Cluster Maps and the Origin of Heatmap Genomics

In 1998, Michael Eisen, Paul Spellman, Patrick Brown, and David Botstein published a paper titled "Cluster analysis and display of genome-wide expression patterns." The paper introduced a software tool called Cluster/TreeView that produced two-dimensional cluster maps of gene expression data. The visual conventions of that paper — rows for genes, columns for conditions, red-green diverging colors, dendrograms on two sides — became the standard for molecular biology. Two decades later, nearly every gene expression paper still uses the same format.


The Situation: Too Much Data, Not Enough Eyes

In the mid-1990s, molecular biology entered a new era. Microarray technology, developed at Stanford and Affymetrix and other labs, allowed researchers to measure the expression levels of thousands of genes simultaneously. Instead of studying one gene at a time — the dominant mode of biology for decades — a single experiment could now produce a matrix of tens of thousands of measurements: rows for genes, columns for experimental conditions (different cell types, time points, treatments, etc.).

This was an extraordinary increase in data volume. A single microarray experiment produced as much information as a career's worth of single-gene work. But it also produced a visualization problem. How do you display a matrix of 20,000 genes × 50 conditions in a form that a human can read? Listing the numbers in a spreadsheet was useless — no one can scan a million cells and find patterns. Plotting bar charts was also useless — there would be a million bars. The existing visualization toolkit, developed for datasets with dozens of observations, could not handle the new scale.

Michael Eisen was a postdoc at Stanford in 1997-1998, working in Pat Brown's lab and David Botstein's statistics-meets-biology group. Eisen came from a background in both computational biology and software development, and he was well-positioned to notice the problem and build a tool to solve it. His insight was to combine two existing ideas — the heatmap and hierarchical clustering — into a single integrated visualization.

The heatmap idea went back decades. Statisticians had been producing colored tables to display matrices since at least the 1970s. The hierarchical clustering idea was also well-established — agglomerative clustering algorithms had been published in the 1960s, and dendrogram visualizations were a standard part of the statistical toolkit. Neither idea was new. What was new was their combination with the specific needs of gene expression analysis: reorder the rows and columns simultaneously through hierarchical clustering, display the reordered matrix as a colored heatmap, and show the dendrograms alongside so readers could see the clustering structure.

Eisen and colleagues built this tool. It was called Cluster for the clustering part and TreeView for the visualization part. The tools were written in C++ and distributed for free. They became, almost overnight, the standard tool for molecular biologists working with microarray data.

The Data: Yeast Cell Cycle Gene Expression

The specific dataset that appeared in the 1998 paper was a study of yeast gene expression during the cell cycle. Pat Brown's lab at Stanford had used microarrays to measure the expression of approximately 6,200 yeast genes at 18 time points during the cell cycle. The time points spanned a full cycle — from one cell division to the next — and the resulting matrix was 6,200 × 18.

The question was: which genes are cell-cycle regulated, and when in the cycle are they expressed? Biologists had known for decades that some genes turn on and off during the cell cycle — the genes that produce DNA replication machinery turn on during S phase; the genes that produce mitotic spindles turn on during M phase; and so on. But most of the cycle-regulated genes had been identified one at a time, through genetic screens or targeted studies. The microarray data allowed a systematic survey of every gene simultaneously.

To find the pattern in a 6,200 × 18 matrix, Eisen and colleagues used hierarchical clustering. They computed the correlation between every pair of gene expression profiles (each profile is a 18-long vector of expression levels across the time points). Genes with similar profiles — both peaking at the same phase of the cycle — had high correlation; genes with dissimilar profiles had low correlation. The clustering algorithm took these correlations and built a tree, joining the most similar genes first and working outward to the most dissimilar.

The result of the clustering was a reordering of the 6,200 genes such that adjacent rows in the reordered matrix had similar profiles. When the reordered matrix was displayed as a heatmap — rows for genes, columns for time points, color for expression level — the clustering became visible. Groups of genes that turned on at the same phase of the cycle appeared as horizontal bands of the same color. Each band corresponded to a biological "cluster" — a group of genes that likely participated in the same biological process at the same phase.

The Visualization: Red-Green Heatmaps with Dendrograms

Eisen's visualization had several key features that became convention:

Red-green diverging colormap. Gene expression was traditionally measured as a ratio — expression in the experimental condition divided by expression in a control. A ratio greater than 1 meant the gene was up-regulated; less than 1 meant down-regulated. Eisen mapped up-regulation to red and down-regulation to green, with black at the midpoint (ratio = 1, no change). This became the standard for gene expression heatmaps and is still dominant today, even though red-green is a terrible choice for colorblind readers.

Rows for genes, columns for conditions. This orientation became convention because genes (rows) are typically the "observations" in a biological sense — there are many genes, a few conditions. Listing the many-axis vertically allows more genes to fit on the page at readable row heights.

Dendrograms on the top and left. The column dendrogram (top) shows how the conditions cluster; the row dendrogram (left) shows how the genes cluster. Both are shown by default, which is important because both the gene grouping and the condition grouping are biologically meaningful.

Log-scale color. Gene expression ratios are often highly skewed — a gene can be up-regulated 20-fold or down-regulated 20-fold in different conditions. On a linear color scale, the extremes would dominate. Eisen used log2 of the ratio, so a twofold change in either direction was symmetric around zero and moderate colors corresponded to moderate changes.

Black background. The original TreeView displayed the heatmap against a black background with colored cells. Black was chosen partly for aesthetic reasons and partly because it made the bright red and green cells more visible. This aesthetic was widely imitated and became a kind of visual signature of gene expression analysis.

The integration of all these features into a single tool — one button in TreeView produced a fully-labeled, clustered, color-coded heatmap — was what made Cluster/TreeView so successful. A biologist could drop in a microarray data file, click "cluster," and within seconds see a publication-ready figure. No custom code, no hours of reformatting, no coordination between separate tools for clustering and display.

The Impact: A Universal Standard in Molecular Biology

Eisen's 1998 paper became one of the most-cited papers in bioinformatics history. As of the mid-2020s, it has been cited more than 20,000 times, making it one of the most influential methods papers ever published in the field. Cluster/TreeView was downloaded by essentially every molecular biology lab that worked with expression data. The visual conventions Eisen established — red-green heatmaps, dendrograms on two sides, log-scale color — became so ubiquitous that readers of biology journals learned to recognize them at a glance.

Twenty years later, the conventions are still dominant. Open any current issue of Nature, Cell, Science, or any biology journal with genomics content, and you will find heatmap figures that look nearly identical to Eisen's 1998 originals. The red-green colormap has come under some criticism for colorblind accessibility, and some authors now use blue-red or white-red alternatives. But the core layout — matrix with dendrograms — is unchanged.

The tools themselves have evolved. Cluster/TreeView is still available, but most modern analyses use R packages like pheatmap, ComplexHeatmap, or heatmap.2, or Python tools like sns.clustermap. These tools reproduce the same layout and conventions; they differ mostly in customization options and integration with other libraries. Seaborn's clustermap function, introduced in version 0.5 or so, was explicitly inspired by the Eisen-era gene expression heatmaps.

When you call sns.clustermap on a correlation matrix, you are using a visualization that was designed for gene expression data two decades before seaborn existed. The tool has been generalized — you can cluster any numeric matrix, not just gene expression — but the layout, the dendrograms, the colormap choices, and the philosophy of "reorder for structure" all trace back to Eisen's paper.

Theory Connection: Why the Format Persists

The persistence of Eisen's layout is a case study in how visualization conventions establish themselves and resist change.

The layout works for several reasons. First, the matrix display scales. A heatmap can show thousands of rows at once, which is essential for genomics data. A scatter plot of thousands of points would be overwhelming; a bar chart would not fit. Only the heatmap can compactly display thousands of observations.

Second, the dendrograms encode structure that the heatmap alone cannot. The color matrix shows the values; the dendrograms show the clustering. Together they answer two questions: "what does each gene do?" (the matrix) and "which genes group together?" (the trees). A heatmap without dendrograms answers only the first question; a tree without a heatmap answers only the second. The combined display answers both.

Third, the convention allows comparison across papers. Because every gene expression paper uses the same layout, a biologist can scan across hundreds of papers and quickly extract comparable information. If one paper used rows for conditions and another used columns, the comparison would require mental transposition. The shared convention reduces cognitive load.

Fourth, and more subtly, the convention has become a visual signature of the field. A heatmap with dendrograms and red-green cells means "gene expression data" to a reader who has seen hundreds of such figures. The format itself communicates the context, even before the reader has parsed the individual rows and columns. This is the same phenomenon that lets us recognize a bar chart as "discrete comparisons" or a line chart as "time series" — the format carries semantic meaning through convention.

The flip side of convention is inertia. The red-green colormap has been criticized for decades. Roughly 8% of men and 0.5% of women have red-green color blindness and cannot distinguish the two colors reliably. These readers literally cannot read an Eisen-style heatmap. Alternatives (blue-red, cyan-magenta, viridis-style diverging maps) exist and are demonstrably better for colorblind accessibility. But the red-green convention persists because generations of readers expect it. Changing the convention requires a coordinated shift in publication practices, and such shifts are slow.

This tension — between the benefits of a shared convention and the costs of a suboptimal one — is a recurring theme in visualization. Eisen's conventions are, on balance, probably net positive: the field has a shared language for displaying complex data, and the language works. But the red-green issue is a reminder that conventions should be re-examined periodically and updated when the evidence warrants.


Discussion Questions

  1. On convention inertia. Eisen's red-green colormap is demonstrably worse for colorblind readers than alternatives, yet it persists. What would it take to change the convention? Who has the authority to do so?

  2. On scale. A gene expression heatmap with 20,000 rows is at the limit of what a page can display. Beyond that limit, what tools would you use? Interactive exploration? Dimensionality reduction?

  3. On cluster interpretation. Eisen's dendrograms grouped co-expressed genes, which biologists interpreted as "functional modules." This interpretation is not always right — sometimes genes co-express for reasons other than shared function. How should papers communicate the uncertainty?

  4. On software longevity. Cluster/TreeView is 25+ years old and still available. Most research software does not survive that long. What made this tool durable?

  5. On generalization. The tools this chapter uses (sns.clustermap and friends) were built for gene expression but now serve many fields. What other visualization conventions have escaped their origin disciplines?

  6. On your own use. The next time you use sns.clustermap, will you think about the history behind it? Does knowing the history change how you use the tool?


Eisen's 1998 paper is a reminder that visualization tools are not invented in a vacuum — they are built to solve specific problems in specific fields, and their conventions carry the fingerprints of those origins. When you produce a cluster map in seaborn, you are working in a tradition that began with a 6,200 × 18 yeast matrix and became a universal language for displaying molecular data. The tool has generalized; the conventions have traveled; the original data is long since analyzed and published. But the visual vocabulary persists, and every cluster map you make is, in some sense, a descendant of that first one.