Case Study 1: Hadley Wickham, ggplot2, and the Rise of the Grammar of Graphics in Statistical Computing

DataField.Dev

Case Study 1: Hadley Wickham, ggplot2, and the Rise of the Grammar of Graphics in Statistical Computing

In 2005, a PhD student at the University of Auckland named Hadley Wickham released the first version of an R package called ggplot. The package was his attempt to implement Leland Wilkinson's 1999 book The Grammar of Graphics as a practical plotting library. Within a few years, ggplot (later renamed ggplot2) had become the de facto standard for statistical visualization in R. Within a decade, it had influenced Python (Altair), JavaScript (Vega/Vega-Lite), Julia (Gadfly), and the visualization curriculum at universities worldwide. The ggplot2 story is a case study in how one person's insistence on theoretical rigor can reshape an entire field.

The Situation: R's Base Graphics in the Early 2000s

R, the statistical programming language, was released in 1993 as an open-source reimplementation of S, a proprietary language from Bell Labs. By the early 2000s, R had become the dominant language for academic statistics: widely used in universities, backed by a growing package ecosystem (CRAN), and freely available. Its plotting system, inherited from S, produced statistical graphics with a terse but idiosyncratic API. You called functions like plot(), hist(), boxplot(), and pairs(), and R drew a chart on a "device" (a graphics window or file).

R's base graphics worked. Most statistical papers that used R produced charts with it. But the API was limited. Producing a grouped or faceted chart required writing loops over the data and manually arranging subplots. Changing the color scheme required remembering how R's palette system worked. Adding a legend required manually positioning it with legend(). Every non-trivial chart was a collection of small procedural workarounds, and the resulting code was hard to read and hard to modify.

An alternative approach existed: the lattice package, written by Deepayan Sarkar and built on Bill Cleveland's "Trellis" framework. Lattice handled faceting and conditioning elegantly — you could produce a 3×3 grid of scatter plots colored by a third variable with a single function call. Lattice was a significant improvement over base R, and it attracted users. But lattice was still imperative and procedural; it had its own conventions, its own idioms, its own quirks. It was better than base graphics, but it was not a paradigm shift.

Meanwhile, in 1999, Leland Wilkinson — a statistician and software developer — published The Grammar of Graphics, a 650-page book that laid out a theoretical framework for statistical visualization. Wilkinson argued that every chart could be decomposed into a small set of components (data, aesthetic mappings, geometric objects, scales, statistical transformations, coordinate systems, facets), and that a well-designed graphics system should expose these components directly. The book was academic, mathematically rigorous, and not immediately practical — Wilkinson had built a commercial implementation called GPL (Graphics Production Language), but it was proprietary and did not gain wide use.

Hadley Wickham was a statistics graduate student at Iowa State when he first read The Grammar of Graphics. He was, by his own later accounts, obsessed with the idea. He wanted to build a practical implementation in R that would make grammar-of-graphics concepts accessible to working statisticians. He began coding what would become ggplot while completing his PhD (which he eventually did at the University of Auckland in 2008).

The Library: ggplot and the Layered Grammar

The first ggplot release in 2005 was a rough prototype. Wickham reworked the design several times over the next two years, and in 2007 he released ggplot2 with a new API that he called the layered grammar of graphics. The layered grammar was a refinement of Wilkinson's ideas, specialized for R and tuned for practical use.

The key ggplot2 pattern was:

ggplot(data, aes(x = col1, y = col2, color = col3)) +
  geom_point() +
  facet_wrap(~ col4)

This created a scatter plot of col1 vs. col2, colored by col3, faceted by col4. The ggplot(data, aes(...)) call bound the data and the aesthetic mappings. The + operator added layers: geom_point() added a point layer, geom_line() would add a line layer, geom_smooth() would add a regression smoothing layer. Faceting was a one-line addition. Every chart was a composition of explicit layers with explicit mappings.

Compared to R's base graphics or lattice, ggplot2 code was:

Shorter: a faceted colored chart in three lines, not ten.
More readable: each component was named explicitly (aes, geom_point, facet_wrap).
More consistent: every chart used the same pattern, so learning one geom_* taught you the structure of all of them.
More composable: you could add, remove, or swap layers without rewriting the whole chart.
Thematically explicit: scales, themes, and coordinate systems were first-class objects that you could customize and reuse.

The learning curve was real. Users who were fluent in base R or lattice had to un-learn their imperative habits and adopt the layered grammar mental model. Many did. ggplot2's adoption grew steadily from 2007 through the early 2010s, and by 2015, it had become the de facto standard for statistical graphics in R — used in thousands of papers, taught in statistics courses, and treated as a baseline expectation for graduate students in the field.

The Ecosystem: Tidyverse and the Wickham Stack

Wickham's influence did not stop at ggplot2. Over the following years, he built a suite of R packages that together formed what is now called the tidyverse:

dplyr (2014): a grammar for data manipulation — select, filter, mutate, group_by, summarize. Replaced R's base data-frame manipulation idioms.
tidyr (2014): a library for reshaping data between wide and long formats. Introduced the concept of "tidy data" (Wickham's 2014 paper of the same name formalized the idea).
readr (2015): a replacement for R's base read.csv with better defaults and speed.
purrr (2015): a functional programming library for lists and iteration.
tibble (2016): a modernized data frame.
stringr (2009): a consistent string manipulation API.
forcats (2016): a factor (categorical) manipulation API.

Each of these packages applied the same design philosophy: consistent naming, pipeable verbs, explicit types, and opinionated defaults. They all interoperated cleanly. Using the tidyverse meant writing R code that read top-to-bottom like a recipe, with each step transforming the data toward the final analysis.

The tidyverse became the dominant R idiom of the 2010s. Courses at universities taught it. Book-length introductions (Wickham's R for Data Science) became bestsellers. Job listings for data scientists specified "tidyverse experience" as a requirement. The tidyverse was a package ecosystem, but it was also a philosophy — a way of thinking about data analysis in which each step was explicit, each verb was named, and the whole pipeline was compositional.

ggplot2 was the visualization component of this philosophy. Tidyverse users used it because it fit the overall mindset: verbs + pipelines + compositional design. The grammar of graphics was the visualization grammar that matched the dplyr data grammar, and together they produced a coherent analytical style.

The Influence: From R to Python, JavaScript, and Beyond

By the mid-2010s, ggplot2's design had influenced nearly every serious visualization library being developed. Python attempts to replicate it began early (the pure-Python ggplot package, 2013, which was never quite satisfactory), and continued through seaborn (2012, heavily influenced by ggplot2's grammar but implemented over matplotlib), Plotnine (2017, a direct Python port of ggplot2), and Altair (2016, built on Vega-Lite which is itself a grammar-of-graphics implementation).

JavaScript got its grammar-of-graphics implementation through Vega (2014) and Vega-Lite (2016), both developed at the University of Washington Interactive Data Lab led by Jeffrey Heer. Vega was a low-level specification language that expressed charts as JSON documents; Vega-Lite was a higher-level wrapper that emphasized compositionality and interactivity. Both explicitly cited ggplot2 and Wilkinson's book as influences. Altair, in turn, was built on top of Vega-Lite, bringing the grammar-of-graphics approach back into Python with a declarative API.

The influence extended to commercial tools as well. Tableau, the market-leading BI product, built its VizQL language on grammar-of-graphics principles (its co-founder Chris Stolte was Wilkinson's student). Microsoft's Power BI and Google's Data Studio both borrowed grammar-of-graphics concepts for their chart authoring tools. By the late 2010s, the grammar of graphics had become a universal language for thinking about statistical charts, even among people who did not know it by that name.

Theory Connection: Why the Grammar Wins

The grammar of graphics wins because it is a productive framework. With matplotlib's imperative model or R's base graphics, you learn a vocabulary of chart types — bar chart, line chart, scatter plot, histogram, box plot — and the library offers a function for each. When you want a chart type that is not in the vocabulary, you either improvise or give up.

With the grammar of graphics, there is no fixed vocabulary. Every chart is a composition of primitives. Want a chart with points and lines? Add both layers. Want a chart with a regression smoothing overlay? Add a geom_smooth layer. Want to facet by one variable and color by another? Add a facet and a color encoding. Want a custom chart type that nobody has ever built before? Compose it from primitives. The grammar is generative: it can express chart types that the library author never anticipated, because the library is built on a composable theory rather than a fixed list of functions.

This is why the grammar of graphics has spread so far. It is not just a good library design; it is a good mental model for thinking about charts. Once you internalize the primitives (data, mark, encoding, scale, transform, facet, composition), you start seeing every chart as a composition of them, and you can produce new charts by combining primitives in new ways. The generativity is the feature.

The threshold concept of this chapter — "declaration over instruction" — follows directly. Declarative code is more compositional than imperative code because declarative primitives can be combined without worrying about execution order or state. Imperative code has hidden dependencies (which Axes is current? what did the last function call modify?) that make composition fragile. Declarative code does not.

The Impact: A Generation of Statistical Thinkers

Wickham's impact on statistical computing is hard to overstate. He became a professor at Rice University, then moved to RStudio (now Posit) as its Chief Scientist. He has published multiple books, dozens of papers, and hundreds of open-source packages. His R for Data Science book (co-authored with Garrett Grolemund, 2016) is the standard introduction to the tidyverse and ggplot2 for statistics and data science students worldwide. The second edition, with Mine Çetinkaya-Rundel, was published in 2023.

More importantly, Wickham's work has trained a generation of statisticians and data scientists to think about visualization (and data manipulation, and data modeling) in grammar-of-graphics terms. Students who learn ggplot2 first often struggle with matplotlib's imperative model when they encounter it later — the grammar-of-graphics mental model is so productive that imperative tools feel clumsy. This is not a small effect. The way people think about a problem is influenced by the tools they first learn, and a generation of analysts who learned ggplot2 brings grammar-of-graphics thinking to every new visualization tool they touch.

Altair, the subject of this chapter, is one of the beneficiaries. A Python user coming from ggplot2 will find Altair immediately familiar: alt.Chart(data).mark_point().encode(x="col1:Q", y="col2:Q", color="col3:N") is the direct analog of ggplot(data, aes(x=col1, y=col2, color=col3)) + geom_point(). The Python syntax differs, but the concepts are the same. The learning curve is short because the grammar is shared.

Discussion Questions

On Wilkinson vs. Wickham. Wilkinson wrote a 650-page theoretical book that had modest direct impact. Wickham wrote practical implementations that had massive impact. What does this say about the relationship between theory and practice in software?
On tidyverse adoption. The tidyverse became the dominant R idiom, but some R users still prefer base R. What are the legitimate reasons to resist the tidyverse? What are the illegitimate ones?
On Altair's heritage. Altair is a direct descendant of ggplot2 via Vega-Lite. Should Altair be seen as a Python implementation of ggplot2, or as a distinct library with its own philosophy?
On generative frameworks. The chapter argues that the grammar of graphics is a "generative framework" — it can express chart types the library author never anticipated. Is this always a good thing, or does it come with costs?
On Python's grammar-of-graphics story. Python has had several attempts at grammar-of-graphics libraries (ggplot, plotnine, Altair). Why has the story taken so long compared to R, where ggplot2 was dominant almost immediately?
On your own use. If you have used ggplot2 in R, does the Altair API feel familiar? If you have not, is the grammar-of-graphics approach easier or harder than matplotlib for the charts you need to produce?

ggplot2 changed R, and then changed Python, and then changed JavaScript, and then changed commercial BI tools. The grammar of graphics is no longer a theory in a 650-page book; it is the dominant paradigm for statistical visualization across languages and ecosystems. When you write an Altair chart, you are writing in a language whose grammar was formalized by Wilkinson, popularized by Wickham, and generalized by the Vega team. The lineage matters because it explains why Altair works the way it does — and it suggests that the grammar-of-graphics approach is not just one option among many, but the best-tested and most influential framework for thinking about data visualization that we currently have.