Further Reading: Essential Chart Types in matplotlib


Tier 1: Essential Reading

VanderPlas, Jake. Python Data Science Handbook. 2nd ed. O'Reilly Media, 2023. Already recommended for Chapter 10, VanderPlas's book is the most practical source for the chart types in this chapter. Chapter 4 of the handbook covers line charts, scatter plots, histograms, and others with worked examples. The book is freely available at jakevdp.github.io/PythonDataScienceHandbook/ as Jupyter notebooks. Essential reading for concrete code examples beyond what this textbook chapter covers.

The Matplotlib Gallery. matplotlib.org/stable/gallery/ The gallery is organized by chart type. Browse the "Lines, bars, and markers" section for line and bar chart examples, the "Pie and polar charts" section (though pie charts are rarely recommended), the "Statistics" section for histograms and box plots, and the "Shapes and collections" section for scatter plots. Every gallery example includes full source code. This is the first place to look when you need to produce a specific chart type variant.

Matejka, Justin, and George Fitzmaurice. "Same Stats, Different Graphs: Generating Datasets with Varied Appearance and Identical Statistics through Simulated Annealing." Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems, 2017: 1290-1294. The paper that introduced the Datasaurus Dozen discussed in Case Study 1. The paper is short (5 pages), readable, and freely available through the ACM Digital Library or the authors' website at autodeskresearch.com/publications/samestats. Essential reading for anyone who wants to understand the rhetorical power of scatter plots as demonstrations of the "statistics alone are lossy" principle.


Wilke, Claus O. Fundamentals of Data Visualization. O'Reilly Media, 2019. Wilke's chapters on specific chart types (amounts, distributions, proportions, relationships) complement the matplotlib-specific coverage in this chapter. Wilke's examples are in R/ggplot2, but the principles transfer to matplotlib. His discussion of when to use each chart type, and when to avoid specific anti-patterns (like pie charts with many slices), is particularly useful as a design reference. Freely available at clauswilke.com/dataviz.

Few, Stephen. Show Me the Numbers: Designing Tables and Graphs to Enlighten. 2nd ed. Analytics Press, 2012. Few's book is the definitive practical guide to choosing between chart types for business communication. His chapters on comparison charts, distribution charts, and relationship charts provide extensive examples and specific recommendations. Pair with VanderPlas for a complete view: VanderPlas gives you the matplotlib code, Few gives you the design rationale for choosing between chart types.

Knaflic, Cole Nussbaumer. Storytelling with Data: A Data Visualization Guide for Business Professionals. Wiley, 2015. Already recommended for Chapter 7, Knaflic is also relevant here because her book includes detailed discussion of each chart type — when to use it, when to avoid it, and how to polish it. Her "Big Idea" framework (covered in Chapter 9 of this textbook) sits at the level above individual chart choices but informs the specific chart decisions discussed in this chapter.

Cairo, Alberto. The Truthful Art: Data, Charts, and Maps for Communication. New Riders, 2016. Cairo's book includes chapters on each of the major chart types, with particular attention to the ethical dimensions (when a chart type can mislead) that complement Chapter 4 of this textbook. His discussion of scatter plots and the ethics of encoded variables is directly relevant to Section 11.3.

Healy, Kieran. Data Visualization: A Practical Introduction. Princeton University Press, 2019. An R/ggplot2-focused book on data visualization that covers the same chart types as this chapter with different examples and a slightly more statistical emphasis. Worth reading as a complement if you use R in addition to Python, or if you want a second perspective on the chart-type decisions. Freely available at socviz.co.

Wickham, Hadley, and Garrett Grolemund. R for Data Science. 2nd ed. O'Reilly Media, 2023. The canonical R data science book. The visualization chapters use ggplot2, not matplotlib, but the chart-type recommendations are substantively similar. Useful as a cross-language reference, and the ggplot2 examples clarify several design choices by contrast with matplotlib's approach.


Tier 3: Tools, References, and Specific Chart Types

Resource URL / Source Description
matplotlib plot_types reference matplotlib.org/stable/plot_types/ A visual index of matplotlib chart types organized by category (line, bar, scatter, statistical, etc.). Each entry links to the relevant API documentation and gallery examples. A faster way to find the right method name than searching the API docs directly.
matplotlib Axes API matplotlib.org/stable/api/axes_api.html The comprehensive list of methods on the Axes class. Use this to find the exact signature for any plot method, including all the parameters and their defaults. This is the page to keep bookmarked as you work through the chart types.
seaborn documentation seaborn.pydata.org seaborn is a higher-level interface built on matplotlib. It provides simpler APIs for several of the chart types in this chapter (especially scatter, distribution plots, and box plots). We will cover seaborn in Part IV (Chapters 16-19), but you can preview it now if you want a less verbose alternative to raw matplotlib for statistical visualizations.
The datasauRus R package cran.r-project.org/package=datasauRus The R implementation of the Datasaurus Dozen dataset. If you work in R as well as Python, this is the canonical source.
The datasaurus Python package pypi.org/project/datasaurus/ A Python package that provides the Datasaurus Dozen data for use with pandas and matplotlib. pip install datasaurus gives you the data without having to download the CSV manually.
showyourstripes.info showyourstripes.info Ed Hawkins's public tool for generating warming stripes for any country, region, or city. Discussed in Case Study 2. Free to use and generates PNG and SVG files that can be shared.
Warming stripes reproduction in matplotlib matplotlib.org/stable/gallery/showcase/warming_stripes.html The matplotlib gallery has an example that reproduces the warming stripes, along with the source code. Worth studying after reading Case Study 2.
Anscombe's Quartet in Python seaborn.pydata.org/examples/anscombes_quartet.html seaborn's documentation includes a reproduction of Anscombe's Quartet as a small-multiple figure. Compare to the Datasaurus Dozen reproduction in Case Study 1.
The Python Graph Gallery python-graph-gallery.com A catalog of Python chart types with code examples in matplotlib, seaborn, and Plotly. Organized by chart category. Useful for finding specific chart variants and their implementation.
Real Python matplotlib tutorials realpython.com/tutorials/matplotlib/ A collection of free tutorials on specific matplotlib topics. Variable quality but generally well-written, with particular focus on practical use cases.

Notes on Choosing Between Matplotlib and Higher-Level Libraries

For the chart types in this chapter, matplotlib is rarely the most concise option. seaborn produces cleaner scatter plots with less code. pandas plot methods produce line charts with one-liner syntax. Plotly produces interactive versions of the same chart types. Altair uses a grammar-of-graphics approach that is often shorter for complex charts. Each of these alternatives has advantages.

So why learn matplotlib? Because all of these higher-level libraries are built on matplotlib (seaborn, pandas) or inspired by it (Plotly, Altair, ggplot2), and understanding matplotlib's architecture gives you the ability to customize and debug any of them. When seaborn's default output does not meet your needs, you drop down to matplotlib. When pandas plotting produces something unexpected, you investigate the underlying matplotlib Axes. When you need a chart type that no higher-level library supports, you implement it in matplotlib.

Think of matplotlib as the assembly language of Python visualization: verbose, sometimes awkward, but powerful enough to produce anything, and always available as a fallback. Higher-level libraries are the higher-level languages: faster for common tasks, but built on the same foundation.

A reasonable workflow: 1. For exploratory charts in a notebook: use pandas df.plot() or seaborn for quick results. 2. For publication-quality charts where you have design control: use matplotlib directly. 3. For interactive dashboards: use Plotly or Bokeh. 4. For grammar-of-graphics style declarative charts: use Altair.

This chapter covered matplotlib because it is the foundation. Part IV will cover seaborn for statistical shortcuts. Part V will cover Plotly and Altair for interactive and declarative alternatives. Knowing all of them lets you pick the right tool for each job.


A note on reading order: If you want one source for practical examples, read VanderPlas's Python Data Science Handbook Chapter 4. If you want one source for design rationale, read Wilke's Fundamentals of Data Visualization. If you want to go deeper on specific chart types, browse the matplotlib gallery daily — it is the fastest way to build a mental library of "charts I can produce" and their source code. All three are free online.