Key Takeaways: Statistical and Scientific Visualization
-
Publication figures have specific requirements. Journals specify figure widths (single-column ~3.5 in / 89 mm, double-column ~7 in / 183 mm), minimum font sizes (7-9 pt), font families (Arial/Helvetica), and file formats (PDF/EPS preferred, TIFF at 300+ DPI). Read the guidelines for your target journal before designing the figure.
-
Font embedding is mandatory. Set
mpl.rcParams["pdf.fonttype"] = 42andmpl.rcParams["ps.fonttype"] = 42to force Type 42 TrueType output. Journals reject figures with Type 3 fonts because many PDF processors cannot handle them. -
Reusable style modules save time. Create a Python module (
nature_style.py,plos_style.py) that applies rcParams and provides convenience functions for sized figures and panel labels. Import at the top of every figure script. Ensures consistency and reduces per-figure effort. -
Panel labels use axes coordinates. Add labels with
ax.text(-0.15, 1.05, "a", transform=ax.transAxes, fontsize=12, fontweight="bold"). Thetransform=ax.transAxesmeans coordinates are fractions of the axes. Negative x and y > 1 position text outside the top-left of the axes. -
Error bars and confidence bands are essential. Use
ax.errorbarfor discrete estimates andax.fill_betweenfor continuous bands. Always disclose what the error represents (SEM, SD, 95% CI, bootstrap interval) in the caption. -
Significance brackets via statannotations. The
statannotationslibrary automates pairwise significance annotations on seaborn categorical plots. Configure the test, significance format, and position; the library runs the test and draws the bracket. -
Colorblind and grayscale safety. Use the Wong palette (or similar) for categorical colors and verify grayscale readability. Redundant encoding (color + line style + marker) is the surest way to ensure accessibility.
-
Forest plots, volcano plots, Manhattan plots, QQ plots. Each is a specialized format for a specific statistical context (meta-analysis, high-throughput screening, GWAS, distributional diagnostics). Know the conventions for any format your field uses regularly.
-
Effect sizes beat p-value thresholds. Modern guidance emphasizes effect sizes with confidence intervals over p-value stars. Tools like DABEST make estimation plots accessible. The format you choose shapes the reasoning the reader applies, so choose deliberately.
-
Reproducibility is part of figure quality. Publish the data and code alongside the paper. Show full distributions, not just summaries. Report sample sizes. Use version control. A figure is not just the image; it is also the data and code behind it.
Chapter 27 is applied rather than conceptually novel. The discipline it teaches — checking every figure against external requirements — serves you whenever you produce a chart for an audience with standards, whether a journal, a client, a brand guide, or a style committee. Chapter 28 closes Part VI with big data visualization strategies.