Case Study 2: The S&P 500 Chart and a Century of Financial Time Series

The chart of the S&P 500 stock index from 1920 to the present is one of the most familiar time series visualizations in the world. Financial news uses it daily; investment textbooks reproduce it in every chapter; newspapers include it whenever there is a market event. The chart's design has converged on a specific set of conventions over the decades: log scale on the y-axis, major recessions shaded, key events annotated. Those conventions did not emerge by accident. Each one solves a specific visualization problem created by the extreme scale and volatility of the underlying data. The S&P 500 chart is a case study in how conventions develop and why they persist.


The Situation: A Century of Compounded Returns

The S&P 500 (Standard & Poor's 500) is a stock market index representing the combined value of 500 large US companies. Its predecessor, the S&P 90, was created in 1928; the S&P 500 as we know it was launched in 1957. Data for the index going back to 1871 can be reconstructed from earlier composite indices and individual stock records — researchers like Robert Shiller at Yale have published cleaned historical time series that allow visualization of "S&P 500 equivalent" back to the 19th century.

Over a century of data, the S&P 500 has gone from about $5 (in nominal dollars) to over $5000. That is a factor of 1000 — three orders of magnitude. The data is extremely volatile: the 1929 crash, the Great Depression, the 1937 slump, the 1940s war years, the 1970s stagflation, the 1987 Black Monday, the 2000 dot-com crash, the 2008 financial crisis, the 2020 COVID crash, the subsequent rally — all visible in a long-term chart. Visualizing this data is a challenge because the scale is so large and the events are so many.

The conventional S&P 500 long-term chart has a specific design that has converged over decades of trial and error. It is worth examining each design choice to understand what problem it solves.

Design Choice 1: Log Scale on the Y-Axis

A linear-scaled chart of the S&P 500 from 1920 to 2024 is nearly useless. The early decades (1920–1970) appear as a flat line near zero because the index was in the single digits or low double digits. The recent decades (2000–2024) dominate the visual, with dramatic vertical movements. A reader looking at the linear chart would conclude that the market had been flat for 80 years and then suddenly exploded — which is nominally true but misleading, because the "explosion" is mostly the accumulation of compound returns over decades.

The fix is a log scale. On a log y-axis, equal percentage changes appear as equal vertical distances. A move from $10 to $20 (100% gain) looks the same as a move from $100 to $200 (100% gain) and a move from $1000 to $2000 (100% gain). Because stock returns compound multiplicatively, the log scale is the right scale for comparing returns across eras.

On a log scale, the S&P 500 chart reveals long-term patterns that are invisible on a linear chart. The post-1945 bull market, the 1970s stagflation, the 1990s boom, the 2000s lost decade, and the post-2010 bull market all become visible. Individual crashes and rallies are visible as vertical movements of similar magnitude, regardless of when they occurred. The log scale does not flatten the recent decades; it makes the early decades legible.

This is a specific case of the general principle from Section 25.15: log scales are the right choice for data that spans many orders of magnitude and where percentage changes are meaningful. The S&P 500 satisfies both criteria.

Design Choice 2: Recession Shading

Most long-term S&P 500 charts include shaded regions marking US recessions, as defined by the National Bureau of Economic Research (NBER). The shading is subtle — usually a light gray band covering the recession months — but it adds critical context.

Without the shading, a reader looking at a downturn might wonder "was this a recession or just a market correction?" Recessions are specific economic events with official dates; market downturns are broader and can happen without a recession. The shading answers the question directly: you see the recession, you see the market's response, and the relationship is visible.

The recession shading also illustrates an important subtlety: the stock market leads the economy. Market downturns typically begin before recessions are officially declared. The shading makes this visible — the market often peaks 6 to 12 months before the recession starts and troughs before the recession ends. A reader can see the lead-lag relationship without needing to analyze it statistically.

In matplotlib, the recession shading is implemented with axvspan:

recessions = [
    ("1929-08-01", "1933-03-01"),  # Great Depression
    ("1937-05-01", "1938-06-01"),
    ("1973-11-01", "1975-03-01"),  # OPEC crisis
    ("1980-01-01", "1980-07-01"),
    ("1981-07-01", "1982-11-01"),
    ("1990-07-01", "1991-03-01"),
    ("2001-03-01", "2001-11-01"),
    ("2007-12-01", "2009-06-01"),  # Global financial crisis
    ("2020-02-01", "2020-04-01"),  # COVID
]
for start, end in recessions:
    ax.axvspan(pd.Timestamp(start), pd.Timestamp(end), alpha=0.15, color="gray")

The same pattern works for any economic or financial chart: store the events in a DataFrame, iterate, and add shaded bands. This is the Section 25.19 pattern applied to a specific domain.

Design Choice 3: Event Annotations

A long-term S&P 500 chart usually has a handful of annotated events: the 1929 crash, Black Monday 1987, the dot-com peak, the 2008 crisis, the COVID crash. Each is marked with a label and often a small callout line. These annotations turn the chart from "a squiggly line going up" into "a narrative about specific events that shaped the market."

The choice of which events to annotate is editorial. Too few, and the chart seems to leave out obvious moments. Too many, and the chart becomes visually cluttered. A typical long-term chart annotates 5 to 10 events, chosen for their historical importance and their visual distinctiveness. The dot-com peak at 1549 in March 2000 is memorable; a minor 5% correction in June 1996 is not, even though it is technically the same kind of event.

Annotations also illustrate the narrative dimension of time series visualization. A chart without annotations is data; a chart with annotations is a story. The annotations guide the reader through the important moments, and the reader comes away with a mental model of "how the market has evolved" rather than just "look, it went up."

Design Choice 4: Nominal vs. Real Returns

A long-term stock chart can be shown in nominal dollars (raw prices unadjusted for inflation) or real dollars (adjusted for CPI inflation). The choice matters a lot over a century.

The S&P 500 in nominal dollars: $5 (1920) → $5000+ (2024), a factor of 1000.

The S&P 500 in real dollars (adjusted for inflation): roughly $50 (1920, in 2024 dollars) → $5000 (2024), a factor of 100.

A tenfold difference in the apparent gain. Nominal returns include inflation, which looks like extra gain but is actually just the dollar losing value. Real returns are the "true" gain in purchasing power. Most long-term charts show nominal returns because they are what investors actually see in their account statements, but academic and research charts often show real returns for comparability across eras.

The choice is an editorial one with different implications. Showing nominal returns makes the long-term gain look more dramatic and supports narratives of "stocks always go up." Showing real returns deflates that narrative — the 1965–1982 period, which was roughly flat in nominal terms, was actually a significant real loss because inflation was high. Disclosing which is shown is essential; a chart that uses nominal returns while implying real gain is a form of visual lie.

Design Choice 5: Rolling Returns vs. Price

The price chart (S&P 500 index level over time) is one way to visualize long-term returns. Another is a rolling returns chart — a chart of the 10-year or 20-year annualized return ending at each date. This chart has no dramatic long-term upslope; it shows the historical pattern of "what returns would you have gotten if you invested for 10 years starting at this date?"

Rolling returns charts are more useful for some questions. They reveal that the 10-year return of the S&P 500 has been positive most of the time but negative for periods ending in the mid-1970s, early 2000s, and early 2010s. A reader who is about to invest for a decade would find this more relevant than the price chart — they want to know the distribution of possible outcomes, not the nominal price level.

Both charts show the same underlying data. They answer different questions. For "how has the market performed historically?" use the price chart. For "what returns should I expect over a decade?" use the rolling returns chart. Neither is better; they are different tools for different questions.

Theory Connection: Convention as Communication Protocol

The S&P 500 chart's conventions — log scale, recession shading, event annotations, real vs. nominal disclosure — did not emerge by accident. Each solves a specific visualization problem. Log scale handles the huge range. Shading provides context. Annotations add narrative. Real/nominal disclosure prevents misinterpretation.

Together, the conventions form a communication protocol. Readers who have seen many S&P 500 charts recognize the conventions and can read them quickly. The log scale does not need explanation because the reader has internalized "this is how financial time series work." The recession shading does not need a legend because the reader knows what those gray bands mean. The conventions reduce cognitive load because they codify decisions that all competent chart makers have already made.

This is why financial charts look similar to each other. A Bloomberg S&P 500 chart, a Yahoo Finance S&P 500 chart, and a Wall Street Journal S&P 500 chart all look alike. The conventions are shared. Violating them — a linear scale, no recession shading, no annotations — would produce a chart that looks "wrong" to financial readers, even if it is technically correct.

For practitioners, the lesson is that convention is not inertia; it is compressed knowledge. When you produce a chart in a specific field, look at how experts in that field produce charts and follow their conventions unless you have a specific reason to deviate. The conventions encode the field's accumulated experience about what works for that data and that audience. You are not being lazy by following them; you are leveraging expertise.

The flip side: conventions are context-dependent. The log scale that is standard for financial time series is unfamiliar to general audiences who see it as confusing. The recession shading that is standard for US economic charts is meaningless to readers outside the US. The conventions apply in context; when you move to a different audience, you may need to explain what you are doing rather than assume the reader knows.

The Impact: Financial Literacy Through Conventions

The S&P 500 long-term chart is one of the most frequently reproduced data visualizations in the world. It appears in every financial news article about the market, in every investment textbook, in every brokerage's marketing materials. Generations of investors have learned to read it — to recognize the log scale, to interpret the recession shading, to understand the event annotations.

This has had a practical consequence. Investors who regularly see long-term charts tend to have better intuitions about volatility, drawdowns, and long-term returns than investors who only see short-term charts. They understand that crashes happen regularly, that recoveries follow, and that the long-term trend is upward. This understanding is imperfect — recency bias still affects many investors — but the conventions of the S&P 500 chart have contributed to financial literacy in measurable ways.

The flip side is that the conventions also contribute to financial narratives that may be partly marketing. "Stocks always go up over the long term" is a message that a 100-year nominal-return chart reinforces, even though it deflates somewhat when adjusted for inflation or risk. Conventions can encode accurate information or promote specific narratives, and sometimes both. A practitioner who knows the conventions can read past them; a practitioner who does not has to be more skeptical.


Discussion Questions

  1. On log scales. The log scale is standard for financial time series but unfamiliar to general audiences. Should financial charts in general-audience publications use log or linear scales?

  2. On recession shading. The NBER defines US recessions officially, but other economic downturns (emerging markets, Europe) are less formally categorized. How should a global chart handle the shading question?

  3. On event annotations. Which events deserve annotation, and who decides? The editorial choice encodes a narrative about which moments mattered. Is this a problem or a feature?

  4. On real vs. nominal returns. Most financial charts show nominal returns because investors see them in their accounts. Academic charts show real returns. Which is more honest? Does "honest" depend on the audience?

  5. On rolling returns. The rolling returns chart is more useful for some questions but less commonly used. Why? What would make it more popular?

  6. On your own financial charts. Do you follow the S&P 500 conventions when you visualize financial data? If not, why not? If yes, what audiences are you assuming?


The S&P 500 long-term chart is one of the most carefully-evolved visualizations in data journalism and finance. Its conventions — log scale, recession shading, event annotations — each solve a specific problem, and together they form a communication protocol that financial readers have internalized over decades. When you visualize financial time series, you are working in this tradition whether you know it or not. Follow the conventions unless you have a good reason to deviate, and disclose your choices clearly so the reader can interpret what you are showing.