Case Study 8.1: AO3 Tag Statistics
The Archive as Data Set
Introduction
Archive of Our Own launched in 2009 and by the early 2020s had grown to host over ten million works of fan fiction in hundreds of languages, covering thousands of fandoms. Its tagging system — a robust metadata infrastructure that allows authors to tag works by fandom, pairing type, content rating, genre, character, and relationship type — creates what is, inadvertently, one of the largest datasets of human creative and sexual expression ever assembled.
The tag statistics that AO3 publishes periodically in its annual report provide a partial but revealing window into the gender and sexual dynamics of fan creative culture. This case study examines what those statistics show, what they cannot show, and what they tell us about the arguments in Chapter 8.
What the Statistics Show
Pairing type distribution — As of the most recent published annual statistics, works tagged with "M/M" (male/male pairing) represent the largest single category, comprising roughly 40–44% of relationship-tagged works. Works tagged "Gen" (no romantic/sexual relationship) constitute approximately 20–25%. Works tagged "M/F" (male/female) constitute approximately 20–22%. Works tagged "F/F" (female/female) constitute approximately 10–12%. Multi (multiple pairing types) constitutes the remainder.
This distribution has been remarkably consistent since approximately 2012, when the archive reached sufficient scale to produce meaningful statistics.
Fandom-specific variation — The aggregate distribution varies significantly by fandom. Fandoms with large female fan bases and intense male character relationships (Supernatural, BBC Sherlock, hockey RPF, anime fandoms) show M/M percentages significantly above the aggregate. Fandoms with large female character ensemble casts and specifically LGBTQ+ canonical content (The 100, Wynonna Earp) show F/F percentages closer to or above M/M. Fandoms originating in Japanese anime/manga with dedicated yuri or shōjo traditions show different distributions.
Top pairings over time — The most-written pairings on AO3 in the 2010s and early 2020s include Destiel (Dean Winchester/Castiel from Supernatural), Derek Hale/Stiles Stilinski from Teen Wolf, and Steve Rogers/Tony Stark from the MCU — all M/M pairings. The Destiel pairing was the most-written pairing on the archive for multiple years, with over 90,000 works at its peak.
What the Statistics Cannot Show
The AO3 statistics have significant methodological limitations that must be acknowledged before drawing conclusions.
Selection bias in the AO3 user base: AO3 was founded by the Organization for Transformative Works, which emerged from slash fiction and fan fiction communities that were predominantly female and LGBTQ+-heavy. The site's community culture reflects this origin: it is explicitly LGBTQ+-affirming, has robust tools for tagging LGBTQ+ content, and has a social environment that is hospitable to queer fan creative work. This means AO3 oversamples from communities where M/M fan fiction is culturally valued and normalized.
The "untagged" problem: A significant percentage of AO3 works are incompletely tagged. Works that are not tagged with pairing type cannot be categorized. There is no good evidence that untagged works are randomly distributed across pairing types.
Language and cultural bias: AO3 is primarily an English-language archive. Japanese fan fiction (stored primarily on pixiv and Pixiv Novels), Korean fan fiction (stored on various Korean platforms), and other non-English fan creative traditions are largely absent from AO3's statistics and have different pairing type distributions.
What "M/M" does not tell us about authors: The tag describes the pairing, not the author's identity. An M/M work could be written by a straight woman, a gay man, a bisexual person of any gender, a trans woman, a non-binary person, or any other gender/sexual configuration. The tag statistics tell us about the content of the archive, not directly about the demographics of its creators.
The Statistics and Chapter 8's Theories
The AO3 statistics provide partial evidence for the three theories of slash discussed in Chapter 8:
For the equal partners theory: The persistence of M/M's dominance over M/F despite the archive's growth and increasing mainstream adoption suggests that the motivation for M/M content is not merely that it is novel or countercultural — it remains the preferred creative form even as the archive matures and diversifies. This is consistent with a persistent motivation that goes beyond novelty, as the equal partners theory proposes.
For the queerness theory: The substantial presence of F/F content (10–12%), while smaller than M/M, is significantly larger than what a "straight women fantasizing about gay men" model would predict. If the archive's female-majority user base were predominantly straight, F/F content should constitute a much smaller percentage than it does. The F/F data is consistent with a substantial queer female presence in the archive's creator community.
For the appropriation critique: The M/M vs. F/F disparity — which has persisted even as F/F has grown — could be interpreted as evidence that the archive's (female-majority) creator community is more comfortable writing male-male desire than female-female desire. If this disparity reflects creators' own demographic composition, it raises the question of whether gay male experience is being used as a creative resource by communities whose own desire is different. However, the data is consistent with multiple alternative explanations.
The Destiel Numbers Specifically
The Destiel pairing's dominance on AO3 — over 90,000 works at its peak — provides numerical context for the case study in section 8.7. This volume exceeds the entire fan fiction archives for many smaller fandoms. It represents an extraordinary concentration of creative labor around a single relationship in a single fandom.
The volume did not collapse after the November 2020 finale, which some observers predicted would end Destiel fan production. By early 2022, the pairing had continued to receive new works at a substantial rate — lower than the peak years but consistent. Vesper_of_Tuesday's observation that "the community exists regardless of what the show did" is confirmed by the archival data: the community continued to produce creative work around the relationship long after the canonical text had failed to affirm it.
Analysis Questions
-
The AO3 statistics show M/M fan fiction consistently dominating the archive's content. Multiple explanations are possible. What additional data would allow you to distinguish between the equal partners theory, the queerness theory, and the appropriation critique as explanations for this distribution?
-
The persistence of Destiel fan production after the disappointing finale suggests that the canonical text's treatment of a fan reading does not simply determine the fan community's production. What does this tell us about the relationship between canonical authority and fan creative investment?
-
The F/F vs. M/M disparity could be explained by: (a) the queer female composition of the archive underrepresenting lesbian and bisexual writers; (b) the structural issue that mainstream media provides fewer richly developed female-female relationships for fan imagination; (c) cultural discomfort with female sexuality in fictionalized form; or (d) some combination. Evaluate each explanation and propose how a researcher might test between them.
-
What are the ethical implications of using AO3 tag statistics as data for academic research? The stories are publicly accessible, but were they produced with the expectation of academic analysis? How should researchers who use this data handle the gap between public accessibility and research consent?
-
If you were designing a follow-up study to the AO3 tag statistics, what survey data would you want to collect from AO3 users to supplement the archival data? What would you most want to know about the people behind the statistics?