Case Study 02: API Docs That Developers Love — Creating Outstanding API Documentation for an Open-Source Library
Background
Tariq Osman maintained querycraft, an open-source Python library for building type-safe database queries with a fluent API. The library had been on PyPI for eighteen months, accumulated 2,400 GitHub stars, and was used by approximately 150 companies based on download statistics. Despite its popularity, the library had a persistent problem: the GitHub issues were dominated by questions that should have been answered by documentation.
A typical month's issue tracker told the story:
- 34 new issues opened
- 14 were bug reports (legitimate)
- 12 were "how do I do X?" questions
- 5 were "is this the correct way to use Y?" questions
- 3 were feature requests that already existed but were undocumented
Tariq estimated he spent 8-10 hours per week answering questions that good documentation would have prevented. The library's documentation consisted of a README with installation instructions and a single example, plus auto-generated Sphinx docs that listed every function signature without explanation. The Sphinx docs were technically comprehensive but practically useless — they told you the function existed and what parameters it accepted, but not when you would use it or how it fit into the larger picture.
Tariq decided to overhaul the documentation completely. His goal: reduce documentation-related issues by 70% within three months.
The Problem in Detail
Tariq analyzed the previous six months of GitHub issues to categorize the documentation gaps:
1. Missing Getting Started Guide (28% of questions) Users could install the library but did not know how to connect to a database, create their first query, or understand the basic workflow. The README showed one example with no context.
2. Incomplete Feature Documentation (24% of questions) The library supported joins, subqueries, transactions, migrations, and connection pooling. None of these had usage documentation. Users discovered features by reading source code or copying patterns from Stack Overflow answers.
3. Missing Error Handling Guide (18% of questions) Users frequently asked about error codes, retry strategies, and how to handle connection failures. The library raised specific exception types, but no documentation explained what each exception meant or how to handle it.
4. No Migration Guide Between Versions (15% of questions) The library had gone through three major versions. Users upgrading from v1 to v2 or v2 to v3 had no guidance on breaking changes or how to update their code.
5. Unclear Advanced Patterns (15% of questions) Power users wanted to know about query optimization, custom type converters, and extending the query builder. These advanced topics had no documentation at all.
The Documentation Strategy
Tariq designed his documentation overhaul around the Diataxis framework, creating four distinct documentation sections:
Tutorials (Learning-Oriented)
Tariq planned three tutorials: 1. "Your First Query" — from installation to executing a SELECT 2. "Building a CRUD API" — using querycraft with FastAPI 3. "Testing with querycraft" — using the built-in test utilities
How-To Guides (Goal-Oriented)
Based on the issue analysis, he planned guides for the twelve most common tasks: 1. How to connect to PostgreSQL, MySQL, and SQLite 2. How to perform JOIN operations 3. How to use subqueries 4. How to manage transactions 5. How to handle connection pooling 6. How to use migrations 7. How to handle errors and retries 8. How to write custom type converters 9. How to optimize query performance 10. How to use raw SQL when needed 11. How to configure logging 12. How to migrate from v2 to v3
Reference (Information-Oriented)
The auto-generated Sphinx docs would be replaced with mkdocstrings-powered reference pages that included: - Complete function signatures with type annotations - Detailed parameter descriptions - Return value descriptions with example values - Exception documentation - Cross-references to related functions - Runnable examples
Explanation (Understanding-Oriented)
Tariq planned explanation articles for: 1. The query builder design pattern and why querycraft uses it 2. How connection pooling works under the hood 3. The type system and how it maps Python types to SQL types 4. Security considerations and SQL injection prevention
Implementation
Phase 1: Infrastructure (Week 1)
Tariq migrated from Sphinx to MkDocs with Material theme and mkdocstrings. His mkdocs.yml:
site_name: querycraft Documentation
site_url: https://querycraft.readthedocs.io
repo_url: https://github.com/tariq/querycraft
theme:
name: material
palette:
- scheme: default
primary: blue
toggle:
icon: material/brightness-7
name: Switch to dark mode
- scheme: slate
primary: blue
toggle:
icon: material/brightness-4
name: Switch to light mode
features:
- navigation.tabs
- navigation.sections
- navigation.expand
- content.code.copy
- content.tabs.link
- search.suggest
nav:
- Home: index.md
- Getting Started:
- Installation: getting-started/installation.md
- Your First Query: getting-started/first-query.md
- Building a CRUD API: getting-started/crud-api.md
- Testing: getting-started/testing.md
- How-To Guides:
- Database Connections: how-to/connections.md
- Joins: how-to/joins.md
- Subqueries: how-to/subqueries.md
- Transactions: how-to/transactions.md
- Error Handling: how-to/errors.md
- Migrations: how-to/migrations.md
- Performance: how-to/performance.md
- Custom Types: how-to/custom-types.md
- Raw SQL: how-to/raw-sql.md
- Logging: how-to/logging.md
- Upgrading: how-to/upgrading.md
- Reference:
- Query Builder: reference/query-builder.md
- Connection: reference/connection.md
- Schema: reference/schema.md
- Migrations: reference/migrations.md
- Exceptions: reference/exceptions.md
- Concepts:
- Query Builder Pattern: concepts/query-builder-pattern.md
- Connection Pooling: concepts/connection-pooling.md
- Type System: concepts/type-system.md
- Security: concepts/security.md
plugins:
- search
- mkdocstrings:
handlers:
python:
options:
docstring_style: google
show_source: true
show_root_heading: true
members_order: source
markdown_extensions:
- admonition
- pymdownx.details
- pymdownx.superfences
- pymdownx.tabbed:
alternate_style: true
- pymdownx.highlight:
anchor_linenums: true
He set up GitHub Actions to build and deploy the docs on every push to main:
name: docs
on:
push:
branches: [main]
jobs:
deploy:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: '3.12'
- run: pip install mkdocs-material mkdocstrings[python]
- run: mkdocs gh-deploy --force
Phase 2: Docstring Overhaul (Weeks 2-3)
Before writing user-facing docs, Tariq needed the source code docstrings to be comprehensive since mkdocstrings would pull from them. He used AI to audit and improve every public docstring in the library.
His workflow for each module:
- Run the docstring analyzer to identify gaps
- Feed the module to an AI assistant with the prompt:
Here is the source code for querycraft's query builder module.
Please review every public method and class and:
1. Identify methods missing docstrings
2. Identify docstrings that are incomplete (missing Args,
Returns, Raises, or Examples)
3. For each gap, generate a complete Google-style docstring
Important context:
- This is a database query builder library
- Users are Python developers building web applications
- The fluent API means most methods return self for chaining
- All SQL generation should be injection-safe (parameterized)
- Review each suggestion, correct inaccuracies, and add examples that match actual query output
- Submit as pull requests for review
The key insight Tariq discovered was that AI-generated examples often used plausible but incorrect SQL output. For every example in a docstring, he ran the actual code against a test database and used the real output:
def where(
self,
column: str,
operator: str = "=",
value: Any = None,
) -> "QueryBuilder":
"""Add a WHERE clause to the query.
Adds a condition to filter results. Multiple calls to
``where()`` are combined with AND. Use ``or_where()``
for OR conditions.
Args:
column: The column name to filter on. Supports
dot notation for joined tables (e.g.,
"users.email").
operator: Comparison operator. Supported values:
"=", "!=", "<", ">", "<=", ">=", "LIKE",
"IN", "NOT IN", "IS NULL", "IS NOT NULL".
value: The value to compare against. For "IN" and
"NOT IN", pass a list. For "IS NULL" and
"IS NOT NULL", omit this parameter.
Returns:
The QueryBuilder instance for method chaining.
Raises:
InvalidColumnError: If column does not exist in
the selected table(s).
InvalidOperatorError: If operator is not in the
supported list.
Example:
>>> query = (
... QueryBuilder("users")
... .select("name", "email")
... .where("age", ">=", 18)
... .where("status", "=", "active")
... )
>>> print(query.to_sql())
SELECT name, email FROM users
WHERE age >= $1 AND status = $2
>>> print(query.params)
[18, 'active']
"""
...
Phase 3: Tutorials and How-To Guides (Weeks 3-5)
Tariq wrote the tutorials himself with AI assistance for drafting. His process:
- Outline the tutorial with specific learning goals
- Write the code examples first (and test them)
- Ask the AI to generate explanatory prose around the code examples
- Edit the prose for accuracy and tone
- Add admonitions for common pitfalls
For how-to guides, he used a template:
# How to [accomplish specific task]
## Prerequisites
- querycraft >= 3.0
- [Any additional requirements]
## Steps
### Step 1: [First action]
[Brief explanation]
```python
# Complete, runnable code
Step 2: [Second action]
[Brief explanation]
# Complete, runnable code
Complete Example
# Full working example combining all steps
Common Pitfalls
!!! warning "Pitfall name" Description of what goes wrong and how to avoid it.
The how-to guides were where AI assistance was most valuable. For each guide, Tariq provided the AI with:
- The specific querycraft APIs involved
- The most common question from GitHub issues related to this topic
- Any known pitfalls or confusing aspects
The AI generated comprehensive first drafts that Tariq refined.
### Phase 4: Reference Documentation (Week 5)
With mkdocstrings, the reference documentation was largely automatic. Each reference page contained a directive that pulled from the source:
```markdown
# Query Builder Reference
::: querycraft.query_builder.QueryBuilder
options:
show_root_heading: true
members_order: source
docstring_section_style: spacy
Tariq added cross-references between related classes, a quick reference table at the top of each page, and a complete exceptions reference that listed every exception the library could raise with its meaning and recommended handling.
Phase 5: Testing Documentation (Week 6)
Tariq implemented documentation testing to prevent drift:
- Example extraction: A pytest plugin extracted all code blocks from documentation files and ran them as tests
- Link checking: A CI step verified all internal and external links
- Build verification: MkDocs strict mode failed the build on any warnings
# conftest.py addition for documentation testing
import subprocess
def pytest_collect_file(parent, file_path):
if file_path.suffix == ".md" and file_path.name != "CHANGELOG.md":
return DocTestFile.from_parent(parent, path=file_path)
Results
Three months after the documentation overhaul launched, Tariq measured the impact:
Issue Volume: - Documentation-related issues dropped from approximately 20/month to 4/month (80% reduction, exceeding the 70% goal) - Total issues dropped from 34/month to 18/month - The remaining documentation questions were about genuinely advanced or edge-case scenarios
Adoption: - PyPI downloads increased 40% in the three months following the documentation launch - GitHub stars grew from 2,400 to 3,800 - Three new contributors submitted their first PRs, citing the documentation as a factor
Maintenance Time: - Tariq's time spent answering questions dropped from 8-10 hours/week to 2-3 hours/week - He redirected that time to feature development, releasing two new major features
Documentation Quality Metrics: - 100% of public functions and classes had complete docstrings - All 47 code examples in documentation were verified by CI - Documentation build time: 45 seconds - Search worked across all documentation sections
Key Decisions That Made the Difference
1. Choosing MkDocs Material over Sphinx. The Material theme provided a modern, searchable, mobile-friendly documentation site out of the box. The Markdown-based workflow was more approachable for contributors than reStructuredText.
2. Structuring around Diataxis. Separating tutorials, how-to guides, reference, and explanation prevented the common problem of documentation that tries to serve every audience and serves none. Users could navigate directly to the type of documentation they needed.
3. Testing every code example. The single most impactful quality measure was running documentation code examples in CI. This caught three instances of documentation drift within the first month, before users encountered broken examples.
4. Using AI for drafting, not for final content. AI assistants generated first drafts that were approximately 75-85% production-ready. Human review caught subtle inaccuracies (incorrect SQL output, wrong exception types, misleading examples) that would have eroded user trust.
5. Issue-driven prioritization. By analyzing existing GitHub issues, Tariq knew exactly which documentation was most needed. This prevented the common trap of documenting what is easy to document rather than what users need most.
6. Versioned documentation. Using mike (a MkDocs versioning plugin), Tariq served documentation for v2 and v3 simultaneously, allowing users on older versions to access accurate documentation.
Lessons for Your Projects
Start with your issue tracker. If you have an existing project, your issue tracker tells you exactly what documentation is missing. Categorize questions and prioritize documentation that eliminates the most common ones.
Documentation is user interface design. The same principles apply: understand your users (developers consuming your API), their goals (build something with your library), and their context (they have limited time and patience). Design documentation for their journey, not for your code structure.
AI accelerates, it does not replace. AI assistance reduced the documentation effort from an estimated six months to six weeks. But the human judgment, testing every example, verifying every claim, organizing for user needs, was what made the documentation trustworthy.
Invest in testing infrastructure. Documentation tests are the single best investment for long-term documentation quality. A broken example that reaches users does more damage than no example at all.
Measure the impact. Tariq set a specific, measurable goal (70% reduction in documentation-related issues) and tracked progress against it. Without measurement, documentation improvements feel subjective. With measurement, they are demonstrably valuable.
Connection to Chapter Concepts
This case study demonstrates the following concepts from Chapter 23:
- Section 23.2: README improvement with structured sections and working examples
- Section 23.3: API documentation using MkDocs with mkdocstrings, replacing bare Sphinx autodoc
- Section 23.5: Comprehensive docstring overhaul following Google style conventions
- Section 23.6: User guides structured using the Diataxis framework (tutorials, how-to guides, reference, explanation)
- Section 23.9: Documentation-driven approach where documentation quality was treated as a primary project metric
- Section 23.10: Documentation maintenance through CI testing of code examples and automated build verification
The code supporting this case study is available in code/case-study-code.py.