Case Study 1: Building an MCP Server for a Company Knowledge Base

Background

Meridian Software is a mid-size technology company with 150 engineers working across 12 product teams. Over the past five years, the company has accumulated a substantial body of internal documentation: architectural decision records (ADRs), runbooks, API specifications, onboarding guides, coding standards, and post-mortem reports. This documentation lives across multiple systems — Confluence pages, a Git repository of Markdown files, and a PostgreSQL database that stores structured metadata about services, teams, and dependencies.

Engineers at Meridian have adopted AI coding assistants for daily development work, using tools like Claude Code for code generation, debugging, and refactoring. However, the AI assistants have a critical blind spot: they know nothing about Meridian's internal systems, conventions, or history. When an engineer asks the AI for help designing a new microservice, the AI provides generic best practices instead of Meridian-specific patterns. When debugging a production issue, the AI cannot access runbooks or past post-mortem reports that would provide essential context.

Sarah Chen, a senior platform engineer, is tasked with closing this gap. Her goal: build an MCP server that gives AI coding assistants structured access to Meridian's internal knowledge base, so that every engineer's AI assistant becomes a Meridian expert.

The Challenge

Sarah identifies several specific pain points through interviews with engineers across teams:

Architecture questions. Engineers frequently need to understand why a particular architectural decision was made. The ADRs exist in a Git repository, but finding the right one requires knowing it exists and where to look.
Service dependencies. Meridian's microservices architecture has complex dependencies. Engineers often need to understand which services depend on a given service before making changes, and this information is scattered across documentation and a service registry database.
Coding standards. Meridian has detailed coding standards, but new engineers (and even experienced ones switching teams) struggle to find and follow them. The AI assistant currently generates code that follows generic conventions instead of Meridian's.
Incident history. When a service misbehaves, past post-mortem reports often contain crucial debugging insights. But searching through dozens of reports manually is time-consuming, especially during an active incident.
Onboarding. New engineers spend weeks learning Meridian's systems. If the AI assistant could answer questions about internal architecture and conventions, onboarding would accelerate dramatically.

The Solution Architecture

Sarah designs an MCP server called meridian-knowledge with three categories of capabilities:

Tools

search_documentation — Full-text search across all documentation sources
search_adrs — Search architectural decision records by keyword, date, or status
get_service_info — Retrieve information about a specific service including dependencies, team ownership, and health metrics
search_postmortems — Search past incident reports by service, date range, or keyword
find_coding_standard — Find the relevant coding standard for a given language, framework, or topic

Resources

docs://standards/python — Python coding standards document
docs://standards/api-design — API design guidelines
docs://standards/testing — Testing requirements and patterns
docs://architecture/overview — High-level architecture overview
docs://onboarding/checklist — New engineer onboarding checklist

Prompts

review_for_meridian_standards — Code review prompt that includes Meridian-specific standards
design_new_service — Service design prompt that follows Meridian's architecture patterns
investigate_incident — Incident investigation prompt that searches relevant post-mortems

Implementation

Data Layer

Sarah starts with the data layer, building adapters for each documentation source:

class ConfluenceAdapter:
    """Adapter for searching and reading Confluence pages."""

    def __init__(self, base_url: str, api_token: str):
        self.base_url = base_url
        self.headers = {
            "Authorization": f"Bearer {api_token}",
            "Content-Type": "application/json",
        }

    async def search(self, query: str, space: str = None, limit: int = 10):
        cql = f'text ~ "{query}"'
        if space:
            cql += f' AND space = "{space}"'
        async with httpx.AsyncClient() as client:
            response = await client.get(
                f"{self.base_url}/rest/api/content/search",
                headers=self.headers,
                params={"cql": cql, "limit": limit},
            )
            response.raise_for_status()
            data = response.json()
            return [
                {
                    "title": r["title"],
                    "id": r["id"],
                    "url": f"{self.base_url}/wiki/spaces/{r['space']['key']}/pages/{r['id']}",
                    "excerpt": r.get("excerpt", ""),
                }
                for r in data.get("results", [])
            ]

The ADR adapter reads from the Git repository:

class ADRAdapter:
    """Adapter for searching and reading ADRs from a Git repository."""

    def __init__(self, adr_directory: str):
        self.adr_dir = Path(adr_directory)

    async def search(self, query: str, status: str = None):
        results = []
        for adr_file in sorted(self.adr_dir.glob("*.md")):
            content = adr_file.read_text(encoding="utf-8")
            metadata = self._parse_metadata(content)
            if status and metadata.get("status", "").lower() != status.lower():
                continue
            if query.lower() in content.lower():
                results.append({
                    "file": adr_file.name,
                    "title": metadata.get("title", adr_file.stem),
                    "status": metadata.get("status", "unknown"),
                    "date": metadata.get("date", "unknown"),
                    "summary": self._extract_summary(content),
                })
        return results

    def _parse_metadata(self, content: str) -> dict:
        metadata = {}
        for line in content.split("\n")[:20]:
            if line.startswith("# "):
                metadata["title"] = line[2:].strip()
            elif line.startswith("Status:"):
                metadata["status"] = line.split(":", 1)[1].strip()
            elif line.startswith("Date:"):
                metadata["date"] = line.split(":", 1)[1].strip()
        return metadata

    def _extract_summary(self, content: str) -> str:
        lines = content.split("\n")
        summary_lines = []
        in_summary = False
        for line in lines:
            if "## Context" in line or "## Summary" in line:
                in_summary = True
                continue
            elif line.startswith("## ") and in_summary:
                break
            elif in_summary and line.strip():
                summary_lines.append(line.strip())
        return " ".join(summary_lines[:3])

The service registry adapter queries the PostgreSQL database:

class ServiceRegistryAdapter:
    """Adapter for querying the service registry database."""

    def __init__(self, connection_string: str):
        self.connection_string = connection_string

    async def get_service(self, service_name: str) -> dict:
        async with asyncpg.create_pool(self.connection_string) as pool:
            async with pool.acquire() as conn:
                service = await conn.fetchrow(
                    "SELECT * FROM services WHERE name = $1",
                    service_name,
                )
                if not service:
                    return {"error": f"Service not found: {service_name}"}

                deps = await conn.fetch(
                    "SELECT dependency_name, dependency_type "
                    "FROM service_dependencies WHERE service_name = $1",
                    service_name,
                )
                dependents = await conn.fetch(
                    "SELECT service_name FROM service_dependencies "
                    "WHERE dependency_name = $1",
                    service_name,
                )
                return {
                    "name": service["name"],
                    "team": service["team"],
                    "language": service["language"],
                    "repository": service["repository"],
                    "description": service["description"],
                    "dependencies": [
                        {"name": d["dependency_name"], "type": d["dependency_type"]}
                        for d in deps
                    ],
                    "dependents": [d["service_name"] for d in dependents],
                    "health_endpoint": service.get("health_endpoint"),
                }

MCP Server Assembly

With adapters in place, Sarah assembles the MCP server:

server = Server("meridian-knowledge")

confluence = ConfluenceAdapter(
    base_url=os.environ["CONFLUENCE_URL"],
    api_token=os.environ["CONFLUENCE_TOKEN"],
)
adrs = ADRAdapter(os.environ["ADR_DIRECTORY"])
registry = ServiceRegistryAdapter(os.environ["SERVICE_DB_URL"])


@server.list_tools()
async def list_tools() -> list[Tool]:
    return [
        Tool(
            name="search_documentation",
            description=(
                "Search Meridian's internal documentation including "
                "Confluence pages, guides, and standards. Use this when "
                "you need to find information about Meridian's systems, "
                "processes, or conventions."
            ),
            inputSchema={
                "type": "object",
                "properties": {
                    "query": {
                        "type": "string",
                        "description": "Search query for finding relevant docs",
                    },
                    "space": {
                        "type": "string",
                        "description": "Confluence space to search (optional)",
                        "enum": ["ENG", "PRODUCT", "OPS", "SECURITY"],
                    },
                    "max_results": {
                        "type": "integer",
                        "default": 5,
                    },
                },
                "required": ["query"],
            },
        ),
        Tool(
            name="search_adrs",
            description=(
                "Search Meridian's Architectural Decision Records. Use this "
                "when you need to understand why an architectural choice was "
                "made, what alternatives were considered, and what the "
                "current status of a decision is."
            ),
            inputSchema={
                "type": "object",
                "properties": {
                    "query": {"type": "string"},
                    "status": {
                        "type": "string",
                        "enum": ["accepted", "proposed", "deprecated", "superseded"],
                    },
                },
                "required": ["query"],
            },
        ),
        Tool(
            name="get_service_info",
            description=(
                "Get detailed information about a Meridian microservice "
                "including its team ownership, dependencies, dependents, "
                "language, and repository. Use this before making changes "
                "that might affect other services."
            ),
            inputSchema={
                "type": "object",
                "properties": {
                    "service_name": {
                        "type": "string",
                        "description": "Name of the service (e.g., 'auth-service', 'payment-gateway')",
                    },
                },
                "required": ["service_name"],
            },
        ),
    ]

Tool Descriptions as the Secret Weapon

Sarah discovers that the most impactful part of the implementation is not the code — it is the tool descriptions. She iterates on them through testing with real engineering queries. Her initial description for search_adrs was simply "Search ADRs." After observing that the AI rarely chose the tool, she rewrote it:

"Search Meridian's Architectural Decision Records. Use this when you need to understand why an architectural choice was made, what alternatives were considered, and what the current status of a decision is."

This longer description dramatically improved the AI's ability to select the right tool at the right time. The description answers the three questions an AI needs to answer: What does this do? When should I use it? What will I get back?

Testing Strategy

Sarah implements a three-tier testing strategy:

Tier 1: Unit Tests for each adapter, verifying correct parsing, error handling, and edge cases. For example, testing the ADR adapter with malformed Markdown files, empty directories, and ADRs without standard metadata fields.

Tier 2: Integration Tests that verify the full MCP protocol flow — initializing the server, listing tools, and invoking them with realistic inputs. Sarah uses the mcp client library to simulate a real AI client.

Tier 3: Scenario Tests that simulate realistic engineering workflows. For example: "An engineer is designing a new notification service. The AI should search for existing ADRs about messaging patterns, check the service registry for existing notification-related services, and review the API design standards."

Sarah runs scenario tests weekly with real AI clients, recording tool call logs to identify cases where the AI fails to use the right tool or uses it incorrectly.

Deployment

The server is deployed in two modes:

Local mode for development: Engineers run the server locally with stdio transport. Environment variables point to staging instances of Confluence and the service registry.
Shared mode for production: A single instance runs on an internal server with SSE transport, accessible to all engineers. This ensures everyone gets consistent search results and reduces load on Confluence and the database.

Sarah's team sets up monitoring with Prometheus metrics for tool call frequency, latency, error rates, and cache hit rates. Alerts fire if error rates exceed 5% or if latency exceeds 3 seconds.

Results

After three months of deployment:

87% of engineers adopted the MCP server in their daily workflow
Onboarding time for new engineers decreased by approximately 35%, measured by time-to-first-meaningful-contribution
ADR discoverability improved dramatically — searches for ADRs increased by 400% compared to the previous manual search approach
Incident response improved because engineers could quickly find relevant post-mortem reports during active incidents
Code review feedback related to Meridian-specific conventions decreased by 50%, as the AI now incorporated Meridian standards into its suggestions

The most unexpected benefit was knowledge sharing. Engineers discovered ADRs and documentation they never knew existed. The AI assistant surfaced relevant information proactively, creating connections across teams that had previously operated in silos.

Lessons Learned

Tool descriptions matter more than tool implementation. The same underlying functionality with a better description performs dramatically better because the AI selects and uses the tool more appropriately.
Start with search, not CRUD. Read-only tools are safer, easier to test, and provide immediate value. Write operations can be added later when trust in the system is established.
Cache aggressively. Documentation changes infrequently. Caching search results for 15 minutes reduced Confluence API calls by 80% with no noticeable impact on result freshness.
Monitor tool usage patterns. Understanding which tools engineers use most frequently (and which they do not use at all) guides investment in new tools and improvements to existing ones.
Iterate on descriptions weekly. Sarah reviews tool call logs weekly, looking for cases where the AI chose the wrong tool or failed to use a tool that would have been helpful. Each iteration of the descriptions improves the AI's behavior.
Security requires ongoing attention. Even with read-only tools, Sarah implemented rate limiting and audit logging. When a write tool for creating ADRs was added later, she implemented an approval workflow where the AI proposes changes but a human must approve them.

Architecture Diagram

┌────────────────────────────────────────────────────┐
│                AI Coding Assistant                   │
│            (Claude Code / Claude Desktop)            │
│                                                      │
│  "What messaging pattern does Meridian use?"         │
│                        │                             │
│               MCP Client                             │
└────────────────────────┼─────────────────────────────┘
                         │ JSON-RPC (stdio or SSE)
                         │
┌────────────────────────┼─────────────────────────────┐
│           meridian-knowledge MCP Server               │
│                        │                             │
│    ┌──────────────┬────┴────┬───────────────┐        │
│    │ search_docs  │search_  │ get_service_  │        │
│    │              │adrs     │ info          │        │
│    └──────┬───────┴────┬────┴───────┬───────┘        │
│           │            │            │                │
│    ┌──────┴─────┐ ┌────┴────┐ ┌────┴──────┐         │
│    │ Confluence │ │  ADR    │ │ Service   │         │
│    │ Adapter    │ │ Adapter │ │ Registry  │         │
│    └──────┬─────┘ └────┬────┘ │ Adapter   │         │
│           │            │      └────┬──────┘         │
└───────────┼────────────┼───────────┼────────────────┘
            │            │           │
     ┌──────┴─────┐ ┌───┴────┐ ┌───┴──────┐
     │ Confluence │ │  Git   │ │PostgreSQL│
     │   API      │ │  Repo  │ │ Database │
     └────────────┘ └────────┘ └──────────┘

Discussion Questions

How would you extend this system to support write operations (e.g., creating new ADRs or updating service registry entries) while maintaining appropriate safety guardrails?
The current implementation searches each data source independently. How would you implement a unified ranking algorithm that orders results from different sources by relevance?
What privacy and access control considerations would you add if some documentation is restricted to specific teams?
How would this system need to change if Meridian grew to 500 engineers with 50 teams? What scalability challenges would emerge?
Sarah's team observes that some engineers "over-rely" on the knowledge base tool, asking the AI to search for things they should learn permanently. How would you address this from both a tooling and a team culture perspective?