26 min read

> "The most powerful AI coding assistant is not the one with the most built-in features — it is the one that can be extended to fit exactly how you work." — Adapted from the Unix philosophy

Chapter 37: Custom Tools, MCP Servers, and Extending AI

"The most powerful AI coding assistant is not the one with the most built-in features — it is the one that can be extended to fit exactly how you work." — Adapted from the Unix philosophy


Learning Objectives

After completing this chapter, you will be able to:

  • Analyze the extensibility architecture of modern AI coding assistants and identify opportunities for custom tool integration (Bloom's: Analyze)
  • Evaluate the Model Context Protocol (MCP) as a standard for connecting AI assistants to external systems, comparing it with alternative integration approaches (Bloom's: Evaluate)
  • Create fully functional MCP servers that expose tools, resources, and prompts to AI coding assistants (Bloom's: Create)
  • Apply tool schema definitions and handler implementations to build custom tools with proper input validation and error handling (Bloom's: Apply)
  • Create data source integrations that give AI assistants access to databases, APIs, and file systems through structured interfaces (Bloom's: Create)
  • Apply custom slash command patterns to streamline repetitive development workflows (Bloom's: Apply)
  • Design middleware pipelines that add pre-processing, validation, logging, and post-processing to AI tool interactions (Bloom's: Create)
  • Evaluate testing strategies for custom tools, including unit testing, integration testing, and simulated AI interaction testing (Bloom's: Evaluate)
  • Apply deployment and distribution patterns for sharing MCP servers across teams and the broader community (Bloom's: Apply)
  • Analyze the growing custom tool ecosystem and predict how extensibility will shape the future of AI-assisted development (Bloom's: Analyze)

Prerequisites

This chapter assumes you have completed:

  • Chapter 36: AI Coding Agents — You understand how autonomous AI agents use tools to accomplish tasks, and how tool calling fits into the agent execution loop.
  • Chapter 20: External APIs and Integrations — You are comfortable with REST APIs, authentication patterns, and integrating external services.
  • Chapter 17: Backend Development and REST APIs — You understand server-side development, request handling, and JSON-based communication.

You should also have practical experience with Python (Chapter 5) and be comfortable with asynchronous programming concepts. If you have not worked with async/await in Python, you will still be able to follow along, but some implementation details will require additional study.


Introduction

In Chapter 36, you learned how AI coding agents use tools to interact with the world — reading files, executing commands, searching codebases, and more. Those built-in tools are powerful, but they represent only a fraction of what is possible. Every development team has unique workflows, proprietary systems, internal knowledge bases, and specialized requirements that no general-purpose AI tool can anticipate.

This is where extensibility transforms AI coding assistants from impressive demos into indispensable infrastructure. When you can teach your AI assistant to query your company's internal documentation, run your team's custom linting rules, interact with your deployment pipeline, or access your proprietary data models, the assistant stops being a generic tool and becomes a deeply integrated member of your development workflow.

The Model Context Protocol (MCP) has emerged as the open standard that makes this extensibility practical. MCP defines a structured way for AI assistants to discover and invoke external tools, access external data sources, and use pre-defined prompt templates — all through a protocol that any developer can implement. Think of MCP as what HTTP did for the web: it provides a universal language that lets AI assistants talk to any service, regardless of who built it or where it runs.

This chapter takes you from understanding the extensibility opportunity through building, testing, and deploying your own MCP servers and custom tools. By the end, you will have the knowledge to extend any MCP-compatible AI assistant with capabilities tailored precisely to your needs.


37.1 The Extensibility Opportunity

Why Built-In Tools Are Not Enough

Modern AI coding assistants ship with impressive built-in capabilities. They can read and write files, execute shell commands, search code, browse the web, and interact with version control systems. For many tasks, these capabilities are sufficient. But consider these scenarios:

  • Your team stores architectural decision records (ADRs) in a Confluence wiki. The AI assistant cannot search them.
  • Your deployment pipeline uses a custom CLI tool with dozens of flags. The AI assistant does not know about it.
  • Your company's coding standards include domain-specific rules that go beyond standard linters.
  • Your microservices communicate through a custom message bus with its own query language.
  • Your QA team tracks test results in a proprietary system that has a REST API but no AI integration.

In each case, the AI assistant lacks context that would make it dramatically more effective. The extensibility opportunity is about closing this gap — giving your AI assistant access to the same systems, knowledge, and tools that human developers use every day.

The Evolution of Tool Integration

Tool integration with AI has evolved through several generations:

Generation 1: Copy-Paste Context. Developers manually copied relevant information into chat prompts. This worked but was tedious, error-prone, and limited by context window sizes.

Generation 2: Plugin Systems. Platforms like ChatGPT introduced plugin architectures that let developers build integrations. These were platform-specific and required custom implementations for each AI provider.

Generation 3: Function Calling. AI models gained the ability to call functions defined by developers, with structured input and output. This was more flexible but still required platform-specific implementation.

Generation 4: The Model Context Protocol. MCP provides a standardized, open protocol that works across AI providers and tools. A single MCP server can be used by any MCP-compatible client — Claude, VS Code extensions, custom agents, and more.

Key Insight

The transition from platform-specific plugins to the open MCP standard mirrors the web's evolution from proprietary online services (CompuServe, AOL) to the open HTTP/HTML standards. Open standards win because they reduce the cost of integration for everyone. A tool developer writes one MCP server instead of separate integrations for each AI platform.

Categories of Custom Tools

Custom tools generally fall into several categories, each serving a different extensibility need:

Category Description Examples
Knowledge Access Give AI access to information sources Internal docs, wikis, knowledge bases
System Interaction Let AI interact with development infrastructure CI/CD pipelines, monitoring, deployment
Data Queries Enable AI to query structured data Databases, analytics, logs
Code Operations Add specialized code analysis or transformation Custom linters, formatters, generators
Workflow Automation Automate multi-step development processes Release processes, onboarding, reviews
Domain Tools Provide domain-specific capabilities Financial calculations, medical coding, legal research

Understanding these categories helps you identify where custom tools would add the most value to your specific workflow.

The Build vs. Buy Decision

Before building custom tools, consider whether existing solutions already meet your needs. The MCP ecosystem is growing rapidly, and community-built servers cover many common use cases. Building custom tools makes the most sense when:

  1. Your data is proprietary. No community tool can access your internal systems.
  2. Your workflow is unique. Standard tools do not match your team's specific processes.
  3. Security requirements demand it. You need full control over how data flows to and from the AI.
  4. Integration depth matters. You need tighter integration than a generic tool provides.
  5. Competitive advantage is at stake. Your custom tooling gives your team capabilities others lack.

Practical Tip

Start with the highest-impact, lowest-complexity tool. Often this is a simple knowledge access tool that gives your AI assistant access to your team's documentation or coding standards. The immediate productivity gain builds organizational support for more ambitious tool development.


37.2 Understanding the Model Context Protocol (MCP)

What MCP Is

The Model Context Protocol (MCP) is an open standard, originally developed by Anthropic, that defines how AI applications communicate with external tools and data sources. MCP follows a client-server architecture: the AI application (or its host) acts as an MCP client, and external services implement MCP servers that expose capabilities through a structured protocol.

MCP defines three primary types of capabilities that servers can expose:

  1. Tools — Functions that the AI can invoke to perform actions. Tools have defined input schemas, execute operations, and return results. Examples: searching a database, creating a ticket, running a calculation.

  2. Resources — Data sources that the AI can read. Resources provide context without requiring the AI to take an action. Examples: configuration files, documentation pages, database schemas.

  3. Prompts — Pre-defined prompt templates that guide AI behavior for specific tasks. Prompts can include parameters and are selected by the user or the AI as needed. Examples: code review templates, debugging workflows, analysis frameworks.

The MCP Architecture

The MCP architecture consists of several key components:

┌─────────────────────────────────────────────┐
│              AI Application                  │
│  (Claude, VS Code Extension, Custom Agent)   │
│                                              │
│  ┌────────────────────────────────────────┐  │
│  │           MCP Client                   │  │
│  │  - Discovers servers                   │  │
│  │  - Manages connections                 │  │
│  │  - Routes tool calls                   │  │
│  │  - Handles responses                   │  │
│  └────────────┬───────────────────────────┘  │
└───────────────┼──────────────────────────────┘
                │  JSON-RPC over stdio/SSE/HTTP
                │
    ┌───────────┼───────────────────┐
    │           │                   │
    ▼           ▼                   ▼
┌────────┐ ┌────────┐        ┌────────┐
│ MCP    │ │ MCP    │  ...   │ MCP    │
│Server A│ │Server B│        │Server N│
│        │ │        │        │        │
│-Tools  │ │-Tools  │        │-Tools  │
│-Resrces│ │-Resrces│        │-Resrces│
│-Prompts│ │-Prompts│        │-Prompts│
└────────┘ └────────┘        └────────┘

MCP Host: The application that contains the AI model and initiates connections. This could be Claude Desktop, an IDE extension, or a custom application.

MCP Client: The protocol handler within the host that manages communication with MCP servers. The client discovers available servers, establishes connections, and routes requests.

MCP Server: An external process that implements the MCP protocol and exposes tools, resources, and/or prompts. Servers can be written in any language and can connect to any external system.

Transport Layer: MCP supports multiple transport mechanisms. The most common are: - stdio (Standard I/O): The client launches the server as a subprocess and communicates through stdin/stdout. This is the simplest transport and works well for local tools. - SSE (Server-Sent Events): The server runs as an HTTP service that uses SSE for server-to-client messages and HTTP POST for client-to-server messages. This works for remote servers. - Streamable HTTP: A newer transport that uses standard HTTP requests with optional streaming, providing a simpler alternative to SSE for remote deployments.

The MCP Message Flow

Communication between client and server follows the JSON-RPC 2.0 protocol. Here is a typical interaction flow:

Client                          Server
  │                                │
  │─── initialize ────────────────>│  Handshake: exchange capabilities
  │<── initialize result ──────────│
  │                                │
  │─── initialized ───────────────>│  Client confirms
  │                                │
  │─── tools/list ────────────────>│  Discover available tools
  │<── tools/list result ──────────│  Returns tool schemas
  │                                │
  │─── resources/list ────────────>│  Discover available resources
  │<── resources/list result ──────│  Returns resource descriptions
  │                                │
  │     ... AI decides to use      │
  │     a tool ...                 │
  │                                │
  │─── tools/call ────────────────>│  Invoke a specific tool
  │<── tools/call result ──────────│  Returns tool output
  │                                │

The initialization handshake is critical — it establishes which protocol version both sides support and what capabilities are available. After initialization, the client can discover tools, resources, and prompts, and then invoke them as needed during the AI's operation.

Technical Note

MCP uses JSON-RPC 2.0, which means every message is a JSON object with a jsonrpc field set to "2.0", a method field specifying the operation, and either params for requests or result/error for responses. Requests include an id field for matching responses to requests.

Tool Schemas in MCP

Each tool exposed by an MCP server has a schema that describes its name, purpose, and expected inputs. These schemas use JSON Schema to define input parameters, making them self-documenting and validatable:

{
  "name": "search_documentation",
  "description": "Search the company knowledge base for relevant documentation. Returns matching articles with titles, summaries, and URLs.",
  "inputSchema": {
    "type": "object",
    "properties": {
      "query": {
        "type": "string",
        "description": "The search query to find relevant documentation"
      },
      "max_results": {
        "type": "integer",
        "description": "Maximum number of results to return (default: 5)",
        "default": 5
      },
      "category": {
        "type": "string",
        "description": "Optional category filter",
        "enum": ["engineering", "product", "design", "operations"]
      }
    },
    "required": ["query"]
  }
}

The AI uses these schemas to understand what each tool does, what inputs it needs, and how to format its requests. Well-written descriptions are crucial — they are the primary way the AI decides when and how to use each tool.

Key Insight

Tool descriptions are essentially prompts. The quality of your tool's description directly affects how well the AI uses it. Write descriptions that explain not just what the tool does, but when it should be used and what kind of results it returns. Think of the description as instructions for a new team member who needs to know when to reach for this particular tool.

Resource URIs in MCP

Resources in MCP are identified by URIs and can represent any kind of readable data:

{
  "uri": "docs://engineering/architecture/microservices",
  "name": "Microservices Architecture Guide",
  "description": "Company's microservices architecture patterns and guidelines",
  "mimeType": "text/markdown"
}

Resources can be static (their content does not change during a session) or dynamic (their content is generated on each request). The AI can read resources to gather context before performing tasks, much like a developer reading documentation before writing code.

Prompt Templates in MCP

Prompts in MCP are reusable templates that structure the AI's approach to specific tasks:

{
  "name": "code_review",
  "description": "Structured code review following company standards",
  "arguments": [
    {
      "name": "file_path",
      "description": "Path to the file to review",
      "required": true
    },
    {
      "name": "focus_area",
      "description": "Specific aspect to focus on (security, performance, readability)",
      "required": false
    }
  ]
}

When a prompt is selected, the server returns a series of messages that establish context and guide the AI's behavior. This is particularly powerful for standardizing workflows across a team.


37.3 Building MCP Servers

Setting Up Your Development Environment

Building MCP servers in Python requires the mcp package, which provides the server framework, transport handlers, and protocol implementation. Install it along with common dependencies:

pip install mcp httpx pydantic

The mcp package provides a high-level API built on Python's asyncio, so your server code will use async/await throughout.

Your First MCP Server

Let us build a minimal MCP server that exposes a single tool. This server provides a "word count" tool that counts words in a given text:

from mcp.server import Server
from mcp.server.stdio import stdio_server
from mcp.types import Tool, TextContent
import json

# Create the server instance
server = Server("word-counter")


@server.list_tools()
async def list_tools() -> list[Tool]:
    """Return the list of tools this server provides."""
    return [
        Tool(
            name="count_words",
            description="Count the number of words in a given text.",
            inputSchema={
                "type": "object",
                "properties": {
                    "text": {
                        "type": "string",
                        "description": "The text to count words in",
                    }
                },
                "required": ["text"],
            },
        )
    ]


@server.call_tool()
async def call_tool(name: str, arguments: dict) -> list[TextContent]:
    """Handle tool invocations."""
    if name == "count_words":
        text = arguments.get("text", "")
        word_count = len(text.split())
        return [
            TextContent(
                type="text",
                text=json.dumps({"word_count": word_count}),
            )
        ]
    raise ValueError(f"Unknown tool: {name}")


async def main():
    """Run the server using stdio transport."""
    async with stdio_server() as (read_stream, write_stream):
        await server.run(
            read_stream,
            write_stream,
            server.create_initialization_options(),
        )


if __name__ == "__main__":
    import asyncio
    asyncio.run(main())

This example demonstrates the three essential components of any MCP server:

  1. Server instantiation — Creating a Server object with a name that identifies it.
  2. Tool listing — Implementing a handler that returns tool schemas so the client knows what is available.
  3. Tool execution — Implementing a handler that receives tool calls and returns results.

Adding Resources to Your Server

Resources let your server expose readable data. Here is how to add documentation resources to the server:

from mcp.types import Resource


@server.list_resources()
async def list_resources() -> list[Resource]:
    """Return the list of resources this server provides."""
    return [
        Resource(
            uri="docs://guides/getting-started",
            name="Getting Started Guide",
            description="Introduction to the project and setup instructions",
            mimeType="text/markdown",
        ),
        Resource(
            uri="docs://guides/api-reference",
            name="API Reference",
            description="Complete API documentation",
            mimeType="text/markdown",
        ),
    ]


@server.read_resource()
async def read_resource(uri: str) -> str:
    """Return the content of a resource by its URI."""
    resources = {
        "docs://guides/getting-started": "# Getting Started\n\n...",
        "docs://guides/api-reference": "# API Reference\n\n...",
    }
    if uri in resources:
        return resources[uri]
    raise ValueError(f"Unknown resource: {uri}")

Adding Prompt Templates

Prompt templates provide structured guidance for common tasks:

from mcp.types import Prompt, PromptArgument, PromptMessage, TextContent


@server.list_prompts()
async def list_prompts() -> list[Prompt]:
    """Return the list of prompts this server provides."""
    return [
        Prompt(
            name="analyze_code",
            description="Analyze code for quality, performance, and security",
            arguments=[
                PromptArgument(
                    name="code",
                    description="The code to analyze",
                    required=True,
                ),
                PromptArgument(
                    name="language",
                    description="Programming language of the code",
                    required=False,
                ),
            ],
        )
    ]


@server.get_prompt()
async def get_prompt(
    name: str, arguments: dict | None = None
) -> list[PromptMessage]:
    """Return the messages for a prompt template."""
    if name == "analyze_code":
        code = arguments.get("code", "") if arguments else ""
        language = arguments.get("language", "unknown") if arguments else "unknown"
        return [
            PromptMessage(
                role="user",
                content=TextContent(
                    type="text",
                    text=(
                        f"Please analyze the following {language} code for:\n"
                        f"1. Code quality and readability\n"
                        f"2. Performance considerations\n"
                        f"3. Security vulnerabilities\n"
                        f"4. Suggested improvements\n\n"
                        f"```{language}\n{code}\n```"
                    ),
                ),
            )
        ]
    raise ValueError(f"Unknown prompt: {name}")

Server Configuration

MCP clients discover servers through configuration files. For Claude Desktop, the configuration lives in claude_desktop_config.json:

{
  "mcpServers": {
    "word-counter": {
      "command": "python",
      "args": ["/path/to/word_counter_server.py"],
      "env": {
        "PYTHONPATH": "/path/to/project"
      }
    },
    "company-docs": {
      "command": "python",
      "args": ["/path/to/docs_server.py"],
      "env": {
        "DOCS_API_KEY": "your-api-key-here"
      }
    }
  }
}

For Claude Code (the CLI), servers are configured via the .claude/settings.json file or through the /mcp command:

{
  "mcpServers": {
    "word-counter": {
      "command": "python",
      "args": ["/path/to/word_counter_server.py"]
    }
  }
}

Practical Tip

During development, use the MCP Inspector tool to test your server interactively. The Inspector lets you connect to your server, list its tools and resources, and invoke tools with custom arguments — all without needing an AI client. Install it with npx @modelcontextprotocol/inspector and point it at your server.

Error Handling Best Practices

Robust error handling is essential for MCP servers. The AI needs clear error messages to understand what went wrong and potentially retry with different inputs:

@server.call_tool()
async def call_tool(name: str, arguments: dict) -> list[TextContent]:
    """Handle tool invocations with proper error handling."""
    try:
        if name == "search_docs":
            query = arguments.get("query")
            if not query:
                return [
                    TextContent(
                        type="text",
                        text=json.dumps({
                            "error": "Missing required parameter: query",
                            "hint": "Provide a search query string",
                        }),
                    )
                ]
            results = await perform_search(query)
            return [
                TextContent(
                    type="text",
                    text=json.dumps({"results": results}),
                )
            ]
    except ConnectionError as e:
        return [
            TextContent(
                type="text",
                text=json.dumps({
                    "error": "Failed to connect to documentation service",
                    "details": str(e),
                    "suggestion": "The documentation service may be down. "
                                  "Try again in a few minutes.",
                }),
            )
        ]
    except Exception as e:
        return [
            TextContent(
                type="text",
                text=json.dumps({
                    "error": f"Unexpected error: {type(e).__name__}",
                    "details": str(e),
                }),
            )
        ]

Notice that errors are returned as structured JSON in the tool result, not raised as exceptions. This allows the AI to understand the error and respond intelligently, perhaps by trying a different approach or informing the user.


37.4 Custom Tool Development

Designing Effective Tool Interfaces

The most important aspect of custom tool development is interface design. A well-designed tool interface makes it easy for the AI to understand when and how to use the tool. A poorly designed interface leads to misuse, errors, and frustration.

Principles of good tool design:

  1. Single Responsibility. Each tool should do one thing well. Prefer multiple focused tools over one tool with a mode parameter.
  2. Descriptive Naming. Use verb-noun names that clearly indicate the action: search_documentation, create_ticket, analyze_dependencies.
  3. Rich Descriptions. The tool description should explain what the tool does, when to use it, and what it returns.
  4. Sensible Defaults. Optional parameters should have reasonable defaults so the AI does not need to specify everything.
  5. Structured Output. Return JSON with consistent structure so the AI can reliably parse results.
  6. Graceful Degradation. When things go wrong, return helpful error messages rather than crashing.

Common Pitfall

Avoid creating "god tools" that accept a sub-command parameter to perform different actions. A tool named database_operation with a command parameter that accepts "query", "insert", "update", and "delete" is harder for the AI to use correctly than four separate tools: query_database, insert_record, update_record, and delete_record. The AI reasons better about focused tools with clear purposes.

Implementing Tool Handlers

Tool handlers are async functions that receive arguments, perform operations, and return results. Here is a comprehensive example of a tool handler for searching a codebase:

import os
import re
from pathlib import Path
from mcp.types import TextContent


async def handle_search_codebase(arguments: dict) -> list[TextContent]:
    """
    Search the codebase for files matching a pattern and containing
    specific text.

    Args:
        arguments: Dictionary with 'pattern' (glob), 'text' (search string),
                   'root_dir' (base directory), and 'max_results' (limit).

    Returns:
        List of TextContent with search results as JSON.
    """
    pattern = arguments.get("pattern", "**/*.py")
    text = arguments.get("text", "")
    root_dir = arguments.get("root_dir", ".")
    max_results = arguments.get("max_results", 20)

    root_path = Path(root_dir).resolve()
    if not root_path.exists():
        return [
            TextContent(
                type="text",
                text=json.dumps({"error": f"Directory not found: {root_dir}"}),
            )
        ]

    results = []
    try:
        for file_path in root_path.glob(pattern):
            if not file_path.is_file():
                continue
            try:
                content = file_path.read_text(encoding="utf-8")
                if text and text.lower() in content.lower():
                    # Find matching lines
                    matching_lines = []
                    for i, line in enumerate(content.splitlines(), 1):
                        if text.lower() in line.lower():
                            matching_lines.append({
                                "line_number": i,
                                "content": line.strip(),
                            })
                    results.append({
                        "file": str(file_path.relative_to(root_path)),
                        "matches": matching_lines[:5],  # Limit per file
                    })
                elif not text:
                    results.append({
                        "file": str(file_path.relative_to(root_path)),
                    })
                if len(results) >= max_results:
                    break
            except (UnicodeDecodeError, PermissionError):
                continue  # Skip binary or inaccessible files
    except Exception as e:
        return [
            TextContent(
                type="text",
                text=json.dumps({"error": str(e)}),
            )
        ]

    return [
        TextContent(
            type="text",
            text=json.dumps({
                "total_matches": len(results),
                "results": results,
                "search_pattern": pattern,
                "search_text": text,
            }),
        )
    ]

Input Validation with JSON Schema

JSON Schema provides powerful validation for tool inputs. Here is how to define schemas that catch errors before they reach your handler:

TOOL_SCHEMAS = {
    "create_ticket": {
        "type": "object",
        "properties": {
            "title": {
                "type": "string",
                "description": "Ticket title (max 200 characters)",
                "maxLength": 200,
                "minLength": 1,
            },
            "description": {
                "type": "string",
                "description": "Detailed description of the issue or request",
            },
            "priority": {
                "type": "string",
                "description": "Ticket priority level",
                "enum": ["critical", "high", "medium", "low"],
                "default": "medium",
            },
            "assignee": {
                "type": "string",
                "description": "Username to assign the ticket to (optional)",
            },
            "labels": {
                "type": "array",
                "items": {"type": "string"},
                "description": "Labels to apply to the ticket",
                "default": [],
            },
        },
        "required": ["title", "description"],
        "additionalProperties": False,
    }
}

You can validate inputs against these schemas at runtime using a library like jsonschema:

from jsonschema import validate, ValidationError


def validate_tool_input(tool_name: str, arguments: dict) -> str | None:
    """
    Validate tool arguments against the schema.

    Returns None if valid, or an error message string if invalid.
    """
    schema = TOOL_SCHEMAS.get(tool_name)
    if not schema:
        return f"No schema defined for tool: {tool_name}"
    try:
        validate(instance=arguments, schema=schema)
        return None
    except ValidationError as e:
        return f"Invalid input: {e.message}"

Building a Tool Registry

As your MCP server grows, you need a structured way to register and manage tools. A tool registry pattern keeps your code organized:

from dataclasses import dataclass
from typing import Callable, Any


@dataclass
class ToolDefinition:
    """Definition of a tool including its metadata and handler."""

    name: str
    description: str
    input_schema: dict
    handler: Callable[..., Any]


class ToolRegistry:
    """Registry for managing custom tools."""

    def __init__(self) -> None:
        self._tools: dict[str, ToolDefinition] = {}

    def register(
        self,
        name: str,
        description: str,
        input_schema: dict,
        handler: Callable[..., Any],
    ) -> None:
        """Register a new tool."""
        self._tools[name] = ToolDefinition(
            name=name,
            description=description,
            input_schema=input_schema,
            handler=handler,
        )

    def get_tool(self, name: str) -> ToolDefinition | None:
        """Get a tool definition by name."""
        return self._tools.get(name)

    def list_tools(self) -> list[ToolDefinition]:
        """Return all registered tools."""
        return list(self._tools.values())

    async def call_tool(
        self, name: str, arguments: dict
    ) -> list[TextContent]:
        """Look up and invoke a tool by name."""
        tool = self._tools.get(name)
        if not tool:
            return [
                TextContent(
                    type="text",
                    text=json.dumps({"error": f"Unknown tool: {name}"}),
                )
            ]
        return await tool.handler(arguments)

This registry can be used as a decorator pattern for even cleaner tool registration:

registry = ToolRegistry()


def tool(name: str, description: str, input_schema: dict):
    """Decorator to register a function as a tool."""
    def decorator(func):
        registry.register(name, description, input_schema, func)
        return func
    return decorator


@tool(
    name="calculate_complexity",
    description="Calculate cyclomatic complexity of Python code",
    input_schema={
        "type": "object",
        "properties": {
            "code": {
                "type": "string",
                "description": "Python source code to analyze",
            }
        },
        "required": ["code"],
    },
)
async def handle_calculate_complexity(arguments: dict) -> list[TextContent]:
    """Calculate cyclomatic complexity of the provided code."""
    code = arguments["code"]
    # Complexity calculation implementation...
    return [
        TextContent(
            type="text",
            text=json.dumps({"complexity": 5, "rating": "moderate"}),
        )
    ]

37.5 Integrating External Data Sources

Database Integration

One of the most powerful uses of custom tools is giving AI assistants structured access to databases. This requires careful design to balance utility with security:

import aiosqlite
from contextlib import asynccontextmanager


class DatabaseTool:
    """Provides AI-accessible database query capabilities."""

    def __init__(self, db_path: str, allowed_tables: list[str] | None = None):
        self.db_path = db_path
        self.allowed_tables = allowed_tables

    @asynccontextmanager
    async def get_connection(self):
        """Get a database connection with safety constraints."""
        async with aiosqlite.connect(self.db_path) as db:
            db.row_factory = aiosqlite.Row
            yield db

    async def query(self, sql: str, params: tuple = ()) -> list[dict]:
        """
        Execute a read-only SQL query.

        Only SELECT statements are allowed. If allowed_tables is set,
        only queries against those tables are permitted.
        """
        # Security: Only allow SELECT statements
        normalized = sql.strip().upper()
        if not normalized.startswith("SELECT"):
            raise ValueError("Only SELECT queries are allowed")

        # Security: Check for dangerous keywords
        dangerous = ["DROP", "DELETE", "INSERT", "UPDATE", "ALTER", "CREATE"]
        for keyword in dangerous:
            if keyword in normalized:
                raise ValueError(f"Forbidden SQL keyword: {keyword}")

        # Security: Table allowlist enforcement
        if self.allowed_tables:
            # Simple check - production code would use SQL parsing
            for table in self.allowed_tables:
                if table.upper() not in normalized:
                    continue
            # Ensure no unauthorized tables
            # (A real implementation would parse the SQL AST)

        async with self.get_connection() as db:
            cursor = await db.execute(sql, params)
            rows = await cursor.fetchall()
            columns = [desc[0] for desc in cursor.description]
            return [dict(zip(columns, row)) for row in rows]

    async def get_schema(self) -> dict:
        """Return the database schema for AI context."""
        async with self.get_connection() as db:
            cursor = await db.execute(
                "SELECT name, sql FROM sqlite_master "
                "WHERE type='table' ORDER BY name"
            )
            tables = await cursor.fetchall()
            schema = {}
            for table in tables:
                name = table[0]
                if self.allowed_tables and name not in self.allowed_tables:
                    continue
                schema[name] = {
                    "create_sql": table[1],
                }
                # Get column info
                col_cursor = await db.execute(
                    f"PRAGMA table_info({name})"
                )
                columns = await col_cursor.fetchall()
                schema[name]["columns"] = [
                    {
                        "name": col[1],
                        "type": col[2],
                        "nullable": not col[3],
                        "primary_key": bool(col[5]),
                    }
                    for col in columns
                ]
            return schema

Security Warning

Giving AI assistants database access requires extreme caution. Always enforce read-only access unless write operations are specifically required and carefully controlled. Use table allowlists, query validation, and parameterized queries. Never allow the AI to execute arbitrary SQL — always validate and sanitize. In production, use a database user with minimal privileges and consider wrapping queries in transactions that are rolled back if they modify data.

REST API Integration

Many internal systems expose REST APIs that can be wrapped as MCP tools:

import httpx
from typing import Any


class APIIntegration:
    """Wraps a REST API as MCP-accessible tools."""

    def __init__(
        self,
        base_url: str,
        api_key: str | None = None,
        headers: dict[str, str] | None = None,
    ):
        self.base_url = base_url.rstrip("/")
        self.headers = headers or {}
        if api_key:
            self.headers["Authorization"] = f"Bearer {api_key}"

    async def get(
        self, endpoint: str, params: dict[str, Any] | None = None
    ) -> dict:
        """Make a GET request to the API."""
        async with httpx.AsyncClient() as client:
            response = await client.get(
                f"{self.base_url}/{endpoint.lstrip('/')}",
                headers=self.headers,
                params=params,
                timeout=30.0,
            )
            response.raise_for_status()
            return response.json()

    async def post(
        self, endpoint: str, data: dict[str, Any] | None = None
    ) -> dict:
        """Make a POST request to the API."""
        async with httpx.AsyncClient() as client:
            response = await client.post(
                f"{self.base_url}/{endpoint.lstrip('/')}",
                headers=self.headers,
                json=data,
                timeout=30.0,
            )
            response.raise_for_status()
            return response.json()

File System Integration

For teams that store knowledge in files (Markdown documentation, configuration files, code templates), a file system integration provides structured access:

from pathlib import Path
import mimetypes


class FileSystemIntegration:
    """Provides structured access to file system resources."""

    def __init__(
        self,
        root_dirs: list[str],
        allowed_extensions: list[str] | None = None,
    ):
        self.root_dirs = [Path(d).resolve() for d in root_dirs]
        self.allowed_extensions = allowed_extensions or [
            ".md", ".txt", ".py", ".json", ".yaml", ".yml",
            ".toml", ".cfg", ".ini", ".rst",
        ]

    def _is_allowed(self, path: Path) -> bool:
        """Check if a file path is allowed to be accessed."""
        # Must be within a root directory
        resolved = path.resolve()
        in_root = any(
            str(resolved).startswith(str(root))
            for root in self.root_dirs
        )
        if not in_root:
            return False
        # Must have an allowed extension
        if self.allowed_extensions:
            return path.suffix.lower() in self.allowed_extensions
        return True

    async def list_files(
        self, directory: str, pattern: str = "*"
    ) -> list[dict[str, str]]:
        """List files in a directory matching a pattern."""
        dir_path = Path(directory).resolve()
        if not any(
            str(dir_path).startswith(str(root))
            for root in self.root_dirs
        ):
            raise ValueError(f"Directory outside allowed roots: {directory}")

        files = []
        for file_path in dir_path.glob(pattern):
            if file_path.is_file() and self._is_allowed(file_path):
                mime_type, _ = mimetypes.guess_type(str(file_path))
                files.append({
                    "path": str(file_path),
                    "name": file_path.name,
                    "size": file_path.stat().st_size,
                    "mime_type": mime_type or "application/octet-stream",
                })
        return files

    async def read_file(self, file_path: str) -> dict[str, str]:
        """Read a file and return its content with metadata."""
        path = Path(file_path).resolve()
        if not self._is_allowed(path):
            raise ValueError(f"File access not allowed: {file_path}")
        if not path.exists():
            raise FileNotFoundError(f"File not found: {file_path}")

        content = path.read_text(encoding="utf-8")
        return {
            "path": str(path),
            "content": content,
            "size": len(content),
            "lines": content.count("\n") + 1,
        }

Combining Multiple Data Sources

Real-world tools often need to combine data from multiple sources. Here is a pattern for a unified search tool that queries across databases, APIs, and files:

class UnifiedSearch:
    """Search across multiple data sources with a single query."""

    def __init__(self):
        self.sources: list[dict] = []

    def add_source(
        self,
        name: str,
        search_func,
        priority: int = 0,
    ) -> None:
        """Register a searchable data source."""
        self.sources.append({
            "name": name,
            "search": search_func,
            "priority": priority,
        })

    async def search(
        self,
        query: str,
        max_results: int = 10,
        sources: list[str] | None = None,
    ) -> dict:
        """
        Search across all registered sources.

        Results are merged and sorted by relevance score.
        """
        all_results = []
        errors = []

        for source in sorted(self.sources, key=lambda s: s["priority"], reverse=True):
            if sources and source["name"] not in sources:
                continue
            try:
                results = await source["search"](query)
                for result in results:
                    result["source"] = source["name"]
                all_results.extend(results)
            except Exception as e:
                errors.append({
                    "source": source["name"],
                    "error": str(e),
                })

        # Sort by relevance score if available
        all_results.sort(
            key=lambda r: r.get("relevance", 0), reverse=True
        )

        return {
            "query": query,
            "total_results": len(all_results),
            "results": all_results[:max_results],
            "errors": errors if errors else None,
        }

Practical Tip

When integrating multiple data sources, always include the source name in each result. This helps the AI (and the user) understand where information came from, which is crucial for assessing reliability and finding the original source. It also helps with debugging when results seem incorrect.


37.6 Building Custom Slash Commands

What Slash Commands Are

Slash commands are user-initiated shortcuts that trigger specific behaviors in AI coding assistants. Unlike tools (which the AI decides to call based on context), slash commands are explicitly invoked by the user. They are ideal for frequently used workflows that benefit from a standardized starting point.

In Claude Code, custom slash commands can be defined at the project level (in .claude/commands/) or at the user level (in ~/.claude/commands/). Each command is a Markdown file whose content becomes the prompt template.

Creating Project-Level Commands

Project-level slash commands live in your repository and are shared with everyone on the team. Create them in the .claude/commands/ directory:

.claude/
  commands/
    review.md
    test-plan.md
    migrate.md
    deploy-checklist.md

Here is an example slash command for generating a code review:

<!-- .claude/commands/review.md -->
Review the code changes in this project. Focus on:

1. **Correctness**: Are there logical errors or edge cases not handled?
2. **Security**: Are there any security vulnerabilities (injection, auth issues, data exposure)?
3. **Performance**: Are there performance concerns (N+1 queries, unnecessary allocations, blocking calls)?
4. **Readability**: Is the code clear, well-named, and properly documented?
5. **Testing**: Are the changes adequately tested? What test cases are missing?

Follow our team's coding standards documented in CONTRIBUTING.md.

For each issue found, provide:
- Severity (critical / warning / suggestion)
- File and line reference
- Description of the issue
- Suggested fix

Summarize with a table of all findings.

Users invoke this command by typing /review in Claude Code, and the content of the Markdown file is sent as the prompt.

Parameterized Slash Commands

Slash commands can accept parameters using the $ARGUMENTS placeholder. This lets users customize the command's behavior:

<!-- .claude/commands/explain.md -->
Explain the following code or concept in detail, suitable for a developer
with intermediate experience:

$ARGUMENTS

Structure your explanation as:
1. **Overview**: What does this do at a high level?
2. **Step-by-step walkthrough**: Walk through the logic
3. **Key concepts**: What programming concepts does this use?
4. **Potential pitfalls**: What could go wrong?
5. **Related patterns**: What similar patterns exist?

Use our codebase for concrete examples when possible.

The user invokes this as /explain the authentication middleware and $ARGUMENTS is replaced with "the authentication middleware".

Advanced Command Patterns

Context-Gathering Commands can reference specific files to always include relevant context:

<!-- .claude/commands/new-endpoint.md -->
Create a new REST API endpoint following our project patterns.

First, read these reference files to understand our conventions:
- src/routes/users.py (for routing patterns)
- src/middleware/auth.py (for authentication approach)
- src/models/base.py (for model patterns)
- tests/routes/test_users.py (for testing patterns)

Now create a new endpoint for: $ARGUMENTS

Follow the exact same patterns, naming conventions, and error handling
approaches used in the reference files. Include:
- Route definition
- Request/response models
- Authentication middleware
- Input validation
- Error handling
- Unit tests
- Integration tests

Workflow Commands guide the AI through multi-step processes:

<!-- .claude/commands/release.md -->
Help me prepare a release for version: $ARGUMENTS

Follow these steps in order:

1. **Changelog**: Review all commits since the last release tag
   and generate a changelog grouped by: Features, Bug Fixes,
   Breaking Changes, and Dependencies.

2. **Version bump**: Update the version in pyproject.toml,
   __version__.py, and any other version references.

3. **Pre-release checks**:
   - Verify all tests pass
   - Check for uncommitted changes
   - Verify the changelog is complete

4. **Summary**: Provide a release summary I can use for the
   GitHub release description.

Do NOT create the release tag or push anything — just prepare
everything for my review.

Key Insight

The best slash commands encode tribal knowledge — the unwritten processes and conventions that experienced team members know but new members must learn. By capturing these in slash commands, you make your team's expertise available to every developer through their AI assistant. This is especially powerful for onboarding new team members.

Organizing Commands for Teams

As your command library grows, organize commands by category using subdirectories:

.claude/
  commands/
    review/
      security.md
      performance.md
      accessibility.md
    generate/
      endpoint.md
      model.md
      migration.md
      test.md
    analyze/
      dependencies.md
      complexity.md
      coverage.md
    workflow/
      release.md
      deploy.md
      hotfix.md

Users can then invoke commands like /review:security or /generate:endpoint.


37.7 AI Tool Middleware and Pipelines

The Middleware Pattern

Middleware is code that sits between the AI's tool call and the tool's actual execution. It can intercept, modify, validate, log, or enrich requests and responses. This pattern is borrowed from web frameworks like Express and Django, where middleware processes HTTP requests and responses.

For MCP servers, middleware enables:

  • Validation: Ensuring inputs meet requirements before executing
  • Logging: Recording all tool calls for audit and debugging
  • Rate Limiting: Preventing excessive API calls
  • Caching: Storing results for repeated queries
  • Authentication: Verifying permissions before allowing access
  • Transformation: Converting data formats between the AI and external systems

Implementing a Middleware Pipeline

Here is a flexible middleware implementation:

from typing import Callable, Any
from datetime import datetime
import logging

logger = logging.getLogger(__name__)


class MiddlewarePipeline:
    """Pipeline of middleware functions for processing tool calls."""

    def __init__(self):
        self._pre_handlers: list[Callable] = []
        self._post_handlers: list[Callable] = []

    def add_pre_handler(self, handler: Callable) -> None:
        """Add a handler that runs before tool execution."""
        self._pre_handlers.append(handler)

    def add_post_handler(self, handler: Callable) -> None:
        """Add a handler that runs after tool execution."""
        self._post_handlers.append(handler)

    async def execute(
        self,
        tool_name: str,
        arguments: dict,
        tool_handler: Callable,
    ) -> Any:
        """Execute the full middleware pipeline."""
        context = {
            "tool_name": tool_name,
            "arguments": arguments,
            "timestamp": datetime.utcnow().isoformat(),
            "metadata": {},
        }

        # Run pre-handlers
        for handler in self._pre_handlers:
            result = await handler(context)
            if result is not None:
                # Pre-handler returned early (e.g., cached result)
                return result

        # Execute the tool
        start_time = datetime.utcnow()
        try:
            result = await tool_handler(arguments)
            context["duration_ms"] = (
                datetime.utcnow() - start_time
            ).total_seconds() * 1000
            context["success"] = True
            context["result"] = result
        except Exception as e:
            context["duration_ms"] = (
                datetime.utcnow() - start_time
            ).total_seconds() * 1000
            context["success"] = False
            context["error"] = str(e)
            raise

        # Run post-handlers
        for handler in self._post_handlers:
            result = await handler(context, result)
            if result is not None:
                # Post-handler modified the result
                context["result"] = result

        return context["result"]

Common Middleware Functions

Logging Middleware records all tool interactions:

async def logging_middleware(context: dict) -> None:
    """Log all tool calls for audit and debugging."""
    logger.info(
        "Tool call: %s | Args: %s | Time: %s",
        context["tool_name"],
        json.dumps(context["arguments"]),
        context["timestamp"],
    )


async def logging_post_middleware(context: dict, result: Any) -> None:
    """Log tool results and performance."""
    logger.info(
        "Tool result: %s | Success: %s | Duration: %.1fms",
        context["tool_name"],
        context["success"],
        context.get("duration_ms", 0),
    )

Caching Middleware stores and retrieves results for repeated queries:

import hashlib
from datetime import datetime, timedelta


class CacheMiddleware:
    """Cache tool results to avoid redundant operations."""

    def __init__(self, ttl_seconds: int = 300):
        self._cache: dict[str, dict] = {}
        self._ttl = timedelta(seconds=ttl_seconds)

    def _cache_key(self, tool_name: str, arguments: dict) -> str:
        """Generate a cache key from tool name and arguments."""
        arg_str = json.dumps(arguments, sort_keys=True)
        return hashlib.sha256(
            f"{tool_name}:{arg_str}".encode()
        ).hexdigest()

    async def check_cache(self, context: dict) -> Any | None:
        """Check if a cached result exists for this call."""
        key = self._cache_key(
            context["tool_name"], context["arguments"]
        )
        if key in self._cache:
            entry = self._cache[key]
            if datetime.utcnow() - entry["timestamp"] < self._ttl:
                logger.debug("Cache hit for %s", context["tool_name"])
                return entry["result"]
            else:
                del self._cache[key]
        return None

    async def store_cache(self, context: dict, result: Any) -> None:
        """Store a result in the cache."""
        if context.get("success"):
            key = self._cache_key(
                context["tool_name"], context["arguments"]
            )
            self._cache[key] = {
                "result": result,
                "timestamp": datetime.utcnow(),
            }

Rate Limiting Middleware prevents excessive API calls:

from collections import defaultdict


class RateLimitMiddleware:
    """Enforce rate limits on tool calls."""

    def __init__(
        self,
        max_calls: int = 60,
        window_seconds: int = 60,
    ):
        self._max_calls = max_calls
        self._window = timedelta(seconds=window_seconds)
        self._calls: dict[str, list[datetime]] = defaultdict(list)

    async def check_rate_limit(self, context: dict) -> Any | None:
        """Check if the rate limit has been exceeded."""
        tool_name = context["tool_name"]
        now = datetime.utcnow()

        # Remove old entries
        self._calls[tool_name] = [
            t for t in self._calls[tool_name]
            if now - t < self._window
        ]

        if len(self._calls[tool_name]) >= self._max_calls:
            return [
                TextContent(
                    type="text",
                    text=json.dumps({
                        "error": "Rate limit exceeded",
                        "tool": tool_name,
                        "limit": self._max_calls,
                        "window_seconds": self._window.total_seconds(),
                        "retry_after_seconds": (
                            self._calls[tool_name][0]
                            + self._window
                            - now
                        ).total_seconds(),
                    }),
                )
            ]

        self._calls[tool_name].append(now)
        return None

Composing the Pipeline

Here is how to assemble a complete middleware pipeline:

# Create the pipeline
pipeline = MiddlewarePipeline()

# Add middleware in order
cache = CacheMiddleware(ttl_seconds=300)
rate_limiter = RateLimitMiddleware(max_calls=100, window_seconds=60)

pipeline.add_pre_handler(logging_middleware)
pipeline.add_pre_handler(rate_limiter.check_rate_limit)
pipeline.add_pre_handler(cache.check_cache)
pipeline.add_post_handler(cache.store_cache)
pipeline.add_post_handler(logging_post_middleware)

# Use the pipeline in your tool handler
@server.call_tool()
async def call_tool(name: str, arguments: dict) -> list[TextContent]:
    """Handle tool calls through the middleware pipeline."""
    tool = registry.get_tool(name)
    if not tool:
        raise ValueError(f"Unknown tool: {name}")
    return await pipeline.execute(name, arguments, tool.handler)

Practical Tip

Start with logging middleware only. Once you have visibility into how the AI uses your tools, add validation and rate limiting based on actual usage patterns. Over-engineering the middleware pipeline before you have real data leads to unnecessary complexity.


37.8 Testing and Debugging Custom Tools

The Testing Challenge

Testing MCP servers and custom tools presents unique challenges. Unlike traditional APIs where you control both the client and the server, MCP tools are invoked by AI models whose behavior is non-deterministic. You need to test that your tools work correctly when called with the inputs an AI might provide, including edge cases and unexpected inputs.

Unit Testing Tool Handlers

Start with traditional unit tests for your tool handlers. Since handlers are async functions that accept dictionaries and return structured results, they are straightforward to test:

import pytest
import json


@pytest.mark.asyncio
async def test_search_documentation_basic():
    """Test that search returns relevant results."""
    result = await handle_search_docs({"query": "authentication"})
    assert len(result) == 1
    data = json.loads(result[0].text)
    assert "results" in data
    assert len(data["results"]) > 0


@pytest.mark.asyncio
async def test_search_documentation_empty_query():
    """Test that empty queries return an error."""
    result = await handle_search_docs({"query": ""})
    data = json.loads(result[0].text)
    assert "error" in data


@pytest.mark.asyncio
async def test_search_documentation_missing_query():
    """Test that missing query parameter returns an error."""
    result = await handle_search_docs({})
    data = json.loads(result[0].text)
    assert "error" in data


@pytest.mark.asyncio
async def test_search_documentation_max_results():
    """Test that max_results parameter is respected."""
    result = await handle_search_docs({
        "query": "test",
        "max_results": 3,
    })
    data = json.loads(result[0].text)
    assert len(data["results"]) <= 3

Integration Testing with the MCP Protocol

Integration tests verify that your server correctly implements the MCP protocol. Use the mcp library's client to test the full request/response cycle:

import pytest
from mcp.client.session import ClientSession
from mcp.client.stdio import stdio_client, StdioServerParameters


@pytest.mark.asyncio
async def test_server_initialization():
    """Test that the server initializes correctly."""
    server_params = StdioServerParameters(
        command="python",
        args=["my_server.py"],
    )
    async with stdio_client(server_params) as (read, write):
        async with ClientSession(read, write) as session:
            await session.initialize()
            # Server should be connected
            tools = await session.list_tools()
            assert len(tools.tools) > 0


@pytest.mark.asyncio
async def test_tool_invocation():
    """Test invoking a tool through the MCP protocol."""
    server_params = StdioServerParameters(
        command="python",
        args=["my_server.py"],
    )
    async with stdio_client(server_params) as (read, write):
        async with ClientSession(read, write) as session:
            await session.initialize()
            result = await session.call_tool(
                "count_words",
                {"text": "hello world foo bar"},
            )
            data = json.loads(result.content[0].text)
            assert data["word_count"] == 4

Schema Validation Testing

Test that your tool schemas are valid and complete:

from jsonschema import Draft7Validator


def test_tool_schemas_are_valid():
    """Verify all tool schemas are valid JSON Schema."""
    for tool_name, schema in TOOL_SCHEMAS.items():
        validator = Draft7Validator(schema)
        assert validator.is_valid({}.__class__), (
            f"Invalid schema for tool: {tool_name}"
        )


def test_tool_schemas_have_descriptions():
    """Verify all tool properties have descriptions."""
    for tool_name, schema in TOOL_SCHEMAS.items():
        properties = schema.get("properties", {})
        for prop_name, prop_schema in properties.items():
            assert "description" in prop_schema, (
                f"Missing description for {tool_name}.{prop_name}"
            )

Simulated AI Interaction Testing

The most valuable tests simulate how an AI model would interact with your tools. Create test scenarios that mimic realistic AI behavior:

@pytest.mark.asyncio
async def test_ai_workflow_search_then_read():
    """
    Simulate an AI workflow: search for documentation,
    then read the top result.
    """
    # Step 1: AI searches for documentation
    search_result = await handle_search_docs({
        "query": "deployment process",
    })
    search_data = json.loads(search_result[0].text)
    assert len(search_data["results"]) > 0

    # Step 2: AI reads the top result
    top_result = search_data["results"][0]
    read_result = await handle_read_doc({
        "doc_id": top_result["id"],
    })
    read_data = json.loads(read_result[0].text)
    assert "content" in read_data
    assert len(read_data["content"]) > 0


@pytest.mark.asyncio
async def test_ai_handles_tool_errors_gracefully():
    """
    Verify that tool errors are returned in a format
    the AI can understand and act on.
    """
    result = await handle_search_docs({
        "query": "x" * 10000,  # Extremely long query
    })
    data = json.loads(result[0].text)
    # Should return an error, not crash
    assert "error" in data or "results" in data

Debugging Techniques

Enable verbose logging to see exactly what your server receives and returns:

import logging
import sys

logging.basicConfig(
    level=logging.DEBUG,
    format="%(asctime)s [%(levelname)s] %(name)s: %(message)s",
    stream=sys.stderr,  # Use stderr so it does not interfere with stdio transport
)

Use the MCP Inspector for interactive debugging. The Inspector provides a web interface where you can see all available tools, test them with custom inputs, and inspect the JSON-RPC messages:

npx @modelcontextprotocol/inspector python my_server.py

Test with edge cases that AI models are likely to produce:

  • Unicode strings and special characters in inputs
  • Very long input strings that might exceed limits
  • Missing optional parameters
  • Extra parameters not defined in the schema
  • Null values where strings are expected
  • Nested objects when flat objects are expected

Common Pitfall

The most common debugging issue with MCP servers is writing to stdout accidentally. Since the stdio transport uses stdout for protocol messages, any print() statement in your server code will corrupt the message stream and cause mysterious failures. Always use stderr for debug output or use a proper logging framework configured to write to stderr.


37.9 Deployment and Distribution

Local Deployment

The simplest deployment model runs MCP servers locally on the developer's machine. This is appropriate for:

  • Tools that access local files or databases
  • Development and testing of new servers
  • Tools with no shared state requirements
  • Security-sensitive tools that should not transmit data over the network

Local deployment uses the stdio transport. The AI client launches the server as a subprocess:

{
  "mcpServers": {
    "my-tools": {
      "command": "python",
      "args": ["/home/user/mcp-servers/my_tools.py"],
      "env": {
        "DATABASE_PATH": "/home/user/data/app.db"
      }
    }
  }
}

For Python servers, consider using uv for dependency management to ensure consistent environments:

{
  "mcpServers": {
    "my-tools": {
      "command": "uv",
      "args": [
        "run",
        "--directory", "/home/user/mcp-servers",
        "my_tools.py"
      ]
    }
  }
}

Remote Deployment

Remote MCP servers run on a separate machine and communicate over the network using SSE or Streamable HTTP transport. This is appropriate for:

  • Shared tools that multiple developers use
  • Tools that require access to server-side resources
  • Tools backed by services that need high availability
  • Tools that require centralized configuration or secrets management

Here is a basic remote MCP server setup using the SSE transport:

from mcp.server import Server
from mcp.server.sse import SseServerTransport
from starlette.applications import Starlette
from starlette.routing import Route, Mount

server = Server("remote-tools")
sse = SseServerTransport("/messages/")

# ... register tools, resources, prompts ...

async def handle_sse(request):
    """Handle SSE connections from MCP clients."""
    async with sse.connect_sse(
        request.scope, request.receive, request._send
    ) as streams:
        await server.run(
            streams[0],
            streams[1],
            server.create_initialization_options(),
        )

starlette_app = Starlette(
    routes=[
        Route("/sse", endpoint=handle_sse),
        Mount("/messages/", app=sse.handle_post_message),
    ],
)

if __name__ == "__main__":
    import uvicorn
    uvicorn.run(starlette_app, host="0.0.0.0", port=8080)

Containerized Deployment

For production deployments, containerize your MCP server with Docker:

FROM python:3.12-slim

WORKDIR /app

COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

COPY . .

EXPOSE 8080

CMD ["python", "server.py"]

With a corresponding docker-compose.yml:

version: "3.8"
services:
  mcp-server:
    build: .
    ports:
      - "8080:8080"
    environment:
      - DATABASE_URL=postgresql://user:pass@db:5432/mydb
      - API_KEY=${API_KEY}
    volumes:
      - ./data:/app/data
    restart: unless-stopped

Packaging for Distribution

To share your MCP server with others, package it as a Python package:

# pyproject.toml
[project]
name = "my-mcp-tools"
version = "1.0.0"
description = "Custom MCP tools for development workflows"
requires-python = ">=3.11"
dependencies = [
    "mcp>=1.0.0",
    "httpx>=0.27.0",
    "pydantic>=2.0.0",
]

[project.scripts]
my-mcp-tools = "my_mcp_tools.server:main"

[build-system]
requires = ["hatchling"]
build-backend = "hatchling.build"

Users can then install and configure your server easily:

pip install my-mcp-tools
{
  "mcpServers": {
    "my-tools": {
      "command": "my-mcp-tools"
    }
  }
}

Security Considerations for Deployment

When deploying MCP servers, security must be a primary concern:

  1. Secret Management. Never hard-code API keys, database credentials, or other secrets. Use environment variables or a secrets manager.

  2. Input Sanitization. Validate and sanitize all inputs from the AI. Remember that AI-generated inputs may be influenced by prompt injection attacks.

  3. Principle of Least Privilege. Grant your server only the permissions it needs. Database users should be read-only unless writes are required. API tokens should have minimal scopes.

  4. Network Security. Remote MCP servers should use TLS, authenticate clients, and restrict access by IP or authentication tokens.

  5. Audit Logging. Log all tool invocations with timestamps, inputs, and outputs. This is essential for debugging and for detecting misuse.

  6. Data Boundaries. Be explicit about what data your server can access. Use allowlists for tables, directories, and API endpoints.

Security Warning

MCP servers act as a bridge between AI models and your systems. A compromised or poorly secured MCP server could expose sensitive data, allow unauthorized modifications, or become a vector for attacks. Treat MCP server security with the same rigor you would apply to any API that handles sensitive data.


37.10 The Custom Tool Ecosystem

The Current Landscape

The MCP ecosystem is growing rapidly. As of early 2026, hundreds of community-built MCP servers cover common development needs:

Official and Reference Servers: - File system access with configurable permissions - GitHub and GitLab integration - Database connectors (PostgreSQL, SQLite, MongoDB) - Web search and browsing - Slack and communication platform integration

Community-Built Servers: - Jira and project management tool integration - Cloud provider CLIs (AWS, GCP, Azure) - Documentation generators - Code quality analyzers - Kubernetes cluster management - CI/CD pipeline interaction

Enterprise Servers: - Internal knowledge base connectors - Compliance and audit tools - Custom deployment pipelines - Proprietary data model access

Discovering and Evaluating MCP Servers

When looking for existing MCP servers, consider these sources:

  1. The MCP Server Registry — A curated directory of community servers with descriptions and quality ratings.
  2. GitHub — Search for repositories tagged with mcp-server or model-context-protocol.
  3. Package Managers — Search PyPI for mcp- prefixed packages or npm for @mcp/ scoped packages.

When evaluating a third-party MCP server, assess:

  • Maintenance status: Is it actively maintained? When was the last commit?
  • Security: Does it follow security best practices? Has it been audited?
  • Documentation: Is the tool well-documented with clear setup instructions?
  • Testing: Does it have automated tests? What is the test coverage?
  • Permissions: What access does it require? Does it follow least privilege?

Building for the Ecosystem

If you build MCP servers that could benefit others, consider open-sourcing them. The ecosystem benefits from shared solutions to common problems. When preparing a server for public distribution:

  1. Write comprehensive documentation including setup instructions, configuration options, and usage examples.
  2. Include examples of tool invocations and expected outputs.
  3. Define clear security boundaries and document what access the server requires.
  4. Write tests that others can run to verify the server works in their environment.
  5. Use semantic versioning so users can depend on stable interfaces.
  6. Provide a configuration schema so users know what options are available.

The Future of AI Tool Extensibility

The custom tool ecosystem is evolving in several directions:

Tool Composition. Future MCP extensions may allow tools to be composed — the output of one tool automatically becoming the input of another, creating complex workflows from simple building blocks. This echoes the Unix philosophy of small, composable tools connected by pipes.

Tool Discovery. AI assistants are becoming better at discovering and learning to use new tools without explicit configuration. Imagine an AI that can search a tool registry, evaluate options, install the most appropriate tool, and begin using it — all within a single conversation.

Collaborative Tool Development. Teams are beginning to use AI assistants to help build and maintain their custom tools. The AI can generate tool implementations from natural language descriptions, write tests, and even suggest improvements based on usage patterns.

Standardization and Governance. As organizations deploy more MCP servers, governance frameworks are emerging for managing tool access, reviewing tool implementations, and auditing tool usage. This mirrors the evolution of API governance in large organizations.

Key Insight

The organizations that will benefit most from AI coding assistants are not those with the best AI models — they are those that build the best tool integrations. The AI model is a commodity; the tools that connect it to your specific domain, data, and workflows are the differentiator. Investing in custom tooling is investing in your team's competitive advantage.

Connecting to Agent Workflows

Custom tools become even more powerful when combined with the agent workflows discussed in Chapter 36. An AI agent with access to custom tools can:

  • Autonomously gather context from your knowledge base before starting a task
  • Verify its work against your team's standards using custom linting tools
  • Deploy changes through your specific deployment pipeline
  • Report status to your team's communication channels
  • Access domain-specific calculations that general-purpose code cannot provide

The agent's ability to chain tool calls means that a well-designed set of custom tools can automate entire workflows that previously required human coordination across multiple systems.


Summary

This chapter has taken you from understanding the extensibility opportunity through building, testing, and deploying custom tools and MCP servers. You learned:

  • The extensibility opportunity is about closing the gap between what AI assistants know and what your team needs them to know. Custom tools give AI access to your specific systems, knowledge, and workflows.

  • The Model Context Protocol (MCP) provides an open standard for AI tool integration. Its client-server architecture, JSON-RPC communication, and support for tools, resources, and prompts make it a flexible foundation for extensibility.

  • Building MCP servers involves implementing tool listings, resource providers, and prompt templates using the mcp Python package. Good error handling and structured output are essential.

  • Custom tool development requires careful interface design following principles of single responsibility, descriptive naming, rich descriptions, and graceful degradation. Tool registries help manage growing tool collections.

  • Data source integration connects AI assistants to databases, APIs, and file systems. Security is paramount — always enforce least privilege, validate inputs, and restrict access.

  • Custom slash commands encode team knowledge and workflows as reusable prompt templates. They are the fastest way to standardize how your team uses AI assistants.

  • Middleware pipelines add cross-cutting concerns like logging, caching, rate limiting, and validation to your tool calls without modifying tool logic.

  • Testing custom tools requires unit tests, integration tests, schema validation, and simulated AI interaction scenarios. Debug with logging and the MCP Inspector.

  • Deployment options range from local stdio-based servers to remote HTTP deployments in containers. Security considerations include secret management, input sanitization, and audit logging.

  • The custom tool ecosystem is growing rapidly, with community-built servers covering many common needs and enterprises building proprietary integrations for competitive advantage.

In the next chapter, we will explore how multiple AI agents can work together on complex tasks, using the tools and protocols covered in this chapter as their shared infrastructure.


Key Terms

Term Definition
Model Context Protocol (MCP) An open standard for connecting AI applications to external tools and data sources through a structured client-server protocol
MCP Server A process that implements the MCP protocol and exposes tools, resources, and/or prompts to AI clients
MCP Client The component within an AI application that discovers and communicates with MCP servers
Tool A function exposed by an MCP server that the AI can invoke to perform actions and receive results
Resource A data source exposed by an MCP server that the AI can read for context
Prompt Template A pre-defined prompt structure exposed by an MCP server that guides AI behavior for specific tasks
JSON-RPC 2.0 The message format used by MCP for communication between clients and servers
stdio Transport MCP transport that uses standard input/output streams for local communication
SSE Transport MCP transport that uses Server-Sent Events for remote communication
Tool Schema A JSON Schema definition that describes a tool's input parameters, types, and constraints
Slash Command A user-initiated shortcut that triggers a specific prompt template in an AI coding assistant
Middleware Code that intercepts tool calls to add cross-cutting concerns like logging, caching, or validation
Tool Registry A pattern for organizing and managing multiple tool definitions in an MCP server