mcp-builder 指导开发者创建高质量的 MCP(Model Context Protocol)服务器,让 Claude 能够发现并使用外部工具和服务。
- 📡 MCP 服务器开发最佳实践(Python + Node.js)
- 🔌 连接管理和工具注册模式
- ✅ 内置评估框架,验证 MCP 服务器质量
- 📚 参考文档涵盖协议规范和实现模式
One-Line Summary
Section titled “One-Line Summary”mcp-builder guides developers in creating high-quality MCP (Model Context Protocol) servers, enabling Claude to discover and use external tools and services.
Core Capabilities
Section titled “Core Capabilities”- 📡 MCP server development best practices (Python + Node.js)
- 🔌 Connection management and tool registration patterns
- ✅ Built-in evaluation framework for MCP server quality validation
- 📚 Reference docs covering protocol spec and implementation patterns
File Inventory
Section titled “File Inventory”- mcp-builder
目录结构分析
Section titled “目录结构分析”mcp-builder 是一个脚本驱动型 Skill,围绕 2 个核心 Python 脚本 + 4 个参考文档组织。
Directory Structure Analysis
Section titled “Directory Structure Analysis”mcp-builder is a script-driven Skill, organized around 2 core Python scripts + 4 reference documents.
- mcp-builder
SKILL.md 结构解析
Section titled “SKILL.md 结构解析”约 200 行的 SKILL.md 采用清晰的四阶段工作流结构:
- Phase 1:Deep Research and Planning(第 20-75 行)—— API Coverage vs Workflow Tools 设计、协议文档研究、框架学习、实现规划
- Phase 2:Implementation(第 78-124 行)—— 项目结构设置、核心基础设施、工具实现(输入/输出 Schema、描述、注解)
- Phase 3:Review and Test(第 127-148 行)—— 代码质量检查、构建验证、MCP Inspector 测试
- Phase 4:Create Evaluations(第 151-192 行)—— 创建 10 个评估问题、XML 输出格式
语言推荐为 TypeScript(理由:SDK 支持好、AI 生成 TypeScript 代码质量高、静态类型检查),传输层推荐 Streamable HTTP。
connections.py 提供底层 MCP 传输层(stdio/SSE/HTTP),evaluation.py 在其之上构建评估框架:
SKILL.md Structure Analysis
Section titled “SKILL.md Structure Analysis”~200 line SKILL.md using a clear four-phase workflow structure:
- Phase 1: Deep Research and Planning (lines 20-75) — API Coverage vs Workflow Tools design, protocol research, framework study, implementation planning
- Phase 2: Implementation (lines 78-124) — Project setup, core infrastructure, tool implementation (input/output schemas, descriptions, annotations)
- Phase 3: Review and Test (lines 127-148) — Code quality checks, build verification, MCP Inspector testing
- Phase 4: Create Evaluations (lines 151-192) — Create 10 evaluation questions, XML output format
Language recommendation is TypeScript (rationale: better SDK support, AI generates higher quality TypeScript, static typing), and transport recommendation is Streamable HTTP.
Module Relationships
Section titled “Module Relationships”connections.py provides the low-level MCP transport layer (stdio/SSE/HTTP), and evaluation.py builds the evaluation framework on top of it:
mcp-builder 模块关系图
graph TD
SKILL[SKILL.md] -->|指导| Claude
Claude -->|调用| eval[evaluation.py]
eval -->|import| conn[connections.py]
conn -->|MCP Transport| Stdio[stdio]
conn -->|MCP Transport| SSE[sse]
conn -->|MCP Transport| HTTP[streamable HTTP]
eval -->|工具发现| MCP_Server[MCP Server]
eval -->|运行评估| Claude_API[Anthropic Claude]
eval -->|输出| Report[Markdown Report]
subgraph scripts [scripts/]
conn
eval
end
subgraph reference [reference/]
BP[mcp_best_practices.md]
PY[python_mcp_server.md]
TS[node_mcp_server.md]
EG[evaluation.md]
end
SKILL --> reference
style SKILL fill:#4fc3f7,stroke:#0288d1,color:#000
style conn fill:#81c784,stroke:#388e3c,color:#000
style eval fill:#81c784,stroke:#388e3c,color:#000
style Claude_API fill:#ffb74d,stroke:#f57c00,color:#000
style Report fill:#ce93d8,stroke:#7b1fa2,color:#000 脚本全量清单
Section titled “脚本全量清单”| 脚本 | 语言 | 行数 | 复杂度 | 功能 |
|------|------|------|--------|------|
| connections.py | Python | ~150 | ⭐⭐⭐ | MCP 连接管理:支持 stdio/SSE/HTTP 三种传输协议 |
| evaluation.py | Python | ~373 | ⭐⭐⭐⭐ | 评估框架:使用 Claude 运行测试问题并生成报告 |
connections.py — MCP 连接管理
Section titled “connections.py — MCP 连接管理”connections.py 提供了一组可插拔的 MCP 连接类。核心设计是抽象基类 MCPConnection + 三种具体实现(stdio/SSE/HTTP),配合工厂函数 create_connection() 统一创建。
connections.py provides a set of pluggable MCP connection classes. The core design is an abstract base class MCPConnection + three concrete implementations (stdio/SSE/HTTP), with a factory function create_connection() for unified creation.
evaluation.py — 评估框架
Section titled “evaluation.py — 评估框架”evaluation.py 是 mcp-builder 的评估驱动引擎。它使用 connections.py 连接到 MCP 服务器,通过 Claude 运行测试问题,并生成 Markdown 评估报告。
evaluation.py is the evaluation engine of mcp-builder. It uses connections.py to connect to MCP servers, runs test questions through Claude, and generates Markdown evaluation reports.
脚本间关系图
Section titled “脚本间关系图”mcp-builder 脚本依赖图
graph LR A[evaluation.py] -->|import create_connection| B[connections.py] B -->|MCP| C[MCP Server via stdio/sse/http] A -->|Anthropic SDK| D[Claude API] A -->|parse| E[eval XML File] A -->|generates| F[Markdown Report] B -->|MCPClientSession| G[list_tools / call_tool] style A fill:#81c784,stroke:#388e3c,color:#000 style B fill:#81c784,stroke:#388e3c,color:#000 style D fill:#ffb74d,stroke:#f57c00,color:#000 style F fill:#ce93d8,stroke:#7b1fa2,color:#000
- ABC 可插拔传输层:
MCPConnection抽象基类定义了统一接口,三种具体子类通过工厂函数实例化——这是典型的 Strategy 模式 - Claude-as-Judge 评估:evaluation.py 使用 Claude 本身来评估 MCP 服务器的效果,通过结构化 XML 标签(summary/feedback/response)控制输出格式
- 脚本分层:connections.py 是纯”基础设施”,evaluation.py 是”业务流程”——职责分离清晰
- AsyncExitStack 生命周期:确保 MCP 连接在异常时也能正确清理资源
“如果你想构建其他 API 的连接器…”
- 保留 ABC 模式:提取抽象基类
APIConnection+ 工厂函数——这是最通用的模式 - 替换传输层:将 stdio/SSE/HTTP 替换为你的协议(gRPC、WebSocket、REST)
- 保留评估框架:evaluation.py 的 Claude-as-Judge 模式可以通用于任何工具评估
- 替换评估 XML:将 MCP 测试用例替换为你的 API 测试用例
- 移除 language-specific 参考:替换 reference/ 中的语言指南为你的技术栈文档
⚠️ AsyncExitStack 必须正确清理: __aexit__ 中的异常会覆盖原始异常——connections.py 中用了 try/except BaseException 是最佳实践
⚠️ SDK 兼容性: MCP Python SDK 的 ClientSession API 可能变化——确保 requirements.txt 锁定版本
⚠️ 评估提示词设计: EVALUATION_PROMPT 中如果 XML 标签描述不够精确,Claude 的输出可能无法解析
⚠️ 不要在生产 MCP 服务器上运行评估: evaluation.py 会调用工具并可能产生副作用——确保测试环境是隔离的
Design Highlights
Section titled “Design Highlights”- ABC Pluggable Transport Layer:
MCPConnectionabstract base class defines a unified interface, with three concrete subclasses instantiated via factory function — a classic Strategy pattern - Claude-as-Judge Evaluation: evaluation.py uses Claude itself to evaluate MCP server effectiveness, controlling output format through structured XML tags (summary/feedback/response)
- Script Layering: connections.py is pure “infrastructure”, evaluation.py is “business logic” — clear separation of concerns
- AsyncExitStack Lifecycle: Ensures MCP connections are properly cleaned up even on exceptions
Reusable Patterns
Section titled “Reusable Patterns”Porting Guide
Section titled “Porting Guide”“If you want to build connectors for other APIs…”
- Keep ABC Pattern: Extract
APIConnectionabstract base class + factory function — the most universal pattern - Replace Transport Layer: Swap stdio/SSE/HTTP for your protocol (gRPC, WebSocket, REST)
- Keep Evaluation Framework: The Claude-as-Judge pattern in evaluation.py is generic for any tool evaluation
- Replace Evaluation XML: Swap MCP test cases for your API test cases
- Remove language-specific references: Replace reference/ language guides with your tech stack docs
Common Pitfalls
Section titled “Common Pitfalls”⚠️ AsyncExitStack must clean up properly: Exceptions in __aexit__ can mask original exceptions — connections.py’s try/except BaseException is best practice
⚠️ SDK Compatibility: MCP Python SDK’s ClientSession API may change — pin versions in requirements.txt
⚠️ Evaluation Prompt Design: If XML tag descriptions in EVALUATION_PROMPT are not precise enough, Claude’s output may be unparseable
⚠️ Don’t run eval on production MCP servers: evaluation.py calls tools and may cause side effects — ensure test environments are isolated
| 模式 | 说明 | 适用于... |
|---|---|---|
| ABC + Factory | 抽象基类定义接口 + 工厂函数统一创建 | 任何需要支持多种传输/实现的连接库 |
| Claude-as-Judge | 用 Claude 评估输出,带 XML 标签控制格式 | AI 输出质量评估场景 |
| Plug-and-Play Transport | 替换传输实现(stdio/SSE/HTTP)而不改业务代码 | 需要多环境部署的连接工具 |
| 评估驱动开发 | 先用 XML 定义测试用例,再实现服务器 | MCP 服务器等 API 开发 |
| Pattern | Description | Applies to... |
|---|---|---|
| ABC + Factory | Abstract base class defines interface + factory creates instances | Any connection library needing multiple transport implementations |
| Claude-as-Judge | Use Claude to evaluate output, with XML tag format control | AI output quality evaluation scenarios |
| Plug-and-Play Transport | Swap transport implementation without changing business code | Connection tools needing multi-environment deployment |
| Evaluation-Driven Development | Define test cases in XML first, then implement server | API development like MCP servers |