Bellwether
Automated behavioral documentation for MCP servers through LLM-guided testing.
Bellwether is a CLI tool that generates comprehensive behavioral documentation for Model Context Protocol (MCP) servers. Instead of relying on manually written docs, Bellwether interviews your MCP server by:
- Discovering available tools, prompts, and resources
- Generating realistic test scenarios using an LLM
- Executing tests and analyzing actual responses
- Synthesizing findings into actionable documentation
Why Bellwether?
| Problem | Solution |
|---|---|
| Documentation says one thing, but what does the server actually do? | Trust but verify - Interview the server to document real behavior |
| Breaking changes slip into production unnoticed | Drift detection - Catch behavioral changes before they hit production |
| Security vulnerabilities are hard to discover manually | Security insights - Persona-based adversarial testing |
| Manual testing is slow and expensive | CI/CD integration - Automated regression testing for MCP servers |
Key Features
- AGENTS.md Generation - Human-readable behavioral documentation generated automatically from actual server responses
- Complete MCP Coverage - Test tools, prompts, and resources with content previews and access patterns
- Drift Detection - Compare baselines to detect behavioral changes between versions with semantic diff analysis
- Multi-Persona Testing - Security tester, QA engineer, technical writer, and novice user personas for comprehensive coverage
- MCP Registry Integration - Search and discover servers from the official MCP Registry
- Verification Program - Certify your server with Bronze, Silver, Gold, or Platinum tiers
- GitHub Action - Official action for automated CI/CD integration
- Multiple Output Formats - Markdown, JSON, JUnit XML, and SARIF for GitHub Code Scanning
How It Works
MCP Server Bellwether Output
| | |
| tools/list | |
|<------------------| |
| | |
| tools/call | LLM generates |
|<------------------| test scenarios |
| | |
| responses | |
|------------------>| Analyze behavior |
| | |
| |-----------------------> AGENTS.md
| | baseline.json
Output Example
Bellwether generates AGENTS.md files documenting observed server behavior:
# @modelcontextprotocol/server-filesystem
> Generated by Bellwether on 2026-01-12
## Overview
A file management server providing tools for reading, writing, and searching files.
## Tools
### read_file
Read contents of a file from the specified path.
| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| path | string | yes | Path to the file to read |
**Observed Behavior:**
- Returns file contents as UTF-8 text
- Binary files are returned as base64-encoded content
- Maximum file size: 10MB
**Limitations:**
- Cannot read files outside configured root directory
**Security Considerations:**
- Path traversal attempts (../) are normalized within root
Cost Efficiency
Bellwether uses LLMs for intelligent testing. Typical costs per interview (10 tools, 3 questions each):
| Model | Cost | Quality |
|---|---|---|
gpt-5-mini | ~$0.02 | Good (recommended for CI) |
claude-haiku-4-5 | ~$0.04 | Good |
gpt-5.2 | ~$0.12 | Best |
claude-sonnet-4-5 | ~$0.13 | Best |
| Ollama (local) | Free | Variable |
Use --quick flag in CI for fastest, cheapest runs (~$0.01).
Quick Example
# Install
npm install -g @dotsetlabs/bellwether
# Set your API key (or use Ollama for free)
export OPENAI_API_KEY=sk-xxx
# Interview a local server during development
bellwether interview node ./src/mcp-server.js
# Or interview an npm package
bellwether interview npx @modelcontextprotocol/server-filesystem /tmp
# Output: AGENTS.md with behavioral documentation
Local Development Workflow
Bellwether integrates into your development workflow to catch behavioral drift before deployment:
# 1. Test your local server
bellwether interview node ./src/mcp-server.js
# 2. Save a baseline after initial development
bellwether interview --save-baseline node ./src/mcp-server.js
# 3. Use watch mode for continuous testing
bellwether watch node ./src/mcp-server.js --watch-path ./src
# 4. Before committing, check for drift
bellwether interview --compare-baseline ./baseline.json node ./src/mcp-server.js
Use Ollama for completely free testing during development.
Next Steps
- Installation - Install Bellwether and configure your LLM provider
- Quick Start - Run your first interview in 5 minutes
- Local Development - Test your server during development
- CLI Reference - Full command documentation
- MCP Registry - Discover servers to test
- Verification - Certify your server
- CI/CD Integration - Automate with the GitHub Action