bellwether baseline
Manage baselines for drift detection.
Synopsis
bellwether baseline save [path]
bellwether baseline compare [path]
bellwether baseline show [path]
bellwether baseline diff <path1> <path2>
bellwether baseline accept
Description
The baseline command group manages baselines for detecting structural drift in your MCP server. Baselines capture the server's tool schemas at a point in time, enabling comparison against future checks.
baseline requires a config file. Run bellwether init once in your project.
Subcommands
save
Save check results as a baseline for drift detection.
bellwether baseline save [path]
| Option | Description | Default |
|---|---|---|
[path] | Output path for baseline file | baseline.savePath or baseline.path |
-c, --config <path> | Path to config file | bellwether.yaml |
--report <path> | Path to check report JSON file | output.files.checkReport |
-f, --force | Overwrite existing baseline | false |
Examples:
# Save baseline from last check run
bellwether baseline save
# Save to specific path
bellwether baseline save ./baselines/v1.0.0.json
# Overwrite existing baseline
bellwether baseline save --force
The save command reads the JSON report generated by bellwether check. The default filename and output directory are configured via output.files.checkReport and output.dir.
compare
Compare check results against an existing baseline.
bellwether baseline compare [baseline-path]
| Option | Description | Default |
|---|---|---|
[baseline-path] | Path to baseline file to compare against | baseline.comparePath or baseline.path |
-c, --config <path> | Path to config file | bellwether.yaml |
--report <path> | Path to check report JSON file | output.files.checkReport |
--format <format> | Output format: text, json, markdown, compact | baseline.outputFormat |
--fail-on-drift | Exit with error if drift is detected | baseline.failOnDrift |
--ignore-version-mismatch | Proceed even if format versions are incompatible | false |
Examples:
# Compare against baseline
bellwether baseline compare ./bellwether-baseline.json
# Fail if drift detected (for CI/CD)
bellwether baseline compare ./bellwether-baseline.json --fail-on-drift
# Output in JSON format
bellwether baseline compare ./baseline.json --format json
# Compare with custom report location
bellwether baseline compare ./baseline.json --report ./output/bellwether-check.json
Relative baseline paths for compare, show, and accept are resolved under output.dir first, then fall back to the current working directory if the file exists there.
show
Display the contents of a baseline file.
bellwether baseline show [path]
| Option | Description | Default |
|---|---|---|
[path] | Path to baseline file | baseline.comparePath or baseline.path |
-c, --config <path> | Path to config file | bellwether.yaml |
--json | Output raw JSON | false |
--tools | Show only tool fingerprints | false |
--assertions | Show only assertions | false |
Examples:
# Show default baseline
bellwether baseline show
# Show specific baseline
bellwether baseline show ./baselines/v1.0.0.json
# Output raw JSON
bellwether baseline show --json
# Show only tools
bellwether baseline show --tools
diff
Compare two baseline files directly.
bellwether baseline diff <path1> <path2>
| Option | Description | Default |
|---|---|---|
<path1> | Path to first baseline file | Required |
<path2> | Path to second baseline file | Required |
-c, --config <path> | Path to config file | bellwether.yaml |
--format <format> | Output format: text, json, markdown, compact | baseline.outputFormat |
--ignore-version-mismatch | Proceed even if format versions are incompatible | false |
Examples:
# Compare two baselines
bellwether baseline diff v1.0.0.json v1.1.0.json
# Output as markdown
bellwether baseline diff old.json new.json --format markdown
# Output as JSON for parsing
bellwether baseline diff old.json new.json --format json
accept
Accept detected drift as intentional and update the baseline with acceptance metadata. This creates an audit trail for intentional changes.
bellwether baseline accept
| Option | Description | Default |
|---|---|---|
-c, --config <path> | Path to config file | bellwether.yaml |
--report <path> | Path to check report JSON file | output.files.checkReport |
--baseline <path> | Baseline file to accept drift against | baseline.comparePath or baseline.path |
--reason <text> | Why the drift was accepted | - |
--accepted-by <name> | Who accepted the drift (for audit trail) | - |
--dry-run | Preview what would be accepted without writing | false |
-f, --force | Required for accepting breaking changes | false |
Examples:
# Accept drift with a reason
bellwether baseline accept --reason "Added new delete_file tool"
# Accept breaking changes (requires --force)
bellwether baseline accept --reason "Major API update" --force
# Preview acceptance without writing
bellwether baseline accept --reason "Testing" --dry-run
# Include who accepted for audit trail
bellwether baseline accept --reason "Bug fix" --accepted-by "release-bot"
When drift includes breaking changes (tool removals, incompatible schema changes), you must use --force to confirm acceptance.
Acceptance Metadata:
When you accept drift, the baseline records:
- When the drift was accepted
- Who accepted it (if
--accepted-byprovided) - Why it was accepted (the reason)
- What changes were accepted (snapshot of the diff)
This creates an audit trail for compliance and team visibility:
{
"acceptance": {
"acceptedAt": "2026-01-21T10:30:00Z",
"acceptedBy": "dev-team",
"reason": "Added new delete_file tool",
"acceptedDiff": {
"toolsAdded": ["delete_file"],
"toolsRemoved": [],
"toolsModified": [],
"severity": "info"
}
}
}
Workflow
Initial Setup
# 1. Create config
bellwether init
# 2. Run check
bellwether check npx @mcp/your-server
# 3. Save baseline
bellwether baseline save
CI/CD Integration
# In your CI pipeline:
# 1. Run check
bellwether check
# 2. Compare against committed baseline
bellwether baseline compare ./bellwether-baseline.json --fail-on-drift
Or configure baseline comparison in bellwether.yaml:
baseline:
comparePath: "./bellwether-baseline.json"
failOnDrift: true
Then run check (comparison happens automatically):
bellwether check --fail-on-drift
Version Comparison
# Compare baselines from different versions
bellwether baseline diff ./baselines/v1.0.0.json ./baselines/v2.0.0.json
Baseline Contents
A baseline file captures:
| Component | Description |
|---|---|
| Server Info | Name, version, protocol version, capabilities, instructions |
| Tools | Name, description, schema hash, title, annotations, output schema, execution/task support |
| Prompts | Prompt names, descriptions, titles, arguments |
| Resources | Resource URIs, names, descriptions, titles, MIME types |
| Performance | P50/P95 latency, success rate, confidence level per tool |
| Security Notes | Security observations per tool |
| Limitations | Known limitations per tool |
| Hash | SHA-256 hash for detecting file tampering |
| Acceptance | Optional: when/why drift was accepted (audit trail) |
Sample Baseline Structure
{
"version": "2.1.1",
"metadata": {
"mode": "check",
"generatedAt": "2026-01-15T10:30:00Z",
"cliVersion": "2.1.1",
"serverCommand": "npx @mcp/your-server",
"serverName": "your-server",
"durationMs": 1234,
"personas": [],
"model": "none"
},
"server": {
"name": "your-server",
"version": "1.0.0",
"protocolVersion": "2025-11-25",
"capabilities": ["tools", "prompts"]
},
"capabilities": {
"tools": [
{
"name": "read_file",
"description": "Read contents of a file",
"inputSchema": { "type": "object", "properties": { "path": { "type": "string" } } },
"schemaHash": "def456..."
}
]
},
"interviews": [],
"toolProfiles": [
{
"name": "read_file",
"description": "Read contents of a file",
"schemaHash": "def456...",
"assertions": [],
"securityNotes": ["Path traversal possible"],
"limitations": ["Cannot read binary files"],
"behavioralNotes": []
}
],
"assertions": [],
"summary": "File system server with read capabilities",
"hash": "abc123..."
}
Drift Detection
Comparisons are protocol-version-aware — version-specific fields are only compared when both baselines support the relevant MCP protocol version.
When comparing baselines, Bellwether detects:
| Change Type | Severity | Description |
|---|---|---|
| Tool added | Info | New tool appeared |
| Tool removed | Breaking | Existing tool disappeared |
| Schema changed | Warning/Breaking | Tool parameters changed |
| Description changed | Info | Tool help text updated |
| Annotation changed | Warning | Tool annotations (readOnlyHint, destructiveHint, etc.) changed |
| Title changed | Info | Tool/prompt/resource/resource template title changed |
| Output schema changed | Warning | Structured output schema modified |
| Task support changed | Warning | Execution/task support configuration changed |
| Server instructions changed | Info | Server-level instructions updated |
| Prompt added/removed | Breaking/Info | Prompt template appeared or disappeared |
| Resource changed | Warning | Resource URI, name, or MIME type modified |
| Performance regression | Warning | P50/P95 latency increased beyond threshold |
Severity Levels
- None: No changes detected
- Info: Non-breaking changes (additions, cosmetic updates)
- Warning: Potentially breaking changes (schema modifications)
- Breaking: Definitely breaking changes (removals, incompatible changes)
Exit Codes
| Code | Meaning |
|---|---|
0 | Success / No drift |
4 | Drift detected (with --fail-on-drift) or runtime error |
Configuration
Baseline settings can be configured in bellwether.yaml:
# bellwether.yaml
baseline:
# Default baseline filename (baseline commands resolve this under output.dir)
path: "bellwether-baseline.json"
# Path used by `bellwether check` drift comparison (resolved under output.dir)
comparePath: "./bellwether-baseline.json"
# Fail if drift is detected (for CI/CD)
failOnDrift: true
| Setting | Description | Default |
|---|---|---|
baseline.path | Default baseline filename (resolved under output.dir by baseline commands) | bellwether-baseline.json |
baseline.comparePath | Baseline to compare against during check (output.dir first, cwd fallback) | - |
baseline.failOnDrift | Exit with error if drift detected | false |
Check Mode vs Explore Mode
Baselines can only be created from check mode results. Check mode provides deterministic, structural testing:
# Run check and save baseline
bellwether check npx @mcp/your-server
bellwether baseline save
Check mode baselines contain:
- Tool schemas and fingerprints
- Server capabilities
- Integrity hashes
Explore mode (bellwether explore) is for documentation only and cannot be used for baselines. Explore results are LLM-powered and non-deterministic, making them unsuitable for drift detection.
# This works - check mode creates baselines
bellwether check
bellwether baseline save
# This fails - explore mode cannot create baselines
bellwether explore
bellwether baseline save # Error: baseline operations only work with check mode
See Also
- check - Run checks to generate reports
- Drift Detection - Understanding baseline comparison
- Baselines - Baseline concepts and best practices
- CI/CD Integration - Pipeline integration