Skip to main content

bellwether baseline

Manage baselines for drift detection.

Synopsis

bellwether baseline save [path]
bellwether baseline compare [path]
bellwether baseline show [path]
bellwether baseline diff <path1> <path2>
bellwether baseline accept

Description

The baseline command group manages baselines for detecting structural drift in your MCP server. Baselines capture the server's tool schemas at a point in time, enabling comparison against future checks.

Config Required

baseline requires a config file. Run bellwether init once in your project.

Subcommands

save

Save check results as a baseline for drift detection.

bellwether baseline save [path]
OptionDescriptionDefault
[path]Output path for baseline filebaseline.savePath or baseline.path
-c, --config <path>Path to config filebellwether.yaml
--report <path>Path to check report JSON fileoutput.files.checkReport
-f, --forceOverwrite existing baselinefalse

Examples:

# Save baseline from last check run
bellwether baseline save

# Save to specific path
bellwether baseline save ./baselines/v1.0.0.json

# Overwrite existing baseline
bellwether baseline save --force
Requires Check Report

The save command reads the JSON report generated by bellwether check. The default filename and output directory are configured via output.files.checkReport and output.dir.

compare

Compare check results against an existing baseline.

bellwether baseline compare [baseline-path]
OptionDescriptionDefault
[baseline-path]Path to baseline file to compare againstbaseline.comparePath or baseline.path
-c, --config <path>Path to config filebellwether.yaml
--report <path>Path to check report JSON fileoutput.files.checkReport
--format <format>Output format: text, json, markdown, compactbaseline.outputFormat
--fail-on-driftExit with error if drift is detectedbaseline.failOnDrift
--ignore-version-mismatchProceed even if format versions are incompatiblefalse

Examples:

# Compare against baseline
bellwether baseline compare ./bellwether-baseline.json

# Fail if drift detected (for CI/CD)
bellwether baseline compare ./bellwether-baseline.json --fail-on-drift

# Output in JSON format
bellwether baseline compare ./baseline.json --format json

# Compare with custom report location
bellwether baseline compare ./baseline.json --report ./output/bellwether-check.json

Relative baseline paths for compare, show, and accept are resolved under output.dir first, then fall back to the current working directory if the file exists there.

show

Display the contents of a baseline file.

bellwether baseline show [path]
OptionDescriptionDefault
[path]Path to baseline filebaseline.comparePath or baseline.path
-c, --config <path>Path to config filebellwether.yaml
--jsonOutput raw JSONfalse
--toolsShow only tool fingerprintsfalse
--assertionsShow only assertionsfalse

Examples:

# Show default baseline
bellwether baseline show

# Show specific baseline
bellwether baseline show ./baselines/v1.0.0.json

# Output raw JSON
bellwether baseline show --json

# Show only tools
bellwether baseline show --tools

diff

Compare two baseline files directly.

bellwether baseline diff <path1> <path2>
OptionDescriptionDefault
<path1>Path to first baseline fileRequired
<path2>Path to second baseline fileRequired
-c, --config <path>Path to config filebellwether.yaml
--format <format>Output format: text, json, markdown, compactbaseline.outputFormat
--ignore-version-mismatchProceed even if format versions are incompatiblefalse

Examples:

# Compare two baselines
bellwether baseline diff v1.0.0.json v1.1.0.json

# Output as markdown
bellwether baseline diff old.json new.json --format markdown

# Output as JSON for parsing
bellwether baseline diff old.json new.json --format json

accept

Accept detected drift as intentional and update the baseline with acceptance metadata. This creates an audit trail for intentional changes.

bellwether baseline accept
OptionDescriptionDefault
-c, --config <path>Path to config filebellwether.yaml
--report <path>Path to check report JSON fileoutput.files.checkReport
--baseline <path>Baseline file to accept drift againstbaseline.comparePath or baseline.path
--reason <text>Why the drift was accepted-
--accepted-by <name>Who accepted the drift (for audit trail)-
--dry-runPreview what would be accepted without writingfalse
-f, --forceRequired for accepting breaking changesfalse

Examples:

# Accept drift with a reason
bellwether baseline accept --reason "Added new delete_file tool"

# Accept breaking changes (requires --force)
bellwether baseline accept --reason "Major API update" --force

# Preview acceptance without writing
bellwether baseline accept --reason "Testing" --dry-run

# Include who accepted for audit trail
bellwether baseline accept --reason "Bug fix" --accepted-by "release-bot"
Breaking Changes

When drift includes breaking changes (tool removals, incompatible schema changes), you must use --force to confirm acceptance.

Acceptance Metadata:

When you accept drift, the baseline records:

  • When the drift was accepted
  • Who accepted it (if --accepted-by provided)
  • Why it was accepted (the reason)
  • What changes were accepted (snapshot of the diff)

This creates an audit trail for compliance and team visibility:

{
"acceptance": {
"acceptedAt": "2026-01-21T10:30:00Z",
"acceptedBy": "dev-team",
"reason": "Added new delete_file tool",
"acceptedDiff": {
"toolsAdded": ["delete_file"],
"toolsRemoved": [],
"toolsModified": [],
"severity": "info"
}
}
}

Workflow

Initial Setup

# 1. Create config
bellwether init

# 2. Run check
bellwether check npx @mcp/your-server

# 3. Save baseline
bellwether baseline save

CI/CD Integration

# In your CI pipeline:

# 1. Run check
bellwether check

# 2. Compare against committed baseline
bellwether baseline compare ./bellwether-baseline.json --fail-on-drift

Or configure baseline comparison in bellwether.yaml:

baseline:
comparePath: "./bellwether-baseline.json"
failOnDrift: true

Then run check (comparison happens automatically):

bellwether check --fail-on-drift

Version Comparison

# Compare baselines from different versions
bellwether baseline diff ./baselines/v1.0.0.json ./baselines/v2.0.0.json

Baseline Contents

A baseline file captures:

ComponentDescription
Server InfoName, version, protocol version, capabilities, instructions
ToolsName, description, schema hash, title, annotations, output schema, execution/task support
PromptsPrompt names, descriptions, titles, arguments
ResourcesResource URIs, names, descriptions, titles, MIME types
PerformanceP50/P95 latency, success rate, confidence level per tool
Security NotesSecurity observations per tool
LimitationsKnown limitations per tool
HashSHA-256 hash for detecting file tampering
AcceptanceOptional: when/why drift was accepted (audit trail)

Sample Baseline Structure

{
"version": "2.1.1",
"metadata": {
"mode": "check",
"generatedAt": "2026-01-15T10:30:00Z",
"cliVersion": "2.1.1",
"serverCommand": "npx @mcp/your-server",
"serverName": "your-server",
"durationMs": 1234,
"personas": [],
"model": "none"
},
"server": {
"name": "your-server",
"version": "1.0.0",
"protocolVersion": "2025-11-25",
"capabilities": ["tools", "prompts"]
},
"capabilities": {
"tools": [
{
"name": "read_file",
"description": "Read contents of a file",
"inputSchema": { "type": "object", "properties": { "path": { "type": "string" } } },
"schemaHash": "def456..."
}
]
},
"interviews": [],
"toolProfiles": [
{
"name": "read_file",
"description": "Read contents of a file",
"schemaHash": "def456...",
"assertions": [],
"securityNotes": ["Path traversal possible"],
"limitations": ["Cannot read binary files"],
"behavioralNotes": []
}
],
"assertions": [],
"summary": "File system server with read capabilities",
"hash": "abc123..."
}

Drift Detection

Comparisons are protocol-version-aware — version-specific fields are only compared when both baselines support the relevant MCP protocol version.

When comparing baselines, Bellwether detects:

Change TypeSeverityDescription
Tool addedInfoNew tool appeared
Tool removedBreakingExisting tool disappeared
Schema changedWarning/BreakingTool parameters changed
Description changedInfoTool help text updated
Annotation changedWarningTool annotations (readOnlyHint, destructiveHint, etc.) changed
Title changedInfoTool/prompt/resource/resource template title changed
Output schema changedWarningStructured output schema modified
Task support changedWarningExecution/task support configuration changed
Server instructions changedInfoServer-level instructions updated
Prompt added/removedBreaking/InfoPrompt template appeared or disappeared
Resource changedWarningResource URI, name, or MIME type modified
Performance regressionWarningP50/P95 latency increased beyond threshold

Severity Levels

  • None: No changes detected
  • Info: Non-breaking changes (additions, cosmetic updates)
  • Warning: Potentially breaking changes (schema modifications)
  • Breaking: Definitely breaking changes (removals, incompatible changes)

Exit Codes

CodeMeaning
0Success / No drift
4Drift detected (with --fail-on-drift) or runtime error

Configuration

Baseline settings can be configured in bellwether.yaml:

# bellwether.yaml
baseline:
# Default baseline filename (baseline commands resolve this under output.dir)
path: "bellwether-baseline.json"

# Path used by `bellwether check` drift comparison (resolved under output.dir)
comparePath: "./bellwether-baseline.json"

# Fail if drift is detected (for CI/CD)
failOnDrift: true
SettingDescriptionDefault
baseline.pathDefault baseline filename (resolved under output.dir by baseline commands)bellwether-baseline.json
baseline.comparePathBaseline to compare against during check (output.dir first, cwd fallback)-
baseline.failOnDriftExit with error if drift detectedfalse

Check Mode vs Explore Mode

Baselines can only be created from check mode results. Check mode provides deterministic, structural testing:

# Run check and save baseline
bellwether check npx @mcp/your-server
bellwether baseline save

Check mode baselines contain:

  • Tool schemas and fingerprints
  • Server capabilities
  • Integrity hashes

Explore mode (bellwether explore) is for documentation only and cannot be used for baselines. Explore results are LLM-powered and non-deterministic, making them unsuitable for drift detection.

# This works - check mode creates baselines
bellwether check
bellwether baseline save

# This fails - explore mode cannot create baselines
bellwether explore
bellwether baseline save # Error: baseline operations only work with check mode

See Also