bellwether baseline

Manage baselines for drift detection.

Synopsis

bellwether baseline save [path]
bellwether baseline compare [path]
bellwether baseline show [path]
bellwether baseline diff <path1> <path2>
bellwether baseline accept

Description

The baseline command group manages baselines for detecting structural drift in your MCP server. Baselines capture the server's tool schemas at a point in time, enabling comparison against future checks.

Config Required

baseline requires a config file. Run bellwether init once in your project.

Subcommands

save

Save check results as a baseline for drift detection.

bellwether baseline save [path]

Option	Description	Default
`[path]`	Output path for baseline file	`baseline.savePath` or `baseline.path`
`-c, --config <path>`	Path to config file	`bellwether.yaml`
`--report <path>`	Path to check report JSON file	`output.files.checkReport`
`-f, --force`	Overwrite existing baseline	`false`

Examples:

# Save baseline from last check run
bellwether baseline save

# Save to specific path
bellwether baseline save ./baselines/v1.0.0.json

# Overwrite existing baseline
bellwether baseline save --force

Requires Check Report

The save command reads the JSON report generated by bellwether check. The default filename and output directory are configured via output.files.checkReport and output.dir.

compare

Compare check results against an existing baseline.

bellwether baseline compare [baseline-path]

Option	Description	Default
`[baseline-path]`	Path to baseline file to compare against	`baseline.comparePath` or `baseline.path`
`-c, --config <path>`	Path to config file	`bellwether.yaml`
`--report <path>`	Path to check report JSON file	`output.files.checkReport`
`--format <format>`	Output format: `text`, `json`, `markdown`, `compact`	`baseline.outputFormat`
`--fail-on-drift`	Exit with error if drift is detected	`baseline.failOnDrift`
`--ignore-version-mismatch`	Proceed even if format versions are incompatible	`false`

Examples:

# Compare against baseline
bellwether baseline compare ./bellwether-baseline.json

# Fail if drift detected (for CI/CD)
bellwether baseline compare ./bellwether-baseline.json --fail-on-drift

# Output in JSON format
bellwether baseline compare ./baseline.json --format json

# Compare with custom report location
bellwether baseline compare ./baseline.json --report ./output/bellwether-check.json

Relative baseline paths for compare, show, and accept are resolved under output.dir first, then fall back to the current working directory if the file exists there.

show

Display the contents of a baseline file.

bellwether baseline show [path]

Option	Description	Default
`[path]`	Path to baseline file	`baseline.comparePath` or `baseline.path`
`-c, --config <path>`	Path to config file	`bellwether.yaml`
`--json`	Output raw JSON	`false`
`--tools`	Show only tool fingerprints	`false`
`--assertions`	Show only assertions	`false`

Examples:

# Show default baseline
bellwether baseline show

# Show specific baseline
bellwether baseline show ./baselines/v1.0.0.json

# Output raw JSON
bellwether baseline show --json

# Show only tools
bellwether baseline show --tools

diff

Compare two baseline files directly.

bellwether baseline diff <path1> <path2>

Option	Description	Default
`<path1>`	Path to first baseline file	Required
`<path2>`	Path to second baseline file	Required
`-c, --config <path>`	Path to config file	`bellwether.yaml`
`--format <format>`	Output format: `text`, `json`, `markdown`, `compact`	`baseline.outputFormat`
`--ignore-version-mismatch`	Proceed even if format versions are incompatible	`false`

Examples:

# Compare two baselines
bellwether baseline diff v1.0.0.json v1.1.0.json

# Output as markdown
bellwether baseline diff old.json new.json --format markdown

# Output as JSON for parsing
bellwether baseline diff old.json new.json --format json

accept

Accept detected drift as intentional and update the baseline with acceptance metadata. This creates an audit trail for intentional changes.

bellwether baseline accept

Option	Description	Default
`-c, --config <path>`	Path to config file	`bellwether.yaml`
`--report <path>`	Path to check report JSON file	`output.files.checkReport`
`--baseline <path>`	Baseline file to accept drift against	`baseline.comparePath` or `baseline.path`
`--reason <text>`	Why the drift was accepted	-
`--accepted-by <name>`	Who accepted the drift (for audit trail)	-
`--dry-run`	Preview what would be accepted without writing	`false`
`-f, --force`	Required for accepting breaking changes	`false`

Examples:

# Accept drift with a reason
bellwether baseline accept --reason "Added new delete_file tool"

# Accept breaking changes (requires --force)
bellwether baseline accept --reason "Major API update" --force

# Preview acceptance without writing
bellwether baseline accept --reason "Testing" --dry-run

# Include who accepted for audit trail
bellwether baseline accept --reason "Bug fix" --accepted-by "release-bot"

Breaking Changes

When drift includes breaking changes (tool removals, incompatible schema changes), you must use --force to confirm acceptance.

Acceptance Metadata:

When you accept drift, the baseline records:

When the drift was accepted
Who accepted it (if --accepted-by provided)
Why it was accepted (the reason)
What changes were accepted (snapshot of the diff)

This creates an audit trail for compliance and team visibility:

{
  "acceptance": {
    "acceptedAt": "2026-01-21T10:30:00Z",
    "acceptedBy": "dev-team",
    "reason": "Added new delete_file tool",
    "acceptedDiff": {
      "toolsAdded": ["delete_file"],
      "toolsRemoved": [],
      "toolsModified": [],
      "severity": "info"
    }
  }
}

Workflow

Initial Setup

# 1. Create config
bellwether init

# 2. Run check
bellwether check npx @mcp/your-server

# 3. Save baseline
bellwether baseline save

CI/CD Integration

# In your CI pipeline:

# 1. Run check
bellwether check

# 2. Compare against committed baseline
bellwether baseline compare ./bellwether-baseline.json --fail-on-drift

Or configure baseline comparison in bellwether.yaml:

baseline:
  comparePath: "./bellwether-baseline.json"
  failOnDrift: true

Then run check (comparison happens automatically):

bellwether check --fail-on-drift

Version Comparison

# Compare baselines from different versions
bellwether baseline diff ./baselines/v1.0.0.json ./baselines/v2.0.0.json

Baseline Contents

A baseline file captures:

Component	Description
Server Info	Name, version, protocol version, capabilities, instructions
Tools	Name, description, schema hash, title, annotations, output schema, execution/task support
Prompts	Prompt names, descriptions, titles, arguments
Resources	Resource URIs, names, descriptions, titles, MIME types
Performance	P50/P95 latency, success rate, confidence level per tool
Security Notes	Security observations per tool
Limitations	Known limitations per tool
Hash	SHA-256 hash for detecting file tampering
Acceptance	Optional: when/why drift was accepted (audit trail)

Sample Baseline Structure

{
  "version": "2.1.1",
  "metadata": {
    "mode": "check",
    "generatedAt": "2026-01-15T10:30:00Z",
    "cliVersion": "2.1.1",
    "serverCommand": "npx @mcp/your-server",
    "serverName": "your-server",
    "durationMs": 1234,
    "personas": [],
    "model": "none"
  },
  "server": {
    "name": "your-server",
    "version": "1.0.0",
    "protocolVersion": "2025-11-25",
    "capabilities": ["tools", "prompts"]
  },
  "capabilities": {
    "tools": [
      {
        "name": "read_file",
        "description": "Read contents of a file",
        "inputSchema": { "type": "object", "properties": { "path": { "type": "string" } } },
        "schemaHash": "def456..."
      }
    ]
  },
  "interviews": [],
  "toolProfiles": [
    {
      "name": "read_file",
      "description": "Read contents of a file",
      "schemaHash": "def456...",
      "assertions": [],
      "securityNotes": ["Path traversal possible"],
      "limitations": ["Cannot read binary files"],
      "behavioralNotes": []
    }
  ],
  "assertions": [],
  "summary": "File system server with read capabilities",
  "hash": "abc123..."
}

Drift Detection

Comparisons are protocol-version-aware — version-specific fields are only compared when both baselines support the relevant MCP protocol version.

When comparing baselines, Bellwether detects:

Change Type	Severity	Description
Tool added	Info	New tool appeared
Tool removed	Breaking	Existing tool disappeared
Schema changed	Warning/Breaking	Tool parameters changed
Description changed	Info	Tool help text updated
Annotation changed	Warning	Tool annotations (readOnlyHint, destructiveHint, etc.) changed
Title changed	Info	Tool/prompt/resource/resource template title changed
Output schema changed	Warning	Structured output schema modified
Task support changed	Warning	Execution/task support configuration changed
Server instructions changed	Info	Server-level instructions updated
Prompt added/removed	Breaking/Info	Prompt template appeared or disappeared
Resource changed	Warning	Resource URI, name, or MIME type modified
Performance regression	Warning	P50/P95 latency increased beyond threshold

Severity Levels

None: No changes detected
Info: Non-breaking changes (additions, cosmetic updates)
Warning: Potentially breaking changes (schema modifications)
Breaking: Definitely breaking changes (removals, incompatible changes)

Exit Codes

Code	Meaning
`0`	Success / No drift
`4`	Drift detected (with `--fail-on-drift`) or runtime error

Configuration

Baseline settings can be configured in bellwether.yaml:

# bellwether.yaml
baseline:
  # Default baseline filename (baseline commands resolve this under output.dir)
  path: "bellwether-baseline.json"

  # Path used by `bellwether check` drift comparison (resolved under output.dir)
  comparePath: "./bellwether-baseline.json"

  # Fail if drift is detected (for CI/CD)
  failOnDrift: true

Setting	Description	Default
`baseline.path`	Default baseline filename (resolved under `output.dir` by baseline commands)	`bellwether-baseline.json`
`baseline.comparePath`	Baseline to compare against during check (output.dir first, cwd fallback)	-
`baseline.failOnDrift`	Exit with error if drift detected	`false`

Check Mode vs Explore Mode

Baselines can only be created from check mode results. Check mode provides deterministic, structural testing:

# Run check and save baseline
bellwether check npx @mcp/your-server
bellwether baseline save

Check mode baselines contain:

Tool schemas and fingerprints
Server capabilities
Integrity hashes

Explore mode (bellwether explore) is for documentation only and cannot be used for baselines. Explore results are LLM-powered and non-deterministic, making them unsuitable for drift detection.

# This works - check mode creates baselines
bellwether check
bellwether baseline save

# This fails - explore mode cannot create baselines
bellwether explore
bellwether baseline save  # Error: baseline operations only work with check mode

Synopsis​

Description​

Subcommands​

save​

compare​

show​

diff​

accept​

Workflow​

Initial Setup​

CI/CD Integration​

Version Comparison​

Baseline Contents​

Sample Baseline Structure​

Drift Detection​

Severity Levels​

Exit Codes​

Configuration​

Check Mode vs Explore Mode​

See Also​

Synopsis

Description

Subcommands

save

compare

show

diff

accept

Workflow

Initial Setup

CI/CD Integration

Version Comparison

Baseline Contents

Sample Baseline Structure

Drift Detection

Severity Levels

Exit Codes

Configuration

Check Mode vs Explore Mode

See Also