Skip to main content

Baselines

Baselines capture a snapshot of your MCP server's expected behavior, enabling drift detection and regression testing.

What Is a Baseline?

A baseline is a JSON file containing:

  • Server capabilities - Tools, prompts, and resources
  • Tool schemas - Parameter types and requirements
  • Behavioral observations - How tools actually behave
  • Security findings - Any identified vulnerabilities

Creating a Baseline

bellwether interview --save-baseline npx your-server

This generates bellwether-baseline.json:

{
"version": 1,
"timestamp": "2026-01-12T10:30:00Z",
"server": {
"name": "@modelcontextprotocol/server-filesystem",
"version": "1.0.0"
},
"tools": [
{
"name": "read_file",
"schema": {
"type": "object",
"properties": {
"path": { "type": "string" }
},
"required": ["path"]
},
"behavior": {
"observations": [
"Returns UTF-8 text for text files",
"Returns base64 for binary files",
"Maximum file size: 10MB"
],
"errors": [
"ENOENT for missing files",
"EACCES for permission denied"
],
"security": [
"Path traversal normalized within root"
]
}
}
]
}

Custom Baseline Path

# Save to specific path
bellwether interview --save-baseline ./baselines/v1.json npx your-server

# Compare against specific baseline
bellwether interview --compare-baseline ./baselines/v1.json npx your-server

Baseline in CI/CD

Commit to Version Control

# Create baseline
bellwether interview --save-baseline npx your-server

# Commit
git add bellwether-baseline.json
git commit -m "Update behavioral baseline"

Check in CI

# GitHub Actions
- name: Check Behavioral Drift
run: |
bellwether interview \
--compare-baseline ./bellwether-baseline.json \
--fail-on-drift \
npx your-server

Updating Baselines

When intentional changes are made:

# Review changes
bellwether interview --compare-baseline ./baseline.json npx your-server

# Update baseline
bellwether interview --save-baseline npx your-server

# Commit
git add bellwether-baseline.json
git commit -m "Update baseline: added delete_file tool"

Baseline Cloud Sync

Upload baselines to Bellwether Cloud for:

  • Historical tracking
  • Version comparison
  • Verification badges
bellwether login
bellwether link
bellwether upload

What's Captured

CategoryContent
CapabilitiesTools, prompts, resources available
SchemasParameter types, required fields
BehaviorObserved responses, return values
ErrorsError types, messages, conditions
SecurityVulnerabilities, attack surface
MetadataTimestamp, model used, personas

Baseline Comparison

When comparing baselines, Bellwether detects:

Change TypeExample
AddedNew tool delete_file
RemovedTool legacy_read no longer exists
Schema changeParameter path now required
Behavior changeError message format changed
Security changeNew vulnerability detected

See Also