Skip to main content

CI/CD Integration

Integrate Bellwether into your CI/CD pipeline for automated behavioral testing of MCP servers.

Quick Start

The simplest CI/CD setup uses check mode (free, fast, deterministic):

# .github/workflows/bellwether.yml
name: MCP Drift Detection
on: [pull_request]

jobs:
bellwether:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- run: npx @dotsetlabs/bellwether init --preset ci npx @mcp/your-server
- run: npx @dotsetlabs/bellwether check --fail-on-drift

Commit bellwether.yaml to your repo so CI always has your configuration. No API keys required. Free. Runs in seconds.

Setup

1. Create Configuration

First, initialize a CI-optimized configuration:

bellwether init --preset ci

This creates bellwether.yaml with:

  • Check mode (free, deterministic)
  • JSON reports written to output.dir
  • Fail on drift enabled

2. Create Initial Baseline

Run the test and save a baseline:

bellwether check npx @mcp/your-server
bellwether baseline save

3. Commit Both Files

git add bellwether.yaml .bellwether/bellwether-baseline.json
git commit -m "Add Bellwether configuration and baseline"

GitHub Actions

Basic Drift Detection

name: MCP Drift Detection
on: [pull_request]

jobs:
bellwether:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4

- name: Run Bellwether Check
run: npx @dotsetlabs/bellwether check --fail-on-drift

Ensure bellwether.yaml is committed. Configure your server command and baseline paths in bellwether.yaml:

server:
command: "npx @mcp/your-server"

baseline:
comparePath: "./bellwether-baseline.json"
failOnDrift: true

Using GitHub Action

For a streamlined experience, use the official GitHub Action:

- name: Detect Behavioral Drift
uses: dotsetlabs/bellwether@v2.1.3
with:
version: '2.1.3'
server-command: 'npx @mcp/your-server'
baseline-path: './bellwether-baseline.json'
fail-on-severity: 'warning'

The action auto-creates bellwether.yaml with --preset ci if not found.

Action Inputs

InputDescriptionDefault
versionBellwether npm version to install (pin for reproducibility)action ref (semver only)
server-commandMCP server command (required)-
server-argsArguments to pass to the server''
config-pathPath to bellwether.yamlbellwether.yaml
baseline-pathPath to baseline filebellwether-baseline.json
save-baselineSave baseline after checkfalse
output-dirDirectory for output files.
formatOutput format: text, json, compact, github, markdown, junit, sarifgithub
min-severityMinimum severity to report: none, info, warning, breakinginfo
fail-on-severityFailure threshold: none, info, warning, breakingbreaking
accept-driftAccept detected drift and update baselinefalse
accept-reasonReason for accepting drift''
upload-sarifUpload SARIF to GitHub Code Scanningtrue

Action Outputs

OutputDescription
resultpassed or failed
exit-codeSemantic exit code (0-5)
severityHighest severity: none, info, warning, breaking, low_confidence
drift-detectedtrue or false
tool-countNumber of tools discovered
breaking-countNumber of breaking changes
warning-countNumber of warning changes
info-countNumber of info changes
doc-scoreDocumentation quality score (0-100)
doc-gradeDocumentation quality grade (A-F)
security-findingsNumber of security findings
contract-mdPath to CONTRACT.md
baseline-filePath to baseline file
sarif-filePath to SARIF file
junit-filePath to JUnit XML file

Artifacts

The action automatically uploads:

  • bellwether-docs - The generated CONTRACT.md
  • bellwether-baseline - The baseline file (if saved)
  • bellwether-report - The JSON report

Save Baseline with Action

- name: Check and Save Baseline
uses: dotsetlabs/bellwether@v2.1.3
with:
version: '2.1.3'
server-command: 'npx @mcp/your-server'
save-baseline: 'true'

- name: Commit Baseline
uses: stefanzweifel/git-auto-commit-action@v5
with:
commit_message: 'Update MCP baseline'
file_pattern: '.bellwether/bellwether-baseline.json'

Action with Server Environment Variables

If your MCP server needs secrets, use interpolation in your config:

# bellwether.yaml
server:
command: "npx @mcp/your-server"
env:
API_KEY: "${API_KEY}"
# workflow
- name: Test with Secrets
uses: dotsetlabs/bellwether@v2.1.3
with:
version: '2.1.3'
server-command: 'npx @mcp/your-server'
env:
API_KEY: ${{ secrets.API_KEY }}

Explore Mode with LLM (Documentation Only)

For generating comprehensive documentation with LLM-powered analysis:

jobs:
bellwether:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4

- name: Run Explore Mode
env:
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
run: |
npx @dotsetlabs/bellwether explore npx @mcp/your-server
note

Explore mode generates AGENTS.md documentation but cannot be used for drift detection or baselines. For CI/CD drift detection, use bellwether check.


GitLab CI

bellwether:
image: node:20
script:
- npx @dotsetlabs/bellwether check --fail-on-drift

Workflow Patterns

PR Checks (Check Mode)

Fast, free checks on every pull request:

name: PR Check
on: pull_request

jobs:
bellwether:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- run: npx @dotsetlabs/bellwether check --fail-on-drift

Nightly Explore Mode Documentation

Generate comprehensive documentation with LLM analysis:

name: Nightly Documentation
on:
schedule:
- cron: '0 0 * * *'

jobs:
bellwether:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4

# Generate documentation with explore mode
- run: npx @dotsetlabs/bellwether explore
env:
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}

# Drift detection still uses check mode
- run: npx @dotsetlabs/bellwether check --fail-on-drift

Update Baseline on Release

name: Update Baseline
on:
release:
types: [published]

jobs:
update:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4

- name: Generate New Baseline
run: |
npx @dotsetlabs/bellwether check
npx @dotsetlabs/bellwether baseline save --force

- name: Commit Baseline
run: |
git config user.name "GitHub Actions"
git config user.email "actions@github.com"
git add .bellwether/bellwether-baseline.json
git commit -m "Update baseline for ${{ github.event.release.tag_name }}"
git push

Configuration for CI

# bellwether.yaml
server:
command: "npx @mcp/your-server"
timeout: 30000

output:
dir: "."

baseline:
comparePath: "./bellwether-baseline.json" # output.dir is "." in this example
failOnDrift: true
severity:
failOnSeverity: breaking # or "warning" for stricter checks

check:
parallel: true # Faster checks
parallelWorkers: 4 # Concurrent tool tests
performanceThreshold: 10 # Flag >10% latency regression
security:
enabled: false # Enable security testing (optional)

logging:
level: warn

Multiple Configurations

Create different configs for different environments:

# CI config (committed)
bellwether init --preset ci
mv bellwether.yaml configs/ci.yaml

# Development config (local)
bellwether init --preset local
mv bellwether.yaml configs/dev.yaml

Use in CI:

- run: npx @dotsetlabs/bellwether check --config ./configs/ci.yaml npx @mcp/your-server

Exit Codes

Bellwether uses granular exit codes for semantic CI/CD integration:

CodeMeaningAction
0No changes detectedPipeline passes
1Info-level changes (non-breaking)Decide in CI (often treated as pass)
2Warning-level changesDecide in CI (often treated as failure)
3Breaking changes detectedPipeline always fails
4Runtime error (connection, config)Pipeline fails
5Low confidence metrics (when check.sampling.failOnLowConfidence is true)Pipeline fails

Handling Exit Codes

bellwether check npx @mcp/server
case $? in
0) echo "Clean - no drift" ;;
1) echo "Info changes only" ;;
2) echo "Warnings detected" ;;
3) echo "BREAKING CHANGES!" && exit 1 ;;
4) echo "Error occurred" && exit 1 ;;
5) echo "Low confidence metrics" && exit 1 ;;
esac

Configurable Failure Threshold

# Fail on any drift (including info-level)
bellwether check --fail-on-severity info

# Fail only on breaking changes (ignore warnings)
bellwether check --fail-on-severity breaking

Output Formats for CI

JUnit XML (Jenkins, GitLab CI, CircleCI)

- name: Run Check with JUnit Output
run: npx @dotsetlabs/bellwether check --format junit > bellwether-results.xml

- name: Publish Test Results
uses: mikepenz/action-junit-report@v4
with:
report_paths: 'bellwether-results.xml'

SARIF (GitHub Code Scanning)

- name: Run Check with SARIF Output
run: npx @dotsetlabs/bellwether check --format sarif > bellwether.sarif

- name: Upload SARIF to GitHub
uses: github/codeql-action/upload-sarif@v3
with:
sarif_file: bellwether.sarif

Parallel Testing

Speed up checks for servers with many tools:

- name: Fast Parallel Check
run: npx @dotsetlabs/bellwether check --fail-on-drift

Configure parallelism in bellwether.yaml:

check:
parallel: true
parallelWorkers: 4

Incremental Checking

Only test tools with changed schemas (requires existing baseline):

- name: Incremental Check
run: npx @dotsetlabs/bellwether check --fail-on-drift

Configure incremental checking in bellwether.yaml:

check:
incremental: true
incrementalCacheHours: 168

Security Testing

Enable security vulnerability scanning:

- name: Security Check
run: npx @dotsetlabs/bellwether check --fail-on-drift

Configure security testing in bellwether.yaml:

check:
security:
enabled: true
categories: [sql_injection, xss, path_traversal, command_injection, ssrf, error_disclosure]

Security testing detects:

  • SQL injection vulnerabilities
  • Path traversal attacks
  • Command injection
  • XSS vulnerabilities
  • SSRF attacks

Security findings are included in SARIF output for GitHub Code Scanning integration.


Baseline Management

Storing Baselines

Commit baselines to version control:

# Generate baseline
bellwether check
bellwether baseline save

# Commit
git add .bellwether/bellwether-baseline.json
git commit -m "Add behavioral baseline"

Updating Baselines

When you intentionally change your server, you have three options:

The baseline accept command records acceptance metadata for audit trail:

# Run check to detect drift
bellwether check

# Review the drift, then accept with a reason
bellwether baseline accept --reason "Added new delete_file tool"

# Commit
git add .bellwether/bellwether-baseline.json
git commit -m "Update baseline: added delete_file tool"

For breaking changes, use --force:

bellwether baseline accept --reason "Major API update" --force

Option 2: Accept During Check

Accept drift as part of the check command:

bellwether check --accept-drift --accept-reason "Improved error handling"
git add .bellwether/bellwether-baseline.json
git commit -m "Update baseline: improved error handling"

Option 3: Force Save (No Audit Trail)

For simple cases without acceptance metadata:

bellwether check
bellwether baseline save --force
git add .bellwether/bellwether-baseline.json
git commit -m "Update baseline for new feature X"
Audit Trail

baseline accept records when, why, and who accepted changes.
--accept-drift records when and why only. Use --accepted-by with baseline accept if you need attribution.

Comparing Versions

# Compare two baselines
bellwether baseline diff ./baselines/v1.0.0.json ./baselines/v2.0.0.json

Cost Comparison

ModeCostSpeedUse Case
Check (default)FreeSecondsPR checks, CI gates
Explore with OllamaFreeMinutesLocal dev
Explore with OpenAI~$0.01-0.10MinutesComprehensive documentation

Check Mode Benefits

  • Free - No API costs
  • Fast - Completes in seconds
  • Deterministic - Same results every time
  • No secrets - No API keys to manage

Environment Variables

VariableDescriptionRequired
OPENAI_API_KEYOpenAI API keyFor explore mode with OpenAI
ANTHROPIC_API_KEYAnthropic API keyFor explore mode with Anthropic
OLLAMA_BASE_URLOllama server URLFor explore mode with Ollama (default: http://localhost:11434)

Troubleshooting

Exit Code 1 (Info Changes)

Non-breaking changes detected:

  • New optional parameters added
  • Description updates
  • Usually safe to proceed

Exit Code 2 (Warnings)

Warning-level changes detected:

  • Check the diff output for what changed
  • May indicate behavioral changes
  • Review before deploying

Exit Code 3 (Breaking Changes)

Breaking changes detected:

  • Tool removed
  • Required parameter added
  • Type changed
  • Update baseline only after careful review

Exit Code 4 (Error)

Configuration or connection error:

  • Verify bellwether.yaml exists
  • Check server command is correct
  • Verify network connectivity

Timeout Errors

Increase timeout in config:

server:
timeout: 120000 # 2 minutes

Debug Mode

Enable debug logging in bellwether.yaml:

logging:
level: debug
verbose: true

Then capture output in CI:

- run: npx @dotsetlabs/bellwether check 2>&1 | tee bellwether.log

- uses: actions/upload-artifact@v4
if: failure()
with:
name: debug-logs
path: bellwether.log

See Also