Skip to main content

Check vs Explore

Bellwether offers two commands to fit different workflows and budgets.

For product scope and command-tier guidance, see Core vs Advanced.

Quick Comparison

bellwether checkbellwether explore
CostFree~$0.01-0.15 per run
SpeedSecondsMinutes
LLM RequiredNoYes
Deterministic100%No (LLM variation)
What it doesSchema validation, drift detectionBehavioral analysis, edge cases, security
OutputDocs and/or JSON (default: CONTRACT.md, bellwether-check.json)Docs and/or JSON (default: AGENTS.md, bellwether-explore.json)

bellwether check (Free)

The check command validates tool schemas, parameter types, and descriptions without making any LLM calls. It's:

  • Free - No API costs
  • Fast - Completes in seconds
  • Deterministic - Same input always produces same output
  • CI/CD friendly - No API keys required
  • Output - Generates docs and/or JSON based on output.format (default is both)
# Initialize config and run check
bellwether init npx @mcp/server
bellwether check

# With baseline comparison (configure baseline.comparePath in bellwether.yaml)
bellwether check
bellwether check --fail-on-drift # Override baseline.failOnDrift

What Check Detects

  • Tools added or removed
  • Parameter changes (name, type, required status)
  • Description changes
  • Schema hash changes
  • Tool annotation changes (readOnlyHint, destructiveHint, etc.)
  • Entity title changes (tool, prompt, resource, resource template)
  • Output schema changes
  • Execution/task support changes
  • Server instruction changes
  • Prompt added/removed/modified
  • Resource and resource template changes
  • Performance regression (P50/P95 latency, success rate)
  • Security vulnerability detection (when enabled)

When to Use Check

  • CI/CD pipelines - Fast, free, deterministic
  • PR checks - Quick validation before merge
  • Schema validation - Ensure API contracts are maintained
  • Cost-sensitive environments - No API costs

bellwether explore (Requires LLM)

The explore command uses an LLM to intelligently probe your server from multiple perspectives. It's:

  • Comprehensive - Tests edge cases, error handling, security
  • Multi-persona - Technical writer, security tester, QA engineer, novice user perspectives
  • Rich documentation - Generates detailed AGENTS.md when output.format includes docs
# Initialize with local preset and run explore
bellwether init --preset local npx @mcp/server
bellwether explore

# Configure provider in bellwether.yaml
# llm:
# provider: openai # or anthropic

What Explore Provides

  • Behavioral observations (how tools actually behave)
  • Edge case testing
  • Error handling patterns
  • Security analysis
  • Performance metrics
  • Limitations discovery
  • Multi-persona perspectives

When to Use Explore

  • Local development - Deep understanding of server behavior
  • Security audits - Comprehensive vulnerability testing
  • Documentation generation - Rich AGENTS.md output
  • Pre-release testing - Thorough validation before deployment

Cost Comparison

Typical costs for exploring a server with 10 tools:

Provider/ModelCostNotes
bellwether check$0.00Free, deterministic
Ollama (qwen3:8b)$0.00Free, requires local setup
gpt-4.1-nano~$0.01-0.02Budget cloud option
claude-haiku-4-5~$0.02-0.05Recommended
gpt-4.1~$0.04-0.08Higher quality OpenAI
claude-sonnet-4-5~$0.08-0.15Premium quality

Combining Check and Explore

A common pattern is to use both commands:

  1. CI/CD: bellwether check for fast, free drift detection
  2. Local dev: bellwether explore for comprehensive testing and documentation
# CI/CD pipeline (baseline path configured in bellwether.yaml)
bellwether check --fail-on-drift

# Local development
bellwether explore # Uses config from bellwether.yaml

Configuration

Both commands use settings from bellwether.yaml:

server:
command: "npx @mcp/server"
timeout: 30000

# LLM settings (for explore command)
llm:
provider: anthropic # or openai, ollama
model: "" # optional, uses provider default (claude-haiku-4-5)

# Explore settings
explore:
personas:
- technical_writer
- security_tester
maxQuestionsPerTool: 3

# Baseline settings (for check command)
baseline:
comparePath: "./bellwether-baseline.json"
failOnDrift: false

Or use presets when initializing:

bellwether init --preset ci        # Optimized for check in CI/CD
bellwether init --preset local # Explore with Ollama
bellwether init --preset security # Explore with security focus
bellwether init --preset thorough # Currently same generated preset values as security

See Also