Output Formats
Bellwether generates output in multiple formats to support different use cases: documentation and machine-readable reports.
Available Formats
| Format | File | Use Case |
|---|---|---|
| Markdown | CONTRACT.md (check) / AGENTS.md (explore) | Human-readable documentation |
| JSON | bellwether-check.json / bellwether-explore.json | Machine-readable data |
| Baseline | Configured by baseline.path / baseline.savePath | Drift detection snapshots |
| JUnit | (stdout) | CI test reporting (bellwether check --format junit) |
| SARIF | (stdout) | GitHub Code Scanning (bellwether check --format sarif) |
| Compact | (stdout) | Single-line summary for log aggregation |
| GitHub | (stdout) | GitHub Actions annotations |
Markdown (Default)
Human-readable documentation. Check generates CONTRACT.md, explore generates AGENTS.md.
bellwether check npx your-server
# Output: CONTRACT.md
Example Output
# @modelcontextprotocol/server-filesystem
> Generated by Bellwether on 2026-01-12
## Overview
A file management server providing read/write access to the local filesystem.
## Tools
### read_file
Reads the contents of a file from the specified path.
| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| path | string | yes | Path to the file |
**Observed Behavior:**
- Returns file contents as UTF-8 text
- Binary files returned as base64
- Maximum file size: 10MB
**Error Handling:**
- `ENOENT`: File not found
- `EACCES`: Permission denied
**Limitations:**
- Cannot read outside root directory
**Security Considerations:**
- Path traversal normalized within root
## Quick Reference
| Tool | Signature |
|------|-----------|
| read_file | `read_file(path)` |
| write_file | `write_file(path, content)` |
## Performance
| Tool | Calls | Avg | P95 | Max | Errors |
|------|-------|-----|-----|-----|--------|
| read_file | 5 | 45ms | 120ms | 150ms | 0% |
| write_file | 3 | 89ms | 200ms | 250ms | 0% |
### Performance Insights
- All tools performing within acceptable limits
## Prompts
### summarize_file
Generates a summary of a file's contents.
| Argument | Type | Required | Description |
|----------|------|----------|-------------|
| path | string | yes | Path to the file to summarize |
| max_length | number | no | Maximum summary length |
**Expected Output:**
Returns a structured summary prompt message suitable for LLM processing.
**Behavior Notes:**
- Works best with text files under 50KB
- Returns error for binary files
The CONTRACT.md output includes:
- Tool Profiles - Behavioral documentation for each tool
- Prompt Profiles - Documentation for prompts (if the server exposes any)
- Quick Reference - Tool signatures for easy lookup
- Performance Metrics - Response times and error rates for each tool
JSON Report
Machine-readable format for programmatic access.
JSON reports are generated when output.format includes json. File names and locations are configurable in bellwether.yaml:
output:
dir: ".bellwether"
files:
checkReport: "bellwether-check.json"
exploreReport: "bellwether-explore.json"
Each JSON report embeds a $schema pointer for validation. The schemas live in the repo under schemas/ and are published for tooling:
Example Output
The example below is abbreviated for readability. Refer to the schema for the full structure.
{
"$schema": "https://unpkg.com/@dotsetlabs/bellwether/schemas/bellwether-explore.schema.json",
"version": 1,
"timestamp": "2026-01-12T10:30:00Z",
"server": {
"name": "@modelcontextprotocol/server-filesystem",
"version": "0.10.1"
},
"tools": [
{
"name": "read_file",
"description": "Reads file contents",
"schema": {
"type": "object",
"properties": {
"path": { "type": "string" }
},
"required": ["path"]
},
"interview": {
"questionsAsked": 3,
"observations": [...],
"errors": [...],
"security": [...]
}
}
],
"prompts": [
{
"name": "summarize_file",
"description": "Generate a summary of file contents",
"arguments": [
{ "name": "path", "required": true },
{ "name": "max_length", "required": false }
],
"interview": {
"questionsAsked": 2,
"observations": [...],
"errors": [...]
}
}
],
"scenarioResults": [
{
"type": "tool",
"name": "read_file",
"description": "Read existing file",
"passed": true,
"assertions": [...]
}
],
"cost": {
"tokens": 1234,
"estimatedCost": 0.02
}
}
The JSON report includes:
toolsandpromptsarrays with their respective interview resultsscenarioResultsarray with custom scenario test results (if scenarios were run)semanticInferenceswith inferred parameter types (check mode)schemaEvolutiontracking response schema stability (check mode)errorAnalysisSummarieswith root causes and remediation hints (check mode)documentationScorewith quality grading and suggestions (check mode)
Baseline Format
Save a baseline for drift detection:
bellwether check npx your-server
bellwether baseline save
# Output (default): .bellwether/bellwether-baseline.json
The baseline captures the server's behavior at a point in time. Later, compare against it:
bellwether check npx your-server
bellwether baseline compare ./bellwether-baseline.json
Baseline format versions follow the CLI package version; baselines are compatible when their major versions match.
Example Baseline
{
"version": "2.1.1",
"metadata": {
"mode": "check",
"generatedAt": "2026-01-25T10:30:00Z",
"cliVersion": "2.1.1",
"serverCommand": "npx @modelcontextprotocol/server-filesystem /tmp",
"serverName": "@modelcontextprotocol/server-filesystem",
"durationMs": 1823,
"personas": [],
"model": "none"
},
"server": {
"name": "@modelcontextprotocol/server-filesystem",
"version": "0.10.1",
"protocolVersion": "2025-11-25",
"capabilities": ["tools"]
},
"capabilities": {
"tools": [
{
"name": "read_file",
"description": "Read contents of a file",
"inputSchema": { "type": "object", "properties": { "path": { "type": "string" } } },
"schemaHash": "def456..."
}
]
},
"interviews": [],
"toolProfiles": [
{
"name": "read_file",
"description": "Read contents of a file",
"schemaHash": "def456...",
"assertions": [],
"securityNotes": [],
"limitations": [],
"behavioralNotes": []
}
],
"assertions": [],
"summary": "Filesystem server with 1 tool",
"hash": "a1b2c3d4e5f6..."
}
Multiple Formats
Documentation and JSON reports are written based on output.format (docs, json, or both; legacy alias: agents.md).
Control their locations in bellwether.yaml:
output:
dir: ".bellwether" # JSON reports
docsDir: "." # CONTRACT.md / AGENTS.md
Custom Output Directory
Set output.dir for JSON files and output.docsDir for markdown docs.
JUnit Format
Generate JUnit XML (stdout):
bellwether check --format junit > bellwether-results.xml
check --format junit works in both modes:
- Check-only run (no
baseline.comparePath): includes tool reliability and security findings from the current run. - Baseline comparison run (
baseline.comparePathset): includes drift-focused test cases (schema drift, performance regression, security deltas, schema evolution, error trends, and documentation score changes).
JUnit output includes test cases for:
- Schema changes (breaking, warning, info)
- Performance regressions
- Security findings
- Documentation quality
- Error pattern changes
SARIF Format
Generate SARIF (stdout):
bellwether check --format sarif > bellwether.sarif
check --format sarif works in both modes:
- Check-only run: emits reliability/security findings from the current run (for example
BWH-REL,BWH-SEC/CWE-based IDs). - Baseline comparison run: emits drift-specific rules and findings (
BWH001and above).
SARIF rules include:
BWH001-004: Schema drift rules (breaking, warning, info)BWH005-006: Response structure and error pattern drift rulesBWH007: Security finding ruleBWH008-009: Response schema evolution rulesBWH010-011: Error trend rulesBWH012-013: Performance regression and confidence rulesBWH014-015: Documentation quality rules
Report Sections
Check mode reports include these sections:
Performance Metrics
─── Performance ───
Tool: read_file
P50: 45ms | P95: 120ms | Success: 98%
Confidence: high (15 samples, CV: 0.28)
Security Findings (with check.security.enabled)
─── Security ───
Tool: execute_query
Category: sql_injection
Risk: critical
Finding: Tool accepted SQL injection payload
Documentation Quality
─── Documentation Quality ───
Score: 85/100 (B)
Coverage: 100% | Quality: 80% | Params: 85%
Issues: 2 (1 warning, 1 info)
Error Analysis
─── Error Summary ───
Category: NotFound (ENOENT)
Root Cause: File does not exist
Remediation: Verify path before calling
See Also
- CI/CD Integration - Using outputs in pipelines
- check - Output options
- Drift Detection - Comparing outputs