Skip to main content

Output Formats

Bellwether generates output in multiple formats to support different use cases: documentation and machine-readable reports.

Available Formats

FormatFileUse Case
MarkdownCONTRACT.md (check) / AGENTS.md (explore)Human-readable documentation
JSONbellwether-check.json / bellwether-explore.jsonMachine-readable data
BaselineConfigured by baseline.path / baseline.savePathDrift detection snapshots
JUnit(stdout)CI test reporting (bellwether check --format junit)
SARIF(stdout)GitHub Code Scanning (bellwether check --format sarif)
Compact(stdout)Single-line summary for log aggregation
GitHub(stdout)GitHub Actions annotations

Markdown (Default)

Human-readable documentation. Check generates CONTRACT.md, explore generates AGENTS.md.

bellwether check npx your-server
# Output: CONTRACT.md

Example Output

# @modelcontextprotocol/server-filesystem

> Generated by Bellwether on 2026-01-12

## Overview

A file management server providing read/write access to the local filesystem.

## Tools

### read_file

Reads the contents of a file from the specified path.

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| path | string | yes | Path to the file |

**Observed Behavior:**
- Returns file contents as UTF-8 text
- Binary files returned as base64
- Maximum file size: 10MB

**Error Handling:**
- `ENOENT`: File not found
- `EACCES`: Permission denied

**Limitations:**
- Cannot read outside root directory

**Security Considerations:**
- Path traversal normalized within root

## Quick Reference

| Tool | Signature |
|------|-----------|
| read_file | `read_file(path)` |
| write_file | `write_file(path, content)` |

## Performance

| Tool | Calls | Avg | P95 | Max | Errors |
|------|-------|-----|-----|-----|--------|
| read_file | 5 | 45ms | 120ms | 150ms | 0% |
| write_file | 3 | 89ms | 200ms | 250ms | 0% |

### Performance Insights
- All tools performing within acceptable limits

## Prompts

### summarize_file

Generates a summary of a file's contents.

| Argument | Type | Required | Description |
|----------|------|----------|-------------|
| path | string | yes | Path to the file to summarize |
| max_length | number | no | Maximum summary length |

**Expected Output:**
Returns a structured summary prompt message suitable for LLM processing.

**Behavior Notes:**
- Works best with text files under 50KB
- Returns error for binary files

The CONTRACT.md output includes:

  • Tool Profiles - Behavioral documentation for each tool
  • Prompt Profiles - Documentation for prompts (if the server exposes any)
  • Quick Reference - Tool signatures for easy lookup
  • Performance Metrics - Response times and error rates for each tool

JSON Report

Machine-readable format for programmatic access.

JSON reports are generated when output.format includes json. File names and locations are configurable in bellwether.yaml:

output:
dir: ".bellwether"
files:
checkReport: "bellwether-check.json"
exploreReport: "bellwether-explore.json"

Each JSON report embeds a $schema pointer for validation. The schemas live in the repo under schemas/ and are published for tooling:

Example Output

The example below is abbreviated for readability. Refer to the schema for the full structure.

{
"$schema": "https://unpkg.com/@dotsetlabs/bellwether/schemas/bellwether-explore.schema.json",
"version": 1,
"timestamp": "2026-01-12T10:30:00Z",
"server": {
"name": "@modelcontextprotocol/server-filesystem",
"version": "0.10.1"
},
"tools": [
{
"name": "read_file",
"description": "Reads file contents",
"schema": {
"type": "object",
"properties": {
"path": { "type": "string" }
},
"required": ["path"]
},
"interview": {
"questionsAsked": 3,
"observations": [...],
"errors": [...],
"security": [...]
}
}
],
"prompts": [
{
"name": "summarize_file",
"description": "Generate a summary of file contents",
"arguments": [
{ "name": "path", "required": true },
{ "name": "max_length", "required": false }
],
"interview": {
"questionsAsked": 2,
"observations": [...],
"errors": [...]
}
}
],
"scenarioResults": [
{
"type": "tool",
"name": "read_file",
"description": "Read existing file",
"passed": true,
"assertions": [...]
}
],
"cost": {
"tokens": 1234,
"estimatedCost": 0.02
}
}

The JSON report includes:

  • tools and prompts arrays with their respective interview results
  • scenarioResults array with custom scenario test results (if scenarios were run)
  • semanticInferences with inferred parameter types (check mode)
  • schemaEvolution tracking response schema stability (check mode)
  • errorAnalysisSummaries with root causes and remediation hints (check mode)
  • documentationScore with quality grading and suggestions (check mode)

Baseline Format

Save a baseline for drift detection:

bellwether check npx your-server
bellwether baseline save
# Output (default): .bellwether/bellwether-baseline.json

The baseline captures the server's behavior at a point in time. Later, compare against it:

bellwether check npx your-server
bellwether baseline compare ./bellwether-baseline.json

Baseline format versions follow the CLI package version; baselines are compatible when their major versions match.

Example Baseline

{
"version": "2.1.1",
"metadata": {
"mode": "check",
"generatedAt": "2026-01-25T10:30:00Z",
"cliVersion": "2.1.1",
"serverCommand": "npx @modelcontextprotocol/server-filesystem /tmp",
"serverName": "@modelcontextprotocol/server-filesystem",
"durationMs": 1823,
"personas": [],
"model": "none"
},
"server": {
"name": "@modelcontextprotocol/server-filesystem",
"version": "0.10.1",
"protocolVersion": "2025-11-25",
"capabilities": ["tools"]
},
"capabilities": {
"tools": [
{
"name": "read_file",
"description": "Read contents of a file",
"inputSchema": { "type": "object", "properties": { "path": { "type": "string" } } },
"schemaHash": "def456..."
}
]
},
"interviews": [],
"toolProfiles": [
{
"name": "read_file",
"description": "Read contents of a file",
"schemaHash": "def456...",
"assertions": [],
"securityNotes": [],
"limitations": [],
"behavioralNotes": []
}
],
"assertions": [],
"summary": "Filesystem server with 1 tool",
"hash": "a1b2c3d4e5f6..."
}

Multiple Formats

Documentation and JSON reports are written based on output.format (docs, json, or both; legacy alias: agents.md).
Control their locations in bellwether.yaml:

output:
dir: ".bellwether" # JSON reports
docsDir: "." # CONTRACT.md / AGENTS.md

Custom Output Directory

Set output.dir for JSON files and output.docsDir for markdown docs.

JUnit Format

Generate JUnit XML (stdout):

bellwether check --format junit > bellwether-results.xml

check --format junit works in both modes:

  • Check-only run (no baseline.comparePath): includes tool reliability and security findings from the current run.
  • Baseline comparison run (baseline.comparePath set): includes drift-focused test cases (schema drift, performance regression, security deltas, schema evolution, error trends, and documentation score changes).

JUnit output includes test cases for:

  • Schema changes (breaking, warning, info)
  • Performance regressions
  • Security findings
  • Documentation quality
  • Error pattern changes

SARIF Format

Generate SARIF (stdout):

bellwether check --format sarif > bellwether.sarif

check --format sarif works in both modes:

  • Check-only run: emits reliability/security findings from the current run (for example BWH-REL, BWH-SEC/CWE-based IDs).
  • Baseline comparison run: emits drift-specific rules and findings (BWH001 and above).

SARIF rules include:

  • BWH001-004: Schema drift rules (breaking, warning, info)
  • BWH005-006: Response structure and error pattern drift rules
  • BWH007: Security finding rule
  • BWH008-009: Response schema evolution rules
  • BWH010-011: Error trend rules
  • BWH012-013: Performance regression and confidence rules
  • BWH014-015: Documentation quality rules

Report Sections

Check mode reports include these sections:

Performance Metrics

─── Performance ───
Tool: read_file
P50: 45ms | P95: 120ms | Success: 98%
Confidence: high (15 samples, CV: 0.28)

Security Findings (with check.security.enabled)

─── Security ───
Tool: execute_query
Category: sql_injection
Risk: critical
Finding: Tool accepted SQL injection payload

Documentation Quality

─── Documentation Quality ───
Score: 85/100 (B)
Coverage: 100% | Quality: 80% | Params: 85%
Issues: 2 (1 warning, 1 info)

Error Analysis

─── Error Summary ───
Category: NotFound (ENOENT)
Root Cause: File does not exist
Remediation: Verify path before calling

See Also