MCP Test Harness
Automated security, conformance, and operational testing for MCP servers. Targets MCP specification version 2025-11-25. Produces a 0–100 quality score.
Model Context Protocol Test Harness on GITHUB https://github.com/Datasculptures/mcp-test-harness
Test Suites
Conformance
- Initialization lifecycle and handshake
- Protocol version negotiation
- Capability declaration vs. actual behaviour
- Tool listing, invocation, and input schema validation
- JSON-RPC error handling (parse errors, invalid requests, method not found)
Security
- Shell injection probing via
;|`$()&& - Environment variable expansion:
${HOME},$PATH,%USERPROFILE% - Path traversal via file-reading tools (
../, URL-encoded, null byte) - Resource URI scope and scheme injection (
javascript:,http://) - Input validation bypass (wrong types, missing args, null bytes, oversized input, deeply nested objects, large arrays)
- Information disclosure in error responses (stack traces, file paths, credential references)
Operational
- Partial JSON and empty line recovery
- Binary garbage resilience
- Rapid sequential request handling
- Large payload handling (1 MB)
- Response time baseline
- Unknown notification tolerance
Install
pip install mcp-test-harness
Usage
# Run all suites against a STDIO server
mcp-test-harness stdio -- python -m my_server
# Select suites
mcp-test-harness stdio --suite conformance -- python -m my_server
mcp-test-harness stdio --suite security -- python -m my_server
mcp-test-harness stdio --suite operational -- python -m my_server
# Output formats
mcp-test-harness stdio --format json -o report.json -- python -m my_server
mcp-test-harness stdio --format markdown -o report.md -- python -m my_server
# Print a shields.io badge URL for the score
mcp-test-harness stdio --badge -- python -m my_server
# Use a config file (auto-discovered as .mcp-test-harness.yaml)
mcp-test-harness stdio --config .mcp-test-harness.yaml
Options
--suiteconformance|security|operational|all— suites to run (default: all, repeatable)--formattext|json|markdown— output format (default: text)--output,-oWrite report to file--config,-cConfig file path (default:.mcp-test-harness.yaml)--verbose,-vShow details for all tests, including passing ones--timeoutPer-request timeout in seconds (default: 10)--badgePrint a shields.io badge URL for the score
Sample Output
MCP Test Harness v0.3.0
Spec: 2025-11-25
Server: my-server v1.0.0
Transport: stdio
=== initialization ===
✓ initialize_response_valid
✓ version_negotiation
...
8 passed
=== security ===
✓ shell_metachar_semicolon__echo
⚠ null_byte_argument — null byte in string argument not rejected
...
5 passed, 2 warnings
=== operational ===
✓ partial_json
✓ rapid_sequential_requests
...
7 passed
────────────────────────────────────────
Total: 52 passed, 0 failed, 2 warnings, 2 skipped
Score: 96/100 (A — Excellent)
- ✓ passTest passed
- ✗ failTest failed
- ⚠ warnAdvisory — server not rejected invalid input
- ○ skipSkipped (capability not declared)
- ! errorHarness error — test could not run
- Exit 0No failures (warnings are acceptable)
- Exit 1One or more failures detected
- Exit 2Harness error (config or startup failure)
Architecture
Client Layer
Two client modes: raw STDIO (direct JSON-RPC) and SDK-based. Protocol module handles framing and JSON-RPC message construction.
Test Suites
Eight suites across three categories: conformance (initialization, capabilities, tools, errors), security (injection, validation, path_traversal, resource_scope), and operational. Each runs independently.
Report Layer
Three output formats: text (human-readable), JSON (machine-readable), and Markdown (GitHub-compatible tables). A scoring module produces a 0–100 grade per run.
Test Fixtures
Includes a good_server.py (compliant MCP server) and bad_server.py (intentionally non-compliant) for harness self-testing.