Skip to main content

The map and the trace: shipping mcp-tape and mcp-replay

3 min read By Craig Merry
claude-code mcp workflow-atlas mcp-tape mcp-replay observability rediscovery-tax open-core

Earlier today I wrote about workflow-atlas — a Claude Code plugin that scaffolds a graphical map of every package and flow in your repo, so cold Claude sessions land oriented instead of burning 80k tokens on grep-tours. The framing: text-only context (CLAUDE.md, MEMORY, spec/plan docs) is incomplete; the structural-graph dimension is missing. Add the atlas, save the rediscovery tax, ship.

Then I closed the laptop and realized I’d only solved half the problem.

The atlas is the map. It says: here’s what Claude should know about your repo before it starts working. But there’s a complementary question I’d been ignoring: what did Claude actually do once it started? Which MCP tools did it call? With what arguments? In what order? How long did each take? Where did things go sideways?

Anthropic ships MCP Inspector for interactive testing of MCP servers — it’s great for “does this server respond to my requests correctly?” But it doesn’t capture or replay real sessions. There’s no time-travel debugger for what your agent actually did across a 2-hour Claude Code session. So I built one this evening.

mcp-tape

mcp-tape is a stdio proxy. You wrap any MCP server with it:

npm i -g mcp-tape@alpha
mcp-tape -- npx -y @modelcontextprotocol/server-filesystem /your/path

Every JSON-RPC message in either direction gets logged to a .jsonl trace file. The proxy is byte-for-byte transparent — the client and server can’t tell it’s there. Default redaction catches common secret shapes (AWS keys, sk- API tokens, GitHub tokens, JWTs, and fields named password / token / authorization) so a trace you accidentally share doesn’t immediately leak credentials.

mcp-replay

mcpreplay.dev is the renderer. Drop a trace URL into ?trace=<url> and you get four views: Timeline (every JSON-RPC message in order, click to expand the raw payload), Tools (aggregate per-tool latency and error counts), Calls (each tools/call paired with its response and latency), Raw (line-numbered JSONL pager). No build step, no framework — static HTML on Cloudflare Pages.

The trace format is open and stable at v1. Anything can produce or consume it.

Two halves of the same thesis

These aren’t unrelated products. They share one thesis: externalize agent cognition into durable, human-readable artifacts the next session can read.

  • workflow-atlas externalizes the context: what packages exist, what flows pass through them, what’s planned, what’s in flight. The “what Claude should know before it starts” view.
  • mcp-replay externalizes the execution: what tools actually got called, with what arguments, in what order, with what latency. The “what Claude actually did once it started” view.

In one sentence: workflow-atlas is the map; mcp-replay is the trace of where you walked.

Open-core, on purpose

workflow-atlas went private earlier today as I started planning a multi-user-teams enterprise version. mcp-tape and mcp-replay ship MIT, open source, free, public — they’re the layer anyone using MCP can adopt without ever knowing workflow-atlas exists. The format spec is open and stable; that’s the moat against re-implementation, not the code.

Long-term, the same trace data atom powers everything. One dev uses mcp-tape to debug a session. Their team aggregates traces across members for shared visibility. Their org rolls those up for compliance and audit. Same format, same visual language, no migration cost up the funnel. Grafana / MongoDB / Elastic shape.

Try it

npm i -g mcp-tape@alpha
mcp-tape -- npx -y @modelcontextprotocol/server-filesystem /your/path
# open the resulting .jsonl at mcpreplay.dev/?trace=<url>

Bug reports and trace donations welcome — the more real-world MCP traffic this gets exercised against, the faster the rough edges get found. Source: github.com/craigm26/mcp-tape and github.com/craigm26/mcp-replay.