diff --git a/.changeset/mcp-tool-annotations.md b/.changeset/mcp-tool-annotations.md new file mode 100644 index 00000000..70b00ca9 --- /dev/null +++ b/.changeset/mcp-tool-annotations.md @@ -0,0 +1,5 @@ +--- +"@stainless-code/codemap": patch +--- + +MCP `tools/list` and HTTP `GET /tools` expose advisory `readOnlyHint`, `destructiveHint`, and `idempotentHint` per tool so clients can gate auto-approval. Apply tools carry `destructiveHint`; read-only query tools carry `readOnlyHint`. diff --git a/docs/agents.md b/docs/agents.md index 32655d71..04d09a75 100644 --- a/docs/agents.md +++ b/docs/agents.md @@ -133,6 +133,8 @@ See [architecture.md § Session lifecycle wiring](./architecture.md#session-life **`context.index_freshness`** — session bootstrap includes index-level freshness metadata: `commit_drift` (HEAD ≠ `last_indexed_commit`), `pending_sync` (watcher debounce queue or in-flight reindex), optional disk-drift counts when watch is off, and a single `warning` string when agents should pause or re-index. **`context.start_here`** (non-compact) adds inline index summary, intent-ranked `query_recipe` cards, and top hub files with export signatures (adaptive caps by file count; optional MCP/HTTP `include_snippets` for one-line previews). Debug intent biases `sample_markers` toward FIXME/TODO. **MCP:** array-shaped JSON tools (`query`, …) keep row payloads verbatim and append a second `content` block prefixed `@codemap/index_freshness`; object-shaped tools merge `index_freshness` inline. **HTTP:** `POST /tool/*` adds `X-Codemap-Pending-Sync`, `X-Codemap-Commit-Drift`, and `X-Codemap-Warning` headers without changing JSON bodies; **`GET /health`** includes full cheap `index_freshness` when the DB is readable. Complements per-file `validate` / snippet `stale`. See [architecture.md § Context wiring](./architecture.md#context-wiring). +**MCP ToolAnnotations** — `tools/list` (and HTTP `GET /tools`) expose advisory `readOnlyHint` / `destructiveHint` / `idempotentHint` per tool so clients can gate auto-approval. Read paths (`query`, `show`, `audit`, …) → `readOnlyHint: true`; disk-write apply tools → `destructiveHint: true` (writes still require `yes: true`); index mutators (`save_baseline`, `drop_baseline`, `ingest_coverage`) → `readOnlyHint: false` without `destructiveHint`. + **`CODEMAP_MCP_TOOLS`** — comma-separated snake_case MCP tool names. When set, only listed tools register (stderr lists the active set). Unknown names are ignored with a warning. Unset = all tools (default). **`query_batch`** registers only when listed or when unset (eval ablation). Example: `CODEMAP_MCP_TOOLS=query,context,show codemap mcp --no-watch` diff --git a/docs/architecture.md b/docs/architecture.md index 12decb92..d9b98544 100644 --- a/docs/architecture.md +++ b/docs/architecture.md @@ -194,9 +194,9 @@ Three **mutually exclusive** CLI entry shapes; all converge on `applyDiffPayload **Tool / resource handlers (transport-agnostic):** **`src/application/tool-handlers.ts`** + **`src/application/resource-handlers.ts`** — pure functions that take the args object an MCP tool / resource URI accepts and return a discriminated **`ToolResult`** (`{ok: true, format: 'json'|'sarif'|'annotations'|'mermaid'|'diff'|'diff-json', payload}` / `{ok: false, error}`) or a **`ResourcePayload`** (`{mimeType, text}`). MCP and HTTP both wrap the same handlers — MCP translates to `{content: [{type: "text", text}]}`, HTTP translates to `(status, body)` with the right `Content-Type`. Engine layer untouched; transport changes don't ripple into the SQL. -**MCP wiring:** **`src/cli/cmd-mcp.ts`** (argv — `--watch` / `--no-watch` / `--debounce` + `--help`; bootstrap absorbs `--root`/`--config`) + **`src/application/mcp-server.ts`** (transport — tool / resource registry, SDK glue). Mirrors the `cmd-audit.ts ↔ audit-engine.ts` seam — CLI parses + lifecycle; engine owns the SDK. **`runMcpServer`** bootstraps codemap once at server boot (config + resolver + DB access become module-level state), instantiates `McpServer` from **`@modelcontextprotocol/sdk`**, attaches a **`StdioServerTransport`**, and resolves on client disconnect via **`src/application/session-lifecycle.ts`** (`createStdioDisconnectMonitor` — stdin EOF, stdout EPIPE, parent-PID poll — plus SDK `transport.onclose` and SIGINT/SIGTERM). With `--watch`, **`createManagedWatchSession`** holds one client for the stdio session and **`forceStop`** drains the watcher on exit. Tool handlers reuse the existing engine entry-points: **`query`** / **`query_recipe`** call **`executeQuery`** in **`src/application/query-engine.ts`** (same `[...rows]` / `{count}` / `{group_by, groups}` envelope `--json` would print) unless **`baseline`** is set — then **`compareQueryBaseline`** in **`src/application/query-baseline.ts`** (incompatible with non-`json` **`format`** / **`group_by`**); **`ingest_coverage`** calls **`runIngestCoverageOnDb`** in **`src/application/ingest-coverage-run.ts`** (CLI twin: `codemap ingest-coverage --json`); **`query_batch`** loops per statement via **`handleQueryBatch`** → **`executeQuery`** (batch-wide defaults + per-item overrides; items are `string | {sql, summary?, changed_since?, group_by?}`); **`audit`** runs `resolveAuditBaselines` + `runAudit` from PR #33 unchanged; **`context`** / **`validate`** call `buildContextEnvelope` / `computeValidateRows` from **`src/application/context-engine.ts`** + **`src/application/validate-engine.ts`** (lifted out of `src/cli/cmd-*.ts` in PR #41 — see § Tool / resource handlers above). **`save_baseline`** is one polymorphic tool (`{name, sql? | recipe?}`) with a runtime exclusivity check — mirrors the CLI's single `--save-baseline=` verb. **Tool naming**: snake_case throughout — Codemap convention matching the patterns in MCP spec examples and reference servers (GitHub MCP, Cursor built-ins); the spec itself doesn't mandate it. CLI stays kebab — translation lives at the MCP-arg layer. **Resources** split by freshness contract: `codemap://schema`, `codemap://skill`, `codemap://rule`, and `codemap://mcp-instructions` use **lazy memoisation** — first `read_resource` populates a per-server-instance cache; constant for the server-process lifetime so eager-vs-lazy produce identical observable behavior. `codemap://recipes`, `codemap://recipes/{id}`, `codemap://files/{+path}`, and `codemap://symbols/{name}` are **live read-per-call** (no cache) so inline recency fields and index mutations under `--watch` don't freeze at first-read. `codemap://schema` queries `sqlite_schema` live (on first read, then cached); `codemap://skill` / `codemap://rule` / `codemap://mcp-instructions` call `assembleAgentContent(kind)` from `application/agent-content.ts`, which concatenates section files under `templates/agent-content//` and dispatches `*.gen.md` files through `RENDERERS` (live recipe catalog, live `createTables()` DDL) — see [agents.md § Section assembler](./agents.md#section-assembler-and-genmd). Output shape: each tool returns the JSON payload its CLI counterpart would print (`query batch`, `trace`, `explore`, `node`, `file`, `schema`, `context --include-snippets`, `ingest-coverage`); MCP wraps via `content: [{type: "text", text: JSON.stringify(payload)}]`. `--changed-since` git lookups are memoised per `(root, ref)` pair across batch items so a `query_batch` of N items sharing the same ref does one git invocation, not N. Per-statement errors in `query_batch` are isolated — failed statements return `{error}` in their slot while siblings still execute. +**MCP wiring:** **`src/cli/cmd-mcp.ts`** (argv — `--watch` / `--no-watch` / `--debounce` + `--help`; bootstrap absorbs `--root`/`--config`) + **`src/application/mcp-server.ts`** (transport — tool / resource registry, SDK glue). Mirrors the `cmd-audit.ts ↔ audit-engine.ts` seam — CLI parses + lifecycle; engine owns the SDK. **`runMcpServer`** bootstraps codemap once at server boot (config + resolver + DB access become module-level state), instantiates `McpServer` from **`@modelcontextprotocol/sdk`**, attaches a **`StdioServerTransport`**, and resolves on client disconnect via **`src/application/session-lifecycle.ts`** (`createStdioDisconnectMonitor` — stdin EOF, stdout EPIPE, parent-PID poll — plus SDK `transport.onclose` and SIGINT/SIGTERM). With `--watch`, **`createManagedWatchSession`** holds one client for the stdio session and **`forceStop`** drains the watcher on exit. Tool handlers reuse the existing engine entry-points: **`query`** / **`query_recipe`** call **`executeQuery`** in **`src/application/query-engine.ts`** (same `[...rows]` / `{count}` / `{group_by, groups}` envelope `--json` would print) unless **`baseline`** is set — then **`compareQueryBaseline`** in **`src/application/query-baseline.ts`** (incompatible with non-`json` **`format`** / **`group_by`**); **`ingest_coverage`** calls **`runIngestCoverageOnDb`** in **`src/application/ingest-coverage-run.ts`** (CLI twin: `codemap ingest-coverage --json`); **`query_batch`** loops per statement via **`handleQueryBatch`** → **`executeQuery`** (batch-wide defaults + per-item overrides; items are `string | {sql, summary?, changed_since?, group_by?}`); **`audit`** runs `resolveAuditBaselines` + `runAudit` from PR #33 unchanged; **`context`** / **`validate`** call `buildContextEnvelope` / `computeValidateRows` from **`src/application/context-engine.ts`** + **`src/application/validate-engine.ts`** (lifted out of `src/cli/cmd-*.ts` in PR #41 — see § Tool / resource handlers above). **`save_baseline`** is one polymorphic tool (`{name, sql? | recipe?}`) with a runtime exclusivity check — mirrors the CLI's single `--save-baseline=` verb. **Tool naming**: snake_case throughout — Codemap convention matching the patterns in MCP spec examples and reference servers (GitHub MCP, Cursor built-ins); the spec itself doesn't mandate it. CLI stays kebab — translation lives at the MCP-arg layer. **Resources** split by freshness contract: `codemap://schema`, `codemap://skill`, `codemap://rule`, and `codemap://mcp-instructions` use **lazy memoisation** — first `read_resource` populates a per-server-instance cache; constant for the server-process lifetime so eager-vs-lazy produce identical observable behavior. `codemap://recipes`, `codemap://recipes/{id}`, `codemap://files/{+path}`, and `codemap://symbols/{name}` are **live read-per-call** (no cache) so inline recency fields and index mutations under `--watch` don't freeze at first-read. `codemap://schema` queries `sqlite_schema` live (on first read, then cached); `codemap://skill` / `codemap://rule` / `codemap://mcp-instructions` call `assembleAgentContent(kind)` from `application/agent-content.ts`, which concatenates section files under `templates/agent-content//` and dispatches `*.gen.md` files through `RENDERERS` (live recipe catalog, live `createTables()` DDL) — see [agents.md § Section assembler](./agents.md#section-assembler-and-genmd). Output shape: each tool returns the JSON payload its CLI counterpart would print (`query batch`, `trace`, `explore`, `node`, `file`, `schema`, `context --include-snippets`, `ingest-coverage`); MCP wraps via `content: [{type: "text", text: JSON.stringify(payload)}]`. **`tools/list` ToolAnnotations** — advisory `readOnlyHint` / `destructiveHint` / `idempotentHint` per tool from **`src/application/mcp-tool-annotations.ts`** (central map beside **`mcp-tool-allowlist.ts`**); read paths (`query`, `show`, `audit`, …) → `readOnlyHint: true`; disk-write apply tools → `destructiveHint: true` (writes still require `yes: true`); index user-data mutators (`save_baseline`, `drop_baseline`, `ingest_coverage`) → `readOnlyHint: false` without `destructiveHint`. Omitted when an older `@modelcontextprotocol/sdk` lacks annotation fields (M.6 guard). `--changed-since` git lookups are memoised per `(root, ref)` pair across batch items so a `query_batch` of N items sharing the same ref does one git invocation, not N. Per-statement errors in `query_batch` are isolated — failed statements return `{error}` in their slot while siblings still execute. -**HTTP wiring:** **`src/cli/cmd-serve.ts`** (argv — `--host` / `--port` / `--token`; bootstrap absorbs `--root`/`--config`) + **`src/application/http-server.ts`** (transport — bare `node:http`; routes `POST /tool/{name}` to `tool-handlers`, `GET /resources/{encoded-uri}` to `resource-handlers`, plus `GET /health` / `GET /tools` / `GET /resources`). Default bind **`127.0.0.1:7878`** (loopback only — refuse `0.0.0.0` unless explicitly opted in via `--host 0.0.0.0`). Optional **`--token `** requires `Authorization: Bearer ` on every request; `GET /health` is auth-exempt so liveness probes work without leaking the token. **CSRF + DNS-rebinding guard** (`csrfCheck`) runs before every route — rejects `Sec-Fetch-Site: cross-site` / `same-site` (modern-browser CSRF), any present `Origin` header (including the opaque string `null`; older-browser CSRF fallback), and `Host` header mismatch on loopback bind (DNS rebinding). Non-browser clients (curl, fetch from Node, MCP hosts, CI scripts) don't send those headers and pass through. The guard runs even on `/health` so a malicious local webpage can't probe for liveness. Output shape: HTTP returns each tool's native JSON payload directly (NOT MCP's `{content: [...]}` wrapper — HTTP doesn't need that transport artifact); `query` / `query_recipe` match `codemap query --json` row arrays (or `{count}` / `{group_by,groups}` when `summary` / `group_by` is set, or baseline diff when `baseline` is set — incompatible with non-`json` `format` / `group_by`; save/list/drop remain separate tools); other tools match their CLI `--json` envelopes; `format: "sarif"` payloads ship as `application/sarif+json`, `format: "annotations"` / `"mermaid"` / `"diff"` as `text/plain; charset=utf-8`, `format: "diff-json"` as `application/json; charset=utf-8`, JSON otherwise. Per-request DB lifecycle: open / `PRAGMA query_only = 1` / close per call (SQLite reader concurrency); 1 MiB request-body cap rejects trivial DoS. SIGINT / SIGTERM → graceful drain via `server.close()`. Every response carries **`X-Codemap-Version: `** so consumers can pin / detect upgrades. +**HTTP wiring:** **`src/cli/cmd-serve.ts`** (argv — `--host` / `--port` / `--token`; bootstrap absorbs `--root`/`--config`) + **`src/application/http-server.ts`** (transport — bare `node:http`; routes `POST /tool/{name}` to `tool-handlers`, `GET /resources/{encoded-uri}` to `resource-handlers`, plus `GET /health` / `GET /tools` / `GET /resources`). Default bind **`127.0.0.1:7878`** (loopback only — refuse `0.0.0.0` unless explicitly opted in via `--host 0.0.0.0`). Optional **`--token `** requires `Authorization: Bearer ` on every request; `GET /health` is auth-exempt so liveness probes work without leaking the token. **CSRF + DNS-rebinding guard** (`csrfCheck`) runs before every route — rejects `Sec-Fetch-Site: cross-site` / `same-site` (modern-browser CSRF), any present `Origin` header (including the opaque string `null`; older-browser CSRF fallback), and `Host` header mismatch on loopback bind (DNS rebinding). Non-browser clients (curl, fetch from Node, MCP hosts, CI scripts) don't send those headers and pass through. The guard runs even on `/health` so a malicious local webpage can't probe for liveness. Output shape: HTTP returns each tool's native JSON payload directly (NOT MCP's `{content: [...]}` wrapper — HTTP doesn't need that transport artifact); `query` / `query_recipe` match `codemap query --json` row arrays (or `{count}` / `{group_by,groups}` when `summary` / `group_by` is set, or baseline diff when `baseline` is set — incompatible with non-`json` `format` / `group_by`; save/list/drop remain separate tools); other tools match their CLI `--json` envelopes; `format: "sarif"` payloads ship as `application/sarif+json`, `format: "annotations"` / `"mermaid"` / `"diff"` as `text/plain; charset=utf-8`, `format: "diff-json"` as `application/json; charset=utf-8`, JSON otherwise. Per-request DB lifecycle: open / `PRAGMA query_only = 1` / close per call (SQLite reader concurrency); 1 MiB request-body cap rejects trivial DoS. **`GET /tools`** returns the same advisory hint fields as MCP `tools/list` (`readOnlyHint` / `destructiveHint` / `idempotentHint` per entry via **`buildHttpToolCatalogEntry`**). SIGINT / SIGTERM → graceful drain via `server.close()`. Every response carries **`X-Codemap-Version: `** so consumers can pin / detect upgrades. **Watch wiring:** **`src/cli/cmd-watch.ts`** (argv — `--debounce ` / `--quiet`; bootstrap absorbs `--root`/`--config`) + **`src/application/watcher.ts`** (engine — pure debouncer + glob filter + injectable backend; production wires [chokidar v5](https://github.com/paulmillr/chokidar) selected via the 6-watcher audit in PR #46 — pure JS, runs identically on Bun + Node, ~30M repos use it). On every change/add/unlink event chokidar emits, the engine filters via `shouldIndexPath` (same indexed extensions as the indexer + project-local recipes; skips `node_modules` / `.git` / `dist`), debounces with a sliding window (default 250 ms), then calls `createReindexOnChange` which opens a DB, runs `runCodemapIndex({mode: 'files', files: [...changed]})`, closes the DB, and logs `reindex N file(s) in Mms` to stderr unless `--quiet`. SIGINT / SIGTERM drains pending edits via `flushNow()` before the watcher closes. **Default-ON for `mcp` / `serve` since 2026-05:** both transports embed the watcher via **`createManagedWatchSession`** in **`session-lifecycle.ts`** — MCP holds one client for the stdio session; HTTP acquires per request (excluding `/health`) and stops the watcher after the last client plus a 5s release grace (not an MCP idle shutdown). Opt out with `--no-watch`, `CODEMAP_WATCH=0`, or `CODEMAP_NO_WATCH=1`. **`src/application/watch-policy.ts`** disables the watcher on WSL2 Windows drive mounts (`/mnt/*`) unless `CODEMAP_FORCE_WATCH=1`; stderr points at `codemap agents init --git-hooks` for git-triggered freshness. Standalone `codemap watch` runs the watcher decoupled from a transport for users wiring it next to a separate MCP / HTTP process. **Audit prelude optimization:** module-level `watchActive` flag; `handleAudit` skips its incremental-index prelude when active (and marks the close as readonly to avoid a wasted checkpoint). Explicit `no_index: false` still forces the prelude. diff --git a/docs/plans/apply-write-safety.md b/docs/plans/apply-write-safety.md index 1469645b..7b84c4c7 100644 --- a/docs/plans/apply-write-safety.md +++ b/docs/plans/apply-write-safety.md @@ -89,7 +89,7 @@ bun src/index.ts apply --dry-run - [ ] Edit file on disk after dry-run passes but before `--yes` apply → `file content changed`, zero files modified - [ ] Mixed-EOL fixture file → `mixed line endings`, no write - [ ] Happy path unchanged: valid apply still returns `applied: true` -- [ ] `destructiveHint` apply tools document recheck behavior in tool description (synergy with [mcp-tool-annotations](./mcp-tool-annotations.md)) +- [ ] `destructiveHint` apply tools document recheck behavior in tool description (synergy with [architecture.md § MCP wiring](../architecture.md) ToolAnnotations) --- diff --git a/docs/plans/mcp-tool-annotations.md b/docs/plans/mcp-tool-annotations.md deleted file mode 100644 index 84218e1a..00000000 --- a/docs/plans/mcp-tool-annotations.md +++ /dev/null @@ -1,108 +0,0 @@ -# MCP tool annotation hints — plan - -> **Status:** open · **Priority:** P2 · **Effort:** S (~3–5 days) -> -> **Motivator:** MCP clients use tool metadata (`readOnlyHint`, `destructiveHint`, `idempotentHint`) to gate auto-approval, sandboxing, and UI affordances. Codemap registers tools with descriptions only — write tools (`apply`, `apply_rows`, `apply_diff_input`, `save_baseline`, `drop_baseline`, `ingest_coverage`) are indistinguishable from read tools at the protocol layer. -> -> **Roadmap:** [§ Agent session & warm-path economics](../roadmap.md#agent-session--warm-path-economics) - ---- - -## Agent start here - -Smallest slice: add **`mcp-tool-annotations.ts`** map + thread into **one** `registerTool` call; snapshot-test `tools/list`; then roll through remaining tools and HTTP `GET /tools`. No handler behavior changes. - -### Key touchpoints - -| File | What to read | -| ------------------------------------------------------------------------------------------------ | ------------------------------------------------------------------------- | -| [`src/application/mcp-server.ts`](../../src/application/mcp-server.ts) | All `server.registerTool(…)` registrations | -| [`src/application/mcp-tool-allowlist.ts`](../../src/application/mcp-tool-allowlist.ts) | `CODEMAP_MCP_TOOLS` subset — annotations must apply when allowlist active | -| [`src/application/tool-handlers.ts`](../../src/application/tool-handlers.ts) | HTTP tool catalog if separate from MCP list | -| [`src/application/http-server.ts`](../../src/application/http-server.ts) | `GET /tools` JSON shape | -| [`src/application/mcp-tool-allowlist.test.ts`](../../src/application/mcp-tool-allowlist.test.ts) | Allowlist regression patterns | - -### Architecture - -```text -MCP_TOOL_ANNOTATIONS: Record - → mcp-server registerTool({ …, annotations }) - → tools/list response (advisory hints only) - → GET /tools parity (architecture § HTTP wiring) -``` - -Apply tools already gated by `--yes` / `yes: true` in handlers — annotations are client-side UX only. - -### Tracer bullet (slice 1) - -Map + annotations on `apply` and `query` only; one test asserting `destructiveHint` / `readOnlyHint`; expand matrix to full tool set in slice 2. - -### Out of scope (v1) - -Changing apply confirmation gates; runtime enforcement of hints inside handlers. - ---- - -## Pre-locked decisions - -| # | Decision | Source | -| --- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | -------------------------------------------------- | -| M.1 | **Annotations on `tools/list` only** — no behavior change to handlers; hints are advisory per [MCP tool annotations](https://modelcontextprotocol.io/specification/2025-06-18/server/tools#toolannotations). | MCP spec | -| M.2 | **Central map in code** — `MCP_TOOL_ANNOTATIONS: Record` beside `mcp-tool-allowlist.ts`; single source for all `registerTool` calls. | DRY; allowlist already names the tool set | -| M.3 | **Conservative destructive set** — `apply`, `apply_rows`, `apply_diff_input` → `destructiveHint: true`, `readOnlyHint: false`. All query/show/trace/audit read paths → `readOnlyHint: true`. | Matches shipped apply confirmation gates (`--yes`) | -| M.4 | **`save_baseline` / `drop_baseline` / `ingest_coverage`** — `readOnlyHint: false` (mutate index user-data tables); not `destructiveHint` (no source-file writes). | Distinct from apply | -| M.5 | **HTTP `/tools` catalog parity** — expose the same hint fields on `GET /tools` JSON so non-MCP consumers benefit. | [architecture § HTTP](../architecture.md) | -| M.6 | **SDK capability guard** — if installed `@modelcontextprotocol/sdk` version lacks annotation fields, skip silently (no runtime error). | Packaging compatibility | - ---- - -## Annotation matrix (v1) - -| Tool | readOnlyHint | destructiveHint | idempotentHint | Notes | -| ------------------------------------------------ | ------------ | --------------- | -------------- | ------------------------- | -| `query`, `query_recipe`, `query_batch` | true | false | true | | -| `context`, `validate`, `show`, `snippet` | true | false | true | | -| `impact`, `affected`, `trace`, `explore`, `node` | true | false | true | | -| `audit`, `list_baselines` | true | false | true | | -| `save_baseline` | false | false | true | Rewrites one baseline row | -| `drop_baseline` | false | false | true | | -| `ingest_coverage` | false | false | false | Replaces coverage slice | -| `apply`, `apply_rows`, `apply_diff_input` | false | true | false | Disk writes | - ---- - -## Implementation steps - -1. Add `mcp-tool-annotations.ts` with typed map + `getMcpToolAnnotations(name)`. -2. Thread `annotations` into each `server.registerTool(…, { description, inputSchema, annotations }, …)` in `mcp-server.ts`. -3. Extend `GET /tools` handler in `http-server.ts` to include annotation fields. -4. Unit test — snapshot `tools/list` shape (or mock server list) asserts apply tools carry `destructiveHint: true`. -5. Update `templates/agent-content/` MCP section one line: "write tools carry `destructiveHint`". -6. Optional: one sentence in [agents.md § MCP](../agents.md) for consumer-facing discoverability. - ---- - -### Verification - -```bash -bun test src/application/mcp-tool-allowlist.test.ts src/application/mcp-server.test.ts -# Snapshot tools/list — apply → destructiveHint, query → readOnlyHint -curl -s http://127.0.0.1:7878/tools # after codemap serve --token … -``` - ---- - -## Acceptance - -- [ ] MCP `tools/list` returns annotations on all registered tools -- [ ] `apply` / `apply_rows` / `apply_diff_input` have `destructiveHint: true` -- [ ] `query` / `show` / `audit` have `readOnlyHint: true` -- [ ] `GET /tools` includes same hints -- [ ] `CODEMAP_MCP_TOOLS` allowlist subset still registers annotations correctly - ---- - -## Dependencies - -- Shipped: `mcp-server.ts`, `mcp-tool-allowlist.ts`, apply confirmation gates -- Independent of other backlog items diff --git a/docs/roadmap.md b/docs/roadmap.md index 8b8f7c30..7a7185e1 100644 --- a/docs/roadmap.md +++ b/docs/roadmap.md @@ -81,7 +81,6 @@ Long-running MCP / HTTP sessions dominate agent workflows; one-shot CLI keeps th - [x] **`agents init --targets` (non-interactive IDE wiring)** — `--targets` + `--link-mode` for CI/sandboxes; MCP subset when combined with `--mcp`. Shipped [#158](https://github.com/stainless-code/codemap/pull/158); see [agents.md](./agents.md). - [ ] **`agents init` uninstall (teardown)** — symmetric inverse of init for failed pilots, template mistakes, or leaving a repo: remove codemap-managed MCP entries, pointer sections, and IDE symlinks only (same scoped paths as init; never delete user-authored `.agents/` siblings). `--target` filter, `--yes` non-interactive. Not the happy-path docs story — adoption stays `init --mcp --git-hooks` + committed `.agents/`. Effort: S. - [x] **HEAD / index freshness warning** — `index_freshness.commit_drift` + `warning` on `context` / tool metadata; boot stderr on `codemap mcp` / `serve` when concerns remain after prime. Shipped [#149](https://github.com/stainless-code/codemap/pull/149). -- [ ] **MCP tool annotation hints** — `readOnlyHint` / `destructiveHint` / `idempotentHint` on MCP `tools/list` per [MCP ToolAnnotations](https://modelcontextprotocol.io/specification/2025-06-18/server/tools#toolannotations); central map beside `mcp-tool-allowlist.ts`. Write tools (`apply`, `apply_rows`, `apply_diff_input`) → `destructiveHint: true`; query/show/trace/audit → `readOnlyHint: true`; `save_baseline` / `drop_baseline` / `ingest_coverage` mutate index user-data only. Mirror hints on HTTP `GET /tools`. Plan: [`plans/mcp-tool-annotations.md`](./plans/mcp-tool-annotations.md). Effort: S. ### Recipe & audit enrichment diff --git a/src/application/http-server.test.ts b/src/application/http-server.test.ts index bc62b985..bb6b782c 100644 --- a/src/application/http-server.test.ts +++ b/src/application/http-server.test.ts @@ -15,6 +15,8 @@ import { resolveCodemapConfig } from "../config"; import { closeDb, createTables, openDb, upsertQueryBaseline } from "../db"; import { initCodemap } from "../runtime"; import { handleRequest } from "./http-server"; +import { MCP_TOOL_NAMES } from "./mcp-tool-allowlist"; +import { MCP_TOOL_ANNOTATIONS } from "./mcp-tool-annotations"; import { createManagedWatchSession } from "./session-lifecycle"; import { _resetWatchStateForTests } from "./watcher"; @@ -126,14 +128,38 @@ describe("http-server — health + tools catalog", () => { expect(body.version).toBe("0.0.0-test"); }); - it("GET /tools returns the catalog", async () => { + it("GET /tools returns the catalog with annotation hints", async () => { serverHandle = await startServer(); const r = await fetch(`http://127.0.0.1:${serverHandle.port}/tools`); - const body = (await r.json()) as { tools: { name: string }[] }; + const body = (await r.json()) as { + tools: { + name: string; + readOnlyHint: boolean; + destructiveHint: boolean; + idempotentHint: boolean; + }[]; + }; expect(body.tools.map((t) => t.name)).toContain("query"); expect(body.tools.map((t) => t.name)).toContain("audit"); expect(body.tools.map((t) => t.name)).toContain("affected"); expect(body.tools.map((t) => t.name)).toContain("trace"); + const query = body.tools.find((t) => t.name === "query"); + const apply = body.tools.find((t) => t.name === "apply"); + expect(query).toMatchObject({ + readOnlyHint: true, + destructiveHint: false, + idempotentHint: true, + }); + expect(apply).toMatchObject({ + readOnlyHint: false, + destructiveHint: true, + idempotentHint: false, + }); + expect(body.tools).toHaveLength(MCP_TOOL_NAMES.length); + for (const name of MCP_TOOL_NAMES) { + const entry = body.tools.find((t) => t.name === name); + expect(entry).toMatchObject(MCP_TOOL_ANNOTATIONS[name]); + } }); it("404 for unknown route", async () => { diff --git a/src/application/http-server.ts b/src/application/http-server.ts index a7d9f259..a5181b97 100644 --- a/src/application/http-server.ts +++ b/src/application/http-server.ts @@ -18,6 +18,8 @@ import { warnIndexFreshnessToStderr, } from "./index-freshness"; import type { IndexFreshness } from "./index-freshness"; +import { MCP_TOOL_NAMES } from "./mcp-tool-allowlist"; +import { buildHttpToolCatalogEntry } from "./mcp-tool-annotations"; import { listResources, readResource } from "./resource-handlers"; import { bindWatchClientRelease, @@ -108,29 +110,6 @@ export interface HttpServerOpts { managedWatchSession?: ManagedWatchSession; } -const TOOL_NAMES = [ - "query", - "query_batch", - "query_recipe", - "audit", - "context", - "validate", - "show", - "snippet", - "impact", - "affected", - "trace", - "explore", - "node", - "apply", - "apply_rows", - "apply_diff_input", - "save_baseline", - "list_baselines", - "drop_baseline", - "ingest_coverage", -] as const; - /** * Bootstrap codemap once at server boot, then attach a long-running HTTP * listener. Resolves on SIGINT / SIGTERM (drains in-flight + closes @@ -311,7 +290,7 @@ export async function handleRequest( return writeJson( res, 200, - { tools: TOOL_NAMES.map((name) => ({ name })) }, + { tools: MCP_TOOL_NAMES.map((name) => buildHttpToolCatalogEntry(name)) }, opts.version, ); } @@ -356,7 +335,7 @@ export async function handleRequest( if (method === "POST" && path.startsWith("/tool/")) { const name = path.slice("/tool/".length); - if (!(TOOL_NAMES as readonly string[]).includes(name)) { + if (!(MCP_TOOL_NAMES as readonly string[]).includes(name)) { return writeJson( res, 404, @@ -619,7 +598,7 @@ async function dispatchTool( break; } default: { - // Reachable only if TOOL_NAMES gains an entry without a switch arm — + // Reachable only if MCP_TOOL_NAMES gains an entry without a switch arm — // the route guard above catches user-typed unknown names. return writeJson( res, diff --git a/src/application/mcp-server.test.ts b/src/application/mcp-server.test.ts index a50bd315..c80bee79 100644 --- a/src/application/mcp-server.test.ts +++ b/src/application/mcp-server.test.ts @@ -22,6 +22,8 @@ import { } from "../db"; import { initCodemap } from "../runtime"; import { createMcpServer } from "./mcp-server"; +import { MCP_TOOL_NAMES } from "./mcp-tool-allowlist"; +import { MCP_TOOL_ANNOTATIONS } from "./mcp-tool-annotations"; let benchDir: string; @@ -127,6 +129,49 @@ describe("MCP server — tool allowlist", () => { await server.close(); } }); + + it("registers ToolAnnotations on every default tool", async () => { + const { client, server } = await makeClient(); + try { + const tools = await client.listTools(); + expect(tools.tools.map((t) => t.name).sort()).toEqual( + [...MCP_TOOL_NAMES].sort(), + ); + for (const tool of tools.tools) { + expect(tool.annotations).toEqual( + MCP_TOOL_ANNOTATIONS[tool.name as keyof typeof MCP_TOOL_ANNOTATIONS], + ); + } + } finally { + await server.close(); + } + }); + + it("registers MCP ToolAnnotations on allowlisted tools", async () => { + const { client, server } = await makeClient({ + CODEMAP_MCP_TOOLS: "query,apply,show", + }); + try { + const tools = await client.listTools(); + const byName = new Map(tools.tools.map((t) => [t.name, t])); + expect(byName.get("query")?.annotations).toMatchObject({ + readOnlyHint: true, + destructiveHint: false, + idempotentHint: true, + }); + expect(byName.get("apply")?.annotations).toMatchObject({ + readOnlyHint: false, + destructiveHint: true, + idempotentHint: false, + }); + expect(byName.get("show")?.annotations).toMatchObject({ + readOnlyHint: true, + destructiveHint: false, + }); + } finally { + await server.close(); + } + }); }); describe("MCP server — query tool", () => { diff --git a/src/application/mcp-server.ts b/src/application/mcp-server.ts index 3e828add..e903d5bb 100644 --- a/src/application/mcp-server.ts +++ b/src/application/mcp-server.ts @@ -26,6 +26,7 @@ import { resolveMcpToolAllowlist, } from "./mcp-tool-allowlist"; import type { McpToolName } from "./mcp-tool-allowlist"; +import { withToolAnnotations } from "./mcp-tool-annotations"; import { listQueryRecipeCatalog } from "./query-recipes"; import { readResource } from "./resource-handlers"; import type { ResourcePayload } from "./resource-handlers"; @@ -201,11 +202,11 @@ export function createMcpServer(opts: ServerOpts): McpServer { function registerQueryTool(server: McpServer, opts: ServerOpts): void { server.registerTool( "query", - { + withToolAnnotations("query", { description: 'Run one read-only SQL statement against the codemap index (default `.codemap/index.db`). Returns the JSON envelope `codemap query --json` would print: row array by default, {count} under `summary`, {group_by, groups} under `group_by`, baseline diff under `baseline` (incompatible with non-json `format` / `group_by`). Pass `format: "sarif"` / `"annotations"` / `"mermaid"` / `"diff"` / `"diff-json"` to receive a formatted payload (incompatible with `summary` / `group_by` / `baseline`). Mermaid requires `{from, to, label?, kind?}` rows; diff requires `{file_path, line_start, before_pattern, after_pattern}` rows.', inputSchema: queryArgsSchema, - }, + }), (args) => wrapToolResult(handleQuery(args, opts.root)), ); } @@ -213,11 +214,11 @@ function registerQueryTool(server: McpServer, opts: ServerOpts): void { function registerQueryRecipeTool(server: McpServer, opts: ServerOpts): void { server.registerTool( "query_recipe", - { + withToolAnnotations("query_recipe", { description: 'Run a recipe by id (bundled or project-local). Output rows carry per-row `actions` hints (recipe-only — `query` never adds them). Parametrised recipes accept `params: {key: value}` validated against recipe frontmatter. Compose with `summary` / `changed_since` / `group_by` / `baseline` exactly like `query` (`baseline` adds `actions` on `added` rows only). Pass `format: "sarif"` / `"annotations"` / `"mermaid"` / `"diff"` / `"diff-json"` to receive a formatted payload (incompatible with `summary` / `group_by` / `baseline`); SARIF rule id derives from the recipe id (`codemap.`). List available recipes via the `codemap://recipes` resource.', inputSchema: queryRecipeArgsSchema, - }, + }), (args) => wrapToolResult(handleQueryRecipe(args, opts.root)), ); } @@ -225,11 +226,11 @@ function registerQueryRecipeTool(server: McpServer, opts: ServerOpts): void { function registerQueryBatchTool(server: McpServer, opts: ServerOpts): void { server.registerTool( "query_batch", - { + withToolAnnotations("query_batch", { description: "Run N read-only SQL statements in one round-trip. Each item is either a bare SQL string (inherits batch-wide flags) or an object {sql, summary?, changed_since?, group_by?} overriding batch-wide flags per-key. Returns an N-element array; per-element shape mirrors single `query`'s output for that statement's effective flag set.", inputSchema: queryBatchArgsSchema, - }, + }), (args) => wrapToolResult(handleQueryBatch(args, opts.root)), ); } @@ -237,11 +238,11 @@ function registerQueryBatchTool(server: McpServer, opts: ServerOpts): void { function registerAuditTool(server: McpServer): void { server.registerTool( "audit", - { + withToolAnnotations("audit", { description: "Structural-drift audit. Composes per-delta snapshots (files / dependencies / deprecated) into a {head, deltas} envelope. Two **primary** snapshot sources are mutually exclusive: (1) `base: ` — materialises a git committish (origin/main, HEAD~5, sha, tag) via `git archive | tar -x` to a sha-keyed cache under `.codemap/audit-cache/` (plain tree, no `.git` artifact — `git clean -xdf` and `rm -rf` both sweep it), reindexes into a cached `.codemap/index.db` at that sha, diffs against current. Cache hit on second run against same sha is sub-100ms. Requires a git repository — non-git projects get `{error: 'codemap audit: --base requires a git repository.'}`. (2) `baseline_prefix` — auto-resolves -{files,dependencies,deprecated} from `query_baselines`. Plus optional **per-delta overrides** via `baselines: {: }` that compose with either primary source. `summary: true` collapses each delta to {added: N, removed: N}. `no_index` controls the head-side incremental-index prelude (default re-indexes; watch-active default is no-op since the watcher keeps the index fresh; pass `no_index: false` to force).", inputSchema: auditArgsSchema, - }, + }), async (args) => wrapToolResult(await handleAudit(args)), ); } @@ -249,11 +250,11 @@ function registerAuditTool(server: McpServer): void { function registerContextTool(server: McpServer): void { server.registerTool( "context", - { + withToolAnnotations("context", { description: "Project bootstrap snapshot — returns the same envelope `codemap context` prints (project root, schema version, file count, start_here shortcuts, recipe catalog, index_freshness). Pass include_snippets for one-line export previews on hub leaders (ignored when compact: true).", inputSchema: contextArgsSchema, - }, + }), (args) => wrapToolResult(handleContext(args)), ); } @@ -261,11 +262,11 @@ function registerContextTool(server: McpServer): void { function registerValidateTool(server: McpServer): void { server.registerTool( "validate", - { + withToolAnnotations("validate", { description: "Compare on-disk SHA-256 of indexed files to the indexed `files.content_hash` column. Returns only out-of-sync rows with status `stale` / `missing` / `unindexed` (fresh paths omitted). Empty `paths` validates every indexed file. Useful for 'codemap doctor' agents that diagnose a stale index before issuing structural queries.", inputSchema: validateArgsSchema, - }, + }), (args) => wrapToolResult(handleValidate(args)), ); } @@ -273,11 +274,11 @@ function registerValidateTool(server: McpServer): void { function registerSaveBaselineTool(server: McpServer, opts: ServerOpts): void { server.registerTool( "save_baseline", - { + withToolAnnotations("save_baseline", { description: "Snapshot the rows of a SQL or recipe under `name` in query_baselines. Polymorphic input: pass exactly one of `sql` (ad-hoc SELECT) or `recipe` (catalog recipe id). Mirrors `codemap query --save-baseline=`'s single-verb shape; the runtime check that exactly one is set keeps the agent from accidentally saving an unintended source.", inputSchema: saveBaselineArgsSchema, - }, + }), (args) => wrapToolResult(handleSaveBaseline(args, opts.root)), ); } @@ -285,11 +286,11 @@ function registerSaveBaselineTool(server: McpServer, opts: ServerOpts): void { function registerListBaselinesTool(server: McpServer): void { server.registerTool( "list_baselines", - { + withToolAnnotations("list_baselines", { description: "List all saved baselines (no rows_json payload — use the audit tool with a baseline_prefix to compare against current). Returns the same array `codemap query --baselines --json` prints.", inputSchema: listBaselinesArgsSchema, - }, + }), () => wrapToolResult(handleListBaselines()), ); } @@ -297,11 +298,11 @@ function registerListBaselinesTool(server: McpServer): void { function registerIngestCoverageTool(server: McpServer, opts: ServerOpts): void { server.registerTool( "ingest_coverage", - { + withToolAnnotations("ingest_coverage", { description: "Ingest a coverage artifact (Istanbul JSON, LCOV, or NODE_V8_COVERAGE directory with `runtime: true`) into the index `coverage` table. Same JSON envelope as `codemap ingest-coverage --json`. Enables coverage-aware recipes (`worst-covered-exports`, `files-by-coverage`, `untested-and-dead`). Args: `path` (required), `runtime` (optional).", inputSchema: ingestCoverageArgsSchema, - }, + }), async (args) => wrapToolResult(await handleIngestCoverage(args, opts.root)), ); } @@ -309,11 +310,11 @@ function registerIngestCoverageTool(server: McpServer, opts: ServerOpts): void { function registerDropBaselineTool(server: McpServer): void { server.registerTool( "drop_baseline", - { + withToolAnnotations("drop_baseline", { description: "Delete the named baseline. Returns {dropped: } on success or {error} if the name doesn't exist.", inputSchema: dropBaselineArgsSchema, - }, + }), (args) => wrapToolResult(handleDropBaseline(args)), ); } @@ -321,11 +322,11 @@ function registerDropBaselineTool(server: McpServer): void { function registerShowTool(server: McpServer, opts: ServerOpts): void { server.registerTool( "show", - { + withToolAnnotations("show", { description: "Look up symbol(s) by exact name or field-qualified `query` search; returns {matches: [{name, kind, file_path, line_start, line_end, signature, ...}], disambiguation?, warning?}. Query syntax: kind:, name:, path:, in: fields plus optional free text (name LIKE, or source_fts with with_fts when indexed — FTS matches file bodies and returns every symbol in matching files). Use `snippet` for source text; use `query` tool for arbitrary SQL.", inputSchema: showArgsSchema, - }, + }), (args) => wrapToolResult(handleShow(args, opts.root)), ); } @@ -333,11 +334,11 @@ function registerShowTool(server: McpServer, opts: ServerOpts): void { function registerSnippetTool(server: McpServer, opts: ServerOpts): void { server.registerTool( "snippet", - { + withToolAnnotations("snippet", { description: "Same lookup as `show` (exact `{name}` or field-qualified `{query}` with kind:/name:/path:/in: tokens + optional `with_fts` for free text — FTS matches file bodies and returns every symbol in matching files) but each match carries `source` (file lines from disk at line_start..line_end) plus `stale` (true when content_hash drifted since indexing — line range may have shifted; agent decides whether to act or re-index) and `missing` (true when file is gone). Returns `{matches, disambiguation?, warning?}`; source/stale/missing are additive fields on each match.", inputSchema: snippetArgsSchema, - }, + }), (args) => wrapToolResult(handleSnippet(args, opts.root)), ); } @@ -345,11 +346,11 @@ function registerSnippetTool(server: McpServer, opts: ServerOpts): void { function registerAffectedTool(server: McpServer, opts: ServerOpts): void { server.registerTool( "affected", - { + withToolAnnotations("affected", { description: "List test files transitively impacted by changed source files (reverse BFS on `dependencies`). Same preprocessor as `codemap affected` → `affected-tests` recipe. Args: paths (explicit project-relative paths; when set, skips git — `paths: []` is explicit empty, omit paths for git discovery), changed_since (git ref when paths omitted; default HEAD; wins only when paths omitted), test_glob (SQLite GLOB; replaces default suffix globs when set), max_depth (non-negative integer BFS cap). Returns JSON array of {test_path, impact_depth, actions?} — file paths only; CI composes the runner command.", inputSchema: affectedArgsSchema, - }, + }), (args) => wrapToolResult(handleAffected(args, opts.root)), ); } @@ -357,11 +358,11 @@ function registerAffectedTool(server: McpServer, opts: ServerOpts): void { function registerImpactTool(server: McpServer): void { server.registerTool( "impact", - { + withToolAnnotations("impact", { description: "Walk the dependency / calls / imports graph from and return the blast radius. Replaces composing `WITH RECURSIVE` queries by hand. Args: target (symbol name or file path), direction (up|down|both, default both), via (dependencies|calls|imports|all, default all — symbol targets walk calls; file targets walk dependencies+imports; mismatched explicit choices land in skipped_backends), depth (default 3, 0=unbounded but cycle-detected and limit-capped), limit (default 500), summary (returns target+summary only). Result envelope: {target, direction, via, depth_limit, matches: [{depth, direction, edge, kind, name?, file_path}], summary: {nodes, max_depth_reached, by_kind, terminated_by: 'depth'|'limit'|'exhausted'}}.", inputSchema: impactArgsSchema, - }, + }), (args) => wrapToolResult(handleImpact(args)), ); } @@ -369,11 +370,11 @@ function registerImpactTool(server: McpServer): void { function registerTraceTool(server: McpServer, opts: ServerOpts): void { server.registerTool( "trace", - { + withToolAnnotations("trace", { description: "Shortest call path between two symbols plus budget-capped snippets. Composes `call-path` recipe + disk reads (cross-file callee lookup). Args: from, to, max_depth?, via (calls|dependencies|all), budget_chars (adaptive default 15k/10k/6k when omitted; snippet source text only). Returns {from, to, via?, path, snippets, truncated, truncation?, snippets_skipped_reason?}. `truncated` is true when snippet budget hit (`truncation.snippets`); dependency hops omit auto-snippets (`snippets_skipped_reason`). Fall back to `query_recipe` call-path when unsure.", inputSchema: traceArgsSchema, - }, + }), (args) => wrapToolResult(handleTrace(args, opts.root)), ); } @@ -381,11 +382,11 @@ function registerTraceTool(server: McpServer, opts: ServerOpts): void { function registerExploreTool(server: McpServer, opts: ServerOpts): void { server.registerTool( "explore", - { + withToolAnnotations("explore", { description: "Multi-symbol neighborhood survey with budget-capped snippets. Composes `symbol-neighborhood` (once per deduped name) + disk reads. Args: names (non-empty array), depth?, kind?, budget_chars (adaptive default 15k/10k/6k snippet chars when omitted). Returns {names, rows, snippets, truncated, truncation?} — `truncation.rows` when adaptive row cap hit (500/250/125 by repo size), `truncation.snippets` when budget hit. Fall back to `query_recipe` symbol-neighborhood.", inputSchema: exploreArgsSchema, - }, + }), (args) => wrapToolResult(handleExplore(args, opts.root)), ); } @@ -393,11 +394,11 @@ function registerExploreTool(server: McpServer, opts: ServerOpts): void { function registerNodeTool(server: McpServer, opts: ServerOpts): void { server.registerTool( "node", - { + withToolAnnotations("node", { description: "One-hop symbol survey: `show` center + scoped depth-1 `symbol-neighborhood` + optional inline snippets. When center is unique (`in` or single match), neighborhood filters to that instance's connected files. Args: name, kind?, in?, include_snippets (default false), budget_chars? (adaptive default 15k/10k/6k when snippets enabled and omitted; snippet source only). Returns {center, neighborhood, snippets, truncated, truncation?}. `truncated` only when `include_snippets: true` and snippet budget hit.", inputSchema: nodeArgsSchema, - }, + }), (args) => wrapToolResult(handleNode(args, opts.root)), ); } @@ -405,11 +406,11 @@ function registerNodeTool(server: McpServer, opts: ServerOpts): void { function registerApplyTool(server: McpServer, opts: ServerOpts): void { server.registerTool( "apply", - { + withToolAnnotations("apply", { description: "Apply the diff hunks a recipe describes (one per row of {file_path, line_start, before_pattern, after_pattern}) to disk. Substrate-shaped fix executor — recipe SQL is the synthesis surface, codemap executes. Args: recipe (id), params (k=v map for parametrised recipes), dry_run (preview only; phase-1 validates against current disk; no file is written), yes (required for the write path — non-TTY transports always need explicit consent; mutually exclusive with dry_run), force (bypass auto_fixable / apply.autoApplyRecipes gates), until_empty (fixpoint: dry-run probe → apply → reindex touched files → repeat; adds passes + terminated_by), max_passes (cap for until_empty; default 10), commit_message (git add touched files + commit after clean apply). Result envelope (same shape across modes): {mode: 'dry-run'|'apply', applied: bool, files: [{file_path, rows_applied, warnings?}], conflicts: [{file_path, line_start, before_pattern, actual_at_line, reason}], summary: {files, files_modified, rows, rows_applied, conflicts, files_with_conflicts}}; fixpoint runs add passes + terminated_by ∈ {empty, cap, conflicts, complete}. Q2 (c) all-or-nothing — any conflict aborts the whole run before any file is touched.", inputSchema: applyArgsSchema, - }, + }), async (args) => wrapToolResult(await handleApply(args, opts.root)), ); } @@ -417,11 +418,11 @@ function registerApplyTool(server: McpServer, opts: ServerOpts): void { function registerApplyDiffInputTool(server: McpServer, opts: ServerOpts): void { server.registerTool( "apply_diff_input", - { + withToolAnnotations("apply_diff_input", { description: "Apply a unified diff (git-style `-`/`+` hunks) to disk — same row contract and executor as `apply_rows`, but `diff_text` is parsed into diff rows (CLI twin: codemap apply --diff-input). Args: diff_text, dry_run, yes (required for writes), commit_message (optional git commit after clean apply). No recipe policy gates.", inputSchema: applyDiffInputArgsSchema, - }, + }), async (args) => wrapToolResult(await handleApplyDiffInput(args, opts.root)), ); } @@ -429,11 +430,11 @@ function registerApplyDiffInputTool(server: McpServer, opts: ServerOpts): void { function registerApplyRowsTool(server: McpServer, opts: ServerOpts): void { server.registerTool( "apply_rows", - { + withToolAnnotations("apply_rows", { description: "Apply explicit diff rows (agent-in-the-loop) — same phase-1/2 engine as `apply` but rows are supplied directly instead of from recipe SQL. Args: rows (array of {file_path, line_start, before_pattern, after_pattern}), dry_run, yes (required for writes).", inputSchema: applyRowsArgsSchema, - }, + }), (args) => wrapToolResult(handleApplyRows(args, opts.root)), ); } diff --git a/src/application/mcp-tool-annotations.test.ts b/src/application/mcp-tool-annotations.test.ts new file mode 100644 index 00000000..e2d65d71 --- /dev/null +++ b/src/application/mcp-tool-annotations.test.ts @@ -0,0 +1,91 @@ +import { describe, expect, it } from "bun:test"; + +import { MCP_TOOL_NAMES } from "./mcp-tool-allowlist"; +import { + MCP_TOOL_ANNOTATIONS, + _setSdkSupportsMcpToolAnnotationsForTests, + buildHttpToolCatalogEntry, + getMcpToolAnnotations, + sdkSupportsMcpToolAnnotations, + withToolAnnotations, +} from "./mcp-tool-annotations"; + +describe("mcp-tool-annotations", () => { + it("covers every MCP_TOOL_NAMES entry", () => { + for (const name of MCP_TOOL_NAMES) { + expect(MCP_TOOL_ANNOTATIONS[name]).toBeDefined(); + expect(getMcpToolAnnotations(name)).toBeDefined(); + } + }); + + it("apply tools carry destructiveHint", () => { + for (const name of ["apply", "apply_rows", "apply_diff_input"] as const) { + expect(getMcpToolAnnotations(name)).toMatchObject({ + readOnlyHint: false, + destructiveHint: true, + idempotentHint: false, + }); + } + }); + + it("query and audit tools carry readOnlyHint", () => { + for (const name of [ + "query", + "query_recipe", + "query_batch", + "audit", + "show", + ] as const) { + expect(getMcpToolAnnotations(name)).toMatchObject({ + readOnlyHint: true, + destructiveHint: false, + idempotentHint: true, + }); + } + }); + + it("sdkSupportsMcpToolAnnotations is true on pinned SDK", () => { + expect(sdkSupportsMcpToolAnnotations()).toBe(true); + }); + + it("omits MCP annotations when M.6 guard is false", () => { + _setSdkSupportsMcpToolAnnotationsForTests(false); + try { + expect(getMcpToolAnnotations("apply")).toBeUndefined(); + expect( + withToolAnnotations("apply", { + description: "test", + inputSchema: {}, + }), + ).toEqual({ description: "test", inputSchema: {} }); + } finally { + _setSdkSupportsMcpToolAnnotationsForTests(undefined); + } + }); + + it("index user-data mutators are not destructive", () => { + for (const name of [ + "save_baseline", + "drop_baseline", + "ingest_coverage", + ] as const) { + expect(getMcpToolAnnotations(name)).toMatchObject({ + readOnlyHint: false, + destructiveHint: false, + }); + } + expect(getMcpToolAnnotations("ingest_coverage")?.idempotentHint).toBe( + false, + ); + }); + + it("buildHttpToolCatalogEntry mirrors MCP hints", () => { + const entry = buildHttpToolCatalogEntry("apply"); + expect(entry).toEqual({ + name: "apply", + readOnlyHint: false, + destructiveHint: true, + idempotentHint: false, + }); + }); +}); diff --git a/src/application/mcp-tool-annotations.ts b/src/application/mcp-tool-annotations.ts new file mode 100644 index 00000000..32b79eb2 --- /dev/null +++ b/src/application/mcp-tool-annotations.ts @@ -0,0 +1,128 @@ +/** + * MCP ToolAnnotations hints for `tools/list` and HTTP `GET /tools`. + * Advisory only — handlers unchanged; see architecture.md § MCP wiring. + */ + +import { ToolAnnotationsSchema } from "@modelcontextprotocol/sdk/types.js"; +import type { ToolAnnotations } from "@modelcontextprotocol/sdk/types.js"; + +import type { McpToolName } from "./mcp-tool-allowlist"; + +let sdkSupportOverrideForTests: boolean | undefined; + +/** Test hook — restore with `undefined` in `afterEach`. */ +export function _setSdkSupportsMcpToolAnnotationsForTests( + value: boolean | undefined, +): void { + sdkSupportOverrideForTests = value; +} + +/** M.6 — skip MCP annotations when an older SDK lacks ToolAnnotations on registerTool. */ +export function sdkSupportsMcpToolAnnotations(): boolean { + if (sdkSupportOverrideForTests !== undefined) { + return sdkSupportOverrideForTests; + } + const shape = ToolAnnotationsSchema?.shape; + return ( + typeof shape === "object" && + shape !== null && + "readOnlyHint" in shape && + "destructiveHint" in shape && + "idempotentHint" in shape + ); +} + +export const MCP_TOOL_ANNOTATIONS = { + query: { readOnlyHint: true, destructiveHint: false, idempotentHint: true }, + query_batch: { + readOnlyHint: true, + destructiveHint: false, + idempotentHint: true, + }, + query_recipe: { + readOnlyHint: true, + destructiveHint: false, + idempotentHint: true, + }, + context: { readOnlyHint: true, destructiveHint: false, idempotentHint: true }, + validate: { + readOnlyHint: true, + destructiveHint: false, + idempotentHint: true, + }, + show: { readOnlyHint: true, destructiveHint: false, idempotentHint: true }, + snippet: { readOnlyHint: true, destructiveHint: false, idempotentHint: true }, + impact: { readOnlyHint: true, destructiveHint: false, idempotentHint: true }, + affected: { + readOnlyHint: true, + destructiveHint: false, + idempotentHint: true, + }, + trace: { readOnlyHint: true, destructiveHint: false, idempotentHint: true }, + explore: { readOnlyHint: true, destructiveHint: false, idempotentHint: true }, + node: { readOnlyHint: true, destructiveHint: false, idempotentHint: true }, + audit: { readOnlyHint: true, destructiveHint: false, idempotentHint: true }, + list_baselines: { + readOnlyHint: true, + destructiveHint: false, + idempotentHint: true, + }, + save_baseline: { + readOnlyHint: false, + destructiveHint: false, + idempotentHint: true, + }, + drop_baseline: { + readOnlyHint: false, + destructiveHint: false, + idempotentHint: true, + }, + ingest_coverage: { + readOnlyHint: false, + destructiveHint: false, + idempotentHint: false, + }, + apply: { + readOnlyHint: false, + destructiveHint: true, + idempotentHint: false, + }, + apply_rows: { + readOnlyHint: false, + destructiveHint: true, + idempotentHint: false, + }, + apply_diff_input: { + readOnlyHint: false, + destructiveHint: true, + idempotentHint: false, + }, +} satisfies Record; + +export function getMcpToolAnnotations( + name: McpToolName, +): ToolAnnotations | undefined { + if (!sdkSupportsMcpToolAnnotations()) return undefined; + return MCP_TOOL_ANNOTATIONS[name]; +} + +/** HTTP `GET /tools` catalog entry — same hint fields as MCP `tools/list`. */ +export function buildHttpToolCatalogEntry(name: McpToolName): { + name: McpToolName; +} & ToolAnnotations { + return { name, ...MCP_TOOL_ANNOTATIONS[name] }; +} + +interface ToolRegisterConfig { + description: string; + inputSchema: unknown; +} + +export function withToolAnnotations( + name: McpToolName, + config: T, +): T & { annotations?: ToolAnnotations } { + const annotations = getMcpToolAnnotations(name); + if (annotations === undefined) return config; + return { ...config, annotations }; +} diff --git a/templates/agent-content/skill/10-recipes-context.md b/templates/agent-content/skill/10-recipes-context.md index 00fa901c..985b6634 100644 --- a/templates/agent-content/skill/10-recipes-context.md +++ b/templates/agent-content/skill/10-recipes-context.md @@ -27,7 +27,7 @@ Replace placeholders (`'...'`) with your module path, file glob, or symbol name. Each emitted delta carries its own `base` metadata so mixed-baseline audits are first-class. **`--base `** materialises any git committish via `git archive | tar -x` + reindex (mutually exclusive with `--baseline`). **`--format sarif`** emits SARIF 2.1.0 for Code Scanning; **`--ci`** aliases `--format sarif` + non-zero exit on additions (mutually exclusive with `--json`). `--summary` collapses each delta to `{added: N, removed: N}`. `--no-index` skips the auto-incremental-index prelude (default is to re-index first so `head` reflects current source). v1 ships no `verdict` / threshold config — `codemap audit --json | jq -e '.deltas.dependencies.added | length <= 50'` is the CI exit-code idiom until v1.x ships native thresholds. Each delta pins a canonical SQL projection and validates baseline column-set membership before diffing — schema-bump-resilient (extras dropped, missing columns surface a clean re-save command). -**MCP server (`codemap mcp [--no-watch] [--debounce ]`)** — separate top-level command exposing the structural-query surface (20 JSON-RPC tools — list below) to agent hosts (Claude Code, Cursor, Codex, generic MCP clients) over stdio. Eliminates the bash round-trip on every agent call. Bootstrap once at server boot; each tool returns the same JSON payload its CLI `--json` would print (including `query batch`, `trace`, `explore`, `node`, `file`, `schema`, `symbols`, `context --include-snippets`, and `ingest-coverage`). MCP wraps payloads in `{content: [{type: "text", text: …}]}`. **`initialize` instructions** + resource `codemap://mcp-instructions` carry the tool-selection playbook. **Watcher default-ON since 2026-05** — every tool reads a live index, `audit`'s incremental-index prelude becomes a no-op. Pass `--no-watch` (or `CODEMAP_WATCH=0`) for one-shot fire-and-forget calls without the in-process chokidar loop. +**MCP server (`codemap mcp [--no-watch] [--debounce ]`)** — separate top-level command exposing the structural-query surface (20 JSON-RPC tools — list below) to agent hosts (Claude Code, Cursor, Codex, generic MCP clients) over stdio. Eliminates the bash round-trip on every agent call. Bootstrap once at server boot; each tool returns the same JSON payload its CLI `--json` would print (including `query batch`, `trace`, `explore`, `node`, `file`, `schema`, `symbols`, `context --include-snippets`, and `ingest-coverage`). MCP wraps payloads in `{content: [{type: "text", text: …}]}`. **`tools/list` ToolAnnotations** — advisory `readOnlyHint` / `destructiveHint` / `idempotentHint` per tool: read paths (`query`, `show`, `audit`, …) → `readOnlyHint: true`; apply tools (`apply`, `apply_rows`, `apply_diff_input`) → `destructiveHint: true` (writes still require `yes: true`); index mutators (`save_baseline`, `drop_baseline`, `ingest_coverage`) → `readOnlyHint: false` without `destructiveHint`. HTTP `GET /tools` exposes the same hints. **`initialize` instructions** + resource `codemap://mcp-instructions` carry the tool-selection playbook. **Watcher default-ON since 2026-05** — every tool reads a live index, `audit`'s incremental-index prelude becomes a no-op. Pass `--no-watch` (or `CODEMAP_WATCH=0`) for one-shot fire-and-forget calls without the in-process chokidar loop. **HTTP server (`codemap serve [--host 127.0.0.1] [--port 7878] [--token ] [--no-watch] [--debounce ]`)** — same tool taxonomy as MCP, exposed over `POST /tool/{name}` for non-MCP consumers (CI scripts, simple `curl`, IDE plugins that don't speak MCP). Loopback-default; optional Bearer-token auth. HTTP returns each tool's native JSON payload directly (NOT MCP's `{content: [...]}` wrapper); SARIF / annotations / mermaid / diff payloads ship with `application/sarif+json` or `text/plain` Content-Type; `format: "diff-json"` uses `application/json`. Resources mirrored at `GET /resources/{encoded-uri}`. `GET /health` is auth-exempt; `GET /tools` / `GET /resources` are catalogs. **Watcher default-ON since 2026-05** — same `--no-watch` / `CODEMAP_WATCH=0` opt-out as `mcp`.