feat(ai-sandbox): serverless/edge run model + Cloudflare example (stacked on #774)#801
Conversation
Adds the serverless/edge execution model so a trigger can start an agent run and return immediately while a durable orchestrator drives it and clients tail from a resumable cursor. - Resumable run event-log (RunEventLog/InMemoryRunEventLog) + run driver (pipeToRunLog, RunController). - Transport-agnostic tool-bridge (createToolBridgeCore/handleBridgeJsonRpc) + ToolBridgeProvisioner capability. node:http transport hardened: loopback bind unless Docker needs host.docker.internal, constant-time bearer compare. - Harness adapters (claude-code/codex/gemini-cli/opencode) resolve the bridge from the provisioner capability instead of hardcoding node:http. - Claude Code runs on Cloudflare: SandboxCapabilities.writableStdin flag + prompt delivered via file/in-shell stdin-redirection. - Tool-bridge bearer token moved out of argv (--mcp-config via file). - examples/sandbox-cloudflare-agent: Worker -> Durable Object -> Container reference (compile-only, not runtime-verified in-repo). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
🚀 Changeset Version Preview9 package(s) bumped directly, 31 bumped as dependents. 🟥 Major bumps
🟨 Minor bumps
🟩 Patch bumps
|
|
View your CI Pipeline Execution ↗ for commit 2758669
☁️ Nx Cloud last updated this comment at |
|
View your CI Pipeline Execution ↗ for commit db8821d
☁️ Nx Cloud last updated this comment at |
@tanstack/ai
@tanstack/ai-angular
@tanstack/ai-anthropic
@tanstack/ai-claude-code
@tanstack/ai-client
@tanstack/ai-code-mode
@tanstack/ai-code-mode-skills
@tanstack/ai-codex
@tanstack/ai-devtools-core
@tanstack/ai-elevenlabs
@tanstack/ai-event-client
@tanstack/ai-fal
@tanstack/ai-gemini
@tanstack/ai-gemini-cli
@tanstack/ai-grok
@tanstack/ai-groq
@tanstack/ai-isolate-cloudflare
@tanstack/ai-isolate-node
@tanstack/ai-isolate-quickjs
@tanstack/ai-mcp
@tanstack/ai-ollama
@tanstack/ai-openai
@tanstack/ai-opencode
@tanstack/ai-openrouter
@tanstack/ai-preact
@tanstack/ai-react
@tanstack/ai-react-ui
@tanstack/ai-sandbox
@tanstack/ai-sandbox-cloudflare
@tanstack/ai-sandbox-docker
@tanstack/ai-sandbox-local-process
@tanstack/ai-solid
@tanstack/ai-solid-ui
@tanstack/ai-svelte
@tanstack/ai-utils
@tanstack/ai-vue
@tanstack/ai-vue-ui
@tanstack/openai-base
@tanstack/preact-ai-devtools
@tanstack/react-ai-devtools
@tanstack/solid-ai-devtools
commit: |
…bridge in-container Adds the co-located variant: the harness loop AND its MCP tool-bridge run INSIDE the container (the in-container sandbox is just local-process — native stdin, a localhost node:http bridge, no public bridge URL). The Durable Object stays outside as a thin durable coordinator. Only chat()-tool EXECUTION crosses the container→orchestrator boundary, shrinking the public surface from the whole MCP protocol to one authenticated tool-exec call. - New @tanstack/ai-sandbox exports: remoteToolStubs, toolDescriptors, httpRemoteToolExecutor, executeHostTool (+ RemoteToolExecutor). The container rebuilds chat() tools as delegating stubs; the orchestrator answers one tool-exec call with the real tool's execute(). (6 unit tests, local-process.) - Integration test (claude-code): a real adapter in a real local-process sandbox with a fake claude that performs an MCP tools/call against the in-container bridge, proving the round-trip agent → localhost bridge → stub → executeHostTool → real tool → stream. (34 tests pass.) - examples/sandbox-cloudflare-agent-colocated: Worker → DO → Container reference with the in-container runner (compile-only; tsc -b clean). - docs: "Edge execution: two models" section contrasting DO-drives vs co-located. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…l edge agent DX
Promotes the CF serverless/edge orchestration out of the examples and into the
package, so an app's worker.ts is a single configured function call. The
coordinator, durable run-log, WebSocket streaming, tool-bridge, and in-container
runner all move into @tanstack/ai-sandbox-cloudflare.
New `@tanstack/ai-sandbox-cloudflare/agent` (Workers entry):
- SandboxCoordinator (abstract base): durable run-log, resumable hibernatable
WebSocket tail, startRun→pipe-under-waitUntil, watchdog alarm, status.
- ChatSandboxCoordinator (DO-drives) + ContainerSandboxCoordinator (co-located).
- createCloudflareSandboxAgent(config) → { Coordinator, Sandbox, worker };
createSandboxAgentWorker() router; DurableObjectRunEventLog;
timingSafeBearerEqualWeb (constant-time, no non-null assertions).
New `@tanstack/ai-sandbox-cloudflare/runner` (node/container entry):
- runInContainerHarness({ resolveAdapter }) — the in-container HTTP server +
chat() wiring for the co-located model; the app supplies only the adapter.
Both CF examples collapse to a ~10-line worker.ts (one createCloudflareSandboxAgent
call + the wrangler-required DO re-exports); coordinator/run-log/protocol files
deleted. Net −1,900 lines in examples. Build externalizes cloudflare:workers;
compile-only (not runtime-verified in-repo).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Address the PR #801 review findings on the edge/serverless run layer. Critical - coordinator: guard against duplicate concurrent WebSocket pumps on one socket (WeakSet) so a client message can't double-deliver events / corrupt the resume cursor; add WebSocket.OPEN guards before send/close. - coordinator: open the run BEFORE building its stream, recording a RUN_ERROR + finish('error') if buildRunStream throws, so a build-time failure is still observed by tailing clients instead of hanging. - chat/container coordinators: guard request.json() in serveBridge / serveToolExec (JSON-RPC -32700 / 400 instead of an opaque DO 500). - coordinator: make the watchdog alarm a real backstop — fail runs stalled past WATCHDOG_STALL_MS (orchestrator presumed dead) and re-arm even on storage error. Important - remote-tools / container-coordinator: thread the abort signal + run context across the tool-exec boundary (per-run AbortController, fetch signal forwarding) so a cancelled run cancels in-flight host tools and tools receive their context. - container-coordinator: parse NDJSON via parseChunkLine — unparseable/non-chunk lines become terminal RUN_ERROR chunks, never thrown or silently dropped. - diagnostics: surface the last /health probe error from ensureRunner; log + 400-on-malformed-body in startHostToolBridge; server-side log in the tail pump. - container-coordinator: memoize the in-flight runner boot (no EADDRINUSE race). - run-log: derive InMemory seq from lastSeq+1 to match the DO; drop unused 'pending' status; add named TerminalRunStatus. Tidy-ups - contracts: make SandboxCapabilities.writableStdin a required boolean; set it on the docker / local-process providers. - tool-bridge: replace the lone `as` in listTools with a guard (toObjectSchema). - remote-tools: centralize ToolExecRequest + isToolExecRequest; protocol derives HarnessId from HARNESS_IDS. - claude-code: clean up the per-run token / prompt temp files in finally. - docs/skill: SKILL.md edge-execution section; bump docs updatedAt; clarify the co-located .dev.vars OAuth hint. Tests - new unit suites: parseContainerRunRequest branches, DurableObjectRunEventLog (incl. eviction/re-poll), handleBridgeJsonRpc branches + permission path, timingSafeBearerEqual / timingSafeBearerEqualWeb truth tables. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…stream
Two stacked bugs left runs stuck at `status:running` with nothing streamed
(claude never started its turn) in the Cloudflare sandbox agent:
- ai-claude-code: the adapter defaults `--permission-mode bypassPermissions`,
which claude maps to `--dangerously-skip-permissions` and refuses to run as
root. Sandbox containers run as root, so claude died instantly. Set
`IS_SANDBOX=1` in the CLI env (claude's documented escape hatch), merged over
any caller-provided env.
- ai-sandbox-cloudflare: `spawn()` used `@cloudflare/sandbox`'s background
process API (`startProcess` + `streamProcessLogs`), whose `onOutput`/`onExit`
callbacks never fire — so the stdout-NDJSON harness hung forever. Stream over
`exec({ stream: true, onOutput })` instead (the same proven path one-shot
`exec` uses) and resolve the exit code from its result. Do not forward the
caller's AbortSignal across the Durable Object RPC boundary (Workers RPC
cannot serialize an AbortSignal, which threw before the command ran); a failed
command now rejects `wait()` so the adapter surfaces a RUN_ERROR instead of a
silent zero-output run.
Verified end-to-end on Docker Desktop (`wrangler dev`): a run streams
RUN_STARTED → text deltas → RUN_FINISHED and settles `done`.
Also wires the Sandbox Agent page into the ts-code-mode-web example (nav link,
route, proxy, README) and bumps the example's deploy config.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…agent; drop colocated example
Convert `examples/sandbox-cloudflare-agent` into a TanStack Start app that ships
the UI, the agent, the run-coordinator Durable Object, and the `@cloudflare/sandbox`
container in one Cloudflare Worker (one `wrangler deploy`).
- `src/server.ts`: custom Cloudflare entry (`@cloudflare/vite-plugin` + TanStack
Start). Re-exports the `RunCoordinator`/`Sandbox` DO classes and composes
`proxyToSandbox` (preview ports) → `agent.worker` (`/runs`, `/_bridge`,
`/tool-exec`) → Start SSR (UI + `/api/*`). Typed via `satisfies ExportedHandler`.
- `src/agent.ts`: the `createCloudflareSandboxAgent({ adapter: claudeCodeText('sonnet') })`
config + demo host tool (DO-drives mode).
- `src/routes/index.tsx`: vanilla `useChat` chat UI.
- `src/routes/api.run.ts`: same-origin SSE proxy bridging the agent's
POST-then-WebSocket protocol; uses the workerd `fetch(Upgrade)` API.
- Generated `worker-configuration.d.ts` committed (`pnpm cf-typegen`); drop
`@cloudflare/workers-types` per wrangler's runtime-types guidance.
Delete `examples/sandbox-cloudflare-agent-colocated`. Colocation stays a supported
package mode (`mode: 'colocated'` + `runInContainerHarness`); the DO-drives vs
colocated tradeoff is documented in prose (docs/sandbox/overview.md, changeset,
example README) rather than exemplified.
Remove the now-superseded `_sandbox-agent` proxy route from
`examples/ts-code-mode-web` (its UI moved into the new example).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Review the following changes in direct dependencies. Learn more about Socket for GitHub.
|
Restore examples/ts-code-mode-web to its main state. This branch had added a README and the `_sandbox-agent` proxy route (and the earlier commit removed the Header link + routeTree entries); none of it belongs here now that the edge sandbox agent is its own TanStack Start example. Net ts-code-mode-web diff vs main is now zero. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…agent Make the sandbox-cloudflare-agent example demonstrate something real: ask the agent to build a TanStack AI chatbot. - `tanstackAiRecipe` host tool (replaces the throwaway `lookup`): returns the current TanStack AI chatbot recipe (packages, server route, client hook, run step), grounded in docs/getting-started. The agent calls it before scaffolding, which exercises the `/_bridge` MCP path AND yields a working app. Recipe targets the Anthropic adapter so it reuses the sandbox's `ANTHROPIC_API_KEY`. - Prompt suggestions + empty-state rewritten around the chatbot build → run → preview URL flow. - Configurable sandbox env: an explicit `sandbox` resolver injects `createSecrets` from the Worker env, so adding a var is one line + `.dev.vars` (with a typed `AppEnv extends SandboxAgentEnv` extension documented inline). README gains a "Setting sandbox env" section and a Limitations note that env is host-controlled, not per-user (the run trigger carries only threadId + messages). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… Intent Make the sandbox-cloudflare-agent example demonstrate something real and zero-config: ask the agent to build a self-contained TanStack Start app (kanban, dashboard, small game…) that runs with NO env, API keys, or external services — so its sandbox preview URL works for anyone. - Dockerfile ships the `tanstack` CLI (`npm i -g @tanstack/cli`). The in-container `claude` is otherwise a bare install with none of the host's skills; rather than copy skill files in, we ship the CLI that provisions them. The agent scaffolds with `tanstack create … --intent`, which writes TanStack Intent skill mappings into the generated project for coding agents. - `tanstackStartRecipe` host tool (replaces the throwaway `lookup`): bridged over the `/_bridge` MCP path, it returns the scaffold command (`tanstack create`), what to build (a no-env app), and the sandbox-specific run step (bind 0.0.0.0, expose the port, return the preview URL) — the bits the generic skill can't know. - Prompt suggestions + empty-state rewritten around the no-env build → run → preview URL flow. - Configurable sandbox env: an explicit `sandbox` resolver injects `createSecrets` from the Worker env (the demo app needs none; `ANTHROPIC_API_KEY` is only for the `claude` CLI). README gains a "Setting sandbox env" section and a Limitations note that env is host-controlled, not per-user. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Turn "the agent built and ran an app" into "here's the running app" — a clickable
preview link in the chat UI.
- `exposePreview` host tool (bridged over `/_bridge`): the agent calls it with the
dev-server port once it's listening; the tool runs host-side, calls
`sandbox.exposePort(port, { hostname: PUBLIC_HOSTNAME })` on the run's container,
and returns the public URL. `proxyToSandbox` (already in server.ts) routes it back
into the container.
- `namedCloudflareSandbox` (src/sandbox-provider.ts): pins the container Durable
Object to the run's `threadId` instead of a random UUID, so the host tool can
address the same container the dev server runs in. Side benefit: deterministic
per-thread reuse across DO eviction.
- UI renders the `exposePreview` result as an "Open preview ↗" link.
- Recipe `run` step + README updated; documents the deployed-vs-local wildcard-DNS
caveat (preview URLs resolve on a deployed Worker, not under localhost).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…hability docs The in-sandbox agent's container reaches the host over `PUBLIC_HOSTNAME` for the `tanstack` MCP tool-bridge (`/_bridge`) and preview ports. A local `pnpm dev` worker isn't reachable from that container, so runs failed with "the tanstack MCP server hasn't come up" (the bridge 404s against the wrong instance). - `pnpm dev:tunnel` (scripts/dev-tunnel.mjs): starts a `cloudflared` quick tunnel, writes the assigned `*.trycloudflare.com` host into `.dev.vars` as `PUBLIC_HOSTNAME`, then runs vite — so the container can reach the bridge and agent runs work locally. (Quick tunnel = one hostname, so the bridge works but wildcard preview URLs still need a named tunnel or a deploy.) - wrangler.jsonc `PUBLIC_HOSTNAME` comment + README Limitations now spell out the requirement and the exact "hasn't come up" symptom, and the README documents the tunnel workflow (quick vs named). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…e cloudflared)
The `@cloudflare/vite-plugin` has first-class Cloudflare Tunnel support and
downloads `cloudflared` itself — so the local-dev bridge tunnel needs no separate
install and no custom script.
- Drop `scripts/dev-tunnel.mjs`; `pnpm dev:tunnel` is now `TUNNEL=1 vite dev`,
which enables the plugin's `tunnel: { autoStart: true }`. (Or press `t + Enter`
in a running `pnpm dev`.)
- vite.config gates the tunnel on `TUNNEL` so plain `pnpm dev` is unchanged.
- README updated: point `PUBLIC_HOSTNAME` at the tunnel host (quick tunnel = copy
the printed `*.trycloudflare.com`; named tunnel = stable host + wildcard route
for previews).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…T_REQUEST_HOST) The agent's container reaches the host over the bridge / tool-exec / preview URL. Those were built from a static `PUBLIC_HOSTNAME`, which doesn't fit a dev tunnel (dynamic host) — leading to "the tanstack MCP server hasn't come up" locally. Add `StartRunInput.publicHost`: the Worker captures the `POST /runs` request host and the coordinators can build the callback URL from it, so opening the UI at the tunnel URL "just works" with no hostname to copy into config. SECURITY: the request `Host` is client-controlled and the per-run bearer token rides the derived URL, so trusting it blindly is a Host-injection / token-exfil vector. It is therefore **off by default** — `PUBLIC_HOSTNAME` stays authoritative unless `env.TRUST_REQUEST_HOST === '1'` (dev-only opt-in; never production). Both coordinators and the example's `exposePreview` gate on the same flag. Example: `.dev.vars.example` documents `TRUST_REQUEST_HOST=1` (with the warning), and the README tunnel flow drops the manual `PUBLIC_HOSTNAME` copy in favor of it. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…st (TRUST_REQUEST_HOST)" This reverts commit 4e752cc.
What this adds
The serverless/edge execution model for the sandbox layer: a trigger starts an agent run and returns immediately while a durable orchestrator drives it and clients tail the stream from a resumable cursor. This is the piece that makes the sandbox layer work where a request-scoped Worker can't hold a multi-minute run open.
Core (
@tanstack/ai-sandbox)RunEventLog/InMemoryRunEventLog: append-only,seq-indexed, replay-then-tail. A dropped connection / new tab / hibernated orchestrator reconnect by passing their last-seenseq.pipeToRunLog+RunController(start-without-blocking, resumableattach,drainforwaitUntil-style flushing).createToolBridgeCore/handleBridgeJsonRpc) + transport.startHostToolBridgeremains thenode:httphost transport; an edge orchestrator serves the same core from its ownfetchhandler (no raw TCP listener). NewToolBridgeProvisionercapability injects the transport.Adapters
claude-code,codex,gemini-cli,opencoderesolve the bridge from theToolBridgeProvisionercapability (default = host transport) instead of hardcodingnode:http. Host/Docker behaviour unchanged.SandboxCapabilities.writableStdinflag lets a provider advertise no writable host→process stdin; the adapter then delivers the prompt via a file + in-shell stdin-redirection (claude -p … < file) instead of a host stdin write.Example
examples/sandbox-cloudflare-agent— Worker (trigger →202 {runId}) →RunCoordinatorDurable Object (drives the run underctx.waitUntil, DO-storage-backed run-log, fetch-served MCP bridge, hibernatable WebSocket with cursor resume) → Container. Compile-only / not runtime-verified in CI (no Workers runtime here; examples aren't built by Nx) — typechecks against the real@cloudflare/workers-types+@cloudflare/sandbox. Run locally withwrangler dev(see its README).Security (addresses findings from the #774 review)
--mcp-config '{…Bearer…}'; the config is written to a file and claude gets the path, so the token isn't readable viaps//proc/<pid>/cmdline.timingSafeBearerEqual; Web-Crypto variant in the DO).node:httptransport binds loopback by default, widening to0.0.0.0only for the Docker (host.docker.internal) case; the edge transport opens no TCP listener at all.ANTHROPIC_API_KEY. Cloudflare's own tutorial keeps the key out of the container via anoutboundByHostproxy that swaps a sentinel on egress — worth adopting in a follow-up.Verification
@tanstack/ai-sandbox: 119 tests (incl. newrun-log+runsuites), tsc + eslint clean.claude-code/codex/gemini-cli/opencode: tsc + tests + eslint clean.ai-sandbox-cloudflare+ the example: tsc clean (example against real CF SDK types).🤖 Generated with Claude Code