rivet-dev · NathanFlurry · Jun 24, 2026 · Jun 24, 2026 · Jun 24, 2026 · Jun 24, 2026
diff --git a/CLAUDE.md b/CLAUDE.md
@@ -5,13 +5,40 @@ Agent OS is the agent-facing wrapper around secure-exec. It provides ACP session
 ## Boundaries
 
 - secure-exec dependency workflow. Manage the secure-exec dependency ONLY through `scripts/secure-exec-dep.mjs` (the `just secure-exec-*` recipes); never hand-edit the `path` / `version` / `catalog:` pins.
-  - Testing against local secure-exec changes: run `just secure-exec-local` to repoint npm (`link:`) and crates (`path = "../secure-exec/..."`) at the sibling checkout, then `node scripts/secure-exec-dep.mjs set-crate-version <sibling-version>` so the Cargo version requirement matches the sibling crate version (otherwise cargo cannot resolve the path deps). Use `just secure-exec-status` to inspect. This mode is for local builds/tests ONLY.
+  - Testing against local secure-exec changes: run `just secure-exec-local` to repoint npm (`link:`) and crates (`path = "../secure-exec/..."`) at the sibling checkout, then `node scripts/secure-exec-dep.mjs set-crate-version <sibling-version>` so the Cargo version requirement matches the sibling crate version (otherwise cargo cannot resolve the path deps). Also run `pnpm install` in `../secure-exec` first, or cargo panics in `v8-runtime/build.rs` with "missing Node dependencies at .../packages/build-tools/node_modules" (the V8 bridge assets are built from there). Use `just secure-exec-status` to inspect. This mode is for local builds/tests ONLY.
   - Pushing changes that depend on secure-exec changes: NEVER push with local (`path:` / `link:`) dependencies. First preview-publish the secure-exec changes to their own secure-exec branch (the `preview-publish-secure-exec` flow), then point agent-os back at that exact published version with `just secure-exec-pinned` + `just secure-exec-set-version <version>` (and `set-crate-version <version>` for the crates). Only commit/push the pinned-to-remote state.
 - Keep generic runtime, kernel, VFS, language execution, and registry software behavior in secure-exec.
 - Agent OS owns ACP, sessions, agent adapters, toolkit semantics, quickstarts, and the AgentOs facade.
 - Call OS instances VMs, never sandboxes.
 - The protocol has no backwards compatibility. Clients and the sidecar ship in same-version lockstep, so never add protocol or config versioning, runtime negotiation, fallbacks, or converters. Configs such as `CreateVmConfig` carry no `version` field; the single same-version wire handshake is the only version check. Change the protocol freely and update both sides together.
 
+## Development
+
+### secure-exec dependency versions (`just`)
+
+Two independent version tracks:
+- **secure-exec** — the `@secure-exec/*` npm packages and the `secure-exec-*` Cargo crates always share **one** version (npm and crates are kept in sync; pin both to the same `<v>`).
+- **`@agentos-software/*`** software packages (registry agents / WASM commands) are on a **separate** track and version independently of secure-exec.
+
+Manage them ONLY via these recipes (never hand-edit `path`/`version`/`catalog:` pins):
+- `just secure-exec-local` — point deps at the sibling `../secure-exec` checkout for local hacking.
+- `just secure-exec-set-version <v>` — pin secure-exec to a published version: sets the `@secure-exec/*` npm packages **and** the `secure-exec-*` crates (same `<v>`, they're in sync) and switches to pinned mode.
+- `just agentos-pkgs-set-version <v>` — pin the `@agentos-software/*` software packages (separate version track).
+
+### Depending on unreleased secure-exec changes
+
+agent-os builds against secure-exec crates + npm packages, so a secure-exec change must reach agent-os before it can be pushed. NEVER push with local (`path:`/`link:`) deps. Flow: preview-publish the secure-exec branch (the `preview-publish-secure-exec` skill), then `just secure-exec-set-version <published-version>` (pins npm + crates + switches to pinned mode), and push only that pinned state. Caveat: a preview publishes npm but the crates.io job is dry-run/skipped — a secure-exec *crate* change only flows locally (`secure-exec-local`) or via a real crates.io release.
+
+### Preview-publishing agent-os
+
+`just preview-publish <branch>` dispatches `.github/workflows/publish.yaml` to cut a **preview** (debug build, npm-only, dist-tag = sanitized branch name) — for handing a build to an external project. **Preview-publish is for previews ONLY; never cut a release with it.** Releases go through `just release` (the `scripts/publish` flow).
+
+### Testing a local build from an external project (same machine)
+
+To consume an unpublished agent-os build in another project on this machine:
+- **npm:** `pnpm -r build`, then either `pnpm pack` the package(s) and `npm install ./rivet-dev-agentos-*.tgz` in the external project, or add a `link:`/`file:` override (e.g. `"@rivet-dev/agentos": "link:/abs/path/agent-os/packages/agentos"`). The sidecar binary ships as `@rivet-dev/agentos-sidecar`.
+- **cargo:** point the external Cargo project at the local crate via a path dep or `[patch.crates-io]` override (e.g. `[patch.crates-io] agentos-sidecar = { path = "/abs/path/agent-os/crates/agentos-sidecar" }`).
+
 ## Security Model
 
 Trust model (decide which side of the boundary something is on before judging whether it is a security bug). Three components:
@@ -41,6 +68,7 @@ Trust model (decide which side of the boundary something is on before judging wh
 
 ## Website And Docs
 
+- External/consumer usage (installing `@rivet-dev/agentos` and using it in your own project) is documented in the website quickstart + Agents/Custom Software pages under `website/`, not in this file. This `CLAUDE.md` is contributor/maintainer-only.
 - The Agent OS website and docs live in `website/` (Astro + Starlight) and deploy to `agentos-sdk.dev` (docs at `agentos-sdk.dev/docs`). The marketing pages and docs were migrated out of `rivet.dev/agent-os` and `rivet.dev/docs/agent-os`, which now 301-redirect to this domain.
 - Docs styling is owned by the shared **`@rivet-dev/docs-theme`** repo (`github.com/rivet-dev/docs-theme`), consumed via `github:rivet-dev/docs-theme#<tag>` and wired in via `...docsTheme(starlight, siteConfig)`. To change any docs styling (palette, header, sidebar, code blocks, fonts), edit that repo and follow its CLAUDE.md release workflow — never restyle docs in `website/src`. This site owns only content + `website/docs.config.mjs` (sidebar icons via each item's `attrs['data-icon']`).
 - Architecture reference docs live in `website/src/content/docs/docs/architecture/` and are surfaced in `website/docs.config.mjs` under Reference → Advanced → Architecture. Treat these pages as the canonical human-facing architecture reference. When architecture behavior changes or new architecture is added, recommend the corresponding docs update to the user; do not proactively edit the docs unless the user asks for docs work or the task explicitly includes it.

diff --git a/Cargo.lock b/Cargo.lock
diff --git a/README.md b/README.md
@@ -14,14 +14,14 @@
 ## Why agentOS
 
 - **Runs inside your process**: No VMs to boot, no containers to pull. Agents start in milliseconds with minimal memory overhead.
-- **Embeds in your backend**: Agents call your functions directly via [host tools](https://rivet.dev/docs/agent-os/tools). No network hops, no complex auth between services.
+- **Embeds in your backend**: Agents call your functions directly via [bindings](https://agentos-sdk.dev/docs/bindings). No network hops, no complex auth between services.
 - **Granular security**: Deny-by-default permissions for filesystem, network, and process access. The same isolation technology trusted by browsers worldwide.
 - **Deploy anywhere**: Just an npm package. Works on your laptop, Rivet Cloud, Railway, Vercel, Kubernetes, or any container platform.
 - **Open source**: Apache 2.0 licensed. Self-host or use [Rivet Cloud](https://rivet.dev/docs/agent-os/deployment) for managed infrastructure.
 
 ### agentOS vs Sandbox
 
-agentOS is a lightweight VM that runs inside your process. Sandboxes are full Linux environments. agentOS integrates agents into your backend with [host tools](https://rivet.dev/docs/agent-os/tools) and granular permissions. Sandboxes give you a full OS for browsers, native binaries, and dev servers.
+agentOS is a lightweight VM that runs inside your process. Sandboxes are full Linux environments. agentOS integrates agents into your backend with [bindings](https://agentos-sdk.dev/docs/bindings) and granular permissions. Sandboxes give you a full OS for browsers, native binaries, and dev servers.
 
 You don't have to choose: agentOS works with sandboxes through the [sandbox extension](https://rivet.dev/docs/agent-os/sandbox), spinning up a full sandbox on demand and mounting the sandbox's file system when the workload needs it.
 
@@ -114,13 +114,13 @@ All benchmarks compare agentOS against the fastest/cheapest mainstream sandbox p
 
 ### Infrastructure
 - **[Mount external storage as a filesystem](https://rivet.dev/docs/agent-os/filesystem)**: S3-compatible storage, Google Drive, host directories, overlay filesystems, or custom backends
-- **[Host tools](https://rivet.dev/docs/agent-os/tools)**: Define JavaScript functions that agents call as CLI commands inside the VM
+- **[Bindings](https://agentos-sdk.dev/docs/bindings)**: Define JavaScript functions that agents call as CLI commands inside the VM
 - **[Cron](https://rivet.dev/docs/agent-os/cron), [webhooks](https://rivet.dev/docs/agent-os/webhooks), and [queues](https://rivet.dev/docs/agent-os/queues)**: Schedule tasks, receive external events, and serialize work with built-in primitives
 - **[Sandbox extension](https://rivet.dev/docs/agent-os/sandbox)**: Pair with full sandboxes (E2B, Daytona, etc.) for heavy workloads like browsers or native compilation
 
 ### Orchestration
 - **[Multiplayer](https://rivet.dev/docs/agent-os/multiplayer)**: Multiple clients observe and collaborate with the same agent in real time
-- **[Agent-to-agent](https://rivet.dev/docs/agent-os/agent-to-agent)**: Agents delegate work to other agents through host-defined tools
+- **[Agent-to-agent](https://rivet.dev/docs/agent-os/agent-to-agent)**: Agents delegate work to other agents through host-defined bindings
 - **[Workflows](https://rivet.dev/docs/agent-os/workflows)**: Chain agent tasks into durable workflows with retries, branching, and resumable execution
 - **[Authentication](https://rivet.dev/docs/agent-os/authentication)**: Integrate with your existing auth model (API keys, OAuth, JWTs)
 

diff --git a/crates/agentos-actor-plugin/src/config.rs b/crates/agentos-actor-plugin/src/config.rs
@@ -17,6 +17,11 @@ use anyhow::{Context, Result};
 /// Serializable mirror of [`AgentOsConfig`]. `deny_unknown_fields` enforces
 /// fail-loud behavior when callers pass fields outside this allow-list
 /// (including non-serializable fields like `schedule_driver`).
+///
+/// Keep this struct in sync with
+/// `packages/agentos/src/config.ts::nativeAgentOsOptionsSchema` and
+/// `packages/agentos/src/actor.ts::buildConfigJson`; TS preflight validation
+/// should reject the same native-boundary fields before this serde guard runs.
 #[derive(serde::Deserialize, Default, Clone)]
 #[serde(deny_unknown_fields, rename_all = "camelCase")]
 pub(crate) struct AgentOsConfigJson {

diff --git a/crates/agentos-sidecar/Cargo.toml b/crates/agentos-sidecar/Cargo.toml
@@ -18,9 +18,11 @@ agentos-protocol = { workspace = true }
 serde_json = "1.0"
 serde_bare = "0.5"
 secure-exec-sidecar = { workspace = true }
-tokio = { version = "1", features = ["sync", "time"] }
+tokio = { version = "1", features = ["sync", "time", "macros"] }
 tracing = "0.1"
-tracing-subscriber = { version = "0.3", features = ["fmt"] }
+tracing-subscriber = { version = "0.3", features = ["fmt", "env-filter"] }
+tracing-logfmt = { version = "0.3", features = ["ansi_logs"] }
+tracing-appender = "0.2"
 
 [dev-dependencies]
 agentos-bridge = { workspace = true }

diff --git a/crates/agentos-sidecar/src/acp_extension.rs b/crates/agentos-sidecar/src/acp_extension.rs
@@ -109,22 +109,74 @@ impl AcpExtension {
         ctx: ExtensionContext<'_>,
         payload: &[u8],
     ) -> Result<ExtensionResponse, SidecarError> {
+        use tracing::Instrument as _;
         let request = decode_request(payload)?;
-        let response = match request {
-            AcpRequest::AcpCreateSessionRequest(request) => self.create_session(ctx, request).await,
-            AcpRequest::AcpGetSessionStateRequest(request) => {
-                AcpHandlerOutput::response(self.get_session_state(ctx, request).await)
+        let kind = Self::acp_request_kind(&request);
+        let start = std::time::Instant::now();
+        tracing::info!(target: "agentos_sidecar::acp_extension", kind, "ext request received");
+
+        let work = async move {
+            match request {
+                AcpRequest::AcpCreateSessionRequest(request) => {
+                    self.create_session(ctx, request).await
+                }
+                AcpRequest::AcpGetSessionStateRequest(request) => {
+                    AcpHandlerOutput::response(self.get_session_state(ctx, request).await)
+                }
+                AcpRequest::AcpCloseSessionRequest(request) => {
+                    AcpHandlerOutput::response(self.close_session(ctx, request).await)
+                }
+                AcpRequest::AcpSessionRequest(request) => self.session_request(ctx, request).await,
+                AcpRequest::AcpResumeSessionRequest(request) => {
+                    self.resume_session(ctx, request).await
+                }
             }
-            AcpRequest::AcpCloseSessionRequest(request) => {
-                AcpHandlerOutput::response(self.close_session(ctx, request).await)
+        }
+        .instrument(tracing::info_span!(
+            target: "agentos_sidecar::acp_extension",
+            "ext.request",
+            kind
+        ));
+
+        // Stall watchdog: while the request is in flight, warn periodically so a
+        // hang surfaces as a breadcrumb long before the host's 120s frame
+        // timeout. This never interrupts the work itself.
+        tokio::pin!(work);
+        let response = loop {
+            tokio::select! {
+                result = &mut work => break result,
+                _ = tokio::time::sleep(std::time::Duration::from_secs(10)) => {
+                    tracing::warn!(
+                        target: "agentos_sidecar::acp_extension",
+                        kind,
+                        elapsed_ms = start.elapsed().as_millis() as u64,
+                        "ext request still pending — possible stall before response frame",
+                    );
+                }
             }
-            AcpRequest::AcpSessionRequest(request) => self.session_request(ctx, request).await,
-            AcpRequest::AcpResumeSessionRequest(request) => self.resume_session(ctx, request).await,
         };
+
+        tracing::info!(
+            target: "agentos_sidecar::acp_extension",
+            kind,
+            elapsed_ms = start.elapsed().as_millis() as u64,
+            "ext request handled",
+        );
         let payload = encode_response(response.response.unwrap_or_else(error_response))?;
         ExtensionResponse::with_wire_events(payload, response.events)
     }
 
+    /// Stable label for an ACP request kind, used as a tracing field.
+    fn acp_request_kind(request: &AcpRequest) -> &'static str {
+        match request {
+            AcpRequest::AcpCreateSessionRequest(_) => "create_session",
+            AcpRequest::AcpGetSessionStateRequest(_) => "get_session_state",
+            AcpRequest::AcpCloseSessionRequest(_) => "close_session",
+            AcpRequest::AcpSessionRequest(_) => "session_request",
+            AcpRequest::AcpResumeSessionRequest(_) => "resume_session",
+        }
+    }
+
     async fn create_session(
         &self,
         mut ctx: ExtensionContext<'_>,