Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 29 additions & 1 deletion CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,13 +5,40 @@ Agent OS is the agent-facing wrapper around secure-exec. It provides ACP session
## Boundaries

- secure-exec dependency workflow. Manage the secure-exec dependency ONLY through `scripts/secure-exec-dep.mjs` (the `just secure-exec-*` recipes); never hand-edit the `path` / `version` / `catalog:` pins.
- Testing against local secure-exec changes: run `just secure-exec-local` to repoint npm (`link:`) and crates (`path = "../secure-exec/..."`) at the sibling checkout, then `node scripts/secure-exec-dep.mjs set-crate-version <sibling-version>` so the Cargo version requirement matches the sibling crate version (otherwise cargo cannot resolve the path deps). Use `just secure-exec-status` to inspect. This mode is for local builds/tests ONLY.
- Testing against local secure-exec changes: run `just secure-exec-local` to repoint npm (`link:`) and crates (`path = "../secure-exec/..."`) at the sibling checkout, then `node scripts/secure-exec-dep.mjs set-crate-version <sibling-version>` so the Cargo version requirement matches the sibling crate version (otherwise cargo cannot resolve the path deps). Also run `pnpm install` in `../secure-exec` first, or cargo panics in `v8-runtime/build.rs` with "missing Node dependencies at .../packages/build-tools/node_modules" (the V8 bridge assets are built from there). Use `just secure-exec-status` to inspect. This mode is for local builds/tests ONLY.
- Pushing changes that depend on secure-exec changes: NEVER push with local (`path:` / `link:`) dependencies. First preview-publish the secure-exec changes to their own secure-exec branch (the `preview-publish-secure-exec` flow), then point agent-os back at that exact published version with `just secure-exec-pinned` + `just secure-exec-set-version <version>` (and `set-crate-version <version>` for the crates). Only commit/push the pinned-to-remote state.
- Keep generic runtime, kernel, VFS, language execution, and registry software behavior in secure-exec.
- Agent OS owns ACP, sessions, agent adapters, toolkit semantics, quickstarts, and the AgentOs facade.
- Call OS instances VMs, never sandboxes.
- The protocol has no backwards compatibility. Clients and the sidecar ship in same-version lockstep, so never add protocol or config versioning, runtime negotiation, fallbacks, or converters. Configs such as `CreateVmConfig` carry no `version` field; the single same-version wire handshake is the only version check. Change the protocol freely and update both sides together.

## Development

### secure-exec dependency versions (`just`)

Two independent version tracks:
- **secure-exec** — the `@secure-exec/*` npm packages and the `secure-exec-*` Cargo crates always share **one** version (npm and crates are kept in sync; pin both to the same `<v>`).
- **`@agentos-software/*`** software packages (registry agents / WASM commands) are on a **separate** track and version independently of secure-exec.

Manage them ONLY via these recipes (never hand-edit `path`/`version`/`catalog:` pins):
- `just secure-exec-local` — point deps at the sibling `../secure-exec` checkout for local hacking.
- `just secure-exec-set-version <v>` — pin secure-exec to a published version: sets the `@secure-exec/*` npm packages **and** the `secure-exec-*` crates (same `<v>`, they're in sync) and switches to pinned mode.
- `just agentos-pkgs-set-version <v>` — pin the `@agentos-software/*` software packages (separate version track).

### Depending on unreleased secure-exec changes

agent-os builds against secure-exec crates + npm packages, so a secure-exec change must reach agent-os before it can be pushed. NEVER push with local (`path:`/`link:`) deps. Flow: preview-publish the secure-exec branch (the `preview-publish-secure-exec` skill), then `just secure-exec-set-version <published-version>` (pins npm + crates + switches to pinned mode), and push only that pinned state. Caveat: a preview publishes npm but the crates.io job is dry-run/skipped — a secure-exec *crate* change only flows locally (`secure-exec-local`) or via a real crates.io release.

### Preview-publishing agent-os

`just preview-publish <branch>` dispatches `.github/workflows/publish.yaml` to cut a **preview** (debug build, npm-only, dist-tag = sanitized branch name) — for handing a build to an external project. **Preview-publish is for previews ONLY; never cut a release with it.** Releases go through `just release` (the `scripts/publish` flow).

### Testing a local build from an external project (same machine)

To consume an unpublished agent-os build in another project on this machine:
- **npm:** `pnpm -r build`, then either `pnpm pack` the package(s) and `npm install ./rivet-dev-agentos-*.tgz` in the external project, or add a `link:`/`file:` override (e.g. `"@rivet-dev/agentos": "link:/abs/path/agent-os/packages/agentos"`). The sidecar binary ships as `@rivet-dev/agentos-sidecar`.
- **cargo:** point the external Cargo project at the local crate via a path dep or `[patch.crates-io]` override (e.g. `[patch.crates-io] agentos-sidecar = { path = "/abs/path/agent-os/crates/agentos-sidecar" }`).

## Security Model

Trust model (decide which side of the boundary something is on before judging whether it is a security bug). Three components:
Expand Down Expand Up @@ -41,6 +68,7 @@ Trust model (decide which side of the boundary something is on before judging wh

## Website And Docs

- External/consumer usage (installing `@rivet-dev/agentos` and using it in your own project) is documented in the website quickstart + Agents/Custom Software pages under `website/`, not in this file. This `CLAUDE.md` is contributor/maintainer-only.
- The Agent OS website and docs live in `website/` (Astro + Starlight) and deploy to `agentos-sdk.dev` (docs at `agentos-sdk.dev/docs`). The marketing pages and docs were migrated out of `rivet.dev/agent-os` and `rivet.dev/docs/agent-os`, which now 301-redirect to this domain.
- Docs styling is owned by the shared **`@rivet-dev/docs-theme`** repo (`github.com/rivet-dev/docs-theme`), consumed via `github:rivet-dev/docs-theme#<tag>` and wired in via `...docsTheme(starlight, siteConfig)`. To change any docs styling (palette, header, sidebar, code blocks, fonts), edit that repo and follow its CLAUDE.md release workflow — never restyle docs in `website/src`. This site owns only content + `website/docs.config.mjs` (sidebar icons via each item's `attrs['data-icon']`).
- Architecture reference docs live in `website/src/content/docs/docs/architecture/` and are surfaced in `website/docs.config.mjs` under Reference → Advanced → Architecture. Treat these pages as the canonical human-facing architecture reference. When architecture behavior changes or new architecture is added, recommend the corresponding docs update to the user; do not proactively edit the docs unless the user asks for docs work or the task explicitly includes it.
Expand Down
47 changes: 47 additions & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

8 changes: 4 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,14 +14,14 @@
## Why agentOS

- **Runs inside your process**: No VMs to boot, no containers to pull. Agents start in milliseconds with minimal memory overhead.
- **Embeds in your backend**: Agents call your functions directly via [host tools](https://rivet.dev/docs/agent-os/tools). No network hops, no complex auth between services.
- **Embeds in your backend**: Agents call your functions directly via [bindings](https://agentos-sdk.dev/docs/bindings). No network hops, no complex auth between services.
- **Granular security**: Deny-by-default permissions for filesystem, network, and process access. The same isolation technology trusted by browsers worldwide.
- **Deploy anywhere**: Just an npm package. Works on your laptop, Rivet Cloud, Railway, Vercel, Kubernetes, or any container platform.
- **Open source**: Apache 2.0 licensed. Self-host or use [Rivet Cloud](https://rivet.dev/docs/agent-os/deployment) for managed infrastructure.

### agentOS vs Sandbox

agentOS is a lightweight VM that runs inside your process. Sandboxes are full Linux environments. agentOS integrates agents into your backend with [host tools](https://rivet.dev/docs/agent-os/tools) and granular permissions. Sandboxes give you a full OS for browsers, native binaries, and dev servers.
agentOS is a lightweight VM that runs inside your process. Sandboxes are full Linux environments. agentOS integrates agents into your backend with [bindings](https://agentos-sdk.dev/docs/bindings) and granular permissions. Sandboxes give you a full OS for browsers, native binaries, and dev servers.

You don't have to choose: agentOS works with sandboxes through the [sandbox extension](https://rivet.dev/docs/agent-os/sandbox), spinning up a full sandbox on demand and mounting the sandbox's file system when the workload needs it.

Expand Down Expand Up @@ -114,13 +114,13 @@ All benchmarks compare agentOS against the fastest/cheapest mainstream sandbox p

### Infrastructure
- **[Mount external storage as a filesystem](https://rivet.dev/docs/agent-os/filesystem)**: S3-compatible storage, Google Drive, host directories, overlay filesystems, or custom backends
- **[Host tools](https://rivet.dev/docs/agent-os/tools)**: Define JavaScript functions that agents call as CLI commands inside the VM
- **[Bindings](https://agentos-sdk.dev/docs/bindings)**: Define JavaScript functions that agents call as CLI commands inside the VM
- **[Cron](https://rivet.dev/docs/agent-os/cron), [webhooks](https://rivet.dev/docs/agent-os/webhooks), and [queues](https://rivet.dev/docs/agent-os/queues)**: Schedule tasks, receive external events, and serialize work with built-in primitives
- **[Sandbox extension](https://rivet.dev/docs/agent-os/sandbox)**: Pair with full sandboxes (E2B, Daytona, etc.) for heavy workloads like browsers or native compilation

### Orchestration
- **[Multiplayer](https://rivet.dev/docs/agent-os/multiplayer)**: Multiple clients observe and collaborate with the same agent in real time
- **[Agent-to-agent](https://rivet.dev/docs/agent-os/agent-to-agent)**: Agents delegate work to other agents through host-defined tools
- **[Agent-to-agent](https://rivet.dev/docs/agent-os/agent-to-agent)**: Agents delegate work to other agents through host-defined bindings
- **[Workflows](https://rivet.dev/docs/agent-os/workflows)**: Chain agent tasks into durable workflows with retries, branching, and resumable execution
- **[Authentication](https://rivet.dev/docs/agent-os/authentication)**: Integrate with your existing auth model (API keys, OAuth, JWTs)

Expand Down
5 changes: 5 additions & 0 deletions crates/agentos-actor-plugin/src/config.rs
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,11 @@ use anyhow::{Context, Result};
/// Serializable mirror of [`AgentOsConfig`]. `deny_unknown_fields` enforces
/// fail-loud behavior when callers pass fields outside this allow-list
/// (including non-serializable fields like `schedule_driver`).
///
/// Keep this struct in sync with
/// `packages/agentos/src/config.ts::nativeAgentOsOptionsSchema` and
/// `packages/agentos/src/actor.ts::buildConfigJson`; TS preflight validation
/// should reject the same native-boundary fields before this serde guard runs.
#[derive(serde::Deserialize, Default, Clone)]
#[serde(deny_unknown_fields, rename_all = "camelCase")]
pub(crate) struct AgentOsConfigJson {
Expand Down
6 changes: 4 additions & 2 deletions crates/agentos-sidecar/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -18,9 +18,11 @@ agentos-protocol = { workspace = true }
serde_json = "1.0"
serde_bare = "0.5"
secure-exec-sidecar = { workspace = true }
tokio = { version = "1", features = ["sync", "time"] }
tokio = { version = "1", features = ["sync", "time", "macros"] }
tracing = "0.1"
tracing-subscriber = { version = "0.3", features = ["fmt"] }
tracing-subscriber = { version = "0.3", features = ["fmt", "env-filter"] }
tracing-logfmt = { version = "0.3", features = ["ansi_logs"] }
tracing-appender = "0.2"

[dev-dependencies]
agentos-bridge = { workspace = true }
Expand Down
68 changes: 60 additions & 8 deletions crates/agentos-sidecar/src/acp_extension.rs
Original file line number Diff line number Diff line change
Expand Up @@ -109,22 +109,74 @@ impl AcpExtension {
ctx: ExtensionContext<'_>,
payload: &[u8],
) -> Result<ExtensionResponse, SidecarError> {
use tracing::Instrument as _;
let request = decode_request(payload)?;
let response = match request {
AcpRequest::AcpCreateSessionRequest(request) => self.create_session(ctx, request).await,
AcpRequest::AcpGetSessionStateRequest(request) => {
AcpHandlerOutput::response(self.get_session_state(ctx, request).await)
let kind = Self::acp_request_kind(&request);
let start = std::time::Instant::now();
tracing::info!(target: "agentos_sidecar::acp_extension", kind, "ext request received");

let work = async move {
match request {
AcpRequest::AcpCreateSessionRequest(request) => {
self.create_session(ctx, request).await
}
AcpRequest::AcpGetSessionStateRequest(request) => {
AcpHandlerOutput::response(self.get_session_state(ctx, request).await)
}
AcpRequest::AcpCloseSessionRequest(request) => {
AcpHandlerOutput::response(self.close_session(ctx, request).await)
}
AcpRequest::AcpSessionRequest(request) => self.session_request(ctx, request).await,
AcpRequest::AcpResumeSessionRequest(request) => {
self.resume_session(ctx, request).await
}
}
AcpRequest::AcpCloseSessionRequest(request) => {
AcpHandlerOutput::response(self.close_session(ctx, request).await)
}
.instrument(tracing::info_span!(
target: "agentos_sidecar::acp_extension",
"ext.request",
kind
));

// Stall watchdog: while the request is in flight, warn periodically so a
// hang surfaces as a breadcrumb long before the host's 120s frame
// timeout. This never interrupts the work itself.
tokio::pin!(work);
let response = loop {
tokio::select! {
result = &mut work => break result,
_ = tokio::time::sleep(std::time::Duration::from_secs(10)) => {
tracing::warn!(
target: "agentos_sidecar::acp_extension",
kind,
elapsed_ms = start.elapsed().as_millis() as u64,
"ext request still pending — possible stall before response frame",
);
}
}
AcpRequest::AcpSessionRequest(request) => self.session_request(ctx, request).await,
AcpRequest::AcpResumeSessionRequest(request) => self.resume_session(ctx, request).await,
};

tracing::info!(
target: "agentos_sidecar::acp_extension",
kind,
elapsed_ms = start.elapsed().as_millis() as u64,
"ext request handled",
);
let payload = encode_response(response.response.unwrap_or_else(error_response))?;
ExtensionResponse::with_wire_events(payload, response.events)
}

/// Stable label for an ACP request kind, used as a tracing field.
fn acp_request_kind(request: &AcpRequest) -> &'static str {
match request {
AcpRequest::AcpCreateSessionRequest(_) => "create_session",
AcpRequest::AcpGetSessionStateRequest(_) => "get_session_state",
AcpRequest::AcpCloseSessionRequest(_) => "close_session",
AcpRequest::AcpSessionRequest(_) => "session_request",
AcpRequest::AcpResumeSessionRequest(_) => "resume_session",
}
}

async fn create_session(
&self,
mut ctx: ExtensionContext<'_>,
Expand Down
Loading
Loading