Skip to content

feat(agent): drive harnesses over ACP via the rivet sandbox-agent#4760

Closed
mmabrouk wants to merge 1 commit into
feat/agent-tools-wp7from
feat/agent-rivet-acp-wp8
Closed

feat(agent): drive harnesses over ACP via the rivet sandbox-agent#4760
mmabrouk wants to merge 1 commit into
feat/agent-tools-wp7from
feat/agent-rivet-acp-wp8

Conversation

@mmabrouk

@mmabrouk mmabrouk commented Jun 19, 2026

Copy link
Copy Markdown
Member

This PR is part of a stack. Review bottom-up.

Each PR's diff is only its own delta. Merge from the bottom. This PR's base is #4759 (merge that first).

Context

This adds a second way to run a coding harness behind the same Harness port: drive it over the Agent Client Protocol (ACP) through a rivet sandbox-agent daemon, instead of the in-process Pi SDK calls. The harness (pi/claude) and the sandbox (local/daytona) become two independent config axes the caller sets per run, so switching engine or environment needs no redeploy and no new code path. Base branch: feat/agent-tools-wp7. This is slice #3 in docs/design/agent-workflows/pr-stack.md (runner streaming boundary, the second runtime path).

What this changes

A new AGENTA_AGENT_RUNTIME=rivet selects the rivet path; the default stays pi so the legacy in-process route never regresses.

  • Python _build_harness() gained two args. Before, it returned a Pi adapter from env alone. Now, when the runtime is rivet, it returns a RivetHarness carrying the chosen harness and sandbox, taken from the request config first and the AGENTA_AGENT_HARNESS / AGENTA_AGENT_SANDBOX env defaults second.
  • The request schema gained harness (enum pi/claude) and sandbox (enum local/daytona) fields, so a playground run can switch either axis.
  • The wire envelope grew harness, sandbox, sessionId, and secrets; everything else (agentsMd, model, messages, tools, customTools, toolCallback, trace) stays identical, so the Python side stays thin.
  • The TS CLI and HTTP server route by backend. The server defaults to auto: a request that carries harness/sandbox goes to runRivet, everything else to runPi, so one sidecar serves both.
  • runRivet.ts is the new driver. Per invoke it starts a cold sandbox, creates an ACP session for the chosen harness, writes AGENTS.md into the cwd, sends the turn, accumulates the streamed text, and tears the sandbox down.
  • The ACP event stream maps into Agenta spans (agenta-otel.ts createRivetOtel), so the run nests under the caller's /invoke span, the same shape the in-process path produces.
  • Token and cost totals roll up onto the /invoke workflow span. The harness span tree exports in a separate OTLP batch, so Agenta's per-batch roll-up cannot bridge the totals; the service stamps gen_ai.usage.* directly on the workflow span instead.

Key architectural decision to review

The most important one is the split tracing strategy in runRivet.ts and agenta-otel.ts, keyed on emitSpans: !isPi || isDaytona (runRivet.ts:646). Local Pi self-instruments: the Agenta extension propagates the traceparent and emits real turn/chat/tool spans from inside Pi, so the runner only accumulates output. Every other case (any non-Pi harness, or Pi on Daytona where the in-sandbox process cannot reach Agenta's OTLP) builds the span tree in the runner from the ACP session/update notifications. The tradeoff: the runner's reconstructed tree is an approximation of what the harness actually did (one chat span per turn, tool spans per tool_call), not the harness's own instrumentation, but it makes tracing uniform across harnesses that do not self-instrument. Confirm the two paths produce a comparable tree and that nothing double-counts when Pi runs locally.

Second, look at secret resolution and auth fallback. _resolve_harness_secrets (agent.py:154) fetches the project vault's provider_key secrets over HTTP and maps each to its standard env var, because the SDK's per-request secret context does not reach this custom route. When a provider key is present the harness authenticates with it; when absent it falls back to its own login (local Codex OAuth, or an uploaded auth.json on Daytona). Check that an empty vault degrades to OAuth rather than failing, and that keys never get baked into a Daytona image.

How to review this PR

  1. Open services/oss/src/agent.py first. Read _build_harness to see the runtime/harness/sandbox selection, then _resolve_harness_secrets and _record_usage. This is where the two axes and the usage roll-up are decided.
  2. Open services/oss/src/agent_pi/rivet_harness.py. It is a thin adapter: same port, two transports (HTTP sidecar, local subprocess), envelope-building only.
  3. Open services/agent/src/runRivet.ts for the real driver. Read runRivet end to end (the cold-per-invoke lifecycle), then buildSandboxProvider (local vs Daytona), onPermissionRequest (headless auto-approve), and applyModel (model rejection handling).
  4. Open services/agent/src/agenta-otel.ts handleUpdate for the event-to-span mapping. Note the Pi-vs-Claude chunk handling: Pi streams pure deltas, Claude streams deltas plus a cumulative snapshot, so the accumulator replaces when a chunk is a superset and appends otherwise (agenta-otel.ts:759).
  5. Skip the docs/ POC files and pnpm-lock.yaml; they are reference material and the dependency lock.

The regression most likely to break is the default pi runtime. The whole point of the runtime/auto-backend gating is that a deployment without AGENTA_AGENT_RUNTIME=rivet and a request without harness/sandbox still takes the exact in-process Pi path. Verify that selection logic in agent.py:66 and server.ts:29.

Tests / notes

This path was host-verified driving pi and claude over ACP via sandbox-agent, with the event-stream trace nesting under /invoke. Watch two operational facts on Daytona: the run uses a pre-baked snapshot that carries the rivet daemon plus the pi CLI (the rivet -full image lacks pi), and the runner installs a cookie-jar fetch so the Daytona preview proxy does not 401 later ACP requests. The server now swallows background unhandledRejection/uncaughtException from the rivet SDK so one failing run cannot take down every in-flight request on the shared sidecar.

Re-platform the agent workflow service to drive coding harnesses (Pi, Claude
Code) over the Agent Client Protocol through a rivet sandbox-agent daemon,
behind the unchanged Harness port and /invoke contract. The harness (pi/claude)
and sandbox (local/daytona) are editable playground config; tracing nests under
the /invoke span; tools are delivered Pi-native via a bundled extension; and the
model provider key resolves from the project vault.
@coderabbitai

coderabbitai Bot commented Jun 19, 2026

Copy link
Copy Markdown

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: cc9bd867-2a05-45bc-a6a4-eebc51c1628e

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • ✅ Review completed - (🔄 Check again to review again)
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/agent-rivet-acp-wp8

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@dosubot dosubot Bot added size:XXL This PR changes 1000+ lines, ignoring generated files. enhancement New feature or request labels Jun 19, 2026
@mmabrouk

Copy link
Copy Markdown
Member Author

Reviewer guide: interesting code

A map to the decisions worth scrutinizing, in suggested reading order.

  • services/oss/src/agent.py:66 — the runtime == "rivet" branch picks the new path and reads the harness/sandbox axes from request config first, env defaults second; everything above this line keeps the legacy in-process Pi behavior.
  • services/agent/src/runRivet.ts:531runRivet is the whole cold-per-invoke lifecycle (start sandbox, create ACP session, write AGENTS.md, prompt, accumulate, tear down); the finally block is where the sandbox always gets destroyed even on error.
  • services/agent/src/runRivet.ts:646emitSpans: !isPi || isDaytona is the split-tracing switch: local Pi self-instruments via the extension, every other case rebuilds the span tree in the runner from ACP updates. This is the single most important line to get right.
  • services/agent/src/agenta-otel.ts:759 — the chunk accumulator copes with two harness streaming styles in one branch: Pi sends pure deltas, Claude sends deltas plus a cumulative snapshot, so it replaces on a superset and appends otherwise.
  • services/oss/src/agent.py:154_resolve_harness_secrets fetches vault provider keys over HTTP and maps them to env vars because the SDK's per-request secret context does not reach this custom route; an empty vault must degrade to the harness's own OAuth login.
  • services/agent/src/server.ts:29 — the auto backend routes by envelope shape (harness/sandbox present means rivet), so one sidecar serves both runtimes and the Pi default cannot regress.

endpoint: request.trace?.endpoint,
authorization: request.trace?.authorization,
captureContent: request.trace?.captureContent,
emitSpans: !isPi || isDaytona,

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the split-tracing decision. Local Pi self-instruments through the Agenta extension under the propagated traceparent, so the runner only accumulates output here. Every other case (any non-Pi harness, or Pi on Daytona where the in-sandbox process cannot reach Agenta's OTLP) rebuilds the span tree from the ACP event stream. Worth confirming the reconstructed tree lines up with what Pi emits locally so a run does not look different depending on which axis it took.

if (!t) return;
// Pi streams pure deltas; Claude streams deltas plus a cumulative snapshot.
// Replace when a chunk is a superset of what we have, append otherwise.
if (t.startsWith(accumulated)) accumulated = t;

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Subtle cross-harness handling in one branch: Pi streams pure deltas while Claude streams deltas plus a cumulative snapshot, so the accumulator replaces when a chunk is a superset of what it has and appends otherwise. A harness whose streaming style fits neither assumption would corrupt the accumulated text, so this is the line to stress when adding a third harness.

Comment thread services/oss/src/agent.py
}


async def _resolve_harness_secrets() -> Dict[str, str]:

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Provider keys are resolved here over HTTP from the project vault rather than from the SDK's per-request secret context, which does not propagate to this custom route. Check the degrade path: an empty vault must return {} and let the harness fall back to its own OAuth login, not fail the run, and keys must reach Daytona only via runtime env_vars (never baked into the image).

function runAgent(request: AgentRunRequest): Promise<AgentRunResult> {
if (BACKEND === "rivet") return runRivet(request);
if (BACKEND === "pi") return runPi(request);
return request.harness || request.sandbox ? runRivet(request) : runPi(request);

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The auto backend routes purely by envelope shape: a request carrying harness or sandbox goes to rivet, everything else to the legacy in-process Pi path. This is what lets one sidecar serve both runtimes, and it is the line that guarantees the Pi default cannot regress for callers that send neither field.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 10

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
services/oss/src/agent.py (1)

413-430: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Avoid unconditional vault secret fetch on every invoke path.

Line 429 always calls _resolve_harness_secrets(), including legacy Pi runs. That adds avoidable external I/O on the hot path and can regress latency/availability when the vault endpoint is slow.

Suggested fix
     harness_id = params.get("harness")
     sandbox_id = params.get("sandbox")
     harness = _build_harness(harness=harness_id, sandbox=sandbox_id)
+    harness_secrets: Dict[str, str] = {}
+    if isinstance(harness, RivetHarness):
+        harness_secrets = await _resolve_harness_secrets()
 
     await harness.setup()
     try:
         result = await harness.invoke(
             HarnessRequest(
@@
                 custom_tools=custom_tools,
                 tool_callback=tool_callback,
                 trace=_trace_context(),
-                secrets=await _resolve_harness_secrets(),
+                secrets=harness_secrets,
             )
         )
🧹 Nitpick comments (3)
docs/design/agent-workflows/wp-8-rivet-acp-runtime/research.md (1)

91-91: 💤 Low value

Minor: rephrase to reduce overuse of "exactly".

The static analysis tool flags the word "exactly" as overused. Consider a synonym like "matches" or "mirrors" for variety: "This matches how DaytonaRunner already injects..." This is a style note only and does not affect correctness.

Source: Linters/SAST tools

docs/design/agent-workflows/wp-8-rivet-acp-runtime/status.md (1)

71-71: 💤 Low value

Minor: capitalize GitHub for brand consistency.

The static analysis tool flags the GitHub reference. Consider capitalizing to match official branding: "a real Composio GitHub github_whoami tool..." (or keep the backtick to mark it as a code identifier). This is a style note only.

Source: Linters/SAST tools

services/agent/src/toolBridge.ts (1)

66-68: ⚡ Quick win

Add a size guard before putting serialized tool specs into env.

AGENTA_TOOL_SPECS can grow enough to hit OS env-size limits, causing hard-to-debug process launch failures. A small preflight check with a clear error path will make this much safer.

[recommendation]

Proposed patch
   const env: EnvVariable[] = [
-    { name: "AGENTA_TOOL_SPECS", value: JSON.stringify(specs) },
+    { name: "AGENTA_TOOL_SPECS", value: JSON.stringify(specs) },
     { name: "AGENTA_TOOL_CALLBACK_ENDPOINT", value: callback.endpoint },
   ];
+  const specsEnv = env[0].value;
+  // Keep a conservative cap to avoid E2BIG/argv+env spawn failures.
+  if (Buffer.byteLength(specsEnv, "utf8") > 64 * 1024) {
+    process.stderr.write("[tool-bridge] tool payload too large for env transport\n");
+    return [];
+  }

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 8f922ea4-dca7-4f22-833d-c7bf66ea68df

📥 Commits

Reviewing files that changed from the base of the PR and between 59bf20a and 9c3d141.

⛔ Files ignored due to path filters (1)
  • services/agent/pnpm-lock.yaml is excluded by !**/pnpm-lock.yaml
📒 Files selected for processing (31)
  • .gitignore
  • docs/design/agent-workflows/README.md
  • docs/design/agent-workflows/wp-8-rivet-acp-runtime/README.md
  • docs/design/agent-workflows/wp-8-rivet-acp-runtime/architecture.md
  • docs/design/agent-workflows/wp-8-rivet-acp-runtime/context.md
  • docs/design/agent-workflows/wp-8-rivet-acp-runtime/isolation-and-fork.md
  • docs/design/agent-workflows/wp-8-rivet-acp-runtime/plan.md
  • docs/design/agent-workflows/wp-8-rivet-acp-runtime/poc/build_rivet_snapshot.py
  • docs/design/agent-workflows/wp-8-rivet-acp-runtime/poc/commit_agent_config.py
  • docs/design/agent-workflows/wp-8-rivet-acp-runtime/poc/debug-events.ts
  • docs/design/agent-workflows/wp-8-rivet-acp-runtime/poc/dump-full.ts
  • docs/design/agent-workflows/wp-8-rivet-acp-runtime/poc/package.json
  • docs/design/agent-workflows/wp-8-rivet-acp-runtime/poc/spike.ts
  • docs/design/agent-workflows/wp-8-rivet-acp-runtime/research.md
  • docs/design/agent-workflows/wp-8-rivet-acp-runtime/status.md
  • hosting/docker-compose/ee/docker-compose.dev.yml
  • services/agent/docker/Dockerfile.dev
  • services/agent/package.json
  • services/agent/scripts/build-extension.mjs
  • services/agent/src/agenta-otel.ts
  • services/agent/src/cli.ts
  • services/agent/src/piExtension.ts
  • services/agent/src/runPi.ts
  • services/agent/src/runRivet.ts
  • services/agent/src/server.ts
  • services/agent/src/toolBridge.ts
  • services/agent/src/toolBridgeServer.ts
  • services/oss/src/agent.py
  • services/oss/src/agent_pi/ports.py
  • services/oss/src/agent_pi/rivet_harness.py
  • services/oss/src/agent_pi/schemas.py


const cwd = mkdtempSync(join(tmpdir(), "wp8-dbg-"));
writeFileSync(join(cwd, "AGENTS.md"), "You are concise.\n", "utf-8");
const env: Record<string,string> = { PATH: `${binDir}:/home/mahmoud/.local/bin:${process.env.PATH ?? ""}`, PI_ACP_PI_COMMAND: join(binDir,"pi"), PI_CODING_AGENT_DIR: join(process.env.HOME??"",".pi/agent"), SANDBOX_AGENT_BIN: BIN, HOME: process.env.HOME??"" };

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Remove hardcoded personal developer path before commit.

Line 15 contains /home/mahmoud/.local/bin, a personal home directory that should not be baked into committed code. Replace with an environment-relative or system-standard path, or remove the hardcoded segment entirely and let PATH resolution handle it.

🗑️ Proposed fix to remove personal path
-const env: Record<string,string> = { PATH: `${binDir}:/home/mahmoud/.local/bin:${process.env.PATH ?? ""}`, PI_ACP_PI_COMMAND: join(binDir,"pi"), PI_CODING_AGENT_DIR: join(process.env.HOME??"",".pi/agent"), SANDBOX_AGENT_BIN: BIN, HOME: process.env.HOME??"" };
+const env: Record<string,string> = { PATH: `${binDir}:${process.env.PATH ?? ""}`, PI_ACP_PI_COMMAND: join(binDir,"pi"), PI_CODING_AGENT_DIR: join(process.env.HOME??"",".pi/agent"), SANDBOX_AGENT_BIN: BIN, HOME: process.env.HOME??"" };
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const env: Record<string,string> = { PATH: `${binDir}:/home/mahmoud/.local/bin:${process.env.PATH ?? ""}`, PI_ACP_PI_COMMAND: join(binDir,"pi"), PI_CODING_AGENT_DIR: join(process.env.HOME??"",".pi/agent"), SANDBOX_AGENT_BIN: BIN, HOME: process.env.HOME??"" };
const env: Record<string,string> = { PATH: `${binDir}:${process.env.PATH ?? ""}`, PI_ACP_PI_COMMAND: join(binDir,"pi"), PI_CODING_AGENT_DIR: join(process.env.HOME??"",".pi/agent"), SANDBOX_AGENT_BIN: BIN, HOME: process.env.HOME??"" };

Source: Coding guidelines

const BIN = join(here, "node_modules/.pnpm/@sandbox-agent+cli-linux-x64@0.4.2/node_modules/@sandbox-agent/cli-linux-x64/bin/sandbox-agent");
const cwd = mkdtempSync(join(tmpdir(), "wp8-dump-"));
writeFileSync(join(cwd, "AGENTS.md"), "You are concise.\n", "utf-8");
const env: Record<string,string> = { PATH: `${binDir}:/home/mahmoud/.local/bin:${process.env.PATH ?? ""}`, PI_ACP_PI_COMMAND: join(binDir,"pi"), PI_CODING_AGENT_DIR: join(process.env.HOME??"",".pi/agent"), SANDBOX_AGENT_BIN: BIN, HOME: process.env.HOME??"" };

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Remove hardcoded personal developer path before commit.

Line 13 contains /home/mahmoud/.local/bin, identical to the issue in debug-events.ts. This personal home directory must not be committed.

🗑️ Proposed fix to remove personal path
-const env: Record<string,string> = { PATH: `${binDir}:/home/mahmoud/.local/bin:${process.env.PATH ?? ""}`, PI_ACP_PI_COMMAND: join(binDir,"pi"), PI_CODING_AGENT_DIR: join(process.env.HOME??"",".pi/agent"), SANDBOX_AGENT_BIN: BIN, HOME: process.env.HOME??"" };
+const env: Record<string,string> = { PATH: `${binDir}:${process.env.PATH ?? ""}`, PI_ACP_PI_COMMAND: join(binDir,"pi"), PI_CODING_AGENT_DIR: join(process.env.HOME??"",".pi/agent"), SANDBOX_AGENT_BIN: BIN, HOME: process.env.HOME??"" };
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const env: Record<string,string> = { PATH: `${binDir}:/home/mahmoud/.local/bin:${process.env.PATH ?? ""}`, PI_ACP_PI_COMMAND: join(binDir,"pi"), PI_CODING_AGENT_DIR: join(process.env.HOME??"",".pi/agent"), SANDBOX_AGENT_BIN: BIN, HOME: process.env.HOME??"" };
const env: Record<string,string> = { PATH: `${binDir}:${process.env.PATH ?? ""}`, PI_ACP_PI_COMMAND: join(binDir,"pi"), PI_CODING_AGENT_DIR: join(process.env.HOME??"",".pi/agent"), SANDBOX_AGENT_BIN: BIN, HOME: process.env.HOME??"" };

Source: Coding guidelines

Comment on lines 446 to 447
AGENTA_HOST: ${AGENTA_HOST:-http://144.76.237.122:8280}
AGENTA_API_KEY: ${AGENTA_API_KEY:-}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Remove the hardcoded public AGENTA_HOST default to prevent accidental data egress.

Line 446 defaults to http://144.76.237.122:8280, which can send trace payloads to an external endpoint when env is unset. That is a privacy/security footgun for local dev stacks.

Suggested fix
-            AGENTA_HOST: ${AGENTA_HOST:-http://144.76.237.122:8280}
+            AGENTA_HOST: ${AGENTA_HOST:-}

callRef: string;
}

const TOOL_CALL_TIMEOUT_MS = Number(process.env.AGENTA_AGENT_TOOL_CALL_TIMEOUT_MS ?? 30000);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

find . -name "piExtension.ts" -type f

Repository: Agenta-AI/agenta

Length of output: 96


🏁 Script executed:

cat -n ./services/agent/src/piExtension.ts | head -80

Repository: Agenta-AI/agenta

Length of output: 3861


🏁 Script executed:

cat -n ./services/agent/src/piExtension.ts | sed -n '38p;58p;59p;60p;61p'

Repository: Agenta-AI/agenta

Length of output: 408


🌐 Web query:

AbortSignal.timeout() behavior with NaN milliseconds

💡 Result:

The behavior of passing NaN to AbortSignal.timeout is governed by its WebIDL definition, which uses the [EnforceRange] extended attribute on the milliseconds argument [1][2][3]. When a method argument is marked with [EnforceRange], passing a value that cannot be converted to an integer—such as NaN—results in a TypeError being thrown immediately. Consequently, AbortSignal.timeout(NaN) does not return a signal or create a timer; it throws a TypeError [1][2]. In general, the milliseconds argument for AbortSignal.timeout must be a non-negative number within the range of 0 to Number.MAX_SAFE_INTEGER [1][2]. Values outside this range or invalid types (that cannot be coerced into valid integers via EnforceRange) will cause the function to throw.

Citations:


Add timeout validation to prevent runtime TypeError on invalid env values.

At Line 38, a non-numeric AGENTA_AGENT_TOOL_CALL_TIMEOUT_MS yields NaN; at Line 58 this fails because AbortSignal.timeout(NaN) throws a TypeError due to the [EnforceRange] WebIDL constraint.

Proposed fix
-const TOOL_CALL_TIMEOUT_MS = Number(process.env.AGENTA_AGENT_TOOL_CALL_TIMEOUT_MS ?? 30000);
+const parsedTimeout = Number(process.env.AGENTA_AGENT_TOOL_CALL_TIMEOUT_MS);
+const TOOL_CALL_TIMEOUT_MS =
+  Number.isFinite(parsedTimeout) && parsedTimeout > 0 ? parsedTimeout : 30000;

Comment on lines +91 to +92
/** Continue a prior run by replaying its history. The rivet backend resumes by id. */
sessionId?: string;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

sessionId contract is currently not honored by the rivet runtime path.

Line 91 says resume-by-id is supported, but services/agent/src/runRivet.ts:531-698 never reads request.sessionId and always starts with a fresh in-memory persist driver. This makes the advertised continuation behavior incorrect for callers.

Please either wire request.sessionId into session restore logic or remove/soften this contract until restore is implemented.

// Adapters (pi-acp, claude-agent-acp) and the pi CLI live in our node_modules/.bin;
// claude CLI is on the inherited PATH. Prepend ours, keep the inherited PATH.
const extra = process.env.AGENTA_RIVET_ADAPTER_PATH;
env.PATH = [ADAPTER_BIN_DIR, extra, process.env.PATH].filter(Boolean).join(":");

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# Check if the file exists and read the context around line 193
head -200 services/agent/src/runRivet.ts | tail -30

Repository: Agenta-AI/agenta

Length of output: 1474


🏁 Script executed:

# Also check the imports at the top of the file
head -30 services/agent/src/runRivet.ts

Repository: Agenta-AI/agenta

Length of output: 1448


🏁 Script executed:

# Search for other PATH-related assignments in the codebase
rg "env\.PATH\s*=" --type ts --type js -B 2 -A 2

Repository: Agenta-AI/agenta

Length of output: 513


🌐 Web query:

node:path delimiter function cross-platform Windows Unix PATH separator

💡 Result:

In Node.js, the path module provides two distinct properties to handle platform-specific separators, which are essential for writing cross-platform code [1][2]. 1. path.delimiter: This property provides the platform-specific delimiter used to separate multiple paths in environment variables like PATH [1][3]. - On Windows, it is a semicolon (;) [1][2]. - On POSIX (Unix-like systems), it is a colon (:) [1][2]. 2. path.sep: This property provides the platform-specific path segment separator used within file paths (e.g., between directory names) [1][2]. - On Windows, it is a backslash () [1][3]. - On POSIX, it is a forward slash (/) [1][2]. While Windows accepts both forward slashes (/) and backslashes () as valid path separators in many contexts, Node.js path methods exclusively use the platform-specific separator (backslash on Windows) when generating or joining paths [1][2]. For code intended to run across different operating systems, using these properties ensures correct behavior for the host environment [1][4].

Citations:


🏁 Script executed:

# Get the imports section more carefully
rg "import.*from.*node:path" services/agent/src/runRivet.ts -A 5 -B 5

Repository: Agenta-AI/agenta

Length of output: 440


🏁 Script executed:

# If not found in that way, check the first 50 lines more carefully
head -50 services/agent/src/runRivet.ts | cat -n

Repository: Agenta-AI/agenta

Length of output: 2475


🏁 Script executed:

# Check for other potential hardcoded PATH delimiters (colon or semicolon in PATH context)
rg "join\s*\(\s*['\"][:;]['\"]" --type ts --type js

Repository: Agenta-AI/agenta

Length of output: 472


Use OS-aware PATH delimiter instead of hardcoded colon.

At line 193, the PATH environment variable is joined with a hardcoded ":", which breaks on Windows (which uses ";" as the PATH delimiter). This breaks subprocess execution on Windows despite win32 CLI packages being explicitly supported.

Proposed fix
-import { dirname, join } from "node:path";
+import { delimiter, dirname, join } from "node:path";
...
-  env.PATH = [ADAPTER_BIN_DIR, extra, process.env.PATH].filter(Boolean).join(":");
+  env.PATH = [ADAPTER_BIN_DIR, extra, process.env.PATH].filter(Boolean).join(delimiter);
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
env.PATH = [ADAPTER_BIN_DIR, extra, process.env.PATH].filter(Boolean).join(":");
import { delimiter, dirname, join } from "node:path";
...
env.PATH = [ADAPTER_BIN_DIR, extra, process.env.PATH].filter(Boolean).join(delimiter);

Comment on lines +531 to +533
export async function runRivet(request: AgentRunRequest): Promise<AgentRunResult> {
const harness = request.harness || process.env.AGENTA_AGENT_HARNESS || "pi";
const sandboxId = request.sandbox || process.env.AGENTA_AGENT_SANDBOX || "local";

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

sessionId continuation is not wired, so resume semantics are currently broken.

request.sessionId is never consumed, a fresh in-memory persist driver is created per invoke, and the response always returns a newly created session.id. This breaks the new continuation contract across harness boundaries and can fragment conversation/session lineage.

Also applies to: 586-587, 627-631, 684-685

Comment on lines +85 to +87
process.on("uncaughtException", (err) => {
process.stderr.write(`[pi-wrapper] uncaughtException: ${err.stack ?? err.message}\n`);
});

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# Check if the file exists and view the code around lines 85-87
if [ -f "services/agent/src/server.ts" ]; then
  echo "=== File exists ==="
  wc -l "services/agent/src/server.ts"
  echo ""
  echo "=== Code at lines 80-95 ==="
  sed -n '80,95p' "services/agent/src/server.ts"
else
  echo "File not found at services/agent/src/server.ts"
  # Try to find it
  echo "Searching for server.ts files..."
  find . -name "server.ts" -type f 2>/dev/null | head -20
fi

Repository: Agenta-AI/agenta

Length of output: 587


🏁 Script executed:

# Also search for other uncaughtException handlers in the codebase
echo "=== Searching for uncaughtException handlers ==="
rg "uncaughtException" -n --context 2

Repository: Agenta-AI/agenta

Length of output: 447


🌐 Web query:

Node.js uncaughtException handler best practices should process exit

💡 Result:

The official Node.js documentation and industry best practices strongly advise that the 'uncaughtException' event should be used only as a last resort to perform synchronous cleanup of resources before shutting down the process [1][2]. Key best practices include: 1. Never attempt to resume normal operation: An uncaught exception places the application in an undefined state [1][3]. Continuing execution after an uncaught exception is widely recognized as an anti-pattern that can lead to data corruption, memory leaks, or unpredictable behavior because the application's internal state (such as database connection pools or in-flight requests) may be broken [4]. 2. Use the handler for cleanup only: If you implement an 'uncaughtException' listener, its primary purpose should be to perform synchronous cleanup of allocated resources, such as closing file descriptors or database handles [1][3]. 3. Terminate the process: After cleanup, you must terminate the process [4]. By default, Node.js prints the stack trace and exits with code 1 [1][2]. If you add a listener, you override this behavior and the process will exit with 0 unless you explicitly manage the exit [1][3]. To maintain correct signaling to external monitoring tools, you should explicitly call process.exit(1) after your cleanup logic [4]. 4. Rely on external process managers: Since the process should always be terminated after an uncaught exception, you should use an external process manager (e.g., PM2, Forever, or container orchestration like Kubernetes) to monitor the application and automatically restart it in a clean state [1][5][4]. 5. Do not use as a generic error handler: 'uncaughtException' is not a mechanism for normal error handling. Expected errors should be managed via try/catch blocks, error-first callbacks, or promise rejections [4]. In summary, the correct pattern is to log the error with sufficient context, perform necessary synchronous cleanup, and then call process.exit(1) to trigger a restart by your process manager [4].

Citations:


Process must terminate after uncaughtException to prevent undefined state.

The current handler at lines 85-87 only logs and continues. Per Node.js best practices, uncaught exceptions place the process in an undefined state, and continuing execution is an anti-pattern that risks data corruption, memory leaks, and cascading failures. The process must explicitly terminate and rely on external process managers for restart.

Safer pattern
 process.on("uncaughtException", (err) => {
   process.stderr.write(`[pi-wrapper] uncaughtException: ${err.stack ?? err.message}\n`);
+  process.exitCode = 1;
+  setImmediate(() => process.exit(1));
 });
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
process.on("uncaughtException", (err) => {
process.stderr.write(`[pi-wrapper] uncaughtException: ${err.stack ?? err.message}\n`);
});
process.on("uncaughtException", (err) => {
process.stderr.write(`[pi-wrapper] uncaughtException: ${err.stack ?? err.message}\n`);
process.exitCode = 1;
setImmediate(() => process.exit(1));
});

callRef: string;
}

const SPECS: ToolSpec[] = JSON.parse(process.env.AGENTA_TOOL_SPECS ?? "[]");

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Avoid process crash on malformed AGENTA_TOOL_SPECS.

At Line 26, JSON.parse at module scope can terminate the bridge process before it serves requests. Parse defensively and fallback to [].

Proposed fix
-const SPECS: ToolSpec[] = JSON.parse(process.env.AGENTA_TOOL_SPECS ?? "[]");
+let SPECS: ToolSpec[] = [];
+try {
+  const parsed = JSON.parse(process.env.AGENTA_TOOL_SPECS ?? "[]");
+  SPECS = Array.isArray(parsed) ? parsed : [];
+} catch {
+  SPECS = [];
+}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const SPECS: ToolSpec[] = JSON.parse(process.env.AGENTA_TOOL_SPECS ?? "[]");
let SPECS: ToolSpec[] = [];
try {
const parsed = JSON.parse(process.env.AGENTA_TOOL_SPECS ?? "[]");
SPECS = Array.isArray(parsed) ? parsed : [];
} catch {
SPECS = [];
}

const ENDPOINT = process.env.AGENTA_TOOL_CALLBACK_ENDPOINT ?? "";
const AUTH = process.env.AGENTA_TOOL_CALLBACK_AUTH;
const SPEC_BY_NAME = new Map(SPECS.map((s) => [s.name, s]));
const TOOL_CALL_TIMEOUT_MS = Number(process.env.AGENTA_AGENT_TOOL_CALL_TIMEOUT_MS ?? 30000);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

fd -t f "toolBridgeServer.ts" --exec wc -l {}

Repository: Agenta-AI/agenta

Length of output: 105


🏁 Script executed:

head -80 services/agent/src/toolBridgeServer.ts | cat -n

Repository: Agenta-AI/agenta

Length of output: 3790


🏁 Script executed:

# Verify AbortSignal.timeout behavior with NaN
python3 << 'PY'
# Check what AbortSignal.timeout does with NaN
# According to ECMAScript spec, AbortSignal.timeout(delay) expects:
# - delay to be a number >= 0
# - If NaN is passed, it would typically be coerced or throw

# The concern is valid - NaN is not a valid timeout value
print("AbortSignal.timeout() spec requirements:")
print("- Expects a non-negative number")
print("- NaN would violate this contract")
print("- Current code can produce NaN if env var is set to non-numeric string")
print("\nExample issue scenarios:")
print("1. AGENTA_AGENT_TOOL_CALL_TIMEOUT_MS='invalid' -> Number('invalid') = NaN -> AbortSignal.timeout(NaN) fails")
print("2. AGENTA_AGENT_TOOL_CALL_TIMEOUT_MS='-500' -> Number('-500') = -500 -> AbortSignal.timeout(-500) likely fails (negative)")
PY

Repository: Agenta-AI/agenta

Length of output: 494


Guard timeout env parsing to prevent NaN at runtime.

Line 30 can produce NaN if AGENTA_AGENT_TOOL_CALL_TIMEOUT_MS is set to a non-numeric value. This would cause AbortSignal.timeout() at line 60 to fail. Validate that the parsed timeout is a positive finite number before use.

Proposed fix
-const TOOL_CALL_TIMEOUT_MS = Number(process.env.AGENTA_AGENT_TOOL_CALL_TIMEOUT_MS ?? 30000);
+const parsedTimeout = Number(process.env.AGENTA_AGENT_TOOL_CALL_TIMEOUT_MS);
+const TOOL_CALL_TIMEOUT_MS =
+  Number.isFinite(parsedTimeout) && parsedTimeout > 0 ? parsedTimeout : 30000;
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const TOOL_CALL_TIMEOUT_MS = Number(process.env.AGENTA_AGENT_TOOL_CALL_TIMEOUT_MS ?? 30000);
const parsedTimeout = Number(process.env.AGENTA_AGENT_TOOL_CALL_TIMEOUT_MS);
const TOOL_CALL_TIMEOUT_MS =
Number.isFinite(parsedTimeout) && parsedTimeout > 0 ? parsedTimeout : 30000;

@mmabrouk

Copy link
Copy Markdown
Member Author

Superseded. Replacing the path-based stack with PRs sliced by functional area showing final code only, so reviewers don't comment on intermediate scaffolding that a later PR rewrites. See the new set.

@mmabrouk mmabrouk closed this Jun 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request size:XXL This PR changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant