Skip to content

feat(agent): runnable tools as agent configuration#4759

Closed
mmabrouk wants to merge 1 commit into
feat/agent-workflowsfrom
feat/agent-tools-wp7
Closed

feat(agent): runnable tools as agent configuration#4759
mmabrouk wants to merge 1 commit into
feat/agent-workflowsfrom
feat/agent-tools-wp7

Conversation

@mmabrouk

@mmabrouk mmabrouk commented Jun 19, 2026

Copy link
Copy Markdown
Member

This PR is part of a stack. Review bottom-up.

Each PR's diff is only its own delta. Merge from the bottom. This PR's base is #4758 (merge that first).

Context

An agent revision can now carry a list of tools in its config, and the runtime can execute them. This is the first cut of WP-7: a Composio action you configure on an agent becomes a runnable tool that Pi calls during a run, with the provider key kept server-side. It sits on top of feat/agent-workflows (base branch) and is slice #9 (tool runtime) of pr-stack.md.

What this changes

Tools become part of the agent's saved configuration. Under an agent revision's parameters, each tool entry is either a built-in name or a discriminated object:

"tools": [
  "read_file",
  { "type": "composio", "integration": "github",
    "action": "GET_THE_AUTHENTICATED_USER", "connection": "github-tvn" }
]

At invoke time the agent service resolves that list. Built-in names pass straight through to Pi's tools: string[]. Composio references go to a new backend endpoint, POST /tools/resolve, which validates the connection and enriches each action from the catalog (description plus input schema). The resolved spec carries a call_ref slug, tools.composio.{integration}.{action}.{connection}. The TS wrapper turns each spec into a Pi customTool whose execute posts that slug back to the existing POST /tools/call. Pi drives the tool loop in the sandbox, but the Composio key and the connection auth never leave the backend.

Before this PR the agent ran with no runnable tools and a fixed two-field config (model, agents_md). After it, the playground renders the agent config as a prompt-template, so the model selector, system-message editor, and tool picker that chat and completion already use now drive the agent with no new frontend code (agent_pi/schemas.py). The handler reads the system message as the AGENTS.md and the picked tools from llm_config.tools.

call_tool no longer inlines its connection lookup. That logic moved into ToolsService.resolve_connection_by_slug, which both the execution endpoint and the new resolver share.

Key architectural decision to review

  1. The configured tool name must match Pi's session allowlist, or the call is hidden. In services/agent/src/runPi.ts, custom tool names are appended to the createAgentSession tools allowlist alongside the built-ins. Pi gates custom tools through the same allowlist, so an empty allowlist would register the customTool but never expose it to the model. This couples the resolved tool name to the allowlist entry. Check buildCustomTools and the toolAllowlist assembly.

  2. The call_ref slug round-trips __ <-> ., so slug segments forbid __. _SLUG_SEGMENT_RE in api/oss/src/core/tools/service.py rejects __ inside any segment because /tools/call swaps __ and . when it parses the OpenAI-style function name. A __ inside an integration or action name would corrupt the five-way split. The tradeoff: the function name the model sees and the call-ref slug share one grammar. Confirm the regex matches the parsing in _parse_gateway_slug (services/oss/src/agent.py) and in call_tool.

How to review this PR

  1. Open api/oss/src/core/tools/service.py first. Read resolve_agent_tools and _resolve_composio_tool for the resolution shape, then resolve_connection_by_slug for the extracted lookup. This is where the contract lives.
  2. Then services/oss/src/agent.py: _resolve_tools (the HTTP hop to /tools/resolve), _normalize_tool_ref and _parse_gateway_slug (config and picker shapes into refs), and _request_authorization (the credential reuse).
  3. Then services/agent/src/runPi.ts: buildCustomTools and callAgentaTool for the execution bridge.
  4. Skip the deleted files. agent_main.py and the two standalone composes are removed in favor of the integrated /agent/v0 mount; the qa.md doc records why.
  5. Most likely regression: an agent with no tools must not touch the backend. _resolve_tools returns early on an empty ref list, which preserves the tool-less WP-2 path. Check that early return.

Tests / notes

  • Verified live on the dev stack (2026-06-16): a real GitHub Composio connection drove the full path. /tools/resolve built the spec, Pi registered the github_whoami customTool and called it, the bridge executed the action through /tools/call, and the agent answered with live data (login, follower count). The trace nests the tool call: _agent -> invoke_agent -> turn 0 -> {chat, execute_tool} -> turn 1 -> chat.
  • WP-6 is not built yet, so resolution runs in api behind an HTTP endpoint the agent service calls. When WP-6 lands, its invoke path calls the same ToolsService.resolve_agent_tools(...) in-process and the HTTP hop drops out. Later PRs (feat(sdk): typed agent tool resolution contracts #4763/feat(agent): resolve typed tools through the service #4764/feat(agent): deliver resolved tools through the runner #4765) move resolution into the SDK and service and deliver it through the runner.
  • One thing to watch: _resolve_tools raises on any resolution failure, so a missing or inactive connection fails the invoke early with a clear message instead of surfacing as a mid-loop tool error.
  • The dev agent-pi sidecar no longer loads the stack env_file, so the sandbox cannot inherit COMPOSIO_API_KEY and the other secrets (hosting/docker-compose/ee/docker-compose.dev.yml).

@vercel

vercel Bot commented Jun 19, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
agenta-documentation Ready Ready Preview, Comment Jun 19, 2026 3:40pm

Request Review

@coderabbitai

coderabbitai Bot commented Jun 19, 2026

Copy link
Copy Markdown

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 2dabc111-1ff2-4c72-aadd-07faeb798eb7

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • ✅ Review completed - (🔄 Check again to review again)
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/agent-tools-wp7

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@dosubot dosubot Bot added size:XXL This PR changes 1000+ lines, ignoring generated files. enhancement New feature or request labels Jun 19, 2026
@mmabrouk

Copy link
Copy Markdown
Member Author

Reviewer guide: interesting code

The hunks that carry the design decisions, in reading order:

  • api/oss/src/core/tools/service.py:489resolve_agent_tools is the contract: builtins pass through as names, Composio refs validate the connection up front and enrich from the catalog, so the invoke fails fast on a bad connection instead of mid-loop.
  • api/oss/src/core/tools/service.py:42_SLUG_SEGMENT_RE forbids __ in a slug segment because /tools/call round-trips __ <-> . when parsing the function name. A __ inside a segment would corrupt the five-way split.
  • api/oss/src/core/tools/service.py:432resolve_connection_by_slug is the lookup extracted out of call_tool, now shared by the execution endpoint and the resolver. Note is_active=None so an inactive connection gives a precise error rather than "not found".
  • services/oss/src/agent.py:195_parse_gateway_slug decodes the playground picker's tools__composio__... function name into the same composio ref the resolver takes, so the picker needed no backend change.
  • services/agent/src/runPi.ts:329 — custom tool names are appended to the createAgentSession allowlist; Pi gates custom tools through the same allowlist, so an empty one would hide them from the model.
  • services/agent/src/runPi.ts:242 — the execution bridge sends arguments as an object inside the OpenAI envelope (no double-encoding), forwards the threaded Authorization, and throws on HTTP/timeout failure so Pi turns it into a tool-error result rather than a run failure.

# A slug segment is safe for the ``tools.{provider}.{integration}.{action}.{connection}``
# call-ref. ``__`` is forbidden because ``/tools/call`` round-trips ``__`` <-> ``.`` when
# parsing function names, so a ``__`` inside a segment would corrupt the split.
_SLUG_SEGMENT_RE = re.compile(r"^[a-zA-Z0-9-]+(?:_[a-zA-Z0-9-]+)*$")

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This regex forbids `` in a slug segment on purpose: `/tools/call` round-trips `` <-> `.` when parsing the OpenAI-style function name, so a `__` inside an integration or action name would corrupt the five-way split of the call-ref. The model-facing function name and the call-ref slug share one grammar.

project_id=project_id,
provider_key=provider_key,
integration_key=integration_key,
is_active=None,

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The `is_active=None` query (all, not active-only) is the reason this lookup can distinguish "inactive" from "not found". Querying active-only would collapse an inactive connection into an indistinguishable `ConnectionNotFoundError`.

request.customTools ?? [],
request.toolCallback,
);
const toolAllowlist = [

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Key coupling for review: Pi gates custom tools through the same `tools` allowlist as built-ins, so appending each customTool name here is required. An empty allowlist registers the customTool but never exposes it to the model.

id: toolCallId,
type: "function",
// Arguments as an object (not a JSON string) to avoid double-encoding.
function: { name: callRef, arguments: params ?? {} },

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arguments go as an object, not a JSON string, to avoid double-encoding on the `/tools/call` round-trip. A failure in this fetch throws, which Pi converts to a tool-error result so the loop continues rather than failing the whole run.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3


ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 33129cd3-25a8-4c72-95b0-b6e52f387a0d

📥 Commits

Reviewing files that changed from the base of the PR and between 6898615 and 59bf20a.

📒 Files selected for processing (22)
  • api/oss/src/apis/fastapi/tools/models.py
  • api/oss/src/apis/fastapi/tools/router.py
  • api/oss/src/core/tools/dtos.py
  • api/oss/src/core/tools/exceptions.py
  • api/oss/src/core/tools/service.py
  • docs/design/agent-workflows/wp-1-pi-tracing/tracing-in-the-agent-service.md
  • docs/design/agent-workflows/wp-2-agent-service/implementation-plan.md
  • docs/design/agent-workflows/wp-2-agent-service/qa.md
  • docs/design/agent-workflows/wp-7-tools/README.md
  • hosting/docker-compose/ee/docker-compose.dev.yml
  • services/agent/README.md
  • services/agent/docker-compose.agent.yml
  • services/agent/docker-compose.stack.yml
  • services/agent/scripts/register_agent_app.py
  • services/agent/src/runPi.ts
  • services/entrypoints/agent_main.py
  • services/oss/src/agent.py
  • services/oss/src/agent_pi/config.py
  • services/oss/src/agent_pi/pi_harness.py
  • services/oss/src/agent_pi/pi_http_harness.py
  • services/oss/src/agent_pi/ports.py
  • services/oss/src/agent_pi/schemas.py
💤 Files with no reviewable changes (4)
  • services/agent/docker-compose.agent.yml
  • services/entrypoints/agent_main.py
  • services/agent/scripts/register_agent_app.py
  • services/agent/docker-compose.stack.yml

Comment on lines +505 to +519
for ref in tools:
if isinstance(ref, AgentBuiltinTool):
if ref.name:
builtins.append(ref.name)
continue

if isinstance(ref, AgentComposioTool):
custom.append(
await self._resolve_composio_tool(
project_id=project_id,
ref=ref,
)
)

return AgentToolsResolution(builtins=builtins, custom=custom)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Enforce unique resolved tool names before returning the resolution.

Two Composio entries with different connection values currently default to the same name ({integration}__{action}), which creates ambiguous custom-tool registration/selection at runtime.

Proposed fix
 async def resolve_agent_tools(
@@
-        builtins: List[str] = []
-        custom: List[ResolvedAgentTool] = []
+        builtins: List[str] = []
+        custom: List[ResolvedAgentTool] = []
+        seen_names: set[str] = set()
@@
             if isinstance(ref, AgentBuiltinTool):
-                if ref.name:
-                    builtins.append(ref.name)
+                if ref.name:
+                    if ref.name in seen_names:
+                        raise ToolSlugInvalidError(
+                            slug=ref.name,
+                            detail=f"Duplicate tool name: {ref.name!r}",
+                        )
+                    seen_names.add(ref.name)
+                    builtins.append(ref.name)
                 continue
 
             if isinstance(ref, AgentComposioTool):
-                custom.append(
-                    await self._resolve_composio_tool(
-                        project_id=project_id,
-                        ref=ref,
-                    )
-                )
+                resolved = await self._resolve_composio_tool(
+                    project_id=project_id,
+                    ref=ref,
+                )
+                if resolved.name in seen_names:
+                    raise ToolSlugInvalidError(
+                        slug=resolved.call_ref,
+                        detail=f"Duplicate tool name: {resolved.name!r}",
+                    )
+                seen_names.add(resolved.name)
+                custom.append(resolved)
@@
-        name = ref.name or f"{ref.integration}__{ref.action}"
+        name = ref.name or f"{ref.integration}__{ref.action}__{ref.connection}"

Also applies to: 559-563

Comment on lines +162 to +165
/** Per-tool budget for the /tools/call round-trip. Surfaced as a tool error on timeout. */
const TOOL_CALL_TIMEOUT_MS = Number(
process.env.AGENTA_AGENT_TOOL_CALL_TIMEOUT_MS ?? 30000,
);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Validate AGENTA_AGENT_TOOL_CALL_TIMEOUT_MS before using it.

Number(...) can produce NaN/invalid values; passing that into timeout construction can cause runtime failures on every tool call.

Proposed fix
-const TOOL_CALL_TIMEOUT_MS = Number(
-  process.env.AGENTA_AGENT_TOOL_CALL_TIMEOUT_MS ?? 30000,
-);
+const parsedTimeoutMs = Number(
+  process.env.AGENTA_AGENT_TOOL_CALL_TIMEOUT_MS ?? 30000,
+);
+const TOOL_CALL_TIMEOUT_MS =
+  Number.isFinite(parsedTimeoutMs) && parsedTimeoutMs > 0
+    ? parsedTimeoutMs
+    : 30000;

Comment on lines +185 to +188
if (!callback?.endpoint) {
log(`skipping ${specs.length} custom tool(s): missing toolCallback endpoint`);
return [];
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Fail fast when customTools are provided without a callback endpoint.

Dropping tools silently here makes runs behave as tool-less without surfacing the configuration error.

Proposed fix
 export function buildCustomTools(
   specs: ResolvedToolSpec[],
   callback: ToolCallbackContext | undefined,
 ): any[] {
   if (specs.length === 0) return [];
   if (!callback?.endpoint) {
-    log(`skipping ${specs.length} custom tool(s): missing toolCallback endpoint`);
-    return [];
+    throw new Error(
+      `customTools provided (${specs.length}) but toolCallback.endpoint is missing`,
+    );
   }

@mmabrouk

Copy link
Copy Markdown
Member Author

Superseded. Replacing the path-based stack with PRs sliced by functional area showing final code only, so reviewers don't comment on intermediate scaffolding that a later PR rewrites. See the new set.

@mmabrouk mmabrouk closed this Jun 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request size:XXL This PR changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant