feat(agents): detached (background) agent-tool runs — durable completion, progress & milestones#1758
Conversation
🦋 Changeset detectedLatest commit: c8fc9bc The changes in this PR will be included in the next version bump. This PR includes changesets to release 3 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
agents
@cloudflare/ai-chat
@cloudflare/codemode
create-think
hono-agents
@cloudflare/shell
@cloudflare/think
@cloudflare/voice
@cloudflare/worker-bundler
commit: |
|
Super small convenience comment:
|
fc6a2ab to
eb12b65
Compare
Adds the design record for first-class detached sub-agent runs with a durable named-method completion hook and progress/milestone signaling, in response to #1752. Co-authored-by: Cursor <cursoragent@cursor.com>
Implements the core of rfc-detached-agent-tools (#1752): - `runAgentTool(cls, { detached })` dispatches a sub-agent without awaiting, returning `{ runId, status: "running" }`. Fire-and-forget (`detached: true`) or a durable per-run callback (`detached: { onFinish: "methodName" }`). - Durable, eviction-surviving completion delivery via a single guarded funnel with two independent ledger slots (finish / give-up) using a claim+lease, so delivery is exactly-once on the happy path and at-least-once under failure — a premature give-up can never dedupe a child's real late completion (the #1752 production incident). - Warm fast path (waitUntil) + durable self-scheduling reconcile backbone (this.schedule) that self-cancels once no detached run remains. - Reconcile fork: detached runs are never sealed `interrupted` on a lost observer (the normal state for a background run); the backbone owns them and re-arms on restart. - Absolute `maxBudgetMs` give-up ceiling (default 24h, finite because a detached run has no observer to notice a leak) surfaced as `interrupted`/`budget-exceeded`. - `cancelAgentTool(runId)` by-id cancellation through the same guarded path. Schema bumped to v10 with detached + ledger columns (idempotent migrations). Co-authored-by: Cursor <cursoragent@cursor.com>
Drives the detached delivery funnel directly: exactly-once on terminal, dedupe under concurrent fast-path/backbone deliveries, and the independent finish/give-up slots so a budget give-up never dedupes a child's real late completion (#1752). Also switches the ledger claim to rowsWritten() since UPDATE ... RETURNING row counts are not a reliable claim signal on Workers SQLite. Co-authored-by: Cursor <cursoragent@cursor.com>
Adds a changeset (minor) and a "Detached (background) runs" section to the agent-tools doc covering the detached handle, durable onFinish, budget give-up, explicit cancellation, and the inspectAgentToolRun null-means-not-yet contract from #1752. Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Adds `detached: { notify: true }` sugar: when a detached sub-agent run
finishes, a Think agent injects the result back into the chat via
submitMessages (idempotent per run + status) so the model reacts, without
hand-wiring onFinish. Override formatDetachedCompletion() to customize. Wired
generically in the base Agent by resolving the conventional notify hook by name
so the core stays decoupled from the chat layer.
Co-authored-by: Cursor <cursoragent@cursor.com>
Adds a `research_background` tool that dispatches a Researcher with
`detached: { notify: true }` (returns immediately, result posted back into the
chat on completion) and a `cancelBackground(runId)` callable built on
cancelAgentTool. Updates the system prompt and README to cover the background
flow.
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
There was a problem hiding this comment.
Devin Review found 2 new potential issues.
⚠️ 1 issue in files not directly in the diff
⚠️ Warm-path milestone notifications silently fail because the run-row query omits the configuration column (packages/agents/src/index.ts:8668-8678)
The milestone configuration is never read on the low-latency warm path (_readAgentToolRun at packages/agents/src/index.ts:8668) because its SELECT omits detached_on_milestones, so _observeForwardedProgress → _maybeDeliverDetachedMilestone always sees undefined for the configuration and returns early without delivering.
Impact: Live milestone chat notifications (the detached: { onMilestones } convenience) never fire while the parent isolate is warm; they only arrive later via the cold backbone reconcile, which has its own complete SELECT.
Column omission in _readAgentToolRun vs _cfDetachedReconcileTick
The reconcile backbone query at packages/agents/src/index.ts:8431 selects all detached columns including detached_on_milestones, detached_no_progress_budget_ms, and last_progress_at. But the shared _readAgentToolRun query at packages/agents/src/index.ts:8668 only selects: detached, detached_on_finish, detached_notify_source, detached_max_budget_at, and the four ledger columns — omitting detached_on_milestones.
The warm-path flow:
- Child emits a milestone →
_forwardAgentToolStreamcalls_observeForwardedProgress(runId, body) _observeForwardedProgressparses the milestone, callsthis._readAgentToolRun(runId)- The returned row has
detached_on_milestonesasundefined(not selected) _maybeDeliverDetachedMilestoneevaluatesrow.detached_on_milestones ?? null→null→_parseAgentToolJson(null)→undefined→ no configured names → returns without delivering
The TypeScript type marks the field as optional (detached_on_milestones?: string | null), so no compile error surfaces the omission.
…runs
Adds two complementary signaling channels to long-running agent-tool runs,
plus a round of detached-run hardening from the deep review pass.
Progress (4a) — ephemeral:
- `reportProgress({ fraction, message, phase, data? })` emits transient
`data-agent-progress` parts; coalesced, latest-snapshot-only, persisted as
`progress_json`. Surfaced via `AgentToolRunState.progress`, the `onProgress`
hook, and `inspectAgentToolRun`.
- Shared `AgentToolProgressEmitter` centralizes coalescing/persistence so Think
and AIChatAgent share one code path; `forget(runId)` clears coalescing state
on terminal cleanup (no per-run map leak).
- No-progress budget (`detachedNoProgressBudgetMs`) gives up a detached run that
reports initial activity then goes silent; resets on any progress/milestone.
Milestones (4b) — durable:
- `reportProgress({ milestone, data })` persists an ordered, observable record
(new `cf_*_agent_tool_milestones` tables), broadcasts a `data-agent-milestone`
part (never coalesced), and surfaces `AgentToolRunState.milestones`.
- `detached: { onMilestones }` injects an at-most-once chat notification on both
the warm path and the reconcile cold path. Two modes — the `string[]`
shorthand defaults to "narrate":
- "narrate" (default): synthetic assistant message, no model turn — a cheap
status line for milestones the agent needn't act on.
- "react": user-role turn so the model responds (steer / start dependent
work). Costs a turn.
Override prose via `formatDetachedMilestone()`.
Client/UX:
- Reducer projects milestones (deduped by sequence) and advances the progress
snapshot monotonically. Synthetic notify messages render as "Agent event"
with milestone/result badges instead of a raw user/assistant bubble.
- agents-as-tools example emits a "sources-gathered" milestone and uses the
narrate shorthand; tray renders progress + milestone chips.
Detached-run hardening (review pass):
- Terminal `agent-tool-event` is always broadcast on cancel (synthesize seq).
- Generic `onFinish` delivery serialized against the turn queue via
`_runDetachedDelivery` (Think/AIChatAgent enqueue on `_turnQueue`).
- `_armDetachedBackbone` race guarded with an arming mutex.
- Failed detached deliveries re-throw and emit
`agent_tool:detached:delivery_failed`; accumulating live runs emit
`agent_tool:detached:live_count_warning`.
Schema bumped to 11.
Co-authored-by: Cursor <cursoragent@cursor.com>
eb12b65 to
2717be8
Compare
Document at the call site that the submission drain runs fire-and-forget, so a nested submission turn (e.g. a detached-finish notify calling submitMessages mid-turn) is admitted safely without deadlock — and why the flag is applied to all submissions rather than scoped to detached notify. Co-authored-by: Cursor <cursoragent@cursor.com>
AIChatAgent now implements the parent-side detached chat conveniences that were
previously Think-only, so `detached: { notify }` and `detached: { onMilestones }`
work on an AIChatAgent parent instead of being silently dropped (the docs and the
base _deliverDetachedMilestone JSDoc already claimed support).
- formatDetachedCompletion + _cfDetachedNotifyFinish (auto-wired by name) inject
the completion as a user turn the model reacts to; idempotent per (runId,
status) via a deterministic message id.
- formatDetachedMilestone + _deliverDetachedMilestone override deliver milestone
notifications: "narrate" (default) persists an assistant line with no model
turn; "react" injects a user turn + reply. Idempotent per (runId, name).
AIChatAgent has no durable-submission layer and TurnQueue has no re-entrancy
bypass, so the shared _injectDetachedNotification helper avoids self-deadlock by
dispatching the react turn three ways:
- inside our own serialized finish-delivery slot -> run inline + awaited (the
ledger only marks delivered after it resolves, so the reaction is
eviction-safe);
- inside a FOREIGN active turn (e.g. cancelAgentTool mid-turn) -> fire-and-forget
a turn that runs once the slot frees (inline would interleave, enqueue-await
would deadlock); the persisted message makes it best-effort-durable;
- otherwise -> enqueue + await normally.
An in-flight id set dedupes a concurrent warm-tail + backbone delivery of the
same milestone within one isolate (milestones have no ledger claim), while
messages.some() dedupes re-delivery against persisted history across eviction.
Tests: ai-chat detached-notify covers notify react + idempotency, narrate (no
model turn), and react milestones. Docs/changesets/RFC updated to reflect that
notify + onMilestones are chat-host conveniences (Think + AIChatAgent).
Co-authored-by: Cursor <cursoragent@cursor.com>
Closes #1752.
TL;DR
Adds a first-class detached mode to
runAgentToolso a parent agent can dispatch a sub-agent, keep working, and be notified exactly once when the child finishes — even across parent Durable Object eviction. On top of that, sub-agents can stream ephemeral progress and emit durable milestones that surface live in the parent's UI and (for chat agents) can narrate themselves back into the conversation. This is the framework-owned version of the ~200 lines of poll / fast-path / idempotency / race-handling glue @rwdaigle described in #1752, plus the progress/milestone tier the RFC laid out.For chat agents there's a one-liner that injects the result back into the conversation so the model reacts to it:
And the child can report progress / milestones as it works:
1. Detached dispatch + durable, exactly-once completion
runAgentTool(Cls, { input, detached: true | { onFinish, maxBudgetMs, notify, onMilestones, noProgressBudgetMs } })returnsDetachedRunAgentToolResult({ runId, agentType, status }) immediately, without awaiting. The full run lifecycle — run row incf_agent_tool_runs,agent-tool-eventbroadcast, child recovery,onAgentToolStart/onAgentToolFinish, cost — fires regardless, exactly as on the awaited path. Detached runs deliberately inherit noAbortSignal(the child must outlive the spawning turn); cancel explicitly withcancelAgentTool(runId).Delivery is a two-tier mechanism with a claim+lease ledger:
ctx.waitUntiloff the dispatch cuts latency while the parent isolate is warm.this.schedulecallback (_cfDetachedReconcileTick, backoff cadence[5s, 15s, 30s, 120s]) that inspects the child to terminal, survives parent DO eviction, and re-arms on parent startup whenever outstanding detached runs exist. Arming is serialized so concurrent detached dispatches in one turn can't race to create multiple backbones.*_claimed_at/*_delivered_atcolumns claimed atomically via SQLiterowsWrittenunder aDETACHED_DELIVERY_LEASE_MSlease. The happy path delivers exactly once; a crash mid-delivery re-delivers (at-least-once), never zero.onFinish(named method, resolved bykeyof thisso it survives rehydration) and the globalonAgentToolFinishboth fire.This resolves the two sharp edges from #1752:
nullis not proof a run is gone. The reconciler treats anullinspect as "not yet reconciled," never as failure, and detached runs are excluded from the await-styleinterruptedsealing — so a poll racing the child's first write can't manufacture a spurious "outcome unconfirmed."interrupted/budget-exceeded) and a later real completion — deliver as two distinct events. A premature give-up can never dedupe away the child's real summary.Bounded. An absolute
maxBudgetMsceiling (default 24h, configurable via thedetachedMaxBudgetMsstatic option) gives up — surfaced asinterruptedwith thebudget-exceededreason — and tears the child down so an abandoned run can't hold a concurrency slot forever.2. Cancellation
cancelAgentTool(runId)cancels a detached (or awaited) run by id through the same guarded path, so a wiredonFinishstill fires once withstatus: "aborted", and the terminalagent-tool-eventis always broadcast to connected clients (a cancelled run's UI settles immediately).3. Chat notify (
@cloudflare/think)detached: { notify: true }(or{ notify: { source } }) injects the completion into chat viasubmitMessages, idempotent perrunId+ terminal status, so the model reacts to the result without you wiringonFinishby hand. OverrideformatDetachedCompletion(run, result)to customize or suppress the text.4. Progress (4a) + durable milestones (4b)
A sub-agent (awaited or detached) can report mid-run progress that rides its own turn stream and re-broadcasts to the parent's connected clients, surfacing on
AgentToolRunState.progress/.milestonesviauseAgentToolEvents— a background-runs tray can render a live bar / phase / milestone chips without drilling in.reportProgress({ fraction?, message?, phase?, data? }, { persist? })— ephemeral, latest-wins coalesced (fraction >= 1always flushes), broadcast as a transientdata-agent-progresspart. The run id is resolved from the active turn — no threading. A no-op (with a dev warning) on the baseAgentand outside an active run.reportProgress({ milestone, data })— promotes a signal to durable: one row per milestone (cf_agent_tool_milestones/cf_ai_chat_agent_tool_milestones) with a monotonic per-runsequence, broadcast as a persisteddata-agent-milestonepart so drill-in replay and a rehydrated parent both see it.onProgress(run, progress)parent hook fires best-effort for both progress and milestones (the snapshot carriesprogress.milestoneso consumers can branch).Latest-snapshot persistence + recovery inspect via
progress_json/last_signal_at, surfaced throughinspectAgentToolRun().progress/.milestones.Resetting no-progress budget for detached runs. Once a detached child reports at least one signal, the backbone gives up if it then goes silent for
detachedNoProgressBudgetMs(default 1h; per-rundetached: { noProgressBudgetMs }), surfaced asinterrupted/ reasonno-progress. A child that never reports is bounded only by the absolute ceiling — we never give up on a run merely for being slow.Think
detached: { onMilestones }convenience. When a configured milestone lands, Think surfaces an idempotent synthetic chat message (keyed per(runId, name)) before the run finishes — from both the warm tail and the cold backbone, collapsed to at-most-once. Two modes (thestring[]shorthand defaults to"narrate"):"narrate"(default) — a synthetic assistant message injected directly (no inference): a cheap status line that does not trigger a turn."react"— a user-role turn so the model responds (steer / start dependent work). Costs a model turn.Override wording via
formatDetachedMilestone(run, milestone). Synthetic notify/milestone messages carrymetadata.sourceso clients can render them as an agent event rather than a human turn (the example does this).Observability
New events:
agent_tool:detached:delivery_failed(a wired callback threw; the slot stays open for retry) andagent_tool:detached:live_count_warning(edge-triggered when live detached runs cross a threshold — a leak smoke alarm, since detached runs hold a concurrency slot for their whole life).Public API
runAgentTool(Cls, { …, detached })overload →DetachedRunAgentToolResult.DetachedAgentToolConfig:{ onFinish?, maxBudgetMs?, noProgressBudgetMs?, notify?, onMilestones? }.cancelAgentTool(runId)— idempotent explicit cancellation.reportProgress(...)on chat agents;onProgress(run, progress)parent hook.detachedMaxBudgetMs(24h) anddetachedNoProgressBudgetMs(1h).AgentToolInterruptedReasonmembers"budget-exceeded"and"no-progress".@cloudflare/think:detached: { notify, onMilestones };formatDetachedCompletion/formatDetachedMilestone.Storage / migration
Schema bumped to v11. New columns on
cf_agent_tool_runs(added viaaddColumnIfNotExists, so existing DOs migrate forward in place):detached,detached_on_finish,detached_max_budget_at,detached_on_milestones,progress_json,last_progress_at, and the four ledger columnsfinish_claimed_at/finish_delivered_at/give_up_claimed_at/give_up_delivered_at. New per-host milestone tablescf_agent_tool_milestones/cf_ai_chat_agent_tool_milestones.cf_agent_tool_runsis created lazily (not part of the constructor DDL snapshot), so no snapshot change.Tests
packages/agents/src/tests/agent-tool-detached.test.ts— ledger firesonFinish+ global hook exactly once; dedupes fast-path vs backbone races; delivers give-up and a later real completion as two independent slots; never re-delivers a give-up.packages/agents/src/chat/__tests__/agent-tools.test.ts— emitter coalescing + milestone emit; client reducer projection of progress/milestones (deduped by sequence, monotonic snapshot).packages/think/src/tests/submissions.test.ts— detached notify nested-admission drain; milestone notify idempotency across re-delivery in bothreactandnarratemodes.Example
examples/agents-as-toolshas aresearch_backgroundtool (detached: { notify, onMilestones: ["sources-gathered"] }) that returns immediately, emits a milestone mid-run, posts the result back into the chat on completion, and acancelBackground(runId)callable — so reviewers can try the end-to-end flow in a preview build. The client renders a background-runs tray with progress + milestone chips and renders synthetic messages as agent events.Scope (what's deferred)
Only the RFC's awaitable join point —
awaitAgentToolMilestone(phase 4c) — is intentionally deferred, gated behind a short design addendum + a real consumer. Everything else from the RFC ships here.Validation
pnpm run check— all projects typecheck + lint + format ✅agents·@cloudflare/think·@cloudflare/ai-chat·agents-as-toolstests ✅Review starting points
design/rfc-detached-agent-tools.md— full design + rationale + phasing.packages/agents/src/index.ts—runAgentTooldetached branch,_parseDetachedOption,_deliverDetachedTerminal(claim+lease),_detachedFastPath,_cfDetachedReconcileTick(backbone + no-progress budget),cancelAgentTool,_maybeDeliverDetachedMilestone.packages/agents/src/chat/agent-tools.ts—AgentToolProgressEmitter, progress/milestone reducer projection.packages/think/src/think.ts—_cfDetachedNotifyFinish,_deliverDetachedMilestone(narrate/react), milestone persistence.cc @rwdaigle — direct response to your write-up; would love your eyes on the shape.
Made with Cursor