Skip to content

fix(agent): classify upstream prompt failures without replay#1782

Open
tatoalo wants to merge 2 commits intomainfrom
fix/agent-classified-upstream-errors
Open

fix(agent): classify upstream prompt failures without replay#1782
tatoalo wants to merge 2 commits intomainfrom
fix/agent-classified-upstream-errors

Conversation

@tatoalo
Copy link
Copy Markdown
Contributor

@tatoalo tatoalo commented Apr 21, 2026

Problem

When the upstream Claude prompt stream terminated unexpectedly, the server surfaced a generic failure that did not preserve the underlying reason. Prompt-level replay also risked rerunning work after execution had already started, which is unsafe for non-idempotent ops

Changes

  • added structured upstream error classification in the Claude ACP conversion layer so terminated streams and connection failures can be identified without relying only on string matching
  • removed prompt-level replay from the initial and resume send paths so transient upstream failures do not rerun a prompt after execution may already have started

@greptile-apps
Copy link
Copy Markdown

greptile-apps Bot commented Apr 21, 2026

Prompt To Fix All With AI
This is a comment left during a code review.
Path: packages/agent/src/adapters/claude/conversion/sdk-to-acp.ts
Line: 637-644

Comment:
**Exported but unused function**

`isTransientAgentError` is exported here but is not imported or called anywhere in the codebase. The prompt-level retry logic that presumably consumed it was removed in this PR, leaving this as dead code — violating the "no superfluous parts" simplicity rule.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: packages/agent/src/server/question-relay.test.ts
Line: 544-630

Comment:
**Regex patterns in `classifyAgentError` are untested**

Both new tests exercise the structured `data.classification` path — the error fixtures have `error.data.classification` pre-set, so `extractErrorClassification` returns early and `classifyAgentError`'s regex matching is never exercised. The two regexes (`API Error:\s*terminated\b` and `API Error:\s*Connection error\b`) therefore have no coverage. Given the team's preference for parameterised tests, a small `it.each` table directly on `classifyAgentError` (e.g. `"API Error: terminated"``upstream_stream_terminated`, `"API Error: Connection error"``upstream_connection_error`, arbitrary string → `agent_error`) would close this gap.

How can I resolve this? If you propose a fix, please make it concise.

Reviews (1): Last reviewed commit: "fix(agent): classify upstream prompt fai..." | Re-trigger Greptile

Comment thread packages/agent/src/adapters/claude/conversion/sdk-to-acp.ts Outdated
Comment thread packages/agent/src/server/question-relay.test.ts
@tatoalo tatoalo self-assigned this Apr 21, 2026
@tatoalo tatoalo requested a review from a team April 21, 2026 17:28
if (!result) return "agent_error";
const text = result.trim();
// Anthropic SDK surfaces an undici fetch abort as "API Error: terminated".
if (/API Error:\s*terminated\b/i.test(text)) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you love a little bit of regex error message matching


// Prefer the structured `data` carried on RequestError if present.
const data = (error as { data?: unknown } | undefined)?.data;
if (
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks like zod would simplify it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants