fix(argo-workflows): handle Error phase webhook events#1112
Conversation
Argo Workflows finalize as `Error` (controller/infra failure) or `Failed` (user-code exit). The webhook handler only mapped `Failed`; `Error` events silently dropped at the unmapped-phase guard, leaving jobs stuck in_progress. Map `Error` -> JobStatus.Failure. Drop the workflowName fallback in getJobId since job.id is a UUID and workflowName never matches; surface missing labels as a warn log instead of a guaranteed no-op UPDATE. Add a notInArray guard on the UPDATE so the ~13 near-simultaneous sensor fires per workflow can't regress a terminal job. Add structured logs at every drop/update path so future silent failures stay loud.
|
Warning Rate limit exceeded
To continue reviewing without waiting, purchase usage credits in the billing tab. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (3)
📝 WalkthroughWalkthroughThis PR enhances Argo workflow webhook handling with explicit status mapping, strict input validation, and idempotent database updates. It adds structured logging at the webhook entry point and throughout job status updates to improve observability while preventing race conditions where late events could overwrite terminal job states. ChangesArgo Workflow Status & Idempotency
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
apps/api/src/routes/argoworkflow/index.ts (1)
51-59: ⚡ Quick winAdd an explicit payload type before logging/handling.
req.bodyis used as an untyped object in this path; adding a concrete payload type (or runtime-validated parse result) here would make Line 53–57 field access and thehandleArgoWorkflowcall safer.As per coding guidelines, "**/*.{ts,tsx}: Use explicit types in TypeScript code".
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/api/src/routes/argoworkflow/index.ts` around lines 51 - 59, The code uses req.body as an untyped payload when calling logger.info and handleArgoWorkflow; declare or import a concrete payload type (e.g., ArgoWorkflowPayload) and replace the untyped payload with a typed/validated variable before logging and calling handleArgoWorkflow—either by typing the parsed body (const payload: ArgoWorkflowPayload = req.body) or by runtime-validating and casting (e.g., with zod/validator) to produce a typedPayload; then pass that typedPayload to logger.info and handleArgoWorkflow so field access and the function call are statically/clinically safe.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@apps/api/src/routes/argoworkflow/index.ts`:
- Around line 51-59: The code uses req.body as an untyped payload when calling
logger.info and handleArgoWorkflow; declare or import a concrete payload type
(e.g., ArgoWorkflowPayload) and replace the untyped payload with a
typed/validated variable before logging and calling handleArgoWorkflow—either by
typing the parsed body (const payload: ArgoWorkflowPayload = req.body) or by
runtime-validating and casting (e.g., with zod/validator) to produce a
typedPayload; then pass that typedPayload to logger.info and handleArgoWorkflow
so field access and the function call are statically/clinically safe.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: cf10fc61-3d86-416f-89a9-590d30f17aea
📒 Files selected for processing (3)
apps/api/src/routes/argoworkflow/__tests__/workflow.test.tsapps/api/src/routes/argoworkflow/index.tsapps/api/src/routes/argoworkflow/workflow.ts
pnpm 11 places global binaries in $PNPM_HOME/bin, not $PNPM_HOME directly. The previous PATH entry pointed at /pnpm so pnpm refused to install globals with "configured global bin directory /pnpm/bin is not in PATH". Surfaced when corepack@latest pulled pnpm 11.0.8; older pnpm tolerated the mismatch.
Export ArgoWorkflowPayload and cast req.body to it in the route handler so field access in the entry log and the call into handleArgoWorkflow is checked by tsc instead of relying on Express's `any`-typed body. Drop the now-redundant optional chaining on payload fields (lint flagged them as unnecessary). No runtime validation added — webhook stays auth-gated by shared secret, matching the TFE and GitHub webhook handlers.
Argo Workflows finalize as
Error(controller/infra failure) orFailed(user-code exit). The webhook handler only mappedFailed;Errorevents silently dropped at the unmapped-phase guard, leaving jobs stuck in_progress.Map
Error-> JobStatus.Failure. Drop the workflowName fallback in getJobId since job.id is a UUID and workflowName never matches; surface missing labels as a warn log instead of a guaranteed no-op UPDATE. Add a notInArray guard on the UPDATE so the ~13 near-simultaneous sensor fires per workflow can't regress a terminal job. Add structured logs at every drop/update path so future silent failures stay loud.Summary by CodeRabbit
Bug Fixes
Tests
Chores