avoid scheduling duplicate repeating tasks when task times out#10
avoid scheduling duplicate repeating tasks when task times out#10
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 96b20f5c29
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| { | ||
| name: task.name, | ||
| scheduledAt, | ||
| repeatAfterMS: task.repeatAfterMS, | ||
| params: task.params, | ||
| previousTaskId: task._id, | ||
| originalTaskId: task.originalTaskId || task._id, | ||
| timeoutMS: task.timeoutMS, | ||
| schedulingTimeoutAt: scheduledAt.valueOf() + 10 * 60 * 1000 | ||
| }); | ||
| } | ||
| previousTaskId: task._id | ||
| }, |
There was a problem hiding this comment.
Add a unique key for repeat-task upserts
When two workers enter _handleRepeatingTask() concurrently for the same timed-out or scheduling-timed-out task, both updateOne(..., { upsert: true }) calls can evaluate this filter before either insert is visible and both insert a repeat task; the schema only declares a non-unique { status, scheduledAt } index, so MongoDB has no unique constraint to collapse that race. The new sequential tests pass, but the production race this change targets still remains unless these fields are backed by a unique index or another atomic guard.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Pull request overview
This PR aims to prevent duplicate “next run” documents for repeating tasks when a task times out, particularly under race conditions between task polling/execution and the timeout-expiry logic.
Changes:
- Treat handler-level execution timeouts as a distinct
timed_outstatus (via aTimeoutError) instead of marking them asfailed. - Add a 10-minute buffer to
expireTimedOutTasks()so it only expires tasks that are stillin_progresswell after theirtimeoutAt. - Change repeating-task scheduling to use an upsert-based write and add regression tests for duplicate scheduling scenarios.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| test/task.test.js | Updates timeout expectations and adds tests to ensure repeats aren’t duplicated under race/double-handle scenarios. |
| src/taskSchema.js | Adds buffered timeout expiry, introduces TimeoutError for execution timeouts, and uses upsert logic when scheduling repeating tasks. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| return Task.updateOne( | ||
| { | ||
| name: task.name, | ||
| scheduledAt, | ||
| repeatAfterMS: task.repeatAfterMS, | ||
| params: task.params, | ||
| previousTaskId: task._id, | ||
| originalTaskId: task.originalTaskId || task._id, | ||
| timeoutMS: task.timeoutMS, | ||
| schedulingTimeoutAt: scheduledAt.valueOf() + 10 * 60 * 1000 | ||
| }); | ||
| } | ||
| previousTaskId: task._id | ||
| }, | ||
| { | ||
| $setOnInsert: { | ||
| name: task.name, | ||
| scheduledAt, | ||
| repeatAfterMS: task.repeatAfterMS, | ||
| params: task.params, | ||
| previousTaskId: task._id, | ||
| originalTaskId: task.originalTaskId || task._id, | ||
| timeoutMS: task.timeoutMS, | ||
| schedulingTimeoutAt: scheduledAt.valueOf() + 10 * 60 * 1000 | ||
| } | ||
| }, | ||
| { upsert: true } | ||
| ); |
There was a problem hiding this comment.
This is a reasonable concern, but highly unlikely given the buffer we give for hanging tasks. Will consider improving this for the future.
No description provided.