fix: null-safe getTime() calls in replication services (#3519)#3974
fix: null-safe getTime() calls in replication services (#3519)#3974deepshekhardas wants to merge 1 commit into
Conversation
…#3519) Add optional chaining to getTime() calls on createdAt, updatedAt fields in runs and sessions replication services. When CDC sends rows before timestamps are fully populated, calling .getTime() on undefined crashes the replication service with a TypeError. - runsReplicationService.server.ts: null-safe updatedAt/createdAt in #prepareTaskRunInsert and #preparePayloadInsert - sessionsReplicationService.server.ts: null-safe createdAt/updatedAt in toSessionInsertArray
|
|
Hi @deepshekhardas, thanks for your interest in contributing! This project requires that pull request authors are vouched, and you are not in the list of vouched users. This PR will be closed automatically. See https://github.com/triggerdotdev/trigger.dev/blob/main/CONTRIBUTING.md for more details. |
| run.updatedAt?.getTime() ?? Date.now(), // updated_at | ||
| run.createdAt?.getTime() ?? Date.now(), // created_at |
There was a problem hiding this comment.
🚩 Date.now() fallback for created_at could produce non-deduplicatable rows in ClickHouse
The ClickHouse task_runs_v2 table uses created_at in both the sort key (ORDER BY (organization_id, project_id, environment_id, created_at, run_id)) and the partition key (PARTITION BY toYYYYMM(created_at)). ReplacingMergeTree deduplicates rows based on the sort key. If createdAt is null for a replication event and Date.now() is used as a fallback, the fabricated created_at value would differ from the real timestamp on any subsequent replication event (e.g., an update) for the same run. This means the two rows would have different sort keys and ClickHouse would not deduplicate them, even with FINAL. They could also land in different partitions. This is an inherent trade-off of the Date.now() approach — the alternative (crashing) would halt the entire replication pipeline. A possible improvement would be to skip the row entirely (similar to the !run.environmentType || !run.organizationId guard at runsReplicationService.server.ts:1032) and log a warning, rather than inserting potentially unreplaceable data. The same concern applies to sessionsReplicationService.server.ts if the sessions table has a similar schema.
Was this helpful? React with 👍 or 👎 to provide feedback.
|
Caution Review failedThe pull request is closed. ℹ️ Recent review info⚙️ Run configurationConfiguration used: Repository UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
WalkthroughTwo replication service files are updated to handle missing timestamp fields defensively. In ✨ Finishing Touches🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 ESLint
ESLint install timed out. The project may have too many dependencies for the sandbox. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Add optional chaining to getTime() calls on createdAt, updatedAt
fields in runs and sessions replication services. When CDC sends rows
before timestamps are fully populated, calling .getTime() on undefined
crashes the replication service with a TypeError.
#prepareTaskRunInsert and #preparePayloadInsert
toSessionInsertArray