fix(gitlab): split large group sync into bounded subgroup fetches#1351
fix(gitlab): split large group sync into bounded subgroup fetches#1351RitwijParmar wants to merge 3 commits into
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (1)
WalkthroughReplaces the single ChangesGitLab group tree traversal with bounded pagination
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
packages/backend/src/gitlab.ts (1)
56-60: ⚡ Quick winAvoid
Array.shift()in the traversal queue for large group trees.Line 60 makes dequeue O(n), so walking large subgroup trees becomes O(n²). This path is exactly the large-namespace hot path this PR is optimizing.
♻️ Proposed refactor
export const getGitLabProjectsForGroupTree = async ( api: GitLabApi, rootGroup: string, ): Promise<ProjectSchema[]> => { const projectsById = new Map<number, ProjectSchema>(); const groupsToVisit: Array<string | number> = [rootGroup]; + let queueIndex = 0; const visitedGroups = new Set<string>(); - while (groupsToVisit.length > 0) { - const group = groupsToVisit.shift()!; + while (queueIndex < groupsToVisit.length) { + const group = groupsToVisit[queueIndex++]!; const groupKey = String(group); if (visitedGroups.has(groupKey)) { continue; } visitedGroups.add(groupKey);🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/backend/src/gitlab.ts` around lines 56 - 60, The traversal queue is using Array.shift() which performs in O(n) time, causing the overall group tree traversal to be O(n²) for large subgroup trees. Replace the shift-based dequeuing with an index-based approach by adding an index variable to track the current position in the groupsToVisit array, then access elements via that index instead of calling shift(). This will make dequeue operations O(1) and improve the overall performance of the traversal logic.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@packages/backend/src/gitlab.ts`:
- Around line 31-46: The pagination loop in the while block trusts that
response.paginationInfo?.next always contains a new page value that advances
forward, but if the upstream response returns the same page or a regressing page
number, the loop will never terminate and hang the worker. Add a guard condition
after checking if nextPage exists to verify that nextPage is actually greater
than the current page value, and if it is not advancing (i.e., nextPage is less
than or equal to page), break out of the loop to prevent an infinite sync hang.
---
Nitpick comments:
In `@packages/backend/src/gitlab.ts`:
- Around line 56-60: The traversal queue is using Array.shift() which performs
in O(n) time, causing the overall group tree traversal to be O(n²) for large
subgroup trees. Replace the shift-based dequeuing with an index-based approach
by adding an index variable to track the current position in the groupsToVisit
array, then access elements via that index instead of calling shift(). This will
make dequeue operations O(1) and improve the overall performance of the
traversal logic.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: cd5ed975-df6a-4aad-bfdb-fcd6e0b5eda4
📒 Files selected for processing (3)
CHANGELOG.mdpackages/backend/src/gitlab.test.tspackages/backend/src/gitlab.ts
Fixes #1139
Summary
includeSubgroups: trueproject query with recursive subgroup traversalperPage: 100requestsThis avoids the expensive GitLab server-side query that can time out for large namespaces such as
redhat/centos-stream/*, while preserving recursive subgroup coverage.Verification
Summary by CodeRabbit
Release Notes
Bug Fixes
Tests