Skip to content

Batch ref-update commands to stay under receive-pack cap#94

Merged
Soph merged 4 commits into
mainfrom
fix/batch-ref-updates-receive-pack
Jun 18, 2026
Merged

Batch ref-update commands to stay under receive-pack cap#94
Soph merged 4 commits into
mainfrom
fix/batch-ref-updates-receive-pack

Conversation

@Soph

@Soph Soph commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

Problem

Syncing a repo with many refs fails with a server-side rejection:

sync: replicate relay failed: ... too many ref-update commands: 55006 (limit 25000)

entire-server caps a single receive-pack request at 25,000 ref-update commands (server/githttp.maxRefUpdateCommands). git-sync's relay/materialize strategies (replicate, incremental, materialized) sent every ref-update command in one request, so a repo with 55k refs exceeded the cap in a single push. Only the bootstrap strategy batched, and only by pack size — never by command count.

Fix

Batch inside the gitproto push primitives (PushPack, PushCommands, PushObjects) so every strategy benefits from one place:

  • PushPack / PushObjects send the pack with the first batch, then the remaining refs as ref-only follow-up batches. The pack carries every object for the whole push, and receive-pack commits the entire received pack — verified at both ends:

    • entire-server: CommitQuarantinedFanout ("we commit even when some refs failed connectivity… the pack may carry objects shared with refs that did pass")
    • canonical git: tmp_objdir_migrate migrates the whole quarantine, no reachability pruning

    So later ref-only batches resolve against objects already on the server.

  • PushCommands chunks all commands under the per-push limit.

A shared splitFirstBatch / chunkRefUpdates pair keeps the splitting logic in one place.

Batch size: default 5,000, configurable

Tested end-to-end mirroring a real 55,008-ref repo between two GitHub repos. Findings:

Batch size entire-server GitHub
20,000 ❌ 500 Internal Server Error
10,000 ❌ 500
5,000

GitHub's receive-pack returns 500 Internal Server Error when a single push updates ~10k refs at once (the pack is accepted; it fails applying the ref updates) — far below entire-server's 25k cap. So the default is 5,000, with two override paths:

  • --target-max-ref-updates N CLI flag (on replicate, sync, plan, bootstrap).
  • GITSYNC_MAX_REF_UPDATES_PER_PUSH env var.

Precedence: flag/Pusher.MaxRefUpdates > env > default. Invalid/non-positive → default. The value rides on the Pusher (no global mutable state) and is plumbed flag → Options → syncer config → Pusher. Raise it for entire-server targets (up to 25k) to cut round trips; lower it for stricter providers.

Progress output

Ref-only follow-up batches carry no useful sideband progress, so a large push used to print a bare target: line per batch. They now push with progress suppressed; when --verbose, each batch emits one concise line instead: target: pushed ref-update batch N/M (K refs).

GitHub compatibility

  • GitHub's "limit branches/tags per push" is opt-in (default: unlimited).
  • The "full pack first, ref-only rest" design relies on quarantine migration installing the whole pack — exactly how canonical git behaves.
  • GitHub's 2 GiB push-size limit is handled separately by existing pack-size limiting.
  • Note: mirroring Entire's refs/entire/checkpoints/* to GitHub can trip GitHub secret-scanning push protection; exclude that namespace (--exclude-ref-prefix refs/entire/) or disable push protection on the target. Not a git-sync concern.

Tests

  • TestChunkRefUpdates, TestEffectiveMaxRefUpdates — splitting + limit resolution
  • TestResolveMaxRefUpdatesPerPush — env override + fallback
  • TestPushCommandsBatchesOverCap, TestPushPackBatchesOverCap — splitting with an explicit limit; pack rides with first batch, remainder ref-only
  • TestPushPackUsesDefaultLimitWhenZero — 0 → default
  • TestPushCommandsVerboseLogsBatches — per-batch verbose log; quiet for a single batch

go build, go vet, and the full test suite pass. Verified live: 55,008-ref GitHub→GitHub mirror succeeds at the 5k default.

🤖 Generated with Claude Code

A sync of a repo with many refs sent every ref-update command in a single
receive-pack request, which entire-server rejects past 25,000 commands
("too many ref-update commands: 55006 (limit 25000)"). The relay/materialize
strategies (replicate, incremental, materialized) had no command-count
batching.

Batch inside the gitproto push primitives so every strategy benefits:

- PushPack / PushObjects send the pack with the first batch and the remaining
  refs as ref-only follow-ups. The pack carries every object for the whole
  push, and receive-pack commits the entire received pack (entire-server via
  CommitQuarantinedFanout, canonical git via tmp_objdir_migrate — neither
  prunes objects unreachable from the pushed tips), so later batches only move
  ref pointers.
- PushCommands chunks all commands under maxRefUpdatesPerPush (20_000, with
  headroom under the server's 25_000 cap).

Works against both entire-server and canonical git/GitHub (GitHub's per-push
branch/tag limit is opt-in, default unlimited).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Entire-Checkpoint: 5d4b8f06380e
nodo
nodo previously approved these changes Jun 18, 2026

@nodo nodo left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉

GitHub returns 500 Internal Server Error when a single push updates ~10k refs
at once but accepts 5k — well below entire-server's 25k cap. Lower the default
batch to 5_000 so mirroring a many-ref repo works against GitHub out of the
box, and add GITSYNC_MAX_REF_UPDATES_PER_PUSH to raise it for targets known to
tolerate larger pushes (entire-server, up to 25k) and cut round trips.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Entire-Checkpoint: 7f01a5ec2f19
Soph and others added 2 commits June 18, 2026 18:27
Plumb a per-target ref-update batch size through the Pusher (MaxRefUpdates),
the syncer config, the unstable Options, and the replicate/sync/plan/bootstrap
commands as --target-max-ref-updates. Zero keeps the env-or-default limit
(GITSYNC_MAX_REF_UPDATES_PER_PUSH or 5000); a positive value overrides it —
raise it for entire-server targets (up to 25k), lower it for stricter
providers. No global mutable state: the value rides on the Pusher.

Also stop spewing a bare "target:" sideband line per ref-only follow-up batch:
those carry no useful progress, so push them with progress suppressed and,
when verbose, emit one concise "pushed ref-update batch N/M (K refs)" line per
batch instead.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Entire-Checkpoint: d908aa3f2758
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Entire-Checkpoint: 3ab7999b3055
@Soph Soph merged commit 0529f86 into main Jun 18, 2026
3 checks passed
@Soph Soph deleted the fix/batch-ref-updates-receive-pack branch June 18, 2026 16:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

2 participants