gix-pack: add a benchmark for pack generation from loose objects by ameyypawar · Pull Request #2675 · GitoxideLabs/gitoxide

Amey Pawar (ameyypawar) · 2026-06-23T18:50:09Z

What

For #2611. Adds the first gix-pack benchmark — a harness for pack generation from many loose objects — so the performance and memory questions raised there can be grounded in numbers, as Axel Ibarrondo (@cachel2) and I discussed in the thread. This is the measurement step before any code changes, not a fix for the underlying performance itself.

The harness

gix-pack/benches/pack_generation.rs separates the three phases of a gix pack create-style generation and reports, per phase, both timing (criterion) and the peak heap allocation, plus the resulting pack size:

discover — walk the odb (odb.iter()) to enumerate loose object ids.
count — count::objects turns those ids into a Vec<output::Count> (loads each object's header).
write — iter_from_counts → FromEntriesIter resolves locations, sorts, and encodes the counts into a pack byte stream.

Note on phase boundaries: the counts sort and pack-location resolve happen inside iter_from_counts — i.e. in the write phase, not in count.

Wiring: a [[bench]] target, criterion/tempfile dev-deps, and the parallel feature on the dev-dep (iter_from_counts requires a Send object-database handle). No new crates enter Cargo.lock — both deps are already used elsewhere in the workspace.

What it shows (example run; flat, unique blobs)

objects	discover	count	write	pack
1k	101 KiB	271 KiB	4.0 MiB	61 KiB
10k	843 KiB	2.8 MiB	6.4 MiB	623 KiB
50k	3.2 MiB	11.0 MiB	13.2 MiB	3.1 MiB

Discovery and count memory are O(N) but modest (~64 B and ~225 B per object); write peak grows sub-linearly with object count for this fixture; the pack is the un-deltified baseline.

Honest limits (also in the bench's doc comment)

the peak figures are peak heap allocation — a lower bound on RSS (excludes thread stacks, memory-mapped pack pages, allocator retention);
a global tracking allocator adds a small constant overhead to the timings, and write is multi-threaded (wall-clock varies with cores);
the fixture is flat near-identical blobs written to a discarding sink (no writer back-pressure), and AsIs input with no pre-existing pack never attempts delta compression — so the pack size is an un-deltified baseline; quantifying the delta gap against git would need a git-packed baseline of the same objects.

Verification

Compiles and runs; cargo clippy -p gix-pack --benches and cargo fmt --check are clean.
CI compile-checks it via cargo clippy --workspace --all-targets.
It's the first benchmark in gix-pack, so the object counts are kept modest to stay runnable; raise OBJECT_COUNTS locally to probe larger, more degenerate repositories.

Adds the first gix-pack benchmark (criterion), for GitoxideLabs#2611. It isolates the counting phase (collecting the `Vec<output::Count>`) from the writing phase (streaming entries out as a pack), and reports the peak heap allocation of each phase plus the resulting pack size via a tracking global allocator, so the memory and delta-compression characteristics raised in the issue are grounded in concrete numbers.

`iter_from_counts` requires a `Send` object database handle (it resolves counts across threads); enable `gix-features/parallel` for the benchmark build so the handle is `Arc`-backed.

Add a `discover` phase that walks the database (`odb.iter()`) to enumerate the loose object ids, which `count` then turns into counts - so the benchmark covers walking as well as collecting. Also make the framing honest: the counts are sorted and resolved inside the write phase (not count); the reported figure is peak heap allocation (a lower bound on RSS); the discarding sink imposes no back-pressure; and `AsIs` input never attempts delta compression, so the pack size is an un-deltified baseline rather than a measured delta gap.

Adds the criterion and tempfile edges for the new gix-pack benchmark. Both crates are already in the lock from elsewhere in the workspace, so no new packages are introduced.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 0be0f4ef45

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "Codex (@codex) review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "Codex (@codex) address that feedback".

chatgpt-codex-connector · 2026-06-23T18:54:12Z

+static LIVE: AtomicUsize = AtomicUsize::new(0);
+static PEAK: AtomicUsize = AtomicUsize::new(0);
+
+unsafe impl GlobalAlloc for PeakTracking {


Delegate realloc in the tracking allocator

For benchmark phases dominated by Vec growth (discover/count collect object lists and write builds chunks), this GlobalAlloc impl changes the allocator behavior because the trait's default realloc allocates a new block, copies, and frees the old block instead of using System.realloc. That inflates both timings and the recorded peak live bytes the benchmark is meant to measure, so the allocator should delegate realloc (and adjust the counters for the size delta) rather than relying on the default.

Useful? React with 👍 / 👎.

Amey Pawar (ameyypawar) and others added 4 commits June 23, 2026 23:45

gix-pack: enable the parallel feature for the benchmark

956248b

`iter_from_counts` requires a `Send` object database handle (it resolves counts across threads); enable `gix-features/parallel` for the benchmark build so the handle is `Arc`-backed.

Cargo.lock: record gix-pack's benchmark dev-dependencies

0be0f4e

Adds the criterion and tempfile edges for the new gix-pack benchmark. Both crates are already in the lock from elsewhere in the workspace, so no new packages are introduced.

chatgpt-codex-connector Bot reviewed Jun 23, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

gix-pack: add a benchmark for pack generation from loose objects#2675

gix-pack: add a benchmark for pack generation from loose objects#2675
Amey Pawar (ameyypawar) wants to merge 4 commits into
GitoxideLabs:mainfrom
ameyypawar:bench/2611-pack-generation

Amey Pawar (ameyypawar) commented Jun 23, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Jun 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Uh oh!

Conversation

Amey Pawar (ameyypawar) commented Jun 23, 2026

What

The harness

What it shows (example run; flat, unique blobs)

Honest limits (also in the bench's doc comment)

Verification

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jun 23, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant