Skip to content

fix: seed fast pseudo-random numbers to differ across processes#2663

Merged
Sebastian Thiel (Byron) merged 3 commits into
GitoxideLabs:mainfrom
ameyypawar:fix/1816-rng-seed
Jun 23, 2026
Merged

fix: seed fast pseudo-random numbers to differ across processes#2663
Sebastian Thiel (Byron) merged 3 commits into
GitoxideLabs:mainfrom
ameyypawar:fix/1816-rng-seed

Conversation

@ameyypawar

Copy link
Copy Markdown
Contributor

What

gix-utils and gix-fs draw fast pseudo-random numbers from fastrand's global generators, whose per-thread seed is derived only from values that can coincide between separate processes — notably small thread IDs. Two concurrently running processes can therefore draw identical sequences, which:

Fixes #1816.

How

Adds a small gix_utils::rng module: a per-thread fastrand::Rng seeded once from a high-entropy OS source via getrandom::u64(), so sequences differ across processes. The two production call sites are routed through it — the backoff jitter in gix-utils, and the three capability probes in gix-fs (whose now-unused direct fastrand dependency is dropped).

wasm handling

getrandom is added as a target-specific dependency, excluded on wasm32-unknown-unknown, which has no entropy backend by default (and is single-process, so the cross-process concern doesn't apply there). On that target the seed falls back to a best-effort value that deliberately avoids Instant::now(), since that panics on wasm32-unknown-unknown (#7319). On wasm32-wasi* targets getrandom is included and uses the wasi backend.

Note on the public API

This adds a new public gix_utils::rng module (so gix-fs can use it across the crate boundary) — flagging in case it affects the version bump.

Out of scope

The gix-testtools port-shuffle (case 3 in the issue) is left as-is — the issue notes it isn't really a problem, only a minor test slowdown.

Verification

  • gix-utils + gix-fs build and tests pass; cargo clippy -- -D warnings and cargo fmt --check are clean.
  • Builds on wasm32-unknown-unknown (getrandom excluded, fallback seed used) and wasm32-wasip1 (getrandom included). Note: these were verified as builds, not at runtime (no local wasm host).

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: faa19e3139

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "Codex (@codex) review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "Codex (@codex) address that feedback".

Comment thread gix-utils/src/rng.rs
Comment thread gix-utils/src/rng.rs
Comment thread gix-utils/Cargo.toml Outdated
Amey Pawar (ameyypawar) added a commit to ameyypawar/gitoxide that referenced this pull request Jun 21, 2026
Per review on GitoxideLabs#2663:

- `fastrand::Rng` is a single `u64`, so a `Cell` -- move it out behind a
  cheap placeholder, advance it, and move it back -- is enough and avoids
  `RefCell`'s runtime borrow tracking, mirroring `fastrand`'s own
  thread-local generator.
- We no longer use `fastrand`'s global generator, so its `std` default
  feature isn't needed: set `default-features = false` and bump to 2.4.1.
  The `wasm32-unknown-unknown` fallback therefore seeds with a fixed value
  instead of `Rng::new()` (which needs the std-gated global); that target
  is single-process, so a fixed seed is fine.
Sebastian Thiel (Byron) pushed a commit to ameyypawar/gitoxide that referenced this pull request Jun 23, 2026
Per review on GitoxideLabs#2663:

- `fastrand::Rng` is a single `u64`, so a `Cell` -- move it out behind a
  cheap placeholder, advance it, and move it back -- is enough and avoids
  `RefCell`'s runtime borrow tracking, mirroring `fastrand`'s own
  thread-local generator.
- We no longer use `fastrand`'s global generator, so its `std` default
  feature isn't needed: set `default-features = false` and bump to 2.4.1.
  The `wasm32-unknown-unknown` fallback therefore seeds with a fixed value
  instead of `Rng::new()` (which needs the std-gated global); that target
  is single-process, so a fixed seed is fine.
Amey Pawar (ameyypawar) and others added 3 commits June 23, 2026 14:16
`gix-utils` and `gix-fs` drew fast pseudo-random numbers from `fastrand`'s
global generators, whose per-thread seed is derived only from values that
can coincide between separate processes (notably small thread IDs). Two
concurrently running processes could therefore produce identical sequences,
which weakens backoff jitter (meant to avoid a thundering herd of retries)
and can make the filesystem capability probes collide on their temporary
file names -- e.g. spuriously reporting that symlinks are unsupported (GitoxideLabs#1789).

Add a `gix_utils::rng` module that draws from a per-thread `fastrand::Rng`
seeded from a high-entropy OS source via `getrandom`, and route the affected
call sites (backoff jitter and the `gix-fs` capability probes) through it.

`getrandom` is a target-specific dependency that is excluded on
`wasm32-unknown-unknown`, which has no entropy backend by default and runs a
single process; there a best-effort seed is used that deliberately avoids
`Instant::now()` (which panics on that target).

For GitoxideLabs#1816
- `std::process::id()` traps at runtime on `wasm32-unknown-unknown`
  ("no pids on this platform"), and the fallback seed -- the only seed
  path on that target -- called it, so the first `rng::usize` would
  abort in browser/unknown wasm builds. On that single-process target,
  where cross-process distinctness is moot anyway, defer to `fastrand`'s
  own wasm-safe default seeding instead. The high-entropy `getrandom`
  seed and the `process::id()`-based fallback now compile only off that
  target.
- `fastrand::Rng` is a single `u64`, so a `Cell` -- move it out behind a
  cheap placeholder, advance it, and move it back -- is enough and avoids
  `RefCell`'s runtime borrow tracking, mirroring `fastrand`'s own
  thread-local generator.
- We no longer use `fastrand`'s global generator, so its `std` default
  feature isn't needed: set `default-features = false` and bump to 2.4.1.
  The `wasm32-unknown-unknown` fallback therefore seeds with a fixed value
  instead of `Rng::new()` (which needs the std-gated global); that target
  is single-process, so a fixed seed is fine.

Co-authored-by: Sebastian Thiel <sebastian.thiel@icloud.com>
This should avoid flaky capability tests of multiple process acting on the
same repository.

Co-authored-by: GPT 5.5 <codex@openai.com>
@Byron

Copy link
Copy Markdown
Member

Thanks for getting the fix in, and let's hope that truly works. It really depends on getrandom, but would also assume that it works better due to access to the OS entropy.

Also many thanks to Eliah Kagan (@EliahKagan) for chiming in and acknowledging Codex's auto-fix. I gave it another pass and assume it's what you had in mind (the review didn't make material changes).

@Byron

Copy link
Copy Markdown
Member

I think one windows job already failed because of exactly what this PR is supposed to improve:

  stderr ───
    failed to extract 'tests\fixtures\generated-archives\make_baseline.tar': Ignoring archive at 'tests\fixtures\generated-archives\make_baseline.tar' as GIX_TEST_IGNORE_ARCHIVES is set.
    thread 'pattern::matching::compare_baseline_with_ours' (832) panicked at gix-glob\tests\pattern\matching.rs:65:13:
    assertion `left == right` failed: baseline for matches must be false - check baseline and git version: GitMatch { pattern: "*/\\\'", value: "XXX/\\\'", is_match: true }
      left: true
     right: false

It's this job, but I am having trouble getting back to the first attempt, maybe because it kind of timed out and was extra-strange.

If it passes next time, I think we should still merge this as it nicely unifies our random number generation, independently of what it tries to achieve beyond that.

@Byron Sebastian Thiel (Byron) merged commit 386f31a into GitoxideLabs:main Jun 23, 2026
56 of 58 checks passed
@EliahKagan

Copy link
Copy Markdown
Member

I think one windows job already failed because of exactly what this PR is supposed to improve

I'm not aware of how that pattern::matching::compare_baseline_with_ours would be related to this PR or #1816 (which this PR fixes).

It's this job, but I am having trouble getting back to the first attempt, maybe because it kind of timed out and was extra-strange.

I think you're referring to the inability to expand the "Test (nextest)" step in the "test-fixtures-windows (windows-latest)" job in attempt 1 there. It shows as though the job is still running, even though its status shows as failed.

I believe this sometimes happens when a runner stops responding and the job is eventually stopped automatically for that reason. I've seen this much more often over the course of the past month. I think it's usually a flake in GitHub's infrastructure, rather than relating conceptually to whatever test is executing.

However, I'm guessing that the test failure you showed was something that happened in that step, prior to the connection to the runner being lost. My guess is that the problems are independent: we may have a flaky test that failed while nextest was runnning, and also GitHub's infrastructure may have been flaky later on. I'm not certain, though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Pseudorandom numbers are sometimes the same across processes

4 participants