fix(mempool): close same-node nonce TOCTOU via per-sender advisory lock (batch C PR 4) by tcsenpai · Pull Request #887 · kynesyslabs/node

tcsenpai · 2026-05-31T14:03:56Z

Summary

PR 4 of the batch C nonce-replay-protection sequence specified in docs/specs/audit-sweep-batch-c-nonce.md. Closes the same-node TOCTOU window that was flagged on PR #886 and originally promised in PR #885's assignNonce docstring (but never landed).

3 files, +133 / -10.

The window

assignNonce runs during confirmTransaction (validate RPC). Mempool.addTransaction runs during a separate execute RPC. Between the two, two concurrent ingress paths for the same sender can both pass validation (each sees the same stale account.nonce + pendingCount) and reach addTransaction with duplicate nonces.

PR #886's consensus-side expectedPrior reject catches them at block-apply time, but only after both occupy mempool slots and burn cleanup churn.

What this PR ships

`Mempool.addTransaction`

Wrap the per-sender re-check + insert in a Postgres transaction:

await em.query(
    "SELECT pg_advisory_xact_lock(hashtext($1))",
    [`nonce:${senderFrom}`],
)
// re-query account.nonce + countPendingByAddress INSIDE the lock
// re-check tx.content.nonce === expected
// insert (or reject)

Lock key namespaced as nonce:${sender_lowercase_hex} so future advisory-lock callers in the project can't collide. Released at commit/rollback.

Fork-gated

Only runs when nonceEnforcement is active and the tx carries a string content.from and a numeric content.nonce. Pre-fork chains stay bit-identical via the legacy insert path below the gate. Genesis-tx / awardPoints / L2PS internal paths that don't carry a native sender skip the gate.

`assignNonce` JSDoc

Rewritten to point at Mempool.addTransaction as the actual TOCTOU mitigation site. PR 2's original wording promised PR 3 would land the lock; PR 3 shipped without it and Greptile flagged the discrepancy. Now matches shipped reality.

Spec doc

PRs frontmatter: PR 3 marked merged, PR 4 added.
Risks table TOCTOU row rewritten to describe the actual PR 4 fix shape + cross-node carve-out.
PR breakdown: new PR 4 row (Low risk, fork-gated, single function).

Cross-node note

This lock is local to this node's Postgres. Cross-RPC replay across two different validators is NOT covered. That remains the consensus-side expectedPrior apply-time reject in GCRNonceRoutines (PR #886). PR 4 is the same-node companion to that cross-node safety net.

Verification

tsc --noEmit clean on both changed code files.
Manual trace of concurrent same-sender submission:
1. Client A and B both validate with account.nonce=0. Both pass (tx.content.nonce=1 in both).
2. Both reach Mempool.addTransaction concurrently. A acquires advisory lock first.
3. A re-queries inside lock: account.nonce=0, pending=0, expected=1. Match. Inserts. Releases at commit.
4. B acquires lock. Re-queries: account.nonce=0, pending=1 (A just inserted), expected=2. tx.content.nonce=1. Mismatch. Returns error without inserting.
Pre-fork trace: senderFrom set, txNonce number, but isForkActive returns false → skip the locked branch → legacy insert path runs → bit-identical re-sync.

Test plan

bun install and confirm clean tsc.
Boot devnet (post-fork from block 0).
Concurrent-submission test: fire 2 same-sender confirmTx + executeTx round-trips back-to-back with identical tx.content.nonce. Expect first to land in mempool, second to fail with Nonce TOCTOU recheck failed error.
Pre-fork chain (testnet checkout with nonceEnforcement.activationHeight: null): confirm legacy addTransaction path is taken (no advisory lock, no recheck) — re-sync stays bit-identical.

…ck (batch C PR 4) PR 4 of the batch C sequence — closes the validate→addTransaction window flagged on PR 3's Greptile review and originally promised in PR 2's `assignNonce` docstring. The window `assignNonce` runs during `confirmTransaction`, which is a separate RPC round-trip from the execute-side `Mempool.addTransaction` call that actually inserts. Between the two, two concurrent ingress paths for the same sender can both pass validation (each sees the same stale `account.nonce + pendingCount`) and reach `addTransaction` with duplicate nonces. PR 3's consensus-side `expectedPrior` reject catches them at block-apply time, but only after both occupy mempool slots and burn cleanup churn. The fix Wrap the per-sender re-check + insert inside `Mempool.addTransaction` in a Postgres transaction. Acquire a `pg_advisory_xact_lock(hashtext('nonce:' || sender))` at the top of the transaction. The lock serialises concurrent same-sender admissions on this node; the in-lock re-query of `account.nonce + countPendingByAddress` catches the second duplicate-nonce submission before it inserts. Released at commit/rollback. Lock key: `hashtext('nonce:' || sender_lowercase_hex)`. Prefixing the address with `nonce:` namespaces the keyspace so future advisory-lock callers in the project (none today) cannot collide. Fork-gated Only runs when `nonceEnforcement` is active. Pre-fork chains bit-identical (legacy `addTransaction` path preserved unchanged below the gate). Genesis-tx / awardPoints / L2PS internal paths that don't carry a native `tx.content.from` string skip the gate via the `senderFrom` and `typeof txNonce === "number"` guards. Cross-node note Mempool-side advisory locks are local to this node's Postgres. Cross-RPC replay across two different validators is NOT covered by this lock — that remains the consensus-side `expectedPrior` apply-time reject in `GCRNonceRoutines` (PR 3). Docstring cleanup `assignNonce` JSDoc rewritten to point at `Mempool.addTransaction` as the actual TOCTOU mitigation site. The original wording promised PR 3 would land the lock; PR 3 shipped without it and Greptile flagged the discrepancy. Updated to match shipped reality. Doc updates - PRs frontmatter: PR 3 marked merged, PR 4 added. - Risks table TOCTOU row: rewritten to describe the actual PR 4 fix shape (re-check + insert under advisory lock) and the cross-node carve-out. - PR breakdown: new PR 4 row. Verification - tsc --noEmit clean on both changed code files. - Manual trace of concurrent same-sender submission: 1. Client A and B both validate with account.nonce=0. Both pass (tx.content.nonce=1 in both). 2. Both reach Mempool.addTransaction concurrently. A acquires advisory lock first. 3. A re-queries inside lock: account.nonce=0, pending=0, expected=1. tx.content.nonce=1. Matches. Inserts. Releases. 4. B acquires lock. Re-queries: account.nonce=0, pending=1 (A just inserted), expected=2. tx.content.nonce=1. Mismatch. Returns error without inserting. - Pre-fork trace: senderFrom set, txNonce number, but `isForkActive` returns false → skip the locked branch → legacy insert path runs → bit-identical re-sync.

qodo-code-review · 2026-05-31T14:04:01Z

Qodo reviews are paused for this user.

Troubleshooting steps vary by plan Learn more →

On a Teams plan?
Reviews resume once this user has a paid seat and their Git account is linked in Qodo.
Link Git account →

Using GitHub Enterprise Server, GitLab Self-Managed, or Bitbucket Data Center?
These require an Enterprise plan - Contact us
Contact us →

coderabbitai · 2026-05-31T14:04:04Z

Warning

Review limit reached

@tcsenpai, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 20 minutes and 33 seconds. Learn how PR review limits work.

Your organization has run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 99481827-db34-44cd-8367-ff7ad9f7b947

📥 Commits

Reviewing files that changed from the base of the PR and between 47be8cd and 2a0e79c.

📒 Files selected for processing (3)

docs/specs/audit-sweep-batch-c-nonce.md
src/libs/blockchain/mempool.ts
src/libs/blockchain/routines/validateTransaction.ts

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch bugfix/audit-sweep-batch-c-toctou-lock-2026-05-31

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

greptile-apps · 2026-05-31T14:12:03Z

Greptile Summary

This PR closes the same-node TOCTOU window in Mempool.addTransaction by wrapping the per-sender re-check and insert in a Postgres transaction at READ COMMITTED isolation protected by pg_advisory_xact_lock(hashtext('nonce:' + sender)). All three prior review concerns (isolation level, stale block-height cutoff, and misleading duplicate-hash error) are addressed directly in this iteration.

mempool.ts: Adds a fork-gated locked path that acquires a per-sender advisory lock, re-queries both the account nonce and live mempool count inside the lock, checks for duplicate hash before the nonce comparison, and only then inserts — falling through to the legacy path when nonceEnforcement is inactive or the tx lacks a native sender/nonce.
validateTransaction.ts: JSDoc-only update to assignNonce pointing at Mempool.addTransaction as the actual TOCTOU mitigation site.
docs/specs/audit-sweep-batch-c-nonce.md: Spec updated with PR 4 row and corrected TOCTOU risk description.

Confidence Score: 5/5

Safe to merge — the fork gate ensures pre-fork chains are untouched, and the advisory-lock + READ COMMITTED re-check correctly serialises concurrent same-sender inserts on this node.

All three issues flagged in previous review rounds are addressed in this iteration. The advisory-lock + READ COMMITTED design correctly serialises concurrent same-sender inserts. The fork gate ensures pre-fork chains are bit-identical. The only residual note is the hashtext() 32-bit collision space, a known throughput trade-off with no correctness impact.

No files require special attention. The advisory-lock path in mempool.ts is the only non-trivial change and its invariants are well-documented.

Important Files Changed

Filename	Overview
src/libs/blockchain/mempool.ts	Core change: wraps the per-sender re-check + insert in a Postgres transaction with pg_advisory_xact_lock at READ COMMITTED isolation. Previous review issues (isolation level, stale blockHeight, duplicate-hash misleading error) are all addressed in this version.
src/libs/blockchain/routines/validateTransaction.ts	JSDoc-only update to assignNonce concurrency caveat: rewrites the stale PR-3 promise to accurately describe PR 4's advisory-lock location. No logic changes.
docs/specs/audit-sweep-batch-c-nonce.md	Spec frontmatter updated (PR 3 marked merged, PR 4 added). TOCTOU risks-table row rewritten to describe the actual PR 4 advisory-lock shape. No code impact.

Sequence Diagram

sequenceDiagram
    participant ClientA as Client A
    participant ClientB as Client B
    participant Mempool as Mempool.addTransaction
    participant PG as PostgreSQL

    par Concurrent calls
        ClientA->>Mempool: "addTransaction(tx_A, nonce=1)"
        ClientB->>Mempool: "addTransaction(tx_B, nonce=1)"
    end

    Mempool->>PG: BEGIN READ COMMITTED
    Mempool->>PG: pg_advisory_xact_lock('nonce:sender') - Client A wins
    Note over ClientB,PG: Client B blocks
    Mempool->>PG: "hashExists? No. account.nonce=0, pendingCount=0"
    Note right of Mempool: expected=1 == nonce=1 ok
    Mempool->>PG: INSERT tx_A + COMMIT
    Note over ClientB,PG: Client B acquires lock
    Mempool->>PG: BEGIN READ COMMITTED
    Mempool->>PG: pg_advisory_xact_lock('nonce:sender') - Client B
    Mempool->>PG: "account.nonce=0, pendingCount=1 (sees tx_A)"
    Note right of Mempool: expected=2 != nonce=1 fail
    Mempool->>PG: ROLLBACK
    Mempool-->>ClientB: Nonce TOCTOU recheck failed

_{Reviews (3): Last reviewed commit: "fix(mempool): live block-height query in..." | Re-trigger Greptile}

…loop iter 1) Two Greptile findings, both real. P1 — isolation level silently breaks TOCTOU under non-default config `this.repo.manager.transaction()` opens a Postgres transaction at the server's default isolation level. Under `READ COMMITTED` (the Postgres default) every statement inside the transaction sees the latest committed data, so the in-lock re-query of `pendingCount` correctly reflects any concurrent winner. Under `REPEATABLE READ` or `SERIALIZABLE`, the transaction snapshot is taken at `BEGIN` — BEFORE the advisory lock fires — so both racing threads see the same stale `pendingCount`, both compute the same expected nonce, both pass the recheck, both insert. The TOCTOU fix silently collapses. A DBA or pooler change to `REPEATABLE READ` (not uncommon in read-heavy deployments) would defeat this PR without any observable failure mode. Pin the isolation explicitly so the lock + recheck semantics are independent of operator config. Fix: `this.repo.manager.transaction("READ COMMITTED", async em => { ... })`. TypeORM accepts an optional isolation level as the first argument. P2 — duplicate-hash concurrent submit produces misleading error The outer `Chain.checkTxExists` + `checkTransactionByHash` calls run BEFORE the advisory lock. Two concurrent submissions of the IDENTICAL transaction (same hash, same sender, same nonce) both pass those checks. The first inserts. The second then sees `pendingCount=1`, computes `expected=N+2`, and is rejected — but the error string is `"Nonce TOCTOU recheck failed: tx.content.nonce=N+1, expected=N+2"` which misleads the operator into thinking the client sent a wrong nonce when the real cause is duplicate broadcast. Fix: add `mempoolRepo.exists({ where: { hash: transaction.hash } })` as the FIRST step inside the lock. If true, return `"Transaction already in mempool"` — the same error the outer check returns — so observability stays consistent. Verification - tsc --noEmit clean. - Manual trace under `REPEATABLE READ` after the pin: 1. Both threads call `transaction("READ COMMITTED", ...)` so the txn opens at READ COMMITTED regardless of server default. 2. A acquires advisory lock, B blocks. 3. A re-queries pending=0, expected=1, inserts, commits. 4. B acquires lock. Inside-lock hash recheck: A's row visible (READ COMMITTED sees committed rows). Returns "Transaction already in mempool" if hash matches, or re-queries pending=1 → expected=2 → mismatch if hash differs but nonce was duplicated. - Pre-fork path unchanged.

…h assignNonce (PR #887 greploop iter 2) Greptile P1 — stale `blockHeight` produces cutoff mismatch `blockHeight` is captured from `getSharedState.lastBlockNumber` at the top of `addTransaction`, BEFORE any await points. The in-lock `cutoff = blockHeight - referenceBlockRoom` uses that snapshot. Meanwhile `assignNonce` (during the earlier `confirmTransaction` RPC) derives its cutoff via `Mempool.countPendingByAddress`, which calls `Chain.getLastBlockNumber()` — a live DB query. If a block is produced between the two RPCs, the two cutoffs diverge by 1. A pending mempool row whose `reference_block` equals the old cutoff is counted by `assignNonce` but excluded by the in-lock recheck (or vice versa). The recheck then rejects a still-valid tx for a phantom nonce mismatch, with no recovery path on the client side except retry. Fix: re-query `Chain.getLastBlockNumber()` inside the lock and derive the cutoff from it. Both paths now align on the same live block tip and the cutoff is identical to the one the earlier validation step used. Notes: - Adds one more DB round-trip inside the locked critical section. Acceptable: `Chain.getLastBlockNumber()` is a primary-key lookup on `chain` and runs in <1ms; the advisory lock holds for ~5-10ms in the steady state. The correctness gain dwarfs the latency cost. - The outer `blockHeight` snapshot is still used for the fork gate (`isForkActive("nonceEnforcement", blockHeight)`) which is harmless — fork-gating doesn't need apply-time precision, only "are we somewhere past activation height". Verification: - tsc --noEmit clean. - Manual trace of the block-transition race: 1. Client A submits confirmTx at block N. Validation uses cutoff = N - referenceBlockRoom. 2. Block N+1 lands. 3. Client A submits executeTx. `blockHeight` snapshot is N+1 (lastBlockNumber updated). Without the fix, in-lock cutoff is (N+1) - referenceBlockRoom, ONE HIGHER than what assignNonce used. Pending row at reference_block = cutoff_assignNonce is excluded by the in-lock count → expected nonce off by 1 → false rejection. 4. With the fix, in-lock re-queries Chain.getLastBlockNumber() which returns the same N+1 (or higher if more blocks land during the lock wait) and derives the cutoff from that. The compare is consistent and the tx admits if and only if the nonce is still valid against the current chain state.

…ork testing pass Single document covering everything surfaced during the testing pass that ran 2026-05-26 → 2026-05-31 against dev.node2. Organised by topic, not chronology, so the team has one place to look up what was found, what was fixed, and what is still pending — without scrolling Telegram or hopping between PRs. Sections: - L2PS encryption nonce reuse (SDK #87 + 3 follow-up security findings) - Governance hash mismatch on dev.node2 (still broken; real fix in DEM-727) - Test-coverage gap that allowed the governance break to slip through - Side finding — nonces were not checked or incremented per tx (Hovhannes's batch C #884–#887, batch D #888 in flight) - UX surfacing for L2PS (Demo #11) - Operational risks still open - Linear ticket map across DEM-722 epic - Source material cross-links Cross-links the raw battery output and serializer analysis already in flight via Node #876. Closes DEM-729. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

greptile-apps Bot reviewed May 31, 2026

View reviewed changes

Comment thread src/libs/blockchain/mempool.ts Outdated

Comment thread src/libs/blockchain/mempool.ts Outdated

greptile-apps Bot reviewed May 31, 2026

View reviewed changes

Comment thread src/libs/blockchain/mempool.ts Outdated

tcsenpai merged commit 924df40 into stabilisation May 31, 2026
4 checks passed

linear Bot mentioned this pull request Jun 1, 2026

Re-validate dev environment after reboot #890

Open

Shitikyan mentioned this pull request Jun 1, 2026

(closed) #892

Closed

tcsenpai mentioned this pull request Jun 1, 2026

test(devnet): cross-RPC nonce-replay e2e test for nonceEnforcement (batch E) #899

Open

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(mempool): close same-node nonce TOCTOU via per-sender advisory lock (batch C PR 4)#887

fix(mempool): close same-node nonce TOCTOU via per-sender advisory lock (batch C PR 4)#887
tcsenpai merged 3 commits into
stabilisationfrom
bugfix/audit-sweep-batch-c-toctou-lock-2026-05-31

tcsenpai commented May 31, 2026

Uh oh!

qodo-code-review Bot commented May 31, 2026

Uh oh!

coderabbitai Bot commented May 31, 2026 •

edited

Loading

Review limit reached

Uh oh!

greptile-apps Bot commented May 31, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

tcsenpai commented May 31, 2026

Summary

The window

What this PR ships

Mempool.addTransaction

Fork-gated

assignNonce JSDoc

Spec doc

Cross-node note

Verification

Test plan

Uh oh!

qodo-code-review Bot commented May 31, 2026

Qodo reviews are paused for this user.

Uh oh!

coderabbitai Bot commented May 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review limit reached

Uh oh!

greptile-apps Bot commented May 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Sequence Diagram

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

`Mempool.addTransaction`

`assignNonce` JSDoc

coderabbitai Bot commented May 31, 2026 •

edited

Loading

greptile-apps Bot commented May 31, 2026 •

edited

Loading