Skip to content

feat(mempool): mempool-aware nonce lookahead in assignNonce (batch C PR 2)#885

Merged
tcsenpai merged 2 commits into
stabilisationfrom
bugfix/audit-sweep-batch-c-mempool-lookahead-2026-05-30
May 30, 2026
Merged

feat(mempool): mempool-aware nonce lookahead in assignNonce (batch C PR 2)#885
tcsenpai merged 2 commits into
stabilisationfrom
bugfix/audit-sweep-batch-c-mempool-lookahead-2026-05-30

Conversation

@tcsenpai
Copy link
Copy Markdown
Contributor

Summary

PR 2 of the 3-PR batch C nonce-replay-protection sequence specified in docs/specs/audit-sweep-batch-c-nonce.md. Extends the PR 1 strict-equality check to account for txs already queued in this node's mempool. Caller remains commented out — no live behaviour change.

3 files, +83 / -9.

Why

PR 1 enforced:

tx.content.nonce === account.nonce + 1

That formula breaks for back-to-back submissions. A sender submitting N transactions in a row reads account.nonce once per call via getAddressNonce, and the SDK sets every tx with nonce = currentNonce + 1. account.nonce only advances on block inclusion (PR 3), so submissions 2..N all carry the same nonce and would have been rejected by PR 1.

What this PR ships

Mempool.countPendingByAddress(address)

New static method on Mempool. Counts mempool rows whose content.from matches the supplied lowercase hex pubkey via a LOWER(content->>'from') JSON query.

Single-writer mempool (this node) is the source of truth. The case-insensitive comparison is defence in depth — every callsite already lowercases, but stored content.from values could in principle carry mixed case.

assignNonce mempool-aware formula

Expected nonce is now:

expected = account.nonce + 1 + pendingMempoolCount

The k-th of N back-to-back submissions has pendingMempoolCount === k - 1 (the previous k-1 txs are already in mempool). The + 1 is the current tx slot.

Wraps the count call in a try/catch so DB errors return false (reject the tx) rather than propagating — same failure mode as the existing account-lookup branch. The error log now includes pendingMempoolCount for triage.

Spec doc

Frontmatter updated to status: in-progress with PRs list referencing #884 (merged), this PR, and the upcoming PR 3.

Out of scope (subsequent PRs in this sequence)

  • PR 3: GCREdit.expectedPrior field, nonce-edit emission in HandleNativeOperations, consensus-side rejection in GCRNonceRoutines, caller uncomment. Requires the single SDK type publish covered in the design doc.

Cross-node correctness note

Single-node correctness only: another node sees its own mempool, which may not contain the same pending set. A user submitting the same tx through two RPCs would briefly pass validation on both. PR 3's consensus-side expectedPrior check is the cross-node safety net — at block-application time, only one of any pair of competing same-nonce txs survives.

Verification

  • tsc --noEmit produces zero new errors in the two changed files.
  • countPendingByAddress is read-only.
  • The assignNonce caller in confirmTransaction remains commented out, so this code is reachable only via direct import (currently none in src/libs/).

Test plan

  • bun install and confirm clean tsc
  • Boot fresh devnet (wipe_and_reboot.sh); confirm no boot regression
  • Manually verify (unit-test or REPL): a known account with nonce=5, two pending txs in mempool from that sender → Mempool.countPendingByAddress(addr) === 2, assignNonce(tx with nonce=8) === true, assignNonce(tx with nonce=7) === false, assignNonce(tx with nonce=9) === false
  • Confirm mixed-case content.from rows are still counted (LOWER(...))

…PR 2)

PR 2 of the 3-PR batch C sequence specified in
`docs/specs/audit-sweep-batch-c-nonce.md`. Extends the PR 1 strict-
equality nonce check to account for txs already queued in this
node's mempool. Caller remains commented out — no live behaviour
change.

Why this is needed:
  PR 1 enforced `tx.content.nonce === account.nonce + 1`. A sender
  submitting N transactions back-to-back reads `account.nonce` once
  per submission via getAddressNonce → SDK sets all N txs with
  the same nonce. account.nonce does not advance until block
  inclusion, so submissions 2..N would have been rejected.

What this PR ships:
  - Mempool.countPendingByAddress(address): new static method that
    counts mempool rows whose content.from matches the supplied
    lowercase hex pubkey via a `LOWER(content->>'from')` JSON query.
    Single-writer mempool (this node) is the source of truth; the
    case-insensitive comparison defends against stored mixed-case
    data even though every callsite already lowercases.
  - assignNonce: extend the expected-nonce formula to
    `account.nonce + 1 + pendingMempoolCount`. Wraps the count call
    in a try/catch so DB errors return false (reject the tx) rather
    than propagating — same failure mode as the existing account-
    lookup branch. Error log now includes pendingMempoolCount for
    triage.
  - JSDoc updated to reflect mempool-aware semantics and clarify
    that single-node correctness is the goal here. Cross-node
    safety net is PR 3's consensus-side `expectedPrior` check.

Out of scope for PR 2:
  - PR 3: GCREdit `expectedPrior` field, nonce-edit emission in
    HandleNativeOperations, consensus-side rejection in
    GCRNonceRoutines, caller uncomment. Requires the single SDK
    publish covered in the design doc.

Verification:
  - tsc --noEmit clean on both changed files.
  - countPendingByAddress is read-only — no DB mutations.
  - assignNonce caller remains commented out, so this is dead code
    until PR 3 lands.
@qodo-code-review
Copy link
Copy Markdown
Contributor

Qodo reviews are paused for this user.

Troubleshooting steps vary by plan Learn more →

On a Teams plan?
Reviews resume once this user has a paid seat and their Git account is linked in Qodo.
Link Git account →

Using GitHub Enterprise Server, GitLab Self-Managed, or Bitbucket Data Center?
These require an Enterprise plan - Contact us
Contact us →

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 30, 2026

Warning

Review limit reached

@tcsenpai, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 21 minutes and 54 seconds. Learn how PR review limits work.

Your organization has run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: e74c569a-15b6-4366-b91b-04b5de34be59

📥 Commits

Reviewing files that changed from the base of the PR and between 55ce990 and 182a173.

📒 Files selected for processing (3)
  • docs/specs/audit-sweep-batch-c-nonce.md
  • src/libs/blockchain/mempool.ts
  • src/libs/blockchain/routines/validateTransaction.ts
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch bugfix/audit-sweep-batch-c-mempool-lookahead-2026-05-30

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@greptile-apps
Copy link
Copy Markdown

greptile-apps Bot commented May 30, 2026

Greptile Summary

This PR extends the PR 1 nonce-enforcement infrastructure with a mempool-aware lookahead so back-to-back same-sender submissions are accepted in sequence. The assignNonce caller in confirmTransaction remains commented out, so no live validation behaviour changes.

  • Adds Mempool.countPendingByAddress(address) — a static query that counts in-window mempool rows from a given sender using a sender-filter + reference_block >= cutoff window, mirroring the same boundary cleanMempool uses to sweep stale rows.
  • Updates assignNonce to compute expected = account.nonce + 1 + pendingCount, wrapping the new DB call in the same fail-closed try/catch pattern used by the existing account-lookup branch.
  • Updates the spec doc with the PR list and two new risk-table rows (TOCTOU and stale-entry), both previously raised as review findings.

Confidence Score: 5/5

Safe to merge — the assignNonce caller remains commented out so no live validation path is activated by this change.

Both new code paths (countPendingByAddress and the expanded assignNonce) are unreachable through any live call site. The reference_block window filter correctly mirrors cleanMempool's cutoff, addressing the stale-entry concern raised in PR 1 review. The single new finding (missing expression index on the JSON sender field) is a pre-launch performance concern that does not affect correctness.

Before the PR 3 caller wire-up, a database migration adding a functional index on LOWER((content->>'from')) in the mempooltx table should accompany or precede that PR.

Important Files Changed

Filename Overview
src/libs/blockchain/mempool.ts Adds countPendingByAddress static method with a sender-filter + reference_block window query; correctly mirrors the cleanMempool cutoff. No functional index on the JSON content->>'from' field means each call does a full mempool scan.
src/libs/blockchain/routines/validateTransaction.ts Extends assignNonce with a mempool-aware pending-count lookup; error handling, logging, and the fallback-to-false pattern are consistent with the existing account-lookup branch. Caller remains commented out so live behaviour is unchanged.
docs/specs/audit-sweep-batch-c-nonce.md Status updated to in-progress; PR list and two new risk-table rows (TOCTOU and stale-entry) added. PR 3 entry still shows #TBD.

Sequence Diagram

sequenceDiagram
    participant Client
    participant confirmTransaction
    participant assignNonce
    participant GCR as GCR DB (account.nonce)
    participant Mempool as Mempool DB

    Client->>confirmTransaction: POST /transaction (tx)
    Note over confirmTransaction: caller commented out — not live yet
    confirmTransaction->>assignNonce: assignNonce(tx)
    assignNonce->>GCR: "findOne(pubkey=sender)"
    GCR-->>assignNonce: "account.nonce = N"
    assignNonce->>Mempool: countPendingByAddress(sender)
    Mempool-->>assignNonce: "pendingCount = k"
    assignNonce->>assignNonce: "expected = N + 1 + k"
    alt "tx.content.nonce === expected"
        assignNonce-->>confirmTransaction: true
        confirmTransaction->>Mempool: addTransaction(tx)
    else nonce mismatch
        assignNonce-->>confirmTransaction: false (reject)
    end
Loading

Reviews (2): Last reviewed commit: "fix(audit-sweep): address Greptile revie..." | Re-trigger Greptile

Comment thread src/libs/blockchain/routines/validateTransaction.ts
Comment thread src/libs/blockchain/mempool.ts
Comment thread docs/specs/audit-sweep-batch-c-nonce.md Outdated
Two code findings + one doc finding.

P1 (stale-mempool inflation) — countPendingByAddress in mempool.ts
  Previous query had no reference_block filter, so it counted every
  row matching the sender — including txs that had aged past the
  referenceBlockRoom window and were waiting for the next
  cleanMempool sweep. A sender with one or more expired-but-not-yet-
  cleaned txs would compute an expected nonce higher than what they
  could supply, so every subsequent legitimate submission would be
  rejected until cleanup ran. Add `tx.reference_block >= :cutoff`
  with `cutoff = lastBlock - referenceBlockRoom`, mirroring the
  exact rule cleanMempool uses to mark a tx stale and
  isReferenceBlockAllowed enforces on inbound RPC. Count now matches
  what will actually be executed.

P1 (TOCTOU race) — assignNonce in validateTransaction.ts
  Two concurrent same-sender submissions both read pendingCount=0
  before either is admitted to the mempool, compute the same
  expected nonce, and both pass — duplicate-nonce txs in the
  mempool simultaneously. The PR description correctly scoped the
  cross-node version to PR 3, but missed the same-node concurrent
  path. PR 2 explicitly does NOT close this window in code (the
  caller is commented out, so it is unreachable); the JSDoc now
  documents the deferral and PR 3 will wrap the validate +
  Mempool.addTransaction sequence in a per-sender critical section
  via pg_advisory_xact_lock(hashtext(sender)), released at commit,
  so concurrent same-sender submissions serialise through
  validation in order. Also added a Risks-table row in
  docs/specs/audit-sweep-batch-c-nonce.md naming the lock approach.

P2 (spec doc) — replace #TBD with the actual PR number (#885)
  Frontmatter PR list now reflects this PR's identity. Risks table
  also gained the new TOCTOU + stale-mempool rows so the design
  doc stays consistent with the iter-1 changes.

Verification:
  - tsc --noEmit clean on both changed files.
  - cleanMempool's cutoff logic re-used directly (Chain.getLastBlockNumber
    + getSharedState.referenceBlockRoom); imports already present.
@tcsenpai tcsenpai merged commit 001efe7 into stabilisation May 30, 2026
4 checks passed
tcsenpai added a commit that referenced this pull request May 31, 2026
…tch C PR 3) [DRAFT - review needed] (#886)

* feat(consensus): nonceEnforcement consensus rule + caller wire-up (batch C PR 3 / Path A)

Final PR of the 3-PR batch C sequence specified in
`docs/specs/audit-sweep-batch-c-nonce.md`. Closes the cross-RPC
replay window end-to-end and activates the `nonceEnforcement` fork
on devnet (already active at block 0 since PR 1's genesis fixture).

Path A — SDK ships type-only, node populates `expectedPrior` at
apply time. Chosen over Path B (SDK-side emit) and Path C
(SDK fork introspection) for zero breaking-change risk: old SDK
clients continue working post-fork without re-publishing. See spec
doc §"Emission-site rationale" for the trade-off matrix.

Changes:

  src/libs/blockchain/gcr/gcr_routines/GCRNonceRoutines.ts
    - When `expectedPrior` is set on a non-rollback nonce edit,
      reject the apply iff `accountGCR.nonce !== expectedPrior`.
    - Rollback path skips the check (unwinding, not validating
      forward progress).
    - Cross-RPC double-spend safety net: two competing same-nonce
      txs bundled into the same block from different RPCs both pass
      per-node validation but only one survives consensus apply.

  src/libs/blockchain/gcr/handleGCR.ts
    - In `applyTransaction`'s per-edit loop, populate `expectedPrior`
      on every `nonce` edit when `nonceEnforcement` is active at the
      apply block. Read live from `entities.accounts` so a
      hypothetical second nonce edit in the same tx sees the
      post-first-apply value.
    - Block height: `tx.blockNumber ?? lastBlockNumber ?? 0`.
    - Skip on rollback. Skip if `expectedPrior` is already set (defence
      against future SDK-side population).

  src/libs/network/endpointValidation.ts
    - Symmetric strip of `expectedPrior` on both
      `gcrEdits` (regen) and `txEditsBlanked` (SDK-shipped) before
      hash compare. Mirrors the existing `txhash` blanking pattern.
    - SDK never writes the field today; the strip is defence in depth
      against future SDK-side emission and any non-conformant clients.

  src/libs/blockchain/routines/validateTransaction.ts
    - Uncomment `assignNonce` caller (lines 80-89). Pre-fork it
      short-circuits to true (PR 1). Post-fork enforces strict
      sequential semantics with mempool lookahead (PR 2) + the
      matching consensus-side rejection above.

  package.json
    - Bump `@kynesyslabs/demosdk` from 4.0.3 to 4.0.5 — the
      published SDK that adds the `expectedPrior?: number` field to
      `GCREditNonce` (type-only, runtime unchanged).

  docs/specs/audit-sweep-batch-c-nonce.md
    - Update PRs list (#885 merged, #884 merged, #TBD = this PR).
    - Add `sdk_publish: 4.0.5` to frontmatter.
    - Rewrite §Fork activation Post-fork bullet to describe the
      Path A flow (node populates, not SDK emits).
    - Add §Emission-site rationale documenting why Path A was
      chosen over B / C.
    - Update PR 3 row in the breakdown to list actual files +
      summarise the SDK pin bump.

Verification:
  - `tsc --noEmit` clean on all six changed files.
  - SDK 4.0.5 verified locally: `expectedPrior?: number` present in
    `node_modules/@kynesyslabs/demosdk/build/types/blockchain/GCREdit.d.ts`.
  - No new e2e test in this commit; will follow up in a separate
    PR with the devnet two-RPC double-broadcast scenario once the
    base wires are merged.

Cross-RPC double-spend scenario (post-fork):
  1. Client signs tx with `content.nonce = N+1`.
  2. Broadcasts to RPC-A and RPC-B in parallel.
  3. Both RPCs pass per-node validation (mempool lookahead is
     single-node).
  4. Block-formation gathers both → applies in deterministic order.
  5. First apply: account.nonce goes N → N+1. `expectedPrior` matched
     (N).
  6. Second apply: `expectedPrior = N` (populated at apply time from
     pre-first-apply snapshot), but `accountGCR.nonce = N+1`. Mismatch
     → `GCRNonceRoutines` rejects → whole tx rolls back at consensus.

* fix: move expectedPrior populate from apply-time to validation-time (PR #886 self-review)

Caught during draft-PR review: the apply-time populate site I shipped
in the initial commit was incorrect.

Bug:
  Populating `expectedPrior` inside `HandleGCR.applyTransaction`'s
  per-edit loop reads `entities.accounts` which already reflects
  prior in-block applies. For two competing same-nonce txs in the
  same block:
    1. Tx A applies. Edit populates `expectedPrior = N` from live
       state. Apply check passes (N === N). Account → N+1.
    2. Tx B (replay) applies next. Edit populates `expectedPrior =
       N+1` (live state already advanced). Apply check passes
       (N+1 === N+1). Account → N+2. Both succeed.
  This defeats the entire cross-RPC safety net.

Fix:
  Move populate into `assignNonce` itself (validation time), using
  the already-loaded account row + pending-mempool-count:
    `expectedPrior = account.nonce + pendingCount`
  This locks the value to the pre-block snapshot the sender's nonce
  was supposed to be at, NOT what it becomes at apply time.

  For the same scenario:
    1. Tx A's edit carries `expectedPrior = 0`. Apply: account=0,
       expectedPrior=0 → match. Account → 1.
    2. Tx B's edit also carries `expectedPrior = 0` (validated on a
       node that didn't yet see Tx A). Apply: account=1,
       expectedPrior=0 → MISMATCH → reject. Whole tx rolls back.

  For back-to-back same-sender submissions (k-th tx):
    `expectedPrior = account.nonce + (k-1)`
  matching the apply order because PR 2's mempool lookahead ensures
  each pending tx applies before this one.

Hash compare safety:
  `endpointValidation` snapshots `tx.content.gcr_edits` at line 48
  BEFORE `confirmTransaction` runs (which invokes `assignNonce`).
  My mutation hits the real array but the snapshot is the
  pre-mutation shape. Both sides of the compare are still stripped
  of `expectedPrior`, so signature integrity is preserved. The
  mutated array flows downstream into the apply pipeline carrying
  the validation-time-locked `expectedPrior`.

Files:
  - src/libs/blockchain/routines/validateTransaction.ts: walk
    `tx.content.gcr_edits`, set `expectedPrior` on the matching
    `nonce` edit (sender address compared case-insensitive).
  - src/libs/blockchain/gcr/handleGCR.ts: revert the apply-time
    populate code added in the previous commit. Also drop the now-
    unused `isForkActive` and `getSharedState` imports.
  - docs/specs/audit-sweep-batch-c-nonce.md: rewrite the fork
    activation Post-fork bullet to describe the validation-time
    populate, plus explicit rationale for why apply-time is wrong.
    Update PR 3 file list to reflect the moved populate site.

Verification:
  - tsc --noEmit clean.
  - Manual trace of the cross-RPC scenario now correctly rejects
    the second replay.

* fix(audit-sweep): address Greptile + CodeRabbit review on PR #886 (greploop iter 1)

Three real findings + three doc/comment fixes.

Critical (CodeRabbit) — fork-gate the consensus rule
  GCRNonceRoutines.apply enforced the `expectedPrior` mismatch check
  unconditionally whenever the field was present. The spec says
  pre-fork nodes must ignore it, and the field is stripped from the
  hash compare in endpointValidation — so a client could attach
  `expectedPrior` on a pre-fork chain without breaking signature
  validation, and upgraded nodes would start rejecting edits that
  older nodes still accepted. Validator-split risk before the fork
  activates.

  Fix: wrap the mismatch check in `isForkActive(nonceEnforcement,
  blockHeight)` using `getSharedState.lastBlockNumber` (same gate
  source `assignNonce` uses). Pre-fork ignores the field entirely;
  post-fork enforces. Imports added: `isForkActive`,
  `getSharedState`.

Major (CodeRabbit) / P1 (Greptile) — forge-account comparison bug
  In `assignNonce`'s populate loop, `edit.account` can be a
  forge-key object (non-string), but the comparison was
  `editAccount === senderAddress` where `senderAddress` is a
  lowercase hex string. Strict equality between an object and a
  string is always false, so the populate would silently skip for
  forge-format accounts and the cross-RPC safety net would never
  engage for those transactions.

  Fix: mirror the normalization already used in `GCRNonceRoutines`:
  `typeof edit.account === "string" ? edit.account :
  forgeToHex(edit.account)`, then `.toLowerCase()`. Now matches the
  sender regardless of how the SDK ships the account field.

Doc / comment cleanup
  Three locations still carried "node fills it in at apply time"
  wording, contradicting the validation-time populate site. Updated
  in:
    - `GCRNonceRoutines.ts`: comment now describes the fork gate
      + validation-time source.
    - `endpointValidation.ts`: comment now points at `assignNonce`
      as the populate site.
    - `docs/specs/audit-sweep-batch-c-nonce.md`: Path A header
      rewritten as "node populates at validation time"; rationale
      paragraph now explains the apply-time draft was rejected,
      with the cross-RPC reasoning inline.

Doc — PR 3 row scope
  The scope-table row for PR 3 listed "e2e test" as a deliverable,
  but the diff ships no test. Removed the "e2e test" item and
  added an explicit note that cross-RPC e2e is tracked as a
  follow-up PR. Also reworded "apply-time rejection" to "consensus-
  side rejection" to match what `GCRNonceRoutines` actually does
  (rejection at apply, populate at validation).

Out of scope for this iter
  TOCTOU advisory-lock wrapper documented in `assignNonce`'s
  PR 2 docstring is still not landed; flagged in this review round
  but deferred to a follow-up PR since wrapping the validate +
  `Mempool.addTransaction` sequence in `pg_advisory_xact_lock` is
  its own structural change that affects every native-tx ingress
  path. Filed as PR 4 in the follow-up tracker.

Verification:
  - `tsc --noEmit` clean.
  - Pre-fork trace: `expectedPrior` present on edit, fork inactive,
    `isForkActive` returns false → mismatch check skipped → legacy
    apply behaviour preserved bit-identically for re-sync.
  - Post-fork trace: forge-format `edit.account` now hex-coerces
    and matches the lowercased sender → `expectedPrior` populates
    correctly → `GCRNonceRoutines` enforces the mismatch reject.

---------

Co-authored-by: tcsenpai <tcsenpai@discus.sh>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant