Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 21 additions & 5 deletions src/sign_bitmap.rs
Original file line number Diff line number Diff line change
Expand Up @@ -277,6 +277,13 @@ impl SignBitmap {
let mut heaps: Vec<BinaryHeap<(u32, u32)>> = (0..nq)
.map(|_| BinaryHeap::with_capacity(m_eff + 1))
.collect();
// Cached copy of each full heap's worst kept hamming. Doc ids visit
// each heap strictly ascending (d ascends within a row, blocks
// ascend), so a candidate tying the worst hamming always loses the
// (hamming, doc_id) tie-break — once full, the boundary test
// reduces to one u32 compare against this register. u32::MAX while
// filling (hamming <= dim can never reach it).
let mut worst_bounds = vec![u32::MAX; nq];

let mut block_start = 0usize;
while block_start < n {
Expand All @@ -290,14 +297,18 @@ impl SignBitmap {
sign_scan_collect_batched(block, bn, qpv, qb_tile, tq, scores);
for ti in 0..tq {
let heap = &mut heaps[tile_start + ti];
let worst = &mut worst_bounds[tile_start + ti];
let row = &scores[ti * bn..(ti + 1) * bn];
for (d, &hamming) in row.iter().enumerate() {
let key = (hamming, (block_start + d) as u32);
if heap.len() < m_eff {
heap.push(key);
} else if key < *heap.peek().expect("non-empty full collector") {
if hamming >= *worst {
continue;
}
heap.push((hamming, (block_start + d) as u32));
if heap.len() > m_eff {
heap.pop();
heap.push(key);
}
if heap.len() == m_eff {
*worst = heap.peek().expect("full collector").0;
}
}
}
Expand Down Expand Up @@ -420,6 +431,11 @@ impl SignBitmap {
/// to the historical per-query rescan. The CSR output contract is
/// unchanged and bit-identical to the previous implementation.
///
/// "Serial" scopes the scan and selection: no rayon is entered for the
/// candidate work, so callers own that parallelism. Input finite-
/// validation MAY briefly use the global rayon pool for large query
/// buffers (order-independent boolean reduction; deterministic).
Comment on lines +434 to +437

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Informational

1. Rayon doc mismatch 🐞 Bug ⚙ Maintainability

top_m_candidates_batched_serial_csr docs claim input finite-validation may use the global rayon
pool, but assert_all_finite is implemented as a plain sequential iterator check and the streamed
candidate path shown does not enter rayon. This misstates the API’s concurrency contract and can
mislead callers relying on the “serial/no rayon” guarantee.
Agent Prompt
## Issue description
The docstring for `top_m_candidates_batched_serial_csr` states that input finite-validation “MAY briefly use the global rayon pool,” but the current implementation does not use rayon in validation.

## Issue Context
- `top_m_candidates_batched_serial_csr` calls `crate::util::assert_all_finite(queries)` and then runs the streamed scan/selection without any rayon usage.
- `assert_all_finite` is currently implemented with `v.iter().all(...)` (sequential).

## Fix Focus Areas
- src/sign_bitmap.rs[414-417]
- src/util.rs[117-131]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

///
/// # Example
/// ```no_run
/// use ordvec::SignBitmap;
Expand Down
Loading