Skip to content

feat: index-ceiling CLI parity + zero-size reason codes#282

Open
Nelson Spence (Fieldnote-Echo) wants to merge 8 commits into
fix/manifest-derived-limitsfrom
feat/index-ceiling-cli-parity
Open

feat: index-ceiling CLI parity + zero-size reason codes#282
Nelson Spence (Fieldnote-Echo) wants to merge 8 commits into
fix/manifest-derived-limitsfrom
feat/index-ceiling-cli-parity

Conversation

@Fieldnote-Echo

Copy link
Copy Markdown
Member

Deferred items from #277's audit triage: --max-index-artifact-bytes CLI flag (mirrors the auxiliary sibling) and CIPHER-04 zero-size shape checks (artifact_file_size_zero / auxiliary_artifact_file_size_zero, optional-absent placeholder respected, README limit-codes table updated). Fleet-authored, reviewed APPROVE; base merged forward to include the FIFO/EINTR fixes. Stacks on #277.

Expose --max-index-artifact-bytes on the ordvec-manifest CLI LimitArgs,
wiring it to ResourceLimits::max_index_artifact_bytes so the opt-in
primary-artifact read ceiling reaches feature parity with the existing
--max-auxiliary-artifact-bytes flag.

Close the deferred CIPHER-04 reason-code symmetry: validate_manifest_shape
now rejects a zero manifest-declared artifact.file_size_bytes
(artifact_file_size_zero) and validate_auxiliary_artifact_shape rejects
zero-size declarations on required auxiliary artifacts
(auxiliary_artifact_file_size_zero), mirroring the calibration and
encoder-distortion *_file_size_zero checks. Optional artifacts keep the
established zero-size absent-placeholder convention.
@chatgpt-codex-connector

Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

@qodo-code-review

Copy link
Copy Markdown

PR Summary by Qodo

Add --max-index-artifact-bytes CLI flag and reject zero-size size declarations

✨ Enhancement 🐞 Bug fix 🧪 Tests 📝 Documentation 🕐 20-40 Minutes

Grey Divider

AI Description

• Expose --max-index-artifact-bytes on the CLI and wire it into ResourceLimits.
• Reject zero-valued file_size_bytes for required primary/aux artifacts with stable reason codes.
• Add regression tests and update README limit-code/flag documentation.
Diagram

graph TD
  CLI["ordvec-manifest CLI"] --> LimitArgs["LimitArgs (clap)"] --> Limits["ResourceLimits"] --> Verify["verify_manifest(...)" ] --> Shape["Shape validation"] --> Report["VerificationReport"]
  Shape --> ManifestCheck{"artifact.file_size_bytes == 0?"} --> Report
  Shape --> AuxCheck{"required aux size == 0?"} --> Report
Loading
High-Level Assessment

The following are alternative approaches to this PR:

1. Treat zero declared sizes as 'unknown' and rely on observed sizes
  • ➕ Avoids hard failures for manifests that previously used 0 as a sentinel
  • ➕ Keeps verification permissive when producers are buggy
  • ➖ Ambiguous contract: 0 could mean absent vs unknown vs invalid
  • ➖ Weakens safety guarantees and makes limits harder to reason about
  • ➖ Conflicts with required-artifact semantics and stable reason codes
2. Factor out a shared `validate_nonzero_file_size(...)` helper
  • ➕ Reduces duplication as more artifact types gain similar checks
  • ➕ Centralizes wording/codes for consistency
  • ➖ Adds indirection for a small amount of logic today
  • ➖ May prematurely generalize before more call sites exist

Recommendation: Keep the PR’s strict zero-size rejection for required artifacts: it enforces an unambiguous manifest contract and aligns with existing *_file_size_zero symmetry. If additional artifact classes add similar invariants, consider extracting a small helper later to keep reason-code emission consistent.

Files changed (4) +107 / -3

Enhancement (1) +43 / -1
main.rsExpose --max-index-artifact-bytes and test CLI/limits wiring +43/-1

Expose --max-index-artifact-bytes and test CLI/limits wiring

• Adds 'max_index_artifact_bytes' to the CLI 'LimitArgs' and maps it into 'ResourceLimits::max_index_artifact_bytes'. Adds unit tests covering both the mapping behavior and clap parsing for the new 'verify --max-index-artifact-bytes' flag.

ordvec-manifest/src/main.rs

Bug fix (1) +17 / -0
lib.rsAdd zero-size validation reason codes for primary and auxiliary artifacts +17/-0

Add zero-size validation reason codes for primary and auxiliary artifacts

• Extends 'validate_manifest_shape' to reject 'artifact.file_size_bytes == 0' with the 'artifact_file_size_zero' code. Extends 'validate_auxiliary_artifact_shape' to reject zero 'file_size_bytes' for required auxiliary artifacts with 'auxiliary_artifact_file_size_zero', while preserving the optional-absent zero-size placeholder convention.

ordvec-manifest/src/lib.rs

Tests (1) +43 / -0
manifest.rsAdd manifest verification tests for zero-size declarations and optional placeholders +43/-0

Add manifest verification tests for zero-size declarations and optional placeholders

• Adds a regression test asserting zero declared sizes on required primary and auxiliary artifacts fail verification with the new reason codes. Adds a test confirming optional auxiliary artifacts may still use a zero-size absent-placeholder without triggering the new auxiliary zero-size error.

ordvec-manifest/tests/manifest.rs

Documentation (1) +4 / -2
README.mdDocument index-artifact ceiling semantics and CLI flag +4/-2

Document index-artifact ceiling semantics and CLI flag

• Clarifies that primary index artifact verification is bounded by declared size with an optional flat cap. Adds '--max-index-artifact-bytes' to the documented CLI override flags and includes the primary index artifact bytes surface in the stable limit-codes table.

ordvec-manifest/README.md

@qodo-code-review

qodo-code-review Bot commented Jul 4, 2026

Copy link
Copy Markdown

Code Review by Qodo

🐞 Bugs (0) 📘 Rule violations (0) 📎 Requirement gaps (0) 🎨 UX issues (0) 🔗 Cross-repo conflicts (0) 📜 Skill insights (0)

Grey Divider


Action required

1. Index ceiling ignored on create ✓ Resolved 🐞 Bug ≡ Correctness
Description
--max-index-artifact-bytes is parsed and applied to ResourceLimits, but
create_manifest_for_index_with_options() hashes the index artifact using the full observed file
length without clamping to max_index_artifact_bytes. As a result, `ordvec-manifest create
--max-index-artifact-bytes ...` does not actually enforce the intended ceiling and can still perform
unbounded work on very large index artifacts.
Code

ordvec-manifest/src/main.rs[R246-248]

+        if let Some(value) = self.max_index_artifact_bytes {
+            limits.max_index_artifact_bytes = value;
+        }
Relevance

⭐⭐⭐ High

Repo has history accepting resource-limit enforcement; bounded hashing/limits work accepted in PR
#157 and pursued in #152.

PR-#157
PR-#152
PR-#255

ⓘ Recommendations generated based on similar findings in past PRs

Evidence
The PR adds CLI wiring for max_index_artifact_bytes, but the library create path hashing the index
artifact does not consult that limit, while auxiliary artifacts explicitly do; therefore the new
flag cannot affect create behavior.

ordvec-manifest/src/main.rs[199-263]
ordvec-manifest/src/lib.rs[3599-3605]
ordvec-manifest/src/lib.rs[3739-3750]
ordvec-manifest/src/lib.rs[2300-2313]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
The CLI flag `--max-index-artifact-bytes` is correctly wired into `ResourceLimits`, but the create path does not use `ResourceLimits.max_index_artifact_bytes` when hashing the primary index artifact. This makes the flag ineffective for `ordvec-manifest create` and contradicts the documented/expected “opt-in ceiling” behavior.

## Issue Context
- Verify and sqlite cache-key hashing already apply the ceiling via `declared_size.min(max_index_artifact_bytes)`.
- Create currently bounds by the observed file length only, unlike auxiliary artifacts which also apply a flat cap.

## Fix Focus Areas
- ordvec-manifest/src/lib.rs[3599-3605]
- ordvec-manifest/src/lib.rs[3739-3750]
- ordvec-manifest/src/main.rs[199-263]

## Suggested change
In `create_manifest_for_index_with_options()`, change the bound used for `sha256_file_bounded` from `fs::metadata(index_path)?.len()` to:

- `let observed_len = fs::metadata(index_path)?.len();`
- `let max_len = observed_len.min(options.limits.max_index_artifact_bytes);`
- `sha256_file_bounded(index_path, max_len, "artifact_file_too_large", "index artifact")?;`

Optionally add/extend a CLI/unit test to assert `create` respects `--max-index-artifact-bytes` (e.g., by setting a tiny ceiling and expecting a `LimitExceeded` error).

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

Qodo Logo

@codecov

codecov Bot commented Jul 4, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@Fieldnote-Echo

Copy link
Copy Markdown
Member Author

/agentic_review

Comment thread ordvec-manifest/src/main.rs
@qodo-code-review

Copy link
Copy Markdown

Code review by qodo was updated up to the latest commit b4b6a64

Nelson Spence (Fieldnote-Echo) added a commit that referenced this pull request Jul 4, 2026
Bot-review remediation (Qodo, #282): --max-index-artifact-bytes wired
into ResourceLimits but the create path bounded the primary hash by
the probed size alone — the opt-in ceiling was ineffective for
create, unlike auxiliary artifacts. Create now mirrors verify:
declared/observed size min explicit ceiling.
@Fieldnote-Echo

Copy link
Copy Markdown
Member Author

Qodo triage: create-path index ceiling — fixed in 1c193c5 (create now mirrors verify: size min explicit ceiling). Also landed on this branch via base: single-snapshot create hashing (TOCTOU), strict 64KiB read clamping, and the primary zero-size shape check (4b35a34).

@qodo-code-review

Copy link
Copy Markdown

triage: create-path index ceiling — fixed in 1c193c5 (create now mirrors verify: size min explicit ceiling). Also landed on this branch via base: single-snapshot create hashing (TOCTOU), strict 64KiB read clamping, and the primary zero-size shape check (4b35a34).

Thanks for the heads-up — that matches the current state.

  • finding 1 is already marked implemented.
  • The branch also appears to include the follow-up fixes you mentioned from base (4b35a34), so there’s nothing left to triage here.

If you want, I can still sanity-check the updated create/verify limit behavior against the diff.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants