refactor(error): typed sub-enums in place of stringly-typed reason fields#23
Draft
zrosenbauer wants to merge 6 commits into
Draft
refactor(error): typed sub-enums in place of stringly-typed reason fields#23zrosenbauer wants to merge 6 commits into
zrosenbauer wants to merge 6 commits into
Conversation
Three public error variants previously buried the actual failure mode
inside a `reason: String` interpolated at the call site, forcing
consumers to string-match diagnostic prose to discriminate cases:
- `ParseError::MalformedTag.reason` → `kind: MalformedTagKind`
(ExpectedCloseAngle, UnterminatedOpenTag, ExpectedCloseSlashAngle,
UnterminatedComment, UnterminatedCdata)
- `ParseError::MalformedAttribute.reason` → `kind: MalformedAttrKind`
(UnexpectedNameStart, ExpectedEquals, ExpectedOpenQuote,
UnterminatedValue)
- `SelectorError::Syntax.reason` → `kind: SyntaxKind`
(UnionTooLarge, CompoundTooLong, TooManyPredicates,
NotNestingTooDeep, UnsupportedPseudoClass, NthChildMustBeOneOrGreater,
IntegerOutOfRange, ExpectedDigit, Expected{what: &'static str})
`SchemaError::InvalidRegex` now carries `#[source] source: regex::Error`
instead of stringifying the cause, so `Error::source()` walks the chain
for downstream `anyhow` / `eyre` / `tracing` consumers. `SchemaError`
drops its `Eq` derive (regex::Error is PartialEq but not Eq); PartialEq
is retained.
Display output is byte-identical to the prior format — the rstest
substring assertions and insta snapshots in fixture_suite pass
unchanged. Breaking only for consumers that constructed these variants
directly or matched on the `reason` field.
Co-Authored-By: Claude <noreply@anthropic.com>
Cover the load-bearing internals that black-box tests in tests/*.rs can't reach directly: tokenizer's looks_like_tag_start prose-vs-tag dispatch, the attr duplicate-tracker linear-to-hashset promotion at threshold, CDATA trivia split bookkeeping, and parse's sibling-scoped duplicate-id check + frame/root scope routing. docs: add executable doctests across the public API Convert the crate-level synopsis to a parse to select to mutate walkthrough and add executable examples to parse, parse_owned, Selector::parse, Markdown::update / replace_content / replace_in / replace_text / to_xml, and validate. Lifts doctest count from 3 to 12 so docs.rs shows real usage and CI catches signature drift at the public surface.
…ee-testing-layering
…ript - examples/: new rust-simple, rust-advanced, node-simple, node-advanced scaffolds + samples; workspace members updated to match. Examples build via `cargo check --workspace`. - bindings/node/src/lib.rs: expand design-notes header, document the clippy carve-outs, clarify JS regex flag handling for replace_in. - crates/marxml/src/lib.rs: drop the stale `#![doc(html_root_url)]` placeholder (was pointing at 0.0.0). - crates/marxml/src/selector/matcher.rs: explain the `#[allow(clippy::too_many_arguments)]` on walk() — packing the scratch into a struct would add a borrow indirection on the hot recursion path. - docs/ARCHITECTURE.md: minor wording. - scripts/reserve-npm-names.sh: deleted; one-time bootstrap, no longer needed now that trusted publishing is configured for every sub-package. Co-Authored-By: Claude <noreply@anthropic.com>
… docs
Outcome of a multi-reviewer audit (6 Claude agents + 1 Codex independent
run) against Rust 1.95 / Rust API Guidelines / napi-rs v3. Internal
cleanup, hot-path microopts, doc/code parity, and one public API
tightening on the validation surface. See the changeset for the
downstream-facing summary; companion contributor doc added.
Public API (minor):
- ValidationError::InvalidAttr.reason: String -> kind: InvalidAttrKind.
The new enum carries the allowed-enum list or the regex pattern as
structured fields callers can match on. Display output is
byte-identical; only consumers that destructured `reason` break.
Re-exported at the crate root.
Node binding:
- IntoNapi extension trait centralizes crate->napi::Error mapping. Call
sites read `.into_napi()?` instead of repeating .map_err(|e| Error::new(
Status::InvalidArg, e.to_string())) closures. Orphan rule blocks a
direct From impl; the trait stands in.
- toJson JSDoc no longer claims "no string round-trip" (the wrapper
hides one inside marxml.mjs — being honest about it).
- replace_in_content rustdoc now describes the RegexBuilder setter
mechanism (the prior (?flags:…) prefix claim was wrong about
mechanism though correct about observable behavior).
Internal renames + cleanup:
- mutate::try_replace_content / try_replace_in -> splice_content_report
/ splice_regex_report. The try_* prefix is reserved for fallible ops;
these always return MutationReport. Public re-exports
(replace_content_report, replace_in_report) unchanged.
- Dropped the dead splice_regex one-line wrapper. Callers go through
splice_regex_with directly.
Hot-path microopts (no benchmark required — straight wins):
- mutate::rewrite_open_tag allocates with capacity sized from
tag + existing-attr + new-attr byte budgets.
- serialize::to_xml allocates with capacity = doc.raw().len().
- tokenizer::record_seen_attr builds the lazy HashSet via
iter().map().collect() instead of with_capacity + manual loop.
- escape::decode_entities replaces .chars().next().expect("non-empty
tail") with an unreachable! — same runtime behavior, expressed via
the type system.
Diagnostic fix:
- ParseError::MalformedAttribute { kind: UnterminatedValue, .. } now
reports the line at which input ran out, not the line of the
attribute name. For attributes whose values span multiple lines and
run off the end, the error now points at the actual EOF rather than
far above. One snapshot updated.
Docs:
- Markdown::to_xml rustdoc now warns the output is re-serialized, not
byte-preserved; SourcePosition values from the source don't apply.
- docs/dsl/selectors.md describes nested :not() as
permitted-up-to-depth-64 (matches the parser); prior wording said
rejected.
- crates/marxml/Cargo.toml gets [package.metadata.docs.rs] for future
feature-gated items.
- New contributing/rust-best-practices.md codifies the rubric the
review pass used: Rust idiom checklist, napi-rs v3 specifics, 2024
edition deferrals, reviewer severity levels.
Validates: cargo fmt --all -- --check, cargo clippy --all-targets
--all-features --workspace -- -D warnings, cargo test --all-features
--workspace, pnpm test in bindings/node — all green on macOS.
Co-Authored-By: Claude <noreply@anthropic.com>
Member
Author
|
Pushed What landed in this commitPublic API (minor, new changeset
Node binding:
Internal cleanup:
Hot-path microopts (no benchmark required):
Diagnostic bug:
Docs:
Validates cleanDeferred — Ask First items not in this commitTracked as follow-ups per the AGENTS.md "Ask First" rules:
Full per-module review reports live at |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Replaces
reason: Stringon three public error variants with typed sub-enums, and wiresregex::Erroras a proper#[source]onSchemaError::InvalidRegex.Displayoutput is byte-identical to the prior format.ParseError::MalformedTag.reason→kind: MalformedTagKind(5 variants)ParseError::MalformedAttribute.reason→kind: MalformedAttrKind(4 variants)SelectorError::Syntax.reason→kind: SyntaxKind(9 variants — distinct caps + structuralExpected { what: &'static str })SchemaError::InvalidRegex.reason: String→#[source] source: regex::Error;SchemaErrordropsEq(regex::Error is PartialEq but not Eq), retainsPartialEq + CloneIncludes
.changeset/typed-error-kinds.md(minor bump, breaking pre-1.0 per SemVer 4.0).Why
Diagnostic prose buried in a
Stringis opaque to programmatic consumers — they end up doingerr.to_string().contains("…")to discriminate cases. With typed kinds, callers canmatchexhaustively on the failure mode. The#[source]wiring fixes the one place we were dropping a cause chain (regex::Errorflattened viae.to_string()), soanyhow/eyre/tracingsee the full chain.Test plan
cargo fmt --checkcargo clippy --all-targets --all-features -- -D warningscargo test --all-features --workspace— 198 tests pass, incl.fixture_suiteinsta snapshots andmalformed_tag_reports_useful_reasonrstest (substring assertions on Display)pnpm run build:debug && pnpm testinbindings/node— 33 vitest tests pass (Node error.message is unchanged because Display is preserved)scripts/check-versions.sh— crate / npm version parity intactBreaking?
Yes, for any consumer that constructed these variants directly or pattern-matched on the
reasonfield. Consumers that only usedmatches!(e, ParseError::MalformedTag { .. })or readDisplayoutput are unaffected.