fix: handle quotes & backticks in JSX in directive scanner#238
fix: handle quotes & backticks in JSX in directive scanner#238mogelbrod wants to merge 3 commits into
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #238 +/- ##
==========================================
+ Coverage 94.95% 95.10% +0.15%
==========================================
Files 10 10
Lines 2597 2821 +224
==========================================
+ Hits 2466 2683 +217
- Misses 131 138 +7
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Pull request overview
This PR fixes the Lingui directive comment scanner so it no longer “runs away” when it encounters quotes ('/") or backticks (`) inside JSX text nodes, which previously caused lingui-set / lingui-reset directives to be silently skipped and the wrong idPrefix to be applied.
Changes:
- Reworked
scan_source_commentsinto a small context-stack scanner (Code / JsxTag / JsxChildren) so JSX text treats quotes/backticks and////*as literal text. - Added regression coverage: a new Lingui macro snapshot + fixture for JSX containing
'and`above directives. - Strengthened string scanning by bailing out of single/double-quoted string skipping on unescaped newlines.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
crates/lingui_macro/src/comment_directive/source_scanner.rs |
Implements JSX-aware comment scanning and adds targeted unit tests for the previously failing cases. |
crates/lingui_macro/tests/lingui_directive.rs |
Adds a regression test case exercising directives following JSX text with quotes/backticks. |
crates/lingui_macro/tests/snapshots/lingui_directive__jsx_trans_with_unclosed_quotes.snap |
New snapshot verifying idPrefix is applied correctly after JSX text containing ' and `. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
I'm thiniking that is going way too far, what if we will simplify this scanner to a hardcoded search of a keywords? Just looking for the I think we can ignore cases when someone could put a lingui directive inside of a string literal, this is an extreme edge case. WDYT? |
|
I'm tarting to feel that way too. How about a regex like the following to limit results to those that appear to be within comments? const re = /\/(?:\/|\*\*?)\s*lingui-(set|reset)[ ]+([^\n]*)/ |
|
Submitted the alternative implementation in #240 |
|
closed in favor of #242 |
The comment-directive scanner could silently drop lingui-set/lingui-reset directives when a quote ('/") or backtick appeared as JSX text above it. When that happened, directives after the stray character were never registered, so message IDs were generated against the wrong (or a stale) idPrefix.
For example, with
idPrefixLeader: ".":.ais incorrectly compiled toroot.ainstead ofdifferent.a.Root cause
scan_source_commentsrecovers comments from raw source text while skipping over '…', "…", and…regions to avoid false positives.A quote or backtick in JSX text (
<p>'</p>) is content, and not a string literal opener delimeter. The scanner treated it as one and scanned ahead to the next matching delimiter — e.g. the ' in id:'.a'— swallowing every comment in between, including the directive.Fix
scan_source_commentsis now a small context-stack state machine (Code/JsxTag/JsxChildren). Inside JSX text, quotes/backticks/comment markers are treated as literal text; {…} expression containers correctly re-enter code context, so real literals and comments inside them are still handled. This unfortunately also requires adding parsing of JSX and TS type annotations, such as:<opens JSX only in expression position (tracked via the preceding significant byte) followed by a tag-name char or>(fragment). This matches .tsx semantics and prevents TS generics (Record<string, …>) and comparisons (a < b) from being mistaken for JSX. A,/; inside a tag bails back to code, covering the<T,>generic-arrow form.return <div>,export default <App/>, andyield <x>is detected by inspecting the full preceding word, while value identifiers that merely end in keyword letters (myreturn<x) are not.skip_string_literalnow bails when it hits an unescaped newline, since single/double-quoted JS strings can't span newlines — a cheap, independent guard against a stray quote running away.