Add PDF keycard parsing functionality#8946
Conversation
…d sdk - add parseKeycard.ts with pure string parsing logic (parseKeycardFromLines, buildLinesFromPDFNodes, KeycardEntry, PDFTextNode types) - add extractKeycardFromPDF.ts with pdfjs-dist based PDF extraction (browser-only; consumer must configure GlobalWorkerOptions.workerSrc) - add pdfjs-dist ^5.0.0 dependency to @bitgo/key-card - export new functions from module index - add mocha unit tests for parseKeycardFromLines covering all Part N edge cases - add pdf parse demo UI to web-demo KeyCard page with file upload, result table, and worker configuration for webpack WCN-19
pdfjs-dist@5.7.x requires node >=22.13.0 || >=24, which is incompatible with the node 20.x CI runner. downgrade to ^4.0.0 which supports node 18+. update web-demo worker path from .mjs to .js to match v4 build output. fix prettier formatting in web-demo KeyCard component. WCN-19
pdfjs-dist initializes browser-only globals (DOMMatrix) at module load time. using a static top-level import caused all tests in @bitgo/key-card to crash in node.js environments. switching to a dynamic import() inside extractKeycardEntriesFromPDF defers loading to call time (browser only). WCN-19
…nsions pdfjs-dist v4 only ships .mjs worker files (no .js variants). the previous commit incorrectly changed the worker path to .mjs -> .js. reverted to pdf.worker.min.mjs and added .mjs to webpack resolve extensions in both dev and prod configs so webpack can locate the file. WCN-19
2d5100a to
45ec5bf
Compare
|
Claude finished @Marzooqa's task —— View job I'll help you with this PR! Since you didn't specify a particular request, I'll conduct a comprehensive code review of the PDF keycard parsing functionality. Todo List
Code Review SummaryI've reviewed the PDF keycard parsing functionality implementation. Overall, this is a well-structured and thoughtfully implemented feature. Here's my detailed analysis: ✅ Strengths1. Solid Architecture & Separation of Concerns
2. Robust PDF Text Reconstruction
3. Excellent Edge Case Handling
4. Comprehensive Test Coverage
5. Browser Safety
|
…d jsdoc - extract magic number 2 to PDF_LINE_Y_TOLERANCE and PDF_NODE_GAP_THRESHOLD constants with explanatory comments - add jsdoc to buildLinesFromPDFNodes explaining the spatial reconstruction algorithm WCN-19 Signed-off-by: Mohammad Al Faiyaz <mohammadalfaiyaz@bitgo.com>
5f23f81 to
a9ec49c
Compare
Description
This PR adds functionality to extract and parse BitGo keycard data from PDF files. It includes:
New parsing module (
parseKeycard.ts): Core logic to parse keycard sections (A–D) from text lines extracted from PDFs, with robust handling of:PDF extraction module (
extractKeycardFromPDF.ts): Wrapper around pdfjs-dist to extract text nodes from PDF files and convert them to keycard entriesWeb demo integration: Added file upload UI in the KeyCard component to allow users to upload a keycard PDF and view extracted sections in a table
Dependencies: Added
pdfjs-dist(^4.0.0) to both@bitgo/key-cardandweb-demopackagesType of change
How Has This Been Tested?
Added comprehensive unit tests in
parseKeycardFromLines.test.tscovering:All tests verify correct reconstruction of encrypted wallet password JSON and other section values.
Checklist:
https://claude.ai/code/session_01Pj4SsrmSnoBNj8h6zX1vaz