Fix: CLI crawl --wait polling interval bug (double ms conversion)#100
Fix: CLI crawl --wait polling interval bug (double ms conversion)#100shiva-manu wants to merge 2 commits intofirecrawl:mainfrom
crawl --wait polling interval bug (double ms conversion)#100Conversation
There was a problem hiding this comment.
Pull request overview
Fixes the CLI crawl --wait polling/timeout unit handling so values are passed to the SDK in seconds (avoiding double conversion to ms that caused very long waits).
Changes:
- Pass
pollInterval/timeoutto the SDK in seconds (default poll interval now5seconds). - Convert seconds → milliseconds only for the CLI’s custom progress polling loop.
- Update/add tests to reflect seconds-based values and verify timeout behavior in progress mode.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| src/commands/crawl.ts | Removes ms pre-conversion for SDK polling options; converts to ms only in custom progress polling. |
| src/tests/commands/crawl.test.ts | Updates expectations to seconds and adds/adjusts progress-mode polling + timeout tests. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // Converts seconds -> ms Only here | ||
| const pollMs = | ||
| crawlOptions.pollInterval !== undefined | ||
| ? crawlOptions.pollInterval * 1000 | ||
| : 5000; | ||
| const startTime = Date.now(); | ||
| const timeoutMs = timeout ? timeout * 1000 : undefined; | ||
| const timeoutMs = | ||
| crawlOptions.timeout !== undefined | ||
| ? crawlOptions.timeout * 1000 | ||
| : undefined; |
There was a problem hiding this comment.
pollMs/timeoutMs are derived by multiplying crawlOptions.pollInterval/timeout without validating they’re finite numbers. Since the CLI parses these with parseFloat, invalid input (e.g. --poll-interval foo) becomes NaN, and with the current !== undefined checks this results in pollMs = NaN (treated as 0 by setTimeout), causing a tight polling loop and potential API/CPU overload. Add a Number.isFinite(...) (and likely > 0) guard and fall back to defaults or error out when values are invalid.
| const crawlOptions: Partial<CrawlOptions> & Record<string, any> = { | ||
| integration: 'cli', | ||
| }; |
There was a problem hiding this comment.
crawlOptions is typed as Partial<CrawlOptions> & Record<string, any>, but this object is the SDK parameter bag (it includes fields like integration and maxDiscoveryDepth that are not part of CrawlOptions, and omits CLI-only fields that are in CrawlOptions). This type can be misleading and doesn’t provide useful safety. Consider defining a dedicated SDK params type (similar to agentParams in src/commands/agent.ts:196-208) or using a plain Record<string, any> here.
Fixes an issue where
firecrawl crawl --wait(without--progress) would hang for long periods due to an incorrect double conversion ofpollIntervalandtimeoutfrom seconds to milliseconds.The CLI was converting values to milliseconds before passing them to the SDK, which also performs the same conversion internally. This resulted in extremely long polling intervals (e.g., 5s → ~83 minutes).
🐛 Root Cause
pollInterval→seconds * 1000timeout→seconds * 1000