Skip to content

fix(media-embed): remove ReDoS-prone regexes in host-gated providers#5305

Merged
waleedlatif1 merged 3 commits into
stagingfrom
media-embed-redos-fix
Jul 1, 2026
Merged

fix(media-embed): remove ReDoS-prone regexes in host-gated providers#5305
waleedlatif1 merged 3 commits into
stagingfrom
media-embed-redos-fix

Conversation

@waleedlatif1

Copy link
Copy Markdown
Collaborator

Summary

  • Fixes 3 CodeQL High js/polynomial-redos alerts on packages/utils/src/media-embed.ts (introduced by feat(rich-markdown-editor): live media embeds + shared embed detection util #5290, now in staging)
  • The YouTube, Facebook, and Giphy branches used unbounded .* inside unanchored regexes. Even with the hostname gate, a crafted valid-host URL with a long path/query (e.g. https://facebook.com/ + 100k junk chars, no /videos/) triggers O(n²) backtracking — a client-side hang since getEmbedInfo runs during note rendering + editor decoration (a crafted link in shared content could freeze other users' tabs)
  • Replaced each with bounded extraction off the parsed URL (parsed.pathname / parsed.searchParams), so there is no .* left in these branches. No change to which links match

Detail

  • YouTube: youtu.be → first path segment; watchsearchParams.get('v'); embed → anchored /^\/embed\/([^/?]+)/; result validated with /^[a-zA-Z0-9_-]{11}$/
  • Facebook: detect video via /\/videos\/\d+/ on the pathname (fb.watch via anchored segment) instead of facebook.com/.*/videos/
  • Giphy: take the trailing hyphen token of the /gifs|embed/ path segment ([^/]+ + split('-').pop()), validated [a-zA-Z0-9]+

Type of Change

  • Bug fix (security — ReDoS)

Testing

  • media-embed.test.ts: existing cases pass; added coverage for extra YouTube query params, Facebook/fb.watch, and Giphy slug extraction (10 tests). tsc clean.

Checklist

  • Code follows project style guidelines
  • Self-reviewed my changes
  • Tests added/updated and passing
  • No new warnings introduced
  • I confirm that I have read and agree to the terms outlined in the Contributor License Agreement (CLA)

Replace the unbounded '.*' patterns flagged by CodeQL (js/polynomial-redos) in
the YouTube, Facebook, and Giphy branches with bounded extraction off the parsed
URL (pathname / searchParams). Eliminates the O(n^2) backtracking a crafted
valid-host URL could trigger, with no change to matched links.
@vercel

vercel Bot commented Jul 1, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
docs Ready Ready Preview, Comment Jul 1, 2026 1:15am

Request Review

@cursor

cursor Bot commented Jul 1, 2026

Copy link
Copy Markdown

PR Summary

Medium Risk
Security fix in client-side URL parsing used during note rendering; logic is refactored but host gating and stricter id validation should preserve safe embed behavior.

Overview
Addresses CodeQL polynomial ReDoS in getEmbedInfo by replacing unbounded .* regex ID extraction on YouTube, Facebook/fb.watch, and Giphy with bounded reads from the parsed URL (pathname segments, searchParams, and short anchored path checks).

YouTube now pulls the video id from youtu.be path, /embed/ segment, or v query param and requires an 11-character id. Facebook treats links as video only when the pathname matches /videos/\d+ (or fb.watch’s single segment). Giphy takes the last hyphen token from the /gifs or /embed segment. Matching behavior is intended to stay the same; tests add cases for extra query params, non-video Facebook URLs, and slug-style Giphy links.

Reviewed by Cursor Bugbot for commit 64df33e. Configure here.

Comment thread packages/utils/src/media-embed.ts Outdated
@greptile-apps

greptile-apps Bot commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR removes ReDoS-prone media embed matching from host-gated providers. The main changes are:

  • Parsed-URL extraction for YouTube, Facebook, and Giphy links.
  • YouTube embed paths now take precedence over conflicting v query params.
  • Tests for provider matching and key URL edge cases.

Confidence Score: 5/5

This looks safe to merge.

  • No blocking issues found in the changed code.

Important Files Changed

Filename Overview
packages/utils/src/media-embed.ts Replaces risky provider regex matching with parsed pathname and query extraction.
packages/utils/src/media-embed.test.ts Adds tests for YouTube precedence and provider-specific embed extraction.

Reviews (2): Last reviewed commit: "fix(media-embed): dispatch YouTube id by..." | Re-trigger Greptile

Comment thread packages/utils/src/media-embed.ts Outdated
Comment thread packages/utils/src/media-embed.ts
Use the first path segment for youtu.be ids so a trailing slash still resolves
(matching the previous regex), and cover extra-query-param, si-param, embed-query,
and short-id cases.
…ents

- Resolve id from the /embed/ path segment before the ?v= query param so a valid
  embed URL with a spurious v param still embeds (was returning null)
- Remove non-TSDoc inline comments from the module and its test
@waleedlatif1

Copy link
Copy Markdown
Collaborator Author

@greptile

@waleedlatif1

Copy link
Copy Markdown
Collaborator Author

@cursor review

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Bugbot reviewed your changes and found no new issues!

Comment @cursor review or bugbot run to trigger another review on this PR

Reviewed by Cursor Bugbot for commit 64df33e. Configure here.

@waleedlatif1 waleedlatif1 merged commit 48752c6 into staging Jul 1, 2026
11 of 12 checks passed
@waleedlatif1 waleedlatif1 deleted the media-embed-redos-fix branch July 1, 2026 01:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant