fix: double-audio scaffold, lint rules, docs guide, Gemini 3.1#299
fix: double-audio scaffold, lint rules, docs guide, Gemini 3.1#299
Conversation
Double-audio bug fix: - scaffolding.ts: stop writing index.html in captures/ (root cause — runtime discovered scaffold + real index.html as two compositions) - New lint rule: multiple_root_compositions — errors if >1 root HTML - New lint rule: duplicate_audio_track — warns on overlapping audio Capture improvements (from testing 30+ websites): - Catalog runs BEFORE extractHtml (which mutates DOM — converts img src to data URLs). HeyKuba: 2 images → 78. - networkidle2 instead of networkidle0 (unblocks SPAs with WebSockets) - Lazy-load image wait, CSS background-image cataloging - SVG naming from class/id/parent (not just aria-label) - Gemini batch 5→20, pause 12s→2s, maxOutputTokens 300→500 - Asset descriptions sorted: captioned first Docs: - New guide: guides/website-to-video.mdx (full tutorial) - CLI docs: added capture and snapshot commands - docs.json: website-to-video in Guides nav C
Gemini 3.1 Flash Lite Preview: 2.5x faster TTFT, 45% faster output, slightly cheaper ($0.25/M vs $0.30/M input), near-2.5-Flash quality. Descriptions are actually more detailed in testing.
|
Preview deployment for your docs. Learn more about Mintlify Previews.
💡 Tip: Enable Workflows to automatically generate PRs for you. |
jrusso1020
left a comment
There was a problem hiding this comment.
Nice bug find and the guide reads well. A few things I'd want addressed before merging — two of them are correctness bugs in the new lint rules, and I think the PR is doing too many things at once.
🔴 Blockers
1. lintMultipleRootCompositions can never fire
const rootFiles = results.map((r) => r.file).filter((f) => !f.startsWith("compositions/"));
if (rootFiles.length > 1) { ... }results is built earlier in lintProject: exactly one push for "index.html" and N pushes for "compositions/${file}". After the filter, rootFiles always has length 1, so the > 1 branch is unreachable. This rule can't catch the bug it was designed to catch.
To actually detect a stray scaffold, the lint needs to walk the project directory for *.html files at the root and compare against project.indexPath — not filter what lintProject already chose to read.
2. lintDuplicateAudioTracks regex is order-sensitive
The regex requires attributes in the source as data-track-index → data-start → data-duration. The old scaffold that caused this bug wrote them as:
<audio id="narration" data-start="0" data-duration="28" data-track-index="0" data-volume="1" src="narration.wav">data-start before data-track-index — that audio tag would not match this regex. Any agent-authored <audio> with a different attribute order is silently skipped.
Fix: match <audio[^>]*> first, then extract each attribute with its own regex against the tag body. While you're there: because the scan walks allHtmlSources (root + every composition), the same <audio> reachable through both a root and a sub-composition will flag as a duplicate of itself — worth deduping by (src, start, duration) or scoping to a single file.
Also, landing two new lint rules with zero unit tests is what let #1 slip through — a single fixture per rule would have caught it.
🟡 Should fix
3. Docs/code drift on the Gemini model
You updated the model in three places but missed one:
step-1-capture.md→ "Gemini 3.1 Flash Lite" ✅contentExtractor.ts→ updated ✅docs/packages/cli.mdx:357→ still says "Gemini 2.5 Flash vision (~$0.001/image)" ❌
4. gemini-3.1-flash-lite-preview is a preview model
The PR body claims "2.5x faster with richer descriptions" without numbers. Preview endpoints get deprecated on Google's schedule, not ours, so two things I'd want:
- A short note with the actual measurements — latency, sample caption quality, any rate-limit changes (the batching code still assumes 2000 RPM).
- An easy swap path — either an env override (
HYPERFRAMES_GEMINI_MODEL) or a constant at the top ofcontentExtractor.tsso the next swap is one line.
5. Drop the greensock/gsap-skills install line
(From @jrusso1020) We ship skills/gsap/ in this repo, so pointing users at greensock/gsap-skills is now redundant and a second source of truth we don't control. Remove both the npx skills add greensock/gsap-skills line in docs/guides/website-to-video.mdx and any similar references.
6. Lead with explicit skill invocation in the guide
(From @jrusso1020, with my take) The current guide shows implicit discovery as the happy path and treats explicit invocation as a troubleshooting fallback. I'd flip that for the published docs:
- Deterministic — users get the same behavior every time, no "why didn't it trigger?" support threads.
- Teachable — the docs actually name the thing they're telling you to use.
- Self-documenting in transcripts — easier to tell what ran when someone pastes a session.
I wouldn't make it the only pattern though — ambient discovery is part of what makes the product feel magical, and removing that story would be a loss. Concretely:
Use the /website-to-hyperframes skill to create a 25-second product launch
video from https://stripe.com. Bold, cinematic, financial infrastructure energy.
…as the primary example, with a <Note> afterwards saying "Agents will also trigger this skill automatically when they see a URL and a video request — the explicit form is just more predictable."
7. docs/docs.json indentation
"pages": [
"guides/website-to-video",
"guides/prompting",
New entry is flush-left while its siblings are at 14 spaces. Mintlify parsed it (CI green) but this drifts on the next formatter pass. One-line fix.
🟢 Scope
Four independent things in one PR: a P0-ish bug fix, two lint rules, a Mintlify guide, and a model swap. I'd split into:
- PR A (ship now): scaffold removal — self-contained, fixes a real bug.
- PR B: lint rules, once they work + tests.
- PR C: docs guide +
cli.mdxedits + gsap-skills removal. - PR D: Gemini 3.1 swap with benchmark + env override.
If anything regresses, revert blast radius becomes one feature instead of four.
👍 What's good
- The root-cause write-up in
scaffolding.tsis exactly right and the inline comment explaining why we no longer emitindex.htmlis the kind of breadcrumb I want to find in six months. - The website-to-video guide is well-structured, uses Mintlify components idiomatically, and the "With/Without Gemini"
<Tabs>comparison is a nice touch.
Requesting changes on #1, #2, #3, #5. Happy to re-review once those are addressed; the rest are strong suggestions.
- lintMultipleRootCompositions: scan filesystem for HTML files with data-composition-id (was filtering results array — always 1 entry) - lintDuplicateAudioTracks: order-independent attribute extraction, dedup by (src,start,duration,trackIndex), Infinity fallback for missing data-duration (matches runtime behavior) - 10 new tests for both lint rules - docs: explicit skill invocation, remove gsap-skills, fix indentation - Gemini: env override (HYPERFRAMES_GEMINI_MODEL), benchmark data in code comment (49 imgs: 3.1-lite ~507ms/img, 2.5-lite ~230ms/img) - cli.mdx: version-agnostic "Gemini vision" reference Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Addressed all 7 review items🔴 Blockers (all fixed)1. 2.
3. 5. Drop 🟡 Should fix (all addressed)4. Preview model benchmark + env override — Added
3.1-flash-lite-preview produces richer captions (+14 chars avg) with higher variance on cold starts. 2.5-flash-lite is faster and more reliable. Keeping 3.1 as default for caption quality; easy swap via env var. Benchmark data added as code comment. 6. Lead with explicit skill invocation — Step 2 example updated, 7. Bonus from self-review
Not splitting the PRKeeping as one PR per discussion — the changes are small and interdependent (lint rules reference the same scaffold bug the removal fixes, docs reference the same Gemini model the code uses). |
What
Remove scaffold index.html from captures, add two new lint rules, add website-to-video docs guide, and switch to Gemini 3.1 Flash Lite for image captioning.
Why
How
Test plan