feat(web-app-ai-image-alt-text-sidebar): add AI Image Alt-Text Generator#467
Conversation
✅ Snyk checks have passed. No issues have been found so far.
💻 Catch issues earlier using the plugins for VS Code, JetBrains IDEs, Visual Studio, and Eclipse. |
|
Converting to draft. There's a lot of issues that need to be taken care of before review. |
dj4oC
left a comment
There was a problem hiding this comment.
Reviewed the web-app-ai-image-alt-text-sidebar draft. The structure is clean and i18n/ODS/test coverage is solid — notes below, ordered by severity. (Draft, so flagging rather than blocking.)
1. Security — oCIS access token sent to a configurable absolute endpoint
useAltText.ts attaches the oCIS access token as Authorization: Bearer on the request to ${cfg.endpoint}/chat/completions, where cfg.endpoint comes straight from applicationConfig.llm.endpoint. That's safe only while the endpoint is the same-origin ai-llm-proxy. If an admin points llm.endpoint directly at an external LLM (an easy misconfiguration), the user's oCIS access token is sent off-origin — CORS is not a confidentiality guarantee, and a permissive endpoint or a redirect can capture it.
Suggestion: only attach the Authorization header when the endpoint is same-origin (relative URL, or new URL(endpoint, location.origin).origin === location.origin), and/or document that llm.endpoint must be the proxy. ai-doc-summary / chat-with-file route through ai-llm-proxy; worth mirroring that contract here.
2. Correctness — panel does not react to resource changes
AltTextPanel.vue loads stored alt text only in onMounted, which fires once. oCIS reuses the sidebar-panel instance and just updates the resource prop on selection change (see chat-with-file's ChatPanel.vue, which watches () => props.resource?.id for exactly this reason). So switching from image A to image B does not reload B's stored alt text and leaves A's editableText displayed for B.
Suggestion: watch(resourceRef, ...) to reset editableText / altText / panelError and re-run loadStoredText on change.
3. Correctness — generate result can land on the wrong image
triggerGenerate() awaits a request with a 30s timeout, then assigns altText.value = result unconditionally. If the user switches files mid-generation, the result is applied to the now-selected image (same class as the load() race called out in #460). A monotonic token or aborting on resource change would close the window.
4. Type safety — props.resource as any
The panel casts props.resource as any when calling saveText / loadStoredText. The prop is typed AltTextResource (no webDavPath), but useAltTextStorage reads resource.webDavPath. It works at runtime only because index.ts passes a full Resource; the cast hides the mismatch. Typing the prop as Resource (or a shape that includes webDavPath) lets you drop the casts.
5. Minor
useAltTextStorage.loadStoredTextswallows all errors tostoredText = null(a load failure looks like "no alt text") and shares the no-guard race from #2; degrades gracefully but worth a guard for consistency.- No success feedback after Save (no toast);
copyToClipboardsilently no-ops whennavigator.clipboardis unavailable. - A 4 MB image inflates to ~5.3 MB base64 inline in the JSON body — combined with proxy/server body-size limits, very large allowed images could be rejected. The 4 MB cap is reasonable; just flagging the inflation.
Nice
Good $gettext/$pgettext coverage, ODS components throughout, the 4 MB guard, AbortSignal.timeout, status-code-based error mapping, and per-composable unit tests. Once #1–#3 are tightened this is in good shape.
|
@dj4oC All points from your review have been addressed:
Unit tests added/updated for all of the above. |
…te package.json, vite.config.ts, tsconfig.json, public/manifest.json, and l10n/translations.json under packages/web-app-ai-image-alt-text-sidebar/; run pnpm install to register the workspace package Signed-off-by: Lukas Hirt <info@hirt.cz>
…ge utility and useLlm composable (with vision-ready / text-only / unconfigured status), plus unit tests for both Signed-off-by: Lukas Hirt <info@hirt.cz>
…age composable — WebDAV PROPFIND read and PROPPATCH write for the urn:oc:ai:alt-text custom property — with unit tests covering stored-value parsing, successful save, and error handling Signed-off-by: Lukas Hirt <info@hirt.cz>
- Remove `async` from `ensureReady` in useLlm.ts (was triggering `require-await` ESLint warning; now returns Promise.resolve() explicitly) - Cast `vi.fn()` mock assignments to `as any` in useAltTextStorage.spec.ts to resolve CalledWithMock type incompatibility with vitest-mock-extended Signed-off-by: Lukas Hirt <info@hirt.cz>
…posable — image download via WebDAV arraybuffer, base64 encoding, multimodal LLM /chat/completions call, 4 MB size guard, and HTTP/network error mapping — with unit tests covering all code paths Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Lukas Hirt <info@hirt.cz>
…ith all UI states (unconfigured, text-only notice, generating spinner, result textarea with Copy/Regenerate/Save buttons, error banner, stored-text pre-fill), wiring useAltText and useAltTextStorage, plus component unit tests; the panel serves as the preview-panel display of alt text requested in issue comments Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Lukas Hirt <info@hirt.cz>
…tering the sidebarPanel and context-action extensions with isVisible guards, run type-check (pnpm check:types) and production build (pnpm build), verify dist/ai-image-alt-text.js is emitted; include a code comment in index.ts noting that search-engine indexing of the stored WebDAV property requires a future server-side oCIS change (out of scope client-side) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Lukas Hirt <info@hirt.cz>
….md if present) for the extension Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Lukas Hirt <info@hirt.cz>
…compose, CI matrix, and app config Add volume mount in docker-compose.yml, register the extension in the GitHub Actions test matrix, and provide LLM config in both ocis.apps.yaml files using llama3.2-vision (required for multimodal image understanding). Signed-off-by: Lukas Hirt <info@hirt.cz> Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Lukas Hirt <info@hirt.cz>
…ith capabilities probing, fix proxy body limit bug, and polish panel UI - Drop `vision` flag from LlmConfig; useLlm.ensureReady now accepts an optional probeVision callback and caches the result after first call - useAltText.probeVision sends a 1×1 PNG to /chat/completions: returns true on 2xx, false on explicit vision-rejection body, throws on any other HTTP/network error so status stays unconfigured (prevents the Generate button appearing when the backend is down or model missing) - Add isProbing flag exposed from useAltText; AltTextPanel shows "Checking AI capabilities…" during the probe instead of the unconfigured message - Replace oc-textarea with a native auto-growing textarea so the alt text field expands to fit content without a scroll container - Tighten generation prompt to one sentence ≤15 words; lower max_tokens 150→60 to keep responses brief - Fix ai-llm-proxy: req.destroy() before sendJson() was silently dropping the 413 response (Traefik saw a connection reset and returned 502); replace with plain reject so the response can be sent - Raise proxy default MAX_BODY_BYTES 128 KiB→6 MiB to accommodate the extension's 4 MiB image limit after base64 encoding Signed-off-by: Lukas Hirt <info@hirt.cz> Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Lukas Hirt <info@hirt.cz>
…d and pre-filled alt text Call autoGrow after storedText pre-fills the textarea so it sizes to fit on load, not just after generation. Also add box-sizing: border-box to prevent the textarea from overflowing the panel horizontally. Signed-off-by: Lukas Hirt <info@hirt.cz> Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Lukas Hirt <info@hirt.cz>
…real Playwright acceptance tests Eight real e2e tests covering: action visibility for image vs non-image files, panel open via context menu, unconfigured/text-only notices, alt text generation into the editable textarea, error banner on LLM failure, and save/reload via WebDAV. LLM endpoint is mocked with page.route(); probe vs generation distinguished by max_tokens in the request body. New AltTextPanelPage page object and per-package playwright.config.ts added alongside the tests. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Lukas Hirt <info@hirt.cz>
…om dj4oC - Security: only attach the oCIS Authorization header when the LLM endpoint is same-origin; cross-origin endpoints never receive the user's access token (isSameOrigin guard in buildHeaders) - Correctness: panel now reacts to resource changes by watching props.resource.id and resetting state + reloading stored alt text - Correctness: triggerGenerate captures the resource id before the async fetch and discards stale results/errors if the resource changed mid-flight (monotonic-id guard) - Type safety: resource prop typed as Resource instead of the narrower AltTextResource; all `as any` casts removed; size guard handles number | string - Minor: loadError ref added to useAltTextStorage for PROPFIND failures; save action shows a success toast via useMessages; clipboard fallback shows an error toast instead of silently no-oping - Tests: new/updated unit tests covering all of the above Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Lukas Hirt <info@hirt.cz>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Lukas Hirt <info@hirt.cz>
…error in e2e helpers page.route() now returns Promise<Disposable>; make mock helpers async and await the call so the return type stays Promise<void>. Signed-off-by: Lukas Hirt <info@hirt.cz> Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Lukas Hirt <info@hirt.cz>
44b1b22 to
06ba969
Compare
Problem
Teams sharing images in oCIS have no way to generate descriptive alt text for accessibility or search-indexing purposes. Writing it manually is slow and routinely skipped, leaving images undiscoverable and inaccessible to screen readers.
Solution
Selecting any image file (.jpg, .png, .webp, .gif) reveals an "Alt Text" sidebar tab. The panel downloads the image via WebDAV, base64-encodes it, and sends it to a vision-capable LLM; the suggested alt text is shown in an editable field the user can copy. Tier 1 (vision model configured): full descriptive alt text generated. Tier 2 (text-only model): panel shows a one-line notice that the configured model does not support image input and prompts the admin to switch endpoints. No LLM configured: tab hidden entirely.
Extension points
global.files.sidebar (sidebarPanel)·global.files.context-actions (action)Why ship this now
WCAG compliance is increasingly mandated in enterprise oCIS deployments, and this is the only extension in the set that addresses visual content via vision AI — a clearly distinct capability from the existing text-only AI extensions.
What was built
packages/web-app-ai-image-alt-text-sidebar— a new extension with two registration points (ai-image-alt-text.panelsidebar tab andai-image-alt-text.actioncontext-menu entry), both hidden entirely when no LLM is configured.The sidebar panel (
AltTextPanel.vue) covers all UI states: generating spinner, result textarea with Copy / Regenerate / Save buttons, a text-only-model notice, and an error banner. Generated alt text is persisted back to the file as the WebDAV custom propertyurn:oc:ai:alt-text:textvia PROPFIND/PROPPATCH, so it survives panel close and re-open. A 4 MB image size guard is applied before the base64 payload is sent to the LLM.The LLM composable (
useLlm) introduces a three-state status —vision-ready/text-only/unconfigured— driven by a new optionalvision: trueflag onLlmConfig. The existingai-doc-summarypattern is reused for the panel registration and config passthrough; the only new wiring is the image MIME-type guard and the multimodal payload.Gate
Effort: S · 🤖 Generated by extctl