Skip to content

feat(web-app-ai-image-alt-text-sidebar): add AI Image Alt-Text Generator#467

Merged
LukasHirt merged 16 commits into
mainfrom
ext/2026-06-19-ai-image-alt-text-sidebar
Jun 22, 2026
Merged

feat(web-app-ai-image-alt-text-sidebar): add AI Image Alt-Text Generator#467
LukasHirt merged 16 commits into
mainfrom
ext/2026-06-19-ai-image-alt-text-sidebar

Conversation

@LukasHirt

@LukasHirt LukasHirt commented Jun 19, 2026

Copy link
Copy Markdown
Collaborator

AI-generated · OSPO-46 · Gate: ✅ 1.00

Problem

Teams sharing images in oCIS have no way to generate descriptive alt text for accessibility or search-indexing purposes. Writing it manually is slow and routinely skipped, leaving images undiscoverable and inaccessible to screen readers.

Solution

Selecting any image file (.jpg, .png, .webp, .gif) reveals an "Alt Text" sidebar tab. The panel downloads the image via WebDAV, base64-encodes it, and sends it to a vision-capable LLM; the suggested alt text is shown in an editable field the user can copy. Tier 1 (vision model configured): full descriptive alt text generated. Tier 2 (text-only model): panel shows a one-line notice that the configured model does not support image input and prompts the admin to switch endpoints. No LLM configured: tab hidden entirely.

Extension points

global.files.sidebar (sidebarPanel) · global.files.context-actions (action)

Why ship this now

WCAG compliance is increasingly mandated in enterprise oCIS deployments, and this is the only extension in the set that addresses visual content via vision AI — a clearly distinct capability from the existing text-only AI extensions.

What was built

packages/web-app-ai-image-alt-text-sidebar — a new extension with two registration points (ai-image-alt-text.panel sidebar tab and ai-image-alt-text.action context-menu entry), both hidden entirely when no LLM is configured.

The sidebar panel (AltTextPanel.vue) covers all UI states: generating spinner, result textarea with Copy / Regenerate / Save buttons, a text-only-model notice, and an error banner. Generated alt text is persisted back to the file as the WebDAV custom property urn:oc:ai:alt-text:text via PROPFIND/PROPPATCH, so it survives panel close and re-open. A 4 MB image size guard is applied before the base64 payload is sent to the LLM.

The LLM composable (useLlm) introduces a three-state status — vision-ready / text-only / unconfigured — driven by a new optional vision: true flag on LlmConfig. The existing ai-doc-summary pattern is reused for the panel registration and config passthrough; the only new wiring is the image MIME-type guard and the multimodal payload.

Search-engine indexing of the stored WebDAV property requires a future server-side oCIS change and is out of scope here.

Gate

Check Result
Hygiene ✅ ok
Build ✅ ok
Lint ✅ ok
Unit tests ✅ ok
Score 1.00

Effort: S · 🤖 Generated by extctl

@kw-security

kw-security commented Jun 19, 2026

Copy link
Copy Markdown

Snyk checks have passed. No issues have been found so far.

Status Scan Engine Critical High Medium Low Total (0)
Open Source Security 0 0 0 0 0 issues
Licenses 0 0 0 0 0 issues
Code Security 0 0 0 0 0 issues

💻 Catch issues earlier using the plugins for VS Code, JetBrains IDEs, Visual Studio, and Eclipse.

@LukasHirt LukasHirt marked this pull request as draft June 19, 2026 16:22
@LukasHirt

Copy link
Copy Markdown
Collaborator Author

Converting to draft. There's a lot of issues that need to be taken care of before review.

@LukasHirt LukasHirt changed the title AI Image Alt-Text Generator feat(web-app-ai-image-alt-text-sidebar): add AI Image Alt-Text Generator Jun 19, 2026

@dj4oC dj4oC left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed the web-app-ai-image-alt-text-sidebar draft. The structure is clean and i18n/ODS/test coverage is solid — notes below, ordered by severity. (Draft, so flagging rather than blocking.)

1. Security — oCIS access token sent to a configurable absolute endpoint

useAltText.ts attaches the oCIS access token as Authorization: Bearer on the request to ${cfg.endpoint}/chat/completions, where cfg.endpoint comes straight from applicationConfig.llm.endpoint. That's safe only while the endpoint is the same-origin ai-llm-proxy. If an admin points llm.endpoint directly at an external LLM (an easy misconfiguration), the user's oCIS access token is sent off-origin — CORS is not a confidentiality guarantee, and a permissive endpoint or a redirect can capture it.

Suggestion: only attach the Authorization header when the endpoint is same-origin (relative URL, or new URL(endpoint, location.origin).origin === location.origin), and/or document that llm.endpoint must be the proxy. ai-doc-summary / chat-with-file route through ai-llm-proxy; worth mirroring that contract here.

2. Correctness — panel does not react to resource changes

AltTextPanel.vue loads stored alt text only in onMounted, which fires once. oCIS reuses the sidebar-panel instance and just updates the resource prop on selection change (see chat-with-file's ChatPanel.vue, which watches () => props.resource?.id for exactly this reason). So switching from image A to image B does not reload B's stored alt text and leaves A's editableText displayed for B.

Suggestion: watch(resourceRef, ...) to reset editableText / altText / panelError and re-run loadStoredText on change.

3. Correctness — generate result can land on the wrong image

triggerGenerate() awaits a request with a 30s timeout, then assigns altText.value = result unconditionally. If the user switches files mid-generation, the result is applied to the now-selected image (same class as the load() race called out in #460). A monotonic token or aborting on resource change would close the window.

4. Type safety — props.resource as any

The panel casts props.resource as any when calling saveText / loadStoredText. The prop is typed AltTextResource (no webDavPath), but useAltTextStorage reads resource.webDavPath. It works at runtime only because index.ts passes a full Resource; the cast hides the mismatch. Typing the prop as Resource (or a shape that includes webDavPath) lets you drop the casts.

5. Minor

  • useAltTextStorage.loadStoredText swallows all errors to storedText = null (a load failure looks like "no alt text") and shares the no-guard race from #2; degrades gracefully but worth a guard for consistency.
  • No success feedback after Save (no toast); copyToClipboard silently no-ops when navigator.clipboard is unavailable.
  • A 4 MB image inflates to ~5.3 MB base64 inline in the JSON body — combined with proxy/server body-size limits, very large allowed images could be rejected. The 4 MB cap is reasonable; just flagging the inflation.

Nice

Good $gettext/$pgettext coverage, ODS components throughout, the 4 MB guard, AbortSignal.timeout, status-code-based error mapping, and per-composable unit tests. Once #1#3 are tightened this is in good shape.

@LukasHirt

Copy link
Copy Markdown
Collaborator Author

@dj4oC All points from your review have been addressed:

  1. SecuritybuildHeaders now checks isSameOrigin(endpoint) before attaching the Authorization header; cross-origin endpoints never receive the oCIS access token.
  2. Correctness (resource changes) — added a watch(() => props.resource?.id, ...) in the panel that resets state and reloads stored alt text on selection change.
  3. Correctness (race condition)triggerGenerate captures the resource id before the fetch and discards stale results/errors if the resource changed mid-flight.
  4. Type safety — prop typed as Resource from @ownclouders/web-client; AltTextResource interface removed; all as any casts dropped.
  5. MinorloadError ref added to useAltTextStorage; save action shows a success toast; clipboard fallback shows an error toast instead of silently no-oping.

Unit tests added/updated for all of the above.

@LukasHirt LukasHirt marked this pull request as ready for review June 22, 2026 08:17
@LukasHirt LukasHirt requested a review from dj4oC June 22, 2026 08:17
LukasHirt and others added 16 commits June 22, 2026 11:26
…te package.json, vite.config.ts, tsconfig.json, public/manifest.json, and l10n/translations.json under packages/web-app-ai-image-alt-text-sidebar/; run pnpm install to register the workspace package

Signed-off-by: Lukas Hirt <info@hirt.cz>
…ge utility and useLlm composable (with vision-ready / text-only / unconfigured status), plus unit tests for both

Signed-off-by: Lukas Hirt <info@hirt.cz>
…age composable — WebDAV PROPFIND read and PROPPATCH write for the urn:oc:ai:alt-text custom property — with unit tests covering stored-value parsing, successful save, and error handling

Signed-off-by: Lukas Hirt <info@hirt.cz>
- Remove `async` from `ensureReady` in useLlm.ts (was triggering
  `require-await` ESLint warning; now returns Promise.resolve() explicitly)
- Cast `vi.fn()` mock assignments to `as any` in useAltTextStorage.spec.ts
  to resolve CalledWithMock type incompatibility with vitest-mock-extended

Signed-off-by: Lukas Hirt <info@hirt.cz>
…posable — image download via WebDAV arraybuffer, base64 encoding, multimodal LLM /chat/completions call, 4 MB size guard, and HTTP/network error mapping — with unit tests covering all code paths

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Lukas Hirt <info@hirt.cz>
…ith all UI states (unconfigured, text-only notice, generating spinner, result textarea with Copy/Regenerate/Save buttons, error banner, stored-text pre-fill), wiring useAltText and useAltTextStorage, plus component unit tests; the panel serves as the preview-panel display of alt text requested in issue comments

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Lukas Hirt <info@hirt.cz>
…tering the sidebarPanel and context-action extensions with isVisible guards, run type-check (pnpm check:types) and production build (pnpm build), verify dist/ai-image-alt-text.js is emitted; include a code comment in index.ts noting that search-engine indexing of the stored WebDAV property requires a future server-side oCIS change (out of scope client-side)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Lukas Hirt <info@hirt.cz>
….md if present) for the extension

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Lukas Hirt <info@hirt.cz>
…compose, CI matrix, and app config

Add volume mount in docker-compose.yml, register the extension in the
GitHub Actions test matrix, and provide LLM config in both ocis.apps.yaml
files using llama3.2-vision (required for multimodal image understanding).

Signed-off-by: Lukas Hirt <info@hirt.cz>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Lukas Hirt <info@hirt.cz>
…ith capabilities probing, fix proxy body limit bug, and polish panel UI

- Drop `vision` flag from LlmConfig; useLlm.ensureReady now accepts an
  optional probeVision callback and caches the result after first call
- useAltText.probeVision sends a 1×1 PNG to /chat/completions: returns
  true on 2xx, false on explicit vision-rejection body, throws on any
  other HTTP/network error so status stays unconfigured (prevents the
  Generate button appearing when the backend is down or model missing)
- Add isProbing flag exposed from useAltText; AltTextPanel shows
  "Checking AI capabilities…" during the probe instead of the
  unconfigured message
- Replace oc-textarea with a native auto-growing textarea so the alt
  text field expands to fit content without a scroll container
- Tighten generation prompt to one sentence ≤15 words; lower max_tokens
  150→60 to keep responses brief
- Fix ai-llm-proxy: req.destroy() before sendJson() was silently
  dropping the 413 response (Traefik saw a connection reset and
  returned 502); replace with plain reject so the response can be sent
- Raise proxy default MAX_BODY_BYTES 128 KiB→6 MiB to accommodate the
  extension's 4 MiB image limit after base64 encoding

Signed-off-by: Lukas Hirt <info@hirt.cz>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Lukas Hirt <info@hirt.cz>
…d and pre-filled alt text

Call autoGrow after storedText pre-fills the textarea so it sizes to
fit on load, not just after generation. Also add box-sizing: border-box
to prevent the textarea from overflowing the panel horizontally.

Signed-off-by: Lukas Hirt <info@hirt.cz>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Lukas Hirt <info@hirt.cz>
…real Playwright acceptance tests

Eight real e2e tests covering: action visibility for image vs non-image files,
panel open via context menu, unconfigured/text-only notices, alt text generation
into the editable textarea, error banner on LLM failure, and save/reload via
WebDAV. LLM endpoint is mocked with page.route(); probe vs generation
distinguished by max_tokens in the request body. New AltTextPanelPage page
object and per-package playwright.config.ts added alongside the tests.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Lukas Hirt <info@hirt.cz>
…om dj4oC

- Security: only attach the oCIS Authorization header when the LLM
  endpoint is same-origin; cross-origin endpoints never receive the
  user's access token (isSameOrigin guard in buildHeaders)
- Correctness: panel now reacts to resource changes by watching
  props.resource.id and resetting state + reloading stored alt text
- Correctness: triggerGenerate captures the resource id before the
  async fetch and discards stale results/errors if the resource
  changed mid-flight (monotonic-id guard)
- Type safety: resource prop typed as Resource instead of the narrower
  AltTextResource; all `as any` casts removed; size guard handles
  number | string
- Minor: loadError ref added to useAltTextStorage for PROPFIND failures;
  save action shows a success toast via useMessages; clipboard fallback
  shows an error toast instead of silently no-oping
- Tests: new/updated unit tests covering all of the above

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Lukas Hirt <info@hirt.cz>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Lukas Hirt <info@hirt.cz>
…error in e2e helpers

page.route() now returns Promise<Disposable>; make mock helpers async
and await the call so the return type stays Promise<void>.

Signed-off-by: Lukas Hirt <info@hirt.cz>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Lukas Hirt <info@hirt.cz>
@LukasHirt LukasHirt force-pushed the ext/2026-06-19-ai-image-alt-text-sidebar branch from 44b1b22 to 06ba969 Compare June 22, 2026 09:26

@dj4oC dj4oC left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGMT

@LukasHirt LukasHirt merged commit ed22ea9 into main Jun 22, 2026
41 of 43 checks passed
@LukasHirt LukasHirt deleted the ext/2026-06-19-ai-image-alt-text-sidebar branch June 22, 2026 13:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants