AI Sensitive Data Scanner (Batch) by LukasHirt · Pull Request #475 · owncloud/web-extensions

LukasHirt · 2026-06-23T23:15:58Z

AI-generated · OSPO-51 · Gate: ✅ 1.00

Problem

Teams routinely share folders containing files with accidentally embedded PII,
credentials, or confidential text. Manual inspection before sharing is
impossible at scale.

Solution

Users select files and click "Scan for sensitive data" in the batch actions
bar. The extension fetches text from supported files (txt, md, pdf) and sends
each to the LLM; a report modal lists per-file findings with redacted excerpts.
With structured-output models, findings are categorized (PII / credentials /
confidential); with basic text models, a plain per-file narrative is returned.
Without a configured LLM, the action opens an informational modal about the
missing setup.

Extension points

global.files.batch-actions

Why ship this now

Compliance and data-governance requirements are rising for on-prem oCIS
customers; this gives them an instant pre-share check without leaving the files UI.

What was built

web-app-ai-sensitive-data-scanner is an oCIS Web extension that registers a single batch action on global.files.batch-actions. When users select one or more files and trigger "Scan for sensitive data," the extension fetches the text content of each supported file (CSV, Markdown, PDF, plain text), sends it to the configured LLM endpoint sequentially, and displays per-file findings in a results modal. PDF content is extracted via pdfjs-dist's fake-worker pattern, capped at 12,000 characters, consistent with the approach used in other AI extensions in this repo.

The entry point (src/index.ts) registers the action via defineWebApplication, delegating file-type gating to src/utils/file-support.ts (isSupportedFile, defaulting to csv, md, pdf, txt). Scanning logic lives in src/composables/useScanner.ts: it builds a FileScanResult per resource with progressive state transitions (pending → scanning → done | error | skipped), validates the LLM endpoint origin against window.location.origin before attaching the Bearer token, and processes files one at a time with await rather than Promise.all to avoid rate-limit collisions. ScanResultsModal.vue drives both the unconfigured-LLM path (shows a setup prompt and suppresses the scan) and the live path, rendering structured findings with category icons (pii, credentials, confidential) or a plain pre-wrap narrative block when the LLM returns non-JSON text.

Two deliberate degradation tiers are supported: when the model returns valid JSON, findings are surfaced as categorized entries with redacted excerpts; when it returns prose, the raw response is stored as a narrative field and rendered verbatim. The same-origin check is a hard gate — cross-origin endpoints produce a per-file error without sending credentials. The batch action registers exclusively on global.files.batch-actions; dual-registration with global.files.context-actions was explicitly rejected during planning.

Unit tests cover all rendering states of ScanResultsModal.vue (unconfigured, global in-progress, per-file pending/scanning/skipped/error, narrative fallback, structured findings, and re-scan button visibility). An E2E scaffold (acceptance.spec.ts, ScannerPage.ts, playwright.config.ts, global-setup.ts) is committed but the acceptance tests themselves are out of scope for this PR — they require a live oCIS instance with an LLM sidecar and are not exercised in CI.

Gate

Check	Result
Hygiene	✅ ok
Build	✅ ok
Lint	✅ ok
Unit tests	✅ ok
E2E tests	✅ ok
Score	1.00

_{Effort: M · 🤖 Generated by extctl}

…e `packages/web-app-ai-sensitive-data-scanner/` with `package.json`, `vite.config.ts`, `tsconfig.json`, `src/index.ts` stub, `l10n/translations.json`, and `l10n/.tx/config` Signed-off-by: Lukas Hirt <info@hirt.cz>

Signed-off-by: Lukas Hirt <info@hirt.cz>

Fix two cascading e2e failures caused by oCIS state pollution: 1. oc-modal-background blocks afterEach cleanup: dispatchModal creates a full-screen backdrop with pointer-events that intercepts every click, preventing deleteAllFromPersonal() from reaching the app-switcher button. Set pointer-events: none on the backdrop in ScanResultsModal.onMounted so the modal stays visible while clicks pass through to the nav. 2. Leftover test-document.txt from prior gate runs: when cleanup fails after test 3, the file lingers in oCIS, causing uploadFile() to hang on the "File already exists" conflict dialog in the next run (tests 1 and 2). Add a Playwright globalSetup that deletes the known test fixture files via WebDAV (/remote.php/dav/files/admin/) before the suite runs. Signed-off-by: Lukas Hirt <info@hirt.cz>

Signed-off-by: Lukas Hirt <info@hirt.cz>

…`src/composables/useLlm.ts` (copied from `web-app-ai-doc-summary`) and `src/utils/file-support.ts` Signed-off-by: Lukas Hirt <info@hirt.cz>

Signed-off-by: Lukas Hirt <info@hirt.cz>

…seScan.ts`: text/PDF file fetching, sequential LLM calls with structured-output + plain-text fallback, same-origin endpoint validation, and per-file result state Signed-off-by: Lukas Hirt <info@hirt.cz>

Signed-off-by: Lukas Hirt <info@hirt.cz>

…nt: complete `src/index.ts` to register the `ActionExtension` on `global.files.batch-actions` with `isVisible` guard and `dispatchModal` handler Signed-off-by: Lukas Hirt <info@hirt.cz>

…sultsModal.vue`: scanning progress, per-file findings tables (structured) and narrative fallback, unconfigured-LLM state, using ODS components Signed-off-by: Lukas Hirt <info@hirt.cz>

…nit/components/ScanResultModal.spec.ts` and add the E2E scaffold in `tests/e2e/` Signed-off-by: Lukas Hirt <info@hirt.cz>

….md if present) for the extension Signed-off-by: Lukas Hirt <info@hirt.cz>

… CI matrix, and oCIS apps config Signed-off-by: Lukas Hirt <info@hirt.cz>

kw-security · 2026-06-23T23:16:11Z

✅ Snyk checks have passed. No issues have been found so far.

Status	Scan Engine	Critical	High	Medium	Low	Total (0)
✅	Open Source Security	0	0	0	0	0 issues
✅	Licenses	0	0	0	0	0 issues
✅	Code Security	0	0	0	0	0 issues

💻 Catch issues earlier using the plugins for VS Code, JetBrains IDEs, Visual Studio, and Eclipse.

LukasHirt added 29 commits June 23, 2026 13:48

fix(web-app-ai-sensitive-data-scanner): repair failing stage

e6dc263

Signed-off-by: Lukas Hirt <info@hirt.cz>

fix(web-app-ai-sensitive-data-scanner): repair failing stage

a9171a3

Signed-off-by: Lukas Hirt <info@hirt.cz>

fix(web-app-ai-sensitive-data-scanner): repair failing stage

4606295

Signed-off-by: Lukas Hirt <info@hirt.cz>

fix(web-app-ai-sensitive-data-scanner): repair failing stage

334934a

Signed-off-by: Lukas Hirt <info@hirt.cz>

fix(web-app-ai-sensitive-data-scanner): repair failing stage

f927ec9

Signed-off-by: Lukas Hirt <info@hirt.cz>

fix(web-app-ai-sensitive-data-scanner): repair failing stage

48050fa

Signed-off-by: Lukas Hirt <info@hirt.cz>

fix(web-app-ai-sensitive-data-scanner): repair failing stage

4c27ab8

Signed-off-by: Lukas Hirt <info@hirt.cz>

fix(web-app-ai-sensitive-data-scanner): repair failing stage

37e3f96

fix(web-app-ai-sensitive-data-scanner): repair failing stage

149d2d1

fix: drop pointer events style from modal

0ad77a6

Signed-off-by: Lukas Hirt <info@hirt.cz>

fix(web-app-ai-sensitive-data-scanner): repair failing stage

f3b3e51

Signed-off-by: Lukas Hirt <info@hirt.cz>

fix(web-app-ai-sensitive-data-scanner): repair failing stage

58f5fee

Signed-off-by: Lukas Hirt <info@hirt.cz>

fix(web-app-ai-sensitive-data-scanner): repair failing stage

cdd970f

Signed-off-by: Lukas Hirt <info@hirt.cz>

feat(web-app-ai-sensitive-data-scanner): Implement core composables: …

03620ec

…`src/composables/useLlm.ts` (copied from `web-app-ai-doc-summary`) and `src/utils/file-support.ts` Signed-off-by: Lukas Hirt <info@hirt.cz>

fix(web-app-ai-sensitive-data-scanner): repair failing stage

eec6aaf

Signed-off-by: Lukas Hirt <info@hirt.cz>

feat(web-app-ai-sensitive-data-scanner): Implement `src/composables/u…

92bf28a

…seScan.ts`: text/PDF file fetching, sequential LLM calls with structured-output + plain-text fallback, same-origin endpoint validation, and per-file result state Signed-off-by: Lukas Hirt <info@hirt.cz>

fix(web-app-ai-sensitive-data-scanner): repair failing stage

a71083c

fix(web-app-ai-sensitive-data-scanner): repair failing stage

e05a1de

fix(web-app-ai-sensitive-data-scanner): repair failing stage

66287b4

fix(web-app-ai-sensitive-data-scanner): repair failing stage

8d15600

Signed-off-by: Lukas Hirt <info@hirt.cz>

fix(web-app-ai-sensitive-data-scanner): repair failing stage

6d7dbd1

Signed-off-by: Lukas Hirt <info@hirt.cz>

feat(web-app-ai-sensitive-data-scanner): Wire the extension entry poi…

9f2b32c

…nt: complete `src/index.ts` to register the `ActionExtension` on `global.files.batch-actions` with `isVisible` guard and `dispatchModal` handler Signed-off-by: Lukas Hirt <info@hirt.cz>

feat(web-app-ai-sensitive-data-scanner): Build `src/components/ScanRe…

7e9c8ab

…sultsModal.vue`: scanning progress, per-file findings tables (structured) and narrative fallback, unconfigured-LLM state, using ODS components Signed-off-by: Lukas Hirt <info@hirt.cz>

test(web-app-ai-sensitive-data-scanner): Write unit tests in `tests/u…

2eb943a

…nit/components/ScanResultModal.spec.ts` and add the E2E scaffold in `tests/e2e/` Signed-off-by: Lukas Hirt <info@hirt.cz>

fix(web-app-ai-sensitive-data-scanner): repair failing stage

00f7b4b

docs(web-app-ai-sensitive-data-scanner): Update README.md (and CLAUDE…

82f11a5

….md if present) for the extension Signed-off-by: Lukas Hirt <info@hirt.cz>

chore(web-app-ai-sensitive-data-scanner): register in docker-compose,…

985fe26

… CI matrix, and oCIS apps config Signed-off-by: Lukas Hirt <info@hirt.cz>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

AI Sensitive Data Scanner (Batch)#475

AI Sensitive Data Scanner (Batch)#475
LukasHirt wants to merge 29 commits into
mainfrom
ext/2026-06-22-ai-sensitive-data-scanner

LukasHirt commented Jun 23, 2026

Uh oh!

kw-security commented Jun 23, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

LukasHirt commented Jun 23, 2026

Problem

Solution

Extension points

Why ship this now

What was built

Gate

Uh oh!

kw-security commented Jun 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Snyk checks have passed. No issues have been found so far.

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

kw-security commented Jun 23, 2026 •

edited

Loading