Allow XENOVA_CACHE_HOME to redirect local embedding model path by canxer314 · Pull Request #862 · rohitg00/agentmemory

canxer314 · 2026-06-08T14:42:13Z

Problem

@xenova/transformers defaults env.localModelPath to its own install directory — deep inside npm's global node_modules (e.g. .../node_modules/@xenova/transformers/models/). When users pre-download embedding models to ~/.cache/Xenova/ — a common pattern in offline or restricted-network environments — the library never looks there.

If HuggingFace CDN is unreachable (firewall, air-gapped network, etc.), every observation save logs:

[agentmemory] warn vector-index add: embed failed — skipping {"provider":"local","error":"fetch failed"}

This means every new observation lacks a vector embedding, and semantic search silently degrades to BM25-only.

Fix

In LocalEmbeddingProvider.getExtractor(), read XENOVA_CACHE_HOME from the environment after importing @xenova/transformers. When set, override both env.localModelPath (the primary file lookup path) and env.cacheDir (the download cache destination) before calling pipeline().

`XENOVA_CACHE_HOME`	Behavior
unset	Unchanged — uses default paths inside npm `node_modules`
`~/.cache/Xenova`	Finds pre-downloaded models directly, no network fetch needed

Why `localModelPath` + `cacheDir` both

localModelPath is the primary lookup (hub.js:392) — the library checks it first
cacheDir is where downloads are persisted — redirecting it keeps the npm directory clean

The convention matches: ~/.cache/Xenova/ already uses the {org}/{model}/ layout that @xenova/transformers expects, so XENOVA_CACHE_HOME=~/.cache/Xenova works directly with pre-downloaded models.

Verification

XENOVA_CACHE_HOME=~/.cache/Xenova → embed succeeds, 384-dim vector returned
env var unset → env.localModelPath unchanged, existing behavior preserved
test/embedding-provider.test.ts unaffected (no env var set → default path)

Summary by CodeRabbit

New Features
- Added support for offline and restricted-network setups, enabling use of pre-downloaded models without network access via environment variable configuration.

@xenova/transformers defaults localModelPath to its own install dir inside npm global node_modules. When users pre-download models to ~/.cache/Xenova/ (offline or restricted-network environments), the library cannot find them and tries to fetch from huggingface.co. If that network is unreachable, every observation save logs: vector-index add: embed failed - skipping (provider: local) Read XENOVA_CACHE_HOME from the environment and, when set, override both env.localModelPath and env.cacheDir before the pipeline() call. Unset = zero behavior change.

vercel · 2026-06-08T14:42:18Z

@canxer314 is attempting to deploy a commit to the rohitg00's projects Team on Vercel.

A member of the Team first needs to authorize it.

coderabbitai · 2026-06-08T14:42:28Z

📝 Walkthrough

Walkthrough

LocalEmbeddingProvider now supports offline model loading via the XENOVA_CACHE_HOME environment variable. When set, this variable redirects both the Xenova local model lookup path and transformer cache directory, enabling pre-downloaded model usage without network access in restricted environments.

Changes

Offline model cache configuration

Layer / File(s)	Summary
Xenova cache home environment variable support `src/providers/embedding/local.ts`	The `transformers` module type is expanded to expose `env` object with `localModelPath` and `cacheDir` fields. `getExtractor()` now reads `XENOVA_CACHE_HOME` and applies it to both paths when the environment variable is set.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

Poem

🐰 A local model hops into place,
No network needed for this cache—
Xenova whispers through the env,
Offline magic, oh so brave!
thump thump ✨

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'Allow XENOVA_CACHE_HOME to redirect local embedding model path' directly and accurately describes the main change: enabling the XENOVA_CACHE_HOME environment variable to redirect the embedding model lookup path.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/providers/embedding/local.ts`:
- Around line 49-60: Remove the explanatory "WHAT" comment block above the
XENOVA cache logic and replace it with either a concise WHY comment like "//
Support offline setups via XENOVA_CACHE_HOME" or no comment at all, leaving the
code self-descriptive; ensure the code around the cacheHome variable and
assignments to transformers.env.localModelPath and transformers.env.cacheDir
(and the XENOVA_CACHE_HOME env var name) remains unchanged so the intent is
preserved without the long HOW explanation.
- Around line 36-39: Extend the Xenova types in src/xenova.d.ts to include env
(with localModelPath and cacheDir) alongside pipeline and remove the //
`@ts-ignore` usage in src/providers/embedding/local.ts; update the file to
import/use the declared types instead of an inline env type, shorten the long
comment block to a concise one-line note about configuring
localModelPath/cacheDir, and (optionally) normalize XENOVA_CACHE_HOME by
expanding a leading "~" using os.homedir() before assigning to
transformers.env.cacheDir so paths are correct at runtime.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: e9295cfe-5f6c-459d-abd0-82c2fbbf3122

📥 Commits

Reviewing files that changed from the base of the PR and between 749c280 and 0b93ba7.

📒 Files selected for processing (1)

src/providers/embedding/local.ts

coderabbitai · 2026-06-08T14:46:51Z

+    let transformers: {
+      pipeline: Pipeline;
+      env: { localModelPath: string; cacheDir: string };
+    };


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Description: Verify `@xenova/transformers` package version and check for env property documentation # Check installed version npm list `@xenova/transformers` 2>/dev/null || echo "Package not installed in current workspace" # Search for any existing references to transformers.env in the codebase rg -n 'transformers\.env' --type=ts -C2

Repository: rohitg00/agentmemory

Length of output: 567

🌐 Web query:

@xenova/transformers version 2.17.2 env.localModelPath env.cacheDir configuration API documentation

💡 Result:

In @xenova/transformers version 2.17.2, the env object is the central configuration API used to control the library's behavior, including model loading and caching [1][2]. env.localModelPath This property specifies the base path from which local models are loaded [1][2]. It defaults to '/models/' [1][2]. You can configure it as follows: import { env } from '@xenova/transformers'; env.localModelPath = '/path/to/local/models/'; env.cacheDir This property defines the directory used for caching files when the file system (FSCache) is utilized [1][2]. By default, it is set to './.cache' [1][2]. You can configure it as follows: import { env } from '@xenova/transformers'; env.cacheDir = '/path/to/cache/directory/'; Additional configuration notes: - To enable local model loading, ensure env.allowLocalModels is set to true (which is typically the default) [1][2]. - For caching, the library uses both browser and file system caches depending on the environment [1][2]. You can control these using boolean flags such as env.useBrowserCache and env.useFSCache [1][2]. - The env object also exposes backend-specific configurations, such as the location of WASM files via env.backends.onnx.wasm.wasmPaths [3][4].

Citations:

1: https://huggingface.co/docs/transformers.js/v2.17.2/en/api/env

2: https://huggingface.co/docs/transformers.js/v2.17.2/api/env

3: https://huggingface.co/docs/transformers.js/v2.17.2/en/custom_usage

4: https://npmx.dev/package/@xenova/transformers/v/%5E2.17.2

🏁 Script executed:

#!/bin/bash set -euo pipefail # Show the relevant sections of local.ts around the env type + assignments sed -n '1,120p' src/providers/embedding/local.ts | nl -ba | sed -n '25,95p' # Show xenova.d.ts declarations for pipeline (and whether env is declared) ls -la src/xenova.d.ts sed -n '1,200p' src/xenova.d.ts | nl -ba

Repository: rohitg00/agentmemory

Length of output: 107

🏁 Script executed:

#!/bin/bash set -euo pipefail # Print local.ts with line numbers (no `nl` in this sandbox) awk '{printf "%4d: %s\n", NR, $0}' src/providers/embedding/local.ts | sed -n '1,120p' awk '{printf "%4d: %s\n", NR, $0}' src/providers/embedding/local.ts | sed -n '120,220p' echo "---- src/xenova.d.ts ----" awk '{printf "%4d: %s\n", NR, $0}' src/xenova.d.ts | sed -n '1,200p'

Repository: rohitg00/agentmemory

Length of output: 2760

Simplify the cache/config comments and align TypeScript typings for @xenova/transformers env

transformers.env.localModelPath and transformers.env.cacheDir are part of the documented @xenova/transformers v2.17.2 env API, so the runtime property usage is reasonable.

src/xenova.d.ts currently only declares pipeline, so extend it to type env (and remove the // @ts-ignore``), instead of relying on an inline env type in `src/providers/embedding/local.ts`.

The long comment block in src/providers/embedding/local.ts (lines 49-55) explains behavior (“what”), which conflicts with the “don’t explain what” guideline; shorten it or restructure to rely on clearer naming.

Optional: if XENOVA_CACHE_HOME may be provided as ~/.cache/... literally, expand ~ via os.homedir() before assigning to transformers.env.*.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/providers/embedding/local.ts` around lines 36 - 39, Extend the Xenova types in src/xenova.d.ts to include env (with localModelPath and cacheDir) alongside pipeline and remove the // `@ts-ignore` usage in src/providers/embedding/local.ts; update the file to import/use the declared types instead of an inline env type, shorten the long comment block to a concise one-line note about configuring localModelPath/cacheDir, and (optionally) normalize XENOVA_CACHE_HOME by expanding a leading "~" using os.homedir() before assigning to transformers.env.cacheDir so paths are correct at runtime.

coderabbitai · 2026-06-08T14:46:51Z

+    // Pre-downloaded models (offline / restricted-network setups) live in
+    // ~/.cache/Xenova/ by convention. @xenova/transformers defaults
+    // localModelPath to its own install dir — which is deep inside npm's
+    // global node_modules and rarely holds pre-downloaded files. When
+    // XENOVA_CACHE_HOME is set, redirect both the local-model lookup and
+    // the download cache so the library finds existing files without a
+    // network fetch.
+    const cacheHome = process.env["XENOVA_CACHE_HOME"];
+    if (cacheHome) {
+      transformers.env.localModelPath = cacheHome;
+      transformers.env.cacheDir = cacheHome;
+    }


🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win

Remove WHAT-comments per coding guidelines.

Lines 49-55 contain comments explaining what the code does, which violates the project guideline: "Do not use code comments explaining WHAT — use clear naming instead." The logic is simple enough that a well-named helper or clear variable names suffice.

♻️ Refactor to remove WHAT-comments

- // Pre-downloaded models (offline / restricted-network setups) live in - // ~/.cache/Xenova/ by convention. `@xenova/transformers` defaults - // localModelPath to its own install dir — which is deep inside npm's - // global node_modules and rarely holds pre-downloaded files. When - // XENOVA_CACHE_HOME is set, redirect both the local-model lookup and - // the download cache so the library finds existing files without a - // network fetch. const cacheHome = process.env["XENOVA_CACHE_HOME"]; if (cacheHome) { transformers.env.localModelPath = cacheHome; transformers.env.cacheDir = cacheHome; }

If context is needed for future maintainers, consider a brief WHY comment explaining the user requirement (e.g., // Support offline setups via XENOVA_CACHE_HOME) rather than explaining the mechanics.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

// Pre-downloaded models (offline / restricted-network setups) live in

// ~/.cache/Xenova/ by convention. @xenova/transformers defaults

// localModelPath to its own install dir — which is deep inside npm's

// global node_modules and rarely holds pre-downloaded files. When

// XENOVA_CACHE_HOME is set, redirect both the local-model lookup and

// the download cache so the library finds existing files without a

// network fetch.

const cacheHome = process.env["XENOVA_CACHE_HOME"];

if (cacheHome) {

transformers.env.localModelPath = cacheHome;

transformers.env.cacheDir = cacheHome;

}

const cacheHome = process.env["XENOVA_CACHE_HOME"];

if (cacheHome) {

transformers.env.localModelPath = cacheHome;

transformers.env.cacheDir = cacheHome;

}

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/providers/embedding/local.ts` around lines 49 - 60, Remove the explanatory "WHAT" comment block above the XENOVA cache logic and replace it with either a concise WHY comment like "// Support offline setups via XENOVA_CACHE_HOME" or no comment at all, leaving the code self-descriptive; ensure the code around the cacheHome variable and assignments to transformers.env.localModelPath and transformers.env.cacheDir (and the XENOVA_CACHE_HOME env var name) remains unchanged so the intent is preserved without the long HOW explanation.

Source: Coding guidelines

coderabbitai Bot reviewed Jun 8, 2026

View reviewed changes

wbugitlab1 mentioned this pull request Jun 15, 2026

[upstream PR 862] Allow XENOVA_CACHE_HOME to redirect local embedding model path wbugitlab1/agentmemory#413

Open

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow XENOVA_CACHE_HOME to redirect local embedding model path#862

Allow XENOVA_CACHE_HOME to redirect local embedding model path#862
canxer314 wants to merge 1 commit into
rohitg00:mainfrom
canxer314:feature/xenova-cache-home

canxer314 commented Jun 8, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

vercel Bot commented Jun 8, 2026

Uh oh!

coderabbitai Bot commented Jun 8, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Jun 8, 2026

Uh oh!

coderabbitai Bot Jun 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

canxer314 commented Jun 8, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Fix

Why localModelPath + cacheDir both

Verification

Summary by CodeRabbit

Uh oh!

vercel Bot commented Jun 8, 2026

Uh oh!

coderabbitai Bot commented Jun 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 8, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 8, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

canxer314 commented Jun 8, 2026 •

edited by coderabbitai Bot

Loading

Why `localModelPath` + `cacheDir` both

coderabbitai Bot commented Jun 8, 2026 •

edited

Loading