Skip to content

[upstream PR 862] Allow XENOVA_CACHE_HOME to redirect local embedding model path #413

@wbugitlab1

Description

@wbugitlab1

Source: Source pull request number: 862 in rohitg00/agentmemory (URL omitted to avoid GitHub cross-reference)
Title: Allow XENOVA_CACHE_HOME to redirect local embedding model path
Author: canxer314
State: open
Draft: no
Merged: no
Head: canxer314/agentmemory:feature/xenova-cache-home @ 0b93ba7
Base: main @ 749c280
Labels: (none)
Changed files: 0
Commits: 0
Created: 2026-06-08T14:42:13Z
Updated: 2026-06-08T14:46:52Z
Closed: (not closed)
Merged at: (not merged)

Original PR body:

Problem

@xenova/transformers defaults env.localModelPath to its own install directory — deep inside npm's global node_modules (e.g. .../node_modules/@<!-- -->xenova/transformers/models/). When users pre-download embedding models to ~/.cache/Xenova/ — a common pattern in offline or restricted-network environments — the library never looks there.

If HuggingFace CDN is unreachable (firewall, air-gapped network, etc.), every observation save logs:

[agentmemory] warn vector-index add: embed failed — skipping {"provider":"local","error":"fetch failed"}

This means every new observation lacks a vector embedding, and semantic search silently degrades to BM25-only.

Fix

In LocalEmbeddingProvider.getExtractor(), read XENOVA_CACHE_HOME from the environment after importing @xenova/transformers. When set, override both env.localModelPath (the primary file lookup path) and env.cacheDir (the download cache destination) before calling pipeline().

XENOVA_CACHE_HOME Behavior
unset Unchanged — uses default paths inside npm node_modules
~/.cache/Xenova Finds pre-downloaded models directly, no network fetch needed

Why localModelPath + cacheDir both

  • localModelPath is the primary lookup (hub.js:392) — the library checks it first
  • cacheDir is where downloads are persisted — redirecting it keeps the npm directory clean

The convention matches: ~/.cache/Xenova/ already uses the {org}/{model}/ layout that @xenova/transformers expects, so XENOVA_CACHE_HOME=~/.cache/Xenova works directly with pre-downloaded models.

Verification

  • XENOVA_CACHE_HOME=~/.cache/Xenova → embed succeeds, 384-dim vector returned
  • env var unset → env.localModelPath unchanged, existing behavior preserved
  • test/embedding-provider.test.ts unaffected (no env var set → default path)

Summary by CodeRabbit

  • New Features
    • Added support for offline and restricted-network setups, enabling use of pre-downloaded models without network access via environment variable configuration.

Local branch:
Fork PR:
Fork decision:
Verification:
Notes:

Metadata

Metadata

Assignees

No one assigned

    Labels

    decision-candidateFork decision has not been madeupstream-openUpstream pull request is openupstream-prTracks an upstream pull request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions