Skip to content

Add PersistentProgramCache (sqlite + filestream backends)#1912

Open
cpcloud wants to merge 1 commit intoNVIDIA:mainfrom
cpcloud:persistent-program-cache-178
Open

Add PersistentProgramCache (sqlite + filestream backends)#1912
cpcloud wants to merge 1 commit intoNVIDIA:mainfrom
cpcloud:persistent-program-cache-178

Conversation

@cpcloud
Copy link
Copy Markdown
Contributor

@cpcloud cpcloud commented Apr 14, 2026

Summary

  • Converts cuda.core.utils from a module to a package
  • Adds ProgramCacheResource ABC with dict-like interface for compiled-program caches
  • Adds make_program_cache_key() — blake2b digest incorporating schema version, cuda-core/driver/nvrtc versions, code, options, extra_sources, and use_libdevice
  • Adds SQLiteProgramCache — LRU eviction, single-process, max_size_bytes cap
  • Adds FileStreamProgramCacheos.replace atomic writes, mtime-based eviction, multi-process safe
  • ~40 unit tests + 3 multiprocess stress tests
  • API docs added to api.rst

Split design (two classes, not unified): different concurrency and eviction semantics make a single class with a mode flag misleading.

Program.compile(cache=...) integration is out of scope (tracked by #176/#179).

Test plan

  • ~40 unit tests covering ABC contract, key generation, CRUD, eviction, corruption recovery
  • 3 multiprocess tests (concurrent writers same key, distinct keys, reader vs writer)
  • CI: end-to-end with real Program compilation (requires GPU)

Closes #178

🤖 Generated with Claude Code

@cpcloud cpcloud added this to the cuda.core v1.0.0 milestone Apr 14, 2026
@cpcloud cpcloud added P0 High priority - Must do! feature New feature or request cuda.core Everything related to the cuda.core module labels Apr 14, 2026
@cpcloud cpcloud self-assigned this Apr 14, 2026
@cpcloud cpcloud force-pushed the persistent-program-cache-178 branch from de57bd8 to ac38a68 Compare April 14, 2026 22:15
@github-actions
Copy link
Copy Markdown

@cpcloud cpcloud force-pushed the persistent-program-cache-178 branch 12 times, most recently from dec7518 to fcee07b Compare April 18, 2026 11:00
Convert cuda.core.utils to a package and add persistent, on-disk caches
for compiled ObjectCode produced by Program.compile.

Public API (cuda.core.utils):
  * ProgramCacheResource  -- abstract bytes|str -> ObjectCode mapping
    with context manager and pickle-safety warning. Path-backed
    ObjectCode is rejected at write time (would store only the path).
  * SQLiteProgramCache    -- single-file sqlite3 backend (WAL mode,
    autocommit) with LRU eviction against an optional size cap. A
    threading.RLock serialises connection use so one cache object is
    safe across threads. wal_checkpoint(TRUNCATE) + VACUUM run after
    evictions so the size cap bounds real on-disk usage, not just
    logical payload. Schema-version mismatch on open wipes entries.
  * FileStreamProgramCache -- directory of atomically-written entries
    (tmp + os.replace) safe across concurrent processes, with
    best-effort size enforcement by mtime. Windows-only PermissionError
    from os.replace is swallowed as a cache miss; other platforms
    re-raise. Schema-version mismatch on open wipes entries.
  * make_program_cache_key -- stable 32-byte blake2b key over code,
    code_type, ProgramOptions (including options.name), target_type,
    name expressions (normalised str/bytes), cuda core/driver/NVRTC
    versions, linker backend+version for PTX inputs, NVVM-specific
    fields (extra_sources, use_libdevice), and an optional extra_digest
    that callers MUST supply when options pull in external file content
    (include_path, pre_include, pch, use_pch, pch_dir).

sqlite3 is imported lazily so the package is usable on interpreters
built without libsqlite3.

Tests: single-process CRUD, LRU/size-cap (logical and on-disk),
corruption, schema-mismatch, threaded access (SQLite), multiprocess
stress (FileStream), Windows vs POSIX PermissionError behaviour, and
an end-to-end test that compiles a real CUDA C++ kernel, stores the
ObjectCode, reopens the cache, and calls get_kernel on the deserialised
copy. Public API is documented in cuda_core/docs/source/api.rst.
@cpcloud cpcloud force-pushed the persistent-program-cache-178 branch from fcee07b to 7cfc5b5 Compare April 18, 2026 11:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cuda.core Everything related to the cuda.core module feature New feature or request P0 High priority - Must do!

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add cuda.core.utils.PersistentProgramCache

1 participant