Add PersistentProgramCache (sqlite + filestream backends)#1912
Open
cpcloud wants to merge 1 commit intoNVIDIA:mainfrom
Open
Add PersistentProgramCache (sqlite + filestream backends)#1912cpcloud wants to merge 1 commit intoNVIDIA:mainfrom
cpcloud wants to merge 1 commit intoNVIDIA:mainfrom
Conversation
de57bd8 to
ac38a68
Compare
|
dec7518 to
fcee07b
Compare
Convert cuda.core.utils to a package and add persistent, on-disk caches
for compiled ObjectCode produced by Program.compile.
Public API (cuda.core.utils):
* ProgramCacheResource -- abstract bytes|str -> ObjectCode mapping
with context manager and pickle-safety warning. Path-backed
ObjectCode is rejected at write time (would store only the path).
* SQLiteProgramCache -- single-file sqlite3 backend (WAL mode,
autocommit) with LRU eviction against an optional size cap. A
threading.RLock serialises connection use so one cache object is
safe across threads. wal_checkpoint(TRUNCATE) + VACUUM run after
evictions so the size cap bounds real on-disk usage, not just
logical payload. Schema-version mismatch on open wipes entries.
* FileStreamProgramCache -- directory of atomically-written entries
(tmp + os.replace) safe across concurrent processes, with
best-effort size enforcement by mtime. Windows-only PermissionError
from os.replace is swallowed as a cache miss; other platforms
re-raise. Schema-version mismatch on open wipes entries.
* make_program_cache_key -- stable 32-byte blake2b key over code,
code_type, ProgramOptions (including options.name), target_type,
name expressions (normalised str/bytes), cuda core/driver/NVRTC
versions, linker backend+version for PTX inputs, NVVM-specific
fields (extra_sources, use_libdevice), and an optional extra_digest
that callers MUST supply when options pull in external file content
(include_path, pre_include, pch, use_pch, pch_dir).
sqlite3 is imported lazily so the package is usable on interpreters
built without libsqlite3.
Tests: single-process CRUD, LRU/size-cap (logical and on-disk),
corruption, schema-mismatch, threaded access (SQLite), multiprocess
stress (FileStream), Windows vs POSIX PermissionError behaviour, and
an end-to-end test that compiles a real CUDA C++ kernel, stores the
ObjectCode, reopens the cache, and calls get_kernel on the deserialised
copy. Public API is documented in cuda_core/docs/source/api.rst.
fcee07b to
7cfc5b5
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
cuda.core.utilsfrom a module to a packageProgramCacheResourceABC with dict-like interface for compiled-program cachesmake_program_cache_key()— blake2b digest incorporating schema version, cuda-core/driver/nvrtc versions, code, options, extra_sources, and use_libdeviceSQLiteProgramCache— LRU eviction, single-process,max_size_bytescapFileStreamProgramCache—os.replaceatomic writes, mtime-based eviction, multi-process safeapi.rstSplit design (two classes, not unified): different concurrency and eviction semantics make a single class with a mode flag misleading.
Program.compile(cache=...)integration is out of scope (tracked by #176/#179).Test plan
Programcompilation (requires GPU)Closes #178
🤖 Generated with Claude Code