Skip to content

fix: deep-copy cache entries so mutating search results cannot corrupt the cache#617

Open
gaoflow wants to merge 1 commit into
msiemens:masterfrom
gaoflow:fix/query-cache-deepcopy
Open

fix: deep-copy cache entries so mutating search results cannot corrupt the cache#617
gaoflow wants to merge 1 commit into
msiemens:masterfrom
gaoflow:fix/query-cache-deepcopy

Conversation

@gaoflow

@gaoflow gaoflow commented Jun 24, 2026

Copy link
Copy Markdown

Problem

Table.search stored and returned shallow copies of the document
list (docs[:]). The Document objects inside the cache and the
list returned to the caller were therefore the same objects.
Mutating a nested mutable value (e.g. appending to a list field) in a
returned document silently corrupted the cached results, causing
subsequent identical searches to return wrong data.

Minimal reproduction:

from tinydb import TinyDB, Query
from tinydb.storages import MemoryStorage

db = TinyDB(storage=MemoryStorage)
db.insert({'name': 'Alice', 'tags': ['a', 'b']})

Q = Query()
results = db.search(Q.name == 'Alice')
results[0]['tags'].append('c')          # mutate the returned document

# Second search hits the cache — should still return ['a', 'b']
print(db.search(Q.name == 'Alice'))     # prints [{'name': 'Alice', 'tags': ['a', 'b', 'c']}]

This was reported in issue #516.

Fix

Use copy.deepcopy when writing to and reading from
_query_cache, so cached entries and returned results are fully
independent. The change is two lines in Table.search:

# store
self._query_cache[cond] = deepcopy(docs)

# retrieve
return deepcopy(cached_results)

Tests

A new test test_query_cache_not_corrupted_by_mutation is added
alongside the existing test_query_cache. All 217 tests pass.

…t the cache

`Table.search` stored and returned shallow copies of the document list,
meaning the `Document` objects inside the cache and the caller's list
were the same objects.  Mutating a nested mutable value (e.g. appending
to a list field) inside a returned document therefore silently corrupted
the cached results, causing subsequent searches to return stale/wrong data.

Switch to `copy.deepcopy` when writing to and reading from `_query_cache`
so that cached entries and returned results are fully independent.

Fixes msiemens#516
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant