tinydb icon indicating copy to clipboard operation
tinydb copied to clipboard

Atomic replace (second attempt)

Open richardsheridan opened this issue 2 years ago • 2 comments

Revamp of #468. Fixes #467.

I was in a position to fix this one because during the python-atomicwrites package fiasco I read somewhere that the author basically said people should be using os.replace anyway. Other platform-dependent errors were fixed basically by closing file handles in the right order.

richardsheridan avatar Jul 27 '22 16:07 richardsheridan

CI problems seem unrelated, see https://github.com/msiemens/tinydb/runs/7544311658?check_suite_focus=true#step:5:231

richardsheridan avatar Jul 27 '22 16:07 richardsheridan

A peculiar edge case of this implementation is that the storage will happily write to a "closed" handle. This is not triggered in the default case because a read is performed before writing, but it can be triggered by CachingMiddleware:

from tinydb import TinyDB, JSONStorage
from tinydb.middlewares import CachingMiddleware
db=TinyDB("test.json", storage=CachingMiddleware(JSONStorage))
db.insert({"test":1})
db.close()
db.insert({"test":2})
db.storage.flush()
db.storage.close()

I'll add a commit that checks self._handle.closed but finding this makes me wary of other edge cases. Maybe this feature should go in as a new "AtomicJSONStorage" and wait for some deprecation period and major version bump to be swapped in?

richardsheridan avatar Jul 31 '22 03:07 richardsheridan

I don't think this is not triggered in the default case since that read operation is reading from the cache instead of the underlying storage.

How about making CachingMiddleware delete its cache after closing? (Which seems reasonable to me)

So that the next operation will read from the file and raise the exception.

Currently:

class CacheMiddelware(Middleware):
    def read(self):
        if self.cache is None:
            # Empty cache: read from the storage
            self.cache = self.storage.read()

        # Return the cached data
        return self.cache

    def close(self):
        # Flush potentially unwritten data
        self.flush()

        # Let the storage clean up, too
        self.storage.close()

->

    def close(self):
        # Flush potentially unwritten data
        self.flush()
        self.cache = None  # Clear cache

        # Let the storage clean up, too
        self.storage.close()

Another solution:

In my personal TinyDB fork, I've forced all Storage classes to implement a closed property, since it's quite an important property and is often needed.

VermiIIi0n avatar Oct 14 '22 02:10 VermiIIi0n