cloudpathlib icon indicating copy to clipboard operation
cloudpathlib copied to clipboard

Feature request: override local cache logic

Open psarka opened this issue 1 year ago • 4 comments

I would love to have a local cache that does not compare mtimes of the file in the local cache and remote bucket. I understand that this is not safe, but I'm working with a lot of immutable files.

This kind of cache might not be important to other cloudpathlib users, so the actual feature request is to allow overriding of the cache logic. I skimmed the code and did not find a way to do this, but it is possible that I missed it.

If you are interested, I could probably contribute this myself.

psarka avatar Apr 05 '24 07:04 psarka

Thanks @psarka for raising, this has been on our radar.

We want to expose more ability to customize cache behavior, and there is a broad set of functionality that it covers. Our goals is to factor out all of the cache interaction logic out of the Client/Path classes and into their own objects that are modular.

This is probably a medium term roadmap item.

If you want a relatively simple near term, you could make a subclass that overrides the _refresh_cache implementation.

from cloudpathlib.cloudpath import register_path_class
from cloudpathlib.s3 import S3Path

@register_path_class("s3")
class MyS3Path(S3Path):
    def _refresh_cache(self, force_overwrite_from_cloud: bool = False) -> None:
        # replace with your logic (e.g., always download or skip if exists)
        pass

This, of course, is not a public API so is subject to change.

pjbull avatar Apr 10 '24 21:04 pjbull

Awesome, thank you! I will try this override :+1:

psarka avatar Apr 12 '24 09:04 psarka