cloudpathlib
cloudpathlib copied to clipboard
Python pathlib-style classes for cloud storage services such as Amazon S3, Azure Blob Storage, and Google Cloud Storage.
Often times we're not just writing to text files but loading and saving csvs. Could be nice to add some examples to the docs or example notebook like ``` with...
We didn't catch #115 because our CI only uses the `all` dependency case. We should think about testing cloud dependencies separately to catch these kinds of issues.
Implementation of `FsspecClient` and `FsspecPath` that work with an fsspec concrete filesystem implementation. Follow up on #96 --- - New abstract `FsspecClient`, `FsspecPath` classes. Each registered implementation in [fsspec.registry](https://filesystem-spec.readthedocs.io/en/latest/_modules/fsspec/registry.html#ReadOnlyRegistry) has...
Known issues, gotchas, things to be careful of - [ ] directory weirdness for cloud storage - [ ] explain default client + credentials system - [ ] explain that...
In order to use a persistent local cache dir, we have to pass a Client instance: ```python ladi = CloudPath( "s3://ladi/Images/FEMA_CAP/2020/70349", S3Client(local_cache_dir="data") ) ``` This works ok, but has two...
Code like this will fail intermittently, likely because a transaction hasn't finished or a place where the cloud is updated before the local file edit time is updated. ```python s3p...
Is there a package that will implement this for us? We may need to create an LRU cache that we check the size of and remove files before downloading more....
Sometimes we can use etags, Azure provides the md5 hash, some providers have "versions," and there may be other options. Currently trusting file times is a little flaky. Can we...
Implement an environment variable that will warn before downloading files over a certain size so that you don't overwhelm your machine with cached files.
Some URIs passed to us may have query parameters and other annoying URL cruft (e.g., `s3://bucket/a?presign=abc` or something like that). We should be able to use urllib parse to segment...