cloudpathlib
cloudpathlib copied to clipboard
Python pathlib-style classes for cloud storage services such as Amazon S3, Azure Blob Storage, and Google Cloud Storage.
The v1 of list_objects can be the cause of some consistency problems. See this answer for context: https://stackoverflow.com/a/67412931/1692709 We currently use `list_objects` for non recursive cases: https://github.com/drivendataorg/cloudpathlib/blob/80f7afdf85dfb4f3ad0406944a5d3cf28c727435/cloudpathlib/s3/s3client.py#L147 We use the...
For example I need to use `str()` for `.stem` field, otherwise it shows: 
In download directory cases, the `download_to` function was running in an infinite loop. For example: We have a s3 example bucket with 1 one file in it: `s3://example/test_file.txt` The function...
When implementing #142 I encountered a tricky type annotation scenario and used a workaround. The `AnyPath` constructor instead of `to_anypath` in the code referenced below runs fine, but `mypy` complains....
Pulling the work from #206 over and rebasing onto the latest master. --------------------- * [WIP]: taking care of the corner case folder created from S3 * Fix format issues *...
This was brought up by @remi-braun in #148. Sometimes, listing a directory with the `S3Client` results in a recursion error. This was seen on Ceph where "faked" S3 directories are...
- Adds `.cloud` accessor to `pd.Series` and `pd.Index` via a pandas-path [custom accessor](https://github.com/drivendataorg/pandas-path/#custom-path-accessors) - Adds simple tests for the integration - Adds docs for the integration - Updates changelog
We don't currently support any `sync` API. Even with `upload_to` and `download_from`, we will never delete files. I'm not sure that we need/want to do anything about this, but wanted...
We currently think it is a non-goal to _manage_ folder syncing behind the scenes, but we might want to explicitly support it as a method. For example, something like: ```python...
This PR https://github.com/drivendataorg/cloudpathlib/pull/193 allows setting the `S3Client` `endpoint_url` from an environment variable `AWS_ENDPOINT_URL`. There is a long open PR upstream in boto3 to add this functionality but for some reason...