polars icon indicating copy to clipboard operation
polars copied to clipboard

feat(rust): generalize the cloud storage builders

Open winding-lines opened this issue 2 years ago • 2 comments

This PR generalizes the parquet support for cloud urls. It enables all the backends supported by object_store. Note that there is still one open issue with the Azure builder, I will re-enable it once the upstream PR is fixed.

This PR threads through the code the required cloud options, as opposed to using the process environment. This reduces the magic provided by global state (std::env) at the expense of more changes for the different layers. I personally prefer more explicit options/settings passing.

This PR also breaks the existing object_store.rs into a module, that file was getting bigger and had a poor cohesion.

Feedback appreciated.

winding-lines avatar Dec 31 '22 23:12 winding-lines

I got some help from the developers of object-store crate, they provided an API that is nicer for us to use. Many thanks :)

This code is ready to review at your convenience. I am not sure how to fully enable it in the Python side... @ritchie46

winding-lines avatar Jan 06 '23 03:01 winding-lines

@ritchie46 addressed feedback, tests are passing, ready for more feedback or merge 🤗

winding-lines avatar Jan 08 '23 17:01 winding-lines

@winding-lines when object_store adds more backends in the future, will it require changes in polars as well or is it generalized now?

because I see that in their main/ master branch HTTP has also been added.

chitralverma avatar Jan 08 '23 18:01 chitralverma

@chitralverma in the current iteration the deltalake and object_store teams have refactored some code out of the former and put in the latter. I think we can also contribute our current layer upstream so that any future changes will be integrated just by recompiling.

Given that this PR has been open for a while my preference would be to merge it and then release a polars version so that I can do more testing at work.

Let me know what you think :)

winding-lines avatar Jan 08 '23 19:01 winding-lines

@ritchie46 addressed feedback, tests are passing :-)

Many thanks!

winding-lines avatar Jan 10 '23 14:01 winding-lines

thanks @ritchie46 ❤️

winding-lines avatar Jan 11 '23 16:01 winding-lines