Adrian Ehrsam

Results 62 comments of Adrian Ehrsam

> Let's get the other one in first I hope it's ok to prioritize this one from my side to not have to keep both branches up-to-date

How about `parallelize:bool|int` on python side? 🙂

I finally had the time to update this branch with the new parallel parameter in python. Hope it's looking good now!

I only did some manual test on my own data, but could probably write some benchmark in python, using duckdb or polars as source. Would it make sense to add...

I did some very basic benchmarking, but the results were not as I hoped :) While RAM consumption is significantly lower, the speed is not good enough yet. I think...

Pretty sure the non-async write causes issues. But object_store 0.10 will change a lot there, so maybe better to wait for that

I guess we also have to wait for a release of the arrow crate

Also renaming directories (kind of move with recurse=True) is a case that's currently not possible on an Datalake v2 Account

I can pick it up, but I'd rather do it on the write.rs operation

Ok, I see partioning makes this quite complicated 🙂 And MemoryExec of DataFusion is not helpful, so might take some time