Gaëtan de Menten

Results 32 comments of Gaëtan de Menten

> how do you paginate the results? do you use `LazyFrame.__getitem__` multiple times and then `collect` each? if so i worry that that would involve doing repeated calculations I am...

> Now, what happens when you run `result[:2].collect()`, and then `result[2:4].collect()`? You may expect that Polars is running the UDF for the first two elements and then for the next...

> would an offset argument in LazyFrame.head suffice for you? Now that I think of it, I don't think it's a good idea because then the Narwhals API would no...

> That's right, it only happens if there are operations which block slice pushdown. But, if you're displaying a lazyframe provided by the user, then you have no control over...

> i'm keen to understand the use-case more Thanks a lot for taking that time, it's really appreciated. > say you want to support duckdb. in that case, showing `from...

> it might help to speak about this over a call to understand what to do? If you feel that helps, I am available all day tomorrow.

> i think it breaks even if the database isn't updated? Indeed but I assume what you see is because duckdb is multithreaded by default. I suppose it evaluates different...

Haha! I did not realize multiprocessing was possible in combination with chunking. I thought it was an exclusive or thing. I was a bit mislead by that sentence in the...

> Maybe you are right and there is still something to gain there, but it sounds complicated to me. As right now I do not have much time to dedicate...

> also, just out of curiosity, you said initially the file took 25 hours to complete. How long is it taking now? what is the chunksize and how many cores...