Jay Chia

Results 126 comments of Jay Chia

We are working on benchmarks! Will be ready in about 3 weeks or so with the prototype + initial benchmarks. In other words, if Celeborn were to support this protocol...

As an added bonus: if we could figure out that the dataframe is partitioned by filename (if no file splitting was performed) that could be really cool. This could enable...

Example use-case for counting number of distinct rows, grouped by filename:

@michaelvay let us know how progress is so far! Happy also to pair up and walk you through any parts of the codebase that you might have questions on.

I think #2502 should fix most of this. However, there is still a potential problem though with multiple dataframe iterators are running in parallel. Our current implementation doesn't place any...

> Happy to contribute a few batch inference examples with ray once overwriting `with_init_args` are all set. Almost in! @kevinzwang has a fix :)

Hey @dioptre! This use-case is technically possible today already like so: ```python # STEP 1: Run your own code to filter the list of files files = get_files() filtered_filepaths =...

Quick bump here for thoughts @dioptre !

Closed, but we need better docs around this I think