Jan Gorecki
Jan Gorecki
unfortunately not yet supported: https://github.com/modin-project/modin/issues/75
@kaiogu I tried to push that forward but getting ``` UserWarning: `DataFrame.groupby_on_multiple_columns` defaulting to pandas implementation. ``` using recent pip modin. I think it make sense to wait for proper...
@mattdowle
I collected some feedback about this task from our internal discussion. Initially I will focus only on reading csv, not a binary formats. For real world data NYT will be...
I pushed draft of join questions. Data is 3 id factor (2 unique, 1 dups), 3 id int (2 unique, 1 dups), 1 double. The list of initially discussed on...
From the 7 questions proposed above, 5 are going to be categorised as `basic`, testing mostly scalability, the rest plus 3 extra will be categorised `advanced`, testing features. For consistency...
note to fix `chk` produced by spark, juliadf and maybe others. as of now they produce `chk` having `0` so answers-validation.R script `solution_chk` check is failing. Workaround has been introduced...
join task for 5 basic questions has been implemented. design of datasets for join is explained in https://github.com/h2oai/db-benchmark/issues/106 as of now join task was not yet added only for clickhouse....
I think that not all solutions allows sorting during aggregation, so yes.
@ritchie46 be patient. Unfortunately I have two other projects in queue as of now.