Jan Gorecki comments

Results 186 comments of


                                            Jan Gorecki

trafficstars

Add python's modin

unfortunately not yet supported: https://github.com/modin-project/modin/issues/75

@kaiogu I tried to push that forward but getting ``` UserWarning: `DataFrame.groupby_on_multiple_columns` defaulting to pandas implementation. ``` using recent pip modin. I think it make sense to wait for proper...

Mind re-running with DuckDB 0.2.8? Thanks!

@mattdowle

new task: read

I collected some feedback about this task from our internal discussion. Initially I will focus only on reading csv, not a binary formats. For real world data NYT will be...

advanced questions for `join` tests

I pushed draft of join questions. Data is 3 id factor (2 unique, 1 dups), 3 id int (2 unique, 1 dups), 1 double. The list of initially discussed on...

advanced questions for `join` tests

From the 7 questions proposed above, 5 are going to be categorised as `basic`, testing mostly scalability, the rest plus 3 extra will be categorised `advanced`, testing features. For consistency...

advanced questions for `join` tests

note to fix `chk` produced by spark, juliadf and maybe others. as of now they produce `chk` having `0` so answers-validation.R script `solution_chk` check is failing. Workaround has been introduced...

advanced questions for `join` tests

join task for 5 basic questions has been implemented. design of datasets for join is explained in https://github.com/h2oai/db-benchmark/issues/106 as of now join task was not yet added only for clickhouse....

groupby q7 should not decompose complex expression

I think that not all solutions allows sorting during aggregation, so yes.

Polars [Rust native solution]

@ritchie46 be patient. Unfortunately I have two other projects in queue as of now.