arrow-wasm icon indicating copy to clipboard operation
arrow-wasm copied to clipboard

Support data transformations

Open alippai opened this issue 5 years ago • 6 comments

What I really miss in the Arrow JS lib, that I have to write row based accumulators or lookups in JS to achieve synthetic accumulators (sum, avg, cumsum). As DataFusion and Polars already support this, I assume it's available on the base Arrow (Rust) too. Could you add a few examples in this area (sort, groupby, sum).

p.s. While I understand that Datafusion capabilities (SQL engine) could be cumbersome and overkill, did you consider exposing @ritchie46's lazy Polars API?

alippai avatar Feb 26 '21 14:02 alippai

I added sum, min, and max to vectors already and it would definitely be great to support more aggregates. Which ones would you want specifically (and are they supported in the rust library)?

I looked into datafusion but it doesn't compile to wasm right now. I filed a Jira ticket already. https://issues.apache.org/jira/plugins/servlet/mobile#issue/ARROW-11615

I had not seen polars before. Thanks for the pointer.

domoritz avatar Feb 26 '21 15:02 domoritz

I saw vec.sum(), but what I meant was table.groupby("date").column(["temp", "rain"]).sum() or table.groupby([1]).column([2,3]).sum().

What I need in SQL I would describe as: SELECT date, SUM(temp), SUM(rain) FROM myTable GROUP BY date ORDER BY date DESC given a myTable available in Arrow IPC format.

alippai avatar Feb 26 '21 15:02 alippai

This would make arrow-wasm one of the most powerful dataframe libs in JS instantly.

alippai avatar Feb 26 '21 15:02 alippai

That would be great and I am thinking about how to best support it (and more). The rust arrow library doesn't have groupby, though (or I didn't see it): https://docs.rs/arrow/3.0.0/arrow/.

domoritz avatar Feb 26 '21 16:02 domoritz

Oh, then it's the higher level libs (DataFusion, Polars) contribution, my mistake :) I still really love this initiative, I'm happy to see the data science field sharing more and more code across languages.

alippai avatar Feb 26 '21 16:02 alippai

I updated the title to be more generic.

domoritz avatar Feb 26 '21 17:02 domoritz