huri icon indicating copy to clipboard operation
huri copied to clipboard

huri vs incanter datasets ?

Open behrica opened this issue 7 years ago • 1 comments

I was investigating the differences between the approach of 'huri' and

the approach of "core.matrix dataset" (and incanter 1.9.9 which is using it) regarding datasets. I am not looking at matrix operations for the moment.

I am still not 100 % clear which approach I prefer, so maybe you can provide me with some comments.

Both have pros and cons, and I was wondering which was for you the main motivation to not improve core.matrix dataset or incanter, but to start with huri ?

I want to do more data science with Clojure, and it is somehow either using "core.matrix data set + incanter" or "huri" for data munging (my main use case).

In a way I want to replace the R tidyverse (https://www.tidyverse.org/)

with Clojure

behrica avatar Oct 10 '17 14:10 behrica

Though I might not be the most entitled to answer to @behrica question, I'll try anyway since we have more or less the same desire to use more Clojure for data science-y stuff.

I tried Incanter and I still use it for little and/or particular cases, but nowaday it is more or less unmaintained (you can check here latest commits and/or issues and pull requests) and I think that the reason is that with Clojure using tabular data is limiting.

For instance it happens very often that I have to pull data from scrapers or REST APIs and representing Json in a tabular way when we have maps it just seems "wrong". Remember that you can always replicate a tabular structure with a map for every row and a keyword for every column.

What I feel we are really lacking is something like scikit-learn for Python.

Anyway, feel free to contact me, who knows maybe we can help each other out :smiley:

alanmarazzi avatar May 04 '18 20:05 alanmarazzi