huri
huri copied to clipboard
huri vs incanter datasets ?
I was investigating the differences between the approach of 'huri' and
the approach of "core.matrix dataset" (and incanter 1.9.9 which is using it) regarding datasets. I am not looking at matrix operations for the moment.
I am still not 100 % clear which approach I prefer, so maybe you can provide me with some comments.
Both have pros and cons, and I was wondering which was for you the main motivation to not improve core.matrix dataset or incanter, but to start with huri ?
I want to do more data science with Clojure, and it is somehow either using "core.matrix data set + incanter" or "huri" for data munging (my main use case).
In a way I want to replace the R tidyverse (https://www.tidyverse.org/)
with Clojure
Though I might not be the most entitled to answer to @behrica question, I'll try anyway since we have more or less the same desire to use more Clojure for data science-y stuff.
I tried Incanter and I still use it for little and/or particular cases, but nowaday it is more or less unmaintained (you can check here latest commits and/or issues and pull requests) and I think that the reason is that with Clojure using tabular data is limiting.
For instance it happens very often that I have to pull data from scrapers or REST APIs and representing Json in a tabular way when we have maps it just seems "wrong". Remember that you can always replicate a tabular structure with a map for every row and a keyword for every column.
What I feel we are really lacking is something like scikit-learn for Python.
Anyway, feel free to contact me, who knows maybe we can help each other out :smiley: