NimData icon indicating copy to clipboard operation
NimData copied to clipboard

`groupBy` with a `DataFrame`

Open ynfle opened this issue 2 years ago • 4 comments

Is it possible to pass in a DataFrame that can be aligned/joined with the original Dataframe to allow for a list of values to group by?

Thanks for you wonderful package

ynfle avatar Aug 11 '21 21:08 ynfle

I'm not quite sure what you mean by "aligned/joined" in this context? Do you perhaps have a small example?

bluenote10 avatar Aug 11 '21 21:08 bluenote10

I am following this to try and make a Naive Bayes Classifier in nim using NimData and in the method calc_prior they group the data by the target class which isn't possible without a join if the data and the target are separated.

ynfle avatar Aug 11 '21 21:08 ynfle

Here is a link to Pandas groupBy

EDIT: Forgot the link https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.groupby.html

ynfle avatar Aug 11 '21 21:08 ynfle

Sorry for the delay! So many things to do...

In general NimData already has a rudimentary (and not well documented) implementation of groupBy:

https://github.com/bluenote10/NimData/blob/ed07c2fa76cae57477d61b08384148f308aa4c6d/src/nimdata.nim#L245-L254

which isn't possible without a join if the data and the target are separated

Yes you'd probably have to combine the target column with the data columns first to make it work.

bluenote10 avatar Aug 21 '21 08:08 bluenote10