scholar icon indicating copy to clipboard operation
scholar copied to clipboard

Shall we deprecate Scholar.Preprocessing?

Open josevalim opened this issue 1 year ago • 4 comments

Today the module only contain defdelegate to functions in other modules. Should we remove these delegations? Should we keep them?

@krstopro @msluszniak @polvalente

josevalim avatar Dec 31 '23 18:12 josevalim

Given that each of these modules contains fit_transform/2, I think it is safe to deprecate/remove Scholar.Preprocessing. Only Binarizer module remains to be implemented.

Just my opinion, curious to see what the others think.

krstopro avatar Dec 31 '23 19:12 krstopro

Yes, I agree that after moving the binarizer into a separate module we'll be able to delete Scholar.Preprocessing

msluszniak avatar Jan 01 '24 16:01 msluszniak

There is another option here to remove the duplication:

  1. Remove the fit_transform functions from Scholar.Preprocesing.* modules
  2. Keep the current functions in Scholar.Preprocessing as the fit_transform variant

But I don't like much because the documentation for supported options are all in the actual module. So we would either need a way to share the options or redirect users to a separate module, which is not the best user experience. I thought I'd mention it for completeness tho.

josevalim avatar Jan 09 '24 14:01 josevalim

We might make a more general struct Preprocessing with two fields operation and data. For practical purposes, all functions will have fit and predict parts. Then using predict, based on the operation field, it will call the right function. However, this makes the fields like standard deviation in normalization less explicit.

msluszniak avatar Jan 09 '24 18:01 msluszniak