category_encoders icon indicating copy to clipboard operation
category_encoders copied to clipboard

Transformers for continous data

Open mglowacki100 opened this issue 5 years ago • 3 comments

Hi I know that library is focused on categorical-encoding, but I think there is a value in adding at least StandardScaler and MinMaxScaler, with such nice interface like we have in one_hot for dealing with NaNs and to have get_feature_names. What do you think about it?

mglowacki100 avatar Dec 09 '19 18:12 mglowacki100

@wdm0006 ?

janmotl avatar Dec 10 '19 09:12 janmotl

At first glance I'd say no that probably is out of scope for this library (but seems like a good idea in general). The universe of continuous value transformations is quite large and adding that into scope here would be quite a big addition.

wdm0006 avatar Dec 10 '19 20:12 wdm0006

Yes, there are a lot of continous transformers, but they are quite similar regarding input/output format at least for scalers. I think the easiest way to implement it is to wrap-up scikit-learn continous transformers. Basically, we need to handle missing/unknown values to use it. Additionally: invariants and feature names. Probably this functionality could be common even for categorical encoders.

mglowacki100 avatar Dec 12 '19 06:12 mglowacki100

closing this as I also agree that it would be out of scope

PaulWestenthanner avatar Jan 07 '23 10:01 PaulWestenthanner