PaulWestenthanner
PaulWestenthanner
Hi @Fish-Soup I like this proposal. It's somewhat similar to interpolation for continuous data. A lot of encoders internally have an ordinal encoder that first encodes data to ordinal integers...
I guess you couldn't just return np.nan if you're function is supposed to handle NaN values it should not return NaN. Then you'd need a second order NaN handling strategy....
Hi @solegalli I'm not super familiar with the HashingEncoder but I've just read up on the referenced literature (given in the docs) and hope I can answer your question. If...
Thanks for that additional clarification @bmreiniger
Hi @shauryauppal Could you please also provide a dataset or even better a self-contained reproducible (minimal) example? Neither in the stackoverflow nor in the kaggle post the dataset is mentioned....
neither `stats["count"]` nor `self.smoothing` should be 0. The former cannot even be 0 while for the second the documentation clearly states `The value must be strictly bigger than 0`. Without...
May I ask why the project wants to re-implement encoders that are already part of sklearn? I thought it was complementing sklearn in way by only adding encoders that are...
I totally agree with point 1. For point 2 I think sklearn also supports pandas DataFrames: ```python >>> import pandas as pd >>> from sklearn.preprocessing import OneHotEncoder >>> enc =...
I agree this would be helpful. If you've got the time feel free to go ahead. Most encoders boil down to exporting only a dictionary of the mapping. Is there...
Not at the moment