category_encoders icon indicating copy to clipboard operation
category_encoders copied to clipboard

A library of sklearn compatible categorical variable encoders

Results 89 category_encoders issues
Sort by recently updated
recently updated
newest added

[sklearn.preprocessing.OneHotEncoder](https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.OneHotEncoder.html) has the option `sparse=True`, to return the output in a scipy.sparse matrix. This can be really useful if you have categories with high cardinality. Would it be possible to...

enhancement

are you planning to implement parallel encoding of features for woe encoding ?

enhancement

Versions sklearn: '0.22.1' category_encoders: 2.1.0 Issue - if I use a fitted BinaryEncoder instance in a custom classifier, there is a ValueError "ValueError: Must train encoder before it can be...

bug

**Summary** `OrdinalEncoder.fit()` throws an exception when the input values are entirely numeric (I.E. `[1, 2, 3, 4, 5]`) or can be converted to be numeric (I.E. `['001', '002', '003', '004',...

Hi! I came up here searching about how to encode categorical variables which have a circular distance relation (such as the days of the week, where the last day, sunday,...

enhancement

Hi I know that library is focused on categorical-encoding, but I think there is a value in adding at least `StandardScaler` and `MinMaxScaler`, with such nice interface like we have...

enhancement

I am packaging this Python package on nixpkgs. When running test, I ran into: ``` error: [Errno 2] File b'source_data/mushrooms/agaricus-lepiota.csv' does not exist: b'source_data/mushrooms/agaricus-lepiota.csv' ``` I think that the path...

bug

I'm trying to see the output of using HashingEncoder, and I've used the original sample code from the documentation, and I don't see any differences between the transformed and non-transformed...

I know that I'm asking for a lot here but it'd be great to have some idea of what encoding strategies are useful in some cases : classification vs regression...

Extend HashingEncoder to work with `util.hash_pandas_object` as the hashing function. **Reasoning**: Currently, HashingEncoder relies on hashlib. Hashlib is nice, however: 1. hashlib works only value by value -> no vectorization...

enhancement
help wanted