category_encoders icon indicating copy to clipboard operation
category_encoders copied to clipboard

A library of sklearn compatible categorical variable encoders

Results 89 category_encoders issues
Sort by recently updated
recently updated
newest added

Fixes #136 ## Proposed Changes This pull request implements feature hierarchies in Target Encoders. Author: @nercisla Current status: Work in Progress

all the encoders encountered the problem of "'XXXEncoder' object has no attribute '_get_tags'"

non-reproducible

Memory increase of WOEEncoder for category_encoders version >=2.0.0 Hi, I noticed another memory issue with ```WOEEncoder```. I have submitted the same bug before in [#335](https://github.com/scikit-learn-contrib/category_encoders/issues/362), the difference between two bugs...

enhancement
good first issue

## Expected Behavior Similar memory usage for the different category_encoders versions or better performance for higher category_encoders versions. ## Actual Behavior According to the experiment results, when the category_encoders version...

enhancement
good first issue

## Expected Behavior I have not found a function to map the encoded values back to the categorical values when using category_encoders' CatBoostEncoder. I was trying to do it manually...

The test `./tests/test_encoders.py::TestEncoders::test_unique_column_is_not_predictive` fails for `QuantileEncoder`. That new supervised encoder wasn't added to this test. My cursory understanding is that the other supervised encoders smooth things so that unique levels...

enhancement

I wanted to know whether there is any plan on creating onnx converter for this library?

enhancement
help wanted

It looks like one of the best category encoders. https://github.com/pcerda/string_categorical_encoders/blob/master/column_encoder.py#L272 The original paper: https://arxiv.org/abs/1907.01860

enhancement
help wanted

CatBoostEncoder returns NaN with pandas 1.1.0 and it works with pandas 1.0.4

non-reproducible

I'm putting CatBoostEncoder in a pipeline just before a RandomForestClassifier, but I'm getting an error in the RF due to all the values being NaNs. If I manually call `.fit(X,...

non-reproducible