feature_engine issues

Mad outlier rule

Added MAD outlier rule Refactored base code and tests

MAD-Median rule for outlier removal

4

**Is your feature request related to a problem? Please describe.** We can add outlier rule similar to gaussian (mean +- k * std) but using robust statistics (median +- k...

glevv

Ordinal encoder outputs -1 for unknown Categories

4

This is my shot for #428. I noticed `OrdinalEncoder` inherits the `transform` method from `CategoricalMethodsMixin`. So I added an additional condition to output -1 only for `OrdinalEncoder`. Additionally I had...

datacubeR

Fix repeated features for OneHotEncoder and PolynomialFeatures

7

Hi @solegalli, This is my shot for fixing #489. After checking the code in detail I think the issue affects not only `PolynomialFeatures` but also Sklearn `OneHotEncoder`. When using `SklearnTransformerWrapper`...

datacubeR

`OrdinalEncoder` could output -1 for unseen categories

1

The OrdinalEncoder has an errors argument which can either raise an error or output NaNs when encountering new categories. For this particular class, it'd make sense to output -1 when...

solegalli

good first issue

enhancement

Information Value for nominal variables

8

Presently, there are no packages in python to calculate Information Value using WOE for nominal/categorical variables. As WOE Encoder is already available in Feature Engine, hence I am raising the...

SurajitTest

new transformer

Psi auto threshold

PR for #494

glevv

Idea: add a `UnstableLabelEncoder`

6

Related to [`RareLabelEncoder`](https://feature-engine.readthedocs.io/en/latest/api_doc/encoding/RareLabelEncoder.html), I wrote an `UnstableLabelEncoder` that groups categories that are unstable over time. You define `n_time_buckets` (for example `5`) and a `time_variable`. Then I cut the `time_variable` into...

timvink

Adding auto threshold to DropHighPSIFeatures

5

**Is your feature request related to a problem? Please describe.** Staple thresholds 0.1 and 0.25 are empirical, but there are alternatives to calculate threshold based on data and parameters of...

glevv

new transformer: feature selection using mrmr

11

As per description here: https://medium.com/towards-data-science/mrmr-explained-exactly-how-you-wished-someone-explained-to-you-9cf4ed27458b and references therein.

solegalli

feature_engine
feature_engine copied to clipboard

Metadata

Mad outlier rule

MAD-Median rule for outlier removal

Ordinal encoder outputs -1 for unknown Categories

Fix repeated features for OneHotEncoder and PolynomialFeatures

`OrdinalEncoder` could output -1 for unseen categories

Information Value for nominal variables

Psi auto threshold

Idea: add a `UnstableLabelEncoder`

Adding auto threshold to DropHighPSIFeatures

new transformer: feature selection using mrmr

← Metadata

Owner

Metadata

feature_engine feature_engine copied to clipboard

Metadata

← Metadata

Owner

Metadata

feature_engine
feature_engine copied to clipboard