feature_engine icon indicating copy to clipboard operation
feature_engine copied to clipboard

add test to ensure estimators raise errors when X has duplicated column names

Open david-cortes opened this issue 1 year ago • 5 comments

Per comment in PR: https://github.com/feature-engine/feature_engine/pull/686#pullrequestreview-1544746361

After the PR in the link above gets merged, there will be a general check for duplicated columns for which a test was added in test_dataframe_checks.py. This new error raised on duplicated column names should be tested also in the invidual transformers in estimator_checks/* after the files gets refactored.

david-cortes avatar Jul 25 '23 17:07 david-cortes

Thank you @david-cortes

solegalli avatar Jul 26 '23 06:07 solegalli

hi @solegalli and @david-cortes,

Slowly dipping my toes back into feature-engine.

Is the idea to create a test/check function on dataframe_checks.py and then add the check functions to the parent transformers in the fit() method?

Morgan-Sell avatar Sep 12 '23 20:09 Morgan-Sell

No, I think the idea was to add a generic test to estimator_checks to test all transformers to ensure that they inherit this functionality properly. I am not sure if it is an overkill or not.

solegalli avatar Sep 13 '23 07:09 solegalli

Ok, so as the code base stands, feature-engine transformers raise errors when there are duplicative feature names, correct? This issue is to create a test that the error is being raised when expected.

Morgan-Sell avatar Sep 15 '23 17:09 Morgan-Sell

Correct!

solegalli avatar Sep 18 '23 10:09 solegalli