class_notebooks icon indicating copy to clipboard operation
class_notebooks copied to clipboard

houses.ensure_normality() does not exist in dataset.py

Open NathanDotTo opened this issue 6 years ago • 1 comments

In the Data Manipulation Classes notebook, you have:

houses.scale() houses.ensure_normality()

The ensure_normality() function appears not to exist. You probably mean the fix_skewness function.

Note that there is also a skewed_features function, which uses Box-Cox. That function can't be applied, because it only works on positive numbers (hence the Yeo-Johnson used in the fix_skewness function). After applying the StandardScaler, the numerical values will be zero centred, so will range from negative to positive.

Also, though, the PowerTransformer has a standardize option, which seems to apply the StandardScaler anyway. But, that doesn't work as there is a "overflow encountered in multiply" problem, unless the StandardScaler is first applied to the data. Hence, I suppose, why standardize=False.

NathanDotTo avatar Mar 05 '19 18:03 NathanDotTo

Thanks! I changed the name of the function (quite unstable class at the moment!) and now it is fix_skewness(), yes. And the other one, skewed_features() is simply to know what features present skewness, before fixing them. I will review them with the notes you sent.!

Thanks!!!

renero avatar Mar 06 '19 06:03 renero