redflag
redflag copied to clipboard
Transform non-Gaussian features before outlier detection
Can't use (say) +/- 3 standard deviations if feature is non-Gaussian. So apply transformation first, eg with Yeo-Johnson transformation, see https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.PowerTransformer.html and also #46
Also see "shifting transformation", eg https://arxiv.org/abs/2106.03899