redflag icon indicating copy to clipboard operation
redflag copied to clipboard

Add test for standardized and normalized data

Open kwinkunks opened this issue 3 years ago • 1 comments

I have utils.is_standardized() which checks that the mean and standard deviation of a 1D array are close to 0 and 1 respectively. But this would really only work for the training data (new data scaled with the training scaler are unlikely to have these statistics) and I suspect it doesn't even do that. Robust standardization is a whole other issue too, e.g. with sklearn's robust scaler.

All the same points go for normalized data, more or less.

So, what would be good tests for standardized and normalized data? Let's add it/them (with a sklearn detector too).

kwinkunks avatar Aug 26 '22 13:08 kwinkunks

NB replaced func with is_standard_normal()

kwinkunks avatar Sep 20 '23 10:09 kwinkunks