redflag
redflag copied to clipboard
Add test for standardized and normalized data
I have utils.is_standardized() which checks that the mean and standard deviation of a 1D array are close to 0 and 1 respectively. But this would really only work for the training data (new data scaled with the training scaler are unlikely to have these statistics) and I suspect it doesn't even do that. Robust standardization is a whole other issue too, e.g. with sklearn's robust scaler.
All the same points go for normalized data, more or less.
So, what would be good tests for standardized and normalized data? Let's add it/them (with a sklearn detector too).
NB replaced func with is_standard_normal()