StatsBase.jl icon indicating copy to clipboard operation
StatsBase.jl copied to clipboard

[feature request] allow `transform` to avoid Z-score transforming when sigma=0

Open SimonEnsemble opened this issue 2 years ago • 1 comments

my design matrix Φ has a column of one's to handle the intercept. so anytime I Z-score transform, I get NaNs after dividing by a zero variance. would be nice if transform had an option to handle this, ie. not transform when sigma=0.0.

feature_transform = fit(ZScoreTransform, Φ, dims=1)
Φ̂ = StatsBase.transform(feature_transform, Φ)

SimonEnsemble avatar Dec 28 '23 18:12 SimonEnsemble

eg. in scikit-learn's StandardScaler: "If a variance is zero, we can’t achieve unit variance, and the data is left as-is, giving a scaling factor of 1."

https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.StandardScaler.html#sklearn.preprocessing.StandardScaler

SimonEnsemble avatar Dec 28 '23 18:12 SimonEnsemble