FeatureTransforms.jl icon indicating copy to clipboard operation
FeatureTransforms.jl copied to clipboard

Want MeanStdScaling that applies to multiple columns

Open glennmoy opened this issue 5 years ago • 3 comments

MeanStdScaling computes one set of mean and std params for all the data provided.

There is not a convenient way to compute separate mean and std params for, e.g., a list of columns without looping or creating a separate transform for each.

glennmoy avatar Mar 17 '21 18:03 glennmoy

Maybe a kwarg like :per_col = true but there might be more implication on things under the hood for how this works

nicoleepp avatar Mar 17 '21 21:03 nicoleepp

Plus one for this, used in GPF on a per slice basis

nicoleepp avatar Apr 22 '21 19:04 nicoleepp

I think computing mean and std per column should be not only an optional feature but the default one. The current implementation is dangerous because people want to compute statistics per feature (per columns) variable and it is extremely rare to compute single statistics over multiple features.

molet avatar Jun 18 '21 12:06 molet