temporian icon indicating copy to clipboard operation
temporian copied to clipboard

New operator: z-score normalization

Open ianspektor opened this issue 1 year ago • 3 comments
trafficstars

New EventSet.z_score_normalize() (name TBD) operator.

See here for how to compute it.

See https://github.com/google/temporian/blob/main/CONTRIBUTING.md#developing-a-new-operator for guidance.

Questions or requests for additional guidance from possible contributors more than welcome!

ianspektor avatar Jan 17 '24 20:01 ianspektor

Hey @ianspektor, I have a few questions about putting this into action:

Q1) Will this be a python-only operator or a c++ one?

Q2) As far as I understand, we can't use scipy. So, we can't call scipy.stats.zscore directly thus, I was wondering, do we keep the arguments same as scipy.stats.zscore ? Also, I'm interested in how we deal with NaNs .

Q3) What data types will this operator support? All numeric?

akshatvishu avatar Apr 02 '24 11:04 akshatvishu

Tagging @javiber, he's the go-to person from now on for all things contributing :)

ianspektor avatar Apr 04 '24 13:04 ianspektor

Hi @akshatvishu I think that we can implement this one using numpy's mean and std whiteout going down to c++.

Scipy's implementation for future reference: https://github.com/scipy/scipy/blob/v1.13.0/scipy/stats/_stats_py.py#L3021

javiber avatar Apr 10 '24 14:04 javiber