interp-net icon indicating copy to clipboard operation
interp-net copied to clipboard

data normalization in data preprocessing

Open KimballCai opened this issue 4 years ago • 3 comments

Is it necessary to standardize the data before it enters the model? (e.g. avg-std normalization)

Thanks, Qingpeng

KimballCai avatar Aug 16 '20 16:08 KimballCai

No, there is no need to standardize the data. The different scales of different dimension are accounted for In the autoencoder loss.

satyanshukla avatar Aug 17 '20 05:08 satyanshukla

Hi, The performance is not good and I want to recheck the input. I generated n samples with d features and the max timestep is 200, so the shape of x, m, T is the same as n * d * T. Am I right? As for the sample that has less than 200 times' records, we just fill the 0 for the rest of x,m,T. Am I right?

Thanks, Qingpeng

KimballCai avatar Aug 29 '20 06:08 KimballCai

yeah, the format seems right. Make sure the mask variable is 1 where x is observed else 0, the starting value of T should be 0 (basically subtract the initial value of the timestamp in each time series so that they are starting at t=0). What are the time scales though?

satyanshukla avatar Sep 01 '20 07:09 satyanshukla