time-series-forecasting-rnn-tensorflow icon indicating copy to clipboard operation
time-series-forecasting-rnn-tensorflow copied to clipboard

Why you shuffle the sequence of training data?

Open hana9090 opened this issue 7 years ago • 2 comments

In the pre- processing before entering network model, why you shuffle the sequence of training data? Since the sequence is important in time series!

hana9090 avatar Mar 22 '18 08:03 hana9090

Did you find the answer?

JuanCaBaqueroB avatar Oct 25 '18 13:10 JuanCaBaqueroB

Train data is a normalized view of the raw data that has been split into windows. Please see this logic: # Split data into windows raw = [] for index in range(len(data) - adjusted_window): raw.append(data[index: index + adjusted_window])

Meaning that train data is also cut into windows, E.g. for window size four you get:

[[ 0. -0.00289975 0.00517813 -0.01367026 -0.04971002] [ 0. 0.00810137 -0.01080183 -0.04694641 -0.02928957] [ 0. -0.01875129 -0.0546054 -0.03709046 -0.00535751] ... [ 0. -0.01493441 -0.01937437 -0.04702321 -0.02583249] [ 0. -0.00450727 -0.03257529 -0.01106331 -0.03605818] [ 0. -0.0281951 -0.00658572 -0.03169376 -0.033546 ]]

Where first four elements of each sub-array are inputs (X) and the last one is the actual value (Y).

It is perfectly valid to shuffle those sub-arrays as all that matters is that the sequence of elements is preserved within each individual sub-array.

So shuffling is not an issue.

dmitryaleks avatar Jan 02 '19 09:01 dmitryaleks