LSTM-Neural-Network-for-Time-Series-Prediction
LSTM-Neural-Network-for-Time-Series-Prediction copied to clipboard
Rolling column 1 predicted data into other columns for next step
I could be wrong but in the model.py predict_sequences_multiple it seems to be taking the 1st range of sequence_length data from the test set for input to the prediction. Thus is this trying to simulate the 1st point in time after the sequence length using a set of the training on data prior to this sequence length?. Thus we can only start to predict for a point in time after the 1st sequence length of the test data.
I then see that the predicted 1D value (normalised price) is then used to populate the rolling sequence length data in every column. If we were using columns price, volume and date (assuming seasonal dependant stock) should it not try to predict in 3 dimensions and roll with predicted values of each of these?.
Actually, I have the same questions. The specific codes are:
curr_frame = curr_frame[1:]
curr_frame = np.insert(curr_frame, [window_size-2], predicted[-1], axis=0)
If my understanding is correct, the above two lines will remove the earliest actual and then add the forecast value in the end of array. But the predicted is 1D. So in the end, the 1D value will be copied to all columns of fitting. I think we have to either forecast other Xs or just use actual Xs for a true rolling forecast.
Glad I am not the only one. Hope the author can fix this issue soon since it basically makes the result of 'predict_sequences_multiple' and 'predict_sequence_full' meaningless.
before this line:
curr_frame = np.insert(curr_frame, [window_size-2], predicted[-1], axis=0)
the end of curr_frame looks like this:
...
[-5.74086586e-02, 1.45605609e-01],
[-7.21488849e-02, 2.86602396e-02],
[-5.90250473e-02, 1.85077417e-02]
after the line, it becomes:
...
[-5.74086586e-02, 1.45605609e-01],
[-7.21488849e-02, 2.86602396e-02],
[-5.90250473e-02, 1.85077417e-02]
[-5.56795821e-02, -5.56795821e-02]
You can clearly see the 1-D prediction '-5.56795821e-02' is duplicated and inserted into this 2-D array.
BTW, thank you for this great code repository, please fix this error. @jaungiers
+1
In my opinion, there is a overlap between the training data and the validation data.
I'm trying to research & fix this problem, because the 2D NumPy Array doesn't imply the stock Date
column clearly
I am also confused about @jaungiers approach to estimate future sequence. Using estimated closed prices for all t+n futures seems wrong to me.
Using actual t+n values (among predicted price) doesn't seems reasonable ether. The hole point is to use history data for predicting future sequence.
I think I will use last value for the future. For example, if volume is second feature, I think I will use last volume value.
If you figure out better approach, please share.
+1
i have the same question, the predict[-1] is the last prediction of the close price, but the volume can not be reached, why fill the last row with the same close price
curr_frame = np.insert(curr_frame, [window_size-2], predicted[-1], axis=0)
The code is using test data to make the prediction, predict_sequences_multiple is meaningless.
def predict_sequences_multiple(self, data, window_size, prediction_len):
prediction_seqs = []
for i in range(int(len(data)/prediction_len)):
#This is just the test data
curr_frame = data[i*prediction_len]
predicted = []
for j in range(prediction_len):
#This is the model using the test data to make the prediction
predicted.append(self.model.predict(curr_frame[newaxis,:,:])[0,0])
curr_frame = curr_frame[1:]
curr_frame = np.insert(curr_frame, [window_size-2], predicted[-1], axis=0)
prediction_seqs.append(predicted)
return prediction_seqs