Stock-Prediction-Models icon indicating copy to clipboard operation
Stock-Prediction-Models copied to clipboard

⚠️ Data Leakage: Must not use test data when fitting MinMaxScaler()

Open shure-dev opened this issue 2 years ago • 2 comments
trafficstars

Probably, I found a serious error.

If I'm correct, we cannot use any information from test data when preprocessing data.

However, your code applied fit_transform() to train and test data.

This means train data can contain information from test data and effects accuracy.

Please correct me if my idea is wrong, thank you.

shure-dev avatar Apr 19 '23 06:04 shure-dev

This answer seems working well for this issue.

https://stackoverflow.com/questions/70923839/sklearn-preprocessing-with-a-rolling-window

shure-dev avatar Apr 19 '23 17:04 shure-dev

Probably, also we have to care about stationarity, when we treat time series data

shure-dev avatar Apr 27 '23 16:04 shure-dev