Automated-Trader icon indicating copy to clipboard operation
Automated-Trader copied to clipboard

Data Leakage

Open personal-coding opened this issue 4 years ago • 0 comments

Luke,

Your random forest is using a randomized train / test split. Your technical indicators embed prior day information into the calculations, which is leaking data into your randomized splits. Your reference [10] paper has this same issue (https://www.reddit.com/r/algotrading/comments/cv83yh/overfitting).

For time series analysis, you should not use randomized splits, as that is how not data would be received in a real environment.

personal-coding avatar Dec 07 '19 00:12 personal-coding