Stock-Trading-Environment icon indicating copy to clipboard operation
Stock-Trading-Environment copied to clipboard

_next_observation method might be looking into the future

Open renatodvc opened this issue 5 years ago • 0 comments

Hello, I'm still getting acquainted with OpenAI Gym, so I'm not entirely sure about this issue, but it's possible that for each step, the _next_observation method is sending the next (future) 5 candles information, instead of the last five candles.

    def _next_observation(self):
        # Get the stock data points for the last 5 days and scale to between 0-1
        frame = np.array([
            self.df.loc[self.current_step: self.current_step +
                        5, 'Open'].values / MAX_SHARE_PRICE,
            self.df.loc[self.current_step: self.current_step +
                        5, 'High'].values / MAX_SHARE_PRICE,
            self.df.loc[self.current_step: self.current_step +
                        5, 'Low'].values / MAX_SHARE_PRICE,
            self.df.loc[self.current_step: self.current_step +
                        5, 'Close'].values / MAX_SHARE_PRICE,
            self.df.loc[self.current_step: self.current_step +
                        5, 'Volume'].values / MAX_NUM_SHARES,
        ])

This is the first 6 days of the dataframe (AAPL.csv):

Index Date Open High Low Close Volume
0 1998-01-02 13.63 16.25 13.50 16.25 6411700.0
1 1998-01-05 16.50 16.56 15.19 15.88 5820300.0
2 1998-01-06 15.94 20.00 14.75 18.94 16182800.0
3 1998-01-07 18.81 19.00 17.31 17.50 9300200.0
4 1998-01-08 17.44 18.62 16.94 18.19 6910900.0
5 1998-01-09 18.12 19.37 17.50 18.19 7915600.0

Setting the self.current_step to 0, and printing the frame array:

[[0.002726, 0.0033, 0.003188, 0.003762, 0.003488, 0.003624], 
[0.00325, 0.003312, 0.004, 0.0038, 0.003724, 0.003874],
[0.0027, 0.003038, 0.00295, 0.003462, 0.003388, 0.0035],
[0.00325, 0.003176, 0.003788, 0.0035, 0.003638, 0.003638],
[0.00298568, 0.00271029, 0.0075357, 0.00433074, 0.00321814, 0.00368599]])

If we remove the normalization:

[13.63, 16.5,  15.94, 18.81, 17.44, 18.12] #Open
[16.25, 16.56, 20. , 19. , 18.62, 19.37] #High
[13.5,  15.19, 14.75, 17.31, 16.94, 17.5] #Low
[16.25, 15.88, 18.94, 17.5,  18.19, 18.19] #Close
[ 6411700. , 5820300. , 16182800. , 9300200. , 6910900. , 7915600. ] #Volume

If you check the DF you will see that those are the next 5 future prices/volume for the next steps. Logic applies to all following steps as well, not limited to step 0.

renatodvc avatar Jun 16 '19 15:06 renatodvc