deep_rl_trader I think there is a look-ahead bias

I think there is a look-ahead bias

Open mg64ve opened this issue 5 years ago • 1 comments

Hi there, nice work. However I think there is a look-ahead bias. Every timestep, you get state and this state includes the current closeprice. Then with step method you calculate profit as:

self.exit_price = self.closingPrice
self.reward += ((self.entry_price - self.exit_price)/self.exit_price + 1)*(1-self.fee)**2 - 1 # calculate reward

In this case you are using the same information that you already used to predict the next action. What do you think about it?

Aug 12 '19 06:08 mg64ve

state_n <- updateState() action_n <- network(state_n) reward_n <- compute_reward(action_n, state_n)

state_n_plus_1 <- updateState() action_n_plus_1 <- network(state_n_plus_1) reward_n_plus_1 <- compute_reward(action_n_plus_1, state_n_plus_1)

reward is computed from the current state, thus there is no lookahead bias. Put it simply, (e.g.) one decides to sell the stock based on past&current price and if one does sell, then one would calculate the earning based on current price.

Dec 31 '19 20:12 miroblog

deep_rl_trader deep_rl_trader copied to clipboard

I think there is a look-ahead bias

deep_rl_trader
deep_rl_trader copied to clipboard