2048_env
2048_env copied to clipboard
It seems not a standard ddqn implementation.
https://github.com/YangRui2015/2048_env/blob/2e9b3938492e4f7a0a2b627b8607ad1c203d273a/dqn_agent.py#L96
I was new to DQN , and confused with this line . And it should be q_eval_4_next_state_argmax instead of q_eval_4_this_state_argmax , right ?
https://github.com/YangRui2015/2048_env/blob/2e9b3938492e4f7a0a2b627b8607ad1c203d273a/dqn_agent.py#L96
I was new to DQN , and confused with this line . And it should be
q_eval_4_next_state_argmaxinstead ofq_eval_4_this_state_argmax, right ?
Hi, this is 'Double DQN'. Please refer to the original paper for more details.
Thanks for your reply .
But according to the DDQN algorithm from ICML 2016 , I think the argmax should be evaluated with the next state on the older network instead of the current state. While the evaluation you choose is from the current state q_eval_total = self.eval_net(batch_state)
Thanks for your reply . But according to the DDQN algorithm from ICML 2016 , I think the argmax should be evaluated with the next state on the older network instead of the current state. While the evaluation you choose is from the current state
q_eval_total = self.eval_net(batch_state)![]()
I think you are right. I will check and revise the code later. You can use the DQN method without the 'Double' trick. Thanks!