MountainCar_DQN_RND icon indicating copy to clipboard operation
MountainCar_DQN_RND copied to clipboard

I have an idea. But I don't know whether it will work.

Open 514123661 opened this issue 4 years ago • 0 comments

I think that the value RND return as the intrinsic reward is not good in DQN. And I think it can be used to select action. So the dimension of RND‘s output is action dimension. And then in DQN,you can change the action selection algorithm by using this idea to achieve exploration.

514123661 avatar Apr 26 '20 07:04 514123661