MountainCar_DQN_RND I have an idea. But I don't know whether it will work.

I have an idea. But I don't know whether it will work.

Open 514123661 opened this issue 4 years ago • 0 comments

I think that the value RND return as the intrinsic reward is not good in DQN. And I think it can be used to select action. So the dimension of RND‘s output is action dimension. And then in DQN，you can change the action selection algorithm by using this idea to achieve exploration.

Apr 26 '20 07:04 514123661

MountainCar_DQN_RND MountainCar_DQN_RND copied to clipboard

I have an idea. But I don't know whether it will work.

MountainCar_DQN_RND
MountainCar_DQN_RND copied to clipboard