pytorch-dqn Fixes for issues #3, #5 and #7. Agent learns better

Fixes for issues #3, #5 and #7. Agent learns better

Open praveen-palanisamy opened this issue 7 years ago • 2 comments

Below is the summary of the contributions made by this PR:

[x] Fixes for issues #3, #5 and #7
[x] Overcomes the limitation of #4. This PR enables the agent to collect more rewards with minimal changes to the original scripts.
[x] Tested with Pytorch 0.2 (as requested in #5 )

Nov 01 '17 02:11 praveen-palanisamy

thanks for your contribution! It works on pytorch 0.2 👍

BTW, can I ask why doing gradient clipping, is that matters to performance? thanks for your code again :)

Nov 01 '17 03:11 SSARCandy

Glad to hear that my contributions helped you.

Clipping the gradient will make sure that the gradients don't "explode" which is a common problem encountered when using gradient descent algorithms with neural networks. In this case with DQN, gradient clipping will ensure that the optimization algorithm only takes small (in magnitude) steps in the direction pointed to by the gradient. Making a larger descent step and hence a big update to the Q-value function approximation might throw the approximation off from (converging to) the optimal values.

Hope the explanation helps.

Nov 01 '17 04:11 praveen-palanisamy

pytorch-dqn pytorch-dqn copied to clipboard

Fixes for issues #3, #5 and #7. Agent learns better

pytorch-dqn
pytorch-dqn copied to clipboard