pytorch-rl 3 - Advantage Actor Critic (A2C) [CartPole].ipynb

3 - Advantage Actor Critic (A2C) [CartPole].ipynb - Returns do not need to be detached

Open nimrare opened this issue 4 years ago • 0 comments

Hi Ben

Thanks for the interesting notebooks. Upon studying the "3 - Advantage Actor Critic (A2C) [CartPole].ipynb" notebook, I came to the conclusion that detaching the returns in the update_policy() function is not necessary. The returns are only calculated on the rewards which are environment outputs and therefore not part of the computational graph. So even leaving out the .detach() call should not affect the model. Would you agree?

Feb 04 '21 14:02 nimrare

pytorch-rl pytorch-rl copied to clipboard

3 - Advantage Actor Critic (A2C) [CartPole].ipynb - Returns do not need to be detached

pytorch-rl
pytorch-rl copied to clipboard