jakkarn
jakkarn
I just found out that the `state_dict` contains the gradients. So they should at least be somewhat reset when loading the global `state_dict` (with new gradients) to the local nn....
This should be the if-condition that determines when N actions have been taken (or terminated) in the `Worker`: https://github.com/MorvanZhou/pytorch-A3C/blob/5ab27abee2c3ac3ca921ac393bfcbda4e0a91745/discrete_A3C.py#L93 N-step loss is then computed in `push_and_pull` in this loop: https://github.com/MorvanZhou/pytorch-A3C/blob/5ab27abee2c3ac3ca921ac393bfcbda4e0a91745/utils.py#L29
The sum of gradients is the same as gradients of the sum.
I would suggest reading the paper included in the Readme: https://arxiv.org/pdf/1602.01783.pdf. They mention the lock-free approach Hogwild! that can make learning more efficient. It's probably good for some problems, and...