jakkarn comments

Results 4 comments of


                                            jakkarn

Are local gradients accumulated and never reset?

I just found out that the `state_dict` contains the gradients. So they should at least be somewhat reset when loading the global `state_dict` (with new gradients) to the local nn....

is it really a3c implementation? not just actor critic?

This should be the if-condition that determines when N actions have been taken (or terminated) in the `Worker`: https://github.com/MorvanZhou/pytorch-A3C/blob/5ab27abee2c3ac3ca921ac393bfcbda4e0a91745/discrete_A3C.py#L93 N-step loss is then computed in `push_and_pull` in this loop: https://github.com/MorvanZhou/pytorch-A3C/blob/5ab27abee2c3ac3ca921ac393bfcbda4e0a91745/utils.py#L29

About total_loss = (a_loss + c_losss).mean()

The sum of gradients is the same as gradients of the sum.

About the lock in multiprocess of A3C.

I would suggest reading the paper included in the Readme: https://arxiv.org/pdf/1602.01783.pdf. They mention the lock-free approach Hogwild! that can make learning more efficient. It's probably good for some problems, and...