hindsight-experience-replay icon indicating copy to clipboard operation
hindsight-experience-replay copied to clipboard

All process update the network and then sync the grad?

Open nizhihao opened this issue 3 years ago • 1 comments

Hi, I have the doubt. In this distributed RL, because of OS scheduling, every process will have the near state and do the same thing? so all process will update the network and then sync the grad together? like sync algorithm A2C? thanks very much.

nizhihao avatar May 18 '21 13:05 nizhihao

Yes, it can be refereed as using a very large batch size for the training.

TianhongDai avatar May 20 '21 18:05 TianhongDai