Reinforcement-Learning-Pytorch-Cartpole Why compute action from target net rather than online net?

Why compute action from target net rather than online net?

Open jsrimr opened this issue 4 years ago • 0 comments

trafficstars

https://github.com/g6ling/Reinforcement-Learning-Pytorch-Cartpole/blob/ecb7b622cfefe825ac95388cceb6752413d90a2a/POMDP/4-R2D2-Single/train.py#L76

Another question : Why do you only store hidden state from target net and not from online net?

Jul 09 '21 06:07 jsrimr