Asynchronous-Methods-for-Deep-Reinforcement-Learning Implement the actor-critic methods

Implement the actor-critic methods

Open originholic opened this issue 9 years ago • 1 comments

Hello, In the asynchronous dqn paper, they also described an on policy method, the advantage actor-critic (A3C), which achieved better results than others, do you currently have any plan to include this method in this repo as well? Because I am working off this repo as a starting point, and attempt to reproduce the results of the A3C method on the continuous action domain, but I am still trying to figure out the network model they used in the physical state case when apply to Mojoco, and how the policy gradient is accumulated.

Apr 10 '16 00:04 originholic

No, originholic. I'm working in others things right now :(.

Maybe in the futurre I try with the advantage actor-critic, but not now. I'm sorry.

Regards. Samu.

Apr 10 '16 07:04 Zeta36

Asynchronous-Methods-for-Deep-Reinforcement-Learning Asynchronous-Methods-for-Deep-Reinforcement-Learning copied to clipboard

Implement the actor-critic methods

Asynchronous-Methods-for-Deep-Reinforcement-Learning
Asynchronous-Methods-for-Deep-Reinforcement-Learning copied to clipboard