DeepMimic
DeepMimic copied to clipboard
Is DeepMimic be trained using A3C or A2C?
A3C: aka Asynchronous Advantage Actor Critic
It uses MPI, so I wonder if DeepMimic be trained using A3C?
Neither, we are using PPO for training. The implementation with MPI is using synchronous updates, so it's more akin to A2C.