pytorchrl
pytorchrl copied to clipboard
Deep Reinforcement Learning algorithms implemented in PyTorch
It's now worth consider to migrate to pytorch 0.40 as there is several advanced changes.
DDPG code is really slow on linux machine with 12 threads, the speed of training 1 epoch (10000 steps) is 56 seconds which is same as just use one thread...
Pearlmutter method only gives "good" value of hessian vector product in the first two iterations in conjugate gradient loop
Conjugate gradient code will complain divides by 0 error if using numpy 1.13.3 in environmental.yaml on Mac OS on anaconda, so we have to use 1.13.1. But 1.13.1 will raise...