Pytorch-DPPO
Pytorch-DPPO copied to clipboard
on advantages
after test your PPO, and compare with another , i think your advantages need to been : (advantages - advantages.mean()) / advantages.std() for you reference
Thanks for the notification, I will try with this normalization. Can-I ask you with which one did you compare?