Proximal-Policy-Optimization-Pytorch icon indicating copy to clipboard operation
Proximal-Policy-Optimization-Pytorch copied to clipboard

Proximal Policy Optimization(PPO) Algorithm and its distributed implementation in Pytorch

Results 1 Proximal-Policy-Optimization-Pytorch issues
Sort by recently updated
recently updated
newest added
trafficstars

Hi, why do you use maximum instead of minimum to clipping value function loss? Suppose clippinng occurs, when v_pred_old < v_clipped < v_pred < R, or reversely, the loss will...