baselines icon indicating copy to clipboard operation
baselines copied to clipboard

policy entropy in PPO2

Open akhilsanand opened this issue 7 years ago • 1 comments

Hi, On applying PPO2 to a custom Mujoco environment, the policy entropy is continuously increasing even with a small entropy coefficient of 0.01 or even less. In my understanding, ideally the policy entropy is supposed to decrease over time, What could be the issue?? Also any suggestions on managing the entropy coefficient to encourage sufficient exploration.

Regards,

akhilsanand avatar Jul 18 '18 20:07 akhilsanand

Hi Ashilsanand, Have you get any solution for this?

Lewisracing avatar Nov 08 '20 20:11 Lewisracing