Sean Zixuan Chen
Results
1
issues of
Sean Zixuan Chen
When training ppo2 using mujoco environment, I find that episode reward earned from infos['episode']['r'] doesn't equal to the sum of rewards of each step. In the Humanoid environment, summing up...