Sean Zixuan Chen

Results 1 issues of Sean Zixuan Chen

When training ppo2 using mujoco environment, I find that episode reward earned from infos['episode']['r'] doesn't equal to the sum of rewards of each step. In the Humanoid environment, summing up...