Gym.jl icon indicating copy to clipboard operation
Gym.jl copied to clipboard

PendulumEnv does not use clamped torque

Open darsnack opened this issue 6 years ago • 1 comments

PendulumEnv is calculating clamped torque, but using unclamped torque in subsequent calculations. i.e. we calculate:

v = clamp.(u, -env.max_torque, env.max_torque)

but we don't use v in any of the following lines, and we use u directly.

darsnack avatar May 05 '19 04:05 darsnack

Thanks for pointing out. I tried changing it and running on the examples from model-zoo, and that has difficulty in learning. The gradients vanish due to use of clamp, maybe that's the reason v was never used. I noticed that without using v, the model still learns to output the values in the given range of torque. I'm experimenting with workarounds to get it working with v.

tejank10 avatar May 05 '19 11:05 tejank10