PendulumEnv does not use clamped torque

Open darsnack opened this issue 6 years ago • 1 comments

PendulumEnv is calculating clamped torque, but using unclamped torque in subsequent calculations. i.e. we calculate:

v = clamp.(u, -env.max_torque, env.max_torque)

but we don't use v in any of the following lines, and we use u directly.

May 05 '19 04:05 darsnack

Thanks for pointing out. I tried changing it and running on the examples from model-zoo, and that has difficulty in learning. The gradients vanish due to use of clamp, maybe that's the reason v was never used. I noticed that without using v, the model still learns to output the values in the given range of torque. I'm experimenting with workarounds to get it working with v.

May 05 '19 11:05 tejank10