TRPO-TensorFlow icon indicating copy to clipboard operation
TRPO-TensorFlow copied to clipboard

kl.pen = 0

Open atishsawant opened this issue 6 years ago • 0 comments

Hey,

I was running an implementation of your code, and it seems like the kl_pen is always zero. It seems like its because the oldlog_vars and log_vars are the same. How'd you get around that? Since if the two are the same then the gradient is zero, and then the hvp function fails because you end up with a div/0.

atishsawant avatar Mar 19 '18 00:03 atishsawant