intuitive_policy_gradient
intuitive_policy_gradient copied to clipboard
calculating the gradient
Thanks for the nice tutorial. Could you comment on where the line "slid.grad = slid.value * (1-slid.value)" is coming from?