trpo
trpo copied to clipboard
Results
2
trpo issues
Sort by
recently updated
recently updated
newest added
thanks for implementation of trpo, there exist some details that do not make sense to me so far I can't see why kl_firstfixed is defined as following `kl_firstfixed = tf.reduce_sum(tf.stop_gradient(...
Hi, thanks for your implementation of TRPO. In [https://github.com/wojzaremba/trpo/blob/master/main.py#L128-L132](url) you normalize an advantage function. I couldn't find any description about this operation in the paper( [https://arxiv.org/abs/1502.05477](url) ). Why did you...