trpo issues

Results 2 trpo issues

Sort by recently updated

About kl_firstfixed

thanks for implementation of trpo, there exist some details that do not make sense to me so far I can't see why kl_firstfixed is defined as following `kl_firstfixed = tf.reduce_sum(tf.stop_gradient(...

PeiYingjun

Normalize advantage function

Hi, thanks for your implementation of TRPO. In [https://github.com/wojzaremba/trpo/blob/master/main.py#L128-L132](url) you normalize an advantage function. I couldn't find any description about this operation in the paper( [https://arxiv.org/abs/1502.05477](url) ). Why did you...

rarilurelo

trpo
trpo copied to clipboard

Metadata

About kl_firstfixed

Normalize advantage function

← Metadata

Owner

Metadata

trpo trpo copied to clipboard

Metadata

About kl_firstfixed

Normalize advantage function

← Metadata

Owner

Metadata

trpo
trpo copied to clipboard