Reiji Hatsugai

Results 2 issues of Reiji Hatsugai

comment

Hi, thanks for your implementation of TRPO. In [https://github.com/wojzaremba/trpo/blob/master/main.py#L128-L132](url) you normalize an advantage function. I couldn't find any description about this operation in the paper( [https://arxiv.org/abs/1502.05477](url) ). Why did you...