ACKTR
ACKTR copied to clipboard
loss function
Hi, thanks for sharing this. I just noticed that the loss function used here https://github.com/gd-zhang/ACKTR/blob/9d61318117672262c78c06a976abf3cd47a54bd6/models/model.py#L93 is different than the loss function used to create fisher matrix in the original paper (section 3.1). The loss you are using is training loss which is different than fisher loss. As also implemented in the open ai baselines, there are two different losses.
As far as my understanding, to calculate/update the estimates, you need to use fisher loss, but to update gradient, you need to use train loss.