dreamerv2
dreamerv2 copied to clipboard
Performance difference between TruncNormal and TanhNormal
Hey @danijar.
I just noticed that the code is using TruncNormal
as the actor distribution instead of TanhNormal
as in v1. I wonder did you make some ablations on these two choices and see TruncNormal
provide better results? Or the change is only because the entropy of TruncNormal
is easier to compute than TanhNormal
for the entropy regularizer?