dreamer-pytorch Reward loss timescale

Reward loss timescale

Open roggirg opened this issue 4 years ago • 0 comments

Hi,

I believe the reward loss should be based on rewards[1:] instead of rewards[:-1] : https://github.com/yusukeurakami/dreamer-pytorch/blob/7e9050e8c454309de40bd0d1a4ec0256ef600147/main.py#L209

If not, can you please explain your reasoning? Thanks,

Oct 22 '20 17:10 roggirg

dreamer-pytorch dreamer-pytorch copied to clipboard

Reward loss timescale

dreamer-pytorch
dreamer-pytorch copied to clipboard