dreamer
dreamer copied to clipboard
Different std of models
Hello!
I was wondering why in the distributions of the RSSM and action model, the standard deviations are parameterized by the neural networks and in the reward model, the observation model and the critic, the distributions have unit std?