dreamerv2
dreamerv2 copied to clipboard
Mastering Atari with Discrete World Models
Results
11
dreamerv2 issues
Sort by
recently updated
recently updated
newest added
Hi, I have a question regarding the implementation of the advantage calculation. The code snippet is as follows: https://github.com/danijar/dreamerv2/blob/07d906e9c4322c6fc2cd6ed23e247ccd6b7c8c41/dreamerv2/agent.py#L252-L274 ```python advantage = tf.stop_gradient(target[1:] - self._target_critic(seq['feat'][:-2]).mode()) ``` Based on my understanding:...