sheeprl
sheeprl copied to clipboard
New Dreamer V3 manuscript
Hi!
Sharing slight change in Dreamer V3 according to their updated(2024/04/17) manuscript https://arxiv.org/pdf/2301.04104
Also their codes are updated few hours ago https://github.com/danijar/dreamerv3
It includes change in the optimizer (LaProp), experiments with larger num envs, etc
Just wanted to let you know! Thanks!
Closing this issue since received ACK
I'll re-open it so that we can track progress and use this issue to discuss about all the changes, i.e.
- [ ] Models: the number of layers are fixed while it changes the number hidden dimensions
- [ ] Adaptive Gradient Clipping and LaProp optimizer (never heard of)
- [ ] LaProp optimizer (https://arxiv.org/abs/2002.04839)
- [ ] GRU with block-diagonal recurrent weights (needs investigation since i don't know what is the block-diagonal variant: https://arxiv.org/abs/1905.12340)
- [ ] Critic regresses both imagined return and real ones: "The critic replay loss uses the imagination returns $R^t_{\lambda}$ at the start states of the imagination rollouts as on-policy value annotations for the replay trajectory to then compute λ-returns over the replay rewards."
- [ ] Replay bufferwith online queue (https://arxiv.org/abs/1909.11583)
- [ ] More parallel envs
Any help is super appreciated here :metal: :grin: Thank you all
P.S.: add here everything that I've missed