sheeprl icon indicating copy to clipboard operation
sheeprl copied to clipboard

New Dreamer V3 manuscript

Open anthony0727 opened this issue 10 months ago • 2 comments

Hi!

Sharing slight change in Dreamer V3 according to their updated(2024/04/17) manuscript https://arxiv.org/pdf/2301.04104

Also their codes are updated few hours ago https://github.com/danijar/dreamerv3

It includes change in the optimizer (LaProp), experiments with larger num envs, etc

Just wanted to let you know! Thanks!

anthony0727 avatar Apr 25 '24 04:04 anthony0727

Closing this issue since received ACK

anthony0727 avatar Apr 25 '24 08:04 anthony0727

I'll re-open it so that we can track progress and use this issue to discuss about all the changes, i.e.

  • [ ] Models: the number of layers are fixed while it changes the number hidden dimensions
  • [ ] Adaptive Gradient Clipping and LaProp optimizer (never heard of)
  • [ ] LaProp optimizer (https://arxiv.org/abs/2002.04839)
  • [ ] GRU with block-diagonal recurrent weights (needs investigation since i don't know what is the block-diagonal variant: https://arxiv.org/abs/1905.12340)
  • [ ] Critic regresses both imagined return and real ones: "The critic replay loss uses the imagination returns $R^t_{\lambda}$ at the start states of the imagination rollouts as on-policy value annotations for the replay trajectory to then compute λ-returns over the replay rewards."
  • [ ] Replay bufferwith online queue (https://arxiv.org/abs/1909.11583)
  • [ ] More parallel envs

Any help is super appreciated here :metal: :grin: Thank you all

P.S.: add here everything that I've missed

belerico avatar Apr 25 '24 23:04 belerico