Variable action spaces
Hello Danijar, So I've been designing a model for doing a particular type of complex work.
Essentially, it involves working on different action spaces.
I thought that I would be able to simply change the Agent's action space and it would work; however I'm not super sure about it. The amount of spaces is very finite (around 30 different action spaces), and yet they are different - sometimes it's simply a single uint from 1 to 3, and sometimes it is a (3 float32 selections, a bool selection, another but different 3 float32 selection); or sometimes it is a vector of 127 bools where model should select true/false.
Will dreamer converge under such circumstances? Will I have to introduce significant changes somewhere in the model?
Cheers.