Mava icon indicating copy to clipboard operation
Mava copied to clipboard

Support for MultiDiscrete action space

Open dewet99 opened this issue 2 years ago • 1 comments

Hi there,

I have a custom Unity environment for which I wrote a wrapper, and its action space can be described by OpenAI's MultiDiscrete space. I've tested my environment and the wrapper with mava's MAPPO example, but the algorithm doesn't generate multiple discrete actions, rather it generates continuous actions in the shape that I require the discrete actions to be in.

In the PettingZooParallelEnvWrapper, the _convert_to_spec function from acme is used to convert to the correct specs, and support for MultiDiscrete spaces is included in the function. However, using this in action_spec doesn't actually generate discrete actions.

Is support for MultiDiscrete action spaces being planned, or did someone implement it somewhere already? Please point me in the right direction.

Thanks in advance.

dewet99 avatar Jun 10 '22 08:06 dewet99

Hey @dewet99. As mentioned in our earlier private chat, we don't currently support MultiDiscrete actions. One suggestion @jcformanek made is the following: To respond to this question I would recommend adding some logic in his environment wrapper that converts a Discrete action into a corresponding Multi-Discrete. This would involve enumerating all combinations of the multi-discrete action space and mapping them to an integer (discrete action)

Alternatively, you could also try converting a continuous action to a MultiDiscrete action. The continuous action space would have its length being equal to the sum of all the lengths of the discrete actions. And then when you want to sample actions you just take multiple argmaxes over the various parts of the continuous action values.

I am not sure which method would work the best. Hope this helps :)

DriesSmit avatar Jun 10 '22 13:06 DriesSmit

Closing all TF issues as we are depreciating our TF systems.

DriesSmit avatar Sep 08 '22 14:09 DriesSmit