multiagent_mujoco
multiagent_mujoco copied to clipboard
Bug in action spaces?
I was checking the code of the environment and noticed that an action wrapper is always used to normalize the actions, the code used for this is:
class NormalizedActions(gym.ActionWrapper):
def _action(self, action):
action = (action + 1) / 2
action *= (self.action_space.high - self.action_space.low)
action += self.action_space.low
return action
def action(self, action_):
return self._action(action_)
def _reverse_action(self, action):
action -= self.action_space.low
action /= (self.action_space.high - self.action_space.low)
action = action * 2 - 1
return action
But even though this wrapper is used, I believe that the action space doesn't get updated to account it, this causes the action limits to end up being the original ones instead of [-1, 1] which could cause compatibility problems with certain implementations in some of the scenarios.