coach icon indicating copy to clipboard operation
coach copied to clipboard

CompoundActionSpace contains() call always fails

Open dans-msft opened this issue 4 years ago • 0 comments

The contains() call always fails for CompoundActionSpace. This prevents any use of this space for actual reinforcement learning, because Coach often calls contains() to check the values of actions.

This error will occur when running a preset against any environment that uses a CompoundActionSpace. But it's even easier to reproduce than that: this small code sample will reliably reproduce the issue:

from rl_coach.spaces import DiscreteActionSpace, CompoundActionSpace cas = CompoundActionSpace([DiscreteActionSpace(2), DiscreteActionSpace(2)]) cas.contains(cas.sample())

This is the error that occurs:

ValueError Traceback (most recent call last) in 1 from rl_coach.spaces import DiscreteActionSpace, CompoundActionSpace 2 cas = CompoundActionSpace([DiscreteActionSpace(2), DiscreteActionSpace(2)]) ----> 3 cas.contains(cas.sample())

~/minerl/lib/python3.6/site-packages/rl_coach/spaces.py in contains(self, val) 130 if type(val) == np.ndarray and not np.all(val.shape == self.shape): 131 return False --> 132 if (self.low is not None and not np.all(val >= self.low))
133 or (self.high is not None and not np.all(val <= self.high)): 134 # TODO: check the performance overhead this causes

ValueError: operands could not be broadcast together with shapes (0,) (2,)

A similar error occurs on any CompoundActionSpace. The one above is the simplest possible one I could come up with.

Looking at the code, the problem appears to be that CompoundActionSpace needs to override .contains() but didn't, resulting in the call going to Space.contains(), which does some comparisons using the space's shape which is something that doesn't make sense for a CompoundActionSpace.

Coach version: 1.0.0 OS: Ubuntu 18.04 LTS

dans-msft avatar Oct 23 '19 17:10 dans-msft