coach Can't figure out 'observation

Can't figure out 'observation_space' configuration

Open mludvig opened this issue 4 years ago • 3 comments

Hi I'm building a new gym that needs a simple 'board' to record the game status, eg. 10x10 cells, and the agent position. Unfortunately I'm unable to figure out how to set up the observation_space structure.

This is what I tried:

    self._board_size = (10, 10)
    self._board = np.ones(self._board_size, dtype=np.int32)
    self._position = np.array([
        np.random.randint(self._board_size[0]),
        np.random.randint(self._board_size[1]),
    ])

And then the observation_space:

    self.observation_space = spaces.Dict({
        "board_status": spaces.Box(
            low=np.zeros(len(self._board.flatten()), dtype=np.int32),
            high=np.ones(len(self._board.flatten()), dtype=np.int32),
            dtype=np.int32),
        "position": spaces.Box(
            low=np.array((0,0)),
            high=np.array(self._board_size),
            dtype=np.int32),
    })

Now in the step() function I return a dictionary:

def _get_observation(self):
    return {
        "board_status": self._board.flatten(),
        "position": self._position,
    }

However when I run it with coach it fails:

ValueError: The key for the input embedder (observation) must match 
    one of the following keys: dict_keys(['board_status', 'position',
    'measurements', 'action', 'goal'])

My presets file has this:

env_params = GymVectorEnvironment(level='Test-v0')

How can I return the board status and agent position to the agent?

I have also tried to return use non-flattened board but that failed even sooner during initialisation:

    self.observation_space = spaces.Dict({
        "board_status": spaces.Box(
            low=np.zeros(self._board_size, dtype=np.int64),
            high=np.ones(self._board_size, dtype=np.int64),
            dtype=np.int32),
        "position": spaces.Box(
            low=np.array((0,0)),
            high=np.array(self._board_size),
            dtype=np.int32),
    })

And I got:

Failed to instantiate Gym environment class Test-v0 with observation space type None

Can you provide any advice how to do that please?

I'm using:

numpy version: 1.17.4 gym version: 0.15.4 rl-coach version: 1.0.1

Nov 27 '19 05:11 mludvig

coach coach copied to clipboard

Can't figure out 'observation_space' configuration

coach
coach copied to clipboard