coach
coach copied to clipboard
Can't figure out 'observation_space' configuration
Hi
I'm building a new gym that needs a simple 'board' to record the game status, eg. 10x10 cells, and the agent position. Unfortunately I'm unable to figure out how to set up the observation_space
structure.
This is what I tried:
self._board_size = (10, 10)
self._board = np.ones(self._board_size, dtype=np.int32)
self._position = np.array([
np.random.randint(self._board_size[0]),
np.random.randint(self._board_size[1]),
])
And then the observation_space
:
self.observation_space = spaces.Dict({
"board_status": spaces.Box(
low=np.zeros(len(self._board.flatten()), dtype=np.int32),
high=np.ones(len(self._board.flatten()), dtype=np.int32),
dtype=np.int32),
"position": spaces.Box(
low=np.array((0,0)),
high=np.array(self._board_size),
dtype=np.int32),
})
Now in the step()
function I return a dictionary:
def _get_observation(self):
return {
"board_status": self._board.flatten(),
"position": self._position,
}
However when I run it with coach
it fails:
ValueError: The key for the input embedder (observation) must match
one of the following keys: dict_keys(['board_status', 'position',
'measurements', 'action', 'goal'])
My presets file has this:
env_params = GymVectorEnvironment(level='Test-v0')
How can I return the board status and agent position to the agent?
I have also tried to return use non-flattened board but that failed even sooner during initialisation:
self.observation_space = spaces.Dict({
"board_status": spaces.Box(
low=np.zeros(self._board_size, dtype=np.int64),
high=np.ones(self._board_size, dtype=np.int64),
dtype=np.int32),
"position": spaces.Box(
low=np.array((0,0)),
high=np.array(self._board_size),
dtype=np.int32),
})
And I got:
Failed to instantiate Gym environment class Test-v0 with observation space type None
Can you provide any advice how to do that please?
I'm using:
numpy version: 1.17.4 gym version: 0.15.4 rl-coach version: 1.0.1