MI-MVI_2016
MI-MVI_2016 copied to clipboard
How to implement your own gym environment
Hey Jiri, I wonder if you could give some guidance on how to use keras-rl in order to create your own "gym" environment.
For example, I see that your board_gym.py is based on core.py, but what for do you need callback,py
Thanks in advance for some advise.
Hi Kjell!
I had had the same questions as you and everything has been answered by keras-rl
author: https://github.com/matthiasplappert/keras-rl/issues/38
In short:
Basically, in your game class (like board_gym.py
) you need to implement step
and reset
methods. These are described in this abstract class: https://github.com/matthiasplappert/keras-rl/blob/master/rl/core.py#L533
To get info from the learning process you need some callback object. In this object (class) you implement several methods which are called by keras-rl
on specific points during the learning process (or during testing/playing by agent). These methods are defined here: https://github.com/matthiasplappert/keras-rl/blob/master/rl/callbacks.py#L14
For example on_action_begin
is called when agent does some chosen action.
You can use the default *Logger
classes from https://github.com/matthiasplappert/keras-rl/blob/master/rl/callbacks.py . However I needed to get and print some specific information about my game (like real score) so I implemented my own Logger
in callbacks.py
.
Yes I see. I will just have to wrap my head around that. The unclear part is maybe, that I do not have discrete states and observations like in a board game, but continues. In AirSim, I am extracting images and then take an action. Therefore I do not really now how to time/clock the procedure.
Well, I don't have any experience with continuous environments. It's rather a question for keras-rl
author -- I see he implemented some continous reinforcement learning methods, so ask him there :)
But I think I read something about this problem and the solution is to somewhat convert the continous space to discrete by averaging the extracted images.
Or maybe you can use recurrent NN (it should be implemented in keras-rl
): each time step would be one extracted image.