MI-MVI_2016 How to implement your own gym environment

Hey Jiri, I wonder if you could give some guidance on how to use keras-rl in order to create your own "gym" environment.

For example, I see that your board_gym.py is based on core.py, but what for do you need callback,py

Thanks in advance for some advise.

Oct 04 '17 09:10 Kjell-K

Hi Kjell!

I had had the same questions as you and everything has been answered by keras-rl author: https://github.com/matthiasplappert/keras-rl/issues/38

In short: Basically, in your game class (like board_gym.py) you need to implement step and reset methods. These are described in this abstract class: https://github.com/matthiasplappert/keras-rl/blob/master/rl/core.py#L533

To get info from the learning process you need some callback object. In this object (class) you implement several methods which are called by keras-rl on specific points during the learning process (or during testing/playing by agent). These methods are defined here: https://github.com/matthiasplappert/keras-rl/blob/master/rl/callbacks.py#L14 For example on_action_begin is called when agent does some chosen action. You can use the default *Logger classes from https://github.com/matthiasplappert/keras-rl/blob/master/rl/callbacks.py . However I needed to get and print some specific information about my game (like real score) so I implemented my own Logger in callbacks.py.

Oct 04 '17 15:10 gorgitko

Yes I see. I will just have to wrap my head around that. The unclear part is maybe, that I do not have discrete states and observations like in a board game, but continues. In AirSim, I am extracting images and then take an action. Therefore I do not really now how to time/clock the procedure.

Oct 05 '17 09:10 Kjell-K

Well, I don't have any experience with continuous environments. It's rather a question for keras-rl author -- I see he implemented some continous reinforcement learning methods, so ask him there :)

But I think I read something about this problem and the solution is to somewhat convert the continous space to discrete by averaging the extracted images. Or maybe you can use recurrent NN (it should be implemented in keras-rl): each time step would be one extracted image.

Oct 05 '17 09:10 gorgitko

MI-MVI_2016 MI-MVI_2016 copied to clipboard

How to implement your own gym environment

MI-MVI_2016
MI-MVI_2016 copied to clipboard