gym-mupen64plus icon indicating copy to clipboard operation
gym-mupen64plus copied to clipboard

Support newer versions of gym

Open roclark opened this issue 4 years ago • 9 comments

Version 0.9.6 of gym introduced several changes which broke compatibility with most environments, including updates to the MultiDiscrete interface and changes to the step, render, and observe methods from private to public. This patch aims to add compatibility with newer versions of gym by addressing the following:

  • Update instantiation of MultiDiscrete action spaces to be non-negative ranges with an inclusive lower-bound and exclusive upper-bound.
  • Pad the actions being sent to the Controller class to ensure proper ranges for joystick orientations are used. For example, the X- and Y-axis ranges as instantiated with MultiDiscrete could be from [0, 256). The controller, however, would expect numbers in the range of [-128, 127]. To account for this, the action supplied by gym should be wrapped for the joystick axes before passing the action to the Controller class.
  • Change all private methods required by gym to public methods (ie. change _step() to step()).

Closes #41

Signed-Off-By: Robert Clark [email protected]

roclark avatar Sep 18 '19 17:09 roclark

Just updated the commit with the requested change, but added a comment about the location of pad_action. Thanks again for the review!

roclark avatar Sep 20 '19 12:09 roclark

I realized that the readme links to another repo that I have with an A3C agent. I'd like to make sure that these changes are either backwards compatible with that agent, or have a parallel PR to that repo to upgrade it as well, to maintain compatibility between these two repos. Long term, this may not be a good strategy, but right now I don't have another idea. Have you tried running this agent? Would you be able to do this as well? If not, let me know and I can take care of it.

bzier avatar Sep 23 '19 16:09 bzier

I will check it out! I haven't looked at the other agent yet, but I will verify the functionality shortly and report back on what I find. If updates need to be made and they are fairly straightforward, I'd be happy to help. I will keep you updated! Apologies for the delay!

roclark avatar Oct 05 '19 21:10 roclark

No worries on the delay, I totally understand how it goes. It might be that the existing agent isn't (easily) able to be made compatible with the new gym. That might be fine since it is an older algorithm, but I would like to have some working agent as an example, that is capable of running and learning successfully.

bzier avatar Oct 06 '19 03:10 bzier

Just wanted to mention that I haven't forgotten about this! I think my focus is on #73 at the moment, but once I get that working (or if someone beats me to it 😃), I will take a better stab at this.

roclark avatar Jan 06 '20 14:01 roclark

I've poked around here for a couple of days as I've had time, and I have some good and bad news. The bad news is your repository that you linked to no longer works with the newer gym, and given the significant changes to gym, universe as a whole isn't supported anymore (the repository you forked from is now archived), which would require a significant re-write to get working properly.

I agree that it would be good to have a link to a repository that is able to train properly, so I found baby-a3c and created my own updates which appear to work relatively well. It's not phenomenal, but it was a decently quick hack which solves the problem.

During this process, I found a few additional changes which needed to be made here - namely applying another update to a newer version of gym since I last worked on this, moving the _pad_action method up to mupen64plus_env.py, and changing the method names in the discrete environment files.

At this point, I think this is functional, but am happy to modify anything else you might feel should be changed.

Thanks for the patience! :smiley:

roclark avatar Jan 09 '20 13:01 roclark

Been a while since I've done any work on this, but I've made several changes recently that should now be fully functional for both continuous and discrete action spaces for Mario Kart and Smash. I've finally created a testbed where I can consistently train agents to learn various Mario Kart tracks using newer versions of gym and I'm satisfied with the results. Looks like the change to get newer versions of gym is more straightforward than I originally planned as we mainly convert from a MultiDiscrete action space to a Box space and rename various methods. By using Box, we also don't need to do any special handling for the joystick values to place them in the proper domain.

I know you've been planning on looking into this yourself, but hopefully this can provide a good template. I'd be happy to share my examples on how I am currently training agents to finish tracks.

Also, looks like my update removed some whitespaces on lines. I'd be happy to revert to the original template for those files if desired. Thanks!

roclark avatar Mar 24 '21 02:03 roclark

Hey @roclark, sorry I haven't responded here yet. I am planning to take a look at it, but may not get to it for another couple weeks. Just didn't want to leave you hanging here, but probably won't get around to it this weekend either.

bzier avatar Apr 10 '21 17:04 bzier

No worries @bzier, I appreciate the update!

roclark avatar Apr 15 '21 16:04 roclark