poke-env icon indicating copy to clipboard operation
poke-env copied to clipboard

[Proposal] Replace `gymnasium` with `pettingzoo`

Open acxz opened this issue 1 year ago • 1 comments

Pokemon battles are fundamentally a multi-agent system.

Currently we are using Farama's Gymnasium which is a RL env API for single-agents. I believe this decision came about as at the time of this project's creation there was no standard/popular multi-agent RL env API, leaving us to choose OpenAI's Gym, which has now moved over to Gymnasium.

However, with popularity rising in multi-agent systems, various RL env API's have been created for multi-agent systems, such as Farama's PettingZoo and Google Deepmind's OpenSpiel among others.

Overall, PettingZoo has become a strong defacto for multi-agent systems as it closely resembles Gymnasium's API, is mathematically more general than the other RL multi-agent env APIs, and more importantly has seen adoption in the popular RL libraries such as rllib, tianshou, stable-baselines3, among others.

With this in mind, and again recognizing that Pokemon battles are a multi-agent system, I believe that we should be using pettingzoo instead of gymnasium.

Practically, this would mean a change in the API for us and it should mean that certain logic we may have assumed/worked around due to using a single-agent API would be simpler in PettingZoo's API.

I'd love to hear everyone's thoughts about this and the feasibility of this change. I truly believe such a change would increase the usability of this project.

acxz avatar Jun 09 '24 16:06 acxz

This sounds like a good idea, and I would be open to discussing a plan to implement it.

hsahovic avatar Jun 10 '24 23:06 hsahovic

@acxz did you ever start/try this? Wondering about this myself

caymansimpson avatar Nov 21 '24 06:11 caymansimpson

Nope never started actually implementing it. Knock yourself out.

Let me put some of my thoughts down tho. Here is my proposed list (with difficulty level):

  1. Change all instances of openai branding to farama and gym to gymnasium or simply get rid of openai branding (easy)
  2. Add multiagent pettingzoo versions of current gymnasium envs alongside the gymnasium envs (medium) We would use the ParallelAPI for this problem domain. After pondering for a good bit, there is still value to have single agent envs, since unless we control both the opponent and our player in the showdown match, the problem is effectively a single agent system. However for any env that we are controlling the opponent's actions ourselves, the multiagent api (i.e. pettingzoo) should be used.

At this point poke-env would have multiagent envs that allow for concepts such as the format gen8freeforall, self-play, multi random formats, etc. to be implemented natively.

acxz avatar Nov 21 '24 07:11 acxz

@acxz I saw your recent PR on #1! And a few of us have been talking about how to do #2 here! Would love your thoughts and comments, or if you've already started thinking about this, to understand what you've been up to/planning. None of us have been able to prioritize it (or are planning to atm) -- and so are just trying to get a lay of the land, thoughts on approaches and figure out who can take a first crack at this. If you comment on it w/ your contact info (like a linkedin or smth), we could start a chat or smth

caymansimpson avatar Dec 08 '24 05:12 caymansimpson

I'm going to close this as the title issue has been solved - feel free to continue chatting here, in discussions, on discord or in a new issue, as you prefer.

hsahovic avatar Dec 10 '24 01:12 hsahovic

@acxz all done!

cameronangliss avatar Jan 03 '25 02:01 cameronangliss