poke-env Enable true concurrent battles

Currently, it is not possible to have concurrent battles. Although you can specify max_concurrent_battles, what it does (as far as I can tell) is spawn multiple battles but only allow you to call choose_move on them one by one. This greatly limits the speed of training as it is much more efficient to run many environments in parallel and batch the inputs than to process them one by one.

Therefore, I propose adding a new choose_moves function that returns a list of battles, as well as modifying the necessary source code to enable (true) simultaneous battles. This function should ideally not fix the length of the list (but allow for a maximum length), returning all the already processed battles when called to prevent bottlenecking.

Jun 26 '22 08:06 carbonbased-lifeform

Hey @carbonbased-lifeform,

Are you referring to the gym API or the general API?

Jul 25 '22 09:07 hsahovic

General. Does it exist in the gym API? I haven't really used that one yet.

Jul 25 '22 09:07 carbonbased-lifeform

In the general API, most of the time the bottleneck is not on poke-env's side, but on pokemon showdown's side. Processing batches of battles on poke-env's side would not make things faster - it would actually probably make things slower, as you would have to wait to have the state of batch_size battles ready to receive an order to process them.

In the gym API, your bottleneck might be on the model side, where batching can be helpful. I'm aware of a couple of implementations that implement this, but they also have to deal with tweaking their state / action / rewards tuple management to make training possible. I do not think this is suitable for the basic API, but it might be added to the docs as an advanced example.

With the general API, making SimpleHeuristicPlayers play randbats I can play a bit more than 20 games per second on my couple of years old laptop. I'm sure you could get these numbers up with a better setup.

Jul 25 '22 10:07 hsahovic

it would actually probably make things slower, as you would have to wait to have the state of batch_size battles ready to receive an order to process them.

As mentioned in the original issue, I proposed that showdown returns all the battles it has already processed. Even if batch size is not reached, we can simply take what games are already processed and run the model on them (this will benefit from finding a balance of course), since there are little complications associated with varying batch sizes on inference (unlike training). This ensures that the device with the model (which is more often than not the bottleneck) is always running at full capacity. Batching inputs is still vastly more efficient even with the concern you raised, a small test shows that a batch size of 1 takes 0.1 seconds whereas a batch size of 128 takes around .5 seconds on a relatively large model, thus my suggestion.

A bit of a nooby question since I haven't tried the gym API yet, but does the gym API run concurrent battles in the background as described above and thus allow batching?

Jul 25 '22 10:07 carbonbased-lifeform

Your remakes make sense w.r.t. running RL training, which is not what the general API is focused on (as opposed to the Gym API). The out-of-the-box implementation of the gym API is focused on running a single battle at a time, but you can spawn multiple instances and organize your training flow to benefit from parallelization, including with batching. This is what the implementations I was referring to are doing.

Jul 25 '22 10:07 hsahovic

Thanks a lot for the replies. I'll try out the gym API when I have the time. One more question (hope it's not a stupid one), is there anyway to modify the player_network_interface to be able to run concurrent battles? Given that it is possible to send messages from a single player to multiple rooms, I don't see why it's not possible to implement the following for concurrency since naïve parallelization using multiple players seems rather inefficient.

Jul 25 '22 14:07 carbonbased-lifeform

max_concurrent_battles determines how many battles will be run at the same time by the Player object, and relies on asyncio. If you want to implement some kind of batching logic, as you were hinting towards, I'd recommend doing so directly in choose_move - everything else is doing parsing and / or state tracking, which wouldn't really benefit from parallelization.

Jul 25 '22 15:07 hsahovic

In the gym API, your bottleneck might be on the model side, where batching can be helpful. I'm aware of a couple of implementations that implement this, but they also have to deal with tweaking their state / action / rewards tuple management to make training possible. I do not think this is suitable for the basic API, but it might be added to the docs as an advanced example.

Hi @hsahovic could you point me to some implementations like the ones you mentioned here? I'm in a situation where I'd like to run a large number of battles (1M+) and would like to be able to speed things up by running multiple battles in parallel.

Aug 18 '22 04:08 akashsara

My best suggestion is to get a fast CPU and try running it in the cloud. My thing is only able to run 200k battles in 15 hours and would take about 75 hours for one million. Maybe less if I increase the amount from just 100 at a time.

Jun 03 '23 01:06 kodecreer

You will also run into more issues with things like Revival blessing which was a tricky one to fix.

Jun 06 '23 20:06 kodecreer

In the gym API, your bottleneck might be on the model side, where batching can be helpful. I'm aware of a couple of implementations that implement this, but they also have to deal with tweaking their state / action / rewards tuple management to make training possible. I do not think this is suitable for the basic API, but it might be added to the docs as an advanced example.

Hi @hsahovic could you point me to some implementations like the ones you mentioned here? I'm in a situation where I'd like to run a large number of battles (1M+) and would like to be able to speed things up by running multiple battles in parallel.

Also You could either look into Torch RL and Remote RPC that I had to impliment personally.

Jun 29 '23 18:06 kodecreer

poke-env poke-env copied to clipboard

Enable true concurrent battles

poke-env
poke-env copied to clipboard