ai2thor icon indicating copy to clipboard operation
ai2thor copied to clipboard

multi-thread speed up

Open xubo92 opened this issue 2 years ago • 3 comments

Hi @ekolve @Lucaweihs @mattdeitke @everyone,

I'm doing RL, and was trying to create multiple sim envs at the same time and speed up for rollouts collection with multi-threads. But I found out even I tried with 8 threads, the rollouts collection is still not speeding up too much compared to only using one sim env.

I can confirm the multi-threads creation and use are correct. Does anyone have experience with this? about using multi-envs for more efficient rollouts collection?

Thank you!

xubo92 avatar Nov 02 '21 23:11 xubo92

Hi @xubo92, a few initial questions:

  1. What RL training library are you using? We generally use AllenAct as it natively supports running many environments in parallel and using multiple GPUs.
  2. Are the FPS numbers your getting taking into account model inference / backpropagation or are you simply collecting lots of (possible random) rollouts?
  3. What types of task are you attempting to accomplish? In particular, what types of actions are your agents taking (some are much slower than others).
  4. What FPS are you achieving?

Lucaweihs avatar Nov 03 '21 22:11 Lucaweihs

Hi @Lucaweihs

Thanks a lot for following up!

  1. I'm not using AllenAct, but using a specific Pytorch implementation for multi-agent RL, more specifically, a multi-agent PPO algorithm. And this implementation supports parallel envs. Here it is: RL code
  2. yeah, the FPS takes into account the model inference in the step() function, but no backpropagation is included. The model inference is using resnet-18 as the backbone to do a visual embedding.
  3. I'm working on a 2-agents cooperative task using robothor map. I mostly use common basic actions, like MoveForward, RotateLeft, teleport.
  4. ~10 FPS at most.

xubo92 avatar Nov 04 '21 18:11 xubo92

Hi @xubo92,

  1. Sounds good! I suspect that the RL library isn't the issue but it might be worth trying another on your setup to double check.
  2. What FPS do you get running a instance of AI2-THOR by itself? I.e. if you set up a script that just has the agent taking "RotateRight" actions (outside of the RL library) what FPS do you get?
  3. Gotcha, yes this should be quite fast. We get around 1.2k FPS when using 60 processes on 8 GPUs (including backpropagation and inference).
  4. That's very slow indeed. Can you give some more information about your machine's setup? I.e. how many GPUs, how many cores, what operating system, etc?

Lucaweihs avatar Nov 05 '21 16:11 Lucaweihs