BlenderProc icon indicating copy to clipboard operation
BlenderProc copied to clipboard

[QUESTION]: Multi instance gpu rendering for multiple gpus is slower than expected

Open EsteBran opened this issue 2 years ago • 7 comments

I have a machine with multiple gpus, and the render time is fast enough that using all the gpus is slower than using only one gpu (due to read/write overhead across all 4 gpus I'm guessing). When I use subprocess.Popen to run 4 instances of blender (each using a separate gpu) though the render time for each frame goes up from ~1.3s to ~3s. Any ideas why this could be?

EsteBran avatar Jul 20 '22 21:07 EsteBran

Hey,

I suspect that it still uses all GPUs, are you sure that it doesn't do that?

Best, Max

themasterlink avatar Jul 21 '22 05:07 themasterlink

I agree with max, this is probably related to #585

cornerfarmer avatar Jul 21 '22 13:07 cornerfarmer

I'm pretty sure it's running on single gpus concurrently, since I have 4 versions of the same script each with different gpu specified running on 4 processes.

EsteBran avatar Jul 21 '22 14:07 EsteBran

How did you ensure that? BlenderProc by default takes all available GPUs. I would really check, you can check with nvidia-smi.

If each script only takes on GPU it shouldn't be slower.

PS: I think BlenderProc prints how many gpus it uses? Doesn't it?

themasterlink avatar Jul 21 '22 14:07 themasterlink

Hey,

So we have a solution in #630, but we are not happy with it. But, our desired solution would take a quite significant restructure of the init and cleanup fct. Which we don't have the time for right now. So, you can use this solution provided in the branch for now.

Best, Max

themasterlink avatar Jul 22 '22 07:07 themasterlink

How did you ensure that? BlenderProc by default takes all available GPUs. I would really check, you can check with nvidia-smi.

I found a reddit post where they were able to get one gpu per instance working. It's a bit of a bad implementation though, since you need multiple copies of the same script, and just change the gpu number by one.

I did test out and see that it was actually using multiple gpus concurrently, instead of using all of them together, and from the gpu usage from nvidia-smi and gpustat, that seems to be the case. I ended up posting something on the official blender forums, and they had something interesting to say.

Either way. Since you’re rendering on 4 GPUs, I have a suspicion that you’re using Cycles as your rendering engine. If that is the case, then you need to take into consideration this: When Cycles, the rendering of each frame in an animation consists of three steps:

  1. Scene initialization (converting the Blender scene into a Cycles scene) - This is done on the CPU.
  2. Rendering - This is done on the GPU.
  3. Saving of the image to disk - This is done on the CPU.

After those tasks are done, Blender/Cycles moves onto the next frame. Since two of those tasks are done on the CPU, they can not be sped up by distrobuting work across multiple GPUs. And since your render times are so low, I suspect the fact those two tasks are done on the CPU is leading to you not seeing the performance uplift you want.

EsteBran avatar Jul 22 '22 15:07 EsteBran

Hey,

even though the answer is correct, we have done a lot to avoid this problem. In our cases, most scenes are static and change not too much in between different renders keeping the CPU time quite low. We also use a setting that this process is only done once. They are not comparable to scenes generated by Blender artists, these are much more detailed and need much more CPU time, than our scenes.

If you use the branch I linked above, you can set the used gpus for each script and should see a speed up. It might still mean that rendering one frame in multi gpu rendering is slower than on a single gpu, but you will get four times the images, so it must be faster. I have rendered many many scenes with BlenderProc on many GPU system and it has always been faster in multi gpu mode, always. If not I really want to know which CPU is in your system and if it is was produced in this century :D

Best, Max

themasterlink avatar Jul 22 '22 17:07 themasterlink

I assume this has been resolved.

themasterlink avatar Aug 25 '22 07:08 themasterlink