three.js icon indicating copy to clipboard operation
three.js copied to clipboard

BatchedMesh Example much slower on WebGPU than WebGL on Android

Open Makio64 opened this issue 1 year ago • 1 comments

Description

On Android ( Samsung Galaxy S20 FE ) BatchedMesh Example WebGPU is much slower :

WebGPU : ~13FPS WebGL : ~25FPS

Screenshot_20241007_185610_Chrome

Screenshot_20241007_185559_Chrome

Reproduction steps

  1. load on Android https://threejs.org/examples/?q=bat#webgpu_mesh_batch
  2. enabled/disable WebGPU

Code

Live example

``

Screenshots

No response

Version

r169

Device

Mobile

Browser

Chrome

OS

Android

Makio64 avatar Oct 07 '24 17:10 Makio64

The multiDrawAPI isn't currently supported in WebGPU, which is why a single multi-draw call with 20,000 batched elements performs significantly better in WebGL, especially on smartphones.

However, there’s good news! A new MultiDrawIndirect API is on the horizon for WebGPU, which is expected to surpass the performance of the WebGL version: https://github.com/gpuweb/gpuweb/issues/1354#issuecomment-2370162949 https://issues.chromium.org/issues/369246557/dependencies

This API is already available in Chrome Canary behind the chromium-experimental-multi-draw-indirect flag, enabled through enable-unsafe-webgpu. I plan to begin working with it over the coming weeks, as multi-draw is an important part of my workflow.

In the meantime, as discussed in this PR, we can implement a workaround using multiple drawIndirect() calls with a single indirect buffer, mapped at different offsets for each draw alongside Render Bundles. This approach can mimic the upcoming MultiDrawIndirect API until it becomes widely available: https://github.com/mrdoob/three.js/pull/29197#issuecomment-2324472275

For now, I’ll wait for @Spiri0's work on implementing drawIndirect that looks very promising, which will provide a solid base for that work: https://github.com/mrdoob/three.js/issues/29568#issuecomment-2396426753

RenaudRohlinger avatar Oct 08 '24 02:10 RenaudRohlinger

@mwyrzykowski Just a heads-up: there’s currently an issue in the official Three.js BatchedMesh WebGPU example where setting the count above 1024 causes a break in the WebGPU backend of Safari. I tested this on the latest Safari Technology Preview. https://threejs.org/examples/?q=batch#webgpu_mesh_batch

The error: [Log] GPUDeviceLostInfo {reason: "unknown", message: ""}

RenaudRohlinger avatar Nov 14 '24 09:11 RenaudRohlinger

@mwyrzykowski Just a heads-up: there’s currently an issue in the official Three.js BatchedMesh WebGPU example where setting the count above 1024 causes a break in the WebGPU backend of Safari. I tested this on the latest Safari Technology Preview. https://threejs.org/examples/?q=batch#webgpu_mesh_batch

The error: [Log] GPUDeviceLostInfo {reason: "unknown", message: ""}

Oh thank you for the report @RenaudRohlinger. Do you know which Mac you tried? I tried an M2 Mac Studio with STP 207 with 17788 instances: Screenshot 2024-11-14 at 10 13 51 AM

might very well be Mac related.

In any case, the performance is really bad, so at the very least I will investigate that until I can figure out how to reproduce.

mwyrzykowski avatar Nov 14 '24 18:11 mwyrzykowski

I've been working with the drawIndirect since we got it in r170. This works quite well but it will be more comfortable to use it with structs

const drawBufferStruct = struct({
   vertexCount: 'uint',
   instanceCount: 'uint',
   firstVertex: 'uint',
   firstInstance: 'uint',
});

The values ​​can then be accessed more clearly in Fn and wgslFn

drawBuffer.vertexCount = vertexCount;
drawBuffer.instanceCount = instanceCount;

instead of:
drawBuffer.x = vertexCount;
drawBuffer.y = instanceCount;
like now

This means that uniforms can be bundled efficently by userside to handle them easier in shaders. Especially if you want to bundle a lot of different parameters from each instance in one or few buffer arrays. I already have it working, but now I have to implement it more cleanly. Let's see if I can make it to r171. My job is currently taking a bit more of my time, but I'm just as motivated to round out the drawIndirect topic with structs, so that it can be used to its full potential.

Spiri0 avatar Nov 14 '24 19:11 Spiri0

https://threejs.org/examples/?q=batch#webgpu_mesh_batch

On my M3 Pro Max I dont crash at 20k instance on safari but im at 1fps.. when 120fps on chrome on the same machine @RenaudRohlinger @mwyrzykowski

Makio64 avatar Nov 15 '24 01:11 Makio64

Looks promising @Spiri0, sorry for hijacking this issue by the way. 😬

@mwyrzykowski Thanks for looking into it! I'm using a Macbook Pro M1 Max from 2021 with Safari 207 and Sequoia 15.1. image

RenaudRohlinger avatar Nov 15 '24 01:11 RenaudRohlinger

Awesome @mwyrzykowski! Performance remained stable during profiling with an instance count of 512, but when I slightly increase it—say, around 600—I occasionally encounter Unhandled Promise Rejection: RangeError: Range consisting of offset and length are out of bounds in Safari, often right before a crash. Screenshot 2024-11-15 at 10 56 49

RenaudRohlinger avatar Nov 15 '24 02:11 RenaudRohlinger

@RenaudRohlinger I have a codePen here on how to use the drawIndirect buffer in conjunction with compute shaders. However, in accordance with If you feel like it, you can convert the shaders to TSL and turn it into an example because it also shows how to use drawIndirect with storage buffers, which will actually always be the case just like using it with compute shaders. If you don't feel like it, no problem then I will do it after the struct expansion. https://codepen.io/Spiri0/pen/PoMBvzz

With a few more buffers you can control exactly which instances should be visible and which should not, but that would be the topic for another example with structs

P.S. sorry for hijacking this issue too 😅 But this issue already touches the drawIndirect topic so much that this can soon be made more efficient.

Spiri0 avatar Nov 16 '24 00:11 Spiri0

@RenaudRohlinger I have a question about tsl / Fn and you know it better than me. So far I've only used wgslFn. You also have a forum account right? That would be more appropriate to discuss than using the issue for secondary topics.

Spiri0 avatar Nov 17 '24 00:11 Spiri0

@Spiri0 Sure! https://discourse.threejs.org/u/yakuno 😊

RenaudRohlinger avatar Nov 17 '24 00:11 RenaudRohlinger