Architecture & Plans for WebGPU Support
Following up on #691 & #647 - with WebGPURenderer becoming more widely supported and compute shaders being supported in three-mesh-bvh I wanted to lay out some of the architecture and features that a “WebGPUPathTracer” could support. This will break things down into the architecture for getting path tracing working with compute, extra features beyond what was reasonably possible with WebGL, and separate optimizations for improving the practically of the renderer. This issue should be considered a sort of “epic” for the overall work. We can create dedicated issues to discuss some of the subtasks if needed.
The webgpu-specific work can go into a src/webgpu folder initially and exported via three-gpu-pathtracer/webgpu.
Architecture
PathTracerCore
This is the upgraded, WebGPU analog to the PathTracingRenderer: the core class that path traces a given BVH-packed scene to a render target and manages blending, tiling, settings for the rendering, manage state being render iterations, and creates intermediate buffers and uniforms for AA, randomized values, etc.
The primary architectural changes from the previous iteration of the path tracer is that this will be written in WGSL compute shaders with three.js’ TSL. The basic render iteration could look like the following, based on the concept of “wavefront path tracing”:
- Update any uniforms to change per frame render for AA, etc.
- Choose a “tile” of the screen to render.
- Run a compute kernel to generate camera rays to a ray queue.
- Run a kernel to intersect the scene and add a series of surface hits with material properties (or material indices) to a shade queue.
- Run a kernel to fragment color from surface shading and generate a next ray and shadow ray to evaluate.
- Run a kernel (or iteration of kernels) to evaluate direct light sampling / “shadow” rays or (later on) indirect light sampling using bi directional path tracing (see #246).
All materials, etc will now be able to be uploaded as structures rather than encoded to textures. The compute kernels can be evaluated in a loop in a generator function so the number of rays and operations can be limited per frame to reduce the performance impact.
WebGPUPathTracer
This is a user-facing class for creating a pathtracer from a WebGPURenderer that can write to an individual render target or canvas and initialize the core path tracer from a passed scene, camera, lighting, etc. Based on the WebGLPathTracer class.
Optimizations
Compressed-wide BVH
A common optimization for GPU-path ray tracing is to modify a typical BVH to a “Compressed Wide BVH”. This optimization should be added to three-mesh-bvh (See https://github.com/gkjohnson/three-mesh-bvh/issues/519)
Top Level Acceleration Structure
This requires creating a “scene level” spatial data structure and then writing all geometry data, BVH data, offsets etc to a single buffer.
Additional Features
Denoising
The OIDN denoiser was experimented with for WebGLPathtracer but proved to be fairly complicated and brittle due to having to juggle conflicting WebGL state. WebGPU should make adding support for the OIDN denoiser much more approachable. The denoiser will need separable specular / view-dependent data from non-view dependent data, etc so the path tracer will likely need to be able to write and blend to separate buffers.
Variance Detection
While rendering it should be possible to know whether a pixel is “finished” rendering or converged by detecting how much a pixel is continuing to change with new samples. (see #198).
Post Processing Effects
Effects like lens flares, bloom, etc should all be done with post processing effects. The built-in three.js effects may be sufficient for this, or new ones can be added to the three.js projects for this.
HDR Support
High-dynamic-range canvases are supported with WebGPU canvases and can be rendered directly with the path tracer.
Custom Materials
The WebGLPathTracer only supported basic “MeshStandardMaterial” which means custom material generation, texture blending, etc, are not supported. With the TSL node system it may be possible to generate WGSL for different material paths in the shading kernel.
Other Thoughts
- Sort rays in queue in some way to improve thread utilization.
- Architect the ray queue so rays can be continuously added to the queue up to a certain sample.
I think a good first step would be to create the PathTracerCore and WebGPUPathTracer structure with support for a basic lambertian material model to ensure everything is working well before moving on to support environment maps, etc.
cc @theblek
Looks great!
Do you think it would be useful to firstly write a megakernel WebGPU pathtracer before proceeding to a wavefront one? I personally think this would be a great apples to apples comparison of WebGPU and WebGL in terms of compute performance. And after that we can cleanly investigate benefits of wavefront pathtracers?
This requires creating a “scene level” spatial data structure and then writing all geometry data, BVH data, offsets etc to a single buffer.
This makes me think about the memory requirements that the pathtracer will have. The division of meshes from TLAS would theoretically allow us to somehow load-unload meshes on demand. But that is very far-fetched and should be on the low priority until after the actual memory consumption is measured.
this will be written in WGSL compute shaders with three.js’ TSL
Do you think it would be best to use TSL or wgsl directly? In three-mesh-bvh we used wgslFn to create nodes from wgsl code. I wonder if using TSL will make it easier to integrate custom materials later on.
I personally think this would be a great apples to apples comparison of WebGPU and WebGL in terms of compute performance.
The recent compute additions to three-mesh-bvh already show this (see the webgpu raytrace example vs the webgl one), so I'm not sure if there's a lot of added value in comparing a WebGPU vs WebGL path tracer. For reference I'm seeing an almost 30% improvement in framerate in the above example when switching to WebGPU (~78 fps to ~100).
There may be value in comparing performance of a WebGPU megakernal path tracer vs a WebGPU wavefront path tracer, though, if only to ensure that the wavefront version is actually "winning" in terms of performance in the way we expect. There are enough other benefits to the wavefront approach that even if it's not strictly faster that I think it's worth pursuing but we should make sure we're not tanking performance with it, as well.
I would still just limit this to a simple, lambertian path tracer, though, until the the wavefront architecture is designed / proven compared to the megakernal. Then we can make decisions about what to do next once the performance and architecture is more clear.
The division of meshes from TLAS would theoretically allow us to somehow load-unload meshes on demand
There are quite a few benefits, I think - the biggest one is that reused or "instanced" geometry becomes significantly cheaper to both compute and use because you no longer have to duplicate all the geometry in a merged mesh.
Do you think it would be best to use TSL or wgsl directly?
The readability of WGSL is significantly easier for me so I'd prefer to stick with wgslFn, for now. I think there's still a lot of investigation to do regarding how we might be able to repurpose custom three.js node materials for use in the path tracer but that can come later, I think.