three-gpu-pathtracer icon indicating copy to clipboard operation
three-gpu-pathtracer copied to clipboard

Wavefront pathtracer

Open TheBlek opened this issue 4 months ago • 5 comments

While doing research for #547 I stumbled upon wavefront pathtracers. According to nvidia's paper it can speedup pathtracing by 36-220% depending on the scene: https://research.nvidia.com/sites/default/files/pubs/2013-07_Megakernels-Considered-Harmful/laine2013hpg_paper.pdf.

Implementing such a pathtracer without compute would be hard. But WebGPU support for three-mesh-bvh was recently merged in and should be available in the next release.

While #547 is not required for this, I think it would be best to do megakernel pathtracer in WebGPU first to propertly measure and compare performance.

Is this something the project is interested in?

TheBlek avatar Sep 10 '25 02:09 TheBlek

Thanks for sharing this - your link is leading to a 404 but I read this this blog post yesterday about the concept (not loading for me now, for some reason, but you can see it in the wayback machine). To summarize it sounds like the path tracing process is broken up into several steps so several comput kernels can be used (screen ray generation, ray trace, shade, shadow, repeat) and reduce the number of threads that are being occupied by "useless" work. I'm wondering if these sets of "hits" or "rays" could be sorted by spatial location, trace direction, or material type, as well, to help cut down on branching in the wavefronts where possible.

I hadn't heard of this concept before but I'd had some thoughts about doing something similar to WebGL but for different reasons: One of the big usability issues of the WebGL version of the path tracer was how it tanked the framerate on most devices even when tiling the working. The solution would have been to use MRT in WebGL to store these kinds of "intermediate ray / hit" states by packing it into various textures so one full "path" could be traced over multiple frames if needed, reducing the framerate hiccups (and ultimately browser / OS lockups since it blocks anything else from being done on the GPU).

This approach would bring these benefits and more so I'm all for it. If you're interested in helping with this and starting off a new WebGPUPathTracer I'm all for it. If it's okay I'll put together an issue with a list of steps and subwork required as well as some thoughts on architecture. I think there's a lot of potential here and compute shading should prove to be significant improvement over the WebGL fragment shader.

gkjohnson avatar Sep 11 '25 07:09 gkjohnson

your link is leading to a 404

Oops, sorry for that. Fixed the link.

I read this this blog post yesterday about the concept

Yeah, this is also where I got the idea.

The solution would have been to use MRT in WebGL to store these kinds of "intermediate ray / hit" states by packing it into various textures so one full "path" could be traced over multiple frames if needed, reducing the framerate hiccups

Oh, that is an interesting idea. Path tracers already us temporal AA so that fits nicely. I'm thinking with the wavefront architecture we could feed the gpu with new work and take what's already done continuously and it would be kind of async rendering. Though the task queue could grow large and image would have a noticeable input delay and results from stale rays could smear the image if camera is moving, so this should be managed somehow.

If it's okay I'll put together an issue with a list of steps and subwork required as well as some thoughts on architecture.

That would be awesome!

TheBlek avatar Sep 11 '25 08:09 TheBlek

I'm thinking with the wavefront architecture we could feed the gpu with new work and take what's already done continuously and it would be kind of async rendering.

Can you elaborate on this? All the work in the path tracer is sequential (and GPU blocking since the GPU is generating the data for it's next step) so I'm wondering what we might be able to parallelize 🤔. But fundamentally it's nice that this kind of architecture lets us slice work a bit and push further processing to the next frame which should let the rest of the browser continue to run at interactive framerates more easily.

results from stale rays could smear the image if camera is moving

Are you imagining a dynamic path traced scene? The current implementation is more or less designed for a static camera / objects but with TRAA and denoising it may be possible to get some rendering that feels somewhat dynamic. Though view-dependent rendering like specular can result in ghosting. I believe applications like Blender separate specular from diffuse rays to composite later for a variety of reasons - so this may be something worth looking into how to do, as well.

That would be awesome!

I'll put something together in the next week 🙏

gkjohnson avatar Sep 12 '25 00:09 gkjohnson

Can you elaborate on this? All the work in the path tracer is sequential (and GPU blocking since the GPU is generating the data for it's next step) so I'm wondering what we might be able to parallelize

It won't be a parallelization but some kind of async computation. Wavefront pathtracer traces rays in a loop. We could only run a couple iterations, not until the end, for the previous frame before starting tracing rays for the current frame. For each frame I though we could add generated screen rays into the same buffer that holds rays to trace to keep the kernels running. And take terminated rays and integrate the resulting color into the current image.

But now that I think of it, this is not very useful and not really achievable either. We could just create multiple buffers and schedule pathtracing for current frame before last is finished. This is much simpler and still keeps GPU busy. Crazy ideas 🤷‍♂️

TheBlek avatar Sep 12 '25 07:09 TheBlek

Crazy ideas are always a good place to start! 😁

I've created #692 to discuss more concrete details. We can create dedicated issues (or continue in this one regarding wavefront PT, for example) and link them back to that one if there are more in depth discussions that need to be had around specifics.

gkjohnson avatar Sep 16 '25 10:09 gkjohnson