wgpu icon indicating copy to clipboard operation
wgpu copied to clipboard

Mesh Shaders Tracking Issue

Open inner-daemons opened this issue 10 months ago • 18 comments

Replaces #3018.

Spec

Progress

Current open PR(s) and other work

  • #8456
  • #8481 by @Slightlyclueless
  • #8493
  • #8507

naga

  • [x] Add mesh, task shaders to naga shader types so wgpu-hal can stop pretending they are compute shaders - #7292
  • [x] Support in WGSL frontend - #8370
  • [ ] https://github.com/gfx-rs/wgpu/issues/8517
  • [ ] Validation - #8507
  • [ ] Support in SPIRV writer - #8456
  • [ ] Support in HLSL writer - taken on by @luc-rodriguez
  • [ ] Support in MSL writer - #8493
  • [ ] Support in WGSL regurgitator - #8481
  • [ ] Support in SPIR-V frontend
  • [ ] Reflection info for pipeline validation - #8507
  • [x] #8360

wgpu-hal backends

  • [x] Vulkan - #7089
  • [x] DX12 - #7219 (#8110) - still needs naga support
  • [x] Metal - #8139 - naga PR separate

Other

  • [x] Formal spec - #7885
  • [x] wgpu API - #7345
  • [x] wgpu-core pipeline validation - #7345 (?)
  • [ ] #8003 - #8507
  • [x] General tests- #7345
  • [ ] #8517
  • [x] Examples - #7345
  • [ ] #8343

Features

  • [x] #7262
  • [ ] Primitive ID builtin - not supported in wgpu in general yet - #8236 required
  • [ ] Queries - #8523 (unplanned)
  • [ ] Point primitives(at least in vulkan, probably its own feature)
  • [ ] Finalize limits - #8003 - #8507

Current Priorities

  • [x] Multiview in wgpu-hal - very simple - #7278
  • [x] Naga types - needed for wgpu-hal completeness(current code pretends they are compute shaders) - #7292
  • [x] Formal spec created and added to trunk - #7885
  • [x] Experimental wgpu API - blocking many things - #7345
  • [x] Naga WGSL frontend - needed for proper wgpu implementation - #8370
  • [ ] Naga SPIRV backend - needed for proper wgpu implementation - #8456
  • [x] DX12 in wgpu-hal - desired - #8110
  • [ ] Naga reflection info, needed for proper wgpu implementation - #8507
  • [x] Lots of testing for wgpu - partly in #7345

Issues to work out

  • How to implement multiview, what features, etc
  • Queries

inner-daemons avatar Feb 22 '25 19:02 inner-daemons

@cwfitzgerald I’ve been thinking, we probably ought to have some way to indicate the “payload” for task and mesh shaders. I’m updating the original comment to add an attribute, @payload(some global variable). This would also probably mean we don’t need a new storage class. Let me know if you think there is a better option.

inner-daemons avatar Feb 26 '25 16:02 inner-daemons

Sounds fine - one thing to keep in mind that we want the IR to be compatible enough with how SPIR-V does stuff, such that spirv-in can eventually support mesh shaders. I don't know specifics about how mesh shaders work in spirv, but we should be careful to not require things that are too hard to implement in that frrontend. We' already require heroics in the spirv frontend for atomics.

cwfitzgerald avatar Feb 27 '25 01:02 cwfitzgerald

@cwfitzgerald Thanks for the response. I'll try to keep a list of things that might be hard to implement in certain environments.

inner-daemons avatar Feb 27 '25 01:02 inner-daemons

@cwfitzgerald Should we wait for MVP before creating a wgpu API? Or should we add a wgpu API before it works properly/is validated? I think working on it right now would make tests, examples, etc far easier, as currently those live in my own branch.

inner-daemons avatar Mar 04 '25 14:03 inner-daemons

I think working on it right now would make tests, examples, etc far easier, as currently those live in my own branch.

10000% - We can add the feature to Features with the experimental prefix (so EXPERIMENTAL_MESH_SHADERS; look at raytracing for something to model this off of) so that people know it's incomplete and may have validation issues. The sooner we can get CI running mesh shaders, the better!

As a separate PR, it also would be good to add the spec as you laid out in the last issue into the repo under docs/specs and linked in the README (you can also look to how RT did this).

cwfitzgerald avatar Mar 04 '25 21:03 cwfitzgerald

@cwfitzgerald What's the guidance on creating tests that require spirv-passthrough? Should I just include the spirv binaries, or some assembly to go along with them, or the original source code?

On an unrelated note, currently I am going with a unified create_render_pipeline in wgpu_core, but separated in wpgu and wgpu-hal. Do you think we should switch to a unified approach in wgpu-hal? There is a decent amount of code duplication. Also, my current approach simply overhauled the pipeline creation apis, but I know wgpu-native uses wgpu-core, and it also broke the player tests. It should be fairly simple to refactor this to have the logic for pipeline creation stored in a single function and then a function to create each of the two types of pipelines, which I suspect is what you'll recommend.

Thanks!

inner-daemons avatar Mar 07 '25 05:03 inner-daemons

@cwfitzgerald More questions! No rush to answer of course. But I worry that if we extend the WGSL language too much, IDE support will completely fail, and so developers will have a rough time. For example, with the functions that must be "generic", they could be called outside of the main entry point, in which point it may not be clear which function should use which vertex/primitive output type. The way we'd work with that in naga is just detecting which entry point a given function is called in. But, and this is really a rather small issue, IDEs might not have such a good way of detecting this. But beyond IDEs, this can cause confusing issues, like changing one function invalidating another.

A similar point is that only one payload is allowed per entry point, so this must also be cleverly determined.

I can't really think of any desirable workarounds for this, since generics are pretty much out of the question for now.

Like I said, not a super huge issue, no rush for responses, I haven't even come up with a spec yet much less started implementing anything for WGSL.

inner-daemons avatar Mar 12 '25 20:03 inner-daemons

Update for anyone following at home: there is only one PR until full support + testing + examples in wgpu. I will have a break in roughly a week(the week of March 24th), I expect this PR to be mostly completed by that point(by that I mean the PR will open and the remaining issues will be from reviews. It may still take longer to merge depending on wgpu maintenance). Currently the only issues that need to be resolved are with limits, and validation relating to them.

So, #7345 will be the PR to watch for now.

After that, the plan for naga is roughly

  • Put together a spec for the WGSL changes
  • Implement a wgsl parser & SPIR-V writer.
  • Implement a SPIR-V parser(this should solidify the IR representation more than anything)
  • Implement reflection for shaders
  • Add support for naga shaders and validation to wgpu-core
  • Implement more naga validation. This will likely need more flags for in-shader correctness checks
  • Implement naga tests & wgpu tests relying on naga
  • Implement naga writer for HLSL
  • Work on DirectX implementation
  • Then maybe look at Metal/MSL. I may get a MacBook Air sometime soon, in which case this could happen sooner.

In particular, I also plan on getting a spec created by or during my break, as well as starting work on the core naga changes.

inner-daemons avatar Mar 17 '25 04:03 inner-daemons

But I worry that if we extend the WGSL language too much, IDE support will completely fail, and so developers will have a rough time.

I wouldn't worry too much about this, there are already various static-use rules within wgsl that need to be minded.

Helper functions writing to payloads is an interesting conundrum. I don't have a great answer offhand.

cwfitzgerald avatar Mar 18 '25 19:03 cwfitzgerald

Update: I haven't accomplished anything since my last comment but I just dusted off and finished #7345. I plan on devoting more time over these next few weeks to implementing mesh shaders. Once I finish adding support for DX12 I'm gonna write up the WGSL extension spec and then start working with naga, which should be the hardest part.

Also, as I will be going to college soon, I may get a Mac laptop. If this is the case, I may work on a metal implementation. Don't hold your breath though!

If you have a good use case for this or want to do some testing, or want to request a feature, let me know :)

inner-daemons avatar Jul 06 '25 02:07 inner-daemons

Update to anyone following along, hopefully this isn't too spammy:

I got the spec landed in #7885, and finalized #7345 which should in turn finalize the wgpu api, with the exception of limits that will change (and corresponding validation). This PR is a little big though, so it will probably take some time to get reviewed! I've been working hard on #7930, which is likely to be the biggest/most complex PR of the mesh shader related ones, and can currently parse all of the related WGSL syntax, validate it, and write the SPIR-V output for task shaders at least. The outputted task shader passes spirv-val, and is correct as far as I (and ChatGPT) can tell, though I haven't tested it in a real pipeline, as I'm waiting for #7345 to land on trunk. I hope to finalize mesh shader writing for SPIR-V tomorrow or at least in the coming week. Then that PR will be good for review.

Also, I will be getting a Mac (!!) so Metal support is likely to come at some point (though it isn't a priority by any means, and I have never worked with Metal). I'm likely to work on DirectX support soon as well, though I have also never written much DirectX code so I might have to familiarize myself with the API first.

It's looking like mesh shaders will be fully supported in Vulkan very soon, and other backends not long after. I'm fairly confident I can finish all of this in time for WGPU 27.0!

inner-daemons avatar Jul 13 '25 03:07 inner-daemons

Ok this probably is starting to cross the line of spammy. But I just wanted to let everyone know, I got naga generated task and mesh shaders working with wgpu mesh shaders api, no validation errors in vulkan layers or spirv-val! Unfortunately I haven't written the code to allow fragment shaders in naga to read from per primitive inputs, so the fragment shader is written in GLSL. My model for how cull_primitive builtin would work also is incorrect, so I had to remove that. But it still involves every other feature working exactly as intended!

The wgpu-api PR is like 2.7k lines long so will take a long time to review. The naga PR is like 5k lines long, so I still will have to split it up, fix various things, and so on, not to even mention the time it will take to get each PR reviewed and merged. So I wouldn't expect complete mesh shader support to be landing on trunk anytime within the next month, and I have worked like 12-15 hours a day these past 2 days, and more today of course, so I will probably slow down now, but I will still try my darnedest to get it out for the next release!

inner-daemons avatar Jul 14 '25 19:07 inner-daemons

Ok this probably is starting to cross the line of spammy.

No please, this is great! If someone is bothered by it, they can always unsubscribe to the issue.

I have worked like 12-15 hours a day these past 2 days, and more today of course, so I will probably slow down now, but I will still try my darnedest to get it out for the next release!

This is absolute heroism, definitely take a break and recoup, you've earned it!

cwfitzgerald avatar Jul 14 '25 20:07 cwfitzgerald

Going to ping @cwfitzgerald for advice here:

Due to an issue I should've seen coming, we will have to write to a temporary output buffer before copying that into the final one on function return (for spirv). Right now, that means that the user will have to write their output type, copy it into a temporary buffer, and then have that temporary buffer copied into the final destination.

I'm concerned that all of this copying will result in larger shader sizes and significantly worse performance. So I'm thinking of either limiting output writing to the entry point function, exposing the output's underlying (or temporary) array, or even making the max output size be tied to the output type and not the entry point (see below explanation of the problem).

Perhaps worth noting is that if we only consider single entry point outputs these issues go away.

Another thing worth noting is that while the issue is there and worth addressing, it's unlikely to be an issue on most systems. This is because the output array for certain builtins must have length exactly equal to the max vertices described in the entry point. But currently, the output array has length of the maximum of the max count of entry points with the same output type. Therefore, I suspect many systems would work fine even with the issue exposed, and the issue would only even be exposed when the user has multiple mesh shader entry points with the same output types but different max output sizes in the same output spirv.

I'm not known for good explanations but hopefully that got the point across

inner-daemons avatar Jul 15 '25 16:07 inner-daemons

Ok its been a while but I got back to the project today. I am basically done with the WGSL-in and SPIRV-out side of things. In my PR #7930, you can run the mesh shader example with all shaders generated by naga! My main changes were about logic and correctness, but I also implemented @per_primitive.

The API still isn't final, so if you're interested in this, I'd appreciate it if you'd check out my updated spec and example WGSL code in the aforementioned PR. I'll describe some areas I'm displeased with below for those interested.Z

My main issues with it are unnecessary copies and lack of transparency. Under the hood, we use temporary vertex and primitive arrays that are simply copied to when you call setVertex or the like, and then copied from when we exit. That's 2 copies per vertex/primitive, which I'm not happy about. Also, setMeshOutputs currently actually calls the underlying function, but if we go with the method of setting everything right before the function exits, we could better ensure correctness by being sure that it isn't being called twice or not at all. But if all 3 mesh functions just end up writing to a temporary array anyway, I'd prefer that the user just sees that array and can write to it themselves since there would be no correctness issues with that.

Also, the @per_primitive system is a little clumsy. The attributes must be present on primitive outputs, but they are actually ignored there anyway. I did this so the primitive outputs would line up with what the fragment shader takes as input, but that's silly because primitive outputs must have one of the indices builtins, which fragment shaders can never have as input anyway. One idea I had was that @per_primitive is applied to parameters in fragment entry points instead of structs or struct members, but this I also found a little unsatisfying.

I still have a lot of work I want to do in terms of validation, but I think I will start splitting the PR and getting parts merged soon!

inner-daemons avatar Aug 10 '25 01:08 inner-daemons

I wanted to give everyone a brief update, since it seems like many parts are soon going to be merged!

As mentioned in previous comments (which you may not have read as they were very long), the mesh shading API stuff and examples have been merged into wgpu already. Right now the focus is on naga and adding wgpu-hal backends, the latter of which should be generally much easier. My behemoth PR #7930 just got reviewed yesterday (shout out @cwfitzgerald, I can't think of anybody else who would be able to actually review 6k lines of code in one sitting) and I just started breaking it off into more managable chunks. Once #8104 gets merged in, I can actually start having multiple PRs in the pipeline at once, and since the code is already written, that means that at least on vulkan, I expect that in no more than 2-4 weeks mesh shaders will be available!

Since the brunt of the work is done I'll try to keep subsequent updates as short as possible

inner-daemons avatar Aug 14 '25 18:08 inner-daemons

We unfortunately didn't move as fast as I wanted, but this time I really mean it when I say I think we are close!

  • DX12 mesh shaders are now implemented in wgpu-hal

  • #8139, which adds support for mesh shaders to the metal backend in wgpu-hal, is nearly here

  • #8104 just got merged yesterday. This is the first and most controversial part of the naga process, and ended up getting blocked for several months. However, with this in, the subsequent PRs should be much quicker. I've just tidied up #8370, which is the WGSL parser.

  • The SPIR-V writer has already been started as I mentioned previously, but will have to be reworked a little in accordance with the tweaks to how mesh shaders are parsed.

  • @luc-rodriguez has kindly offered to write the HLSL writer

  • I will be working on the MSL writer probably in early December

This leads me to be very optimistic about mesh shaders being fully or almost fully implemented in WGPU in time for the next release.

If you're looking to help, here are 2 ways you can do so:

  • Writing the SPIR-V parser
  • Writing the WGSL writer (regurgitator)

If anyone is interested in either of these, let me know

inner-daemons avatar Oct 31 '25 01:10 inner-daemons

Ok, there's a little bit of news but first: I need your help soon!!

More specifically, if you have an AMD GPU or AMD CPU with integrated graphics, and it is recent enough to support mesh shaders (can be checked with vulkan caps viewer or the vulkaninfo command line tool that comes with vulkan sdk), I need your help debugging some issues.

The next release happens on the 17th, but I need this to be done before then so it can get reviewed and merged, which means I need the debugging to happen even sooner. Its no big deal of course if we can't get this in, it just means that all of the mesh shader stuff will come together in the next release in 3 months. But if you're interested in speeding things up this will be massively helpful.

The issue is with the SPIR-V that is generated by #8456 not working on AMD. If you're interested in helping please add me on discord at inner_daemons or matrix at @supamaggie70:matrix.org. I will mainly just want you to test a bunch of different SPIR-V files on a branch and tell me what works, it shouldn't be too involved.

Now, for the news: the WGSL parser got merged in #8370, which represented a big step forward in finalizing the IR and some validation rules. After #8507 gets in with more expansive validation, the validation rules that I plan on adding will be in. Finally, @Slightlyclueless wrote #8481 adding WGSL writing support, which is important for apps like bevy where you do some preprocessing of shaders, and which I expect to be merged on Friday.

inner-daemons avatar Dec 10 '25 04:12 inner-daemons

GOOD NEWS

You might've already seen it in the release notes, but mesh shaders are pretty much feature complete on vulkan!

That's right, we now have full support for limits, shader parsing, shader validation, pipeline validation, SPIR-V writing, and WGSL writing (which is important for use cases like bevy)! This represents the full set of capabilities that will be relevant to Vulkan.

All of this has been included in the release as well!

Remaining bugs on Vulkan

These are all "minor" bugs that should not be important in most cases yet. I expect them to be fixed in a patch release in the next week or two.

Credits

Firstly, I'm gonna give myself some credit, as I was the main architect and author of these changes. I would also like to thank @cwfitzgerald since he was pretty much the only major reviewer for any mesh shader related PR. Without him none of this could've gotten merged, certainly not on such short notice. Also note that @ErichDonGubler and @jimblandy both contributed a review at some point.

I would also like to thank @Slightlyclueless for contributing the WGSL writer. This is very important for many use cases and I am very happy to have had some help in this process!

Another person worth noting is @9291Sam, who within 12 hours of the SPIR-V writer being merged, used it to create a whole-ass voxel chunk renderer, and discovered several issues, which were then fixed in #8749

A large number of people contributed to the final steps for the SPIR-V writer. The generated SPIR-V was functioning on Intel and Nvidia cards but not on AMD, and I don't have an AMD card so I relied on these people to test my changes and submit logs. The following is a list, excluding several people who wish to remain anonymous:

  • @ColinTimBarndt
  • @Mhowser
  • @AdamK2003

Among those, I'd especially like to point out @ColinTimBarndt. When I posted a comment asking for help testing, he immediately DMed me on both Discord and Element. Since then he has been responding to my requests almost instantly pretty much hour that he has been awake. He has looked over the code, is responsible for reaching out to all the other testers, and even found the final bug in our code! He has dedicated so much time to this, and it wouldn't have been possible to get this done on such short notice without him, so I really can't thank him enough.

Finally, thanks to everyone else who offered to help by testing after my previous comment. If Colin hadn't been so damn useful I definitely would've need you!

Other updates

I actually finished the MSL writer in a single 24 hour period recently, which I was shocked by! It needs to be cleaned up and has some of the same issues as the SPIR-V writer but it is functioning on all tests and the example. You can experiment with it on #8739.

I also just finished the HLSL writer like an hour ago, so that'll probably be merged soon, in #8752.

Next steps

Bugs aside, here is what I will be working on next:

  • Getting the already-written MSL writer merged in
  • Getting the already-written HLSL writer merged in
  • #8517 - testing for WGSL parsing errors
  • SPIR-V parser at some point
  • Queries? Not a priority
  • Even more tests & better example?

I have full confidence that by the next release these will all be complete!

inner-daemons avatar Dec 18 '25 04:12 inner-daemons