Support unchecked pipelines
device.create_shader_module() has a (somewhat evil) cousin device.create_shader_module_unchecked() which allows to instantiate a shader without extra runtime checks (e.g. for array bounds).
This comes very handy for memory-intensive shaders, for instance yielding an extra 25% boost for my ray tracing thingie.
This commit introduces the concept of unchecked pipelines for Bevy abstractions as well.
Welcome, new contributor!
Please make sure you've read our contributing guide and we look forward to reviewing your pull request shortly ✨
Have you seen https://github.com/bevyengine/bevy/pull/9450?
Last time I tried this, I didn't get any real performance benefits. Curious to know why.
I haven't seen that pull request 👀 -- but disabling checks did yell ~25% performance boost for my ray-tracer on Metal.
Edit: in a hindsight, 25% sounds like too much, now that I think about it - realistically it must've been something like 10%; I benchmarked it about six months ago, trying to optimize my tracing shaders, and the extra bounds checks for accessing memory and managing the traversing stack quickly added up.
Edit 2: managed to find something - https://github.com/Patryk27/strolle/commit/684edd179207a3c177a9055950b0d0917ab0b907 (note that back then the create_shader_module_unchecked() function did not exist, so I just monkey-patched wgpu not to generate those checks) - according to me from the past, disabling bounds checks got me extra 10 FPS (so like 12%, considering the overall FPS=80 of my demo).
Idk if it would make sense to tie this to release mode #[cfg(not(debug_assertions))] as a further release optimization
I think going with not(debug_assertions) might be too much - e.g. indexing out of bounds is safe in shaders (and with the bounds-check in place it defaults to returning zero), but it's undefined behavior ™️ when using create_shader_module_unchecked().
So it's sort of like compiling with --release made array[idx] UB when idx >= array.len() in typical Rust code - it would be pretty surprising 😅
FYI I tried this recently on my meshlet shaders and none of them showed any speed improvements. They weren't memory bound for the most part, though. I don't have a memory bound shader to test it on ATM.
They weren't memory bound for the most part, though. I don't have a memory bound shader to test it on ATM.
The most important thing (for me) unchecked pipelines do is disabling array-bounds checking - by default, all array accesses like array[idx] get changed into idx < array.len() ? array[idx] : default() (which might get later optimized by the compiler downstream, but it's not always feasible); in BVH traversal, this adds noticeable overhead 👀