v22.1.0.1 crashes on `wgpuQueueSubmit()` if `wgpuRenderPassEncoderRelease()` wasn't called.
I just updated to wgpu-native v22.1.0.1. If I don't call wgpuRenderPassEncoderRelease() before wgpuQueueSubmit(), I get an error:
thread '<unnamed>' panicked at /Users/runner/.cargo/git/checkouts/wgpu-53e70f8674b08dd4/5c5c8b1/wgpu-core/src/command/mod.rs:522:14:
CommandBuffer cannot be destroyed because is still in use
The code looks like this:
wgpuRenderPassEncoderEnd( render_pass );
wgpuRenderPassEncoderRelease( render_pass ); // crashes without this line
WGPUCommandBuffer command = wgpuCommandEncoderFinish( encoder, nullptr );
wgpuQueueSubmit( queue, 1, &command );
The relevant part of the backtrace:
3: core::option::expect_failed
at /rustc/3f5fd8dd41153bc5fdca9427e9e05be2c767ba23/library/core/src/option.rs:1995:5
4: wgpu_core::command::CommandBuffer<A>::from_arc_into_baked
5: wgpu_core::device::queue::<impl wgpu_core::global::Global>::queue_submit
6: _wgpuQueueSubmit
Here is a runnable fairly minimal crashing example: https://github.com/yig/LearnWebGPU-Code/blob/step030-vanilla/main.cpp#L233
I hacked the CMakeLists.txt to run on macos-aarch64. (It hard-codes the macos-aarch64 release and adds -framework Metal linker flags. Otherwise, it should be easy to modify to reproduce on other platforms.)
git clone [email protected]:yig/LearnWebGPU-Code.git
cd LearnWebGPU-Code
git switch step030-vanilla
cmake -B build
cmake --build build -j 6
./build/App
Yeah in previous versions it was incorrectly allowed to release the RenderPass later, but in latest version that bug is fixed. Also RenderPasses are single use only so it doesn't make sense to keep it around after calling wgpuRenderPassEncoderEnd on it.
How could that be part of the spec? In JavaScript, I didn't think it was possible to manually release a RenderPass.
wgpu[Object]Release() and (the Reference() conterpart) are not in javascript spec but part of the webgpu.h.
Where are things like this RenderPass lifetime rule specified? My understanding was that the JS API and the C API should be implementable on top of each other. I don't see how the JS API could be implemented on top of the C API with this lifetime rule.
Seem like this is similar to https://github.com/gfx-rs/wgpu/issues/6145.
Yes, this is exactly the same. The extra scope for the render pass shown there serves as a workaround for a ref-counted language (including C++ with RAII). I guess a similar workaround for a garbage collected language could be to add the scope and manually triggering the garbage collector immediately after closing the scope. I hope this can be fixed and these workarounds avoided.
I also run into this in pygfx/wgpu-py#547 and just made the private release is part of the public end method.
I am not sure if there is any case where a end call is not followed by release. So it feels somewhat redundant
I have just run into this myself, and I must say the error message and fix was not at all intuitive.
Is there something I can refer to to understand the lifetime expectations and refcount rules for these objects in C/C++?
My understanding is that this is a deviation from the spec.
Just spent a couple of hours debugging this myself. Someday I will learn to check github issues first :)
Yeah in previous versions it was incorrectly allowed to release the
RenderPasslater, but in latest version that bug is fixed.
@rajveermalviya but surely the crash with an unrelated error message cannot be a desired behavior? I'm not arguing for that to be the only correct way of handling resources, but e.g. dawn doesn't crash and (seems to) release encoder passes by itself.
This is a wgpu bug ("deviation from the spec") for render and compute passes. Is there a bug tracker elsewhere that we should use to report this?
@yig:
This is a wgpu bug ("deviation from the spec") for render and compute passes. Is there a bug tracker elsewhere that we should use to report this?
Yep: https://github.com/gfx-rs/wgpu/issues/
I believe this issue was fixed when https://github.com/gfx-rs/wgpu/issues/6145 was fixed. This can be closed.