wgpu-native icon indicating copy to clipboard operation
wgpu-native copied to clipboard

v22.1.0.1 crashes on `wgpuQueueSubmit()` if `wgpuRenderPassEncoderRelease()` wasn't called.

Open yig opened this issue 1 year ago • 12 comments

I just updated to wgpu-native v22.1.0.1. If I don't call wgpuRenderPassEncoderRelease() before wgpuQueueSubmit(), I get an error:

thread '<unnamed>' panicked at /Users/runner/.cargo/git/checkouts/wgpu-53e70f8674b08dd4/5c5c8b1/wgpu-core/src/command/mod.rs:522:14:
CommandBuffer cannot be destroyed because is still in use

The code looks like this:

wgpuRenderPassEncoderEnd( render_pass );
wgpuRenderPassEncoderRelease( render_pass ); // crashes without this line
WGPUCommandBuffer command = wgpuCommandEncoderFinish( encoder, nullptr );
wgpuQueueSubmit( queue, 1, &command );

The relevant part of the backtrace:

   3: core::option::expect_failed
             at /rustc/3f5fd8dd41153bc5fdca9427e9e05be2c767ba23/library/core/src/option.rs:1995:5
   4: wgpu_core::command::CommandBuffer<A>::from_arc_into_baked
   5: wgpu_core::device::queue::<impl wgpu_core::global::Global>::queue_submit
   6: _wgpuQueueSubmit

Here is a runnable fairly minimal crashing example: https://github.com/yig/LearnWebGPU-Code/blob/step030-vanilla/main.cpp#L233

I hacked the CMakeLists.txt to run on macos-aarch64. (It hard-codes the macos-aarch64 release and adds -framework Metal linker flags. Otherwise, it should be easy to modify to reproduce on other platforms.)

git clone [email protected]:yig/LearnWebGPU-Code.git
cd LearnWebGPU-Code
git switch step030-vanilla
cmake -B build
cmake --build build -j 6
./build/App

yig avatar Aug 27 '24 03:08 yig

Yeah in previous versions it was incorrectly allowed to release the RenderPass later, but in latest version that bug is fixed. Also RenderPasses are single use only so it doesn't make sense to keep it around after calling wgpuRenderPassEncoderEnd on it.

rajveermalviya avatar Aug 27 '24 06:08 rajveermalviya

How could that be part of the spec? In JavaScript, I didn't think it was possible to manually release a RenderPass.

yig avatar Aug 27 '24 11:08 yig

wgpu[Object]Release() and (the Reference() conterpart) are not in javascript spec but part of the webgpu.h.

rajveermalviya avatar Aug 27 '24 12:08 rajveermalviya

Where are things like this RenderPass lifetime rule specified? My understanding was that the JS API and the C API should be implementable on top of each other. I don't see how the JS API could be implemented on top of the C API with this lifetime rule.

yig avatar Aug 27 '24 12:08 yig

Seem like this is similar to https://github.com/gfx-rs/wgpu/issues/6145.

rajveermalviya avatar Aug 27 '24 20:08 rajveermalviya

Yes, this is exactly the same. The extra scope for the render pass shown there serves as a workaround for a ref-counted language (including C++ with RAII). I guess a similar workaround for a garbage collected language could be to add the scope and manually triggering the garbage collector immediately after closing the scope. I hope this can be fixed and these workarounds avoided.

yig avatar Aug 27 '24 22:08 yig

I also run into this in pygfx/wgpu-py#547 and just made the private release is part of the public end method.

I am not sure if there is any case where a end call is not followed by release. So it feels somewhat redundant

Vipitis avatar Aug 28 '24 17:08 Vipitis

I have just run into this myself, and I must say the error message and fix was not at all intuitive.

Is there something I can refer to to understand the lifetime expectations and refcount rules for these objects in C/C++?

trbabb avatar Oct 14 '24 18:10 trbabb

My understanding is that this is a deviation from the spec.

yig avatar Oct 15 '24 00:10 yig

Just spent a couple of hours debugging this myself. Someday I will learn to check github issues first :)

Yeah in previous versions it was incorrectly allowed to release the RenderPass later, but in latest version that bug is fixed.

@rajveermalviya but surely the crash with an unrelated error message cannot be a desired behavior? I'm not arguing for that to be the only correct way of handling resources, but e.g. dawn doesn't crash and (seems to) release encoder passes by itself.

hqhs avatar Nov 09 '24 00:11 hqhs

This is a wgpu bug ("deviation from the spec") for render and compute passes. Is there a bug tracker elsewhere that we should use to report this?

yig avatar Nov 28 '24 19:11 yig

@yig:

This is a wgpu bug ("deviation from the spec") for render and compute passes. Is there a bug tracker elsewhere that we should use to report this?

Yep: https://github.com/gfx-rs/wgpu/issues/

ErichDonGubler avatar Dec 17 '24 18:12 ErichDonGubler

I believe this issue was fixed when https://github.com/gfx-rs/wgpu/issues/6145 was fixed. This can be closed.

yig avatar Aug 20 '25 18:08 yig