wgpu icon indicating copy to clipboard operation
wgpu copied to clipboard

Test prevent_invalid_ray_query_calls segfault on Mesa RADV driver

Open SpeedCrash100 opened this issue 1 month ago • 10 comments

Description As requested by @Vecvec in #8527 , creating a issue for the test case wgpu_gpu::ray_tracing::shader::prevent_invalid_ray_query_calls which gets segfault on Mesa RADV driver.

Repro steps Run the test using

cargo xtask test prevent_invalid_ray_query_calls 

Expected vs observed behavior Expected: the test should be passed Observed: SEGFAULT got in progress

Extra materials

stacktrace
#0  0x00007ffff6b75c53 in ?? () from /usr/lib/libvulkan_radeon.so
#1  0x00007ffff6736969 in ?? () from /usr/lib/libvulkan_radeon.so
#2  0x00007ffff6712d6f in ?? () from /usr/lib/libvulkan_radeon.so
#3  0x00007ffff6713342 in ?? () from /usr/lib/libvulkan_radeon.so
#4  0x00007ffff67136f9 in ?? () from /usr/lib/libvulkan_radeon.so
#5  0x00007ffff671392b in ?? () from /usr/lib/libvulkan_radeon.so
#6  0x00005555561abddb in ash::device::Device::create_compute_pipelines (self=0x5555570aa070, pipeline_cache=..., create_infos=..., 
    allocation_callbacks=...) at /home/deucalion/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/ash-0.38.0+1.3.281/src/device.rs:2185
#7  0x0000555556241e2a in wgpu_hal::vulkan::device::{impl#4}::create_compute_pipeline (self=0x5555570a3bc0, desc=0x7fffffff7a28)
    at wgpu-hal/src/vulkan/device.rs:2200
#8  0x0000555555f66277 in wgpu_hal::dynamic::device::{impl#0}::create_compute_pipeline<wgpu_hal::vulkan::Device> (self=0x5555570a3bc0, 
    desc=0x7fffffff82c0) at wgpu-hal/src/dynamic/device.rs:437
#9  0x0000555555eca50e in wgpu_core::device::resource::Device::create_compute_pipeline (self=0x7fffffff8850, desc=...)
    at wgpu-core/src/device/resource.rs:3788
#10 0x00005555560d64f5 in wgpu_core::global::Global::device_create_compute_pipeline (self=0x55555707b660, device_id=..., desc=0x7fffffff8ee0, id_in=...)
    at wgpu-core/src/device/global.rs:1623
#11 0x0000555555dcf8b9 in wgpu::backend::wgpu_core::{impl#12}::create_compute_pipeline (self=0x555557095b00, desc=0x7fffffff9560)
    at wgpu/src/backend/wgpu_core.rs:1521
#12 0x0000555555ddcf55 in wgpu::api::device::Device::create_compute_pipeline (self=0x7fffffff9a78, desc=0x7fffffff9560) at wgpu/src/api/device.rs:262
#13 0x000055555577a5cb in wgpu_gpu::ray_tracing::shader::prevent_invalid_ray_query_calls (ctx=...) at tests/tests/wgpu-gpu/ray_tracing/shader.rs:157
#14 0x00005555558ba282 in core::ops::function::Fn::call<fn(wgpu_test::run::TestingContext), (wgpu_test::run::TestingContext)> ()
    at /home/deucalion/.rustup/toolchains/1.88-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ops/function.rs:79
#15 0x00005555558591e9 in wgpu_test::config::{impl#0}::run_sync::{closure#0}::{async_block#0}<fn(wgpu_test::run::TestingContext)> ()
--Type <RET> for more, q to quit, c to continue without paging--
    at tests/src/config.rs:98
#16 0x0000555555998e73 in core::future::future::{impl#1}::poll<alloc::boxed::Box<(dyn core::future::future::Future<Output=()> + core::marker::Send), alloc::alloc::Global>> (self=..., cx=0x7fffffffbd68)
    at /home/deucalion/.rustup/toolchains/1.88-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/future/future.rs:124
#17 0x0000555555998253 in core::panic::unwind_safe::{impl#26}::poll<core::pin::Pin<alloc::boxed::Box<(dyn core::future::future::Future<Output=()> + core::marker::Send), alloc::alloc::Global>>> (self=..., cx=0x7fffffffbd68)
    at /home/deucalion/.rustup/toolchains/1.88-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/panic/unwind_safe.rs:297
#18 0x000055555599773e in futures_lite::future::{impl#11}::poll::{closure#0}<core::panic::unwind_safe::AssertUnwindSafe<core::pin::Pin<alloc::boxed::Box<(dyn core::future::future::Future<Output=()> + core::marker::Send), alloc::alloc::Global>>>> ()
    at /home/deucalion/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/futures-lite-2.6.1/src/future.rs:653
#19 0x0000555555998274 in core::panic::unwind_safe::{impl#23}::call_once<core::task::poll::Poll<()>, futures_lite::future::{impl#11}::poll::{closure_env#0}<core::panic::unwind_safe::AssertUnwindSafe<core::pin::Pin<alloc::boxed::Box<(dyn core::future::future::Future<Output=()> + core::marker::Send), alloc::alloc::Global>>>>> (self=...)
    at /home/deucalion/.rustup/toolchains/1.88-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/panic/unwind_safe.rs:272
#20 0x000055555599a381 in std::panicking::try::do_call<core::panic::unwind_safe::AssertUnwindSafe<futures_lite::future::{impl#11}::poll::{closure_env#0}<core::panic::unwind_safe::AssertUnwindSafe<core::pin::Pin<alloc::boxed::Box<(dyn core::future::future::Future<Output=()> + core::marker::Send), alloc::alloc::Global>>>>>, core::task::poll::Poll<()>> (data=0x7fffffffa018)
    at /home/deucalion/.rustup/toolchains/1.88-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panicking.rs:589
#21 0x0000555555998d6b in __rust_try ()
#22 0x0000555555998cf4 in std::panicking::try<core::task::poll::Poll<()>, core::panic::unwind_safe::AssertUnwindSafe<futures_lite::future::{impl#11}::poll::{closure_env#0}<core::panic::unwind_safe::AssertUnwindSafe<core::pin::Pin<alloc::boxed::Box<(dyn core::future::future::Future<Output=()> + core::marker::Send), alloc::alloc::Global>>>>>> (f=...)
    at /home/deucalion/.rustup/toolchains/1.88-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panicking.rs:552
--Type <RET> for more, q to quit, c to continue without paging--
#23 std::panic::catch_unwind<core::panic::unwind_safe::AssertUnwindSafe<futures_lite::future::{impl#11}::poll::{closure_env#0}<core::panic::unwind_safe::AssertUnwindSafe<core::pin::Pin<alloc::boxed::Box<(dyn core::future::future::Future<Output=()> + core::marker::Send), alloc::alloc::Global>>>>>, core::task::poll::Poll<()>> (f=...) at /home/deucalion/.rustup/toolchains/1.88-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panic.rs:359
#24 0x00005555559976a8 in futures_lite::future::{impl#11}::poll<core::panic::unwind_safe::AssertUnwindSafe<core::pin::Pin<alloc::boxed::Box<(dyn core::future::future::Future<Output=()> + core::marker::Send), alloc::alloc::Global>>>> (self=..., cx=0x7fffffffbd68)
    at /home/deucalion/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/futures-lite-2.6.1/src/future.rs:653
#25 0x000055555597d53c in wgpu_test::run::execute_test::{async_fn#0} () at tests/src/run.rs:88
#26 0x0000555555992c34 in wgpu_test::native::{impl#0}::from_configuration::{async_block#0} () at tests/src/native.rs:60
#27 0x0000555555998e73 in core::future::future::{impl#1}::poll<alloc::boxed::Box<(dyn core::future::future::Future<Output=()> + core::marker::Send), alloc::alloc::Global>> (self=..., cx=0x7fffffffbd68)
    at /home/deucalion/.rustup/toolchains/1.88-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/future/future.rs:124
#28 0x0000555555998c2a in pollster::block_on<core::pin::Pin<alloc::boxed::Box<(dyn core::future::future::Future<Output=()> + core::marker::Send), alloc::alloc::Global>>> (fut=...) at /home/deucalion/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/pollster-0.4.0/src/lib.rs:126
#29 0x0000555555992d68 in wgpu_test::native::{impl#0}::into_trial::{closure#0} () at tests/src/native.rs:67
#30 0x0000555555995db9 in libtest_mimic::{impl#0}::test::{closure#0}<wgpu_test::native::{impl#0}::into_trial::{closure_env#0}, alloc::string::String> (
    _test_mode=true) at /home/deucalion/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/libtest-mimic-0.8.1/src/lib.rs:119
#31 0x0000555555969d22 in core::ops::function::FnOnce::call_once<libtest_mimic::{impl#0}::test::{closure_env#0}<wgpu_test::native::{impl#0}::into_trial::{closure_env#0}, alloc::string::String>, (bool)> ()
    at /home/deucalion/.rustup/toolchains/1.88-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ops/function.rs:250
#32 0x00005555559b281d in alloc::boxed::{impl#28}::call_once<(bool), (dyn core::ops::function::FnOnce<(bool), Output=libtest_mimic::Outcome> + core::marker::Send), alloc::alloc::Global> (self=..., args=...) at /rustc/6b00bc3880198600130e1cf62b8f8a93494488cc/library/alloc/src/boxed.rs:1966
#33 0x00005555559d7680 in libtest_mimic::run_single::{closure#0} () at src/lib.rs:576
#34 0x00005555559dbc60 in core::panic::unwind_safe::{impl#23}::call_once<libtest_mimic::Outcome, libtest_mimic::run_single::{closure_env#0}> (self=...)
--Type <RET> for more, q to quit, c to continue without paging--
    at /rustc/6b00bc3880198600130e1cf62b8f8a93494488cc/library/core/src/panic/unwind_safe.rs:272
#35 0x00005555559bc0a1 in std::panicking::try::do_call<core::panic::unwind_safe::AssertUnwindSafe<libtest_mimic::run_single::{closure_env#0}>, libtest_mimic::Outcome> (data=0x7fffffffbfd8) at /rustc/6b00bc3880198600130e1cf62b8f8a93494488cc/library/std/src/panicking.rs:589
#36 0x00005555559c858b in __rust_try ()
#37 0x00005555559c518b in std::panicking::try<libtest_mimic::Outcome, core::panic::unwind_safe::AssertUnwindSafe<libtest_mimic::run_single::{closure_env#0}>> (f=...) at /rustc/6b00bc3880198600130e1cf62b8f8a93494488cc/library/std/src/panicking.rs:552
#38 std::panic::catch_unwind<core::panic::unwind_safe::AssertUnwindSafe<libtest_mimic::run_single::{closure_env#0}>, libtest_mimic::Outcome> (f=...)
    at /rustc/6b00bc3880198600130e1cf62b8f8a93494488cc/library/std/src/panic.rs:359
#39 0x00005555559d7631 in libtest_mimic::run_single (runner=..., test_mode=true) at src/lib.rs:576
#40 0x00005555559d64f5 in libtest_mimic::run (args=0x7fffffffc580, tests=...) at src/lib.rs:507
#41 0x0000555555993107 in wgpu_test::native::execute_native<core::iter::adapters::flatten::FlatMap<alloc::vec::into_iter::IntoIter<fn() -> wgpu_test::config::GpuTestConfiguration, alloc::alloc::Global>, core::iter::adapters::map::Map<core::iter::adapters::enumerate::Enumerate<core::slice::iter::Iter<wgpu_test::report::AdapterReport>>, wgpu_test::native::main::{closure#1}::{closure_env#0}>, wgpu_test::native::main::{closure_env#1}>> (tests=...)
    at tests/src/native.rs:134
#42 0x0000555555966d2b in wgpu_test::native::main (tests=...) at tests/src/native.rs:113
#43 0x00005555558e3675 in wgpu_gpu::main () at tests/src/lib.rs:132

Rust trace log attached as file

log.txt

Platform OS: CachyOS as for 15.11.2025 GPU: AMD RX 9070 XT Driver: Mesa radv 25.2.7-cachyos1.2

SpeedCrash100 avatar Nov 16 '25 05:11 SpeedCrash100

@SpeedCrash100, wgpu claims it can't find the vulkan validation layers, do you have them installed? It might catch something if we're doing it wrong.

Vecvec avatar Nov 16 '25 06:11 Vecvec

@Vecvec , updated log with validation layers installed and working, does not see any problems. Re-download, from the issue description

SpeedCrash100 avatar Nov 16 '25 06:11 SpeedCrash100

@SpeedCrash100, in the shader from prevent_invalid_ray_query_calls there are multiple blocks, could you try removing them (maybe in multiple orders?) and see if some configurations work? It could also be useful to have things like where the crash happens (could be difficult) and info from the core dump like what address it was accessing.

Vecvec avatar Nov 25 '25 22:11 Vecvec

@Vecvec , Sorry, I forgot about this. I've tried to comment out this in shader: https://github.com/gfx-rs/wgpu/blob/75188a7fe1c7826a8122f6c80292ac73a5e8ae09/tests/tests/wgpu-gpu/ray_tracing/shader.wgsl#L66C1-L71C6 and the test passes now.

As for exact state when calling this from ash crate from backtrace:

(gdb) p create_infos
$5 = &[ash::vk::definitions::ComputePipelineCreateInfo] [
  ash::vk::definitions::ComputePipelineCreateInfo {s_type: ash::vk::enums::StructureType (29), p_next: 0x0, flags: ash::vk::bitflags::PipelineCreateFlags (0), stage: ash::vk::definitions::PipelineShaderStageCreateInfo {s_type: ash::vk::enums::StructureType (18), p_next: 0x0, flags: ash::vk::bitflags::PipelineShaderStageCreateFlags (0), stage: ash::vk::bitflags::ShaderStageFlags (32), module: ash::vk::definitions::ShaderModule (727876697588374), p_name: 0x555557ee2030, p_specialization_info: 0x0, _marker: core::marker::PhantomData<&()>}, layout: ash::vk::definitions::PipelineLayout (726777185960597), base_pipeline_handle: ash::vk::definitions::Pipeline (0), base_pipeline_index: 0, _marker: core::marker::PhantomData<&()>}]
(gdb) p pipeline_cache
$6 = ash::vk::definitions::PipelineCache (0)
(gdb) p allocation_callbacks
$7 = core::option::Option<&ash::vk::definitions::AllocationCallbacks>::None
(gdb) p self
$8 = (*mut ash::device::Device) 0x55555714da40
(gdb) 

SpeedCrash100 avatar Nov 29 '25 05:11 SpeedCrash100

@SpeedCrash100, no problem. I don't expect people to immediately respond (and I was away during that time anyway). It's useful to have found the cause. Does it still segfault with just that single block of code (yes, this sounds strange but I was recently dealing with a error caused by an optimizer, where things happened due to a lot of factors that are non-obvious). Could you also try remove the rayQueryGetCommittedIntersection call and see if it still crashes (ideally just in that block on its own). Thank you for the state during the ash call, but I was mostly wondering about the state at the point it crashes (in #0 0x00007ffff6b75c53 in ?? () from /usr/lib/libvulkan_radeon.so) as things like what it was doing (i.e. what address was it derefing/juming to) can give us clues as to where it happens and what it was doing.

Vecvec avatar Nov 30 '25 18:11 Vecvec

@Vecvec

Does it still segfault with just that single block of code

It does

Could you also try remove the rayQueryGetCommittedIntersection call and see if it still crashes

It crashes same way, however, if rayQueryProceed line removed validation error will appear

thread '<unnamed>' panicked at wgpu/src/backend/wgpu_core.rs:2359:26:
    wgpu error: Validation Error

    Caused by:
      In ComputePipeline::get_bind_group_layout
        Invalid group index 0

As for the state of libvulkan_radeon.so . I got debug symbols so here a more full info:

Backtrace
Thread 1 "wgpu_gpu-326e1f" received signal SIGSEGV, Segmentation fault.
nir_opt_ray_query_ranges () at ../mesa-25.2.7/src/compiler/nir/nir_opt_ray_queries.c:303
303              struct rq_range *range = ranges + (uintptr_t)index_entry->data;
(gdb) bt
#0  nir_opt_ray_query_ranges () at ../mesa-25.2.7/src/compiler/nir/nir_opt_ray_queries.c:303
#1  0x00007ffff6736969 in radv_shader_spirv_to_nir () at ../mesa-25.2.7/src/amd/vulkan/radv_shader.c:542
#2  0x00007ffff6712d6f in radv_compile_cs () at ../mesa-25.2.7/src/amd/vulkan/radv_pipeline_compute.c:101
#3  0x00007ffff6713342 in radv_compute_pipeline_compile () at ../mesa-25.2.7/src/amd/vulkan/radv_pipeline_compute.c:216
#4  0x00007ffff67136f9 in radv_compute_pipeline_create () at ../mesa-25.2.7/src/amd/vulkan/radv_pipeline_compute.c:297
#5  0x00007ffff671392b in radv_create_compute_pipelines () at ../mesa-25.2.7/src/amd/vulkan/radv_pipeline_compute.c:325
#6  radv_CreateComputePipelines () at ../mesa-25.2.7/src/amd/vulkan/radv_pipeline_compute.c:354
#7  0x00007fffea42a003 in vvl::dispatch::Device::CreateComputePipelines () at /usr/src/debug/vulkan-validation-layers/Vulkan-ValidationLayers/layers/chassis/dispatch_object_manual.cpp:2254
#8  0x00007fffea4189e5 in vulkan_layer_chassis::CreateComputePipelines () at /usr/src/debug/vulkan-validation-layers/Vulkan-ValidationLayers/layers/chassis/chassis_manual.cpp:614
#9  0x00005555561abddb in ash::device::Device::create_compute_pipelines (self=0x55555714da40, pipeline_cache=..., create_infos=..., allocation_callbacks=...)
    at /home/deucalion/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/ash-0.38.0+1.3.281/src/device.rs:2185
#10 0x0000555556241e2a in wgpu_hal::vulkan::device::{impl#4}::create_compute_pipeline (self=0x555557169b20, desc=0x7fffffff7a28) at wgpu-hal/src/vulkan/device.rs:2200
#11 0x0000555555f66277 in wgpu_hal::dynamic::device::{impl#0}::create_compute_pipeline<wgpu_hal::vulkan::Device> (self=0x555557169b20, desc=0x7fffffff82c0) at wgpu-hal/src/dynamic/device.rs:437
#12 0x0000555555eca50e in wgpu_core::device::resource::Device::create_compute_pipeline (self=0x7fffffff8850, desc=...) at wgpu-core/src/device/resource.rs:3788
#13 0x00005555560d64f5 in wgpu_core::global::Global::device_create_compute_pipeline (self=0x5555570a7aa0, device_id=..., desc=0x7fffffff8ee0, id_in=...) at wgpu-core/src/device/global.rs:1623
#14 0x0000555555dcf8b9 in wgpu::backend::wgpu_core::{impl#12}::create_compute_pipeline (self=0x5555573709f0, desc=0x7fffffff9560) at wgpu/src/backend/wgpu_core.rs:1521
#15 0x0000555555ddcf55 in wgpu::api::device::Device::create_compute_pipeline (self=0x7fffffff9a78, desc=0x7fffffff9560) at wgpu/src/api/device.rs:262
#16 0x000055555577a5cb in wgpu_gpu::ray_tracing::shader::prevent_invalid_ray_query_calls (ctx=...) at tests/tests/wgpu-gpu/ray_tracing/shader.rs:157
#17 0x00005555558ba282 in core::ops::function::Fn::call<fn(wgpu_test::run::TestingContext), (wgpu_test::run::TestingContext)> ()
    at /home/deucalion/.rustup/toolchains/1.88-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ops/function.rs:79
#18 0x00005555558591e9 in wgpu_test::config::{impl#0}::run_sync::{closure#0}::{async_block#0}<fn(wgpu_test::run::TestingContext)> () at tests/src/config.rs:98
#19 0x0000555555998e73 in core::future::future::{impl#1}::poll<alloc::boxed::Box<(dyn core::future::future::Future<Output=()> + core::marker::Send), alloc::alloc::Global>> (self=..., cx=0x7fffffffbd68)
    at /home/deucalion/.rustup/toolchains/1.88-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/future/future.rs:124
#20 0x0000555555998253 in core::panic::unwind_safe::{impl#26}::poll<core::pin::Pin<alloc::boxed::Box<(dyn core::future::future::Future<Output=()> + core::marker::Send), alloc::alloc::Global>>> (
    self=..., cx=0x7fffffffbd68) at /home/deucalion/.rustup/toolchains/1.88-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/panic/unwind_safe.rs:297
#21 0x000055555599773e in futures_lite::future::{impl#11}::poll::{closure#0}<core::panic::unwind_safe::AssertUnwindSafe<core::pin::Pin<alloc::boxed::Box<(dyn core::future::future::Future<Output=()> + core::marker::Send), alloc::alloc::Global>>>> () at /home/deucalion/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/futures-lite-2.6.1/src/future.rs:653
#22 0x0000555555998274 in core::panic::unwind_safe::{impl#23}::call_once<core::task::poll::Poll<()>, futures_lite::future::{impl#11}::poll::{closure_env#0}<core::panic::unwind_safe::AssertUnwindSafe<core::pin::Pin<alloc::boxed::Box<(dyn core::future::future::Future<Output=()> + core::marker::Send), alloc::alloc::Global>>>>> (self=...)
    at /home/deucalion/.rustup/toolchains/1.88-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/panic/unwind_safe.rs:272
#23 0x000055555599a381 in std::panicking::try::do_call<core::panic::unwind_safe::AssertUnwindSafe<futures_lite::future::{impl#11}::poll::{closure_env#0}<core::panic::unwind_safe::AssertUnwindSafe<core::pin::Pin<alloc::boxed::Box<(dyn core::future::future::Future<Output=()> + core::marker::Send), alloc::alloc::Global>>>>>, core::task::poll::Poll<()>> (data=0x7fffffffa018)
    at /home/deucalion/.rustup/toolchains/1.88-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panicking.rs:589
#24 0x0000555555998d6b in __rust_try ()
#25 0x0000555555998cf4 in std::panicking::try<core::task::poll::Poll<()>, core::panic::unwind_safe::AssertUnwindSafe<futures_lite::future::{impl#11}::poll::{closure_env#0}<core::panic::unwind_safe::AssertUnwindSafe<core::pin::Pin<alloc::boxed::Box<(dyn core::future::future::Future<Output=()> + core::marker::Send), alloc::alloc::Global>>>>>> (f=...)
    at /home/deucalion/.rustup/toolchains/1.88-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panicking.rs:552
#26 std::panic::catch_unwind<core::panic::unwind_safe::AssertUnwindSafe<futures_lite::future::{impl#11}::poll::{closure_env#0}<core::panic::unwind_safe::AssertUnwindSafe<core::pin::Pin<alloc::boxed::Box<(dyn core::future::future::Future<Output=()> + core::marker::Send), alloc::alloc::Global>>>>>, core::task::poll::Poll<()>> (f=...)
    at /home/deucalion/.rustup/toolchains/1.88-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panic.rs:359
#27 0x00005555559976a8 in futures_lite::future::{impl#11}::poll<core::panic::unwind_safe::AssertUnwindSafe<core::pin::Pin<alloc::boxed::Box<(dyn core::future::future::Future<Output=()> + core::marker::Send), alloc::alloc::Global>>>> (self=..., cx=0x7fffffffbd68) at /home/deucalion/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/futures-lite-2.6.1/src/future.rs:653
#28 0x000055555597d53c in wgpu_test::run::execute_test::{async_fn#0} () at tests/src/run.rs:88
#29 0x0000555555992c34 in wgpu_test::native::{impl#0}::from_configuration::{async_block#0} () at tests/src/native.rs:60
#30 0x0000555555998e73 in core::future::future::{impl#1}::poll<alloc::boxed::Box<(dyn core::future::future::Future<Output=()> + core::marker::Send), alloc::alloc::Global>> (self=..., cx=0x7fffffffbd68)
    at /home/deucalion/.rustup/toolchains/1.88-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/future/future.rs:124
#31 0x0000555555998c2a in pollster::block_on<core::pin::Pin<alloc::boxed::Box<(dyn core::future::future::Future<Output=()> + core::marker::Send), alloc::alloc::Global>>> (fut=...)
    at /home/deucalion/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/pollster-0.4.0/src/lib.rs:126
#32 0x0000555555992d68 in wgpu_test::native::{impl#0}::into_trial::{closure#0} () at tests/src/native.rs:67
#33 0x0000555555995db9 in libtest_mimic::{impl#0}::test::{closure#0}<wgpu_test::native::{impl#0}::into_trial::{closure_env#0}, alloc::string::String> (_test_mode=true)
    at /home/deucalion/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/libtest-mimic-0.8.1/src/lib.rs:119
#34 0x0000555555969d22 in core::ops::function::FnOnce::call_once<libtest_mimic::{impl#0}::test::{closure_env#0}<wgpu_test::native::{impl#0}::into_trial::{closure_env#0}, alloc::string::String>, (bool)>
--Type <RET> for more, q to quit, c to continue without paging--
    () at /home/deucalion/.rustup/toolchains/1.88-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ops/function.rs:250
#35 0x00005555559b281d in alloc::boxed::{impl#28}::call_once<(bool), (dyn core::ops::function::FnOnce<(bool), Output=libtest_mimic::Outcome> + core::marker::Send), alloc::alloc::Global> (self=..., 
    args=...) at /rustc/6b00bc3880198600130e1cf62b8f8a93494488cc/library/alloc/src/boxed.rs:1966
#36 0x00005555559d7680 in libtest_mimic::run_single::{closure#0} () at src/lib.rs:576
#37 0x00005555559dbc60 in core::panic::unwind_safe::{impl#23}::call_once<libtest_mimic::Outcome, libtest_mimic::run_single::{closure_env#0}> (self=...)
    at /rustc/6b00bc3880198600130e1cf62b8f8a93494488cc/library/core/src/panic/unwind_safe.rs:272
#38 0x00005555559bc0a1 in std::panicking::try::do_call<core::panic::unwind_safe::AssertUnwindSafe<libtest_mimic::run_single::{closure_env#0}>, libtest_mimic::Outcome> (data=0x7fffffffbfd8)
    at /rustc/6b00bc3880198600130e1cf62b8f8a93494488cc/library/std/src/panicking.rs:589
#39 0x00005555559c858b in __rust_try ()
#40 0x00005555559c518b in std::panicking::try<libtest_mimic::Outcome, core::panic::unwind_safe::AssertUnwindSafe<libtest_mimic::run_single::{closure_env#0}>> (f=...)
    at /rustc/6b00bc3880198600130e1cf62b8f8a93494488cc/library/std/src/panicking.rs:552
#41 std::panic::catch_unwind<core::panic::unwind_safe::AssertUnwindSafe<libtest_mimic::run_single::{closure_env#0}>, libtest_mimic::Outcome> (f=...)
    at /rustc/6b00bc3880198600130e1cf62b8f8a93494488cc/library/std/src/panic.rs:359
#42 0x00005555559d7631 in libtest_mimic::run_single (runner=..., test_mode=true) at src/lib.rs:576
#43 0x00005555559d64f5 in libtest_mimic::run (args=0x7fffffffc580, tests=...) at src/lib.rs:507
#44 0x0000555555993107 in wgpu_test::native::execute_native<core::iter::adapters::flatten::FlatMap<alloc::vec::into_iter::IntoIter<fn() -> wgpu_test::config::GpuTestConfiguration, alloc::alloc::Global>, core::iter::adapters::map::Map<core::iter::adapters::enumerate::Enumerate<core::slice::iter::Iter<wgpu_test::report::AdapterReport>>, wgpu_test::native::main::{closure#1}::{closure_env#0}>, wgpu_test::native::main::{closure_env#1}>> (tests=...) at tests/src/native.rs:134
#45 0x0000555555966d2b in wgpu_test::native::main (tests=...) at tests/src/native.rs:113
#46 0x00005555558e3675 in wgpu_gpu::main () at tests/src/lib.rs:132

It tries to dereference 0x10 address, here is ASM and src parts:

Image

with an register state:

(gdb) info reg
rax            0x0                 0
rbx            0x555557f7c768      93825036437352
rcx            0x55555812c110      93825038205200
rdx            0x2                 2
rsi            0x0                 0
rdi            0x555558280390      93825039598480
rbp            0x7fffffff4d90      0x7fffffff4d90
rsp            0x7fffffff4d00      0x7fffffff4d00
r8             0xa57913            10844435
r9             0x1a                26
r10            0x1c0               448
r11            0x7ffff7a37ac0      140737348074176
r12            0x555557eeaba0      93825035840416
r13            0x555558280590      93825039598992
r14            0x9                 9
r15            0x0                 0
rip            0x7ffff6b75c53      0x7ffff6b75c53 <nir_opt_ray_query_ranges+979>
eflags         0x10202             [ IF RF ]
cs             0x33                51
ss             0x2b                43
ds             0x0                 0
es             0x0                 0
fs             0x0                 0
gs             0x0                 0
fs_base        0x7ffff7f4d900      140737353406720
gs_base        0x0                 0

Seems like index_entry is NULL to me and mesa have not check this

SpeedCrash100 avatar Dec 01 '25 03:12 SpeedCrash100

It crashes same way, however, if rayQueryProceed line removed validation error will appear

I'm not sure if that's expected, but its a wgpu validation error (and so internal).

As for the state of libvulkan_radeon.so . I got debug symbols so here a more full info:

That's great! I'll see if I can figure out what's going on in it (much easier w/ debug symbols).

Seems like index_entry is NULL to me and mesa have not check this

That seems very likely, this seems like another 'optimiser optimises itself into a corner', but I'll need to look at it more.

Thank you very much for all this work!

Vecvec avatar Dec 01 '25 04:12 Vecvec

I have opened the bug report as https://gitlab.freedesktop.org/mesa/mesa/-/issues/14421

Vecvec avatar Dec 05 '25 19:12 Vecvec

This also segfaults on ANV on an A750.

cwfitzgerald avatar Dec 05 '25 20:12 cwfitzgerald

That's not completely surprising, this is in an optimization pass. I'd assumed this was a RADV specific optimization, but guess not.

Vecvec avatar Dec 05 '25 21:12 Vecvec