Vulkan-ValidationLayers icon indicating copy to clipboard operation
Vulkan-ValidationLayers copied to clipboard

DebugPrintf/GPU-AV preset results in access violation in amdvlk64.dll/nvoglv64.dll

Open StefanPoelloth opened this issue 10 months ago • 4 comments

Environment:

  • OS: Win11 24H2
  • GPU and driver version: 9950X3D igpu, 25.5.1
  • SDK or header version if building from repo: 1.4.309
  • Options enabled (synchronization, best practices, etc.): GPU-AV Preset

Describe the Issue

When creating a mesh shader pipeline and a certain shader the AMD driver crashes when GPU-AV is enabled. With modifications to the shader code or descriptor access or without gpu-av it runs fine. I can provide a repro project through share.lunarg.com

Exception thrown at 0x00007FFC57671370 (amdvlk64.dll): 0xC0000005: Access violation reading location 0x0000000000000008.
at amdvlk64.dll!00007ffc57671370()	Unknown
at amdvlk64.dll!00007ffc5740f262()	Unknown
at amdvlk64.dll!00007ffc574104ca()	Unknown
at amdvlk64.dll!00007ffc5742f662()	Unknown
at amdvlk64.dll!00007ffc57418ede()	Unknown
at amdvlk64.dll!00007ffc5739692e()	Unknown
at amdvlk64.dll!00007ffc573a9544()	Unknown
at amdvlk64.dll!00007ffc573a78b7()	Unknown
at amdvlk64.dll!00007ffc573ddc50()	Unknown
at amdvlk64.dll!00007ffc5739cf86()	Unknown
at amdvlk64.dll!00007ffc57397da2()	Unknown
at amdvlk64.dll!00007ffc570e369e()	Unknown
at amdvlk64.dll!00007ffc570e2515()	Unknown
at amdvlk64.dll!00007ffc56fdfd6d()	Unknown
at amdvlk64.dll!00007ffc57059d86()	Unknown
at amdvlk64.dll!00007ffc5705bee2()	Unknown
at amdvlk64.dll!00007ffc56ff152c()	Unknown
at amdvlk64.dll!00007ffc570367a9()	Unknown
at VkLayer_khronos_validation.dll!vvl::dispatch::Device::CreateGraphicsPipelines(VkDevice_T * device, VkPipelineCache_T * pipelineCache, unsigned int createInfoCount, const VkGraphicsPipelineCreateInfo * pCreateInfos, const VkAllocationCallbacks * pAllocator, VkPipeline_T * * pPipelines) Line 895	C++
at VkLayer_khronos_validation.dll!vulkan_layer_chassis::CreateGraphicsPipelines(VkDevice_T * device, VkPipelineCache_T * pipelineCache, unsigned int createInfoCount, const VkGraphicsPipelineCreateInfo * pCreateInfos, const VkAllocationCallbacks * pAllocator, VkPipeline_T * * pPipelines) Line 490	C++
at vulkan-1.dll!00007ffc6f4622a7()	Unknown
at [Managed to Native Transition]	
at Silk.NET.Vulkan.dll!Silk.NET.Vulkan.Vk.CreateGraphicsPipelines(Silk.NET.Vulkan.Device device, Silk.NET.Vulkan.PipelineCache pipelineCache, uint createInfoCount, Silk.NET.Vulkan.GraphicsPipelineCreateInfo pCreateInfos, Silk.NET.Vulkan.AllocationCallbacks* pAllocator, Silk.NET.Vulkan.Pipeline* pPipelines)	Unknown
at Silk.NET.Vulkan.dll!Silk.NET.Vulkan.Vk.CreateGraphicsPipelines(Silk.NET.Vulkan.Device device, Silk.NET.Vulkan.PipelineCache pipelineCache, System.ReadOnlySpan<Silk.NET.Vulkan.GraphicsPipelineCreateInfo> pCreateInfos, Silk.NET.Vulkan.AllocationCallbacks* pAllocator, Silk.NET.Vulkan.Pipeline* pPipelines)	Unknown
[snip]

Expected behavior

There shouldnt be a crash in the driver.

Additional context

This was another exception i was encountering but there was a core validation check complaining before. (Ive used an empty setlayout but the shader has bindings specified)

Exception thrown: read access violation.
this->module_.set_index_to_bindings_layout_lut_._Mypair._Myval2._Myfirst->_Mypair._Myval2._Myfirst was 0x111011101110122.

at VkLayer_khronos_validation.dll!gpuav::spirv::DescriptorClassGeneralBufferPass::CreateFunctionCall(gpuav::spirv::BasicBlock & block, std::_Vector_iterator<std::_Vector_val<std::_Simple_types<std::unique_ptr<spirv::Instruction,std::default_delete<spirv::Instruction>>>>> * inst_it, const gpuav::spirv::InjectionData & injection_data) Line 56	C++
at VkLayer_khronos_validation.dll!gpuav::spirv::DescriptorClassGeneralBufferPass::Instrument() Line 203	C++
at [Inline Frame] VkLayer_khronos_validation.dll!gpuav::spirv::Pass::Run() Line 28	C++
at VkLayer_khronos_validation.dll!gpuav::GpuShaderInstrumentor::InstrumentShader(const vvl::enumeration<unsigned int const ,unsigned int const *> & input_spirv, unsigned int unique_shader_id, const gpuav::GpuShaderInstrumentor::InstrumentationDescriptorSetLayouts & instrumentation_dsl, const Location & loc, std::vector<unsigned int,std::allocator<unsigned int>> & out_instrumented_spirv) Line 1195	C++
at VkLayer_khronos_validation.dll!gpuav::GpuShaderInstrumentor::PreCallRecordPipelineCreationShaderInstrumentation<vku::safe_VkGraphicsPipelineCreateInfo>(const VkAllocationCallbacks * pAllocator, vvl::Pipeline & pipeline_state, vku::safe_VkGraphicsPipelineCreateInfo & modified_pipeline_ci, const Location & loc, std::vector<chassis::ShaderInstrumentationMetadata,std::allocator<chassis::ShaderInstrumentationMetadata>> & shader_instrumentation_metadata) Line 899	C++
at VkLayer_khronos_validation.dll!gpuav::GpuShaderInstrumentor::PreCallRecordCreateGraphicsPipelines(VkDevice_T * device, VkPipelineCache_T * pipelineCache, unsigned int count, const VkGraphicsPipelineCreateInfo * pCreateInfos, const VkAllocationCallbacks * pAllocator, VkPipeline_T * * pPipelines, const RecordObject & record_obj, std::vector<std::shared_ptr<vvl::Pipeline>,std::allocator<std::shared_ptr<vvl::Pipeline>>> & pipeline_states, chassis::CreateGraphicsPipelines & chassis_state) Line 424	C++
at VkLayer_khronos_validation.dll!vulkan_layer_chassis::CreateGraphicsPipelines(VkDevice_T * device, VkPipelineCache_T * pipelineCache, unsigned int createInfoCount, const VkGraphicsPipelineCreateInfo * pCreateInfos, const VkAllocationCallbacks * pAllocator, VkPipeline_T * * pPipelines) Line 482	C++
at vulkan-1.dll!00007ffc6f4622a7()	Unknown
at [Managed to Native Transition]	
at Silk.NET.Vulkan.dll!Silk.NET.Vulkan.Vk.CreateGraphicsPipelines(Silk.NET.Vulkan.Device device, Silk.NET.Vulkan.PipelineCache pipelineCache, uint createInfoCount, Silk.NET.Vulkan.GraphicsPipelineCreateInfo pCreateInfos, Silk.NET.Vulkan.AllocationCallbacks* pAllocator, Silk.NET.Vulkan.Pipeline* pPipelines)	Unknown
at Silk.NET.Vulkan.dll!Silk.NET.Vulkan.Vk.CreateGraphicsPipelines(Silk.NET.Vulkan.Device device, Silk.NET.Vulkan.PipelineCache pipelineCache, System.ReadOnlySpan<Silk.NET.Vulkan.GraphicsPipelineCreateInfo> pCreateInfos, Silk.NET.Vulkan.AllocationCallbacks* pAllocator, Silk.NET.Vulkan.Pipeline* pPipelines)	Unknown
[snip]

StefanPoelloth avatar May 13 '25 16:05 StefanPoelloth

@StefanPoelloth I sent you a new share link to help provide a way to reproduce

What might be "enough" to debug this if you can provide the Mesh/Task and Frag shader used when this occured as this might just be some SPIR-V pattern we are not handling correctly

spencer-lunarg avatar May 14 '25 01:05 spencer-lunarg

@spencer-lunarg Thanks, ive uploaded the repro project, please check out the readme.

The task and frag shader are basicly empty. The mesh shader in question is:

layout(local_size_x = 1, local_size_y = 1, local_size_z = 1) in;
void main()
{
    for (uint i = 0; i < 32 / gl_SubgroupSize; i++) // if one of the operands of "32 / gl_SubgroupSize" is removed, the crash doesnt happen
    {
        uint lineIndex0 = Indices[4];// if the access to Indices is removed, the crash doesnt happen
    }
}

EDIT: It also happens with the Debug Printf preset. Intel and Nvidia are not affected.

StefanPoelloth avatar May 14 '25 07:05 StefanPoelloth

This problem is really annoying because debug tools for mesh shader pipelines are not really there and now debugprintf crashes the driver before running.

I could reproduce the problem with the nvidia driver:

Exception thrown at 0x00007FFC303BCFF6 (nvoglv64.dll) in RenderDemo.exe: 0xC0000005: Access violation reading location 0x0000032631658D68.
 	nvoglv64.dll!00007ffc303bcff6()	Unknown
 	nvoglv64.dll!00007ffc303bd01e()	Unknown
 	nvoglv64.dll!00007ffc30405439()	Unknown
 	nvoglv64.dll!00007ffc2f2c14a8()	Unknown
 	nvoglv64.dll!00007ffc2f32bd58()	Unknown
 	nvoglv64.dll!00007ffc2f275459()	Unknown
 	nvoglv64.dll!00007ffc30106c72()	Unknown
 	nvoglv64.dll!00007ffc301057da()	Unknown
 	nvoglv64.dll!00007ffc3010b580()	Unknown
 	nvoglv64.dll!00007ffc3010abbb()	Unknown
 	nvoglv64.dll!00007ffc30109b2d()	Unknown
 	nvoglv64.dll!00007ffc3010f697()	Unknown
 	nvoglv64.dll!00007ffc300f1eda()	Unknown
 	nvoglv64.dll!00007ffc2ffd16df()	Unknown
 	VkLayer_khronos_validation.dll!vvl::dispatch::Device::CreateGraphicsPipelines(VkDevice_T * device, VkPipelineCache_T * pipelineCache, unsigned int createInfoCount, const VkGraphicsPipelineCreateInfo * pCreateInfos, const VkAllocationCallbacks * pAllocator, VkPipeline_T * * pPipelines) Line 1002	C++
 	VkLayer_khronos_validation.dll!vulkan_layer_chassis::CreateGraphicsPipelines(VkDevice_T * device, VkPipelineCache_T * pipelineCache, unsigned int createInfoCount, const VkGraphicsPipelineCreateInfo * pCreateInfos, const VkAllocationCallbacks * pAllocator, VkPipeline_T * * pPipelines) Line 490	C++
 	vulkan-1.dll!00007ffbb81122a7()	Unknown
 	[Managed to Native Transition]	
 	Silk.NET.Vulkan.dll!Silk.NET.Vulkan.Vk.CreateGraphicsPipelines(Silk.NET.Vulkan.Device device, Silk.NET.Vulkan.PipelineCache pipelineCache, uint createInfoCount, Silk.NET.Vulkan.GraphicsPipelineCreateInfo pCreateInfos, Silk.NET.Vulkan.AllocationCallbacks* pAllocator, Silk.NET.Vulkan.Pipeline* pPipelines)	Unknown
 	Silk.NET.Vulkan.dll!Silk.NET.Vulkan.Vk.CreateGraphicsPipelines(Silk.NET.Vulkan.Device device, Silk.NET.Vulkan.PipelineCache pipelineCache, System.ReadOnlySpan<Silk.NET.Vulkan.GraphicsPipelineCreateInfo> pCreateInfos, Silk.NET.Vulkan.AllocationCallbacks* pAllocator, Silk.NET.Vulkan.Pipeline* pPipelines)	Unknown
[snip]

StefanPoelloth avatar May 15 '25 11:05 StefanPoelloth

Ive also seen these 5 reoccuring exceptions with Printf OR GPU-AV enabled, I think they might be related to the original problem.

Exception thrown: read access violation. this->layout_._Ptr was 0xFFFFFFFFFFFFFF87. 
Exception thrown: read access violation. this->layout_._Ptr was 0xD30363420.
Exception thrown: read access violation. this was nullptr.
Exception thrown: read access violation. this->layout_._Ptr was nullptr.
Exception thrown: read access violation. this->layout_._Ptr was 0x2000001A2.

The exceptions all happened at this callstack:

>	[Inline Frame] VkLayer_khronos_validation.dll!vvl::DescriptorSetLayout::GetIndexFromBinding(unsigned int) Line 323	C++
 	VkLayer_khronos_validation.dll!vvl::DescriptorSet::GetBinding(unsigned int binding) Line 879	C++
 	VkLayer_khronos_validation.dll!gpuav::LogMessageInstDescriptorClass(gpuav::Validator & gpuav, const unsigned int * error_record, std::string & out_error_msg, std::string & out_vuid_msg, const std::vector<std::shared_ptr<gpuav::DescriptorSet>,std::allocator<std::shared_ptr<gpuav::DescriptorSet>>> & descriptor_sets, const Location & loc, bool uses_shader_object, bool & out_oob_access) Line 732	C++
 	VkLayer_khronos_validation.dll!gpuav::LogInstrumentationError(gpuav::Validator & gpuav, const gpuav::CommandBuffer & cb_state, const LogObjectList & objlist, const gpuav::InstrumentationErrorBlob & instrumentation_error_blob, const std::vector<std::string,std::allocator<std::string>> & initial_label_stack, unsigned int label_command_i, unsigned int operation_index, const unsigned int * error_record, const std::vector<std::shared_ptr<gpuav::DescriptorSet>,std::allocator<std::shared_ptr<gpuav::DescriptorSet>>> & descriptor_sets, VkPipelineBindPoint pipeline_bind_point, bool uses_shader_object, bool uses_robustness, const Location & loc) Line 1039	C++
 	VkLayer_khronos_validation.dll!gpuav::PreCallSetupShaderInstrumentationResources::__l2::<lambda>(gpuav::Validator & gpuav, const gpuav::CommandBuffer & cb_state, const unsigned int * error_record, const LogObjectList & objlist, const std::vector<std::string,std::allocator<std::string>> & initial_label_stack) Line 591	C++
 	[Inline Frame] VkLayer_khronos_validation.dll!stdext::inplace_function<bool __cdecl(gpuav::Validator &,gpuav::CommandBuffer const &,unsigned int const *,LogObjectList const &,std::vector<std::string,std::allocator<std::string>> const &),256,8>::operator()(gpuav::Validator & <args_2>, const gpuav::CommandBuffer &) Line 248	C++
 	VkLayer_khronos_validation.dll!gpuav::CommandBuffer::PostProcess(VkQueue_T * queue, const std::vector<std::string,std::allocator<std::string>> & initial_label_stack, const Location & loc) Line 528	C++
 	VkLayer_khronos_validation.dll!gpuav::Queue::Retire(vvl::QueueSubmission & submission) Line 708	C++
 	VkLayer_khronos_validation.dll!vvl::Queue::ThreadFunc() Line 274	C++
 	[Inline Frame] VkLayer_khronos_validation.dll!std::invoke(void(vvl::Queue::*)() &&) Line 1540	C++
 	VkLayer_khronos_validation.dll!std::thread::_Invoke<std::tuple<void (__cdecl vvl::Queue::*)(void),vvl::Queue *>,0,1>(void * _RawVals) Line 56	C++
 	ucrtbase.dll!00007ffcd8b137b0()	Unknown
 	kernel32.dll!00007ffcda0ae8d7()	Unknown
 	ntdll.dll!00007ffcdaf114fc()	Unknown

StefanPoelloth avatar May 16 '25 09:05 StefanPoelloth