uVkCompute icon indicating copy to clipboard operation
uVkCompute copied to clipboard

Benchmark mad crash on Jetson Nano

Open tpoisonooo opened this issue 3 years ago • 1 comments

mad_throughput crashed on Jetson Nano and throw VK_ERROR_DEVICE_LOST.

This is call chain:

mad_throughput_main.cc:189  --->   GetDeviceBufferViaStagingBuffer -->  vulkan_buffer_util.cc:67 ---->  QueueSubmitAndWait ---> crash

No nullptr or bad variable found.

I have tried to fix it by validation layer, but Jetson Nano does not support it ... 0 == layerCount

$ vulkaninfo
Instance Extensions:
====================
Instance Extensions	count = 16
	VK_KHR_device_group_creation        : extension revision  1
	VK_KHR_display                      : extension revision 23
	VK_KHR_external_fence_capabilities  : extension revision  1
	VK_KHR_external_memory_capabilities : extension revision  1
	VK_KHR_external_semaphore_capabilities: extension revision  1
	VK_KHR_get_display_properties2      : extension revision  1
	VK_KHR_get_physical_device_properties2: extension revision  2
	VK_KHR_get_surface_capabilities2    : extension revision  1
	VK_KHR_surface                      : extension revision 25
	VK_KHR_surface_protected_capabilities: extension revision  1
	VK_KHR_wayland_surface              : extension revision  6
	VK_KHR_xcb_surface                  : extension revision  6
	VK_KHR_xlib_surface                 : extension revision  6
	VK_EXT_debug_report                 : extension revision  9
	VK_EXT_debug_utils                  : extension revision  1
	VK_EXT_display_surface_counter      : extension revision  1
Layers: count = 0

this is my draft PR https://github.com/google/uVkCompute/pull/17

tpoisonooo avatar Mar 22 '22 08:03 tpoisonooo

VK_ERROR_DEVICE_LOST is an indication that the workload is taking too much time to complete on the GPU (given we have a weak GPU on Jetson Nano I think). You can try to reduce the amount of workload to see if it helps.

antiagainst avatar Mar 28 '22 20:03 antiagainst