imgui icon indicating copy to clipboard operation
imgui copied to clipboard

Vulkan backend (docking): Fix VUIDs 03868/00067 (unsafe render semaphore usage)

Open Gabboxl opened this issue 7 months ago • 5 comments

Fixes VUID-vkQueueSubmit2-semaphore-03868 + VUID-vkQueueSubmit-pSignalSemaphores-00067 validation errors (external issue https://github.com/KhronosGroup/Vulkan-ValidationLayers/issues/10254):

[ERROR: Validation] - VUID-vkQueueSubmit2-semaphore-03868
vkQueueSubmit2(): pSubmits[0].pSignalSemaphoreInfos[0].semaphore (VkSemaphore 0x150000000015) is being signaled by VkQueue 0x258c8401f90, but it may still be in use by VkSwapchainKHR 0x430000000043.
Here are the most recently acquired image indices: 2, 1, 0, 1, 0, [1], 0, 2.
(brackets mark the last use of VkSemaphore 0x150000000015 in a presentation operation)
Swapchain image 1 was presented but was not re-acquired, so VkSemaphore 0x150000000015 may still be in use and cannot be safely reused with image index 2.

  • Remove ImGui_ImplVulkanH_FrameSemaphores structure (acquire and render semaphores have to be indexed separately)
  • Integrate ImageAcquiredSemaphore VkSemaphore into ImGui_ImplVulkanH_Frame struct, as it is a current-frame-specific feature
  • Make RenderSemaphores independent, indexed by the current SwapchainImageIndex (this avoids the validation error and ensures the render semaphore can be used safely again, because if vkAcquireNextImageKHR has returned that index, it implicitly means the presentation of that image had been completed)
  • This fix does NOT require the use of any new extensions

The fundamental issue is that the original version of vkQueuePresentKHR doesn't support signaling any fence, so we are stuck checking the fence signaled by vkQueueSubmit2, which is a step before the present operation, which involves waiting for a semaphore known as "render semaphore" (signaled by the vkQueueSubmit operation). If we cannot determine when the "present" operation finishes, then we can't know when it is safe to signal/wait again for the "render semaphore" associated with that swapchain image, because there could be a queued "present operation" that is still waiting on it.

Without any additional extension, we can exploit the behavior of vkAcquireNextImageKHR, which returns a new image index only when it is ready and has already been presented. That's why we index the "render semaphores" using the current swapchain image index.

Another solution is to use the VK_EXT_swapchain_maintenance1 extension (which has recently been promoted to VK_KHR_swapchain_maintenance1), which adds the possibility to specify a fence to be signaled by the vkQueuePresent operation, but vendor adoption could be a little problematic as of now.

More info on the fundamental issue with examples: https://docs.vulkan.org/guide/latest/swapchain_semaphore_reuse.html

Gabboxl avatar Jul 28 '25 22:07 Gabboxl

Thank you!

Tagging @NostraMagister for possible feedback? (following https://github.com/ocornut/imgui/issues/7236#issuecomment-2258784142)

ocornut avatar Jul 30 '25 08:07 ocornut

This is still generating validation errors for me. Specifically an "vkAcquireNextImageKHR(): Semaphore must not have any pending operations" validation error.

I added printf("Calling vkAcquireNextImageKHR. CurrentFrameIndex: %i\n", wd->CurrentFrameIndex) before the vkAcquireNextImageKHR call and this is the output when an ImGui window is dragged outside of the main viewport:

Calling vkAcquireNextImageKHR. CurrentFrameIndex: 0
Calling vkAcquireNextImageKHR. CurrentFrameIndex: 1
Calling vkAcquireNextImageKHR. CurrentFrameIndex: 0
[vulkan] Debug report from ObjectType: 5
Message: vkAcquireNextImageKHR(): Semaphore must not have any pending operations.
The Vulkan spec states: If semaphore is not VK_NULL_HANDLE, it must not have any uncompleted signal or wait operations pending (https://vulkan.lunarg.com/doc/view/1.4.321.1/windows/antora/spec/latest/chapters/VK_KHR_surface/wsi.html#VUID-vkAcquireNextImageKHR-semaphore-01779)

Calling vkAcquireNextImageKHR. CurrentFrameIndex: 1
[vulkan] Debug report from ObjectType: 5
Message: vkAcquireNextImageKHR(): Semaphore must not have any pending operations.
The Vulkan spec states: If semaphore is not VK_NULL_HANDLE, it must not have any uncompleted signal or wait operations pending (https://vulkan.lunarg.com/doc/view/1.4.321.1/windows/antora/spec/latest/chapters/VK_KHR_surface/wsi.html#VUID-vkAcquireNextImageKHR-semaphore-01779)

Calling vkAcquireNextImageKHR. CurrentFrameIndex: 0
[vulkan] Debug report from ObjectType: 5
Message: vkAcquireNextImageKHR(): Semaphore must not have any pending operations.
The Vulkan spec states: If semaphore is not VK_NULL_HANDLE, it must not have any uncompleted signal or wait operations pending (https://vulkan.lunarg.com/doc/view/1.4.321.1/windows/antora/spec/latest/chapters/VK_KHR_surface/wsi.html#VUID-vkAcquireNextImageKHR-semaphore-01779)

Calling vkAcquireNextImageKHR. CurrentFrameIndex: 1
[vulkan] Debug report from ObjectType: 5
Message: vkAcquireNextImageKHR(): Semaphore must not have any pending operations.
The Vulkan spec states: If semaphore is not VK_NULL_HANDLE, it must not have any uncompleted signal or wait operations pending (https://vulkan.lunarg.com/doc/view/1.4.321.1/windows/antora/spec/latest/chapters/VK_KHR_surface/wsi.html#VUID-vkAcquireNextImageKHR-semaphore-01779)

// etc.

The first two calls are fine, but as soon as we go back to the first frame's data we try to use an in-use semaphore, because we don't wait for the frame's RenderFence. Moving the vkWaitForFences before vkAcquireNextImageKHR fixes this, and this is indeed what the Vulkan docs do. However, @mklefrancois notes that this causes stuttering on fast GPUs, although I didn't notice any stuttering on my end.

Voxeles avatar Aug 21 '25 12:08 Voxeles

Hi @Voxeles, yes, you're right. I was fixing that issue too (the #7236), and confusingly I referenced it in this PR that resolves a completely different issue.

I removed the reference to that issue from the PR description; this PR only fixes VUIDS 03868 and 00067.

Sorry for the mistake. I will share more on issue #7236 when I find time to work on it again. By the way, I was also testing the solution you mentioned, and I didn't experience any stuttering either.

@ocornut could you review this PR again, please? Thank you.

Gabboxl avatar Aug 25 '25 20:08 Gabboxl

FYI the easiest way of fixing #7236 without moving the vkWaitForFences is to create a second vector of semaphores specifically for use with vkAcquireNextImageKHR. You need ImageCount + 1 of these semaphores and then you cycle through them to prevent grabbing a potentially-in-use semaphore.

Basically, this is the same solution that is already in use, except instead of the struct you only have the ImageAcquiredSemaphore. The RenderCompleteSemaphore needs to be separated into a different vector like in your PR, giving us two vectors of semaphores that we then use in different ways (one is per-image, the other one we cycle through for each image).

Voxeles avatar Aug 25 '25 21:08 Voxeles

The current modern way to deal with a Vulkan swapchain at the moment is to use a timeline semaphore. There is an implementation of it in vk_minimal_latest; see the Swapchain class. This is also used in nvpro_core2 for NVIDIA pro samples. Unfortunately, on some platforms using older video drivers, the timeline semaphore might not work as expected, and therefore a VkFence might be needed. However, this is an addition that shouldn't be necessary. A migration towards timeline semaphore would be something nice to have.

mklefrancois avatar Aug 26 '25 07:08 mklefrancois