VulkanTutorial icon indicating copy to clipboard operation
VulkanTutorial copied to clipboard

`VUID-vkQueueSubmit-pSignalSemaphores-00067`: incorrect use of `renderFinishedSemaphores` in the "frames in flight" chapter

Open chuigda opened this issue 6 months ago • 6 comments

As stated in http://disq.us/p/32wdmkz, a recent Vulkan Validation Layer update exposed the problem that most Vulkan tutorials (also vkguide, and other derivative works of VulkanTutorial) and Vulkan programs (even vkcube) did not correctly make use of semaphores to synchronize presentation.

I am not a Vulkan professor, and here's my understanding. In the following code:

VkSemaphore signalSemaphores[] = {renderFinishedSemaphores[currentFrame]};
submitInfo.signalSemaphoreCount = 1;
submitInfo.pSignalSemaphores = signalSemaphores;
vkQueueSubmit(graphicsQueue, 1, &submitInfo, inFlightFences[currentFrame]);

Here we are actually expecting that the renderFinishedSemaphores[currentFrame] to have already been acquired/consumed by presentation queue during the several previous iterations. However there's no guarantee. So it is possible for the graphics queue to signal an already signaled semaphore.

To fix this problem, program should create the renderFinishedSemaphores array of swapChainImages.size() semaphores, and use the imageIndex returned from vkAcquireNextImageKHR to index into the array, ensuring that the semaphore has been acquired/consumed:

https://github.com/kennyalive/vulkan-base/commit/27bcaad9d519cc2f9c5cde4872742d4a5212eee6 https://github.com/JasonPhone/vrtr/commit/c459d8829458da8d9bd4c3ea78cb1ee5de8f3922 https://github.com/JorriFransen/v10/commit/3652d32a52ba64b5fccfb48a2abe1e7181c1dd23

I also made my personal fix on my own Java version tutorial: old -> new

chuigda avatar May 28 '25 04:05 chuigda

Unfortunately I currently don't have much time to maintain the tutorial. Would you be willing to submit a PR for this issue?

Overv avatar Jun 09 '25 12:06 Overv

Sure, I'll manage to get some time fixing this.

chuigda avatar Jun 10 '25 11:06 chuigda

Having encountered the same bug following the tutorials a few months ago, and rethinking about the problem I wonder if the whole in-flight frames (the frame counter increase explicitly by the user code) concept is questionable and should be removed as a whole.

I try to breakdown my thought:

  • The three important calls are:
    1. AcquireNextImage
    2. QueueSubmit
    3. QueuePresent
  • The three important synchronization objects are
    1. An AcquireNextImage semaphore
    2. An "Render is finished" semaphore (from queue submit)
    3. And the "Render is finished" fence (also from queue submit)

The acquire semaphore is important so the submitted work does not start before the acquire actually finished. (So queue submit waits for this semaphore) The queue submit signals the semaphore when rendering is finished (important for QueuePresent) and the fence which we wait for after the next AquireImage of that image (important to sync host objects).

Another important thing is that QueuePresent is executed in order

The processing of the presentation happens in issue order with other queue operations, but semaphores must be used to ensure that prior rendering and other commands in the specified queue complete before the presentation begins. The presentation command itself does not delay processing of subsequent commands on the queue. However, presentation requests sent to a particular queue are always performed in order.

If I express this in terms of code. Let's assume we have SwapchainImages.size() amount of swapchain images.

Then we need

// For the next acquire call
vk::Semaphore acquire_semaphore;
// All acquire semaphores currently in use (per swapchain image)
std::vector<vk::Semaphore> busy_acquire_semaphores;
// All QueueSubmit semaphores currently in use (per swapchain image)
std::vector<vk::Semaphore> queue_submit_semaphores;
// All QueueSubmit fences currently in use (per swapchain image)
std::vector<vk::Fence queue_submit_fences;
// The swapchain image index acquired
u32 image_index;

All the vectors initialized with SwapchainImages.size()-amount of objects.

Now the rendering workflow would look like this (slightly simplified):

vkAcquireNextImageKHR(acquire_semaphore, &image_index);
vkWaitForFences(queue_submit_fences[image_index]);
// [...]

auto wait_semaphore = acquire_semaphore;
auto signal_semaphore = queue_submit_semaphores[image_index];
auto signal_fence = queue_submit_fences[image_index];
vkResetFences(signal_fence);
vkQueueSubmit(wait_semaphore, signal_semaphore, signal_fence);

// Switch the acquire semaphore with the busy_acquire_semaphores
swap(&busy_acquire_semaphores[image_index], &acquire_semaphore);

// [..] 
vkQueuePresentKHR(signal_semaphore);

So:

  • We have an acquire semaphore, that is not bound to any swapchain image index (since we cannot foresee the future of what image index is acquired)
  • We swap this acquire semaphore with the busy one for the given image index (since the busy one is now not busy anymore)
  • the submit semaphore and fences are in use as long as acquire image gives the image index for these sync objects again (the semaphore is signaled since the presentation engine waits for it, the fence is waiting for explicitly in our code).
  • the QueuePresent already defines a implicit order of rendered images (I don't see a need to synchronize two separate frames to each other)

I don't see where the need for in-flight is. Maybe it should be removed completely

StarStarJ avatar Jun 15 '25 09:06 StarStarJ

From what I understood, the purpose of the in-flight count is to have no more then 2 frames processed at the same time while the framebuffers would often be 3. If you just made 3 fences, wouldn't that mean the computer would try to render them all at the same time? Even if one of the images is currently used for presenting, the render pass can still do stuff like vertex shaders. Correct me if I'm wrong.

svetli97 avatar Jun 15 '25 13:06 svetli97

From what I understood, the purpose of the in-flight count is to have no more then 2 frames processed at the same time while the framebuffers would often be 3.

You are right, that is also why the tutorial suggests that I think. I've no overview on how drivers are actually implemented, but I'd claim the tutorial simplifies this topic a bit anyway, e.g.

We choose the number 2 because we don't want the CPU to get too far ahead of the GPU. With 2 frames in flight, the CPU and the GPU can be working on their own tasks at the same time. If the CPU finishes early, it will wait till the GPU finishes rendering before submitting more work. With 3 or more frames in flight, the CPU could get ahead of the GPU, adding frames of latency.

If the CPU would need to wait for the GPU to finish and then starts the next frame, in that moment you added latency, because you don't know how long the CPU needs to prepare next frame (and how fast the GPU is with two different frames).

A good driver in my eyes would hold one swap chain image, and as soon as it wants to update the held image it swaps with the newest one of the two not in use. So it can always return one of the images to the AcquireNext call and minimizes blocking. This most likely gives the lowest latency:

  1. Bcs if you don't use vsync(or adaptive sync) the driver might swap the image a few times during one monitor cycle.
  2. As said it's not clear how long the CPU needs for a frame to prepare. You might end up by luck getting a new frame just in time before the drivers sends the next chunk of data to the monitor. Generally vsync & adaptive sync always add latency, simply because it's impossible to have perfect frame times to sync gpu and cpu work

I find this behavior likely, because that is one of the reasons this issue exists in first place, bcs AcquireImage does sometimes return a image index early (used recently).

If you want to target a specific fps value, you might want to use VK_PRESENT_MODE_FIFO_RELAXED_KHR, VK_PRESENT_MODE_FIFO_KHR or an app internal fps logic (that is unrelated to your backend).

Anyway, maybe I am wrong or simplify stuff too much too. I'd happily hear opinions of others. To me it just feels like there is additional synchronization besides the implicit given by vulkan with the in-flight concept.

If you just made 3 fences, wouldn't that mean the computer would try to render them all at the same time? Even if one of the images is currently used for presenting, the render pass can still do stuff like vertex shaders. Correct me if I'm wrong.

Sry but I am not entirely sure if I understand what you are trying to say. You still need to acquire the next image, so you cannot just render to a frame currently presented (the acquirenext signaling semaphore is still important) ^^

StarStarJ avatar Jun 15 '25 14:06 StarStarJ

Ok I finally started to understand what's happening. And your idea of having one extra semaphore is so far the only compliant implementation I've found.

  • you need to pass a semaphore to vkAcquireNextImageKHR that is not used at the moment;
  • with 1 semaphore per image, the only guaranteed free one is the one for the image you are trying to aquire;
  • but you don't know which one it is until you call the same function that you need to pass it to;
  • a paradox!
  • so instead you assume you can't use any of them and make one extra semaphore slot that swaps with the one that was last used with the acquired image so it is guaranteed to be unused.

At the end of the day it is a simple and reliable solution. Maybe this should be in the tutorial. Or maybe there should be a way to pass an array of semaphores to vkAcquireNextImageKHR that would automatically index it with the image index.

And about the fences, I realised that the reason we only need and use 2 fences is because they are only relevant for submitting command buffers. You don't want to write to a buffer while the API is reading it and that's why you need 2. Probably. And their count isn't very directly related to the thing I was thinking about originally.

svetli97 avatar Jun 15 '25 20:06 svetli97

I thought a bit more about the fences.

In the end fences are also there to limit the amount of pre-flight images calculated as they block the cpu, aren't they? So you will need 3 fences if you want 3 images prepared. However, you must always use less pre-flight frames than images in the swapchain (obviously) as otherwise it could happen that a command-queue is written while reading it (As it was already mentioned).

Then I realized that in one frame something like this happens: ---- Wait for flight-fence -- Aquire next Image and Signal aquire-semaphore -- Wait for aquire-semaphore then submit buffer and ---- signal flight-fence This means that by using 2 fences you only have to use 2 aquire-semaphores, because when the flight fence is signaled it is safe to say that the according aquire-semaphore was signaled too. So you need the same amount of aquire-semaphores as you have fences / pre-flight frames.

For the queue-submit-semaphores you will however need 1 semaphore per image + one extra, as the presenting and thus re-aquiring of the semaphore is not in sync with any other synchronization mechanisms.

Tbh. I only started doing Vulkan things 2 weeks ago, I do not know much about it and I do not know how to express this better. These are just my thoughts and maybe they help but if I am completely wrong feel free to correct me!

Placeblock avatar Jun 17 '25 23:06 Placeblock

I think you got a few things wrong but you also convinced me that this may actually be a false flag message, though it would be very difficult for the validation to actually know if the usage is ok or not.

With 2 fences and 2 aquire-semaphores cycling together, the fences protect the semaphores from being misused. But it is somewhat complicated:

  • the fence is signalled when the buffer finishes rendering
  • the buffer finishes after it waited on the aquire-semaphore
  • the aquire-semaphore is signalled after the fence was waited on

So yes, it seems like the usage is perfectly valid. Though difficult to validate due to the chains of sync objects.

But I don't like seeing warning/error messages and I would prefer code that is easier to reason about and future proof. So I will stick with the 2 fences and 4 aquire semaphores. That way if I ever have to modify this logic, there will be less risk of misuse. So at the end of the day the message warns you about real danger.

Here is an example of an alegedly "correct solution" that actually fails but somehow avoids the warning. When the swapchain occasionally rearranges the images (happens often in some present modes) you do get an error message: https://github.com/KhronosGroup/Vulkan-Tutorial/issues/57#issuecomment-2973861304 The error happens because this solution cycles the 3 semaphores but the swapchain makes no guarantee that the aquired image indices cycle the same way. So it gives a warning every time the order switches:

validation layer: vkQueueSubmit(): pSubmits[0].pSignalSemaphores[0] (VkSemaphore 0x180000000018) is being signaled by VkQueue 0x21bd76a0b50, but it may still be in use by VkSwapchainKHR 0x30000000003.
Here are the most recently acquired image indices: 0, 2, 1, 0, [2], 1, 0, 1.
(brackets mark the last use of VkSemaphore 0x180000000018 in a presentation operation)
Swapchain image 2 was presented but was not re-acquired, so VkSemaphore 0x180000000018 may still be in use and cannot be safely reused with image index 1.

Again, I think with 2 fences and 3 semaphores this is essentially the same case as with 2 fences and 2 semaphores cycled this way. The validation layers have no way of knowing what your code on the CPU would do. A random "goto" somewhere else in the code could skip the wait on fence. So it is unprovable without completely proving the whole program.

And about the render-finished-semaphores, those you can bind to specific images because at that point you do know the current image index. So the simple and safe solution is one per image.

The full drawFrame (couldn't be bothered to clean it up, sorry):

void drawFrame() {
    vkWaitForFences(device, 1, &inFlightFences[currentFrame], VK_TRUE, UINT64_MAX);
    uint32_t imageIndex;
    VkResult result = vkAcquireNextImageKHR(
        device,
        swapChain,
        UINT64_MAX,
        acquire_semaphore, // idea from https://github.com/Overv/VulkanTutorial/issues/407#issuecomment-2973620217
        VK_NULL_HANDLE,
        &imageIndex);
    if (result == VK_ERROR_OUT_OF_DATE_KHR) {
        recreateSwapChain();
        // cout << "recreate after acquire\n";
        return;
    } else if (result != VK_SUCCESS && result != VK_SUBOPTIMAL_KHR) {
        throw std::runtime_error("failed to acquire swap chain image!");
    }
    vkResetFences(device, 1, &inFlightFences[currentFrame]); // reset after passing the early return
    updateUniformBuffer(currentFrame);
    vkResetCommandBuffer(commandBuffers[currentFrame],  0);
    recordCommandBuffer(commandBuffers[currentFrame], imageIndex);
    VkSubmitInfo submitInfo{};
    submitInfo.sType = VK_STRUCTURE_TYPE_SUBMIT_INFO;

    VkSemaphore waitSemaphores[] = {acquire_semaphore};
    VkPipelineStageFlags waitStages[] = {VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT};
    submitInfo.waitSemaphoreCount = 1;
    submitInfo.pWaitSemaphores = waitSemaphores;
    submitInfo.pWaitDstStageMask = waitStages;
    submitInfo.commandBufferCount = 1;
    submitInfo.pCommandBuffers = &commandBuffers[currentFrame];
    VkSemaphore signalSemaphores[] = {renderFinishedSemaphores[imageIndex]};
    submitInfo.signalSemaphoreCount = 1;
    submitInfo.pSignalSemaphores = signalSemaphores;
    if (vkQueueSubmit(graphicsQueue, 1, &submitInfo, inFlightFences[currentFrame]) != VK_SUCCESS) {
        throw std::runtime_error("failed to submit draw command buffer!");
    }
    VkPresentInfoKHR presentInfo{};
    presentInfo.sType = VK_STRUCTURE_TYPE_PRESENT_INFO_KHR;

    presentInfo.waitSemaphoreCount = 1;
    presentInfo.pWaitSemaphores = signalSemaphores;
    VkSwapchainKHR swapChains[] = {swapChain};
    presentInfo.swapchainCount = 1;
    presentInfo.pSwapchains = swapChains;
    presentInfo.pImageIndices = &imageIndex;
    presentInfo.pResults = nullptr; // Optional
    std::swap(imageAvailableSemaphores[imageIndex], acquire_semaphore);
    result = vkQueuePresentKHR(presentQueue, &presentInfo);
    // test after present to avoid semaphore issues
    if (result == VK_ERROR_OUT_OF_DATE_KHR
        || result == VK_SUBOPTIMAL_KHR
        || framebufferResized) {
        framebufferResized = false;
        // cout << "recreate after present\n";
        recreateSwapChain();
    } else if (result != VK_SUCCESS) {
        throw std::runtime_error("failed to present swap chain image!");
    }
    currentFrame = (currentFrame + 1) % MAX_FRAMES_IN_FLIGHT;
}

svetli97 avatar Jun 18 '25 00:06 svetli97

Here we are actually expecting that the renderFinishedSemaphores[currentFrame] to have already been acquired/consumed by presentation queue during the several previous iterations. However there's no guarantee. So it is possible for the graphics queue to signal an already signaled semaphore.

According to the Spec, only after the command buffer is fully executed, the pSignalSemaphores in VkSubmitInfo and the fence in vkQueueSubmit will be signaled. Why is there a situation where renderFinishedSemaphores[currentFrame] is repeatedly signaled in this case?

I have carefully read your insightful words and you said 'this can not be guaranteed'. Could you please explain the specific reason in detail?

xiaomx32 avatar Jun 25 '25 01:06 xiaomx32

The issue here stems from the incomplete information the vulkan validation layers have about the control flow on the CPU side. So it is more appropriate to say that the validation layers cannot guarantee correct use. Even though the CPU side control flow plus the fences and semaphores guarantee correct use.

You can see my last comment for a detailed explanation which I wrote after I figured out all the details of this issue. But to put it simply: The validation layers don't know that you indend to wait on the fence before you signal the semaphores. So they warn you about it. And you can avoid the warnings by rewriting the logic in a way that gives enough information for the validation layers to know that the use is valid.

svetli97 avatar Jun 25 '25 09:06 svetli97

I am quite new to Vulkan so if the following doesn't make sense feel free to ignore this.

Here's what i think is going on.

First of all the error message: validation layer: vkQueueSubmit(): pSubmits[0].pSignalSemaphores[0] (VkSemaphore 0x300000000030) is being signaled by VkQueue 0x8b34380, but it may still be in use by VkSwapchainKHR 0x30000000003. Here are the most recently acquired image indices: [0], 1, 2. (brackets mark the last use of VkSemaphore 0x300000000030 in a presentation operation) Swapchain image 0 was presented but was not re-acquired, so VkSemaphore 0x300000000030 may still be in use and cannot be safely reused with image index 2. Vulkan insight: One solution is to assign each image its own semaphore. Here are some common methods to ensure that a semaphore passed to vkQueuePresentKHR is not in use and can be safely reused: a) Use a separate semaphore per swapchain image. Index these semaphores using the index of the acquired image. b) Consider the VK_EXT_swapchain_maintenance1 extension. It allows using a VkFence with the presentation operation. The Vulkan spec states: Each binary semaphore element of the pSignalSemaphores member of any element of pSubmits must be unsignaled when the semaphore signal operation it defines is executed on the device (https://vulkan.lunarg.com/doc/view/1.4.313.0/linux/antora/spec/latest/chapters/cmdbuffers.html#VUID-vkQueueSubmit-pSignalSemaphores-00067)

  1. It refers to a semaphore that may be in use. Now a semaphore can be waited on or be signaled, but in use is just Undefined Terminology afaik. However further down it seems to mean we are trying to signal an already signaled semaphore. Problem is we can't really tell which of the different semaphores the program uses is being referred to.
  2. it refers to a swapchain image that has not been re-acquired and the semaphore related to this swapchain image. This seems to mean the offending semaphore is an imageAvailableSemaphores[currentImage] not a rendenFinishedSemaphores[currentImage] and begs the question when this image is being re-acquired and the associated semaphore is reset.

I could not find when a semaphore is reset but as multiple processes can wait on the same semaphore it will not be directly reset once one thread passes it as that could mess up other threads waiting on that semaphore. In te vulkan spec it states that: The internal data of a semaphore may include a reference to any resources and pending work associated with signal or unsignal operations performed on that semaphore object, collectively referred to as the semaphore’s payload. Now from that i assume the payload of this semaphore is the associated swapchain image and the semaphore is being reset only when the swapchain image is released.

Now if i understand this stackoverflow answer correctly: "vkQueuePresentKHR sends the image to the compositor (SurfaceFlinger), which displays it. The compositor re-displays this image on every display refresh until it gets a new image to display for that window. Until it gets that next image, it doesn't know whether or how many times it will need to read the buffer again in the future, and can't create a semaphore that will signal when the last read completes."

Seems to mean the swapchain image will not be released until a new one has been acquired by vkQueuePresent and so will the related imageAvailableSemaphores[currentImage] not be reset.

Now as @svetli97 stated:

The error happens because this solution cycles the 3 semaphores but the swapchain makes no guarantee that the aquired image indices cycle the same way.

So if the swapchain order changes we now try to signal the semaphore with payload the currently presented image which is no yet reset as vkQueuePresent has not yet acquired a new image. Thus signalling an already signaled semaphore.

Now the suggestion of a) Use a separate semaphore per swapchain image. Index these semaphores using the index of the acquired image. makes a bit more sense. We need to fix a semaphore to each swapchain image. So when we acquire the next image we can also get a uniquely associated semaphore.

I have not yet figured out how to implement this. Probably there are examples in the Vulkan docs.

ODON1 avatar Jun 27 '25 09:06 ODON1

Ok, after some more testing here's how i fixed it

It indeed seems to be the case that vkQueuePresent is not resetting the semaphore used to signal the availability of an image until it has acquired a new image. However unlike my previous comment it is indeed the renderFinishedSemaphores[currentFrame] that is not being reset, not the imageAvailableSemaphores[currentFrame] as i wrote.

To fix it i changed the names of the semaphores as they didn't fit anymore: imageAvailableSemaphores -> imageReadyForWrite and renderFinishedSemaphores -> imageReadyForPresent

Now instead of creating MAX_FRAMES_IN_FLIGHT nr of renderFinishedSemaphores create swapChainImages.size() of imageReadyForPresent (semaphores)

(the name change for imageAvailableSemaphores is just cosmetic.)

  std::vector<VkSemaphore> imageReadyForWrite;
  std::vector<VkSemaphore> imageReadyForPresent;
  std::vector<VkSemaphore> computeFinishedSemaphores;
void createSyncObjects() {
    ...
    imageReadyForPresent.resize(swapChainImages.size());
    for (size_t i = 0; i < swapChainImages.size(); i++) {
         if (vkCreateSemaphore(device, &semaphoreInfo, nullptr,
                               &imageReadyForPresent[i]) != VK_SUCCESS) {
           throw std::runtime_error(
               "failed to create graphics synchronization objects for a frame!");
         }
    }
    ...
 void cleanup() {
    ...
    for (size_t i = 0; i < swapChainImages.size(); i++) {
      vkDestroySemaphore(device, imageReadyForPresent[i], nullptr);
    }
    ...

Finally in DrawFrame set the last submitInfo.pSignalSemaphores to &imageReadyForPresent[imageIndex] and the presentInfo.pWaitSemaphores to &imageReadyForPresent[imageIndex]

(note that the code below was from the compute example in the tutorial.)

 void drawFrame() {
    ....
    uint32_t imageIndex;
    VkResult result = vkAcquireNextImageKHR(device, swapChain, UINT64_MAX,
                                            imageReadyForWrite[currentFrame],
                                            VK_NULL_HANDLE, &imageIndex);

    if (result == VK_ERROR_OUT_OF_DATE_KHR) {
      recreateSwapChain();
      return;
    } else if (result != VK_SUCCESS && result != VK_SUBOPTIMAL_KHR) {
      throw std::runtime_error("failed to acquire swap chain image!");
    }

    updateDrawUniformBuffer(currentFrame);

    vkResetCommandBuffer(commandBuffers[currentFrame], 0);
    recordCommandBuffer(commandBuffers[currentFrame], imageIndex);

    VkSemaphore waitSemaphores[] = {computeFinishedSemaphores[currentFrame],
                                    imageReadyForWrite[currentFrame]};
    submitInfo = {};
    submitInfo.sType = VK_STRUCTURE_TYPE_SUBMIT_INFO;
    VkPipelineStageFlags waitStages[] = {
        VK_PIPELINE_STAGE_VERTEX_INPUT_BIT,
        VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT};
    submitInfo.waitSemaphoreCount = 2;
    submitInfo.pWaitSemaphores = waitSemaphores;
    submitInfo.pWaitDstStageMask = waitStages;
    submitInfo.commandBufferCount = 1;
    submitInfo.pCommandBuffers = &commandBuffers[currentFrame];
    submitInfo.signalSemaphoreCount = 1;
    submitInfo.pSignalSemaphores = &imageReadyForPresent[imageIndex];         <------------------------------

    if (vkQueueSubmit(graphicsQueue, 1, &submitInfo,
                      computeInFlightFences[currentFrame]) != VK_SUCCESS) {
      throw std::runtime_error("failed to submit draw command buffer!");
    }

    VkPresentInfoKHR presentInfo{};
    presentInfo.sType = VK_STRUCTURE_TYPE_PRESENT_INFO_KHR;

    presentInfo.waitSemaphoreCount = 1;
    presentInfo.pWaitSemaphores = &imageReadyForPresent[imageIndex];         <--------------------------------

    VkSwapchainKHR swapChains[] = {swapChain};
    presentInfo.swapchainCount = 1;
    presentInfo.pSwapchains = swapChains;

    presentInfo.pImageIndices = &imageIndex;

    result = vkQueuePresentKHR(presentQueue, &presentInfo);

    if (result == VK_ERROR_OUT_OF_DATE_KHR || result == VK_SUBOPTIMAL_KHR ||
        framebufferResized) {
      framebufferResized = false;
      recreateSwapChain();
    } else if (result != VK_SUCCESS) {
      throw std::runtime_error("failed to present swap chain image!");
    }

    currentFrame = (currentFrame + 1) % MAX_FRAMES_IN_FLIGHT;
  }

ODON1 avatar Jun 27 '25 18:06 ODON1

@ODON1 a few notes about your comments:

First, my conclusion from this discussion and the tests I did is that the tutorial code does obey the spec. It is just that they added a warning for cases where valid use cannot be guaranteed by the validation layers (which cannot interpret your whole code). The word "may" in the warning is important.

So there are two things you can do:

  • ignore the warning if you are sure the use is fine
  • rewrite the logic in a way that would convince the validation layers that the use is ok

I picked the second option because I don't want to miss an important warning in a sea of false positives. And I want my code to be easier to reason about.

This leads to the following problem: you can easily assign one imageReadyForPresent semaphore per image because at that point you know the image index, but you can't do that for the imageReadyForWrite semaphore because you get the image index from the function call that needs to take this specific semaphore. You can't know which one to pass in advance. Instead, you can make a 4th one that would be guaranteed to be unused because it would be the last one you received from aquiring an image.

So to answer you question at the end of your first reply:

Now the suggestion of a) Use a separate semaphore per swapchain image. Index these semaphores using the index of the acquired image. makes a bit more sense. We need to fix a semaphore to each swapchain image. So when we acquire the next image we can also get a uniquely associated semaphore.

I have not yet figured out how to implement this. Probably there are examples in the Vulkan docs.

The reason you can't figure out how to implement this with 3 imageReadyForWrite semaphores, one for each image, is because it's not possible. But it is easily done with a 4th semaphore. See my example code for reference. It is based on @StarStarJ 's idea for swapping around the semaphores so you always have one guaranteed unused.

And as I already said, we are essentially trying to be clear with our intent to the validation layers. The original code in the tutorial is fine. But it is good practice to write synchronisation logic as straightforward as possible. So you won't break your whole code with a small change that would cascade down the semaphore spaghetti.

svetli97 avatar Jun 27 '25 23:06 svetli97

@svetli97 I don't see why imageReadyForWrite needs to know the imageIndex.

vkAcquireNextImageKHR will just give you a usable image and you don't need a semaphore fixed to that image to signal its availability. As soon as the render has finished that semaphore becomes reset and you can use it again. So MAX_FRAMES_IN_FLIGHT nr of semaphores is fine here and you can cycle them independently of the way the engine cycles the swapchain images.

The problem is that vkQueuePresent does not reset the semaphore at the end of the loop. It holds it in use until it has acquired a new image. So you need a dedicated semaphore per image for vkQueuePresent only. So you need swapChainImages.size() nr of imageReadyForPresent semaphores and at this point you know the index of the acquired image and you can assign them uniquely.

ODON1 avatar Jun 28 '25 08:06 ODON1

The exact issue is in the vulkan docs:

https://docs.vulkan.org/guide/latest/swapchain_semaphore_reuse.html

ODON1 avatar Jun 30 '25 14:06 ODON1

I don't see why imageReadyForWrite needs to know the imageIndex.

It doesn't.

vkAcquireNextImageKHR will just give you a usable image and you don't need a semaphore fixed to that image to signal its availability. As soon as the render has finished that semaphore becomes reset and you can use it again. So MAX_FRAMES_IN_FLIGHT nr of semaphores is fine here and you can cycle them independently of the way the engine cycles the swapchain images.

This is exactly what I said. And if you do this, the validation layers will give you a warning. So I also said how you can avoid the warning (and also make your code more future-proof).

The problem is that vkQueuePresent does not reset the semaphore at the end of the loop. It holds it in use until it has acquired a new image. So you need a dedicated semaphore per image for vkQueuePresent only. So you need swapChainImages.size() nr of imageReadyForPresent semaphores and at this point you know the index of the acquired image and you can assign them uniquely.

Exactly what I said. This is the part that everyone does correctly. The questionable implementations are usually regarding the aquire semaphore. Becasue it is more difficult to convince the validation layers that the use is valid.

The exact issue is in the vulkan docs:

https://docs.vulkan.org/guide/latest/swapchain_semaphore_reuse.html

Thanks for the link. So they came up with the same conclusion about how to guarantee valid use. But they didn't write it in a way that convinces the validation layers. The warnings annoy me. That's why I chose to do it in a way that solves the warnings problem as well.

svetli97 avatar Jun 30 '25 19:06 svetli97

Not sure if we're still talking about the same issue here. For me the warnings are all gone when implementing this -> (dedicated semaphore per swapchainimage for signalling to vkQueuePresent)

(EDIT): Can't see in your code above, but could you be creating MAX_FRAMES_IN_FLIGHT + 1 nr semaphores instead of swapChainImages.size() nr of semaphores with a VK_PRESENT_MODE_MAILBOX_KHR? I.e. the requested instead of the obtained nr of swapchain size?

For exactly the same reasons mentioned why vkQueuePresent can't release the current swapchainimage the minimum size of a swapchain with VK_PRESENT_MODE_MAILBOX_KHR will be 4 images. Not 3 as the tutorial is requesting.

(as soon as vkQueuePresent obtains the next image it will hold on to that as it is under the obligation to read it at next VSync. Therefore it will not allow updating this image with a newer one should it arrive. There thus will be two extra slots besides the presented image that will alternatingly take the role of 'newest image' while the other one can be swapped with an even newer one. Therefore there will be a minimum of 4 images in this swapchain)

ODON1 avatar Jul 01 '25 07:07 ODON1

@ODON1 I've have the exact same problem where on Windows the example ran smoothly without validation errors where 3 swapchain images were used (2 driver + 1 buffer) but when I wanted to run it on ubuntu there were 5 swapchain images being created (4 driver +1 buffer) with still 2 FRAMES_IN_FLIGHT for the semaphore.

These didn't cause any issues on windows because it was "just right" but on linux these validation errors were thrown.

I tried having both semaphores set to the image size, but just the one used in "imageReadyForPresent" were absolutely sufficient.

Thanks for your help!

Should we recreate these semaphores on swapchain recreation?

1rrsinn avatar Sep 21 '25 17:09 1rrsinn

Indeed, as you found out, (nr of) swapchainimages is totally distinct and seperate from (nr of) frames_in_flight. You can make any combination of these you wan't. (Altought i doubt using more frames in flight makes sense). But this means you need separate sets of semaphores to manage both (It might be possible to use a single set of semaphores to manage both, but this will be more complex, error prone, less maintainable, and less understandable without real benefit)

The semaphores are 'free objects' and if available (not in use already) can be used to signal anything you wan't. So they are not tied to a particular frame or swapchain. As such they do not need to be recreated upon swapchain recreation.

ODON1 avatar Sep 23 '25 11:09 ODON1