Vulkan-Samples icon indicating copy to clipboard operation
Vulkan-Samples copied to clipboard

Updated Timeline Semaphore sample

Open bryce-young-mobica opened this issue 1 year ago • 5 comments

Description

Reworked the timeline semaphore sample to prevent it crashing on Windows.

The only obvious trigger for the crash that I could observe was the main thread calling "vkDeviceWaitIdle" whilst the compute thread was in "wait_timeline_gpu". To avoid this, I removed the "wait/signal_timeline_gpu" calls (opting to attach the "VkTimelineSemaphoreSubmitInfo" to the queue submissions instead), and restructured the compute/graphics work stages to prevent the compute thread running ahead and submitting (potentially blocking) work.

  • Replaced the "wait/signal_timeline_gpu" calls with "VkTimelineSemaphoreSubmitInfo" added to the queue submissions
  • Added an enum for identifying the timeline stages, to make it easier to read and debug the signals/waits
  • Moved the rendering work to a seperate thread, leaving the main thread responsible for acquiring and presenting the swapchain image, and signalling the timeline semaphore.

Fixes #588

Tested on Windows Tested using seperate and shared queue for compute and graphics

General Checklist:

Please ensure the following points are checked:

  • [X] My code follows the coding style
  • [X] I have reviewed file licenses
  • [X] I have commented any added functions (in line with Doxygen)
  • [X] I have commented any code that could be hard to understand
  • [X] My changes do not add any new compiler warnings
  • [X] My changes do not add any new validation layer errors or warnings
  • [X] I have used existing framework/helper functions where possible
  • [X] My changes do not add any regressions
  • [X] I have tested every sample to ensure everything runs correctly
  • [X] This PR describes the scope and expected impact of the changes I am making

Note: The Samples CI runs a number of checks including:

  • [X] I have updated the header Copyright to reflect the current year (CI build will fail if Copyright is out of date)
  • [X] My changes build on Windows, Linux, macOS and Android. Otherwise I have documented any exceptions

Sample Checklist

If your PR contains a new or modified sample, these further checks must be carried out in addition to the General Checklist:

  • [X] I have tested the sample on at least one compliant Vulkan implementation
  • [X] If the sample is vendor-specific, I have tagged it appropriately
  • [X] I have stated on what implementation the sample has been tested so that others can test on different implementations and platforms
  • [X] Any dependent assets have been merged and published in downstream modules
  • [X] For new samples, I have added a paragraph with a summary to the appropriate chapter in the readme of the folder that the sample belongs to e.g. api samples readme
  • [X] For new samples, I have added a tutorial README.md file to guide users through what they need to know to implement code using this feature. For example, see conditional_rendering
  • [X] For new samples, I have added a link to the Antora navigation so that the sample will be listed at the Vulkan documentation site

bryce-young-mobica avatar Aug 06 '24 10:08 bryce-young-mobica

But I get an error about every second time I start this demo: Validation Error: [ UNASSIGNED-CoreValidation-DrawState-InvalidImageLayout ] Object 0: handle = 0x243ed7ef060, type = VK_OBJECT_TYPE_COMMAND_BUFFER; Object 1: handle = 0x944a2c0000000039, type = VK_OBJECT_TYPE_IMAGE; | MessageID = 0x4dae5635 | vkQueueSubmit(): pSubmits[0].pCommandBuffers[0] command buffer VkCommandBuffer 0x243ed7ef060[] expects VkImage 0x944a2c0000000039[] (subresource: aspectMask 0x1 array layer 0, mip level 0) to be in layout VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL--instead, current layout is VK_IMAGE_LAYOUT_UNDEFINED.

I saw that error during development (iirc it was related to the compute work starting before the setup had finished), and I resolved it by moving the initial "game of life" setup out of the compute thread.

How are you running the tests? (I've been unable to reproduce it myself using the VS debugger or Vulkan Configurator)

bryce-young-mobica avatar Aug 16 '24 12:08 bryce-young-mobica

I can reproduce that message, but only randomly. I did run the sample many times (latest commit) and only saw that validation error twice. It's triggered from here:

image

Since it's so random (sometimes takes more than 10 runs to show up) it's pretty hard to debug.

P.S. : I'm running a debug build on Win11 with VS 2022 and using the latest SDK.

SaschaWillems avatar Aug 18 '24 18:08 SaschaWillems

I can repro the validation layer error on about every second run, on Win10, VS2022, NVIDIA RTX A3000 Laptop GPU.

It's always the the shared.images[0], created on line 122. If I wait_on_timeline(Timeline::draw); in do_graphics_work() when timeline.frame == 0, the issue seems to disappear. But I have no idea, what synchronization is missing or failing to make that a requirement.

asuessenbach avatar Aug 19 '24 08:08 asuessenbach

It's always the the shared.images[0], created on line 122. If I wait_on_timeline(Timeline::draw); in do_graphics_work() when timeline.frame == 0, the issue seems to disappear. But I have no idea, what synchronization is missing or failing to make that a requirement.

Thanks, I think that is fixing the issue by ensuring the compute commands are submitted first (image[1] is initialised in the "setup_game_of_life" function, but image[0] was only being initialised on that first submission). I've updated the "setup_game_of_life" function to run both VkImages through the compute "init_pipeline".

@asuessenbach please could you try the latest patch and let me know if it helps?

bryce-young-mobica avatar Aug 19 '24 11:08 bryce-young-mobica

Yep, that seems to fix the issue. Note, though, that you could handle all NumSyncFrames images in one submit, like so: 0001-Get-image-initializations-in-one-command-buffer.patch

asuessenbach avatar Aug 19 '24 12:08 asuessenbach

Merging - 3 approvals

marty-johnson59 avatar Sep 18 '24 15:09 marty-johnson59