vulkano Advice for multi-window (and multi-swapchain) setups

This is not a bug! I'm just hoping to get some advice :bowing_man:

In order to create a multi-window application, one must create multiple swap chains. My current understanding is that in order to draw to each of these swapchains, I must call swapchain::acquire_next_image for each one, however each of these calls may block if an image is not immediately available.

Is it the case that image availability for each of these swapchains may occur at differing rates if those swapchains are running on different physical devices or are presenting to different displays?

If so, would you opt for spawning a new thread for each window's swapchain? Or perhaps use futures, tokio and use Future::select to let the tokio runtime handle each one as they become ready?

If anyone has worked with vulkano and multiple windows I'd love to hear their experience or perhaps get a link to some working code! Once I get my head around the best approach I wouldn't mind making an example to add to vulkano-examples for this to clarify things for future users.

Nov 10 '18 12:11 mitchmindtree

An example would be great! I have never worked with multiple swapchain or multiple windows.

Heres the specs for the underlying vulkan call: https://www.khronos.org/registry/vulkan/specs/1.1-extensions/html/vkspec.html#vkAcquireNextImageKHR I don't see anything about differing rates across different devices or displays.

I know from experience that linux + nvidia gpu doesnt block on this call at all, but windows + nvidia gpu does. So across different devices sounds very likely to be different.

As for displays, if the windows + nvidia gpu blocking is waiting for the frame to be displayed, then monitors with different refresh rates would block for different lengths.

Nov 10 '18 23:11 rukai

As for displays, if the windows + nvidia gpu blocking is waiting for the frame to be displayed, then monitors with different refresh rates would block for different lengths.

Thanks for this, I'm interested in a cross-platform solution so its nice to confirm that it's at least possible to happen in some cases!

Very interesting that the call does not block on linux+nvidia but does on windows+nvidia... Is it possible that this is because a different presentation mode is being used for the swapchain on each platform? For example, perhaps it is not blocking on linux+nvidia because the MAILBOX presentation mode is being used with triple-buffering, while on windows the FIFO presentation mode is being used with a double-buffer? Just a thought, I'm still very new to Vulkan so I could be way off the mark!

I'm going to dive into this today and begin experimenting - will report back on my progress.

Nov 11 '18 06:11 mitchmindtree

So a bit of context here, although not very relevant to your problem: This difference between drivers is the cause of a gpu OoM in the examples when running on linux + nvidia because the examples are reliant on the timing used by other drivers e.g. windows + nvidia. https://github.com/vulkano-rs/vulkano/issues/923 I know the problem here now .. I just need to sit down and figure out the best way to fix it.

It was initially thought that the linux + nvidia driver was just misbehaving and vkAcquireNextImageKHR should be blocking but the vulkan specs actually specifically says that it can return whenever it wants regardless of the presentation mode in use.

from the spec for vkAcquireNextImageKHR

Applications should not rely on vkAcquireNextImageKHR blocking in order to meter their rendering speed. The implementation may return from this function immediately regardless of how many presentation requests are queued, and regardless of when queued presentation requests will complete relative to the call. Instead, applications can use fence to meter their frame generation work to match the presentation rate.

Edit: Hmm.... thinking about this again has made me uncertain, maybe we already checking for the fence, I don't have time to look into it atm though ;-;

Nov 11 '18 06:11 rukai

Ahh thanks for the extra context!

Instead, applications can use fence to meter their frame generation work to match the presentation rate.

Thank you for highlighting this - this refers to the use of the previous_frame_end future within the triangle.rs example right? Specifically this section?

Nov 11 '18 06:11 mitchmindtree

@rukai just for the record, I was testing timings of certain parts of the triangle.rs example and I noticed that on my linux + intel (integrated gpu) laptop swapchain::acquire_next_image does actually block. I thought this was interesting:

setup	does `acquire_next_image` block?
linux+nvidia	no
linux+intel	yes
windows+nvidia	yes

Just to clarify, this was with PresentMode::Fifo. PresentMode::Mailbox always returned immediately (using 2 or 3 buffers didn't seem to affect this) which makes sense to me as the idea for Mailbox is to just display the most recently presented image.

Applications should not rely on vkAcquireNextImageKHR blocking in order to meter their rendering speed.

Do you have any advice on how to do this with vulkano? Is it the assignment of previous_frame_end here (which results in dropping the current previous_frame_end GpuFuture) the part that blocks on a fence, in turn metering frame generation? It's tricky to test this on my machine by throwing in some timestamps as acquire_next_image is blocking on my machine. Perhaps it would be useful to know whether or not the triangle.rs example syncs to your refresh rate on your linux+nvidia setup or whether it just loops as fast as possible? If it does sync, then perhaps a useful test could be to time how long the previous_frame_end assignment (and in turn, the drop call) takes, considering we know the frame delay is not in the call to acquire_next_image on that setup.

Sorry for so many questions! Just trying to clarify my understanding on how (or whether) vulkano does this "fence"ing under the hood.

Ahh just noticed your edit in your last comment! No worries if you are unsure, I might have to do some digging through the source, unless maybe @tomaka can briefly clarify?

Nov 12 '18 14:11 mitchmindtree

I'm running two windows in separate threads. Very few resources are shared across threads, and this doesn't seem to be a problem. I detect no dependence on timing between the two windows / threads. Granted, I've only tested on Linux with Nvidia's ICD. I have an issue with winit and input, but it seems surmountable with a correct use of the winit event loop.

Mar 11 '19 06:03 knappador

Vulkano now has examples with multiple windows, so I think this can be closed.

Aug 06 '23 09:08 Rua