glutin icon indicating copy to clipboard operation
glutin copied to clipboard

Indefinite block or hidden windows when using vsync on Wayland

Open elinorbgr opened this issue 5 years ago • 5 comments

Context:

VSync handling on Wayland is done by the following way. Mesa sends wl_surface.frame requests to the Wayland server, asking it to fire a frame event in response when an appropriate time for drawing has come.

In theory, the Wayland server sends such events to clients that requested them in synchronization with VSync, so all goes well.

Problem:

Some Wayland servers such as Sway do an optimization that consists of not sending frame events to clients when their windows are not visible at all (because they are minimized, on an other workspace, or completely hidden by an other window), under the rationale that there is no point drawing content that will never be displayed.

The issue appears when an app relies on EGL's VSync handling, in which case if the window is hidden eglSwapBuffer will block until a frame event is received, that is until the window becomes visible again. This has the consequence of completely freezing said app as soon as it is not visible any more. This causes https://github.com/jwilm/alacritty/issues/2851 for example.

Theoretically proper solution:

The Good Way™ to do handle this is that apps should not use mesa's vsync on Wayland, but rather request and track frame events themselves. This way they only draw at the appropriate times, but still continue processing events even if they are hidden. This is what GTK and Qt do for example.

A way to translate this into winit/glutin would be to add a method on winit's Window to request a frame event, and translate them into RedrawRequested events. Meaning that winit-based app would have to always disable vsync and use this special API on Wayland.

This would be the most flexible solution, but it would impose special codepath to winit users for Wayland, which goes completely against the idea of abstracting away the platform.

Pramatic proposition:

This is why I suggest we instead integrate in glutin what I'd call "manual vsync with timeout". When VSync is requested on Wayland, glutin would not activate it in the underlying EGL, but rather manage it manually (with the frame request/events) with the special addition that it'd block with timeout rather than indefinitely when waiting for a frame event. For example with a timeout of 1 second.

This would be a relatively small and simple code addition to glutin, and would have the consequence that Wayland apps that are not visible would see their vsync drop to 1fps, rather than completely freeze.

How does that sound?

elinorbgr avatar Oct 18 '19 20:10 elinorbgr

I'm not precisely sure what "manag[ing] it manually (with the frame request/events)" would entail, being unfamiliar with how Wayland handles most stuff, but if it brings the behavior in line with that of the other platforms, as you say, while not splintering the API with Wayland-specific code paths, then it sounds good to me :)

goddessfreya avatar Oct 18 '19 22:10 goddessfreya

Ok, nice!

I won't have the bandwidth to implement this right now, but I'd be happy to mentor anyone interested in working on it. This is a rather narrow task with a rather limited scope.

elinorbgr avatar Oct 19 '19 17:10 elinorbgr

I would certainly prefer ways that allow Wayland clients to behave best, which means not running unnecessary timers and drawing in the background.

I think there are two ways to do that:

  1. We could consider allowing everyone to subscribe to RedrawRequested on vsync or vsync-like, so that swap_interval(0) would work well for anyone. This is similar to what is done in applications like Firefox. This allows all platforms to support both modes natively.

  2. Simply allowing people to implement The Wayland Way™ separately in a simple way rather than attempting to hide it. Think enable_frame_callbacks(true); swap_interval(0);, or even just always pump RedrawRequested, making it only a matter of setting the swap interval.

An important thing to note is that the current behavior is not incorrect, and that the "wayland way" is not wayland-specific, but useful on all platforms where you don't want to block a thread (it's just that the block can be longer on wayland).

Number 1 is in reality a superset of number 2, by giving everyone the option to do non-blocking swaps (for the benefit of everyone), while number 2 implements what needs to be exposed on wayland only.

Both allow simple applications with simple needs to operate in simple ways as today, but allows more complex applications with more complex needs to operate more effectively. It is not weird that complex applications use platform-dependent features, such as swap_buffer_with_damage (which is only available with EGL, not WGL or GLX). There will always be differences, and my experience tells me that trying to simulate a common behavior (i.e. shoehorning) always ends up backfiring in the end.

kennylevinsen avatar Oct 24 '19 18:10 kennylevinsen

Simply allowing people to implement The Wayland Way™ separately in a simple way rather than attempting to hide it.

This is actually already possible, they just need to not activate vsync and use wayland-client to manage the frame callbacks and decide when to swap buffers. And this is actually what is recommended for wayland clients in general: never activate mesa's vsync. The main reasons I suggested this is to provide some sane-ish default for simple apps that don't care about this, by avoiding having swap_buffers blocking indefinitely.

An important thing to note is that the current behavior is not incorrect [...]

Actually, on Wayland it is, as a client is supposed to continue processing events even if it does not receive any frame events for a long time. Not doing so may cause its internal event buffer to fill up, at which point the server will fail to send new events to it, and just kill the connection.

The thing with hooking vsync into RedrawRequested is that while it would seem to be the obvious thing to do, it poses several questions:

  • first, it'd need tight integration with winit, but that's just work to do, not a blocker
  • second, and most importantly, this is not the meaning of RedrawRequested. This event signifies that the content need to be redrawn, but conveys no meaning of when it should be drawn, as winit is mostly built with the assumption that vsync is managed automatically by swap_buffers blocking the necessary amount of time.

So a solution like that would rather require some design decisions winit side (possibly introducing a new VSync event), and raise the questions of how it'd integrate with different platforms and Vulkan (and certainly other cases). I agree that this would be the best thing to do, but I don't have the time nor the energy to spearhead a discussion about that.

elinorbgr avatar Oct 24 '19 20:10 elinorbgr

Actually, on Wayland it is, as a client is supposed to continue processing events even if it does not receive any frame events for a long time.

Ah, yes, there's a fine detail: It is not incorrect to block on swap_buffer, and mesa will service the connection and dispatch its own queue, but one must themselves ensure that other queues are dispatched. This means that additional threads are required.

I forgot this particular case, as the discussion about blocked swap_buffer is rarely about correctness, and usually just focused on surprised developers that expected swap_buffer to always return in a timely manner.

second, and most importantly, this is not the meaning of RedrawRequested.

Agreed, I'd probably suggest a separate event to encompass frame-callback or vsync-like things.

I agree that this would be the best thing to do, but I don't have the time nor the energy to spearhead a discussion about that.

Completely understandable. I thought it would be interesting to air the thoughts none-the-less.

kennylevinsen avatar Oct 24 '19 20:10 kennylevinsen

This is out of scope for glutin. Though, the vsync source is discussed in winit here #2412.

kchibisov avatar Sep 03 '22 06:09 kchibisov