egl-wayland
egl-wayland copied to clipboard
egl-swap: provide damage rectangles to wl_surface
Previously, wlEglSendDamageEvent used the entire surface as damage to the compositor but in the wrong coordinate system. wl_surface_damage() requires coordinates for the surface which could be scaled while wl_surface_damage_buffer() expects buffer coordinates which is what surface->width and surface->height represent.
This ensures that the parameters to eglSwapBuffersWithDamage() are passed along to the compositor as well. The coordinate system is flipped between eglSwapBuffersWithDamage() and wl_surface_damage_buffer() which is handled as well.
Signed-off-by: Christian Hergert [email protected]
lgtm, thanks @chergert :)
Okay, fixed a coordinate issue with the rectangle which I was finally able to see now that I added a second commit, which passes damage regions across from the producer to the consumer thread w/ a lock-free ring-buffer (3 frames deep, up to 32 rectangles per frame).
I decided on such a large rectangle count because I'm regularly seeing rectangles from GTK in the 20's with very occasional peeks to 30.
A depth of 3 frames seem to be plenty with experimental testing.
Thank you very much for the patch. Two minor points
-
It's probably not hugely important in practice, but for the lock-free ring buffer, would we be able to use memory_order_acqure / memory_order release for the first and second atomic_thread_fence calls instead of memory_order_seq_cst for both? Admittedly I don't have a lot of experience with the stdatomics API so if I'm wrong feel free to tell me, but in theory that might be slightly faster, right?
-
Indentation is off in WlEglStreamDamageBufferRec
Otherwise I've applied the patch locally and everything looks good. I've also run it through our automated test system and that came back clean. Note that I will need to submit the change to our internal Perforce depot before merging on GitHub (sorry, don't have a great system for handling external contributions at the moment).
would we be able to use memory_order_acqure / memory_order release for the first and second atomic_thread_fence calls instead of memory_order_seq_cst for both?
I too don't have much experience with these APIs other than making our ring buffer work for Sysprof's perf_event integration. That code was just looking for a replacement for __sync_synchronize()
which is definitely the heavy-handed approach to correctness.
After going through and reading the semantics of these, I think you're right and some quick testing shows it working well.
I'll update the patches quick, and thanks for the review!
Note that I will need to submit the change to our internal Perforce depot before merging on GitHub
And no worries on this, I know the drill with regards to corporate integrations. I had to deal with the same sort of issues at VMware years ago (with perforce no less).
Turns out _Atomic
is not what we want actually causes issues (at least with GTK 4) so reverted back to volatile
.
Turns out
_Atomic
is not what we want actually causes issues (at least with GTK 4) so reverted back tovolatile
.
Could you elaborate? volatile generally isn't meant for inter-thread synchronization purposes from what I've read.
Could you elaborate? volatile generally isn't meant for inter-thread synchronization purposes from what I've read.
I don't believe we're actually relying on volatile
here either given how it's using the other barrier enforcing primitives. So it's more likely that we could drop volatile
than needing to add _Atomic
.
I was asking what issues _Atomic causes and how you determined it isn't what we want here.
I was asking what issues _Atomic causes and how you determined it isn't what we want here.
I can't answer that for you, I've never used these features from the language. I've just done the ring buffer the same way it's done for perf_event mmap streams.
It sounds like there isn't sufficient shared understanding to choose the correct solution then. "Because another app does it this way" isn't suitable in lieu of actually being able to describe why the code you're requesting to contribute is correct when questioned.
Honestly, it doesn't matter too much to me anymore because the damage thread is already broken by design. The driver constantly locks up on Wayland and so while adding a working eglSwapBuffersWithDamage()
makes it lock up less for me, I can't fix the completely brokenness underneath it which is not publicly available.
I was asking what issues _Atomic causes and how you determined it isn't what we want here.
From what I've read from https://en.cppreference.com/w/c/language/atomic, it would look like using _Atomic should be fine, and what was causing me to revert my change is just fundamental breakage in the underlying driver for which I can't work around.
Sorry to bump an old MR, but do I understand correctly that libraries such as GTK use eglSwapBuffersWithDamage
get forwarded through egl-wayland to the wlEglSendDamageEvent
being modified here? And that this note:
This ensures that the parameters to eglSwapBuffersWithDamage() are passed along to the compositor as well. The coordinate system is flipped between eglSwapBuffersWithDamage() and wl_surface_damage_buffer() which is handled as well.
... means that this will actually fix this GTK bug indicating that full surfaces are always being damaged instead of just the part that is modified?
If so, this MR remains at least partially relevant even after #115 is merged, since partial damage doesn't seem to be included there, and this would help in optimizing power draw and performance on NVIDIA on Wayland :slightly_smiling_face: .
(Just to be clear: my intent isn't to put pressure on anyone, just to add hopefully helpful information for developers and users following these discussions. In case it helps I'll create a new issue here to track it, but since it's already implemented here it seemed to only add further noise.)
I took a stab at updating the patch set to latest master, changing volatile
to the suggested _Atomic
, and then pulling a new diff from master including all the changes:
I did some testing with PAINT_DAMAGE_REGION
and this indeed solves the problem for me with Nautilus. I'll try to give it some field testing in the coming days to see if there are any regressions.
Without wanting to be too hasty, but in case everything keeps working fine, would you like me to create a new merge/pull request to replace this one? Do you want me to maintain the same commits and commit messages as are present now or is a single commit fine? (It's also fine for me if the original author or a contributor takes the patch and applies it without me as author.)
Do you want me to maintain the same commits and commit messages as are present now or is a single commit fine?
I certainly don't mind if you change/update things. Instead of pushing this through I just changed GPU vendors but I'd be happy to see it fixed regardless.
@Gert-dev Thanks, we are always open to and will consider any pull requests. In this case I think a new PR would be best. As for commits I prefer to break things up into separate commits whenever possible, as it makes reviewing and testing easier.
One thing to note is the damage thread is finally gone when running with explicit sync. For simplicity we may want to only use the passed-in damage regions when there is no damage thread, saving it for when we send our surface update. We can consider that once you've posted your updated PR though.
@amshafer Clear, thanks. I opened up https://github.com/NVIDIA/egl-wayland/pull/138 with just the commit that introduces damage rectangles to get started (which also seems to be sufficient for Nautilus and to fix the GTK issue), not the one that further changes the damage thread.