wl-mirror icon indicating copy to clipboard operation
wl-mirror copied to clipboard

Output freezes on niri wm with nvidia GPUs and only updates on focus changes

Open Balint66 opened this issue 1 year ago • 16 comments

Hello there!
A few days ago I've started a discussion about mirroring output to another on the repository of niri wm.

It has been noted, that wl-mirror should work just fine, because screencopy is implemented. In practice this works until the window loses focus.
I'm not sure if this is a bug on the wm's side, or wl-mirror's side, but I'm more than happy to help resolving this problem. :)

Balint66 avatar Feb 24 '25 09:02 Balint66

I added some comments to your original discussion on the niri side.

For completeness, I'm gonna repeat what I said in the niri discussion: I can't reproduce this issue, the wl-mirror window does not freeze even if something else has focus and even if it is partially scrolled off-screen on the target output.

Ferdi265 avatar Feb 24 '25 11:02 Ferdi265

as mentioned in the referenced comment in #64 above, a similar issue seems to also sometimes occur in River.

Ferdi265 avatar Apr 12 '25 18:04 Ferdi265

I don't know if it's helpful but I am having the same issue. But on Arch rather than Nix also specifically using niri as well

Sylphraena avatar Aug 09 '25 04:08 Sylphraena

I don't know if it's helpful but I am having the same issue. But on Arch rather than Nix also specifically using niri as well

Interesting. Could you give me some information about your setup so that I can figure out what it might take to reproduce it?

Specifically

  • GPU
  • Package versions (I assume latest Arch pkg?)
  • reproduction steps (or whether it just occurs every time)
  • which wl-mirror backends it occurs with (or whether it occurs with all backends)
  • if possible, logs of wl-mirror with --verbose and WAYLAND_DEBUG=1 set

I'll also try to reproduce it again with latest wl-mirror and latest niri from Arch the next time I find time for this; hopefully I can finally get behind what's happening here

Ferdi265 avatar Aug 09 '25 07:08 Ferdi265

  • Nvidia GTX 1080
  • it would be the latest arch package i just installed it yesterday and ran -Syu before running it
  • Anytime i run the wl-mirror it only refreshes input as I have my focus change away from it sometimes it will capture for about a second afterwards but that shrinks to about a quarter of a second as the program runs
  • Not sure exactly what this is asking for the commands I input as i run it (sorry fairly newish to linux i only installed arch yesterday and haven't used wayland before) but if its the commands anything i've tried so far i included the command I ran it with this time in the log file i'll attach
  • think i set what you were asking for this is my terminal output as i ran the program with the note that the sections at the end which generate when the mirror screen updates only generate when i swap focus between two panels.

wl-log.txt

Sylphraena avatar Aug 09 '25 15:08 Sylphraena

Thanks for the detailed info!

  • Nvidia GTX 1080

Nvidia drivers have a bad track-record on Wayland, which has only really gotten better recently. I've heard it should be better with 10xx cards (like you have) and later, but I haven't used an Nvidia GPU since the GTX 7xx series.

  • it would be the latest arch package i just installed it yesterday and ran -Syu before running it

Right, thanks.

  • Anytime i run the wl-mirror it only refreshes input as I have my focus change away from it sometimes it will capture for about a second afterwards but that shrinks to about a quarter of a second as the program runs

Interesting. The logs don't indicate any errors though, but it's much less than usual. usually you'd get hundreds of lines per second with --verbose since every frame is logged, but it seems something is hanging between frames.

  • Not sure exactly what this is asking for the commands I input as i run it (sorry fairly newish to linux i only installed arch yesterday and haven't used wayland before) but if its the commands anything i've tried so far i included the command I ran it with this time in the log file i'll attach

Sorry for that, I should have been more explicit. This is the command I meant, which adds some extra logging about Wayland protocol messages:

WAYLAND_DEBUG=1 wl-mirror --verbose TARGET_OUTPUT

The part about the backends was that you could try

WAYLAND_DEBUG=1 wl-mirror --backend screencopy-shm --verbose TARGET_OUTPUT

which would attempt to capture the image in a different way which is less performant, but is less complicated and hopefully works better with nvidia drivers.

  • think i set what you were asking for this is my terminal output as i ran the program with the note that the sections at the end which generate when the mirror screen updates only generate when i swap focus between two panels.

wl-log.txt

Yes, exactly, thanks for the logs. As I mentioned above, I don't see any obvious errors, but it seems to hang somewhere between frames.

My current guess is that this is related to a driver bug, but it could also be that I'm using some OpenGL thing wrong, and Mesa (the open source OpenGL implementation) is being lenient and handles it in an OK way, while the nvidia implementation hangs or fails somehow. Sadly I don't know what exactly fails or hangs, but I might be able to provide you with a package with additional logging that will help us pinpoint where exactly it hangs.

This issue being related to Nvidia GPUs could also be why I can't reproduce this, because I have an AMD GPU in my laptop and had an Intel GPU in my old one, but never used an Nvidia GPU since starting work on wl-mirror.

Ferdi265 avatar Aug 09 '25 15:08 Ferdi265

@Balint66: do you, by any chance, also use an nvidia GPU?

If not, then that changes things and it must be a different issue.

Ferdi265 avatar Aug 09 '25 15:08 Ferdi265

WAYLAND_DEBUG=1 wl-mirror --backend screencopy-shm --verbose TARGET_OUTPUT


which would attempt to capture the image in a different way which is less performant, but is less complicated and hopefully works better with nvidia drivers.

Just tried running that and its the same issue. If there is anything else you would want/need to help with debugging im happy to let you know what I can. I am running the proprietary Nvidia drivers as well if it is relevant.

Sylphraena avatar Aug 09 '25 16:08 Sylphraena

Just tried running that and its the same issue. If there is anything else you would want/need to help with debugging im happy to let you know what I can.

I was hoping it would make a difference, but as neither backend actually throws any errors I think the issue is actually with rendering, and not capturing the frames. I'll think about this and see what I can come up with. I'll also double-check my OpenGL calls and see if I can somehow validate that they are used correctly or find the issue. Since other apps (including OBS) work fine, there must be a way to do it correctly.

With the additional logging from WAYLAND_DEBUG=1, can you tell me where in the logs (between which lines) the pause is when the app hangs? That would be very useful in figuring out where the hang is. I might also be able to create a custom package that has additional logging in the next few days to help with identifying that.

I am running the proprietary Nvidia drivers as well if it is relevant.

Thanks; using the nvidia/nvidia-lts/nvidia-dkms package from the Arch repos, right?

Ferdi265 avatar Aug 09 '25 17:08 Ferdi265

as mentioned in the referenced comment in #64 above, a similar issue seems to also sometimes occur in River.

The other user who had a similar bug on River was also using an nvidia GPU, so I think we can pretty safely say this bug is triggered by having an nvidia GPU.

Ferdi265 avatar Aug 09 '25 17:08 Ferdi265

[3788555.297] {Display Queue} wl_display#1.delete_id(34) this is the last line showing when it hangs sometimes the number in the far right brackets is 35 or 30 and its almost always one or two of them together. I did have a different line once when checking but like 99% of the time its that

I can stream using discord no problem as well while on a call

Yes those are the drivers packages i installed

Sylphraena avatar Aug 09 '25 17:08 Sylphraena

[3788555.297] {Display Queue} wl_display#1.delete_id(34) this is the last line showing when it hangs sometimes the number in the far right brackets is 35 or 30 and its almost always one or two of them together. I did have a different line once when checking but like 99% of the time its that

I can stream using discord no problem as well while on a call

Yes those are the drivers packages i installed

This means that a wayland object was deleted. Such objects are created and deleted all the time, and with only the ids it's hard to know what it was (I'd need the previous log lines as well, when the object with id 34 was created). Can you send me the whole log from the WAYLAND_DEBUG=1 run? that would help me a lot.

Ferdi265 avatar Aug 09 '25 18:08 Ferdi265

[3788555.297] {Display Queue} wl_display#1.delete_id(34) this is the last line showing when it hangs sometimes the number in the far right brackets is 35 or 30 and its almost always one or two of them together. I did have a different line once when checking but like 99% of the time its that I can stream using discord no problem as well while on a call Yes those are the drivers packages i installed

This means that a wayland object was deleted. Such objects are created and deleted all the time, and with only the ids it's hard to know what it was (I'd need the previous log lines as well, when the object with id 34 was created). Can you send me the whole log from the WAYLAND_DEBUG=1 run? that would help me a lot.

That should be the new log file I didn't leave it running for too long and swapped between focus a couple times to make sure it got it hopefully that helps

wl-log.txt

Sylphraena avatar Aug 09 '25 18:08 Sylphraena

[3788555.297] {Display Queue} wl_display#1.delete_id(34) this is the last line showing when it hangs sometimes the number in the far right brackets is 35 or 30 and its almost always one or two of them together. I did have a different line once when checking but like 99% of the time its that I can stream using discord no problem as well while on a call Yes those are the drivers packages i installed

This means that a wayland object was deleted. Such objects are created and deleted all the time, and with only the ids it's hard to know what it was (I'd need the previous log lines as well, when the object with id 34 was created). Can you send me the whole log from the WAYLAND_DEBUG=1 run? that would help me a lot.

That should be the new log file I didn't leave it running for too long and swapped between focus a couple times to make sure it got it hopefully that helps

wl-log.txt

Perfect!

[3719076.993] {Default Queue} zwlr_screencopy_frame_v1#34.ready(0, 3568, 573268109)
debug: mirror-screencopy::on_ready(): received ready event with width: 1920, height: 1080, stride: 7680, format:
[3719077.167] {Display Queue} wl_display#1.delete_id(32)
[3719077.195] wl_buffer#36.release()
[3719077.219] wl_callback#32.done(0)
[3719078.263] {Default Queue}  -> zwlr_screencopy_frame_v1#34.destroy()
[3719089.076] {Display Queue} wl_display#1.delete_id(34)

Interesting. There is a 10 second gap between zwlr_screencopy_frame_v1#34.destroy() and wl_display#1.delete_id(34), which means that either the Wayland compositor is taking very long to destroy these screencopy objects or wl-mirror is taking very long to import the buffer and render it, likely the latter.

Normally, wl-mirror imports the buffer, destroys the screencopy object, immediately renderes the next frame and then waits for wayland events. In this case something is taking very long between destroying the screencopy frame and when the frame is finally presented.

This narrows down where I have to look significantly, thank you very much!

I'll make a package with extra logging (including timestamps for wl-mirror's logs) to better be able to figure out what exactly is hanging.

Ferdi265 avatar Aug 09 '25 19:08 Ferdi265

Looking at the code I have a hunch that the render timing in wl-mirror is wrong. That's not the core issue here though, since that only impacts performance usually and shouldn't lead to hangs, but maybe it has something to do with this.

Ferdi265 avatar Aug 09 '25 20:08 Ferdi265