gamescope
gamescope copied to clipboard
Strange flickering using gamescope for Steam games with NVIDIA
So I tested gamescope (compiled from latest master 3.11.31-beta4-6-g97288b8) yesterday on NVIDIA 515.43.04:
gamescope -f -U -w 1920 -h 1080 -- %command%
It's upscaling to 3840x2160 at 60 fps (according to the Steam FPS counter).
But across all games I tested, it feels like 60 fps but it looks like it's swapping render buffers in the wrong order, no matter if vsync is on or off. Hard to describe but I try: If turning the camera slowly, it looks smooth. But turning or moving around faster, it looks like 30 Hz flickering with graphics constantly jumping back and forth, like the game is rendering frames 1,2,3,4,... but gamescope is showing 2,1,4,3... It looks like games now use triple buffering instead of double buffering but showing the wrong frame buffer on screen, supported by the fact that I'm still seeing screen tearing (while it may be triple buffering).
As mentioned above: I can see screen tearing but this is probably some general issue with the NVIDIA driver and multi-monitor. Full composition pipeline isn't enabled (doing so doesn't fix tearing for me anyways, it looks like vsync is offset by around half a screen refresh, maybe due to using multiple monitors).
Also, sometimes after closing gamescope, my desktop turns upside down with sometimes flickering back to the right orientation. This can be solved by running another game without gamescope. It isn't solved by restarting kwin.
So I found the strange flickering may be caused by kwin because gamescope doesn't block compositing in fullscreen mode. Without kwin compositing, the flickering is gone. The intense tearing is still an issue, tho.
graphics constantly jumping back and forth
I can see this in game menus, sometimes when selecting next menu item it looks like it then jumps to previous menu item and back.
More fun
VK_INSTANCE_LAYERS=VK_LAYER_MESA_overlay gamescope -i -- stranglevk 40 vkcube
This command displays two mesa layers, one for gamescope and one for vkcube. Both shows 40fps, but on vkcube it's ~25ms and on gamescope it's constantly jumping between ~16ms and ~33ms. When I randomly shaking mouse then mesa overlay on gamescope shows 60 fps and 16ms.
Much better:
VK_INSTANCE_LAYERS=VK_LAYER_MESA_overlay gamescope -i -r30 -- stranglevk 30 vkcube
Also, there are ton of events generating (or something), glxgears process stats:
perf report
glxgears
Performance counter stats for process id '30897':
539.04 msec task-clock # 0.024 CPUs utilized
1536 context-switches # 2.849 K/sec
202 cpu-migrations # 374.737 /sec
0 page-faults # 0.000 /sec
464782409 cycles # 0.862 GHz (67.05%)
265246086 stalled-cycles-frontend # 57.07% frontend cycles idle (66.51%)
96366893 stalled-cycles-backend # 20.73% backend cycles idle (65.96%)
162872237 instructions # 0.35 insn per cycle
# 1.63 stalled cycles per insn (66.66%)
32568835 branches # 60.420 M/sec (66.99%)
4890428 branch-misses # 15.02% of all branches (66.84%)
22.543913280 seconds time elapsed
gamescope -- glxgears
Performance counter stats for process id '30991':
36861.88 msec task-clock # 0.988 CPUs utilized
14069 context-switches # 381.668 /sec
34 cpu-migrations # 0.922 /sec
0 page-faults # 0.000 /sec
113806258921 cycles # 3.087 GHz (66.62%)
2333765348 stalled-cycles-frontend # 2.05% frontend cycles idle (66.68%)
109240541753 stalled-cycles-backend # 95.99% backend cycles idle (66.74%)
3658279819 instructions # 0.03 insn per cycle
# 29.86 stalled cycles per insn (66.68%)
795305924 branches # 21.575 M/sec (66.64%)
24605241 branch-misses # 3.09% of all branches (66.64%)
37.305241359 seconds time elapsed
# Overhead Command Shared Object Symbol
# ........ ........ ............................. ............................................
#
93.35% glxgears [kernel.vmlinux] [k] copy_user_generic_string
1.24% glxgears libnvidia-glcore.so.515.43.04 [.] _nv041glcore
0.95% glxgears libnvidia-glcore.so.515.43.04 [.] _nv011glcore
0.56% glxgears [vdso] [.] __vdso_clock_gettime
0.41% glxgears libnvidia-glcore.so.515.43.04 [.] _nv023glcore
0.23% glxgears [kernel.vmlinux] [k] sched_clock_cpu
...
...
93.12% 92.71% glxgears [kernel.vmlinux] [k] copy_user_generic_string
|
--92.69%--0x5f53f3c000007fff
writev
entry_SYSCALL_64_after_hwframe
do_syscall_64
__x64_sys_writev
vfs_writev
do_iter_write.part.0
do_iter_readv_writev
sock_write_iter
unix_stream_sendmsg
skb_copy_datagram_from_iter
|
|--81.40%--copy_page_from_iter
| copy_user_generic_string
|
--11.29%--_copy_from_iter
copy_user_generic_string
...
This is on X11, I'm not sure if I correctly set everything up.
EDIT: same or related: https://gitlab.freedesktop.org/xorg/xserver/-/issues/1317 (implicit synchronization issues on Nvidia, see comments) https://github.com/NVIDIA/open-gpu-kernel-modules/issues/187
Another perf report. Removed entries below ~1%.
perf report
75.45% 0.05% glxgears [kernel.vmlinux] [k] entry_SYSCALL_64_after_hwframe
|
--75.40%--entry_SYSCALL_64_after_hwframe
do_syscall_64
|
|--73.84%--do_writev
| vfs_writev
| |
| --73.84%--do_iter_write
| |
| --73.84%--do_iter_readv_writev
| sock_write_iter
| sock_sendmsg
| unix_stream_sendmsg
| |
| --73.73%--skb_copy_datagram_from_iter
| |
| |--72.24%--copy_page_from_iter
| | |
| | --72.24%--copy_user_enhanced_fast_string
| | |
| | |--2.84%--asm_sysvec_apic_timer_interrupt
| | |
| | --2.62%--irq_entries_start
| |
| --1.48%--_copy_from_iter
| |
| --1.48%--copy_user_enhanced_fast_string
|
|--0.74%--__ia32_sys_sched_yield
| |
| --0.71%--schedule
| |
| --0.70%--__schedule
|
--0.69%--syscall_exit_to_user_mode
75.40% 0.02% glxgears [kernel.vmlinux] [k] do_syscall_64
|
--75.38%--do_syscall_64
|
|--73.84%--do_writev
| vfs_writev
| |
| --73.84%--do_iter_write
| |
| --73.84%--do_iter_readv_writev
| sock_write_iter
| sock_sendmsg
| unix_stream_sendmsg
| |
| --73.73%--skb_copy_datagram_from_iter
| |
| |--72.24%--copy_page_from_iter
| | |
| | --72.24%--copy_user_enhanced_fast_string
| | |
| | |--2.84%--asm_sysvec_apic_timer_interrupt
| | |
| | --2.62%--irq_entries_start
| |
| --1.48%--_copy_from_iter
| |
| --1.48%--copy_user_enhanced_fast_string
|
|--0.74%--__ia32_sys_sched_yield
| |
| --0.71%--schedule
| |
| --0.70%--__schedule
|
--0.69%--syscall_exit_to_user_mode
73.85% 0.00% glxgears libc.so.6 [.] writev
|
--73.85%--writev
|
--73.84%--entry_SYSCALL_64_after_hwframe
do_syscall_64
do_writev
vfs_writev
|
--73.84%--do_iter_write
|
--73.84%--do_iter_readv_writev
sock_write_iter
sock_sendmsg
unix_stream_sendmsg
|
--73.73%--skb_copy_datagram_from_iter
|
|--72.24%--copy_page_from_iter
| |
| --72.24%--copy_user_enhanced_fast_string
| |
| |--2.84%--asm_sysvec_apic_timer_interrupt
| |
| --2.62%--irq_entries_start
|
--1.48%--_copy_from_iter
|
--1.48%--copy_user_enhanced_fast_string
73.84% 0.00% glxgears [kernel.vmlinux] [k] do_writev
|
---do_writev
vfs_writev
|
--73.84%--do_iter_write
|
--73.84%--do_iter_readv_writev
sock_write_iter
sock_sendmsg
unix_stream_sendmsg
|
--73.73%--skb_copy_datagram_from_iter
|
|--72.24%--copy_page_from_iter
| |
| --72.24%--copy_user_enhanced_fast_string
| |
| |--2.84%--asm_sysvec_apic_timer_interrupt
| |
| --2.62%--irq_entries_start
|
--1.48%--_copy_from_iter
|
--1.48%--copy_user_enhanced_fast_string
73.84% 0.00% glxgears [kernel.vmlinux] [k] vfs_writev
|
---vfs_writev
|
--73.84%--do_iter_write
|
--73.84%--do_iter_readv_writev
sock_write_iter
sock_sendmsg
unix_stream_sendmsg
|
--73.73%--skb_copy_datagram_from_iter
|
|--72.24%--copy_page_from_iter
| |
| --72.24%--copy_user_enhanced_fast_string
| |
| |--2.84%--asm_sysvec_apic_timer_interrupt
| |
| --2.62%--irq_entries_start
|
--1.48%--_copy_from_iter
|
--1.48%--copy_user_enhanced_fast_string
73.84% 0.00% glxgears [kernel.vmlinux] [k] do_iter_write
|
--73.84%--do_iter_write
do_iter_readv_writev
sock_write_iter
sock_sendmsg
unix_stream_sendmsg
|
--73.73%--skb_copy_datagram_from_iter
|
|--72.24%--copy_page_from_iter
| |
| --72.24%--copy_user_enhanced_fast_string
| |
| |--2.84%--asm_sysvec_apic_timer_interrupt
| |
| --2.62%--irq_entries_start
|
--1.48%--_copy_from_iter
|
--1.48%--copy_user_enhanced_fast_string
73.84% 0.00% glxgears [kernel.vmlinux] [k] do_iter_readv_writev
|
---do_iter_readv_writev
sock_write_iter
sock_sendmsg
unix_stream_sendmsg
|
--73.73%--skb_copy_datagram_from_iter
|
|--72.24%--copy_page_from_iter
| |
| --72.24%--copy_user_enhanced_fast_string
| |
| |--2.84%--asm_sysvec_apic_timer_interrupt
| |
| --2.62%--irq_entries_start
|
--1.48%--_copy_from_iter
|
--1.48%--copy_user_enhanced_fast_string
73.84% 0.00% glxgears [kernel.vmlinux] [k] sock_write_iter
|
---sock_write_iter
sock_sendmsg
unix_stream_sendmsg
|
--73.73%--skb_copy_datagram_from_iter
|
|--72.24%--copy_page_from_iter
| |
| --72.24%--copy_user_enhanced_fast_string
| |
| |--2.84%--asm_sysvec_apic_timer_interrupt
| |
| --2.62%--irq_entries_start
|
--1.48%--_copy_from_iter
|
--1.48%--copy_user_enhanced_fast_string
73.84% 0.00% glxgears [kernel.vmlinux] [k] sock_sendmsg
|
---sock_sendmsg
unix_stream_sendmsg
|
--73.73%--skb_copy_datagram_from_iter
|
|--72.24%--copy_page_from_iter
| |
| --72.24%--copy_user_enhanced_fast_string
| |
| |--2.84%--asm_sysvec_apic_timer_interrupt
| |
| --2.62%--irq_entries_start
|
--1.48%--_copy_from_iter
|
--1.48%--copy_user_enhanced_fast_string
73.84% 0.00% glxgears [kernel.vmlinux] [k] unix_stream_sendmsg
|
--73.83%--unix_stream_sendmsg
|
--73.73%--skb_copy_datagram_from_iter
|
|--72.24%--copy_page_from_iter
| |
| --72.24%--copy_user_enhanced_fast_string
| |
| |--2.84%--asm_sysvec_apic_timer_interrupt
| |
| --2.62%--irq_entries_start
|
--1.48%--_copy_from_iter
|
--1.48%--copy_user_enhanced_fast_string
73.73% 0.00% glxgears [kernel.vmlinux] [k] skb_copy_datagram_from_iter
|
--73.73%--skb_copy_datagram_from_iter
|
|--72.24%--copy_page_from_iter
| |
| --72.24%--copy_user_enhanced_fast_string
| |
| |--2.84%--asm_sysvec_apic_timer_interrupt
| |
| --2.62%--irq_entries_start
|
--1.48%--_copy_from_iter
|
--1.48%--copy_user_enhanced_fast_string
73.72% 73.43% glxgears [kernel.vmlinux] [k] copy_user_enhanced_fast_string
|
--73.22%--0
writev
entry_SYSCALL_64_after_hwframe
do_syscall_64
do_writev
vfs_writev
do_iter_write
do_iter_readv_writev
sock_write_iter
sock_sendmsg
unix_stream_sendmsg
skb_copy_datagram_from_iter
|
|--71.74%--copy_page_from_iter
| copy_user_enhanced_fast_string
| |
| |--2.79%--asm_sysvec_apic_timer_interrupt
| |
| --2.62%--irq_entries_start
|
--1.48%--_copy_from_iter
copy_user_enhanced_fast_string
73.72% 0.00% glxgears [unknown] [k] 0000000000000000
|
---0
|
--73.64%--writev
|
--73.64%--entry_SYSCALL_64_after_hwframe
do_syscall_64
do_writev
vfs_writev
|
--73.64%--do_iter_write
|
--73.63%--do_iter_readv_writev
sock_write_iter
sock_sendmsg
unix_stream_sendmsg
|
--73.52%--skb_copy_datagram_from_iter
|
|--72.04%--copy_page_from_iter
| |
| --72.03%--copy_user_enhanced_fast_string
| |
| |--2.83%--asm_sysvec_apic_timer_interrupt
| |
| --2.62%--irq_entries_start
|
--1.48%--_copy_from_iter
|
--1.48%--copy_user_enhanced_fast_string
72.24% 0.00% glxgears [kernel.vmlinux] [k] copy_page_from_iter
|
--72.24%--copy_page_from_iter
copy_user_enhanced_fast_string
|
|--2.84%--asm_sysvec_apic_timer_interrupt
|
--2.62%--irq_entries_start
6.35% 0.12% Xwayland [kernel.vmlinux] [k] entry_SYSCALL_64_after_hwframe
|
--6.23%--entry_SYSCALL_64_after_hwframe
|
--6.22%--do_syscall_64
|
|--4.20%--__sys_recvmsg
| |
| --4.16%--___sys_recvmsg
| |
| --4.05%--____sys_recvmsg
| |
| --4.02%--unix_stream_recvmsg
| |
| --4.00%--unix_stream_read_generic
| |
| --3.38%--unix_stream_read_actor
| |
| --3.38%--skb_copy_datagram_iter
| |
| --3.38%--__skb_datagram_iter
| |
| --3.34%--_copy_to_iter
| |
| --3.28%--copy_user_enhanced_fast_string
|
--1.10%--syscall_exit_to_user_mode
6.24% 0.05% Xwayland [kernel.vmlinux] [k] do_syscall_64
|
--6.19%--do_syscall_64
|
|--4.20%--__sys_recvmsg
| |
| --4.16%--___sys_recvmsg
| |
| --4.05%--____sys_recvmsg
| |
| --4.02%--unix_stream_recvmsg
| |
| --4.00%--unix_stream_read_generic
| |
| --3.38%--unix_stream_read_actor
| |
| --3.38%--skb_copy_datagram_iter
| |
| --3.38%--__skb_datagram_iter
| |
| --3.34%--_copy_to_iter
| |
| --3.28%--copy_user_enhanced_fast_string
|
--1.10%--syscall_exit_to_user_mode
4.79% 0.02% Xwayland libc.so.6 [.] recvmsg
|
--4.79%--recvmsg
|
--4.53%--entry_SYSCALL_64_after_hwframe
|
--4.50%--do_syscall_64
|
--4.20%--__sys_recvmsg
|
--4.16%--___sys_recvmsg
|
--4.05%--____sys_recvmsg
|
--4.02%--unix_stream_recvmsg
|
--4.00%--unix_stream_read_generic
|
--3.38%--unix_stream_read_actor
|
--3.38%--skb_copy_datagram_iter
|
--3.38%--__skb_datagram_iter
|
--3.34%--_copy_to_iter
|
--3.28%--copy_user_enhanced_fast_string
4.20% 0.01% Xwayland [kernel.vmlinux] [k] __sys_recvmsg
|
--4.19%--__sys_recvmsg
|
--4.16%--___sys_recvmsg
|
--4.05%--____sys_recvmsg
|
--4.02%--unix_stream_recvmsg
|
--4.00%--unix_stream_read_generic
|
--3.38%--unix_stream_read_actor
|
--3.38%--skb_copy_datagram_iter
|
--3.38%--__skb_datagram_iter
|
--3.34%--_copy_to_iter
|
--3.28%--copy_user_enhanced_fast_string
4.16% 0.02% Xwayland [kernel.vmlinux] [k] ___sys_recvmsg
|
--4.14%--___sys_recvmsg
|
--4.05%--____sys_recvmsg
|
--4.02%--unix_stream_recvmsg
|
--4.00%--unix_stream_read_generic
|
--3.38%--unix_stream_read_actor
|
--3.38%--skb_copy_datagram_iter
|
--3.38%--__skb_datagram_iter
|
--3.34%--_copy_to_iter
|
--3.28%--copy_user_enhanced_fast_string
4.05% 0.02% Xwayland [kernel.vmlinux] [k] ____sys_recvmsg
|
--4.03%--____sys_recvmsg
|
--4.02%--unix_stream_recvmsg
|
--4.00%--unix_stream_read_generic
|
--3.38%--unix_stream_read_actor
|
--3.38%--skb_copy_datagram_iter
|
--3.38%--__skb_datagram_iter
|
--3.34%--_copy_to_iter
|
--3.28%--copy_user_enhanced_fast_string
4.02% 0.01% Xwayland [kernel.vmlinux] [k] unix_stream_recvmsg
|
--4.00%--unix_stream_recvmsg
|
--4.00%--unix_stream_read_generic
|
--3.38%--unix_stream_read_actor
|
--3.38%--skb_copy_datagram_iter
|
--3.38%--__skb_datagram_iter
|
--3.34%--_copy_to_iter
|
--3.28%--copy_user_enhanced_fast_string
4.00% 0.06% Xwayland [kernel.vmlinux] [k] unix_stream_read_generic
|
--3.94%--unix_stream_read_generic
|
--3.38%--unix_stream_read_actor
|
--3.38%--skb_copy_datagram_iter
|
--3.38%--__skb_datagram_iter
|
--3.34%--_copy_to_iter
|
--3.28%--copy_user_enhanced_fast_string
3.38% 0.00% Xwayland [kernel.vmlinux] [k] unix_stream_read_actor
|
--3.38%--unix_stream_read_actor
|
--3.38%--skb_copy_datagram_iter
|
--3.38%--__skb_datagram_iter
|
--3.34%--_copy_to_iter
|
--3.28%--copy_user_enhanced_fast_string
3.38% 0.00% Xwayland [kernel.vmlinux] [k] skb_copy_datagram_iter
|
--3.38%--skb_copy_datagram_iter
__skb_datagram_iter
|
--3.34%--_copy_to_iter
|
--3.28%--copy_user_enhanced_fast_string
3.38% 0.03% Xwayland [kernel.vmlinux] [k] __skb_datagram_iter
|
--3.34%--__skb_datagram_iter
|
--3.34%--_copy_to_iter
|
--3.28%--copy_user_enhanced_fast_string
3.34% 0.05% Xwayland [kernel.vmlinux] [k] _copy_to_iter
|
--3.30%--_copy_to_iter
|
--3.28%--copy_user_enhanced_fast_string
3.32% 3.29% Xwayland [kernel.vmlinux] [k] copy_user_enhanced_fast_string
|
--3.27%--recvmsg
entry_SYSCALL_64_after_hwframe
do_syscall_64
__sys_recvmsg
___sys_recvmsg
|
--3.27%--____sys_recvmsg
unix_stream_recvmsg
unix_stream_read_generic
unix_stream_read_actor
skb_copy_datagram_iter
__skb_datagram_iter
_copy_to_iter
|
--3.26%--copy_user_enhanced_fast_string
2.84% 0.00% glxgears [kernel.vmlinux] [k] asm_sysvec_apic_timer_interrupt
|
---asm_sysvec_apic_timer_interrupt
2.62% 0.00% glxgears [kernel.vmlinux] [k] irq_entries_start
|
---irq_entries_start
2.55% 0.00% glxgears [unknown] [k] 0x89495541f6894956
|
---0x89495541f6894956
0x7f3f536b0a40
|
--2.26%--__sched_yield
|
--1.48%--entry_SYSCALL_64_after_hwframe
|
--1.44%--do_syscall_64
|
|--0.74%--__ia32_sys_sched_yield
| |
| --0.71%--schedule
| |
| --0.70%--__schedule
|
--0.66%--syscall_exit_to_user_mode
2.55% 0.00% glxgears libnvidia-glcore.so.515.43.04 [.] 0x00007f3f536b0a40
|
---0x7f3f536b0a40
|
--2.26%--__sched_yield
|
--1.48%--entry_SYSCALL_64_after_hwframe
|
--1.44%--do_syscall_64
|
|--0.74%--__ia32_sys_sched_yield
| |
| --0.71%--schedule
| |
| --0.70%--__schedule
|
--0.66%--syscall_exit_to_user_mode
2.27% 0.02% glxgears libc.so.6 [.] __sched_yield
|
--2.24%--__sched_yield
|
--1.48%--entry_SYSCALL_64_after_hwframe
|
--1.44%--do_syscall_64
|
|--0.74%--__ia32_sys_sched_yield
| |
| --0.71%--schedule
| |
| --0.70%--__schedule
|
--0.66%--syscall_exit_to_user_mode
2.16% 0.00% Xwayland [unknown] [.] 0000000000000000
|
---0
|
--0.66%--0x7fe9e57500a3
1.61% 0.00% Xwayland [unknown] [k] 0x0000564419f1f020
|
---0x564419f1f020
|
--1.54%--epoll_wait
|
--0.97%--__x64_sys_epoll_wait
|
--0.96%--do_epoll_wait
|
--0.62%--schedule_hrtimeout_range_clock
|
--0.52%--schedule
|
--0.50%--__schedule
1.56% 0.02% Xwayland libc.so.6 [.] epoll_wait
|
--1.54%--epoll_wait
|
--0.97%--__x64_sys_epoll_wait
|
--0.96%--do_epoll_wait
|
--0.62%--schedule_hrtimeout_range_clock
|
--0.52%--schedule
|
--0.50%--__schedule
1.48% 0.00% glxgears [kernel.vmlinux] [k] _copy_from_iter
|
---_copy_from_iter
|
--1.48%--copy_user_enhanced_fast_string
This problem seems gone with 515.48.07 but tearing is still terrible: While vsync seems generally broken with NVIDIA (at least for me, forcing composition pipeline does not help, I usually run without vsync now because otherwise it causes extreme stutter without eliminating tearing), I see at least just one line of tearing without gamescope but with gamescope, I see a whole block of tearing zick-zacking across a part of the screen - which could be partially explained by the issues linked by @pchome in https://github.com/Plagman/gamescope/issues/495#issuecomment-1126800982, like you took zick-zack scissors to cut between to frames.
@kakra
FYI: KWIN_X11_FORCE_SOFTWARE_VSYNC=1
and KWIN_X11_NO_SYNC_TO_VBLANK=1
environment variables may help a bit with kwin compositor. There was a knob (or setting) in KDE for vsync, but it was removed a while ago.
@pchome It looks like putting Option "ForceCompositionPipeline" "on"
into Section "Device"
fixes most of the issues. Strangely, it didn't work properly before. Also, just enabling this in nvidia-settings doesn't seem to be enough.
I have same issue on Nvidia, regardless of driver version and with or without gamescope.
- Weird "mixed" frame pacing under wayland. (same as those author described).
- Stutters under wayland and xorg with ForceCompositionPipeline, stutters seem to be fixed to camera position or place in the game, but going into the inventory/map (The Ascent) or opening the map (Elden RIng) fixes the stutters for this particular place. (note: frametime graph show no spikes at all)
- In windowed/tiled mode games with Vsync have tearing, the line always appears at the same spot at about 2/7th of the screen.
- Vsync works perfect in gamescope only if started with -f. Without gamescope only in game's native Full Screen mode.
arch 6.0, nvidia 520, 60hz screen 5600X 3060Ti
Something tells me this is Nvidia's problem. Will post there I guess.