open-gpu-kernel-modules icon indicating copy to clipboard operation
open-gpu-kernel-modules copied to clipboard

RTX 5090 often freezes presentation for 3-5 seconds in certain games

Open matte-schwartz opened this issue 6 months ago • 50 comments

NVIDIA Open GPU Kernel Modules Version

570.123.18

Please confirm this issue does not happen with the proprietary driver (of the same version). This issue tracker is only for bugs specific to the open kernel driver.

  • [x] I confirm that this does not happen with the proprietary driver package.

Operating System and Version

CachyOS (Arch Linux)

Kernel Release

Linux blackwell 6.12.33-1-cachyos-lts #1 SMP PREEMPT_DYNAMIC Tue, 10 Jun 2025 14:43:36 +0000 x86_64 GNU/Linux

Please confirm you are running a stable release kernel (e.g. not a -rc). We do not accept bug reports for unreleased kernels.

  • [x] I am running on a stable kernel release.

Hardware: GPU

GPU 0: NVIDIA GeForce RTX 5090 (UUID: GPU-addf70ec-1a4e-f8db-39bc-59d1ab271798)

Describe the bug

When playing certain titles on my RTX 5090 rig, I experience frequent "freezes" where the game window will freeze but some audio continues to play in the background. During the freezes, I am able to use everything else on my PC normally without any indications of a GPU hang. After 3-5 seconds, the game "unfreezes" and then catches up to where it should be.

This happens with every compositor I have tried: KWin wayland, cosmic-comp, and gamescope. I have tried every 570 and 575 driver release and experience this issue on all of them.

Games I experience this freezing with include:

  • Clair Obscur: Expedition 33 (DX12)
  • The First Berserker: Khazan (DX11 and DX12)
  • Star Citizen (DX11)
  • Final Fantasy VII: Rebirth (DX12)

Games I do not experience this freezing with include:

  • Kingdom Come: Deliverance (DX11)
  • Horizon: Forbidden West (DX12)
  • Death Stranding

To Reproduce

KWin repro steps:

  1. Launch Clair Obscur: Expedition 33 on an RTX 5090 (settings should not matter)
  2. Play the game for 5-20 minutes

At some random point in that timespan, you should start running into this freezing bug.

Gamescope repro steps:

  1. Launch Steam inside of gamescope with gamescope -e -f -h <height> -w <width> -r <refresh rate> --mangoapp --adaptive-sync -- steam -steamdeck -steamos3 -steampal -gamepadui
  2. Launch Clair Obscur: Expedition 33 (make sure the only launch option you are using is SteamDeck=0 %command%)
  3. Play the game for 5-20 minutes
  4. When you encounter a freeze, try pulling up one of the Steam side menus with the Xbox logo or Shift + Tab on your keyboard

You should see a similar effect to the gameplay clip I posted below

Bug Incidence

Always

nvidia-bug-report.log.gz

nvidia-bug-report.log.gz

steam-1903340.log

More Info

One interesting tidbit I found is that I am able to "unfreeze" the game if I pull up a separate DRM plane, like one of the Steam side menus when running Steam's GamepadUI inside of gamescope. An example of this is here:

https://github.com/user-attachments/assets/c406a8c4-8ea3-461c-8e5e-18e1cb467015

You can see at the end of the clip it gets stuck for a few seconds but the instant I pull up the side menu, it unfreezes.

matte-schwartz avatar Jun 13 '25 05:06 matte-schwartz

Apparently another way to unfreeze presentation is to pull up a different app window (like ulauncher) on top of the main game window if you're using a desktop environment to game.

Haven't been able to reproduce this issue on my 4000 series laptop in dGPU mode yet or on any of my AMD rigs under identical usage conditions, only on Blackwell.

matte-schwartz avatar Jun 14 '25 19:06 matte-schwartz

I think I'm getting the same thing here with Kingdom Come: Deliverance II on a 5090.

https://forums.developer.nvidia.com/t/570-release-feedback-discussion/321956/554?u=arulan7106

Arulan avatar Jun 16 '25 05:06 Arulan

@Arulan yes that looks very similar.

Did some more tinkering this weekend and can confirm: 1. disabling VK_KHR_present_wait has no effect, 2. disabling MPO in the nvidia module has no effect

matte-schwartz avatar Jun 16 '25 19:06 matte-schwartz

FYI, I had this exact issue on Clair Obscur on my 5080 when using Proton Experimental Bleeding Edge about 3 weeks ago. Switching back to the "stable" Proton 10.0-1 beta fixed it for me. Not sure about the exact causal relation, but maybe this makes it easier to find the trigger.

EDIT: Scratch that. After more testing the issue popped up again with Clair Obscur, but only after several successful longer play sessions today after playing for an hour or so. Same issue also popped up after playing Mirror's Edge Catalyst (DX11) for ~15 Minutes via Proton 8.0.5. So not related to any Proton version and appears to be quite random. Sorry.

mgruberb avatar Jun 17 '25 17:06 mgruberb

Image

already testing with that, no luck

edit: going to try clearing out the prefix and re-downloading Proton 10 just to confirm

matte-schwartz avatar Jun 17 '25 17:06 matte-schwartz

I have this happening as well. On an AM5 system.

I have a friend with an AM4 system and same GPU (5090 FE) and this doesn't happen on their system for some reason running the same version of proton and OS.

Running on Bazzite May 29 Stable as that was the last driver where Clair Obscur would launch.

SimpleHeuristics avatar Jun 17 '25 18:06 SimpleHeuristics

that's very interesting, I hadn't considered that. I am running a 9800X3D on my NVIDIA rig with an AM5 motherboard (ASUS B850M-PLUS)

matte-schwartz avatar Jun 17 '25 18:06 matte-schwartz

I grabbed a gpu-trace capture after the issue occurred, and the results are... odd. It looks like some of the game's threads stop running while all the audio threads continue. The obvious gap is where I experienced the freeze:

Image

Image

Image

gpu-trace: https://app.filen.io/#/d/7a48bce5-7821-4e67-b3c1-208602aa6bb1%23f3PN4KX887TbdN04Vz0cs9m2E8ZM7Jms, use https://github.com/mikesart/gpuvis to view

matte-schwartz avatar Jun 17 '25 19:06 matte-schwartz

I grabbed a gpu-trace capture after the issue occurred, and the results are... odd. It looks like some of the game's threads stop running while all the audio threads continue. The obvious gap is where I experienced the freeze:

Image

Image

Image

gpu-trace: https://app.filen.io/#/d/7a48bce5-7821-4e67-b3c1-208602aa6bb1%23f3PN4KX887TbdN04Vz0cs9m2E8ZM7Jms, use https://github.com/mikesart/gpuvis to view

That describes the issue exactly. Audio always continues for me.

For what it's worth, I'm also on a 9800x3D (MSI X870 Tomahawk).

Arulan avatar Jun 17 '25 22:06 Arulan

So since Bazzite testing updated to 575.64 I enabled the smooth motion feature for Clair Obscur since that game doesn't have frame Gen support.

The game now doesn't do the random visual hang described here.

I haven't tested it with smooth motion off yet.

SimpleHeuristics avatar Jun 19 '25 15:06 SimpleHeuristics

Interesting find. 575.64 makes no difference here on its own, but I've yet to experience a freeze if I enable smooth motion.

Maybe will help nvidia track down the issue?

matte-schwartz avatar Jun 19 '25 18:06 matte-schwartz

Interesting find. 575.64 makes no difference here on its own, but I've yet to experience a freeze if I enable smooth motion.

Maybe will help nvidia track down the issue?

At first I was wondering if it was GPU load related thing but it doesn't seem to be. I had the game locked at 60FPS for consistent frame timing for parries etc. and it would do the freezing. With smooth motion it just bumps it to a consistent 120 that's displayed with the same 60FPS base framerate and GPU load and no freezes with that.

SimpleHeuristics avatar Jun 19 '25 19:06 SimpleHeuristics

Interesting. I wonder if some experimenting with DXVK_NVAPI logging/flags could yield something. I've tried to set DXVK_NVAPI_GPU_ARCH to Ada (AD100), but that didn't seem to do anything.

https://github.com/jp7677/dxvk-nvapi/blob/master/README.md

Arulan avatar Jun 20 '25 06:06 Arulan

None of the NVAPI settings seem to make a difference here. I'm fairly certain the reason NvPresent avoids the freeze is because of the second swapchain it creates alongside the primary game swapchain.

matte-schwartz avatar Jun 23 '25 07:06 matte-schwartz

got another hang in Clair Obscur: Expedition 33 after ~20 minutes of gameplay on the new 575.64.03 driver:

nvidia-bug-report.log.gz

Did not have Smooth Motion enabled at the time.

matte-schwartz avatar Jul 01 '25 23:07 matte-schwartz

Hi All, We have a bug 5376205 filed internally for tracking purpose. Shall try to duplicate issue locally and if needed any additional information, will get back.

amrit1711 avatar Jul 04 '25 07:07 amrit1711

Enabling smooth motion seems to stop the freezes. I've tested this on at least Expedition 33 and Tekken 8.

felirx avatar Jul 04 '25 08:07 felirx

Funny enough with smooth motion I am getting the same freezes in Control, but with it off no freezes there.

Seems inconsistent as to whether or not Smooth Motion helps or hinders.

SimpleHeuristics avatar Jul 05 '25 06:07 SimpleHeuristics

I'm having similar issues with classic Mists of Pandaria. The game will randomly freeze for a few seconds (sometimes showing an older frame or just black) and then "catch up" to a current frame. Pulling up any kind of other window (alt-tab, start menu etc.) will often unfreeze the game.

I'm on Fedora 42 (kernel 6.15.4) with a 5090 FE (driver version 575.64.03). Logs at the time of the freeze below.

info:vkd3d-proton:vkd3d_pipeline_library_disk_thread_main: Pipeline cache marked dirty. Flush is scheduled.
info:vkd3d-proton:vkd3d_pipeline_library_disk_thread_main: Flushing disk cache (wakeup counter since last flush = 152). It seems like application has stopped creating new PSOs for the time being.
info:vkd3d-proton:vkd3d_pipeline_library_disk_thread_main: Pipeline cache marked dirty. Flush is scheduled.
info:vkd3d-proton:vkd3d_pipeline_library_disk_thread_main: Flushing disk cache (wakeup counter since last flush = 81). It seems like application has stopped creating new PSOs for the time being.
info:vkd3d-proton:vkd3d_pipeline_library_disk_thread_main: Pipeline cache marked dirty. Flush is scheduled.
info:vkd3d-proton:vkd3d_pipeline_library_disk_thread_main: Flushing disk cache (wakeup counter since last flush = 2). It seems like application has stopped creating new PSOs for the time being.
err:vkd3d-proton:d3d12_command_allocator_Reset: There are still 1 pending command lists awaiting execution from command allocator iface 00000000887dbbd0!
err:vkd3d-proton:d3d12_command_allocator_Reset: There are still 1 pending command lists awaiting execution from command allocator iface 00000000887dd9d0!
err:vkd3d-proton:d3d12_command_allocator_Reset: There are still 1 pending command lists awaiting execution from command allocator iface 0000000088af6f20!
err:vkd3d-proton:d3d12_command_allocator_Reset: There are still 1 pending command lists awaiting execution from command allocator iface 0000000088af7c40!
err:vkd3d-proton:d3d12_command_allocator_Reset: There are still 1 pending command lists awaiting execution from command allocator iface 0000000088af8d20!
err:vkd3d-proton:d3d12_command_allocator_Reset: There are still 1 pending command lists awaiting execution from command allocator iface 0000000088af9680!
err:vkd3d-proton:d3d12_command_allocator_Reset: There are still 1 pending command lists awaiting execution from command allocator iface 0000000088af9c20!
err:vkd3d-proton:d3d12_command_allocator_Reset: There are still 1 pending command lists awaiting execution from command allocator iface 0000000096528a20!
err:vkd3d-proton:d3d12_command_allocator_Reset: There are still 1 pending command lists awaiting execution from command allocator iface 00000000887dbf90!
info:vkd3d-proton:vkd3d_get_vk_version: vkd3d-proton - applicationVersion: 2.14.0.
info:vkd3d-proton:vkd3d_instance_init: vkd3d-proton - build: ...
info:vkd3d-proton:vkd3d_memory_info_decide_hvv_usage: Topology: No more than 1 device local heap, HVV access is viable.
info:vkd3d-proton:vkd3d_memory_info_upload_hvv_memory_properties: Topology: HVV usage is allowed, using DEVICE_LOCAL | HOST_COHERENT for UPLOAD.
info:vkd3d-proton:vkd3d_memory_info_init_budgets: Applying resizable BAR budget to memory types: 0x10.
info:vkd3d-proton:vkd3d_bindless_state_get_bindless_flags: Enabling fast paths for advanced ExecuteIndirect() graphics and compute (EXT_dgc).
info:vkd3d-proton:vkd3d_bindless_state_get_bindless_flags: Device supports VK_EXT_mutable_descriptor_type.
info:vkd3d-proton:vkd3d_bindless_state_add_binding: Device supports VK_EXT_descriptor_buffer!
info:vkd3d-proton:vkd3d_bindless_state_add_binding: Device supports VK_EXT_descriptor_buffer!
info:vkd3d-proton:vkd3d_bindless_state_add_binding: Device supports VK_EXT_descriptor_buffer!
info:vkd3d-proton:d3d12_device_caps_init_shader_model: Enabling support for SM 6.6.
info:vkd3d-proton:d3d12_device_caps_init_shader_model: Enabling support for SM 6.7.
info:vkd3d-proton:d3d12_device_caps_init_shader_model: Enabling support for SM 6.8.
fixme:vkd3d-proton:d3d12_device_caps_init_feature_options1: TotalLaneCount = 16320, may be inaccurate.
info:vkd3d-proton:d3d12_device_determine_ray_tracing_tier: DXR support enabled.
info:vkd3d-proton:d3d12_device_determine_ray_tracing_tier: DXR 1.1 support enabled.
info:vkd3d-proton:d3d12_device_caps_init_feature_level: DX Ultimate supported!
info:vkd3d-proton:vkd3d_pipeline_library_init_disk_cache: Remapping VKD3D_SHADER_CACHE to: vkd3d-proton.cache.
info:vkd3d-proton:vkd3d_pipeline_library_init_disk_cache: Attempting to load disk cache from: vkd3d-proton.cache.
info:vkd3d-proton:vkd3d_pipeline_library_disk_thread_main: Performing async setup of stream archive ...
info:vkd3d-proton:vkd3d_pipeline_library_disk_cache_merge: Merging disk caches.
info:vkd3d-proton:vkd3d_pipeline_library_disk_cache_merge: Done merging shader caches, existing entries: 1147, new entries: 76.
info:vkd3d-proton:vkd3d_pipeline_library_disk_cache_merge: Successfully replaced shader cache with merged cache.
info:vkd3d-proton:vkd3d_pipeline_library_disk_cache_initial_setup: Merging pipeline libraries took 2.288 ms.
info:vkd3d-proton:vkd3d_pipeline_library_disk_cache_initial_setup: Mapping read-only cache took 0.054 ms.
info:vkd3d-proton:vkd3d_pipeline_library_disk_cache_initial_setup: Parsing stream archive took 0.262 ms.
info:vkd3d-proton:vkd3d_pipeline_library_disk_thread_main: Done performing async setup of stream archive.
info:vkd3d-proton:dxgi_vk_swap_chain_init: Creating swapchain (3840 x 2160), BufferCount = 2.
info:vkd3d-proton:dxgi_vk_swap_chain_init_sync_objects: Ensure maximum latency of 3 frames with KHR_present_wait.
info:vkd3d-proton:dxgi_vk_swap_chain_init_waiter_thread: Enabling present wait path for frame latency.
info:vkd3d-proton:dxgi_vk_swap_chain_init_sleep_state: Timer interval is 1.0 ms.
info:vkd3d-proton:dxgi_vk_swap_chain_recreate_swapchain_in_present_task: Got 4 swapchain images.
info:vkd3d-proton:dxgi_vk_swap_chain_recreate_swapchain_in_present_task: Got 4 swapchain images.
info:vkd3d-proton:dxgi_vk_swap_chain_recreate_swapchain_in_present_task: Got 4 swapchain images.

vuryn avatar Jul 09 '25 11:07 vuryn

I'm having similar issues, though my freezes seem to get fixed by simply moving my mouse. I first encountered this issue when playing Wonderlands, but it was only mildly annoying as I'd constantly move my mouse, but now in Clair Obscur it's way more of a problem since I'm playing on a controller. I also noticed that if it happens in cutscenes, audio/video would get desynced by the duration of the freeze after it unfreezes. I have RTX5080 and I'm using arch btw with nvidia-open 575.64.03-3, xorg-xwayland 24.1.8-1, proton-experimental.

roman-vorobiov avatar Jul 11 '25 20:07 roman-vorobiov

depending on your setup moving the mouse may be enough to "wake" it up per se, similar to how pulling up an overlay unfreezes it for me.

matte-schwartz avatar Jul 11 '25 21:07 matte-schwartz

~~were there any potential fixes for 5376205 in 580.65.06? so far I have not had any freezes in Clair Obscur: Expedition 33 after 50 minutes, but I still need to keep playing and re-check other games to confirm this~~

Still froze eventually, and unfroze with the same workaround

matte-schwartz avatar Aug 04 '25 17:08 matte-schwartz

I was able to reduce the frequency of the freezes from every couple of minutes to every few hours with the following (not sure which one actually did the thing):

  • disable ftpm in bios
  • use a dedicated compositor when launching games: gamescope -f --backend sdl %command%
  • bump process priority: sudo setcap 'CAP_SYS_NICE=eip' $(which gamescope)
  • I also saw ppl saying that lowering the PCIe speed to gen.4 helped them, but this did nothing for me

roman-vorobiov avatar Aug 04 '25 18:08 roman-vorobiov

I have been trying to troubleshoot this problem for a few weeks now. I was getting the same issues, FPS would drop to nothing and would unfreeze after a few moments or bring another application into focus. Trying changes listed in this thread didn't yield much.

So next i decided to turn off any overclocks. starting with DOCP/XMP. After disabling overclock on my memory, a game that would do it a few times per hour hasn't done it once in three hours. I am going to continue testing but my case seems to have been stemming from an unstable overclock on the memory controller. (Also running on an 9800X3D)

admchatm avatar Aug 05 '25 22:08 admchatm

That was one of the first things I tried before reporting the issue, unfortunately did not seem to make a difference here. I will check again though.

matte-schwartz avatar Aug 05 '25 23:08 matte-schwartz

Hi Matte. Was curious if you saw any success. After several days I haven't had a single drop to performance to what we were seeing. I did find something else interesting though, that could also be worth investigation. While I was having issues, lsfg-vk was running in the background latching onto GameThread, while troubleshooting I disabled that background process which might have also had something to do with it. When I re-enabled it I got drops to 0 but they only lasted for a split second.

admchatm avatar Aug 09 '25 15:08 admchatm

Been a bit busy, but early testing by setting memory timings to Manual and then leaving everything stock in my BIOS rather than Auto does seem more stable. Need to try more games in my library to confirm.

matte-schwartz avatar Aug 13 '25 07:08 matte-schwartz

The freezes persist without any memory overclock after all (DDR5@4800mhz). Will try different RAM but I'm still leaning towards this being a driver issue.

Also, the freezing seems like it may be specific to using D3D. Star Citizen DX11 freezes while I have not had a single freeze using Vulkan instead.

matte-schwartz avatar Aug 18 '25 22:08 matte-schwartz

Issue persists with new RAM from a different vendor as well, not looking like a memory issue imo.

matte-schwartz avatar Aug 19 '25 17:08 matte-schwartz

I can confirm the same behaviour on Garuda Dragonized(zen kernel 6.16.1) and 580 drivers. Hardware: RTX 5090 and Intel Core Ultra 265K. I.e. cpu can be removed from the equation. Proton logs doesn’t show anything interesting/unusual, tried experimental/ge/cachyOs.

EugeneSlepov avatar Aug 19 '25 18:08 EugeneSlepov