Nvidia-open Driver Crashes on PREEMPT_RT Kernel
NVIDIA Open GPU Kernel Modules Version
575.64
Please confirm this issue does not happen with the proprietary driver (of the same version). This issue tracker is only for bugs specific to the open kernel driver.
- [ ] I confirm that this does not happen with the proprietary driver package.
Operating System and Version
ubuntu 22.04
Kernel Release
6.12 LTS
Please confirm you are running a stable release kernel (e.g. not a -rc). We do not accept bug reports for unreleased kernels.
- [x] I am running on a stable kernel release.
Hardware: GPU
RTX 5090
Describe the bug
Using the Nvidia open-source kernel modules (nvidia-open) with a PREEMPT_RT patched kernel leads to a consistent crash of the display manager upon system boot. The system's graphical user interface fails to launch. The driver installation itself, using the official NVIDIA .run file, completes without errors. The issue is resolved by switching the kernel's preemption model to PREEMPT_DYNAMIC, which indicates a specific conflict with the full real-time kernel configuration.
To Reproduce
Perform a clean installation of a supported Linux distribution.
Install and boot into a PREEMPT_RT patched kernel.
Install the NVIDIA driver using the .run file, ensuring the nvidia-open modules are selected and compiled.
Reboot the system.
Bug Incidence
Always
nvidia-bug-report.log.gz
Cannot create log file as it even crashes the Recovery terminal so no way to get that.
More Info
No response
If it helps, I tested 24.04 (with the 565 and 570 divers) and later 25.05 (with 575) with preempt full and it worked (using the ones from Ubuntu mainline).
Maybe you need to pass manually the following, since you are using the .run file ``IGNORE_PREEMPT_RT_PRESENCE=1` Also see: https://gitlab.archlinux.org/archlinux/packaging/packages/nvidia-utils/-/blob/main/PKGBUILD?ref_type=heads#L76
Maybe you need to pass manually the following, since you are using the .run file ``IGNORE_PREEMPT_RT_PRESENCE=1` Also see: https://gitlab.archlinux.org/archlinux/packaging/packages/nvidia-utils/-/blob/main/PKGBUILD?ref_type=heads#L76
Yes I'm already using the IGNORE_PREEMPT_RT_PRESENCE=1. Also nice seeing you here. Recently started using Cachyos on my personal machine. Enjoying it so far.
If it helps, I tested 24.04 (with the 565 and 570 divers) and later 25.05 (with 575) with preempt full and it worked (using the ones from Ubuntu mainline).
Because I have RTX 5090 I cannot use 565 driver. 565 driver works with older GPU with RT Kernel but not with RTX 5090 as it need minimum 570 driver. Also @luisalvarado What kernel version did you test? The problem only happens on RTX 5090, I have successfully tested it with RTX 4060 and RTX A5000.
If it helps, I tested 24.04 (with the 565 and 570 divers) and later 25.05 (with 575) with preempt full and it worked (using the ones from Ubuntu mainline).
Because I have RTX 5090 I cannot use 565 driver. 565 driver works with older GPU with RT Kernel but not with RTX 5090 as it need minimum 570 driver. Also @luisalvarado What kernel version did you test? The problem only happens on RTX 5090, I have successfully tested it with RTX 4060 and RTX A5000.
For the 5090 I tested 570 and 575. For the 4090 and only tested 565 and then 570.
If it helps, I tested 24.04 (with the 565 and 570 divers) and later 25.05 (with 575) with preempt full and it worked (using the ones from Ubuntu mainline).
Because I have RTX 5090 I cannot use 565 driver. 565 driver works with older GPU with RT Kernel but not with RTX 5090 as it need minimum 570 driver. Also @luisalvarado What kernel version did you test? The problem only happens on RTX 5090, I have successfully tested it with RTX 4060 and RTX A5000.
For the 5090 I tested 570 and 575. For the 4090 and only tested 565 and then 570.
Hmm that's weird. I tested both 570 and 575. Both had same problems on RTX 5090. With preempt dynamic both drivers work. Also, I'm on Ubuntu 22.04 and compiling Kernel from source.
Then by the looks it is 22.04 and the kernel on it. Updating to either 24.04 or latest would fix it.
Hi, I am having the same exact issue for months with RTX5090 and ``IGNORE_PREEMPT_RT_PRESENCE=1` parameter makes no difference. I am able to get GNOME environment but when I open any tabs, it starts freezing and can't do anything at all after few seconds.
Hey there. Could you try with this change applied? https://github.com/NVIDIA/open-gpu-kernel-modules/commit/7cc00d318f788b7cf186dd4b9e0a886cf0baf82c
Similar thing will show up in one of the future release branches. Not sure which though.
Thank you @mtijanic ! I never tried the latest driver version (580.105.08) but it wasn't working on 580.76.05. After these changes, I can run nvidia drivers on preempt kernel! Thanks for your help!
The patch should apply cleanly to all branches, even the proprietary closed source driver, so no need to use the latest version.
I'll leave the bug open until the changes make their way to an official release. For reference, this is tracked internally as bugs 5480625, 5610356, 5319037
Change has landed in 590.44.01, closing. https://github.com/NVIDIA/open-gpu-kernel-modules/blob/590.44.01/kernel-open/common/inc/nv-lock.h