distribution
distribution copied to clipboard
GNOME fails to start after installing NVIDIA proprietary driver
The last two days I've attempted multiple (5+) times to install the latest NVIDIA proprietary driver and after installation GNOME fails to start - I'm left at a black screen with a flashing cursor. I'm able to access the system via SSH and terminal with CTRL+ALT+F2.
I've followed the tutorial here: https://docs.01.org/clearlinux/latest/tutorials/nvidia.html
My system information:
H/W path Device Class Description
======================================================
system To Be Filled By O.E.M. (To Be Filled By O.E.M.)
/0 bus Z390 Taichi
/0/0 memory 64KiB BIOS
/0/10 memory 32GiB System Memory
/0/10/0 memory 8GiB DIMM DDR4 Synchronous 3200 MHz (0.3 ns)
/0/10/1 memory 8GiB DIMM DDR4 Synchronous 3200 MHz (0.3 ns)
/0/10/2 memory 8GiB DIMM DDR4 Synchronous 3200 MHz (0.3 ns)
/0/10/3 memory 8GiB DIMM DDR4 Synchronous 3200 MHz (0.3 ns)
/0/1f memory 512KiB L1 cache
/0/20 memory 2MiB L2 cache
/0/21 memory 16MiB L3 cache
/0/22 processor Intel(R) Core(TM) i9-9900K CPU @ 3.60GHz
/0/100 bridge 8th Gen Core 8-core Desktop Processor Host Bridge/DRAM Registers [Coffee Lake S]
/0/100/1 bridge Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor PCIe Controller (x16)
/0/100/1/0 display GP104 [GeForce GTX 1070]
/0/100/1/0.1 multimedia GP104 High Definition Audio Controller
/0/100/1.1 bridge Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor PCIe Controller (x8)
/0/100/1.1/0 network BCM4360 802.11ac Wireless Network Adapter
/0/100/1.2 bridge Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor PCIe Controller (x4)
/0/100/1.2/0 bus ASM1042A USB 3.0 Host Controller
/0/100/1.2/0/0 usb3 bus xHCI Host Controller
/0/100/1.2/0/0/1 multimedia EVGA NU Audio
/0/100/1.2/0/1 usb4 bus xHCI Host Controller
/0/100/12 generic Cannon Lake PCH Thermal Controller
/0/100/14 bus Cannon Lake PCH USB 3.1 xHCI Host Controller
/0/100/14/0 usb1 bus xHCI Host Controller
/0/100/14/0/3 bus ASM107x
/0/100/14/0/4 input Gaming Mouse G502
/0/100/14/0/d bus USB2.0 Hub
/0/100/14/1 usb2 bus xHCI Host Controller
/0/100/14/1/7 bus ASM107x
/0/100/14.2 memory RAM memory
/0/100/16 communication Cannon Lake PCH HECI Controller
/0/100/17 storage Cannon Lake PCH SATA AHCI Controller
/0/100/1c bridge Cannon Lake PCH PCI Express Root Port #7
/0/100/1c/0 /dev/fb0 bridge ASM1184e PCIe Switch Port
/0/100/1c/0/1 bridge ASM1184e PCIe Switch Port
/0/100/1c/0/1/0 wlp6s0 network Dual Band Wireless-AC 3168NGW [Stone Peak]
/0/100/1c/0/3 bridge ASM1184e PCIe Switch Port
/0/100/1c/0/3/0 enp7s0 network I211 Gigabit Network Connection
/0/100/1c/0/5 bridge ASM1184e PCIe Switch Port
/0/100/1c/0/7 bridge ASM1184e PCIe Switch Port
/0/100/1c/0/7/0 storage ASM1062 Serial ATA Controller
/0/100/1f bridge Z390 Chipset LPC/eSPI Controller
/0/100/1f.4 bus Cannon Lake PCH SMBus Controller
/0/100/1f.5 bus Cannon Lake PCH SPI Controller
/0/100/1f.6 eno1 network Ethernet Connection (7) I219-V
What version of Clear Linux and what version of the NVIDIA driver?
I noticed 430.x on the Clear Linux 5.3 kernel is not working. I haven't had a chance to see if this is a Clear Linux specific issue or an upstream issue. But a workaround is Interrupt the bootloader by holding SPACE
key as the system boots and select an older 5.2.x kernel to boot.
Sorry, I should have specified the versions.
This was with the 5.3 kernel and NVIDIA Driver 430.5.
I just tried the 435.21 drivers today with 5.3 and had no issues installing or booting in to GNOME after first install. The system was stable until the first reboot. After reboot I was left with a flashing cursor and was no longer able to access terminal, system was unresponsive.
Reverting to a 5.2.x kernel in the meantime.
@gmatler did the combo kernel 5.2.x + NVIDIA 430.5 get you a successful boot?
@mrkz rolling back to 31080 with a 5.2 kernel works for me. Anything newer does not.
The driver appears to build and install ok and load ok:
$ lsmod | grep ^nvidia
nvidia_drm 45056 0
nvidia_modeset 1122304 1 nvidia_drm
nvidia 19513344 1 nvidia_modeset
The X.org log is pretty generic:
[ 145.811] (II) NVIDIA GLX Module 435.21 Sun Aug 25 08:14:27 CDT 2019
[ 145.811] (II) NVIDIA: The X server does not support PRIME Render Offload.
[ 146.187] (EE) NVIDIA(GPU-0): Failed to initialize the NVIDIA GPU at PCI:1:0:0. Please
[ 146.187] (EE) NVIDIA(GPU-0): check your system's kernel log for additional error
[ 146.187] (EE) NVIDIA(GPU-0): messages and refer to Chapter 8: Common Problems in the
[ 146.187] (EE) NVIDIA(GPU-0): README for additional information.
[ 146.187] (EE) NVIDIA(GPU-0): Failed to initialize the NVIDIA graphics device!
[ 146.187] (EE) NVIDIA(0): Failing initialization of X screen
The only thing that sticks out to me from journalctl are these lines. On a successful start these do not appear:
2:44 kernel: nvidia 0000:01:00.0: DMAR: 32bit DMA uses non-identity mapping
2:44 kernel: NVRM: GPU 0000:01:00.0: RmInitAdapter failed! (0x24:0x59:1184)
2:44 kernel: NVRM: GPU 0000:01:00.0: rm_init_adapter failed, device minor number 0
Another data point: I rebuilt the 5.2.17-836 kernel and booted it on Clear Linux version 31230 and the nvidia driver started working again.
This indicates to me the issue is indeed somewhere in the kernel, and not another gcc/gnome change.
Another finding that can hopefully help root cause: changing kernel parameter intel_iommu=igfx_off
to intel_iommu=off
also resolves the issue on the 5.3 kernel.
EDIT: intel_iommu=on
also works
So you say it's a kernel bug? I mean, it's a coincidence GNOME was updated at the same time?
@puneetse any updates on this? I literally can't use Clear Linux...
@SPAstef Sorry I don't have much of an update, but try the work around I posted above. It should at least unblock you. You can hold SPACE
to interrupt the bootloader and hit e
to edit the kernel command-line.
One more thing I was able to test: the "mainline" kernel package also has the issue, which tells me it's probably not one of the Clear Linux kernel patches causing this. Maybe it's a conflicting config or upstream bug (but I'd expect more noise if it was)
Ok, didnt know that space interrupt thing. I have it in dual boot with Windows, hope it will work anyway. What would be the problem into shipping the kernel with that parameter you told me already set?
Likely an issue between nvidia driver and the specific kernel version. Changing the kernel command line in this case is a negative impact for other uses though so we wouldn't be likely to make it default.
Thanks. Another issue that I have only since latest version of GNOME: Xorg session doesent recognize external monitor (attached to GPU via Displayport), while Wayland does. (This happens with Nouveau drivers). Should I open a new issue for this?
EDIT: anyway it did kinda work... While I was writing this message saying it didn't, it actually did. Took it 5 minutes to show GDM but in the end it did it...
With the latest version it seems that even setting the intel_iommu option doesent work anymore
With the latest version it seems that even setting the intel_iommu option doesent work anymore
@SPAstef It's still working for me on 31380 . Check cat /proc/cmdline
to make sure it really booted withou the intel_iommu option.
Likely an issue between nvidia driver and the specific kernel version.
@bryteise while that is a recurring cat-and-mouse game, I don't think this particular issue is a generic NVIDIA vs Linux kernel problem.
I would expect more noise from other distro users if that was the case and I tested a Fedora 30/31 system with NVIDIA drivers (5.3.6 kernel, GNOME 3.32/3.34, and with and without intel_iommu=igfx_off
) and it works fine. So I suspect this is more focused to Clear Linux somehow.
@puneetse now it works again, but the internal display isn't recognized (this might be related to the kernel option, since the internal display is connected to the Intel on-board GPU). Still hoping this will get fixed soon :smile:
I have the same issue and it has been resolved at this time by removing intel_iommu=iglx_off
and adding intel_iommu=on
as @SPAstef suggested.
I've had this issue since I installed Clear Linux last week but I was using the LTS kernel to get around the issue. This is the first time I'm seeing this so I catnt say if any changed or had an effect.
For completely different reasons, I happened to disable Intel VT-d in the BIOS, and now everything seems to be working normally. I don't know if it is related to this... Or did you fix it silently?
I wanted to just confirm that disabling Intel VT-d in BIOS resolved this issue for me. I had no graphical boot with the same generic failure message in my Xorg log. After disabling VT-d I booted into GDM / Gnome just fine. I did not change the intel_iommu
setting, it is currently set to intel_iommu=igfx_off
and I have the internal graphics card on the system disabled in the bios as well.
I have this problem as well. Have followed all the steps here “Oh no! Something has gone wrong” error screen. My BIOS does not allow me to disable the integrated graphics. This is on a Razer Blade 13"
I have this problem as well. Have followed all the steps here “Oh no! Something has gone wrong” error screen. My BIOS does not allow me to disable the integrated graphics. This is on a Razer Blade 13"
My laptop doesn't either, you shouldn't need to disable it. Just follow the additional steps for Optimus laptops
Just follow the additional steps for Optimus laptops
Am I missing what these additional steps are? The only reference I see about Optimus in the docs is to turn of the iGPU via firmware.
I'm unsure how to do that since my BIOS doesn't have that option.
Just follow the additional steps for Optimus laptops
Am I missing what these additional steps are? The only reference I see about Optimus in the docs is to turn of the iGPU via firmware.
I'm unsure how to do that since my BIOS doesn't have that option.
They've just been lazy writing that tutorial. Follow this: https://community.clearlinux.org/t/bash-scripts-to-automate-installation-of-nvidia-proprietary-driver/368
I found a solution. I installed sudo install lightdm and reconfigure sudo dpkg-reconfigure lightdm I don't understand why gdm3 fail for start gnome with nvidia. My laptop debian 10 Linux 4.19.0-6-amd64 #1 SMP Debian 4.19.67-2+deb10u2 (2019-11-11) x86_64 GNU/Linux
I found a solution. I installed sudo install lightdm and reconfigure sudo dpkg-reconfigure lightdm I don't understand why gdm3 fail for start gnome with nvidia. My laptop debian 10 Linux 4.19.0-6-amd64 #1 SMP Debian 4.19.67-2+deb10u2 (2019-11-11) x86_64 GNU/Linux
Didn't work for me for Wayland.