os icon indicating copy to clipboard operation
os copied to clipboard

GPU has fallen off the bus on suspend

Open salperinlea opened this issue 4 years ago • 1 comments

Prerequisites

  • [X] I have searched open and closed issues for duplicates.
  • [X] I'm using the latest released stable version

Hello, I am currently running Elementary on a new Dell XPS 15 with an NVIDIA GeForce 1650. When using said GPU, I am unable to suspend the computer. When I do so, I recieve the following error on restart from suspend:

NVRM: Xid (PCI:0000:01:00): 79, pid=589, GPU has fallen off the bus.

Setting the GPU to persistence mode does not fix the problem, and occasionally generates a whole new string of ACPI errors instead. These read:

Jun 27 01:10:14 logos kernel: [ 3708.055085] ACPI BIOS Error (bug): Could not resolve symbol [\_SB.PCI0.PEG0.PEGP.BRT6.LCD], AE_NOT_FOUND (20190703/psargs-330) Jun 27 01:10:14 logos kernel: [ 3708.055115] No Local Variables are initialized for Method [BRT6] Jun 27 01:10:14 logos kernel: [ 3708.055119] Initialized Arguments for Method [BRT6]: (2 arguments defined for method invocation) Jun 27 01:10:14 logos kernel: [ 3708.055121] Arg0: 000000005f8a557c <Obj> Integer 0000000000000002 Jun 27 01:10:14 logos kernel: [ 3708.055131] Arg1: 0000000079d417db <Obj> Integer 0000000000000000 Jun 27 01:10:14 logos kernel: [ 3708.055147] ACPI Error: Aborting method \_SB.PCI0.PEG0.PEGP.BRT6 due to previous error (AE_NOT_FOUND) (20190703/psparse-531) Jun 27 01:10:14 logos kernel: [ 3708.055643] ACPI Error: Aborting method \EV5 due to previous error (AE_NOT_FOUND) (20190703/psparse-531) Jun 27 01:10:14 logos kernel: [ 3708.056025] ACPI Error: Aborting method \SMEE due to previous error (AE_NOT_FOUND) (20190703/psparse-531) Jun 27 01:10:14 logos kernel: [ 3708.056326] ACPI Error: Aborting method \SMIE due to previous error (AE_NOT_FOUND) (20190703/psparse-531) Jun 27 01:10:14 logos kernel: [ 3708.056618] ACPI Error: Aborting method \NEVT due to previous error (AE_NOT_FOUND) (20190703/psparse-531) Jun 27 01:10:14 logos kernel: [ 3708.056805] ACPI Error: Aborting method \_SB.PCI0.LPCB.ECDV._Q66 due to previous error (AE_NOT_FOUND) (20190703/psparse-531) I have tried boot-parameter related fixes, setting acpi_osi=Linux, or acpi_osi=Windows, to no avail.

When I contacted NVIDIA about this, they were uniquely unhelpful, directing me to unresolved forum posts for archlinux from 2012, so I apologize if this isn't the perfect place to post this.

I have seen suggestions that this is related to the device overheating. I will note that I am having this issue constantly; zoom calls alone are enough to get my entire computer up to 100 C in a number of seconds.

Device specifications are attached. settings

salperinlea avatar Jun 27 '20 20:06 salperinlea

Forgot to mention that pcie_aspm=off has not fixed the problem, and that this error has persisted both in nvidia driver 440:59, and in 440:100, to which I updated yesterday.

salperinlea avatar Jun 27 '20 20:06 salperinlea