packages
packages copied to clipboard
AMDGPU Crashing Kernel When Suspending
Please confirm there isn't an existing open bug report
- [X] I have searched open bugs for this issue
Summary
Motherboard Asus Z97 Pro Gamer CPU: i5-4570 GPU: Sapphire Radeon 550 4GB Kernel version: 6.11.5-30.current Current Sync: 10/30/2024
A few weeks or months ago (I forget how long), after one of the weekly syncs (I believe when going to Linux 6.10, though it could be later), my desktop PC would no longer sleep when I suspended the OS (whether via the power button, as that was what I had set it to, or via the user indicator on Budgie Panel), which I had noticed due to when I thought I was waking, instead the fans would start for half a second, stop, and then start again, with the PC posting like normal, as though I had just turned it on (cold boot).
This has gone on unchanged through the last few syncs, including a few minutes ago, and I just never got around to posting this issue until now.
Checking logs did not reveal anything, so I turned to dmesg while suspending, and the last lines referred to AMDGPU in connection to the Linux kernel crashing.
Logs will be attached soon, after I crash the kernel again to obtain the last lines.
Steps to reproduce
- Press Suspend.
- System suspends.
- Kernel crashes.
- PC turns off.
Expected result
The PC should suspend normally, without the kernel crashing.
Actual result
The kernel crashed, and the PC turned off.
Environment
- [X] Is system up to date?
Repo
Shannon (stable)
Desktop Environment
Budgie
System details
System:
Host: moriel-pc Kernel: 6.11.5-307.current arch: x86_64 bits: 64
Desktop: Budgie v: 10.9.2 Distro: Solus 4.6 convergence
Machine:
Type: Desktop System: ASUS product: All Series v: N/A
serial:
Other comments
Update: Unfortunately, I was unable to capture the very moment the kernel crashed, as it did before dmesg managed to output anything, so the logs are incomplete (the crash happens right after the system gets into the suspended state). journal.log
Unfortunately, the logs don't have anything after the suspend operation, they don't have the kernel crash information.
I noticed that systemd had a coredump a few minutes before the power button was pressed to suspend the system
coredump
Oct 30 03:50:30 moriel-pc systemd[1]: Created slice system-systemd\x2dcoredump.slice - Slice /system/systemd-coredump. Oct 30 03:50:30 moriel-pc systemd[1]: Started [email protected] - Process Core Dump (PID 5180/UID 0). Oct 30 03:50:32 moriel-pc systemd-coredump[5181]: Process 2478 (im.nheko.Nheko) of user 1000 dumped core.
Stack trace of thread 2478:
#0 0x00007f77916aa87b pthread_kill (libc.so.6 + 0x9e87b)
#1 0x00007f7791650316 raise (libc.so.6 + 0x44316)
#2 0x000055c9df2681a6 _Z17stacktraceHandleri (im.nheko.Nheko + 0xa6d1a6)
#3 0x00007f77916503c0 n/a (libc.so.6 + 0x443c0)
#4 0x000055ca02d10940 n/a (n/a + 0x0)
#5 0x00007f77935b2aa7 event_add_nolock_ (libevent_core-2.1.so.7 + 0x19aa7)
#6 0x00007f77935b2ddc event_add (libevent_core-2.1.so.7 + 0x19ddc)
#7 0x00007f7794ae9410 _ZN6coeurl6Client7addsockEii (libcoeurl.so.0.3 + 0xa410)
#8 0x00007f7794ae982c _ZN6coeurl6Client7sock_cbEPviiS1_S1_ (libcoeurl.so.0.3 + 0xa82c)
#9 0x00007f7791b3cc0b n/a (libcurl.so.4 + 0x5cc0b)
#10 0x00007f7791afa592 n/a (libcurl.so.4 + 0x1a592)
#11 0x00007f7791afa96d n/a (libcurl.so.4 + 0x1a96d)
#12 0x00007f7791afab3a n/a (libcurl.so.4 + 0x1ab3a)
#13 0x00007f7791afae1d n/a (libcurl.so.4 + 0x1ae1d)
#14 0x00007f7791b3c970 curl_multi_cleanup (libcurl.so.4 + 0x5c970)
#15 0x00007f7794ae9f6f _ZN6coeurl6ClientD2Ev (libcoeurl.so.0.3 + 0xaf6f)
#16 0x00007f77947a84ea _ZN3mtx4http6ClientD1Ev (libmatrix_client.so.0.10.0 + 0x1a84ea)
#17 0x000055c9dee1390f n/a (im.nheko.Nheko + 0x61890f)
#18 0x00007f7791652c60 n/a (libc.so.6 + 0x46c60)
#19 0x00007f7791652d1e exit (libc.so.6 + 0x46d1e)
#20 0x00007f77916364f3 n/a (libc.so.6 + 0x2a4f3)
#21 0x00007f77916365a9 __libc_start_main (libc.so.6 + 0x2a5a9)
#22 0x000055c9dee02965 _start (im.nheko.Nheko + 0x607965)
Stack trace of thread 2572:
#0 0x00007f7791721c0d syscall (libc.so.6 + 0x115c0d)
#1 0x00007f7791d2da1a n/a (libglib-2.0.so.0 + 0xc0a1a)
#2 0x00007f7791d26911 n/a (libglib-2.0.so.0 + 0xb9911)
#3 0x00007f77916a89ea n/a (libc.so.6 + 0x9c9ea)
#4 0x00007f77917244cc n/a (libc.so.6 + 0x1184cc)
Stack trace of thread 2573:
#0 0x00007f7791717686 ppoll (libc.so.6 + 0x10b686)
#1 0x00007f7791d6edcd n/a (libglib-2.0.so.0 + 0x101dcd)
#2 0x00007f7791ce8566 n/a (libglib-2.0.so.0 + 0x7b566)
#3 0x00007f7791d26911 n/a (libglib-2.0.so.0 + 0xb9911)
#4 0x00007f77916a89ea n/a (libc.so.6 + 0x9c9ea)
#5 0x00007f77917244cc n/a (libc.so.6 + 0x1184cc)
Stack trace of thread 2575:
#0 0x00007f7791721c0d syscall (libc.so.6 + 0x115c0d)
#1 0x00007f7791d6e723 n/a (libglib-2.0.so.0 + 0x101723)
#2 0x00007f7791c95049 g_async_queue_pop (libglib-2.0.so.0 + 0x28049)
#3 0x00007f7782b1db75 n/a (libpangoft2-1.0.so.0 + 0xdb75)
#4 0x00007f7791d26911 n/a (libglib-2.0.so.0 + 0xb9911)
#5 0x00007f77916a89ea n/a (libc.so.6 + 0x9c9ea)
#6 0x00007f77917244cc n/a (libc.so.6 + 0x1184cc)
Stack trace of thread 2570:
#0 0x00007f7791717686 ppoll (libc.so.6 + 0x10b686)
#1 0x00007f7791d6edcd n/a (libglib-2.0.so.0 + 0x101dcd)
#2 0x00007f7791ce1754 g_main_context_iteration (libglib-2.0.so.0 + 0x74754)
#3 0x00007f77923bb2b2 _ZN20QEventDispatcherGlib13processEventsE6QFlagsIN10QEventLoop17ProcessEventsFlagEE (libQt6Core.so.6 + 0x5bb2b2)
#4 0x00007f77920ce00a _ZN10QEventLoop4execE6QFlagsINS_17ProcessEventsFlagEE (libQt6Core.so.6 + 0x2ce00a)
#5 0x00007f77921e4786 _ZN7QThread4execEv (libQt6Core.so.6 + 0x3e4786)
#6 0x00007f7792f951fa n/a (libQt6DBus.so.6 + 0x4f1fa)
#7 0x00007f779228d4df n/a (libQt6Core.so.6 + 0x48d4df)
#8 0x00007f77916a89ea n/a (libc.so.6 + 0x9c9ea)
#9 0x00007f77917244cc n/a (libc.so.6 + 0x1184cc)
Stack trace of thread 2574:
#0 0x00007f7791717686 ppoll (libc.so.6 + 0x10b686)
#1 0x00007f7791d6edcd n/a (libglib-2.0.so.0 + 0x101dcd)
#2 0x00007f7791ce6b8f g_main_loop_run (libglib-2.0.so.0 + 0x79b8f)
#3 0x00007f778ff78ec0 n/a (libgio-2.0.so.0 + 0x178ec0)
#4 0x00007f7791d26911 n/a (libglib-2.0.so.0 + 0xb9911)
#5 0x00007f77916a89ea n/a (libc.so.6 + 0x9c9ea)
#6 0x00007f77917244cc n/a (libc.so.6 + 0x1184cc)
ELF object binary architecture: AMD x86-64
Oct 30 03:50:32 moriel-pc systemd[1]: [email protected]: Deactivated successfully.
@moriel5 , would you be able to attach a picture (taken with your phone, for instance) of any core dump / errors on the screen when the kernel crashes? Thanks!
If I can, I'll try doing so.
But the one time I saw anything, it was either one or two (I forget which, but no more than two) lines, which only mentioned the fact that AMDGPU crashed and took the kernel with it, without any more information.
Thanks. Exact errors are always more useful than summaries. Things that might look unimportant sometimes are helpful to us.
As is always the case.
@moriel5 Have you seen this crash recently? I'm hoping one of the newer kernels has resolved the issue. Thanks.
I may have still been seeing this crash as late as last week (have not yet had time to sync this week since I was not at home since Thursday, however I intend to sync in a few hours), however since I was unable to procure logs, there is no proof in either direction, only that asus-wmi is no longer conflicting with acpi in regards with S0ix reporting, so I am back to trying to figure out what is going on with S3.