Steam Deck crashes to a black screen with the backlight on
Your system information
- Steam client version: 1679680416
- SteamOS version: 3.4.6
- Opted into Steam client beta?: [Yes/No] NO
- Opted into SteamOS beta?: [Yes/No] NO
- Have you checked for updates in Settings > System?: [Yes/No] YES
Please describe your issue in as much detail as possible:
I was playing Elden Ring in a room with an ambient temperature of 23.2 degrees Celsius. The TDP limit was set to 11 W. The screen froze with an image on the screen. Audio still played for a few more seconds. The image vanished from the screen. The backlight of the screen was still on. The Steam Deck stood like that for 30 seconds or 2-3 minutes. The typical Steam progress bar was displayed. The game mode UI was started again. The Steam Deck didn't seem to reboot. The uptime since a cold boot was more than an hour after this crash.
This seems to have been a crash of the driver or some hardware instability. I've caught this Steam Deck running at 86-88 degrees Celsius at the default 15W TDP. The room temperature was at 25.5 degrees Celsius when I had mangohud enabled to see the temperatures. Setting it to 10 W seemed to allow it to run cooler (70-80 degrees Celsius). It crashed with a TDP limit of 11 W.
When games crash, SteamOS returns to the game mode without any explanation. This wasn't the same.
The Steam Deck should be able to handle running games with its default TDP of 15 W. If it can't, its cooling solution is flawed. I'm grateful for what Valve is doing for Linux, plasma and open source software as a whole. I really appreciate the time, effort and money put into this. The hardware seems to need work. The people from Steam support weren't convinced that this Steam Deck has issues with cooling.
The solution for crashes they've recommended before was to try to disable the updated fan curve. That didn't seem to do anything for the temperatures. It simply made the Steam Deck extremely noisy. Since it's one of the early ones, it's quite noisy.
If this is indeed a video driver bug in RADV or a bug in gamescope, I'm not sure why it correlates with running the Steam Deck at its default TDP or higher than 10 W. Either way, if you want me to do some experiments on this Steam Deck, please let me know. I'd much rather not discuss any further with Steam support at this point.
Steps for reproducing this issue:
- Pick up an early Steam Deck without the updated cooling solution, noisy fan and with other potential earlier batch flaws
- Play Elden Ring or another similar game with the default TDP
- The video driver may crash at some point
Thanks for the report.
Two ideas here: could you collect logs when this happens? Like dmesg or journalctl, so to help determine which component failed. Also, is it feasible for you testing with the main branch?
Cheers!
I can try to retrieve the logs when this happens again. This game (Elden Ring) appears to require Proton Experimental. The stable Proton 8 didn't allow the game to start when I checked yesterday or the day before.
Would switching to the main branch also install a BIOS update? Can I keep using the stable version of Steam or does that also need to be beta/main? I can create an image of the internal SSD before switching to main. Ideally, this shouldn't make changes to the BIOS or to anything which would make Steam support say that it's not a supported configuration anymore. Based on the issues this device has encountered so far, I'd like to request an RMA if these issues persist. A regular PC with Linux doesn't have such issues with overheating and crashes.
The Steam Deck crashed again after 2-4 minutes passed since the moment when the game was started.
The relevant journald logs are attached. steam-deck-crash.log
The system appeared to be in a bad state after the crash. Loading the game again made the Steam Deck display vertical bars. In my previous experience with other devices, this meant that the display was damaged internally or the GPU was about to die.

This is what it looks after one reboot, one cold boot, one attempted boot with vol-, ... and power buttons pressed, and an actual boot with those three buttons pressed.

update: The screenshot I've taken when starting the game again after the crash didn't show the vertical bars seen in the photos of the screen. This makes me think the bars weren't caused by a corruption in the VRAM (a.k.a shared RAM). This screenshot was taken after the crash and before the first reboot mentioned above. I was wondering whether this is visible in the screenshot as well.

I'll provide the logs, photos and screenshots again when it crashes again.
Thanks a lot for your log! It seems indeed you faced a GPU reset, gamescope process died, etc.
Are you running the stable image, with kernel 5.13? Would you consider trying the main branch, which contains a bunch of updates (both kernel and userspace)? If not testing the whole main image (which would be ideal), at least running with the latest Valve kernel 6.1 is an interesting experiment.
Let me know about that possibilities, and if you need some guidance in how to get/install them.
Thanks for the recommendation. Yes, this was the stable and unmodified SteamOS running the kernel shipped as part of the image. The only change was that I've set the password for the deck user to be able to run journalctl and dmesg via sudo. This is the vanilla SteamOS which runs from the Steam Deck's internal SSD. I haven't spent time tinkering with the one from the external SSD for a while.
I'll set aside some time today to create an image of the internal SSD before I switch this Steam Deck to the main branch. I've wasted a bit of time trying to figure out what's going on with Proton 8/Proton experimental and EAC. That's a separate story. This is what I was doing before posting those updates.
I was able to switch to the main branch on the 24th. I didn't have a lot of time to play games on the Steam Deck since. It still seemed to run hot and was still quite noisy. I don't think this hardware will last a long time running at 80-86 degrees Celsius.
I'll return with logs if it crashes again. This is its last chance before the RMA request.
The SteamOS 3.5 images from the main channel seem to not have the radv bug I've run into. The most serious issue I've seen on these images was not being able to start games until the system was rebooted. This was before the release of the most recent main image.
Unfortunately, the fan still sounds like a tiny jet engine when it spins at ~5500 RPM. The cooling is this device's weakest point.
If tickets are closed when issues are no longer an issue in an upcoming release, this can probably be closed. I'll leave it as it is.
My Steam Deck managed to crash once again while running Steam OS 3.5 from the main branch. It crashed to a black screen while running Elden Ring. It didn't recover at all. The backlight was still on.
I had to turn it off and on again in order to make it work properly again. I'm using the device with a TDP of 11w now. I suspect that there's an issue with the hardware. It's ironic how some people who've opened their Steam Decks have no issues, yet I keep running into issues with one unit which hasn't been opened or modified at all.
I've run into another issue with the controls which stopped working in game. I could control SteamOS with the buttons on the controller. I had to force exit the game. This is probably related to Steam input or some other ridiculous Steam beta client bug.
I'll enable SSH and try to figure out how to install the kdump package. This unit seems to have some kind of hardware issue at this point. I was really hoping that the newer kernel with the newer radv fixed the issues with crashes.
Please let me know if you want me to do something else. It should be something which doesn't take too much time. I've already spent more time debugging and figuring out issues than I would've liked to. I'll ask Steam support to help me with the RMA when I run into the next crash.
I'll enable SSH and try to figure out how to install the kdump package. This unit seems to have some kind of hardware issue at this point. I was really hoping that the newer kernel with the newer radv fixed the issues with crashes.
Please let me know if you want me to do something else. It should be something which doesn't take too much time. I've already spent more time debugging and figuring out issues than I would've liked to. I'll ask Steam support to help me with the RMA when I run into the next crash.
The kdump package should be installed by default - you can quickly check that with the command journalctl -b | grep -i kdump. The problem I see is that your issue seems to not lead to a panic, otherwise the system would restart (and a kdump/pstore log would be collected).
One idea is to set ssh and be sure it works...then if the issue happens again, try to login via ssh - iff the problem is amdgpu/mesa/radv or anything GFX-related, it is very likely that you can connect via SSH and collect logs, and these would be really helpful to determine the root-cause of the issue!
Thanks again for your effort in trying to help and provide data, very much appreciated =)
I'll try to test again this weekend. This Steam Deck may have some hardware issues. I'll reach out to support as well to request an RMA due to the issues noticed with Elden Ring recently and the overall instability of the device, both with the stable OS and with 3.5.
I've got the same issue. Elden Ring (and only Elden Ring) freezes after about 10-30 minutes of playtime. Most of the time I can get back to the Steam-menu and exit the game, sometimes I have to hard-reset the deck. Had a longer session once - but only once.
Tested steps:
- Same behavior in Desktop-mode or Game-mode.
- Same behavior when Limiting Wattage
- Same behavior turning of the new fan curve
- Same behavior when Easy Anti Cheat is disabled (launching elden ring.exe and creating steamID.txt)
- Same behavior with multiple Proton versions (including GE)
- Same behavior after deleting protonfiles
- Same behavior after reinstalling the game
- Same behavior after verifiying files
OS 3.4.8, No Beta Hardware is about 6 month old, no other issues. Almost any game runs without issues.
Editing Post (commented on mobile phone earlier)
@Screak42: Can you follow the steps to enable the sshd daemon and connect to your Steam Deck via ssh?
This should allow you to fetch the logs via sudo journalctl. This would confirm whether you run into a radv video driver crash or something else.
@Screak42: Can you follow the steps to enable the sshd daemon and connect to your Steam Deck via ssh?
This should allow you to fetch the logs via sudo journalctl. This would confirm whether you run into a radv video driver crash or something else.
I'll try that over the weekend
Hey!
had some time and manged to get it to crash after about 20 minutes of gaming. I'm not sure what I'm looking for or what to make out of this. I also don't expect anything :) But if it helps, I'm happy. I guess I can look for more specific logs or dumps, but I'd need to know how to get them. SSH into the device seems to be no problem in that state. The added logs in the zip are a bit longer than the copy and paste below. It's just what I think might be an indicator
STEPS:
- Delete proton files (via developer)
- verified integrity of game files
- selected GE-Proton7-55 (I read it worked good for some people with Elden Ring)
- wifi enabled
- updated fan control enabled
started the game
-
wattage limited to 10W
-
medium game settings
-
after about 20 minutes, game froze, audio keeps playing.
- crash-time: about 15:35 local time
I can bring up the steamdeck keyboard, take screenshots, open the "steam menu"
- I left the frozen game open
- ssh into device and looked around.
- journalctl -b | grep -i kdump without success:
Jul 04 13:25:23 potato systemd[1]: Starting SteamOS kdump loader boot-time service...
Jul 04 13:25:23 potato root[977]: kdump-steamos: pstore-RAM was loaded successfully
Jul 04 13:25:23 potato kdump-load.sh[979]: /usr/lib/kdump/kdump-load.sh: line 66: /home//.steamos/offload/var/kdump/.installed_kernels: No such file or directory
Jul 04 13:25:23 potato kdump-load.sh[980]: find: ‘/home//.steamos/offload/var/kdump/*’: No such file or directory
Jul 04 13:25:23 potato systemd[1]: Finished SteamOS kdump loader boot-time service.
Jul 04 13:25:23 potato root[998]: kdump-steamos: invalid folder (/home//.steamos/offload/var/kdump) - aborting...
so I looked around and could not find much; so I only looked at what kind of matched the time stamp. looks it may have some info, but I'm not sure what I'm looking at or even for here.
- cef_log.txt
[0704/153219.551510:ERROR:shared_context_state.cc(721)] SharedContextState context lost via ARB/EXT_robustness. Reset status = GL_INNOCENT_CONTEXT_RESET_KHR
[0704/153219.553582:ERROR:gpu_service_impl.cc(1124)] Exiting GPU process because some drivers can't recover from errors. GPU process will restart shortly.
[0704/153219.572065:WARNING:gpu_process_host.cc(1256)] The GPU process has crashed 1 time(s)
- gpu-trace-daemon.log.truncated
- shader_log.txt.truncated
[2023-07-04 15:40:48] [ AppID 1868140 ] Found precompiled fossilize bucket for AppID AMD RADV VANGOGH /
- steamwebhelper.log
For completion I'm adding a few screenshots of my current settings. Although I've tried many and it doesn't change anything in the behavior.
Thank you @Screak42 ! Regarding kdump, seems you had exposed a bug that should be fixed in latest version - I'll check why it's still hapenning.
Can you please login through SSH at any time and run sudo mkdir -p /home/.steamos/offload/var/kdump ? After that, reboot the Deck and recheck the status of kdump using the command journalctl -b | grep -i kdump .
Finally, whenever you reproduce again (and if you still can access the Deck through SSH), 2 commands could be very useful to collect, in order to debug. You can try: sudo dmesg > dmesg.txt and sudo journalctl -b > journal.txt - both files should help to understand "where" the failure is happening in your case.
Thanks again for your help!
@Screak42 , sorry but I'd need another output from you! Can you run pacman -Q|grep kdump in your Deck and let me know the output? Huge thanks again!
Hey @guilhermepiccoli :) First of all, thanks for looking at this!
And ... Can do. I tried to keep everything consistent. (see below)
-
pacman -Q|grep kdump
kdump-steamos 0.9-2 -
uname -a
Linux potato 5.13.0-valve36-1-neptune #1 SMP PREEMPT Mon, 19 Dec 2022 23:39:41 +0000 x86_64 GNU/Linux -
flatpak --version
Flatpak 1.12.4 -
sudo mkdir -p /home/.steamos/offload/var/kdump
-
ls -lah /home/.steamos/offload/var/
total 28K
drwxr-xr-x 7 root root 4.0K Jul 5 10:34 .
drwxr-xr-x 7 root root 4.0K Sep 23 2022 ..
drwxr-xr-x 3 root root 4.0K Sep 23 2022 cache
drwxr-xr-x 2 root root 4.0K Jul 5 10:34 kdump
drwxr-xr-x 5 root root 4.0K Sep 23 2022 lib
drwxr-xr-x 5 root root 4.0K Oct 1 2022 log
drwxrwxrwt 65 root root 4.0K Jul 5 10:27 tmp
-
reboot
-
uptime
10:55:50 up 3 min, 4 users, load average: 1.82, 1.24, 0.54 -
journalctl -b | grep -i kdump
Jul 05 10:52:22 potato systemd[1]: Starting SteamOS kdump loader boot-time service...
Jul 05 10:52:22 potato root[946]: kdump-steamos: pstore-RAM was loaded successfully
Jul 05 10:52:22 potato kdump-load.sh[955]: find: ‘/home//.steamos/offload/var/kdump/*’: No such file or directory
Jul 05 10:52:22 potato systemd[1]: Finished SteamOS kdump loader boot-time service.
-
sudo systemctl start sshd
-
leaving desktop mode
settings/condistions in game mode:
-
Delete proton files (via developer)
-
verified integrity of game files
-
selected GE-Proton7-55 (I read it worked good for some people with Elden Ring)
-
wifi enabled
-
updated fan control enabled
-
wattage limited to 10W
-
medium game settings
-
starting game-session 11:05
-
freeze 11:12 (nothing particularly special happened ingame)
-
this time the steam button / menu was unresponsive. the device was frozen, audio kept playing.
-
SSH to device works fine.
-
sudo dmesg > dmesg.txt
-
journalctl -b > journal.txt Crash 2 2023.07.05.zip
Files are attached! I hope it helps
Hi @Screak42 , you're welcome! And thanks a lot for the detailed info you collected, this is very useful!
Two things:
(a) Regarding the kdump issues, they're indeed fixed in the latest kdump version (0.92), whereas you're using 0.9. By creating the directory manually as you did, the main problem is circumvented, so now you have a fully functional kdump! Even with that other minor issue (the find one), kdump should work properly - this other minor issue is also fixed in the latest package.
(b) The freeze problem seems to be caused due to a GPU reset induced by some issue on Elden Ring, likely in radv / dxvk / amdgpu. Dmesg shows the following:
[ 1200.609984] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, signaled seq=434558, emitted seq=434560
[ 1200.610208] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process eldenring.exe pid 9615 thread eldenring.exe pid 9693
[ 1200.610408] amdgpu 0000:04:00.0: amdgpu: GPU reset begin!
I'll suggest 2 steps in order to validate if the problem goes away. First is to update your Deck to latest version - I see you're not using the latest stable image or else you would have kdump 0.92 heh
The second idea is to avoid using Proton GE for this test - maybe try Proton Experimental if you want, to gather the latest official fixes/improvements. After that, lemme know if still reproduces.
Cheers!
Hey @guilhermepiccoli - Thanks!
I checked, but there is no later stable-OS (stable channel) available other than what I have - 3.4.8 (20230508.1) Maybe a staggered release in different areas?
Anyaway, I'll move to Proton Experimental.
Yesteday I had a quite long session with the steam overlay disabled and a launch command /nolightfx
I can't say yet if it's random or not... I'll keep an eye on it :)
In case I can reproduce - does it help and make sense to collect the logs or is it pretty much clear with the GPU reset?
Thanks again @Screak42 , for your prompt response! It is indeed a good idea switching to Proton Experimental. About the stable channel, it seems you're running the latest one - so, there might be some "bug" about picking the latest version of kdump, I'll investigate. I'm running main channel, which contains the latest kdump version.
In case the issue happens again, collecting dmesg is quite important. I also received information that PROTON_LOG=1 could also be very helpful, I'm not sure how that would impact game performance for Elden Ring though...instructions to enable the Proton logging are present here.
Cheers!
Hey again ;)
Same conditions as before yesterday (2023.07.07) I received a proton-experimental update, but crashing remains. Since yesterday the crashes happen almost without exception after 3-8 minutes. No longer game sessions possible.
Reproduce issue: (2023.07.08)
- sudo systemctl start sshd
- leaving desktop mode
- select proton experimental
- Delete proton files (via developer)
- add launch option PROTON_LOG=1
- verified integrity of game files
- wifi enabled
- updated fan control enabled
- wattage limited to 10W
- medium game settings (I always use windows & downscaled resolution, then enable FSR, but I get the same behavior leaving it in fullscreen settings ingame)
starting game-session: PROTON_LOG=1 via launch options (if it's supposed to generate an extra log, I was not able to find it)
0815 start elden ring
crash 08:18
game freeze, then black screen steam-UI resets, no full reboot (showing pin-password)
attached:
-
core.gamescope.1000.eb05187fb1c840e680c55d0b33a91717.4246.1688797107000000.zst (I found this referenced in journal, maybe its useful)
-
dmesg.txt logs 2023.07.08 screak42.zip
-
gpu-trace-daemon.log
-
journal.txt
-
user-1000.journal
FYI: some people report that the RMA'd the device and did not have problems after. However, I find that almost any other (even demanding) game runs just fine for me. I'm 99% certain it's not a hardware issue. The ONLY other game that's crashing more than once for me is Lords of the Fallen. But that game ... crashes everywhere. Even my Ps4. I'm using the deck in desktop mode as my every day machine almost all the time, no problems. even games like noita which can go from 10% cpu & gpu to 100% in 1 frame have no issues.
I've run into these crashes with my Steam Deck unit on the stable SteamOS. This is an early unit from last year.
I upgraded to the SteamOS 3.5 images from main. The Steam Deck was stable for a while with that setup. It started to crash again on 3.5 after a while when running Elden Ring. I haven't been able to trigger it since enabling ssh to retrieve the logs from it when it crashes.
I've decided to reach out to Steam support to RMA it this week. Even if this issue turns out to be entirely caused by software, my Steam Deck has other issues which make it an overall disappointing experience.
I was able to get my Steam Deck to crash again.
steam deck crash 6.1.29 main 20230630.log I've removed the wlan0 related entries with my network SSID, MAC addresses, other network and device specific details.
The Steam Deck was turned on for almost two hours when it crashed. The video driver seemed to crash. The game's sound kept working just fine. I pressed the power button to make it go to sleep. It woke up, the UI was started again and I retrieved the logs.
I had the Steam Deck limited to 12 W TDP in an attempt to make the fan a bit quieter. This doesn't seem to help with the stability issues.
My Steam support request for an RMA has been filed.
@unclejack can you send me your support ticket I'd or your steam account is?
@lostgoat: You should've received an email from me just now. I'd rather not post my Steam id or the ticket id here.
edit: I've left the Steam Deck running without rebooting. It appears to have recovered from that crash. I'll keep it running in case we can collect some data from it. The gpu trace didn't seem to produce output to disk.
A have similar problem with Elden Ring on my LCD deck. Latest main.
It crashed in 10-30 minutes to a black screen, didn't recover at all and backlight was still on, needed to reboot deck.
But I have noticed that when I play docked (valve dock, LAN) there is no crashes. Only in handheld mode.
edit: just crashed docked, but still is way more stable
@skurovec : Can you retrieve the logs from your Steam Deck to provide the error in a similar manner as I have? That would let you know whether it's the issue or https://github.com/ValveSoftware/SteamOS/issues/1312.