GPU crash in halo infinite
@emcy849 can you open a new issue describing your problem? Let me know in that new issue if the crash occurs just in halo or if it also happens in other games. And if you can submit a system report after you see the crash that would be useful. I'd like to compare it to the system report you submitted before.
Originally posted by @lostgoat in https://github.com/ValveSoftware/SteamOS/issues/1540#issuecomment-2161062778 It just happened again, system report submitted. The behaviour now is a screen freeze whilst game audio is still running, so the game itself is still running just video output is frozen, then flashes black (as the gpu resets i presume) then comes back still frozen. Sometimes the steam overlays still work over the frozen game output. This tracks with the fix of not crashing the steam client when a gpu reset happens, so at least thats fixed.
I believe it also happens at a much lesser rate in other games, though i cant think off the top of my head which other games it has happened in now. Right now its been crashing whilst playing the squad battle multiplayer playlist in halo infinite. its not 100% repro, sometimes itll play nice for hours and sometimes it crashes really fast.
Really annoyed this has come back for me now after some recent update, my deck hadnt crashed this way for a few months and i thought it was over. And my warranty period just expired too, so im not happy :(
@emcy849
Halo Infinite started crashing more frequently starting with SteamOS 3.5 because Valve changed the default value for Transparent HugePages from "madvise" to "always", which increased RAM usage by default.
Here is how you can revert to the upstream Linux kernel default "madvise" setting:
Just copy & paste the following two lines into the Konsole terminal app:
sudo sed -i 's/\bGRUB_CMDLINE_LINUX_DEFAULT="\b/&transparent_hugepage=madvise /' /etc/default/grub
sudo grub-mkconfig -o /boot/efi/EFI/steamos/grub.cfg
If you don't know how to do this, then A.B.T. has written an excellent article that explains everything step-by-step:
https://medium.com/@a.b.t./here-are-some-possibly-useful-tweaks-for-steamos-on-the-steam-deck-fcb6b571b577
Hope it helps your crashing issue with Halo Infinite!
And in case any Valve employee reads this:
What was the reasoning behind the switch from the default "madvise" THP option to the "always" one, which is known to cause issues with occasionally drastically inflated RAM consumption, as can be seen with Halo Infinite?
Another known problematic game with THP set to "always" is Detroit: Become Human.
Maybe sticking to the upstream Linux kernel default of "madvise" (as was the case up until SteamOS 3.4) perhaps was the better choice, also recommended by A.B.T.?
Thanks in advance for any reply!
Replying to https://github.com/ValveSoftware/SteamOS/issues/1547#issuecomment-2169893497
This....may explain why i seem to find halo is crashing more in squad battle games (16 players, bigger maps) than other 8 player smaller map games. intriguing.
Im a little hesitant to start changing kernel parameters yet though. Id like to know why this change was made too in that case, if this really is the issue. Why would a OOM cause a gpu reset though?
@emcy849
If you are worried about somehow harming your Steam Deck by switching over to the "madvise" option, no worries, since you can easily revert to the default "always" THP option set by Valve by simply running the following two commands, which will simply revert the above change I suggested, based off A.B.T.'s SteamOS tweaks:
sudo sed -i -e 's/transparent_hugepage=madvise //' /etc/default/grub
sudo grub-mkconfig -o /boot/efi/EFI/steamos/grub.cfg
Just make sure to reboot your Steam Deck after running the above two commands, similarly to the other two commands which enable the "madvise" THP option I posted earlier.
Please feel free to give it a shot, since there is no risk in doing so.
Hope it helps!
Replying to https://github.com/ValveSoftware/SteamOS/issues/1547#issue-2354874831
The behaviour you mention is something that I have experienced when doing low complexity task in the past (short black screen and then the UI comes back with a frozen video). There is already a couple of reports from other Deck users with this, see: https://gitlab.freedesktop.org/drm/amd/-/issues/3111#note_2248516 and this: https://gitlab.freedesktop.org/drm/amd/-/issues/3440 that was reported by Valve employees themselves.
If you can check your dmesg log and see if it's similar to mine below it would probably corroborate that you are seeing the same issue.
Ironically since I'm using a more up to date distribution compared to SteamOS I haven't been able to reproduce it since around three months or so, but from what I've read it seems to be a sdma firmware bug (nothing yet confirmed from AMD for now).