vkd3d-proton
vkd3d-proton copied to clipboard
The Last of Us Part 1 Xid 109
Testing TLOU1 i am hitting a Xid 109
NVRM: Xid (PCI:0000:01:00): 109, pid='<unknown>', name=<unknown>, Ch 00000116, errorString CTX SWITCH TIMEOUT, Info 0x32c07d
While testing this i have just been running around outside in "The Capitol Building" area seen below. Don't know if it can also happen other places as i haven't tried that.
Screenshot
Software information
The Last of Us Part 1 Ultra preset
System information
- GPU: RTX 4080
- Driver: Tested 550.54.14 and 535.161.07
- Wine version: Experimental Bleeding Edge
- VKD3D-Proton version: Master. Also tried 2.9 once with the same result
Running with VK_ICD_FILENAMES=/usr/share/vulkan/icd.d/nvidia_icd.json
to avoid a frozen white screen on game start. Likely weirdness due to also having a iGPU.
Log files
I've gathered some breadcrumb logs which all mention the same shader in the potential crash region. Also attached the dxil and spv file of that shader below from a dump proton-breadcrumbs.tar.gz
Weird, I played for a while and it worked here but I'm in "Downtown" right now. How long does it take before it crash?
RTX 4070 Recent dxvk-nvapi/vkd3d-proton master with reflex support. Vulkan dev 550.40.53 Proton 8 Ultra settinga
Under a minute for me. Sometimes only a few seconds. It's worth noting though that I am unsure if the nature of it is a bit racy or random since when I was collecting breadcrumbs I had to play alot to reproduce again for the second log.
I see what you mean now, what's weird is that I quit in that very same location yesterday and didn't have a crash. Now it went boom after 30 seconds. Isn't Xid crashes usually driver related? I mean, I played this game without crashes with older drivers.
[tis mar 5 21:03:08 2024] NVRM: GPU at PCI:0000:01:00: GPU-c527f869-4b6b-fb00-51ea-f36233874170
[tis mar 5 21:03:08 2024] NVRM: Xid (PCI:0000:01:00): 109, pid='<unknown>', name=<unknown>, Ch 00000068, errorString CTX SWITCH TIMEOUT, Info 0x3ec06
Isn't Xid crashes usually driver related? I mean, I played this game without crashes with older drivers.
Often yes. But i asked if i should make a vkd3d-proton issue for this one since i had breadcrumbs that lead to a specific shader and i got a yes.
Given this shader seems to do some waveop loops similar to our maximal_convergence test that also hangs the GPU on NV atm, it's likely caused by that. Currently awaiting a driver fix for the maximal_convergence test, and hopefully that fixes this too.
@Blisto91 For some reaon the game even crashes during shader compilation for me now, don't know why. Never had that problem before. Does it work for you with latest dxvk-nvapi and vkd3d-proton?
Yes. Both master and latest stable of both. 550.67
Game still crashes with 550.67 :( The fact that it crashed during shader compile for me was because of my memory overclock, walks away in shame (i've fixed it now tho').
I got this now:
NVRM: GPU at PCI:0000:01:00: GPU-c527f869-4b6b-fb00-51ea-f36233874170
NVRM: Xid (PCI:0000:01:00): 11, pid='<unknown>', name=<unknown>, Ch 000000af Cl 0000c997 Off 00001028 Data 00000020
Seems fixed for me in Dev 550.40.61 - Please verify
It's NOT fixed. Which's weird because I played it for a while without any issues and now it crashes almost instantly with XID 109 again :(
Still crashing with 555.42.02
[tis maj 21 15:48:18 2024] NVRM: Xid (PCI:0000:01:00): 109, pid='
Sorry yea i hadn't gotten around to this again. But thanks for checking
This is confusing me so much, sometimes it just crashes almost straight away, sometimes it's like it's never happend and the game runs fine. 555 beta again... First launch, compiled main story shaders. Ran the game, worked fine. Second lanuch, ran game, crashed within 10 seconds.
Yeah, it's a bit random. I'm back on RTX 4070 and reproduced it easily there. I don't know if it was just random that I didn't reproduce on 3070 last time, as I can sometimes play a while without hangs on the 4070 as well. Sometimes it takes 4-5 launches before it hangs, but quitting after a minute or two of gameplay and relaunching seems to be the fastest way to reproduce. Now I had eight launches in a row where I got a Xid 109 a few seconds after loading into the capitol building area, so sometimes it's trivial.
The hang still happens with 550.40.65 and 555.58.02 on 6.9.8-arch1-1.
The shader Blisto found is named CS_VolumetricsTemporalCombineProbeCacheFroxelsScalar. It does have subgroup operations within loops, but forcing maximal reconvergence in dxil-spirv didn't help. QA descriptor checks had no effect, and forcing barriers for this shader via VKD3D_BARRIER_HASHES also doesn't prevent the hang.
FWIW, I tried spoofing a different nvidia GPU architecture, it did not help either. Nor did disabling nvapi.
I might have to eat my words later but vkd3d-https://github.com/HansKristian-Work/vkd3d-proton/commit/e957460ed1aade52300d5dfb9790478cd1ab80d9 and Nvidia 560.31.02 no longer causes XID 109 crashes here. (I don't use the nvidia open driver and have GSP disabled)
Someone please verify if possible.
This is hilarious... It stopped working again. I give up. I played the game fine for about an hour, i even restarted the game maybe 2-3 times. Now I was going to play again today. Instant crash. Same drivers, nothing changed, just a few system restarts in between...
~Sorry for the spam, deleting /home/user/.cache/nvidia temporarily fixes it. So it seems like it's some cache issue with the Nvidia driver.~
@HansKristian-Work You know what, it's a race condition....
There seems to be a "frametime spike" happening about 5-6 seconds after the game has loaded it's initial menu screen, if you load a saved game before that frametime spike has occured, the game will crash with XID 109 when entering the level. So whatever that frametime spike is, it cannot happen during a game save load.
So you wait a while (less than 10 seconds on my machine) at the main menu before loading a save, the game will work fine every time.
That's probably why it worked after a driver update, because of the shader recaching time.
Update, i can reproduce the crash in Windows too with vkd3d-proton and dxvk-nvapi.
Breadcrumbs log seems to look the same as the one @Blisto91 posted. I can reproduce it easily, like I said, start the game quickly = instant crash. Wait a while at the main menu before starting = no crash breadcrumb.log
How exactly do I reproduce the issue?
I hit continue on the main menu immediately and it loads into the game just fine. The game itself works fine too after that.
RTX 4070 vkd3d-proton latest git dxvk-nvapi latest git
- Let it finalize the shader building at the begining
- Set graphics to ultra
- Load a saved game (it should probably work fine now)
- Quit the game
- Start the game, quickly load a saved game (continue). It should crash now (takes ~ 5-30 seconds), try step 4-5 again if it doesn't.
I don't think location really matters but I'm in "The Woods" now. I haven't tried the Dev Linux 550.40.71 driver tho'.
I did manage to reproduce it in the end.
Ok, so I tried really hard to make it crash with Dev Linux 550.40.71 .... it doesn't anymore. At least not for me. Wonder if the root cause was the same as the Final Fantasy crash, what do you think? Was it similar in any way?
@K0bin can you try the dev driver?
I was just about to comment that a fix was added in the latest beta driver. I don't know if the two problems are related, though.
Good call, I'll try the beta driver.
I can't reproduce the issue with the beta driver. There's too much randomness involved with this bug to make any definitive statement but so far it seems like it's fixed.
I even restarted my computer twice make sure it wasn't a fluke here.... it just won't crash anymore, which is good I guess. :) But yeah, the randomness is strong in this one.
We'll just reopen if it suddenly appears. Thanks for the help