dxvk
dxvk copied to clipboard
corrupted size vs. prev_size with Nvidia 495+ and DXVK 1.9.3+
Star Citizen works very well with DXVK, except is some precise situations. If using Nvidia 470.x, it works with basically any version of DXVK. With Nvidia 495.x, it doesn't work with any DXVK. With Nvidia 510.x if works with DXVK 1.9.2 and prior. But not with DXVK 1.9.3+. Only one line seems relevant in the log attached : corrupted size vs. prev_size
We tried every game or Wine setting with the LUG with no difference except for one : When changing __GL_THREADED_OPTIMIZATIONS to 0 the error line changes to : malloc(): invalid next size (unsorted) We are not sure if these error messages are related or two completely different errors.
Software information
Star Citizen.
System information
- GPU: RTX 2070
- Driver: 510.47.03
- Wine version: 7.1
- DXVK version: 1.9.3+
I'm not seeing any DXVK messges in that log, did you disable logging or does it crash before even calling into any of our code? Are you using the native Vulkan renderer perhaps (if that's even a thing yet)?
This sounds like an extremely painful issue since Star Citizen is not something I can test locally, and even if I could, this looks like some sort of memory corruption that bleeds into the Linux side or happens within the Nvidia driver itself, and problems like that are exceptionally hard to debug. This could be caused by literally anything as well, including the driver, the game itself, or DXVK.
Couple of things you can do:
- try to run the game with Vulkan validation layers enabled and post a full log, to see if we're doing anything obviously wrong
- try to bisect the issue on 510.x to see which change between 1.9.2 and 1.9.3 broke it on these drivers
- try to run the game through Valgrind and see if it complains about buffer overflows etc
Would also be interesting if the game still works on AMD drivers.
Might it be related to this, and the next few messages, in some way? Though not present in the logs I posted, I get the corrupted size vs. prev_size
error when using newer Proton (and by extension DXVK; just tested 34fd16b) with a version of the game from 2019, from shortly Online Party was released, to versions from somewhere in (I think early) 2021. For versions after that, it acts like the log I posted a little bit above in the linked issue.
While trying to get Proton to leave a log file for this game version (I really don't know why it's not doing it despite me specifying the environment variables, but yeah), the game actually launched successfully one time, so I saved its stdout at least, in the event it might be useful.
corrupted_size_2019ver_1.9.4.txt
working_2019ver_1.9.4.txt
It seems like the corrupted size vs. prev_size
occurs before info: DXVK: Read 19441 valid state cache entries
is supposed to be printed?
I've been meaning to make an issue here as alasky suggested, but idk I guess I've been lazy. I'll try to do some more DXVK bisecting now. Oh and also it doesn't seem to crash on Windows at all, I tried it a few times, as the issue template said to try it.
Correction: it also prints corrupted size vs. prev_size
on more recent versions of A Hat in Time.
I'm not seeing any DXVK messges in that log, did you disable logging or does it crash before even calling into any of our code? Are you using the native Vulkan renderer perhaps (if that's even a thing yet)?
No it's not a thing yet and no logging isn't disabled.
Couple of things you can do:
- try to bisect the issue on 510.x to see which change between 1.9.2 and 1.9.3 broke it on these drivers
Someone in the LUG did it and says that this commit is the culprit : https://github.com/doitsujin/dxvk/commit/86148ec070628f5a89fbb0a91603bae2ce89529a
- try to run the game through Valgrind and see if it complains about buffer overflows etc
I'm lacking time but I'll try to find a moment to do it.
Would also be interesting if the game still works on AMD drivers.
Yes it does. And yes we think that the NV driver isn't innocent there. But we also think that DXVK does something specific to NV cards that triggers this crash (see the commit potentially responsible).
Appears that it broke at 86148ec. Previous commit works.
Was typing this at the same time as the above, oops. But yeah I just got to the same conclusion. @Loacoon1 do you happen to have a 10xx or older card to test with? afaik, at least in another game with the same error print, those still work after that commit. Not that I know much, just curious.
The code that was added quite literally doesn't do anything except enable some Vulkan extensions, it doesn't change DXVK's behaviour in any way at all as long as DLSS is not used.
Does https://github.com/doitsujin/dxvk/tree/sc-test help?
Yes, it fixes it. I'm at 5/5 successful launches (A Hat in Time).
Commenting just the devExtensions.nvxBinaryImport.setMode(DxvkExtMode::Optional);
in the if (enableCudaInterop)
block also works.
@Loacoon1 Thanks for writing the issue for this.
As for Star Citizen testing: the sc-test
branch allows the game to run as normal, where 86148ec wouldn't.
@doesthisusername's one-liner comment also works for SC.
The code that was added quite literally doesn't do anything except enable some Vulkan extensions, it doesn't change DXVK's behaviour in any way at all as long as DLSS is not used.
Our thought is that the commit might do something that uncovers a Nvidia driver bug.
Yeah, but the point that I don't quite get is... why the heck does this only happen in specific games that don't even use DLSS?
Yeah, it seems a bit strange. I looked up what nvxBinaryImport
is, and it looks like it's a Vulkan extension that "allows applications to import CuBIN binaries and execute them."
strings HatinTimeGame.exe | grep -i cuda
has one result, ?enableCudaAcceleration@Compressor@nvtt@@QEAAX_N@Z
. That might be a clue? I don't have other crashing games to look at though.
Is it possible for me to manually apply this fix to Proton, before it gets merged? I'm running into the same A Hat in Time crashes with Proton Experimental, and running AHiT with this fix applied seems to fix the crash, though with no Steam features (since I'm not launching through Steam to test it).
@RealTheSunCat yeah, if you build Proton from source you can comment out that line in the DXVK submodule and make install
it to Steam, that's what I'm doing. You can also compile just DXVK with the fix and overwrite Proton Experimental's DXVK files with the output, but then Steam's gonna replace those files every time it checks them since they're modified.
@doitsujin would you accept a PR to put disabling that behind an environment variable? It's clearly misbehavior on the GPU side, but it would be nice to enable people who aren't capable of manually building dxvk to use more recent versions again.
I'm also assuming someone has flagged this to nvidia by now?
I have been maintaining a DXVK build with the workaround on my fork that has been working for the Star Citizen LUG for some time now, if that might be interesting to anyone here.
Could any of you recheck on the newly released 520.56.06 Nvidia driver?
Edit: got someone in the proton DayZ issue to check and it's still a problem.
@Blisto91 Just tested it with Star Citizen. Unfortunately still the same problem with 520.56.06 (non-open version) and a RTX3060 by using dxvk v1.10.3.
have same problem with Star Citizen. work with <= dxvk-1.9.2
RTX3080 Driver Version: 520.56.06
Yes a driver fix from Nvidia is needed afaik.
FWIW, can confirm that this issue is still affecting 525.60.11 (RTX 3060 Ti).
At least for AHIT, I made a semi-custom Proton version that uses symlinked files from whatever the latest Proton-GE version is (using Arch AUR's proton-ge-custom-bin
package as my GE install), just with the relevant DXVK files replaced from @gnusenpai's fork, and the game boots consistently.
Have there any news about this? Currently facing the same issue with Watch Dogs 2. The game crashed semi-randomly. I tried to downgrade DXVK to 1.9.2, but it is still happening. Do I need to downgrade NVIDIA driver too? ~~Or is it a difference issue?~~
Seem to be the same. Add __GL_THREADED_OPTIMIZATIONS=0
cause the error to change to:
free(): invalid next size (fast)
free(): invalid pointer
@TheBill2001 Not yet besides that Nvidia devs are notified of the issue.
You should see corrupted size vs. prev_size
in log if it's the same issue. Without the variable at least.
535 drivers have reached stable. Not every fix usually makes it to the release notes, so might be worth a try as it's a big one in general.
Gonna knock on wood as I say this, but I think this might be resolved in the current drivers? Using 535.86.05, and tested with both Proton-GE and Experimental; launched it like eight consecutive times and it seems to consistently launch. But I can only say this with regards to A Hat In Time for now. Someone else advise for Star Citizen if they can?
Thanks for reporting this. Though I can't test Star Citizen, A Hat in Time indeed works fine for me too now
@Loacoon1 are you able to give this a spin in Star Citizen with the latest Nvidia drivers? 🙂
@Loacoon1 Friendly ping