dxvk icon indicating copy to clipboard operation
dxvk copied to clipboard

corrupted size vs. prev_size with Nvidia 495+ and DXVK 1.9.3+

Open Loacoon1 opened this issue 3 years ago • 19 comments

Star Citizen works very well with DXVK, except is some precise situations. If using Nvidia 470.x, it works with basically any version of DXVK. With Nvidia 495.x, it doesn't work with any DXVK. With Nvidia 510.x if works with DXVK 1.9.2 and prior. But not with DXVK 1.9.3+. Only one line seems relevant in the log attached : corrupted size vs. prev_size

We tried every game or Wine setting with the LUG with no difference except for one : When changing __GL_THREADED_OPTIMIZATIONS to 0 the error line changes to : malloc(): invalid next size (unsorted) We are not sure if these error messages are related or two completely different errors.

Software information

Star Citizen.

System information

  • GPU: RTX 2070
  • Driver: 510.47.03
  • Wine version: 7.1
  • DXVK version: 1.9.3+

log SC.txt

Loacoon1 avatar Feb 18 '22 17:02 Loacoon1

I'm not seeing any DXVK messges in that log, did you disable logging or does it crash before even calling into any of our code? Are you using the native Vulkan renderer perhaps (if that's even a thing yet)?

This sounds like an extremely painful issue since Star Citizen is not something I can test locally, and even if I could, this looks like some sort of memory corruption that bleeds into the Linux side or happens within the Nvidia driver itself, and problems like that are exceptionally hard to debug. This could be caused by literally anything as well, including the driver, the game itself, or DXVK.

Couple of things you can do:

  • try to run the game with Vulkan validation layers enabled and post a full log, to see if we're doing anything obviously wrong
  • try to bisect the issue on 510.x to see which change between 1.9.2 and 1.9.3 broke it on these drivers
  • try to run the game through Valgrind and see if it complains about buffer overflows etc

Would also be interesting if the game still works on AMD drivers.

doitsujin avatar Feb 18 '22 17:02 doitsujin

Might it be related to this, and the next few messages, in some way? Though not present in the logs I posted, I get the corrupted size vs. prev_size error when using newer Proton (and by extension DXVK; just tested 34fd16b) with a version of the game from 2019, from shortly Online Party was released, to versions from somewhere in (I think early) 2021. For versions after that, it acts like the log I posted a little bit above in the linked issue. While trying to get Proton to leave a log file for this game version (I really don't know why it's not doing it despite me specifying the environment variables, but yeah), the game actually launched successfully one time, so I saved its stdout at least, in the event it might be useful. corrupted_size_2019ver_1.9.4.txt working_2019ver_1.9.4.txt It seems like the corrupted size vs. prev_size occurs before info: DXVK: Read 19441 valid state cache entries is supposed to be printed?

I've been meaning to make an issue here as alasky suggested, but idk I guess I've been lazy. I'll try to do some more DXVK bisecting now. Oh and also it doesn't seem to crash on Windows at all, I tried it a few times, as the issue template said to try it.

Correction: it also prints corrupted size vs. prev_size on more recent versions of A Hat in Time.

doesthisusername avatar Feb 18 '22 20:02 doesthisusername

I'm not seeing any DXVK messges in that log, did you disable logging or does it crash before even calling into any of our code? Are you using the native Vulkan renderer perhaps (if that's even a thing yet)?

No it's not a thing yet and no logging isn't disabled.

Couple of things you can do:

  • try to bisect the issue on 510.x to see which change between 1.9.2 and 1.9.3 broke it on these drivers

Someone in the LUG did it and says that this commit is the culprit : https://github.com/doitsujin/dxvk/commit/86148ec070628f5a89fbb0a91603bae2ce89529a

  • try to run the game through Valgrind and see if it complains about buffer overflows etc

I'm lacking time but I'll try to find a moment to do it.

Would also be interesting if the game still works on AMD drivers.

Yes it does. And yes we think that the NV driver isn't innocent there. But we also think that DXVK does something specific to NV cards that triggers this crash (see the commit potentially responsible).

Loacoon1 avatar Feb 18 '22 22:02 Loacoon1

Appears that it broke at 86148ec. Previous commit works.

Was typing this at the same time as the above, oops. But yeah I just got to the same conclusion. @Loacoon1 do you happen to have a 10xx or older card to test with? afaik, at least in another game with the same error print, those still work after that commit. Not that I know much, just curious.

doesthisusername avatar Feb 18 '22 22:02 doesthisusername

The code that was added quite literally doesn't do anything except enable some Vulkan extensions, it doesn't change DXVK's behaviour in any way at all as long as DLSS is not used.

Does https://github.com/doitsujin/dxvk/tree/sc-test help?

doitsujin avatar Feb 18 '22 23:02 doitsujin

Yes, it fixes it. I'm at 5/5 successful launches (A Hat in Time).

doesthisusername avatar Feb 19 '22 00:02 doesthisusername

Commenting just the devExtensions.nvxBinaryImport.setMode(DxvkExtMode::Optional); in the if (enableCudaInterop) block also works.

doesthisusername avatar Feb 19 '22 00:02 doesthisusername

@Loacoon1 Thanks for writing the issue for this.

As for Star Citizen testing: the sc-test branch allows the game to run as normal, where 86148ec wouldn't. @doesthisusername's one-liner comment also works for SC.

gnusenpai avatar Feb 19 '22 04:02 gnusenpai

The code that was added quite literally doesn't do anything except enable some Vulkan extensions, it doesn't change DXVK's behaviour in any way at all as long as DLSS is not used.

Our thought is that the commit might do something that uncovers a Nvidia driver bug.

Loacoon1 avatar Feb 19 '22 12:02 Loacoon1

Yeah, but the point that I don't quite get is... why the heck does this only happen in specific games that don't even use DLSS?

doitsujin avatar Feb 19 '22 12:02 doitsujin

Yeah, it seems a bit strange. I looked up what nvxBinaryImport is, and it looks like it's a Vulkan extension that "allows applications to import CuBIN binaries and execute them." strings HatinTimeGame.exe | grep -i cuda has one result, ?enableCudaAcceleration@Compressor@nvtt@@QEAAX_N@Z. That might be a clue? I don't have other crashing games to look at though.

doesthisusername avatar Feb 19 '22 13:02 doesthisusername

Is it possible for me to manually apply this fix to Proton, before it gets merged? I'm running into the same A Hat in Time crashes with Proton Experimental, and running AHiT with this fix applied seems to fix the crash, though with no Steam features (since I'm not launching through Steam to test it).

TheSunCat avatar Apr 01 '22 14:04 TheSunCat

@RealTheSunCat yeah, if you build Proton from source you can comment out that line in the DXVK submodule and make install it to Steam, that's what I'm doing. You can also compile just DXVK with the fix and overwrite Proton Experimental's DXVK files with the output, but then Steam's gonna replace those files every time it checks them since they're modified.

doesthisusername avatar Apr 01 '22 19:04 doesthisusername

@doitsujin would you accept a PR to put disabling that behind an environment variable? It's clearly misbehavior on the GPU side, but it would be nice to enable people who aren't capable of manually building dxvk to use more recent versions again.

I'm also assuming someone has flagged this to nvidia by now?

TLATER avatar May 28 '22 12:05 TLATER

I have been maintaining a DXVK build with the workaround on my fork that has been working for the Star Citizen LUG for some time now, if that might be interesting to anyone here.

gnusenpai avatar May 28 '22 19:05 gnusenpai

Could any of you recheck on the newly released 520.56.06 Nvidia driver?

Edit: got someone in the proton DayZ issue to check and it's still a problem.

Blisto91 avatar Oct 12 '22 20:10 Blisto91

@Blisto91 Just tested it with Star Citizen. Unfortunately still the same problem with 520.56.06 (non-open version) and a RTX3060 by using dxvk v1.10.3.

char32 avatar Oct 19 '22 10:10 char32

have same problem with Star Citizen. work with <= dxvk-1.9.2

RTX3080 Driver Version: 520.56.06

Nix-id avatar Nov 09 '22 12:11 Nix-id

Yes a driver fix from Nvidia is needed afaik.

Blisto91 avatar Nov 09 '22 12:11 Blisto91

FWIW, can confirm that this issue is still affecting 525.60.11 (RTX 3060 Ti).

At least for AHIT, I made a semi-custom Proton version that uses symlinked files from whatever the latest Proton-GE version is (using Arch AUR's proton-ge-custom-bin package as my GE install), just with the relevant DXVK files replaced from @gnusenpai's fork, and the game boots consistently.

SeongGino avatar Dec 26 '22 00:12 SeongGino

Have there any news about this? Currently facing the same issue with Watch Dogs 2. The game crashed semi-randomly. I tried to downgrade DXVK to 1.9.2, but it is still happening. Do I need to downgrade NVIDIA driver too? ~~Or is it a difference issue?~~

Seem to be the same. Add __GL_THREADED_OPTIMIZATIONS=0 cause the error to change to:

free(): invalid next size (fast)
free(): invalid pointer

TheBill2001 avatar Mar 28 '23 04:03 TheBill2001

@TheBill2001 Not yet besides that Nvidia devs are notified of the issue.

You should see corrupted size vs. prev_size in log if it's the same issue. Without the variable at least.

Blisto91 avatar Apr 06 '23 01:04 Blisto91

535 drivers have reached stable. Not every fix usually makes it to the release notes, so might be worth a try as it's a big one in general.

Blisto91 avatar Jun 14 '23 15:06 Blisto91

Gonna knock on wood as I say this, but I think this might be resolved in the current drivers? Using 535.86.05, and tested with both Proton-GE and Experimental; launched it like eight consecutive times and it seems to consistently launch. But I can only say this with regards to A Hat In Time for now. Someone else advise for Star Citizen if they can?

SeongGino avatar Jul 22 '23 05:07 SeongGino

Thanks for reporting this. Though I can't test Star Citizen, A Hat in Time indeed works fine for me too now

Gigas002 avatar Jul 22 '23 16:07 Gigas002

@Loacoon1 are you able to give this a spin in Star Citizen with the latest Nvidia drivers? 🙂

Blisto91 avatar Aug 05 '23 14:08 Blisto91

@Loacoon1 Friendly ping

Blisto91 avatar Sep 06 '23 06:09 Blisto91