RetroArch icon indicating copy to clipboard operation
RetroArch copied to clipboard

Retroachievement causes RetroArch crash after load 2nd game

Open Jobima1st opened this issue 2 years ago • 24 comments

Description

RetroArch consistently crashes when loading a second game only when achievements are enabled. When achievements are disabled it loads content perfectly fine after having already loaded content.

Steps to reproduce the bug

  1. login with retroachievements
  2. load any game with any core
  3. load another game
  • RetroArch: [version/commit] 1.14.0 x64 from 7z pack for win 7-11

Environment information

  • OS: windows 11

Jobima1st avatar Dec 18 '22 03:12 Jobima1st

I can reproduce on the stable 1.14.0 Windows x64 downloaded from libretro.com. I cannot reproduce with a msvc2019 release build of the v1.14.0 tag.

Attaching a debugger shows this for the stacktrace:

Unhandled exception at 0x00007FFA6E644673 (mswsock.dll) in retroarch.exe: 0xC0000005: Access violation writing location 0x0000000000000020.
>	mswsock.dll!00007ffa6e644673()	Unknown
 	mswsock.dll!00007ffa6e64d691()	Unknown
 	mswsock.dll!00007ffa6e6410d8()	Unknown
 	kernel32.dll!00007ffa70347614()	Unknown
 	ntdll.dll!00007ffa717e26a1()	Unknown

Clarified steps to reproduce:

  1. Set up achievements
  2. Start a game with achievements
  3. When the game banner shows up (you have earned X of Y achievements), close the content
  4. Reload the same game
  5. Crash

The issue occurs even if I leave the game running for several minutes to ensure the background initialization task completes.

As I can't reproduce in the debugger, I'm not sure how to diagnose/repair. Also, as the problem doesn't occur when I build it myself, I can't bisect.

Jamiras avatar Dec 18 '22 16:12 Jamiras

I've been having this issue for a while now, and forgot to report it. I just grabbed stable 1.10.0 Windows 64bit build right now because I remember not having this issue somewhere back. I did the steps and it's not crashing when reloading or swapping games. I'm using the same config files from 1.14, so comparing the 1.10 build might help find the issue.

bizarf avatar Dec 19 '22 14:12 bizarf

That's a good idea. I do have older releases lying around. The behavior starts appearing in 1.11.0. It does not occur in 1.10.3.

Unfortunately, there's over 1000 commits over five months between those versions: https://github.com/libretro/RetroArch/compare/v1.10.3...v1.11.0

Jamiras avatar Dec 19 '22 15:12 Jamiras

I accidentally discovered something strange on 1.1.4. I loaded up a NES game with the Mesen core and decided to swap to another NES game. It loaded up fine. Starting a new session with an arcade games with the FinalBurn Neo core is also letting me reload it, or swap to games on other cores just fine.

I didn't mention above, but my crashing was happening with GBA games on the mGBA core. Oddly, GBA games are loading fine after I started a NES game with the Mesen core on my current session. If I shut down Retroarch, then load a GBA game with mGBA, then RA will crash if I reload it. The crashing will also happen with GB and GBC games with the Gambatte core, and with Genesis games with the Genesis Plus GX core. I also tried loading a NES game with the FCEUmm core and it's crashing when trying to reload it or swap to a different game. I swapped the core back to Mesen for that NES game and it's not crashing.

I'm not sure if it works only for me, but loading a NES game with Mesen, or an arcade game with FinalBurn Neo on a new session might be a temporary workaround for some reason.

bizarf avatar Dec 22 '22 02:12 bizarf

Seen over on Reddit too:

https://www.reddit.com/r/RetroArch/comments/1085v4m/retroachievements_crashing_retroarch_when_closing/

retroNUC avatar Jan 11 '23 18:01 retroNUC

Tested some different builds myself, definitely some sort of undefined behaviour difference between MinGW (broken) vs MSVC (working).

Suppose I'd have to get the MinGW debugger up and running to pinpoint the exact spot, but running an ASan MSVC build certainly shows some runtime problems in rc_libretro.c that could be cleaned up - Lots of memcmp usage for string comparisons instead of strcmp/strncmp.

retroNUC avatar Jan 11 '23 20:01 retroNUC

Thread 23 received signal SIGSEGV, Segmentation fault.
0x00007ffa6a8db277 in WSPStartup () from C:\Windows\system32\mswsock.dll

Caused by the rcheevos_async_begin_request in rcheevos_client_identify_game on the second game boot.

First request: image

Second request (causes segfault): image

My best guess is that data/userdata/datacopy is being freed early, then being dereferenced on the network task.

Side note - Spotted that "rcheevos_locals.load_info.hashes_tried" isn't being reset between sessions, that could probably lead to some funkiness.

retroNUC avatar Jan 12 '23 00:01 retroNUC

Bisected to PR https://github.com/libretro/RetroArch/pull/14351 / Commit https://github.com/libretro/RetroArch/commit/e45958b25a140cbd549cfb8771395f3bcd922e48

(Network) Get rid of the timeout_enable parameter for socket_connect

retroNUC avatar Jan 14 '23 12:01 retroNUC

The difference between MinGW crashing and MSVC working is that the latter doesn't actually have SSL implemented.

On the second boot, something is calling WSAStartup() (can't tell what's spawning this on a new thread) at the same time that WSAPoll() is being called in socket_connect_with_timeout(), which is what the above commit changed the http socket code to instead socket_connect(). On _WIN32 platforms through that last function, it wasn't actually doing any timeouts at all, despite the request in the parameter.

In any case, both SSL and non-SSL now go through socket_connect_with_timeout() and only the SSL one crashes, which makes me think something's not being reset/shutdown correctly for the next session.

retroNUC avatar Jan 14 '23 20:01 retroNUC

Honestly can't tell whether I hate debugging network code or multithreaded code more.

It's a dangling async task for CHEEVOS_ASYNC_FETCH_HARDCORE_USER_UNLOCKS from the first run that's causing the problems once socket_connect_with_timeout() starts being used from the second run's rcheevos async tasks.

Actually, all of the first run's async tasks seem to be sticking around until the game shutdown, which manages to clean up all apart from the hardcore data. Would have though all of those threads would have cleaned themselves up as soon as HTTP requests were finished.

retroNUC avatar Jan 15 '23 19:01 retroNUC

@retroNUC How do we proceed from here? When you revert that PR locally does it work properly then? If so, we can start figuring out a way we can fundamentally fix this.

LibretroAdmin avatar Jan 24 '23 15:01 LibretroAdmin

It's the addition of this code that's causing the problem (thanks for bisecting): https://github.com/libretro/RetroArch/blob/e45958b25a140cbd549cfb8771395f3bcd922e48/libretro-common/net/net_socket_ssl_mbed.c#L118-L126 which was added by #14351: https://github.com/libretro/RetroArch/pull/14351/files#diff-e68eaa44ad3da41536e0a0bc37e9e75f13750d7d4c29fd6cb54305b98e736f9fR225-R237

I'm not a socket programmer, so I don't know what the actual issue is, but passing false for timeout_enable in net_http_new_socket does seem to fix the issue (as it avoids the referenced new code):

#ifdef _WIN32
      if (ssl_socket_connect(conn->sock_state.ssl_ctx, addr, false, true)
            < 0)
#else
      if (ssl_socket_connect(conn->sock_state.ssl_ctx, addr, true, true)
            < 0)
#endif

Note that it was passing true prior to the changes in #14351. The function was just ignoring the parameter.

I don't know if this change would have any unintended side effects. Perhaps it's better to continue passing true and just make net_socket_ssl_mbed.c ignore the parameter as it had been before.

Jamiras avatar Jan 24 '23 19:01 Jamiras

Submitted the temp fix after speaking to @LibretroAdmin, pretty much along the lines of what @Jamiras suggested.

Keep this issue open until we fix properly, though?

retroNUC avatar Jan 24 '23 22:01 retroNUC

Same thing here

Immersion95 avatar Sep 07 '23 06:09 Immersion95

I'm encountering a similar issue. Attached is one of the logs. retroarch__2024_12_10__22_15_14.log

GoodLuckTrying avatar Dec 11 '24 03:12 GoodLuckTrying

This problem persist in the latest nightly switch build but is absent in the stable. 1.20.0

ZeROOFALL avatar Feb 15 '25 09:02 ZeROOFALL

Ahhh just ran into this exact same issue on the nightly libnx/Switch version of RetroArch, can confirm it only happens when RetroAchievements are enabled.

lreeves avatar Feb 26 '25 15:02 lreeves

@lreeves are you using Vulkan as video driver? It stopped once I switched to OpenGL.

GoodLuckTrying avatar Feb 26 '25 15:02 GoodLuckTrying

Nope opengl. I'm going to switch away from Nightly though since 1.20.0 is fresh enough for me :)

lreeves avatar Feb 26 '25 15:02 lreeves

This indeed happens on my modded switch lite after updating RetroArch to 1.21.0

LA208602-Gouthiere avatar May 21 '25 16:05 LA208602-Gouthiere

I had 1.20.0 on Nintendo Switch until 2 weeks ago and it's been happening (Only when RetroAchievements are on) since I updated as well.

GoodLuckTrying avatar May 21 '25 17:05 GoodLuckTrying

I have 1.20 on my switch, and the issue is there as well

SimpliOP avatar May 27 '25 18:05 SimpliOP

I have the same issue on 1.20 on my switch

TomGUN02 avatar Jun 03 '25 16:06 TomGUN02

I actually think I have 1.19.1, and it crashes on there too.

SimpliOP avatar Jun 03 '25 22:06 SimpliOP

Been seeing this same issue on Switch. I tried versions 1.21, 1.20, and 1.13.1 and experienced the crash on all versions. Disabling RetroAchievements fixes the issue, but I would like to have achievements on.

blakeheyman avatar Jun 30 '25 18:06 blakeheyman

Can't believe this is still an issue 5 years down

Niibhan avatar Aug 20 '25 22:08 Niibhan

Pretty sure it's two different issues. I'm also pretty sure I caused the second, and it should be fixed in the nightly.

warmenhoven avatar Aug 21 '25 02:08 warmenhoven

Pretty sure it's two different issues. I'm also pretty sure I caused the second, and it should be fixed in the nightly.

Changing the video driver to OpenGL fixes the issue but then you lose the ability to turn on HDR because of the absence of vulkan

Niibhan avatar Aug 21 '25 10:08 Niibhan