flycast icon indicating copy to clipboard operation
flycast copied to clipboard

Flycast libretro Core segfaults when using Vulkan

Open Vamp898 opened this issue 2 months ago • 23 comments

Platform / OS / Hardware: Gentoo, x86_64

Flycast version: master (20251014)

Hardware: AMD Ryzen 5700G

Description of the Issue

When running Flycast standalone, everything works as expected. I switch to Vulkan and the games still play.

Using Flycast's libretro core, it crashes when starting the Game (Only with Vulkan)

Debugging Steps Tested

I tried building both RetroArch and Fljycast from Source from master, nothing changed

Logs Gathered

[libretro INFO] core/reios/reios.cpp:633 N[REIOS]: -----------------
[libretro INFO] core/reios/reios.cpp:634 N[REIOS]: REIOS: Booting up
[libretro INFO] core/reios/reios.cpp:635 N[REIOS]: -----------------
[INFO] [Environ] SET_GEOMETRY: 640x480, Aspect: 1.333.
[libretro ERROR] core/linux/common.cpp:68 E[COMMON]: SIGSEGV @ 0x55dcec37bc52 invalid access to (nil)
[libretro ERROR] Fatal error : segfault
 in fault_handler -> /home/vamp898/flycast/core/linux/common.cpp : 81
[libretro ERROR] shell/libretro/libretro.cpp:3545 E[COMMON]: DEBUGBREAK!
Illegal instruction        ./retroarch
Thread 25 "Flycast-emu" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fff955cb6c0 (LWP 1091)]
0x00007fff80c507c5 in bm_GetCodeByVAddr(unsigned int) () from /home/vamp898/.config/retroarch/cores/flycast_libretro.so
(gdb) bt
#0  0x00007fff80c507c5 in bm_GetCodeByVAddr(unsigned int) () at /home/vamp898/.config/retroarch/cores/flycast_libretro.so
#1  0x00007fff80c62036 in ??? () at /home/vamp898/.config/retroarch/cores/flycast_libretro.so
#2  0x0000000000000000 in ??? ()

Vamp898 avatar Oct 14 '25 06:10 Vamp898

Flycast uses SIGSEGV signals so these are not necessarily errors. To catch actual errors, set gdb to ignore these signals and use a breakpoint instead:

handle SIGSEGV nostop noprint
break core/linux/common.cpp:81

flyinghead avatar Oct 14 '25 07:10 flyinghead

[libretro ERROR] shell/libretro/libretro.cpp:3545 E[COMMON]: DEBUGBREAK!

Thread 1 "retroarch" received signal SIGILL, Illegal instruction.
0x00007fff7c7520fe in os_DebugBreak() () from /home/vamp898/.config/retroarch/cores/flycast_libretro.so
(gdb) bt
#0  0x00007fff7c7520fe in os_DebugBreak() () at /home/vamp898/.config/retroarch/cores/flycast_libretro.so
#1  0x00007fff7d7cb167 in fault_handler(int, siginfo_t*, void*) () at /home/vamp898/.config/retroarch/cores/flycast_libretro.so
#2  0x00007ffff2285600 in <signal handler called> () at /usr/lib64/libc.so.6
#3  0x0000555555b00c52 in spv::Builder::addName(unsigned int, char const*) ()
#4  0x0000555555b0c1f8 in spv::Builder::makeFunctionEntry(spv::Decoration, unsigned int, char const*, std::vector<unsigned int, std::allocator<unsigned int> > const&, std::vector<std::vector<spv::Decoration, std::allocator<spv::Decoration> >, std::allocator<std::vector<spv::Decoration, std::allocator<spv::Decoration> > > > const&, spv::Block**) ()
#5  0x0000555555b122aa in spv::Builder::makeEntryPoint(char const*) ()
#6  0x0000555555af435e in glslang::GlslangToSpv(glslang::TIntermediate const&, std::vector<unsigned int, std::allocator<unsigned int> >&, spv::SpvBuildLogger*, glslang::SpvOptions*) ()
#7  0x0000555555af56c6 in glslang::GlslangToSpv(glslang::TIntermediate const&, std::vector<unsigned int, std::allocator<unsigned int> >&, glslang::SpvOptions*) ()
#8  0x00007fff7d8a5d0c in ShaderCompiler::Compile(vk::ShaderStageFlagBits, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) () at /home/vamp898/.config/retroarch/cores/flycast_libretro.so
#9  0x00007fff7d8bda11 in ShaderManager::compileQuadVertexShader(bool) () at /home/vamp898/.config/retroarch/cores/flycast_libretro.so
#10 0x00007fff7d8bbccd in QuadPipeline::CreatePipeline() () at /home/vamp898/.config/retroarch/cores/flycast_libretro.so
#11 0x00007fff7d8ea15c in VulkanContext::PresentFrame(vk::Image, vk::ImageView, vk::Extent2D const&, float) ()
    at /home/vamp898/.config/retroarch/cores/flycast_libretro.so
#12 0x00007fff7d8dd64d in BaseVulkanRenderer::presentFramebuffer() () at /home/vamp898/.config/retroarch/cores/flycast_libretro.so
#13 0x00007fff7cc359e2 in ??? () at /home/vamp898/.config/retroarch/cores/flycast_libretro.so
#14 0x00007fff7cc35d97 in rend_single_frame(bool const&) () at /home/vamp898/.config/retroarch/cores/flycast_libretro.so
#15 0x00007fff7c6cab09 in Emulator::render() () at /home/vamp898/.config/retroarch/cores/flycast_libretro.so
#16 0x00007fff7c753d5b in retro_run () at /home/vamp898/.config/retroarch/cores/flycast_libretro.so
#17 0x00005555556939ab in core_run ()
#18 0x0000555555697915 in runloop_iterate ()
#19 0x00005555556894a8 in rarch_main ()
#20 0x00007ffff226860b in ??? () at /usr/lib64/libc.so.6
#21 0x00007ffff22686ba in __libc_start_main () at /usr/lib64/libc.so.6
#22 0x0000555555679f95 in _start ()

Vamp898 avatar Oct 14 '25 07:10 Vamp898

I've seen this bug before with glslang since both Flycast and RetroArch use it and there are potential conflicts between the two versions. See #992. Make sure to compile flycast with its embedded version (USE_HOST_GLSLANG=OFF, the default). RetroArch has a similar option: ./configure --enable-builtinglslang

Which version of RetroArch are you using by the way? I can't reproduce the issue with RA 1.19.0.

flyinghead avatar Oct 14 '25 08:10 flyinghead

Thanks for the feedback.

I tried rebuilding flycast libretro core with the above mentioned flag explicitly off just now cmake -DUSE_HOST_GLSLANG=OFF -DLIBRETRO=on .. Copied the core into the cores folder --> tried again --> same crash.

I then rebuilt retroarch (latest stable by the way, so 1.21.0) with ./configure --enable-builtinglslang Tried again --> Same crash.

So sadly this did not help. Because i got curious, i tried the exact opposite, so using USE_HOST_GLSLANG=ON when building flycast --> same crash, nothing changes.

So i tried the same with RetroArch building with --disable-builtinglslang but configure already told me, he can not find the system glslang on Linux, even though they are installed (possible version conflict) abut that doesn't work.

So changing these flags in flycast does not change the situation. changing these flags in RetroArch causes the loose of Vulkan Support because he no longer finds any glslang to begin with.

I tried again with gl (Which also uses GLSL) and glcore (Which uses Slang), both work fine, only vulkan doesn't.

Vamp898 avatar Oct 14 '25 08:10 Vamp898

Upgraded to 1.21.0 and still no crash on my side (Ubuntu 22). Both RA and the core are using an embedded glslang lib.

flyinghead avatar Oct 14 '25 13:10 flyinghead

That is generally positive. The question is, what is causing the crash then? If there is anything i can do to isolate/debug that, i am willing to help.

I had this issue with 1.20, so it is not specific to 1.21 i guess. I do not have this issue with any other core, no matter which. Its an exclusive combination of RetroArch and Flycast.

But other People in the original Bug Report have the same issue, so i assume i am not alone.

If there is anything i can help, i am willing to test around. For me personally, using the Interface of Flycast directly is an okay workaround, i have no issue with that. Also it works fine with glcore in RetroArch too.

I'd love to help when i can

Vamp898 avatar Oct 14 '25 13:10 Vamp898

I would be more positive if I could reproduce the issue locally. I tried the released builds of both RA and the core to no avail.

It's clearly an issue with glslang, possibly some conflict between the two embedded versions. If you have the same issue with the retroarch and core released binaries then a build issue can be ruled out.

flyinghead avatar Oct 14 '25 14:10 flyinghead

I was using the RetroArch from the Package Manager with the Core directly provided by RetroArch first. The only reason i build it from source myself was to debug this issue.

Well i hoped that it might be already fixed by building from source, but using it to debug the issue was my second best guess.

To not use the Version from the Package manager, i downloaded the latest RetroArch from the Homepage, installed the core using RetroArch ons online update feature, set to vulkan and --> same crash

Vamp898 avatar Oct 14 '25 15:10 Vamp898

I have the same problem with Flycast retroarch core under GroovyArcade (arch distro)

[libretro ERROR] core/linux/common.cpp:68 E[COMMON]: SIGSEGV @ (nil) invalid access to (nil)
[libretro ERROR] Fatal error : segfault
 in fault_handler -> /builds/libretro/flycast-upstream/core/linux/common.cpp : 81

turric4n avatar Oct 18 '25 18:10 turric4n

Hi,

I got this log in debug, for RA I used ./configure --disable-qt && DEBUG=1 make -j$(nproc) and Flycast cmake -DLIBRETRO=on -DCMAKE_BUILD_TYPE=Debug ..

#0  fault_handler (sn=11, si=0x7fffffffbf70, segfault_ctx=0x7fffffffbe40) at /tmp/flycast/core/linux/common.cpp:81
#1  0x00007ffff3c3e540 in <signal handler called> () at /usr/lib/libc.so.6
#2  0x0000555555db9a84 in glslang::TIntermediate::getTreeRoot (this=0x2) at deps/glslang/glslang/glslang/MachineIndependent/localintermediate.h:404
#3  0x0000555555db4b67 in glslang::GlslangToSpv (intermediate=..., spirv=std::vector of length 0, capacity 0, logger=0x7fffffffceb0, options=0x0)
    at deps/glslang/glslang/SPIRV/GlslangToSpv.cpp:6948
#4  0x0000555555db4acb in glslang::GlslangToSpv (intermediate=..., spirv=std::vector of length 0, capacity 0, options=0x0) at deps/glslang/glslang/SPIRV/GlslangToSpv.cpp:6942
#5  0x00007fffdd70e178 in GLSLtoSPV
    (shaderType=vk::ShaderStageFlagBits::eVertex, glslShader="#version 430\n#define ROTATE 0\n\nlayout (location = 0) in vec3 in_pos;\nlayout (location = 1) in vec2 in_uv;\n\nlayout (location = 0) out vec2 outUV;\n\nvoid main()\n{\n#if ROTATE == 0\n\tgl_Position = vec4(in_p"..., spvShader=std::vector of length 0, capacity 0) at /tmp/flycast/core/rend/vulkan/compiler.cpp:103
#6  0x00007fffdd70e25c in ShaderCompiler::Compile
    (shaderStage=vk::ShaderStageFlagBits::eVertex, shaderText="#version 430\n#define ROTATE 0\n\nlayout (location = 0) in vec3 in_pos;\nlayout (location = 1) in vec2 in_uv;\n\nlayout (location = 0) out vec2 outUV;\n\nvoid main()\n{\n#if ROTATE == 0\n\tgl_Position = vec4(in_p"...) at /tmp/flycast/core/rend/vulkan/compiler.cpp:110
#7  0x00007fffdd740b9a in ShaderManager::compileQuadVertexShader (this=0x5555574db040, rotate=false) at /tmp/flycast/core/rend/vulkan/shaders.cpp:790
#8  0x00007fffdd73e845 in ShaderManager::GetQuadVertexShader (this=0x5555574db040, rotate=false) at /tmp/flycast/core/rend/vulkan/shaders.h:131
#9  0x00007fffdd73615e in QuadPipeline::CreatePipeline (this=0x55555723ece0) at /tmp/flycast/core/rend/vulkan/quad.cpp:110
#10 0x00007fffdd776a9d in QuadPipeline::GetPipeline (this=0x55555723ece0) at /tmp/flycast/core/rend/vulkan/quad.h:110
#11 0x00007fffdd7769e0 in QuadPipeline::BindPipeline (this=0x55555723ece0, commandBuffer=...) at /tmp/flycast/core/rend/vulkan/quad.h:100
#12 0x00007fffdd799b92 in VulkanContext::PresentFrame (this=0x7fffde9cf6a0 <theVulkanContext>, image=..., imageView=..., extent=..., aspectRatio=1.33333349)
    at /tmp/flycast/core/rend/vulkan/vk_context_lr.cpp:409
#13 0x00007fffdd77defb in BaseVulkanRenderer::presentFramebuffer (this=0x555557492fe0) at /tmp/flycast/core/rend/vulkan/vulkan_renderer.cpp:230
#14 0x00007fffdd77ee6e in VulkanRenderer::Present (this=0x555557492fe0) at /tmp/flycast/core/rend/vulkan/vulkan_renderer.cpp:311
#15 0x00007fffdca20c7b in PvrMessageQueue::present (this=0x7fffdfa60840 <pvrQueue>) at /tmp/flycast/core/hw/pvr/Renderer_if.cpp:234
#16 0x00007fffdca20993 in PvrMessageQueue::execute (this=0x7fffdfa60840 <pvrQueue>, msg=...) at /tmp/flycast/core/hw/pvr/Renderer_if.cpp:164
#17 0x00007fffdca205f1 in PvrMessageQueue::waitAndExecute (this=0x7fffdfa60840 <pvrQueue>, timeoutMs=20) at /tmp/flycast/core/hw/pvr/Renderer_if.cpp:105
#18 0x00007fffdca1edab in rend_single_frame (enabled=@0x7fffffffdbf7: true) at /tmp/flycast/core/hw/pvr/Renderer_if.cpp:260
#19 0x00007fffdc43776d in Emulator::render (this=0x7fffde9c5a00 <emu>) at /tmp/flycast/core/emulator.cpp:1059
#20 0x00007fffdc4ca402 in retro_run () at /tmp/flycast/shell/libretro/libretro.cpp:1218
#21 0x000055555564da23 in core_run () at runloop.c:8065
#22 0x000055555564c658 in runloop_iterate () at runloop.c:7405
#23 0x000055555563573d in rarch_main (argc=4, argv=0x7fffffffdf08, data=0x0) at retroarch.c:6050
#24 0x000055555563579f in main (argc=4, argv=0x7fffffffdf08) at retroarch.c:6182

gouchi avatar Oct 23 '25 15:10 gouchi

It looks like the same issue. What's your distrib and version?

flyinghead avatar Oct 23 '25 15:10 flyinghead

I am running Arch (updated) with RA from latest commit fc1acf4 and Flycast latest from main branch bf2bd7e.

gouchi avatar Oct 23 '25 15:10 gouchi

Not sure if this is related to this bug, but with further testing i found another bug.

When i go Fullscreen in Flycast (Not libretro, the actual Application), it just freezes hard. I can not even kill it with CTRL+C, i have to send the SIGKIILL to get rid of it.

This does not happen using OpenGL. playing in a window works fine, only fullscreen is the issue.

This happens in the menu too, so even before loading any game, as soon i go fullscreen, its just frozen.

Sadly there is no output in the log but when trying to kill it with ctrl-c, i get this in the backtrace

(gdb) bt
#0  0x00007ffff7772fff in ioctl () from /usr/lib64/libc.so.6
#1  0x00007fffcd4cb1e0 in drmIoctl () from /usr/lib64/libdrm.so.2
#2  0x00007fffcd4d017b in drmSyncobjTimelineWait () from /usr/lib64/libdrm.so.2
#3  0x00007fff9a8b3b88 in wsi_drm_wait_for_explicit_sync_release () from /usr/lib64/libvulkan_radeon.so
#4  0x00007fff9a8b801a in x11_acquire_next_image () from /usr/lib64/libvulkan_radeon.so
#5  0x00007fff9a8b051f in wsi_AcquireNextImage2KHR () from /usr/lib64/libvulkan_radeon.so
#6  0x00007fff9a8b04a3 in wsi_AcquireNextImageKHR () from /usr/lib64/libvulkan_radeon.so
#7  0x0000555556b1058b in ?? ()
#8  0x0000555556b1d618 in ?? ()
#9  0x0000555556944235 in ?? ()
#10 0x000055555694d087 in ?? ()
#11 0x0000555556964514 in ?? ()
#12 0x0000555556964748 in ?? ()
#13 0x00005555556a1846 in main ()

But not sure if this is any helpful as this was created when trying to SIGINT the application, not during runtime. During runtime it just freezes, it doesn't crash.

For the sake of reference, nothing changed with RetroArch 1.22.0 either, it still crashes.

Vamp898 avatar Nov 12 '25 14:11 Vamp898

I fixed this issue recently, which may be related: https://github.com/flyinghead/flycast/issues/1620 Make sure to use the latest master build.

flyinghead avatar Nov 12 '25 14:11 flyinghead

I checkout daily

Image

Its in the git log too.

Image

I removed the build directory and did a fresh, new, clean rebuild but its still there. But in that case it is a different bug and not related to this.

Vamp898 avatar Nov 12 '25 23:11 Vamp898

Can you install the Vulkan SDK and run Flycast standalone with validation layers enabled? Going fullscreen might trigger an error, which should be logged.

flyinghead avatar Nov 13 '25 08:11 flyinghead

I am not sure but i assume its the vulkan-layers package?

Image Image

I hope those are the correct ones^^ If so, this is the result

vamp898@VampBook ~ $ ./flycast/build/flycast 
00:00:000 sdl/sdl.cpp:772 N[RENDERER]: Monitor refresh rate: 100 Hz (2560 x 1440)
00:00:007 rend/vulkan/vulkan_context.cpp:271 N[RENDERER]: Vulkan API 1.1. Device AMD Radeon Graphics (RADV RENOIR)
00:00:007 rend/vulkan/vulkan_context.cpp:425 N[RENDERER]: Device extension enabled: VK_KHR_swapchain
00:00:007 rend/vulkan/vulkan_context.cpp:428 N[RENDERER]: Device extension unavailable: VK_KHR_portability_subset
00:00:007 rend/vulkan/vulkan_context.cpp:425 N[RENDERER]: Device extension enabled: VK_EXT_provoking_vertex
00:00:010 rend/vulkan/vulkan_context.h:326 N[RENDERER]: Using depth format D32SfloatS8Uint tiling Optimal
00:00:014 ui/gui.cpp:322 N[RENDERER]: Screen DPI is 108, size 1540 x 1046. Scaling by 1.00
00:00:097 rend/vulkan/vulkan_renderer.cpp:240 N[RENDERER]: VulkanRenderer::Init
00:00:100 sdl/dreamlink.cpp:86 N[INPUT]: GUID: 0300c6b1790000001100000011010000 VID:0079 PID:0011
00:00:100 sdl/sdl_gamepad.cpp:220 N[INPUT]: SDL: Opened joystick 0 on port 0: 'SWITCH CO.,LTD. USB Gamepad' unique_id=sdl_joystick_0
00:00:156 rend/vulkan/vulkan_context.h:326 N[RENDERER]: Using depth format D32SfloatS8Uint tiling Optimal
00:03:267 rend/vulkan/vulkan_context.h:326 N[RENDERER]: Using depth format D32SfloatS8Uint tiling Optimal
00:03:290 rend/vulkan/vulkan_context.h:326 N[RENDERER]: Using depth format D32SfloatS8Uint tiling Optimal
Validation Error: [ VUID-vkAcquireNextImageKHR-semaphore-01286 ] | MessageID = 0xe9e4b2a9
vkAcquireNextImageKHR(): Semaphore must not be currently signaled.
The Vulkan spec states: If semaphore is not VK_NULL_HANDLE, it must be unsignaled (https://docs.vulkan.org/spec/latest/chapters/VK_KHR_surface/wsi.html#VUID-vkAcquireNextImageKHR-semaphore-01286)
Objects: 1
    [0] VkSemaphore 0xe400000000e4

Validation Error: [ VUID-vkAcquireNextImageKHR-surface-07783 ] | MessageID = 0xad0e15f6
vkAcquireNextImageKHR(): Application has already previously acquired 1 image from swapchain. Only 1 is available to be acquired using a timeout of UINT64_MAX (given the swapchain has 3, and VkSurfaceCapabilitiesKHR::minImageCount is 3).
The Vulkan spec states: If forward progress cannot be guaranteed for the surface used to create the swapchain member of pAcquireInfo, timeout must not be UINT64_MAX (https://docs.vulkan.org/spec/latest/chapters/VK_KHR_surface/wsi.html#VUID-vkAcquireNextImageKHR-surface-07783)
Objects: 1
    [0] VkSwapchainKHR 0xd200000000d2

Validation Error: [ VUID-vkAcquireNextImageKHR-semaphore-01286 ] | MessageID = 0xe9e4b2a9
vkAcquireNextImageKHR(): Semaphore must not be currently signaled.
The Vulkan spec states: If semaphore is not VK_NULL_HANDLE, it must be unsignaled (https://docs.vulkan.org/spec/latest/chapters/VK_KHR_surface/wsi.html#VUID-vkAcquireNextImageKHR-semaphore-01286)
Objects: 1
    [0] VkSemaphore 0xe400000000e4

Validation Error: [ VUID-vkAcquireNextImageKHR-surface-07783 ] | MessageID = 0xad0e15f6
vkAcquireNextImageKHR(): Application has already previously acquired 2 images from swapchain. Only 1 is available to be acquired using a timeout of UINT64_MAX (given the swapchain has 3, and VkSurfaceCapabilitiesKHR::minImageCount is 3).
The Vulkan spec states: If forward progress cannot be guaranteed for the surface used to create the swapchain member of pAcquireInfo, timeout must not be UINT64_MAX (https://docs.vulkan.org/spec/latest/chapters/VK_KHR_surface/wsi.html#VUID-vkAcquireNextImageKHR-surface-07783)
Objects: 1
    [0] VkSwapchainKHR 0xd200000000d2

Vamp898 avatar Nov 13 '25 08:11 Vamp898

Spot on. Thanks

flyinghead avatar Nov 13 '25 09:11 flyinghead

I might have found the issue. Can you test a small change? in core/rend/vulkan/vulkan_driver.h line 75 add context->resize(); in the catch block:

		} catch (const InvalidVulkanContext&) {
			context->resize();
		}

In core/rend/vulkan/vulkan_context.cpp line 1091, add resized = true; in the catch block:

		} catch (const InvalidVulkanContext&) {
			resized = true;
		}

flyinghead avatar Nov 13 '25 10:11 flyinghead

Thix fixed the fullscreen issue in the actual application for me, yes. I can go to fullscreen and back without issues now.

Of course i tested if this also affects the libretro core but... that one still crashes. So sadly that issue was not related

Vamp898 avatar Nov 13 '25 11:11 Vamp898

Great news! Thank you.

This code isn't used by the RA core because this is handled by the frontend (RetroArch) so it's normal not to see any improvement there.

flyinghead avatar Nov 13 '25 11:11 flyinghead

Thank you for your hard work in form of fast and consistent commits!

Vamp898 avatar Nov 13 '25 12:11 Vamp898

Fix pushed on master

EDIT: fix for the fullscreen freeze with standalone, not the original libretro issue.

flyinghead avatar Nov 14 '25 10:11 flyinghead