Sunshine icon indicating copy to clipboard operation
Sunshine copied to clipboard

Random segmentation fault on start

Open Sidefix opened this issue 1 year ago • 5 comments

Is there an existing issue for this?

  • [X] I have searched the existing issues

Is your issue described in the documentation?

  • [X] I have read the documentation

Is your issue present in the latest beta/pre-release?

This issue is present in the latest pre-release

Describe the Bug

Randomly when starting up Sunshine, it will segfault while reading the configuration file. I've observed this issue only occurs on a second+ startup of Sunshine after a machine reboot. Meaning that Sunshine first startup after a fresh reboot will never cause this to occur.

Expected Behavior

Sunshine should start up every time.

Additional Context

This has been occurring since before 0.21.0 (this was the first version I ever used). This has occurred across multiple versions of my OS (Ubuntu 22.10, Ubuntu 23.04, Ubuntu 23.10 and now Ubuntu 24.04). This has occurred across multiple versions of graphic drivers.

No configuration changes occur in between a failing startup and a successful startup.

The issue occurs randomly, but relatively rarely. 3 out of 10 startups will segfault.

A retry of the startup will almost always work immediately. Only a handful of times has the startup segfault persisted a second time consecutively.

Host Operating System

Linux

Operating System Version

24.04

Architecture

64 bit

Sunshine commit or version

v2024.730.191523

Package

Linux - deb

GPU Type

Nvidia

GPU Model

RTXA2000

GPU Driver/Mesa Version

560.28.03

Capture Method

NvFBC (Linux)

Config

log_path = /home/pi/sunshine/configs/sunshine.log
nv_preset = p1
origin_web_ui_allowed = pc
credentials_file = /home/pi/sunshine/configs/sunshine_state.json
nvenc_spatial_aq = enabled
file_apps = /home/pi/sunshine/configs/apps.json
resolutions = [
    1440x900
]
min_log_level = 1
file_state = /home/pi/sunshine/configs/sunshine_state.json
encoder = nvenc
nvenc_preset = 7
fps = [60]
nv_rc = vbr
capture = nvfbc
native_pen_touch = disabled
channels = 3
sunshine_name = Pi
global_prep_cmd = [{"do":"","undo":""}]
high_resolution_scrolling = disabled
nvenc_twopass = full_res

Apps

No response

Relevant log output

pi@pi:~/sunshine/configs$ sunshine sunshine.conf
[nvenc_twopass] -- [full_res]
[high_resolution_scrolling] -- [disabled]
[global_prep_cmd] -- [[{"do":"","undo":""}]]
[sunshine_name] -- [Pi]
[channels] -- [3]
[native_pen_touch] -- [disabled]
[capture] -- [nvfbc]
[log_path] -- [/home/pi/sunshine/configs/sunshine.log]
[nv_preset] -- [p1]
[min_log_level] -- [1]
[origin_web_ui_allowed] -- [pc]
[credentials_file] -- [/home/pi/sunshine/configs/sunshine_state.json]
[nvenc_spatial_aq] -- [enabled]
[file_apps] -- [/home/pi/sunshine/configs/apps.json]
[resolutions] -- [[
    1440x900
]]
[file_state] -- [/home/pi/sunshine/configs/sunshine_state.json]
[encoder] -- [nvenc]
[nvenc_preset] -- [7]
[fps] -- [[60]]
[nv_rc] -- [vbr]
Warning: Unrecognized configurable option [nv_preset]
Warning: Unrecognized configurable option [nv_rc]
[2024:08:02:12:24:13]: Info: Sunshine version: v2024.730.191523
Segmentation fault (core dumped)

Sidefix avatar Aug 02 '24 09:08 Sidefix

Some additional context: I do end up killing the sunshine process with some regularity for testing a separate issue I have: #2614

Sidefix avatar Aug 02 '24 09:08 Sidefix

Some additional context: I do end up killing the sunshine process with some regularity

I think the config file is locked by the other process and when you kill it, it doesn't release immediately?

ReenigneArcher avatar Aug 02 '24 15:08 ReenigneArcher

@ReenigneArcher I've had it occur even with a long break between killing the original process and starting a new instance. I don't do it programatically, I do it manually.

If I do end up killing the original process, I usually do so from the terminal with kill -9 or ctrl+C from the running window.

I.e. I don't think the previous process still has any hooks that could cause this when I usually restart. I'll keep that in mind in my repros though.

Edit: sorry if it's a bit confusing; what I'm trying to say effectively is that I kill the process in such a way that it shouldn't still have the file locked regardless, and I usually have enough of a delta time between killing the old one and starting the new one that it doesn't really make sense to me.

Sidefix avatar Aug 02 '24 15:08 Sidefix

Please try to run under gdb and get a backtrace of the crash.

cgutman avatar Aug 03 '24 20:08 cgutman

@cgutman here you go, I hope I did this correctly, I've never used gdb before:

Thread 4 "sunshine" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x731cec800000 (LWP 80512)]
__GI_getenv (name=0x731cee1ae017 "XAUTHORITY") at ./stdlib/getenv.c:31
warning: 31	./stdlib/getenv.c: No such file or directory
(gdb) backtrace
#0  __GI_getenv (name=0x731cee1ae017 "XAUTHORITY") at ./stdlib/getenv.c:31
#1  0x0000731cee1ac6fb in XauFileName () from /lib/x86_64-linux-gnu/libXau.so.6
#2  0x0000731cee1acd56 in XauGetBestAuthByAddr ()
   from /lib/x86_64-linux-gnu/libXau.so.6
#3  0x0000731cee728d1d in xcb_connect_to_display_with_auth_info ()
   from /lib/x86_64-linux-gnu/libxcb.so.1
#4  0x0000731cf0f223ca in _XConnectXCB () from /lib/x86_64-linux-gnu/libX11.so.6
#5  0x0000731cf0f130fe in XOpenDisplay () from /lib/x86_64-linux-gnu/libX11.so.6
#6  0x0000731cf1776ebc in ?? () from /lib/x86_64-linux-gnu/libgdk-3.so.0
#7  0x0000731cf1721397 in gdk_display_manager_open_display ()
   from /lib/x86_64-linux-gnu/libgdk-3.so.0
#8  0x0000731cf2e0238a in gtk_init_check () from /lib/x86_64-linux-gnu/libgtk-3.so.0
#9  0x00005dcaf3d54b87 in ?? ()
#10 0x00005dcaf3d03ece in ?? ()
#11 0x0000731cf1ceabb4 in ?? () from /lib/x86_64-linux-gnu/libstdc++.so.6
#12 0x0000731cf189ca94 in start_thread (arg=<optimized out>)
    at ./nptl/pthread_create.c:447
#13 0x0000731cf1929c3c in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:78
(gdb) 

Sidefix avatar Aug 07 '24 07:08 Sidefix

It seems this issue hasn't had any activity in the past 90 days. If it's still something you'd like addressed, please let us know by leaving a comment. Otherwise, to help keep our backlog tidy, we'll be closing this issue in 10 days. Thanks!

LizardByte-bot avatar Nov 06 '24 10:11 LizardByte-bot

This issue was closed because it has been stalled for 10 days with no activity.

LizardByte-bot avatar Nov 17 '24 10:11 LizardByte-bot