Waybar icon indicating copy to clipboard operation
Waybar copied to clipboard

hyprland modules: Segmentation fault when quitting waybar

Open akdevservices opened this issue 1 year ago • 5 comments

There are segmentation faults (core dumped) when quitting Waybar with active hyprland modules (I tried hyprland/language and hyprland/window). Modules seem to work fine.

Steps to reproduce:

  1. Configure a hyprland module.
  2. Start waybar from the shell.
  3. Press Ctrl+C: ^C[2023-08-07 21:35:57.630] [info] Quitting.
  4. Press Ctrl+C once again: ^CSegmentation fault (core dumped)

Arch Linux, Waybar ver. 0.9.20 Hyprland ver. 0.27.2

akdevservices avatar Aug 07 '23 18:08 akdevservices

I noticed the same problem recently with hyprland/workspaces.

Using AUR packages:

  • waybar-hyprland-git 0.9.20.r64.gb084bf72-1
  • hyprland-nvidia-git 0.28.0.r26.gfe9453c6-1

zjeffer avatar Aug 10 '23 18:08 zjeffer

Work-around: switch to another window or workspace (to create a Hyprland IPC event).

I think the problem is following:

  1. SIGINT/Ctrl+C handler called, we break out of the main loop and exit the main thread.

  2. The spdlog sink gets destructed.

  3. The program is stuck terminating because of blocking fgets() call in Hyprland backend's thread. https://github.com/Alexays/Waybar/blob/58e506a6754fae213e409bcd079d38d03f70b958/src/modules/hyprland/backend.cpp#L68C56-L68C56

  4. SIGINT/Ctrl+C handler called again, we try to log, but the spdlog sink is already destructed.

stack trace:

#0  std::__atomic_base<int>::load(std::memory_order) const (__m=std::memory_order_relaxed, this=0x565771e824bb)
    at /usr/include/c++/13.2.1/bits/atomic_base.h:503
#1  spdlog::sinks::sink::should_log(spdlog::level::level_enum) const
    (this=0x565771e824b3, msg_level=spdlog::level::info)
    at /usr/src/debug/spdlog/spdlog-1.12.0/include/spdlog/sinks/sink-inl.h:14
#2  0x00007f485ba4fbfc in spdlog::logger::sink_it_(spdlog::details::log_msg const&) (this=0x565214c96d40, msg=...)
    at /usr/include/c++/13.2.1/bits/shared_ptr_base.h:1665
#3  0x00007f485ba520aa in spdlog::logger::log_it_(spdlog::details::log_msg const&, bool, bool)
    (this=0x565214c96d40, log_msg=..., log_enabled=<optimized out>, traceback_enabled=false)
    at /usr/src/debug/spdlog/spdlog-1.12.0/include/spdlog/logger-inl.h:170
#4  0x0000565212dead5b in main::{lambda(int)#3}::_FUN(int) ()
#5  0x00007f4859d6f710 in <signal handler called> () at /usr/lib/libc.so.6
#6  0x00007f4859dba699 in futex_wait (private=0, expected=2, futex_word=0x7f4838000d00)
    at ../sysdeps/nptl/futex-internal.h:146
#7  __GI___lll_lock_wait_private (futex=0x7f4838000d00) at lowlevellock.c:34
#8  0x00007f4859db7d85 in __GI__IO_flush_all () at genops.c:698
#9  0x00007f4859db815a in _IO_cleanup () at genops.c:843
#10 0x00007f4859d71dc8 in __run_exit_handlers
    (status=0, listp=0x7f4859f09680 <__exit_funcs>, run_list_atexit=run_list_atexit@entry=true, run_dtors=run_dtors@entry=true) at exit.c:132
#11 0x00007f4859d71e10 in __GI_exit (status=<optimized out>) at exit.c:141
#12 0x00007f4859d58cd7 in __libc_start_call_main
    (main=main@entry=0x565212da4760 <main>, argc=argc@entry=3, argv=argv@entry=0x7ffcdce26eb8)
    at ../sysdeps/nptl/libc_start_call_main.h:74
#13 0x00007f4859d58d8a in __libc_start_main_impl
    (main=0x565212da4760 <main>, argc=3, argv=0x7ffcdce26eb8, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7ffcdce26ea8) at ../csu/libc-start.c:360
#14 0x0000565212db4205 in _start ()

staticssleever668 avatar Oct 08 '23 12:10 staticssleever668

I am seeing the same problem, and can confirm that switching to a different workspace, or doing something to create a Hyprland IPC event allows waybar to exit.

digitalnotions avatar May 02 '24 17:05 digitalnotions

I was observing this issue from 2 months and today i found that hyprland/window is causing this issue, if i remove this from module array then there is no issue in quitting waybar.

config:


"hyprland/workspaces": {
	"format": "<span font='18px'>{icon}</span>",
	"format-icons": {
		"1": "१",
		"2": "२",
		"3": "३",
		"4": "४",
		"5": "५",
                "6": "६",
		"default": "" 
	},
       "persistent-workspaces": {
         "*": [1, 2, 3, 4, 5, 6 ]
       }
},

ashish-kus avatar Jun 03 '24 08:06 ashish-kus

@akdevservices i tried hyprland/workspaces and hyprland/languages also but for me all of them produces same error, can you confirm if somehow you managed to fix it.

ashish-kus avatar Jun 03 '24 08:06 ashish-kus

I spent some time looking into this. I think the root cause is a bit more fundamental. Basically, I think the signal handling logic in main.cpp is, I'm sorry to say, quite unsound. (Mixing signals and threads correctly is very hard.)

(Apologies for the long explanation; I don't know how familiar people are with all this stuff.)

Fundamentally, there's not much you can do in a signal handler without causing undefined behavior. A signal handler may be called when some lock is held, including locks internal to libc (or possibly the allocator).

Quoting from the C++ reference:

The behavior is undefined if [...] the signal handler calls any function within the standard library, except

  • std::abort
  • std::_Exit
  • std::quick_exit
  • std::signal with the first argument being the number of the signal currently handled (async handler can re-register itself, but not other signals).

In addition to that, any static variables you read or write need to be std::atomic<>.

In real life this means that a signal arriving at an inopportune time likely causes a crash because, among other things,

  • The signal handlers for SIGUSR2 and SIGINT call logging functions which may cause allocations and file I/O, none of which is signal handler safe
  • They may cause Client::reset() to be called from the signal handler context, which does all kinds of things
  • It calls handleSignal() for the bars from the signal handler contexts, and those probably call library routines

If we're talking about POSIX, you can do a bit more (but still not things like memory allocation): https://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html#tag_15_04_03

The sigwait() approach in signalThread is likely much safer, although even there reap.empty() is called without the mutex held, which is not safe.


I think a lot of this headache comes from the decision to mix signals and threads, but I realize the signals are now a fundamental part of the interface. I haven't read the entire code, but I think the signal handling code is mostly limited to main.cpp? I know nobody asked me, but I suspect the cleanest way to untangle this would actually be to have two processes; call them main and worker (I'll let you bikeshed the names).

  • Have an undocumented, internal only command line option (--worker). Or, if you want, separate these into different binaries.

In main():

  • If no --worker was passed, this is the user starting waybar, the main process that listens to the signals. In that case:
  1. fork and exec self with the --worker command line flag. Establish some kind of a communication channel between these processes. It may be pipes or file descriptors or dbus or really anything except Posix signals.
  2. Enter a loop waiting for signals with sigwait(). The only functionality here is to pass the information about the received signals to the worker over that non-signal channel.

The worker, on the other hand:

  1. Realizes in main() that it's the worker because it got the --worker flag.
  2. Does all the wayland stuff, runs the bars, etc.
  3. Listens to the communication channel, reacting to the information about the signals.

sliedes avatar Jul 14 '24 03:07 sliedes