Waybar
Waybar copied to clipboard
hyprland modules: Segmentation fault when quitting waybar
There are segmentation faults (core dumped) when quitting Waybar with active hyprland modules (I tried hyprland/language
and hyprland/window
). Modules seem to work fine.
Steps to reproduce:
- Configure a hyprland module.
- Start waybar from the shell.
- Press Ctrl+C:
^C[2023-08-07 21:35:57.630] [info] Quitting.
- Press Ctrl+C once again:
^CSegmentation fault (core dumped)
Arch Linux, Waybar ver. 0.9.20 Hyprland ver. 0.27.2
I noticed the same problem recently with hyprland/workspaces
.
Using AUR packages:
-
waybar-hyprland-git 0.9.20.r64.gb084bf72-1
-
hyprland-nvidia-git 0.28.0.r26.gfe9453c6-1
Work-around: switch to another window or workspace (to create a Hyprland IPC event).
I think the problem is following:
-
SIGINT/Ctrl+C handler called, we break out of the main loop and exit the main thread.
-
The spdlog sink gets destructed.
-
The program is stuck terminating because of blocking fgets() call in Hyprland backend's thread. https://github.com/Alexays/Waybar/blob/58e506a6754fae213e409bcd079d38d03f70b958/src/modules/hyprland/backend.cpp#L68C56-L68C56
-
SIGINT/Ctrl+C handler called again, we try to log, but the spdlog sink is already destructed.
stack trace:
#0 std::__atomic_base<int>::load(std::memory_order) const (__m=std::memory_order_relaxed, this=0x565771e824bb)
at /usr/include/c++/13.2.1/bits/atomic_base.h:503
#1 spdlog::sinks::sink::should_log(spdlog::level::level_enum) const
(this=0x565771e824b3, msg_level=spdlog::level::info)
at /usr/src/debug/spdlog/spdlog-1.12.0/include/spdlog/sinks/sink-inl.h:14
#2 0x00007f485ba4fbfc in spdlog::logger::sink_it_(spdlog::details::log_msg const&) (this=0x565214c96d40, msg=...)
at /usr/include/c++/13.2.1/bits/shared_ptr_base.h:1665
#3 0x00007f485ba520aa in spdlog::logger::log_it_(spdlog::details::log_msg const&, bool, bool)
(this=0x565214c96d40, log_msg=..., log_enabled=<optimized out>, traceback_enabled=false)
at /usr/src/debug/spdlog/spdlog-1.12.0/include/spdlog/logger-inl.h:170
#4 0x0000565212dead5b in main::{lambda(int)#3}::_FUN(int) ()
#5 0x00007f4859d6f710 in <signal handler called> () at /usr/lib/libc.so.6
#6 0x00007f4859dba699 in futex_wait (private=0, expected=2, futex_word=0x7f4838000d00)
at ../sysdeps/nptl/futex-internal.h:146
#7 __GI___lll_lock_wait_private (futex=0x7f4838000d00) at lowlevellock.c:34
#8 0x00007f4859db7d85 in __GI__IO_flush_all () at genops.c:698
#9 0x00007f4859db815a in _IO_cleanup () at genops.c:843
#10 0x00007f4859d71dc8 in __run_exit_handlers
(status=0, listp=0x7f4859f09680 <__exit_funcs>, run_list_atexit=run_list_atexit@entry=true, run_dtors=run_dtors@entry=true) at exit.c:132
#11 0x00007f4859d71e10 in __GI_exit (status=<optimized out>) at exit.c:141
#12 0x00007f4859d58cd7 in __libc_start_call_main
(main=main@entry=0x565212da4760 <main>, argc=argc@entry=3, argv=argv@entry=0x7ffcdce26eb8)
at ../sysdeps/nptl/libc_start_call_main.h:74
#13 0x00007f4859d58d8a in __libc_start_main_impl
(main=0x565212da4760 <main>, argc=3, argv=0x7ffcdce26eb8, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7ffcdce26ea8) at ../csu/libc-start.c:360
#14 0x0000565212db4205 in _start ()
I am seeing the same problem, and can confirm that switching to a different workspace, or doing something to create a Hyprland IPC event allows waybar to exit.
I was observing this issue from 2 months and today i found that hyprland/window
is causing this issue, if i remove this from module array then there is no issue in quitting waybar.
config:
"hyprland/workspaces": {
"format": "<span font='18px'>{icon}</span>",
"format-icons": {
"1": "१",
"2": "२",
"3": "३",
"4": "४",
"5": "५",
"6": "६",
"default": ""
},
"persistent-workspaces": {
"*": [1, 2, 3, 4, 5, 6 ]
}
},
@akdevservices i tried hyprland/workspaces
and hyprland/languages
also but for me all of them produces same error, can you confirm if somehow you managed to fix it.
I spent some time looking into this. I think the root cause is a bit more fundamental. Basically, I think the signal handling logic in main.cpp is, I'm sorry to say, quite unsound. (Mixing signals and threads correctly is very hard.)
(Apologies for the long explanation; I don't know how familiar people are with all this stuff.)
Fundamentally, there's not much you can do in a signal handler without causing undefined behavior. A signal handler may be called when some lock is held, including locks internal to libc (or possibly the allocator).
Quoting from the C++ reference:
The behavior is undefined if [...] the signal handler calls any function within the standard library, except
- std::abort
- std::_Exit
- std::quick_exit
- std::signal with the first argument being the number of the signal currently handled (async handler can re-register itself, but not other signals).
In addition to that, any static variables you read or write need to be std::atomic<>.
In real life this means that a signal arriving at an inopportune time likely causes a crash because, among other things,
- The signal handlers for SIGUSR2 and SIGINT call logging functions which may cause allocations and file I/O, none of which is signal handler safe
- They may cause Client::reset() to be called from the signal handler context, which does all kinds of things
- It calls handleSignal() for the bars from the signal handler contexts, and those probably call library routines
If we're talking about POSIX, you can do a bit more (but still not things like memory allocation): https://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html#tag_15_04_03
The sigwait() approach in signalThread is likely much safer, although even there reap.empty() is called without the mutex held, which is not safe.
I think a lot of this headache comes from the decision to mix signals and threads, but I realize the signals are now a fundamental part of the interface. I haven't read the entire code, but I think the signal handling code is mostly limited to main.cpp
? I know nobody asked me, but I suspect the cleanest way to untangle this would actually be to have two processes; call them main and worker (I'll let you bikeshed the names).
- Have an undocumented, internal only command line option (
--worker
). Or, if you want, separate these into different binaries.
In main()
:
- If no
--worker
was passed, this is the user starting waybar, the main process that listens to the signals. In that case:
- fork and exec self with the
--worker
command line flag. Establish some kind of a communication channel between these processes. It may be pipes or file descriptors or dbus or really anything except Posix signals. - Enter a loop waiting for signals with sigwait(). The only functionality here is to pass the information about the received signals to the worker over that non-signal channel.
The worker, on the other hand:
- Realizes in
main()
that it's the worker because it got the--worker
flag. - Does all the wayland stuff, runs the bars, etc.
- Listens to the communication channel, reacting to the information about the signals.