kando icon indicating copy to clipboard operation
kando copied to clipboard

Triggering Kando over and over crashes dbus-broker, crashing the entire desktop (kwin_wayland)

Open auslegungssache opened this issue 9 months ago • 21 comments

Short Summary

I use Kando to switch music. This involves triggering it very often in a certain amount of time. This usually eventually leads to a complete crash of the desktop due to dbus-broker (or dbus-broker-launch) crashing (see logs). Especially if I do this very often in a certain amount of time.

Steps to Reproduce the Issue

  1. Use KDE Plasma 6.3.2 with Wayland
  2. Trigger Kando over and over Example: watch -n 0.5 flatpak run menu.kando.Kando -m "Main"
  3. The dbus broker soon crashes and takes the entire desktop down with it, segfaulting every application in the session

Kando Version

v1.7.0

Installation Method

Another method (specify in the comments below)

Desktop Environment

KDE on Wayland

Environment Version

KDE Plasma 6.3.2, OpenSUSE Tumbleweed

Additional Information

I have tested this issue with multiple ways of triggering Kando. Be it through my keyboard, my mouse bound to a key or even the command line, it still does it. I have also tried to see if this is caused by any of my other extensions. Sadly it occurs even on vanilla Plasma.

The issue happens irregardless of if Kando is run as an AppImage, Flatpak or anything else.

While I do think this is a bug in dbus_broker, the issue also seems partially to be on the side of Kando. I think this could be caused by Kando creating a high amount of chatter, which then crashes dbus. That could however be wrong on my side

Logs:

Mar 12 19:51:18 moon kwin_wayland[2976]: js: Kando: Triggered.
Mar 12 19:51:18 moon kwin_wayland[2976]: js: Kando: Triggered.
Mar 12 19:51:18 moon kwin_wayland[2976]: js: Kando: Received data request.
Mar 12 19:51:18 moon kwin_wayland[2976]: js: Kando: Successfully transmitted the data.
Mar 12 19:51:18 moon kwin_wayland[2976]: js: Kando: Registered shortcut main-menu
Mar 12 19:51:18 moon kwin_wayland[2976]: js: Kando: Triggered.
Mar 12 19:51:18 moon kwin_wayland[2976]: js: Kando: Triggered.
Mar 12 19:51:18 moon kwin_wayland[2976]: js: Kando: Received data request.
Mar 12 19:51:18 moon kwin_wayland[2976]: js: Kando: Successfully transmitted the data.
Mar 12 19:51:18 moon kwin_wayland[2976]: js: Kando: Registered shortcut main-menu
Mar 12 19:51:19 moon kwin_wayland[2976]: js: Kando: Triggered.
Mar 12 19:51:19 moon kwin_wayland[2976]: js: Kando: Triggered.
Mar 12 19:51:19 moon kwin_wayland[2976]: js: Kando: Received data request.
Mar 12 19:51:19 moon dbus-broker-launch[2955]: ERROR sockopt_get_peerpidfd @ ../src/util/sockopt.c +244: Too many open files
Mar 12 19:51:19 moon dbus-broker-launch[2955]:       peer_new_with_fd @ ../src/bus/peer.c +290
Mar 12 19:51:19 moon dbus-broker-launch[2955]:       listener_dispatch @ ../src/bus/listener.c +54
Mar 12 19:51:19 moon dbus-broker-launch[2955]:       dispatch_context_dispatch @ ../src/util/dispatch.c +344
Mar 12 19:51:19 moon dbus-broker-launch[2955]:       broker_run @ ../src/broker/broker.c +229
Mar 12 19:51:19 moon kwin_wayland[2976]: js: Kando: Successfully transmitted the data.
Mar 12 19:51:19 moon systemd[2801]: Got disconnect on API bus.
Mar 12 19:51:19 moon wireplumber[3077]: m-dbus-connection: <WpDBusConnection:0x556c00b6d460> DBus connection closed: Underlying GIOStream returned 0 bytes on an async read
Mar 12 19:51:19 moon flatpak[3781]: [2 preload-host-spawn-strategy] Dropping 0xe400004c6c0 (3) because of connection closed
Mar 12 19:51:19 moon systemd[2801]: obex.service: Main process exited, code=exited, status=1/FAILURE
Mar 12 19:51:19 moon gvfsd[3647]: A connection to the bus can't be made
Mar 12 19:51:19 moon systemd[2801]: obex.service: Failed with result 'exit-code'.
Mar 12 19:51:19 moon dbus-broker[3629]: Dispatched 706 messages @ 2(±3)μs / message.
Mar 12 19:51:19 moon flatpak[3780]: [2 preload-host-spawn-strategy] Dropping 0x39480004c6c0 (3) because of connection closed
Mar 12 19:51:19 moon 1password[3713]: [3713:0312/195119.091844:FATAL:bus.cc(1246)] D-Bus connection was disconnected. Aborting.
Mar 12 19:51:19 moon flatpak[3780]: [2:36:0312/195119.092948:FATAL:bus.cc(1247)] D-Bus connection was disconnected. Aborting.
Mar 12 19:51:19 moon flatpak[3781]: [2:34:0312/195119.092748:FATAL:bus.cc(1247)] D-Bus connection was disconnected. Aborting.
Mar 12 19:51:19 moon flatpak[3926]: [0312/195119.093041:ERROR:scoped_ptrace_attach.cc(27)] ptrace: Operation not permitted (1)
Mar 12 19:51:19 moon flatpak[3925]: [0312/195119.092831:ERROR:scoped_ptrace_attach.cc(27)] ptrace: Operation not permitted (1)
Mar 12 19:51:19 moon wireplumber[3077]: m-dbus-connection: <WpDBusConnection:0x556c00b6d460> Trying to reconnect after core sync
Mar 12 19:51:19 moon systemd[2801]: xdg-permission-store.service: Main process exited, code=exited, status=1/FAILURE
Mar 12 19:51:19 moon systemd[2801]: xdg-permission-store.service: Failed with result 'exit-code'.
Mar 12 19:51:19 moon systemd[2801]: xdg-document-portal.service: Main process exited, code=exited, status=20/n/a
Mar 12 19:51:19 moon dbus-broker[2955]: Dispatched 24923 messages @ 3(±9)μs / message.
Mar 12 19:51:19 moon dbus-broker-launch[2955]:       run @ ../src/broker/main.c +261
Mar 12 19:51:19 moon dbus-broker-launch[2955]:       main @ ../src/broker/main.c +295
Mar 12 19:51:19 moon systemd[2801]: flatpak-session-helper.service: Main process exited, code=exited, status=1/FAILURE
Mar 12 19:51:19 moon kdeconnectd[3458]: 2025-03-12T19:51:19 org.kde.pulseaudio: No object for name "alsa_output.pci-0000_12_00.6.analog-stereo"
Mar 12 19:51:19 moon dbus-broker-launch[2954]: Caught SIGCHLD of broker.
Mar 12 19:51:19 moon dbus-broker-launch[2954]: ERROR launcher_run @ ../src/launch/launcher.c +1453: Return code 1
Mar 12 19:51:19 moon dbus-broker-launch[2954]:       run @ ../src/launch/main.c +152
Mar 12 19:51:19 moon systemd[2801]: xdg-document-portal.service: Failed with result 'exit-code'.
Mar 12 19:51:19 moon systemd[2801]: flatpak-session-helper.service: Failed with result 'exit-code'.

auslegungssache avatar Mar 12 '25 19:03 auslegungssache

Hi there! Thanks for the report and sorry for the delayed response (I was ill for the last week or so 🤒). I fear that this will be very hard to debug.

The only thing which comes to my mind is that the binding and unbinding of global shortcuts is implemented in a very weird way on KWin/Wayland using KWin scripts. And the hotkeys are unbound and rebound whenever a menu is shown.

You could test this - if you run Kando from source and remove this line and this line, this part of the D-Bus communication will be skipped.

Instructions for running Kando from source are here: https://kando.menu/compile-from-source/

Schneegans avatar Mar 17 '25 07:03 Schneegans

Thank you for your answer and so sorry for inactivity 😅

I tried out removing the lines but it sadly didn't help.

dbus-broker-launch[2955]: ERROR sockopt_get_peerpidfd @ ../src/util/sockopt.c +244: Too many open files

The actual error dbus broker error message leads me to believe, that this is caused by the script constantly being loaded, started, stopped and so forth.

Is there a specific reason, why it was implemented this way? Or is there anything that would speak against loading the script once and rolling trigger and sendWMInfo into a single call? If not, I could have a look into writing a patch.

Image Here you can see org.kde.kwin.Scripting.loadScript being called twice per invocation.

EDIT: Forgot to mention - the script that gets loaded multiple times per invocation is the following: Image

auslegungssache avatar Mar 24 '25 22:03 auslegungssache

Well, if you find a way to improve this, go for it! Back when I implemented this, I was happy that it worked at all 😅. So let's break this down a bit. There are two scripts used by the backend:

  • The get-info.js is static and it could indeed be possible to load this only once. Maybe you could separate the loading from the starting.
  • The contents of global-shortcuts.js changes every time when updateShortcuts is called. So this needs to be reloaded indeed quite frequently. I do not see a proper way to avoid this. Do you have an idea?

I think our best bet would be to remove this "hack" for binding global shortcuts altogether and use the global shortcuts portal instead. I think by now this is pretty widely supported.

I have not yet looked into this - however, I have the fear that this is not designed for our use-case. Currently Kando unbinds a menu's shortcut when the menu is shown so that Turbo-Mode works. I think that unbinding and rebinding a shortcut using the portal will not work without user interaction. But one would need to confirm this.

Schneegans avatar Mar 25 '25 07:03 Schneegans

Oh man this was such a rabbit hole. Wayland support for advanced use cases is still half baked at best... I looked more into how exactly programs lan-mouse handle this, and I see two options. I want to ask you how you imagine this being solved, since I don't want to implement something that doesn't fit into your vision 😺

Two ways of solving it

"The Proper Way"

The way I understand it, simply switching to the Global Shortcuts API wouldn't really help us here, because we'd still need to register a script that then sends the window / mouse info to Kando.

For capturing the position of the mouse, we'd have to switch to the Input Capture API. That would also solve the problem with turbo-mode, as Kando could just handle the key binding by itself and not rely on the compositor handling shortcuts properly. I don't think there is a Node.JS library for actually processing LibEi events, so this would be non-trivial to implement. I also looked into implementing this with just LibEi on its own, but although not documented, it seems that GNOME and KDE only let you interact with the API through the XDG portal 😓

The only real issue I see with this is that there is currently no way to capture the current window title / class, so that feature would just not work on Wayland for now 🤷‍♀

I think this would be the right and future proof way to do it moving forward. The Input Capture API is currently only supported on KDE, GNOME and Hyprland, but I think more will hop on later. This would allow us to remove some of the compositor-specific code as well.

As for implementing this: I looked around for a Node.JS library that'd handle the XDG Portal / LibEi, but there doesn't really seem to be any. I think implementing that is wayyy outside the scope of this project 😭 Since Kando uses native addons for Hyprland support, we could use the C library libportal to handle this for us. I sadly have no experience with C / C++ so I can't help there. The most developed and easy library for interfacing with these APIs is the Rust library ashpd (example of how to capture input events with it). It could then be connected with something like Neon. Maybe that is something we could look into?

The hacky way

We could move more of the logic into the kwin script. The kwin scripting API has pretty bad docs, so I'll look the specifics of that later. There doesn't seem to be any interface, over which one could define new D-Bus methods (or does this have to be handled by registering a new service??)

Assuming this is the case, we could rewrite the scripts, so that:

  1. the scripts only load once (at Kando startup)
  2. Kando communicates with the script over some socket / port / dbus (?) - new shortcuts, updates etc.
  3. trigger and sendWMInfo get rolled into a single call
  4. the script gets unloaded when Kando exits

However, this sounds scope-wise more like a full blown kwin extension 🫠 Maybe the IPC between Kando and KWin could also be moved completely out of D-Bus. I'm also considering reporting this to dbus-broker, as it shouldn't ever crash because of this.

auslegungssache avatar Mar 25 '25 10:03 auslegungssache

Create something like Kando with support for Wayland is indeed really painful. It took me really long to figure out that it is actually possible to register and call KWin scripts via D-Bus.

Regarding the "Proper Way"

The Input Capture API could potentially solve the issue of Turbo Mode. However, there is a good chance that it does not work. Just as a recap - the events here are as follows:

  1. Kando registers a global shortcut with the compositor, lets say Ctrl+Space.
  2. The user presses Ctrl+Space and Kando opens its window.
  3. The user keeps Ctrl+Space pressed - at this point, the compositor should send mouse motion events to Kando indicating that Ctrl is still pressed. Usually, this does not work if the global shortcut is still registered. Maybe using the input capture API could help here, but I do not see how the captured input should be forwarded to the Kando. If I see it correctly, the input capture API will forward the input to a "libEI file descriptor". Getting this to Node.js will be a challenge in itself.

The real problem of replacing the get-info.js script is however the following: Kando requires the pointer position before opening the window. The menu should be opened at the pointer position. From what I tried, getting the pointer position on Wayland is only possible after the user moved the pointer over the client window. If the pointer is stationary, the compositor does not send any events to the client indicating the pointer position. Hence, I concluded that we will need some sort of compositor extension to get the pointer position before the window is opened.

What could be a solution?

Ideally, we would use the global shortcuts portal for binding shortcuts, and hope that it does not interfere too much with Turbo Mode. To get the application title and the pointer position, we will require something which hooks into the compositor. A KWin script is the only thing that I tried - do KWin extensions have more capabilities? For GNOME, I also use a custom extension for providing the required information via D-Bus. Maybe a KWin extension could also provide a D-Bus interface?

As you figured out we can use both C++ or any NodeJS code for that. Kando already uses the Remote Desktop Portal via JS under KDE/Wayland, so I am pretty confident that using the Global Shortcuts Portal could work in a similar fashion.

Schneegans avatar Mar 25 '25 11:03 Schneegans

Electron 36 comes with builtin support for the global-shortcuts portal. I tested this and the implementation seems to be pretty mediocre. First, it has to be enabled via a command-line switch, second it is only enabled if the app runs as a Wayland client. So it simply doesn't work if we use XWayland (which is still the standard for Kando). Lastly, electron only allows binding shortcuts one by one. This causes the portal dialog to pop up once for each shortcut of the application which is pretty annoying.

So we need our own implementation!

While those issues may get fixed in the future, it will not happen anytime soon. So I decided to create our own implementation (#995). And I think it works pretty well! I had to do some pretty massive refactoring as Kando used the Electron approach of binding shortcuts one-by-one, but with the desktop portal you need to pass the entire set of shortcuts at once.

I am not sure if this fixes this issue. After all, it significantly reduces the amount of D-Bus communication between Kando and KWin. We still have the get-info script which returns the window title etc., but no script for binding the shortcuts anymore.

However, it improves the user experience on KDE Wayland anyways, as we can bind the shortcuts more or less directly from Kando, without having to search for some menu IDs in the system settings anymore!

Please test this!

You can download a test build from the bottom of this page: https://github.com/kando-menu/kando/actions/runs/15618764858 (the artifact named build-ubuntu-24.04). This contains all the different formats, you just run the AppImage, for instance.

If you want to be on the safe side, you may want to backup your config files before.

Does this work for you? Can you still reproduce the crash?

Schneegans avatar Jun 13 '25 03:06 Schneegans

I have merged #995 as I am pretty sure that it will improve the user experience anyways. But I am not sure if it resolves this issue here. So please report any findings!

Schneegans avatar Jun 14 '25 12:06 Schneegans

I have tested it out again, and it does seem to have improved a bit, as it really takes a lot of activating Kando to trigger it. However, the issue is sadly still present. Aka if I restart my PC every day, I don't think I'd notice it. But after multiple days of use, it would bring the desktop down eventually.

I assume it has something to do with a script being reloaded every single time. I have attached a snippet of the logs.

Jun 22 19:10:48 moon kwin_wayland[366661]: js: Kando: Successfully transmitted the data.
Jun 22 19:10:49 moon kwin_wayland[366661]: js: Kando: Received data request.
Jun 22 19:10:49 moon kwin_wayland[366661]: js: Kando: Successfully transmitted the data.
Jun 22 19:10:49 moon dbus-broker-launch[366317]: ERROR sockopt_get_peerpidfd @ ../src/util/sockopt.c +244: Too many open files
Jun 22 19:10:49 moon dbus-broker-launch[366317]:       peer_new_with_fd @ ../src/bus/peer.c +290
Jun 22 19:10:49 moon dbus-broker-launch[366317]:       listener_dispatch @ ../src/bus/listener.c +54
Jun 22 19:10:49 moon dbus-broker-launch[366317]:       dispatch_context_dispatch @ ../src/util/dispatch.c +344
Jun 22 19:10:49 moon dbus-broker-launch[366317]:       broker_run @ ../src/broker/broker.c +229
Jun 22 19:10:49 moon wireplumber[3268]: m-dbus-connection: <WpDBusConnection:0x555b41691370> DBus connection closed: Underlying GIOStream returned>
Jun 22 19:10:49 moon systemd[3081]: Got disconnect on API bus.
Jun 22 19:10:49 moon gvfsd[366950]: A connection to the bus can't be made
Jun 22 19:10:49 moon systemd[3081]: xdg-permission-store.service: Main process exited, code=exited, status=1/FAILURE
Jun 22 19:10:49 moon flatpak[367413]: [2 preload-host-spawn-strategy] Dropping 0x2da00004c540 (3) because of connection closed
Jun 22 19:10:49 moon systemd[3081]: xdg-permission-store.service: Failed with result 'exit-code'.
Jun 22 19:10:49 moon dbus-broker[367201]: Dispatched 201 messages @ 3(±4)μs / message.
Jun 22 19:10:49 moon 1password[367295]: [367295:0622/191049.171464:FATAL:bus.cc(1247)] D-Bus connection was disconnected. Aborting.
Jun 22 19:10:49 moon flatpak[367413]: [2:50:0622/191049.171806:FATAL:dbus/bus.cc:1248] D-Bus connection was disconnected. Aborting.
Jun 22 19:10:49 moon wireplumber[3268]: m-dbus-connection: <WpDBusConnection:0x555b41691370> Trying to reconnect after core sync
Jun 22 19:10:49 moon flatpak[367520]: [0622/191049.171917:ERROR:third_party/crashpad/crashpad/util/linux/scoped_ptrace_attach.cc:27] ptrace: Opera>
Jun 22 19:10:49 moon dbus-broker[366317]: Dispatched 26098 messages @ 2(±4)μs / message.
Jun 22 19:10:49 moon dbus-broker-launch[366317]:       run @ ../src/broker/main.c +261
Jun 22 19:10:49 moon dbus-broker-launch[366317]:       main @ ../src/broker/main.c +295
Jun 22 19:10:49 moon dbus-broker-launch[366311]: Caught SIGCHLD of broker.
Jun 22 19:10:49 moon dbus-broker-launch[366311]: ERROR launcher_run @ ../src/launch/launcher.c +1453: Return code 1
Jun 22 19:10:49 moon dbus-broker-launch[366311]:       run @ ../src/launch/main.c +152

But at least we now know, what the actual issue is caused by :)

auslegungssache avatar Jun 22 '25 17:06 auslegungssache

Thanks for the feedback! However, at this point, I am mostly out of ideas. The only thing one could try is to add some timeouts so that run and stop are not called in crazy fast succession.

Would you mind running Kando from source and testing this a bit? Instructions are here: https://kando.menu/compile-from-source/, if you have not done this yet.

Once this is running on your end, I could send you some ideas for testing!

Schneegans avatar Jun 23 '25 19:06 Schneegans

Oh, I just understand why 'dbus-broker-launch: ERROR sockopt_get_peerpidfd @ ../src/util/sockopt.c +244: Too many open files' means your system is reaching the maximum number of file descriptors (FDs) — and D-Bus, a core inter-process communication system, is failing because of it. Once D-Bus goes down, everything else that depends on it (KWin, WirePlumber, Flatpak, 1Password, etc.) starts to crash like dominoes. That explains the desktop breakdown you're seeing.

yar2000T avatar Jun 24 '25 07:06 yar2000T

Yes exactly. Now one could claim this is an issue of the dbus broker itself (which it also kinda is, it should never crash). However, the current way Kando does it with two scripts being loaded on every invocation is rather hacky. (obverse the D-Bus traffic in Bustle)

I tried to have a look into how one could use a single script, that'd only get loaded once, and do the heavy handling there. Sadly the KDE documentation here is really sparse and I'm not an expert on this, I'm just trying to get Kando to work 😭

I'll have another look when I get around to it :) I feel like we're already on a very good path here with the new hotkeys api integration

auslegungssache avatar Jun 24 '25 07:06 auslegungssache

I know a temporary fix, you can make the maximum number of file descriptors bigger, here is a solution: Edit /etc/security/limits.conf and add:

yourusername soft nofile 65535
yourusername hard nofile 65535

Add this to your systemd user config:

mkdir -p ~/.config/systemd/user.conf.d
nano ~/.config/systemd/user.conf.d/limits.conf

With content:

[Manager]
DefaultLimitNOFILE=65535

Reboot or restart the session:

systemctl --user daemon-reexec
systemctl --user restart some-service

yar2000T avatar Jun 24 '25 07:06 yar2000T

https://www.cyberciti.biz/faq/linux-increase-the-maximum-number-of-open-files/ https://unix.stackexchange.com/questions/8945/how-can-i-increase-open-files-limit-for-all-processes

yar2000T avatar Jun 24 '25 07:06 yar2000T

But as I said, it's temporary solution, and in the future we need to fix it on the d-bus broker side

yar2000T avatar Jun 24 '25 07:06 yar2000T

That's a good observation! Thank you! I wasn't aware you could do that.

The main concern here is that somebody uses Kando on KDE (I mean it's the most used DE) and it eventually brings their whole OS down in the middle of their work. And if they're not technically versed, there is no way to restart the DE except for a hard power off (thus losing all unsaved work)

auslegungssache avatar Jun 24 '25 07:06 auslegungssache

However, I don't think fixing it on the side of the dbus broker is the right way here. As I said, currently it loads 2 scripts per invocation. That is inherently quite hacky.

Maybe it would also be a good idea to raise an issue with the d-bus broker but I think that's secondary

auslegungssache avatar Jun 24 '25 07:06 auslegungssache

I think the easiest what we can do is just restart Kando and the D-Bus broker, and it will close all active file descriptors

yar2000T avatar Jun 24 '25 07:06 yar2000T

It would be awesome if a KWin script could provide some sort of IPC interface for Kando to talk to. Then we could start the script only once and and request the required information via the interface whenever a menu has to be shown.

But as you said, documentation is sparse and I do not see a way how to do this currently...

Schneegans avatar Jun 24 '25 07:06 Schneegans

Would something speak against sending the mouse position and the currently opened window INSIDE of this D-Bus call in the KWin script?

Image

Thanks to the newly implemented global-shortcuts, it would theoretically be possible to do it this way:

  1. Kando registers global shortcut on startup The shortcut callback attaches the current mouse position and the currently opened window info (title / id) to the trigger D-Bus call send to Kando
  2. When the shortcut is pressed, Kando receives all the relevant data in one invocation This would sidestep the need to load a script every time Kando is opened

I see just one issue with this, and that's that it would not be possible to open the menu without actually pressing the key - aka not possible to open a menu over the tray icon This could potentially be solved by Kando just injecting the hotkey press.

Or is there something else I'm missing? I really wish KWin would have a way to register new endpoints for a script. Sadly they're sandboxed.

auslegungssache avatar Jun 24 '25 08:06 auslegungssache

This is a good idea. Problem is, this code is not used with the new global-shortcuts portal. Instead, this code is used now: https://github.com/kando-menu/kando/blob/main/src/main/backends/linux/kde/wayland/backend.ts#L268

The one you posted is the old one using a KWin script for binding the shortcuts. Also, Kando does need this information in other places as well, for instance for picking the currently active window in the settings dialog:

Image

So we really need a way to get this information reliably via a method triggered from Kando...

Schneegans avatar Jun 24 '25 08:06 Schneegans

@auslegungssache do you still experience this issue? With the desktop portal we have reduced the D-Bus traffic regarding the KWin scripting quite a bit. Also, the only remaining script is not re-created every time, so maybe it does not increase the number of open file descriptors...

So maybe the issue is now more or less resolved?

Schneegans avatar Nov 06 '25 05:11 Schneegans