swaylock icon indicating copy to clipboard operation
swaylock copied to clipboard

Segmentation fault on sleep

Open luispabon opened this issue 5 years ago • 17 comments

I have swaylock occasionally segfaulting at the point of locking before suspend:

https://gist.github.com/luispabon/5489fc8207b69f47691a2bd94ea290a0 https://github.com/luispabon/sway-dotfiles https://github.com/luispabon/sway-dotfiles/blob/master/configs/swaylock/config https://github.com/luispabon/sway-dotfiles/blob/master/configs/sway/conf.d/idle-and-lockscreen

I'm not too versed on debugging this sort of thing. I've installed what I believe to be debug symbols, providing that's what I need to get a trace or something could you please let me know how to set things up next time it crashes?

These commits are all the latest versions as of monday:

swaylock 11560381bf54d228aa09aa61bd31135ae0ea9662 (latest as of today) sway 5becce8005e3c617afbdddda8f0da95f84540b27 wlroots 46dc4100d66567f77a413627a0a0b046ccf8094b

Ubuntu 19.04

luispabon avatar Jun 26 '19 09:06 luispabon

Please provide a stack trace. You can do so by running coredumpctl gdb and then bt full.

emersion avatar Jun 26 '19 10:06 emersion

I will, thank you.

luispabon avatar Jun 26 '19 10:06 luispabon

Same here,

sway commit f5d1c27226a74a234664af9b35ab226a67386e8e swaylock commit b1a7defa0087db7b984f568c79634316bb6bf1eb wlroots commit fb739b829305a60f99abb6b847b45aeb9c6cbf77 linux kernel version 5.1.15 from Arch Linux

stacktrace.txt backtrace.txt

sydneymeyer avatar Jun 30 '19 13:06 sydneymeyer

https://gist.github.com/luispabon/924b65981bb7e5fafc80d84c2acb24a0

Looks the same as @sydneymeyer

Crash happened at the point of closing my laptop's lid, not after resuming from sleep.

luispabon avatar Jul 02 '19 08:07 luispabon

I have another one. It only happens when I undock my laptop before closing the lid:

Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/bin/swaylock...
(No debugging symbols found in /usr/bin/swaylock)
[New LWP 3850]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/libthread_db.so.1".
Core was generated by `swaylock -f'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00007f5b1b2752c0 in xkb_state_update_mask () from /usr/lib/libxkbcommon.so.0
(gdb) bt full
#0  0x00007f5b1b2752c0 in xkb_state_update_mask () at /usr/lib/libxkbcommon.so.0
#1  0x0000561efcee6df8 in  ()
#2  0x00007f5b1a4866d0 in ffi_call_unix64 () at /usr/lib/libffi.so.6
#3  0x00007f5b1a4860a0 in ffi_call () at /usr/lib/libffi.so.6
#4  0x00007f5b1b265f8f in  () at /usr/lib/libwayland-client.so.0
#5  0x00007f5b1b2626ba in  () at /usr/lib/libwayland-client.so.0
#6  0x00007f5b1b263bfc in wl_display_dispatch_queue_pending () at /usr/lib/libwayland-client.so.0
#7  0x00007f5b1b26404c in wl_display_roundtrip_queue () at /usr/lib/libwayland-client.so.0
#8  0x0000561efcee2703 in  ()
#9  0x00007f5b1aeb0ee3 in __libc_start_main () at /usr/lib/libc.so.6
#10 0x0000561efcee2ade in  ()

This might be related, this happens after undock. sway hickups with the outputs going away. I think this subsequently causes swaylock to die.

Jul 05 15:02:03 kronos sway[860]: 2019-07-05 15:02:03 - [wlroots-0.6.0/backend/drm/atomic.c:55] eDP-1: Atomic commit failed (pageflip): Invalid argument
Jul 05 15:02:03 kronos sway[860]: 2019-07-05 15:02:03 - [wlroots-0.6.0/backend/drm/atomic.c:55] DP-4: Atomic commit failed (pageflip): Invalid argument
Jul 05 15:02:03 kronos sway[860]: 2019-07-05 15:02:03 - [wlroots-0.6.0/backend/drm/atomic.c:55] DP-5: Atomic commit failed (pageflip): Invalid argument
Jul 05 15:02:03 kronos sway[860]: [
Jul 05 15:02:03 kronos sway[860]:    {
Jul 05 15:02:03 kronos sway[860]:      "success": true
Jul 05 15:02:03 kronos sway[860]:    }
Jul 05 15:02:03 kronos sway[860]:  ]
Jul 05 15:02:06 kronos sway[860]: 2019-07-05 15:02:06 - [wlroots-0.6.0/backend/drm/atomic.c:55] DP-4: Atomic commit failed (pageflip): Invalid argument
Jul 05 15:02:06 kronos sway[860]: 2019-07-05 15:02:06 - [wlroots-0.6.0/backend/drm/atomic.c:62] DP-4: Atomic commit without new changes failed (pageflip): Invalid argument
Jul 05 15:02:06 kronos sway[860]: 2019-07-05 15:02:06 - [wlroots-0.6.0/backend/drm/drm.c:809] Skipping pageflip on output 'eDP-1'
Jul 05 15:02:06 kronos sway[860]: 2019-07-05 15:02:06 - [wlroots-0.6.0/backend/drm/drm.c:809] Skipping pageflip on output 'DP-4'
Jul 05 15:02:06 kronos sway[860]: 2019-07-05 15:02:06 - [wlroots-0.6.0/backend/drm/drm.c:809] Skipping pageflip on output 'DP-5'
Jul 05 15:02:06 kronos sway[860]: [
Jul 05 15:02:06 kronos sway[860]:    {
Jul 05 15:02:06 kronos sway[860]:      "success": true
Jul 05 15:02:06 kronos sway[860]:    }
Jul 05 15:02:06 kronos sway[860]:  ]
Jul 05 15:02:18 kronos kernel: usb 1-3: USB disconnect, device number 3
Jul 05 15:02:18 kronos kernel: usb 1-3.2: USB disconnect, device number 5
Jul 05 15:02:18 kronos kernel: usb 2-4: USB disconnect, device number 3
Jul 05 15:02:18 kronos kernel: thinkpad_acpi: undocked from hotplug port replicator
Jul 05 15:02:18 kronos kernel: usb 1-3.3: USB disconnect, device number 7
Jul 05 15:02:18 kronos kernel: usb 1-3.4: USB disconnect, device number 9
Jul 05 15:02:18 kronos kernel: usb 1-3.4.2: USB disconnect, device number 10
Jul 05 15:02:18 kronos kernel: usb 1-3.4.3: USB disconnect, device number 11
Jul 05 15:02:18 kronos kernel: [drm:intel_mst_disable_dp [i915]] *ERROR* failed to update payload -22
Jul 05 15:02:19 kronos sway[860]: 2019-07-05 15:02:19 - [swaybg-1.0/main.c:168] Destroying output DP-4 (Dell Inc. DELL U2312HM 59DJP189BW8L)
Jul 05 15:02:19 kronos sway[860]: 2019-07-05 15:02:19 - [swaybg-1.0/main.c:168] Destroying output DP-5 (Dell Inc. DELL U2312HM 59DJP189BWAL)
Jul 05 15:02:21 kronos systemd-logind[768]: Lid closed.
Jul 05 15:02:21 kronos systemd-logind[768]: Suspending...
Jul 05 15:02:21 kronos systemd-logind[768]: Lid closed.
Jul 05 15:02:21 kronos systemd-logind[768]: Suspending...
Jul 05 15:02:21 kronos NetworkManager[771]: <info>  [1562331741.8375] manager: sleep: sleep requested (sleeping: no  enabled: yes)
Jul 05 15:02:21 kronos NetworkManager[771]: <info>  [1562331741.8376] device (p2p-dev-wlp3s0): state change: disconnected -> unmanaged (reason 'sleeping', sys-iface-state: 'managed')
Jul 05 15:02:21 kronos NetworkManager[771]: <info>  [1562331741.8379] manager: NetworkManager state is now ASLEEP
Jul 05 15:02:21 kronos evolution[1419]: Network disconnected.  Forced offline.
Jul 05 15:02:21 kronos audit[3850]: ANOM_ABEND auid=1000 uid=1000 gid=1000 ses=1 pid=3850 comm="swaylock" exe="/usr/bin/swaylock" sig=11 res=1
Jul 05 15:02:21 kronos kernel: swaylock[3850]: segfault at 0 ip 00007f5b1b2752c0 sp 00007ffdc3276240 error 4 in libxkbcommon.so.0.0.0[7f5b1b272000+1c000]
Jul 05 15:02:21 kronos kernel: Code: 0f 94 c2 5b 89 d0 5d 41 5c c3 41 57 41 89 f7 41 56 41 89 d6 41 55 41 89 cd 41 54 45 89 cc 55 44 89 c5 53 48 89 fb 48 83 ec 38 <f3> 0f 6f 07 f3 0f 6f 4f 10 64 48 8b 04 25 28 0>
Jul 05 15:02:21 kronos kernel: audit: type=1701 audit(1562331741.869:84): auid=1000 uid=1000 gid=1000 ses=1 pid=3850 comm="swaylock" exe="/usr/bin/swaylock" sig=11 res=1
Jul 05 15:02:21 kronos systemd[1]: Created slice system-systemd\x2dcoredump.slice.
Jul 05 15:02:21 kronos systemd[1]: Started Process Core Dump (PID 3861/UID 0).
Jul 05 15:02:21 kronos audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-coredump@0-3861-0 comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=succes>
Jul 05 15:02:21 kronos kernel: audit: type=1130 audit(1562331741.875:85): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-coredump@0-3861-0 comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? a>
Jul 05 15:02:22 kronos sway[860]: Releasing sleep lock 11
Jul 05 15:02:22 kronos systemd[1]: Reached target Sleep.
Jul 05 15:02:22 kronos systemd[1]: Starting Suspend...
Jul 05 15:02:22 kronos systemd-sleep[3863]: Suspending system...
Jul 05 15:02:22 kronos kernel: PM: suspend entry (deep)
Jul 05 15:02:22 kronos systemd-coredump[3862]: Process 3850 (swaylock) of user 1000 dumped core.

                                               Stack trace of thread 3850:
                                               #0  0x00007f5b1b2752c0 xkb_state_update_mask (libxkbcommon.so.0)
                                               #1  0x0000561efcee6df8 n/a (swaylock)
                                               #2  0x00007f5b1a4866d0 ffi_call_unix64 (libffi.so.6)
                                               #3  0x00007f5b1a4860a0 ffi_call (libffi.so.6)
                                               #4  0x00007f5b1b265f8f n/a (libwayland-client.so.0)
                                               #5  0x00007f5b1b2626ba n/a (libwayland-client.so.0)
                                               #6  0x00007f5b1b263bfc wl_display_dispatch_queue_pending (libwayland-client.so.0)
                                               #7  0x00007f5b1b26404c wl_display_roundtrip_queue (libwayland-client.so.0)
                                               #8  0x0000561efcee2703 n/a (swaylock)
                                               #9  0x00007f5b1aeb0ee3 __libc_start_main (libc.so.6)
                                               #10 0x0000561efcee2ade n/a (swaylock)

rumpelsepp avatar Jul 05 '19 13:07 rumpelsepp

Now that you mention @rumpelsepp about undocking, this is definitely more likely to happen to me at the end of the day when I pull all the cables (displays, input etc) right before I close the lid.

luispabon avatar Jul 12 '19 15:07 luispabon

I'm on sway version 1.1-rc1-60-g5ffcea4c with swaylock version 1.4-9-gb1a7def and the last time swaylock crashed has been a week ago. Also, FWIW, i'm using kanshi from commit 76e9f41 for display management, since i also remember this is/has been only happening "around" disconnecting the laptop from its docking station (ThinkPad X1C6 with Ultra Dock).

sydneymeyer avatar Jul 12 '19 19:07 sydneymeyer

Oddly enough, it appears that since i have switched back to a wired apple keyboard from a bluetooth one, swaylock has started crashing again reliably. swaylock hasn't crashed a single time whilst using the bt keyboard. i'll try to document this behaviour a bit better, when i have some time to spare.

sydneymeyer avatar Jul 21 '19 19:07 sydneymeyer

I can confirm this is still a problem and can reliably reproduce it by disconnecting my thunderbolt dock (two displays, mouse, keyboard and audio interface) then closing the lid.

If I however trigger lock before closing the lid, there's no problem.

luispabon avatar Nov 13 '19 11:11 luispabon

The backtrace you provided doesn't contain debug symbols. This most likely happens because the binary you're using doesn't have debug information bundled.

Can you try again with a manually compiled binary?

emersion avatar Nov 13 '19 11:11 emersion

Of course

luispabon avatar Nov 13 '19 12:11 luispabon

This any help? https://pastebin.com/97aV4Q6v I'm struggling to get a debug build going together with debian's debuild.

luispabon avatar Nov 13 '19 14:11 luispabon

The best would be to compile manually with ASan (meson -Db_sanitize=address).

emersion avatar Nov 13 '19 14:11 emersion

Copy that. Anybody here familiar enough with debian's packaging? I can find zero docs on how to get dh_auto_configure to pass on any extra params to meson.

luispabon avatar Nov 13 '19 15:11 luispabon

I can trigger this reliably by unplugging mouse and keyboard and waiting for swayidle to trigger a swaylock -f (swayidle -w timeout 30 'swaylock -f'). Although, interestingly sway itself soon crashes (in addition to swaylock crashing like in this issue) when I plug mouse and keyboard back in. I will try to get a backtrace with debug symbols when I have time.

icasdri avatar Nov 14 '19 04:11 icasdri

I hit this same problem, also when using a thunderbolt dock and lots of USB peripherals. I didn't have debug symbols enabled, so no new data (identical stack trace.) I've enabled debug symbols for next time.

EDIT: Eh, what the heck. Here's the bt:

#0  0x00007f7bff845830 in xkb_state_update_mask () from /nix/store/wg61935hysd510viifbfmqyn2v052wha-libxkbcommon-0.8.4/lib/libxkbcommon.so.0
#1  0x0000000000409347 in keyboard_modifiers ()
#2  0x00007f7bfed35ff0 in ffi_call_unix64 () from /nix/store/j1gs46vkawlk9mz8lc9g0xfi94hwrcv7-libffi-3.2.1/lib/../lib64/libffi.so.6
#3  0x00007f7bfed3577a in ffi_call () from /nix/store/j1gs46vkawlk9mz8lc9g0xfi94hwrcv7-libffi-3.2.1/lib/../lib64/libffi.so.6
#4  0x00007f7bff81f29d in wl_closure_invoke () from /nix/store/41dpnsb2h58lxjp3fxks4m82lsrww4m0-wayland-1.17.0/lib/libwayland-client.so.0
#5  0x00007f7bff81bac9 in dispatch_event.isra () from /nix/store/41dpnsb2h58lxjp3fxks4m82lsrww4m0-wayland-1.17.0/lib/libwayland-client.so.0
#6  0x00007f7bff81cfb4 in wl_display_dispatch_queue_pending ()
   from /nix/store/41dpnsb2h58lxjp3fxks4m82lsrww4m0-wayland-1.17.0/lib/libwayland-client.so.0
#7  0x00007f7bff81d3d3 in wl_display_roundtrip_queue ()
   from /nix/store/41dpnsb2h58lxjp3fxks4m82lsrww4m0-wayland-1.17.0/lib/libwayland-client.so.0
#8  0x0000000000404fa2 in main ()

grahamc avatar Dec 17 '19 16:12 grahamc

I wouldn't know how this would be related, but my laptop is a Thinkpad X1 Carbon 6th Gen and this machine has a bios option called "Thunderbolt BIOS Assist Mode". Since i disabled this "TB Assist Mode" i had almost (i.e. 1-2 segfaults in ~ 3-4 Months) no crashes with swaylock since. Besides, a lot of things have become considerable more stable, particulary around resuming from suspend, since i have switched off this option. IIRC, the Linux Kernel has received support for handling this in SW somewhere around 4.19.

sydneymeyer avatar Dec 17 '19 19:12 sydneymeyer