sway icon indicating copy to clipboard operation
sway copied to clipboard

Sway crashes when using `for_window [all] border pixel 1` and spawning a new window while in fullscreen

Open itshog opened this issue 1 year ago • 6 comments

Please fill out the following:

  • Sway Version: 1.10-dev-7a2ff7ba (Feb 13 2024, branch 'master')

  • Debug Log: sway debug log

  • Configuration File: default config, the only options added were the one mentioned in the title which produces the crash and those related to keyboard layout

  • Stack Trace: bt full output, other output from coredumpctl gdb sway include the following: other output, and also:

Core was generated by `/home/itshog/sway/build/sway/sway --config config-base --debug'.
Program terminated with signal SIGABRT, Aborted.
#0  __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44
44	     return INTERNAL_SYSCALL_ERROR_P (ret) ? INTERNAL_SYSCALL_ERRNO (ret) : 0;                                                                                                         
[Current thread is 1 (Thread 0x734bec3879c0 (LWP 72603))]
  • Description:

Some times ago I read that I could use the window rule for_window [all] border pixel 1 to force all windows to have borders (at the time it was needed because some applications that I used didn't have borders without this rule), and I found out that spawning a window while another is in fullscreen with this rule active reliably crashes sway on latest master.

Steps to reproduce:

  1. Have the for_window [all] border pixel 1 option in the config file
  2. Spawn a window (like foot)
  3. Make it fullscreen
  4. Spawn another window

Removing the option from the config seems to fix the problem. I successfully reproduced the crash with different applications, both terminals and graphical ones, and I believe the relevant part of the debug log, which is common between all the crashes, is the following:

00:00:02.411 [DEBUG] [sway/tree/view.c:472] Checking criteria [all]
00:00:02.411 [DEBUG] [sway/tree/view.c:477] for_window '[all]' matches view 0x58c93ae9c470, cmd: 'border pixel 1'
00:00:02.411 [INFO] [sway/commands.c:261] Handling command 'border pixel 1'
00:00:02.411 [DEBUG] [sway/desktop/transaction.c:787] Transaction 0x58c93b13c2e0 committing with 3 instructions
sway: types/xdg_shell/wlr_xdg_toplevel.c:550: wlr_xdg_toplevel_set_size: Assertion `width >= 0 && height >= 0' failed.

Let me know if I could take other steps to investigate this further.

itshog avatar Feb 13 '24 00:02 itshog

This seems loosely related to https://github.com/swaywm/sway/issues/7946

Nefsen402 avatar Feb 13 '24 15:02 Nefsen402

I think I got it why this happens.

The height/width calculation that crashes sway happens at sway/tree/view.c in the function view_autoconfigure. It seems like when rule for_window [all] border pixel 1 is set in the config, creating a new window in fullscreen happens correctly. The problems start when the rule for_window [all] is triggered. Somehow this rule matches the not-focused, and non-fullscreen window and that crashes sway. The function view_autoconfigure is executed throughsway/commands/border.c's function *cmd_border() calling arrange_container() when setting the border to match the rule border pixel 1.

The actual crash happens simply because view_autoconfigure() calculates the new width and height by reducing the borders from the currect width and height (which are both 0) which results in width=height=-2 and that gets caught by sway: types/xdg_shell/wlr_xdg_toplevel.c:550: wlr_xdg_toplevel_set_size: Assertion `width >= 0 && height >= 0' failed.

iritmaximus avatar Mar 01 '24 10:03 iritmaximus

Also, just adding this seems to fix this issue, though this is not the best way to do it (not fixing the root cause).

diff --git a/sway/tree/view.c b/sway/tree/view.c
index 35b4b73f..ce883140 100644
--- a/sway/tree/view.c
+++ b/sway/tree/view.c
@@ -365,8 +365,8 @@ void view_autoconfigure(struct sway_view *view) {
 
        con->pending.content_x = x;
        con->pending.content_y = y;
-       con->pending.content_width = width;
-       con->pending.content_height = height;
+       con->pending.content_width = MAX(width, 0);
+       con->pending.content_height = MAX(height, 0);
 }
 
 void view_set_activated(struct sway_view *view, bool activated) {

iritmaximus avatar Mar 01 '24 12:03 iritmaximus

Also, just adding this seems to fix this issue, though this is not the best way to do it (not fixing the root cause).

diff --git a/sway/tree/view.c b/sway/tree/view.c
index 35b4b73f..ce883140 100644
--- a/sway/tree/view.c
+++ b/sway/tree/view.c
@@ -365,8 +365,8 @@ void view_autoconfigure(struct sway_view *view) {
 
        con->pending.content_x = x;
        con->pending.content_y = y;
-       con->pending.content_width = width;
-       con->pending.content_height = height;
+       con->pending.content_width = MAX(width, 0);
+       con->pending.content_height = MAX(height, 0);
 }
 
 void view_set_activated(struct sway_view *view, bool activated) {

I can reproduce the same issue. I've been using this patch for weeks and it works perfectly, though even though I tried to debug it, I didn't find the root cause.

Ferdi265 avatar May 23 '24 17:05 Ferdi265

I looked at this again, and I think I have more of an idea of where this is going wrong. The following is a timeline of things happening:

  • A client window becomes fullscreen on the active workspace
  • A new client is spawned
  • The new client's window is created, which causes view_map() in sway/tree/view.c to be called
  • view_map() then arranges the workspace. This is done differently depending on the workspace state and whether the window should spawn in fullscreen.
  • In this case, the new client's window should not spawn in fullscreen, does not have a parent container, but does have a workspace (the active workspace), so view_map() calls arrange_workspace() on that workspace.
  • Since the active workspace has a fullscreen container, arrange_workspace() will arrange only that container and not the tiling or floating containers
    • NOTE: this means our newly created window has never been arranged properly, which means its pending.width and pending.height have never been set
  • After arranging the workspace, view_map() calls view_execute_criteria(), which triggers our for_window [all] criterion
  • This causes the command border pixel 1 to be executed, calling cmd_border()
  • cmd_border() sets the window's pending.border_thickness and then calls arrange_container() on it. Since the container never had a valid width or height set, this explodes.

TL;DR: the new window never got a size since it's not actually visible, but cmd_border() reconfigures windows without rearranging the whole workspace, leading to the window being arranged with only part of the necessary state.

A better fix would probably be to bail out of attempting to reconfigure a window that was never visible and therefore has not yet been arranged the proper way and might be missing some state. view_autoconfigure() already has some checks to bail out for hidden scratchpad items, but I think this needs a more comprehensive solution to prevent things from being reconfigured that aren't ready for that yet.

Ferdi265 avatar Jun 10 '24 19:06 Ferdi265

Here are my test case and debug print patches I used to debug this: gist

Note: the debug patch intentionally crashes once the negative container size is set. This makes post-mortem-debugging the call stack at the time the negative size was set easier.

Ferdi265 avatar Jun 10 '24 19:06 Ferdi265

This appears to have been fixed by #8217, since view_autoconfigure() now has changes to never configure a negative size.

Ferdi265 avatar Jul 07 '24 21:07 Ferdi265

Please re-open if you can reproduce with latest commit.

emersion avatar Jul 11 '24 19:07 emersion