qubes-issues icon indicating copy to clipboard operation
qubes-issues copied to clipboard

Firefox window content freezes randomly

Open Warthog-Capital opened this issue 3 years ago • 36 comments

Qubes OS release

4.1.1

Brief summary

While using Firefox on the Fedora 35 and 36 XFCE templates, sometimes the window content freezes. However, the window title still changes when clicking on different tabs, and when switching between windows, the window content is updated as well. Closing and opening Firefox again solves the problem for the moment.

Steps to reproduce

  1. Open Firefox
  2. Surf the web, scrolling, clicking and sometimes selecting text with the mouse pointer
  3. Wait for Firefox to freeze.

Expected behavior

The window content should update immediately, while interacting with the application.

Actual behavior

The window content is only updated after switching to a different window (Alt-Tab), and then switching back to Firefox. This way it is impossible to fluently interact with the application.

Example:

  1. Click on an existing open tab. The window title changes to the title of that tab's website, but otherwise nothing happens
  2. Hit Alt-Tab twice
  3. Firefox now shows the open tab, but it is still not possible to interact with the site

Solutions I have tried

Closing Firefox and opening it again solves the problem, until it freezes the next time.

Warthog-Capital avatar Aug 14 '22 12:08 Warthog-Capital

I experience the same bug from time to time, but it's hard to find a way to reliably reproduce it.

resulin avatar Aug 15 '22 14:08 resulin

Closing and opening Firefox again solves the problem for the moment

I've encountered the same type of issue for weeks now, but only on my "offline" development VM. It turns out that the VM ran out of memory every single time I've encountered this issue. And because the IDE was restarting the killed process without notice, I wasn't able to find a reliable way to reproduce the issue.

You should check if there are any oom-killer invocations in your logs:

cd /var/log/xen/console
grep -rl 'Xorg invoked oom-killer'

If you have some files listed in there, you can use sed to see what programs have consumed the memory and what did the oom-killer do to free-up some memory:

cat $(grep -rl 'Xorg invoked oom-killer') | sed -n '/Xorg invoked oom-killer/,/Out of memory: Killed process/p'
Click for truncated example log
[2022-08-23 19:14:40] [13547.358165] Xorg invoked oom-killer: gfp_mask=0x1100cca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=0
[2022-08-23 19:14:40] [13547.358200] CPU: 0 PID: 625 Comm: Xorg Not tainted 5.15.57-1.fc32.qubes.x86_64 #1
[2022-08-23 19:14:40] [13547.358219] Call Trace:
[2022-08-23 19:14:40] [13547.358228]  <TASK>
[2022-08-23 19:14:40] [13547.358236]  dump_stack_lvl+0x46/0x5e
[2022-08-23 19:14:40] [13547.358250]  dump_header+0x4a/0x1f4
[2022-08-23 19:14:40] [13547.358261]  oom_kill_process.cold+0xb/0x10
[2022-08-23 19:14:40] [13547.358272]  out_of_memory+0xed/0x2d0
[2022-08-23 19:14:40] [13547.358284]  __alloc_pages_slowpath.constprop.0+0x96d/0xa20
[2022-08-23 19:14:40] [13547.358300]  __alloc_pages+0x1e9/0x220
[2022-08-23 19:14:40] [13547.358311]  pagecache_get_page+0x1c4/0x4b0
[2022-08-23 19:14:40] [13547.358322]  filemap_fault+0x3cf/0x790
[2022-08-23 19:14:40] [13547.358332]  __do_fault+0x37/0x150
[2022-08-23 19:14:40] [13547.358343]  do_read_fault+0x43/0x80
[2022-08-23 19:14:40] [13547.358355]  do_fault+0xba/0x1a0
[2022-08-23 19:14:40] [13547.358365]  __handle_mm_fault+0x3d9/0x6d0
[2022-08-23 19:14:40] [13547.358376]  handle_mm_fault+0xcf/0x2b0
[2022-08-23 19:14:40] [13547.358387]  do_user_addr_fault+0x1be/0x670
[2022-08-23 19:14:40] [13547.358398]  exc_page_fault+0x72/0x150
[2022-08-23 19:14:40] [13547.358410]  asm_exc_page_fault+0x21/0x30
[2022-08-23 19:14:40] [13547.358421] RIP: 0033:0x76b10f894a70
[2022-08-23 19:14:40] [13547.358434] Code: Unable to access opcode bytes at RIP 0x76b10f894a46.
[2022-08-23 19:14:40] [13547.358448] RSP: 002b:00007ffff9a6f240 EFLAGS: 00010202
[2022-08-23 19:14:40] [13547.358461] RAX: 0000000000000005 RBX: 0000649ec8b3d850 RCX: 000076b10fca658f
[2022-08-23 19:14:40] [13547.358478] RDX: 00007ffff9a6f240 RSI: 00007ffff9a6f370 RDI: 000000000000000e
[2022-08-23 19:14:40] [13547.358494] RBP: 0000649ec8b98460 R08: 000076b106ac8000 R09: 0000000000000780
[2022-08-23 19:14:40] [13547.358511] R10: 0000649ec88a5b50 R11: 0000000000007fff R12: 0000000000000000
[2022-08-23 19:14:40] [13547.358528] R13: 00007ffff9a6f8c8 R14: 00007ffff9a6f8cc R15: 0000000000000000
[2022-08-23 19:14:40] [13547.358546]  </TASK>
[2022-08-23 19:14:40] [13547.358569] Mem-Info:
...
[2022-08-23 19:14:40] [13547.358968] 8660 total pagecache pages
[2022-08-23 19:14:40] [13547.358976] 4282 pages in swap cache
[2022-08-23 19:14:40] [13547.358990] Swap cache stats: add 780235, delete 775951, find 114217/171831
[2022-08-23 19:14:40] [13547.359004] Free swap  = 0kB
[2022-08-23 19:14:40] [13547.359013] Total swap = 1048572kB
[2022-08-23 19:14:40] [13547.359027] 982943 pages RAM
[2022-08-23 19:14:40] [13547.359035] 0 pages HighMem/MovableOnly
[2022-08-23 19:14:40] [13547.359044] 32107 pages reserved
[2022-08-23 19:14:40] [13547.359053] 0 pages cma reserved
[2022-08-23 19:14:40] [13547.359061] 0 pages hwpoisoned
[2022-08-23 19:14:40] [13547.359070] Tasks state (memory values in pages):
[2022-08-23 19:14:40] [13547.359092] [  pid  ]   uid  tgid total_vm      rss pgtables_bytes swapents oom_score_adj name
...
[2022-08-23 19:14:40] [13547.359554] [    625]  1000   625    43188    15696   258048     2373             0 Xorg
...
[2022-08-23 19:14:40] [13547.360559] [   1492]  1000  1492  1344755   379571  6025216   141791             0 java
...
[2022-08-23 19:14:40] [13547.360794] [   2458]  1000  2458   409658   256637 12570624    12776             0 node
...
[2022-08-23 19:14:40] [13547.360857] [   3611]  1000  3611 294774777     245   598016     3471           300 firefox
...
[2022-08-23 19:14:40] [13547.360878] [   3782]  1000  3782   181023     1293  1388544     3312             0 npm run start
[2022-08-23 19:14:40] [13547.360898] [   3794]  1000  3794 10900971   155524 15831040    50991             0 ng serve
[2022-08-23 19:14:41] [13547.361018] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/user.slice/...-1492.scope,task=java,pid=1492,uid=1000
[2022-08-23 19:14:41] [13547.361187] Out of memory: Killed process 1492 (java) total-vm:5379020kB, anon-rss:1518180kB, file-rss:0kB, shmem-rss:104kB, UID:1000 pgtables:5884kB oom_score_adj:0

SchnWalter avatar Aug 23 '22 17:08 SchnWalter

@SchnWalter I recommend copying the files to a disposable VM first

DemiMarie avatar Aug 24 '22 01:08 DemiMarie

I observe this from time to time too. I captured gui-agent debug log this time. I'm not sure if any of this give any hits, for me it looks rather normal. I do not see any process killed by oom-killer. I'm not exactly sure what action caused the freeze, but I found it frozen when I switched back to it, after doing something on another workspace (in another VM).

0x100002b is Firefox 0x200002 is qubes-gui's 1x1 window

(before freeze)
Aug 26 00:21:38 disp9221 qubes-gui[550]: received message type 128 for 0x100002b
Aug 26 00:21:38 disp9221 qubes-gui[550]: 0x100002b lost focus
Aug 26 00:21:38 disp9221 qubes-gui[550]: received message type 127 for 0x100002b
(after this point it was frozen)
Aug 26 00:23:43 disp9221 qubes-gui[550]: Skipping unmanaged window 0x200002received message type 142 for 0x0
Aug 26 00:23:43 disp9221 qubes-gui[550]: received message type 127 for 0x100002b
Aug 26 00:23:43 disp9221 qubes-gui[550]: Skipping unmanaged window 0x200002received message type 142 for 0x0
Aug 26 00:23:43 disp9221 qubes-gui[550]: received message type 128 for 0x100002b
Aug 26 00:23:43 disp9221 qubes-gui[550]: WM_TAKE_FOCUS sent for 0x100002b
Aug 26 00:23:43 disp9221 qubes-gui[550]: 0x100002b raised
Aug 26 00:23:43 disp9221 qubes-gui[550]: handle configure event 0x100002b w=1280 h=1040 ovr=0
Aug 26 00:23:43 disp9221 qubes-gui[550]: handle property WM_HINTS for window 0x100002b
Aug 26 00:23:43 disp9221 qubes-gui[550]: Received input hint 0x1 for Window 0x100002b
Aug 26 00:23:43 disp9221 qubes-gui[550]: received message type 124 for 0x100002b
Aug 26 00:23:43 disp9221 qubes-gui[550]: received message type 126 for 0x100002b
Aug 26 00:23:43 disp9221 qubes-gui[550]: received message type 126 for 0x100002b
Aug 26 00:23:43 disp9221 qubes-gui[550]: received message type 126 for 0x100002b

marmarek avatar Aug 25 '22 22:08 marmarek

I've been meaning to try to debug this in Firefox or tor browser as I've seen it happen in both. One area I have meant to explore is looking at the resource limits/used stats in /sys (/proc?) to see if it gets triggered by hitting hard limits (probably not) or soft limits that might temporarily be hit and then lag before being increased by firefox/tor browser.

B

brendanhoar avatar Aug 25 '22 23:08 brendanhoar

@marmarek what is your dmesg? Are block devices still working?

DemiMarie avatar Aug 26 '22 01:08 DemiMarie

@marmarek what is your dmesg?

Nothing else. The above log is all I have in journal at this time.

Are block devices still working?

Yes. All the other windows are functional too. It's only this Firefox instance that is frozen. Firefox works correctly after restarting it (for some time). This bug made me starting xterm in DispVM and starting Firefox from there (so I can restart it and load previous session from history), instead of starting Firefox in DispVM directly (where restarting would kill DispVM).

marmarek avatar Aug 26 '22 10:08 marmarek

I've been running some Firefox instances via x11trace for the past week+ and the issue did not reproduced there. At the same time, I did hit the issue few times in another VM with the same Firefox version (but without x11trace there). It could be just coincidence (the issue doesn't happen that often), but it could be also some side effect of x11trace (increased latency reduce chance of some race condition?).

marmarek avatar Sep 08 '22 11:09 marmarek

FWIW I have a habit of randomly selecting text with the mouse while reading. I have the impression that this issue happens most often while I am doing that kind of movement. This could be coincidental of course. I wonder if anybody else here observes a similar correlation?

Specificially, what I am doing is:

  • With a physical mouse: Hold the left mouse button and move the mouse pointer up and down, causing text to be selected and unselected in quick succession as I go up and down.
  • With a touchpad: Double-tap, and while on the second tap, keep the finger on the touchpad and move the mouse pointer up and down.

Warthog-Capital avatar Sep 11 '22 17:09 Warthog-Capital

Update:

I have seen the freeze on different VMs and with plenty of RAM to spare. The strace logs always contained some timeouts for the 4th FD of the firefox process, which seems to always point to an Xorg socket. And I've seen this while while reading static websites with NoScript+UBlockOrigin enabled, so nothing that would require a high refresh rate.

click for strace+lsof for the frozen FF window
-- VM#1
...
01:44:28.979846 poll([{fd=4, events=POLLIN}, {fd=5, events=POLLIN}, {fd=10, events=POLLIN}, {fd=27, events=POLLIN}, {fd=28, events=POLLIN}], 5, 0) = 0 (Timeout) <0.000035>
01:44:28.979929 clock_gettime(CLOCK_MONOTONIC, {tv_sec=40773, tv_nsec=737363552}) = 0 <0.000025>
...
01:44:28.981550 clock_gettime(CLOCK_MONOTONIC, {tv_sec=40773, tv_nsec=738984866}) = 0 <0.000020>
01:44:28.981682 recvmsg(4, {msg_namelen=0}, 0) = -1 EAGAIN (Resource temporarily unavailable) <0.000029>
01:44:28.981760 recvmsg(4, {msg_namelen=0}, 0) = -1 EAGAIN (Resource temporarily unavailable) <0.000021>
01:44:28.981911 clock_gettime(CLOCK_MONOTONIC, {tv_sec=40773, tv_nsec=739347632}) = 0 <0.000030>
...
01:44:28.982972 clock_gettime(CLOCK_MONOTONIC, {tv_sec=40773, tv_nsec=740437094}) = 0 <0.000070>
01:44:28.983182 poll([{fd=4, events=POLLIN}, {fd=5, events=POLLIN}, {fd=10, events=POLLIN}, {fd=27, events=POLLIN}, {fd=28, events=POLLIN}], 5, -1) = 1 ([{fd=4, revents=POLLIN}]) <0.686750>
01:44:29.670243 clock_gettime(CLOCK_MONOTONIC, {tv_sec=40774, tv_nsec=428295719}) = 0 <0.000608>
...
01:44:29.671429 clock_gettime(CLOCK_MONOTONIC, {tv_sec=40774, tv_nsec=428916278}) = 0 <0.000021>
01:44:29.671600 recvmsg(4, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="U\2\361\207\32+n\2\3\24\4\0\20\0\0\0\0\0\0\24\24\24\24\24\0\0\3\37%\2\0\0", iov_len=4096}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0) = 32 <0.000028>
01:44:29.671835 recvmsg(4, {msg_namelen=0}, 0) = -1 EAGAIN (Resource temporarily unavailable) <0.000024>
01:44:29.672053 recvmsg(4, {msg_namelen=0}, 0) = -1 EAGAIN (Resource temporarily unavailable) <0.000023>
01:44:29.672226 clock_gettime(CLOCK_MONOTONIC, {tv_sec=40774, tv_nsec=429711111}) = 0 <0.000021>
01:44:29.672390 clock_gettime(CLOCK_MONOTONIC, {tv_sec=40774, tv_nsec=429873976}) = 0 <0.000021>
01:44:29.672550 clock_gettime(CLOCK_MONOTONIC, {tv_sec=40774, tv_nsec=430033491}) = 0 <0.000021>
...

FDs with socket endpoint information:

-- VM#1
$ lsof +E -a -p881 -d4
COMMAND PID USER   FD   TYPE             DEVICE SIZE/OFF  NODE NAME
Xorg    613 user   27u  unix 0x00000000625f7363      0t0 15310 @/tmp/.X11-unix/X0 type=STREAM ->INO=15712 881,firefox,4u (CONNECTED)
firefox 881 user    4u  unix 0x000000009a8e8040      0t0 15712 type=STREAM ->INO=15310 613,Xorg,27u (CONNECTED)

-- VM#2
$ lsof +E -a -p912 -d4
COMMAND PID USER   FD   TYPE             DEVICE SIZE/OFF  NODE NAME
Xorg    643 user   28u  unix 0x00000000a5f51327      0t0 16513 @/tmp/.X11-unix/X0 type=STREAM ->INO=16512 912,firefox,4u (CONNECTED)
firefox 912 user    4u  unix 0x000000007ed0b555      0t0 16512 type=STREAM ->INO=16513 643,Xorg,28u (CONNECTED)

There's also something else that I didn't mention: if the window doesn't completely freeze, sometimes it's redrawn on minimize+restore or when using the "shade" button (the one that causes the window to be "minimized" to the titlebar). This makes me think that the clock_gettime(CLOCK_MONOTONIC, ...) entries are to blame, they appear every 150-200 nanoseconds, just like in https://github.com/QubesOS/qubes-issues/issues/7404#issuecomment-1097491275 and I get a "Fast TSC calibration failed" in VMs and they use the xen clock source, I'll have to look into https://github.com/QubesOS/qubes-core-admin/commit/8c8c99c07643c996356f4fc12bbe781a31454f93

click for clocksource logs in dom0 & domU
dom0# cat /sys/devices/system/clocksource/clocksource0/current_clocksource 
tsc

dom0# journalctl --boot --grep=tsc
-- Logs begin at $TIME$, end at $TIME$. --
$TIME$ dom0 kernel: tsc: Fast TSC calibration using PIT
$TIME$ dom0 kernel: tsc: Detected 3502.691 MHz processor
$TIME$ dom0 kernel: tsc: Detected 3502.636 MHz TSC
$TIME$ dom0 kernel: clocksource: tsc-early: mask: 0xffffffffffffffff max_cycles: 0x327d0acf13d, max_idle_ns: 440795226346 ns
$TIME$ dom0 kernel: clocksource: Switched to clocksource tsc-early
$TIME$ dom0 kernel: clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x327d0acf13d, max_idle_ns: 440795226346 ns
$TIME$ dom0 kernel: clocksource: Switched to clocksource tsc


domU# cat /sys/devices/system/clocksource/clocksource0/current_clocksource 
xen

domU# journalctl --grep="tsc" --boot --no-hostname
-- Journal begins at $TIME$, ends at $TIME$. --
[$TIME$] domU kernel: tsc: Fast TSC calibration failed
[$TIME$] domU kernel: tsc: Detected 3502.636 MHz processor
[$TIME$] domU kernel: TSC deadline timer available
[$TIME$] domU kernel: clocksource: tsc-early: mask: 0xffffffffffffffff max_cycles: 0x327d0acf13d, max_idle_ns: 440795226346 ns
[$TIME$] domU kernel: clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x327d0acf13d, max_idle_ns: 440795226346 ns

Also, if I recall correctly, when I saw the oom-killer log entries, the window was never redrawn, I had to reopen firefox or the VM for it to recover.

SchnWalter avatar Sep 23 '22 10:09 SchnWalter

One more thing, the clock_gettime() strace entries, and the mouse select mentioned by @Warthog-Capital, make me think that this issue might be related those where mouse events were not counted as "user activity", causing the session to be locked or something.

I usually have 2-3 different Firefox instances running at the same time, each inside a their own VM, some on debian-11, others on fedora-36-xfce*, usually customized from the base Qubes template. And there's always one Firefox window that doesn't receive any keyboard input for long stretches of time, only mouse clicks and scroll events.

I think that this might related to the X session and the fixes that attached the time value as a flag on the X window object. I'll provide links when I have some time to debug this, I don't remember the issues right now.

* later edit: I have now seen this issue on VMs without modified templates: debian-11 (gnome) and fedora-36-xfce.

SchnWalter avatar Sep 28 '22 10:09 SchnWalter

This seems to happen outside of firefox with almost all other applications as well (fedora-36). I will edit this comment to confirm whether I see the fd#4 related issues in straces.

vladimir-lu avatar Oct 26 '22 17:10 vladimir-lu

Debugging something (window manager? xen? video driver?) in dom0 can help too, because dom0 receives "image" from VM and this can be related to broken dom0<->VM communication, sorry I don't know how this works internally.

gangamstyle avatar Dec 06 '22 16:12 gangamstyle

Happens quiet often now with Tor Browser 12.0 (based on Firefox 102 ESR) in whonix-ws based qubes.

resulin avatar Dec 11 '22 07:12 resulin

I used to have this problem quite often, a few times per day, but after the recent update to Firefox 109 and after checking the Use recommended performance settings option, I haven't encountered it even once over the last week :tada: :crossed_fingers:

Not sure why Use recommended performance settings was unchecked in some of my VMs, probably a past attempt to solve this issue that likely backfired... :sweat_smile:

na-- avatar Jan 27 '23 14:01 na--

@na-- I always had Use recommended performance settings checked when encountering this problem.

resulin avatar Jan 27 '23 15:01 resulin

I also didn't got this issue in recent week or two, but got it semi-frequently before. I'd wait a week more, and if it won't happen again, I think we can close it.

marmarek avatar Jan 27 '23 15:01 marmarek

still happening to me at least, with latest firefox 109.0 in fedora-36. in the past had hardware acceleration disabled, i have also recently tried with Use recommended performance settings, no difference i can tell.

~~now i can at least close the VM that firefox is frozen within cleanly, in the past i had to kill the VM.~~ never mind had to kill a firefox VM today.

mfc avatar Jan 31 '23 15:01 mfc

I've submitted #8145 that may be a way to demonstrate a root cause of at least one mechanism for random freezes

the-moog avatar Apr 16 '23 00:04 the-moog

I do not think this is strictly a Qubes issue. Just my 2c: I've had numerous problems like this on Fedora XFCE across multiple Fedora and Firefox versions. These are some observations I've had:

  • The variant of this issue I've had for longest of them all has something to do with dragging tabs - I'm not sure exactly what it is, but dragging a tab in a specific way can cause it to get stuck halfway - but trying to drag it again fixes it. If you don't fix it, you cannot switch between tabs correctly - the UI won't update.
  • A similar issue like the above can also occur except the tab does not get visually stuck halfway between tabs - this is more insidious. The UI won't update when you open a new tab, click on a tab, etc., but the title bar will - until you switch to another window and then back to Firefox, which seems to trigger a single UI update. This seems to have something to do with dragging the tab down (?). I thought for quite some time that this could only be fixed by restarting the browser, but I eventually found that dragging tabs solves this one too.

More recently though I've been having more issues:

  • At times, Firefox goes into on-off massive input lag mode, where most keyboard inputs are delayed by around 15 seconds, if not more. If this happens, sometimes it starts working again after a while, only to relapse back into input lag. Some inputs, like hotkeys (such as to open a new tab), keep working fine during this time.
  • Other times the browser simply freezes completely. Minimizing and maximizing the window will cause artifacts akin to the thread not drawing anything into the 'new' window region.

These are frustrating and seem to happen at complete random. I have not found a single way to reproduce any of these reliably.

ziplantil avatar May 14 '23 20:05 ziplantil

Just wanted to add that I haven't encountered this issue for several months now. From my side it can be closed.

Warthog-Capital avatar May 29 '23 09:05 Warthog-Capital

I still encounter it, but much more rarely than before, and I think it's a smaller version of the original bug :confused:

Very rarely, a single Firefox window freezes (others from the same session are fine), but it's mostly the borwser chrome part that's frozen (the interface), not the actual content :man_shrugging: And even the interface is somewhat responsive, e.g. I can switch to the next tab (Ctrl+Tab), and if I switch to another window and back to the frozen window (e.g. 2x Alt+Tab), I see the contents of the new tab, as @ziplantil mentioned. But contrary to the original report in the issue, I usually can interact with the page contents and even scroll. I can also blindly copy the address bar contents (Ctrl+L, Ctrl+C) successfully and open the same tabs in another window... :sweat_smile:

At this point, I have no indications if this is QubesOS specific or not, but trying to search for it just leads me back to this issue, I can't find anything similar at https://bugzilla.mozilla.org/ (though I may not be searching it very well). So it might be a Firefox bug that only surfaces on QubesOS, which makes keeping the issue open somewhat worth it :man_shrugging:

na-- avatar May 29 '23 11:05 na--

@na-- Can you confirm if the logs show any hint of an OOM occurring at the time you get a frozen dialog? Look in two places, both in the app Qube and in DOM0, (or whatever domain hosts your GUI). I've found you can have two instances of an app (e.g. FF) open and one is dead and the other not, depends what got killed as the app UI is separate from the app itself. i.e. An app/process/thread can die but the GUI objects remain on the screen. The container level UI objects usually still work. You can click, move, resize, minimize but you can't close them and their contents is either empty or won't update. I've also seen similar in DOM0, where the result is an unresponsive dialog, not always in the app that you are using.

the-moog avatar May 29 '23 16:05 the-moog

:thinking: I don't remember seeing any signs of an OOM, but I haven't actually explicitly looked, I'll make sure to do so next time it happens! It's unlikely because the VM I most often encounter the problem has 6GB of RAM, but it's not impossible... That said, it happens very, very rarely nowadays, so it might be a while until I have more information.

na-- avatar May 29 '23 19:05 na--

Only 6G? Mine has 32G and it still manages it, though rarely now I've made some adjustments. It seems Xen is not that efficient with memory and the stock Qubes images have defaults for important IPC between DOM0 and appQubes are set to run with the same 'kill me' badge as a toy app or occasional cron.

Does your GPU eat into CPU DRAM?

The appVM I am trying in right now has 12G allocated with 12G swap. It has 7G free but not started swapping. So 5G used and I have only three apps open. One window of FF open with 8 tabs, a terminal (in which I just typed free) and Dolphin.

DOM0 (also my GUI) has 4G allocated. 81Mb free (eeek - need to fix that) No apps running other than X and the display manager. Top says the display manager is a hog.

the-moog avatar May 30 '23 03:05 the-moog

After the most recent Firefox update (115.3.0esr) on Debian 11, the problem started occurring again for me. Three times today. Restarting the browser still solves the problem, until the next time.

Warthog-Capital avatar Sep 30 '23 19:09 Warthog-Capital

The proper solution is to make anything related to the Qube graphics/UI pipe, and Qube/Xen interaction/management less prone to being taken out by OOM Killer. Currently it seems to use the software precision of a blunderbuss. Most things have the same values (see top) I tweaked a few apps by altering their startup in systemd, and giving them a better OOM value, but there are a lot and I don't know what half of them do. That seemed to help but I too have recently been seeing a few more frequent lockups. Plus, of course DOM0 can run/close/restart any app on any cube at any time, so those need to be protected if they are critical, and a failure would leave he Qube in a zombie state. Especially if they are non-atomic.

the-moog avatar Oct 14 '23 19:10 the-moog

The comments above clearly says it is not OOM... And if that would be OOM killing some qubes graphic component, restarting browser wouldn't work.

marmarek avatar Oct 14 '23 20:10 marmarek

for my issue (which may be orthogonal to OP issue), it requires killing the qube. not only can i not restart the browser, i cannot shut down (or restart) the qube cleanly.

mfc avatar Oct 14 '23 20:10 mfc

@mfc can you file a separate ticket for that?

DemiMarie avatar Oct 14 '23 20:10 DemiMarie