nohang icon indicating copy to clipboard operation
nohang copied to clipboard

both nohang tests failed on Ubuntu 18.04

Open lvitya opened this issue 5 years ago • 23 comments

I have installed nohang as described. Then

$ sudo systemctl enable nohang-desktop
$ sudo systemctl start nohang-desktop
$ nohang --memload

And guess what - system freezed. After waiting for 10 minutes I performed a power cycle. After booting I checked if nohang is running with $ systemctl list-units

This time I tried $ tail /dev/zero The system freezed again.

Is this me doing something wrong or the app? For me as a user this behaviour is not expected.

lvitya avatar May 28 '20 14:05 lvitya

Hi! I'd like to see the output:

$ uname -a

$ cat /proc/swaps

$ cat /proc/sys/vm/swappiness

$ cat /proc/pressure/memory

See also https://github.com/hakavlad/nohang/issues/85

hakavlad avatar May 28 '20 15:05 hakavlad

And i'd like to see the journal since nohang starts.

sudo journalctl -eu nohang-desktop

hakavlad avatar May 28 '20 15:05 hakavlad

$ uname -a
Linux viktor-desktop 4.16.3-041603-generic #201804190730 SMP Thu Apr 19 07:32:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

$ cat /proc/swaps
Filename				Type		Size	Used	Priority
/dev/dm-0                               partition	8252924	3009020	-2

$ cat /proc/sys/vm/swappiness
60

$ cat /proc/pressure/memory
cat: /proc/pressure/memory: No such file or directory

lvitya avatar May 28 '20 16:05 lvitya

-- Logs begin at Thu 2020-04-23 14:41:57 EEST, end at Thu 2020-05-28 19:53:05 EEST. --
тра 28 16:51:56 viktor-desktop systemd[1]: Started Sophisticated low memory handler.
тра 28 16:51:56 viktor-desktop nohang-desktop[6284]: config: /etc/nohang/nohang-desktop.conf
тра 28 16:51:56 viktor-desktop nohang-desktop[6284]: WARNING: PSI metrics are not provided by the kernel: [Errno 2] No such file or directory: '/proc/pressure/memory'
тра 28 16:51:56 viktor-desktop nohang-desktop[6284]: Monitoring has started!
-- Reboot --
тра 28 17:05:34 viktor-desktop systemd[1]: Started Sophisticated low memory handler.
тра 28 17:05:36 viktor-desktop nohang-desktop[1243]: config: /etc/nohang/nohang-desktop.conf
тра 28 17:05:36 viktor-desktop nohang-desktop[1243]: WARNING: PSI metrics are not provided by the kernel: [Errno 2] No such file or directory: '/proc/pressure/memory'
тра 28 17:05:36 viktor-desktop nohang-desktop[1243]: Monitoring has started!
-- Reboot --
тра 28 17:21:58 viktor-desktop systemd[1]: Started Sophisticated low memory handler.
тра 28 17:21:59 viktor-desktop nohang-desktop[1270]: config: /etc/nohang/nohang-desktop.conf
тра 28 17:21:59 viktor-desktop nohang-desktop[1270]: WARNING: PSI metrics are not provided by the kernel: [Errno 2] No such file or directory: '/proc/pressure/memory'
тра 28 17:21:59 viktor-desktop nohang-desktop[1270]: Monitoring has started!
тра 28 17:58:39 viktor-desktop nohang-desktop[1270]: Warning threshold exceeded
тра 28 17:58:39 viktor-desktop nohang-desktop[1270]: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
тра 28 17:58:39 viktor-desktop nohang-desktop[1270]: Memory status that requires corrective actions:
тра 28 17:58:39 viktor-desktop nohang-desktop[1270]:   MemAvailable [2 MiB, 0.0 %] <= soft_threshold_min_mem [392 MiB, 5.0 %]
тра 28 17:58:39 viktor-desktop nohang-desktop[1270]:   SwapFree [788 MiB, 9.8 %] <= soft_threshold_min_swap [806 MiB, 10.0 %]
тра 28 17:58:39 viktor-desktop nohang-desktop[1270]: Found 26 tasks with non-zero oom_score (except init and self) in 4957ms
тра 28 17:58:39 viktor-desktop nohang-desktop[1270]: TOP-15 tasks by badness:
тра 28 17:58:39 viktor-desktop nohang-desktop[1270]:   Name                PID badness
тра 28 17:58:39 viktor-desktop nohang-desktop[1270]:   --------------- ------- -------
тра 28 17:58:39 viktor-desktop nohang-desktop[1270]:   Web Content       26404     234
тра 28 17:58:39 viktor-desktop nohang-desktop[1270]:   Web Content       26552     226
тра 28 17:58:39 viktor-desktop nohang-desktop[1270]:   Web Content       26160     225

lvitya avatar May 28 '20 16:05 lvitya

2 reboots in the journal are power cycles after freezes. After 2nd reboot, I just use PC normally. There are more journal entries after this part.

lvitya avatar May 28 '20 16:05 lvitya

PSI required to prevent freezing under heavy swapping.

You could install the new kernel (4.20+), maybe 5.0 or 5.3.

Demo: https://youtu.be/Y6GJqFE_ke4

hakavlad avatar May 28 '20 17:05 hakavlad

https://www.kernel.org/doc/html/latest/accounting/psi.html

hakavlad avatar May 28 '20 17:05 hakavlad

Also disabling swap space can help you to prevent freezeng: this will help to avoid prolonged swapping. Demo without swapping: https://youtu.be/UCwZS5uNLu0

hakavlad avatar May 28 '20 17:05 hakavlad

You could install the new kernel (4.20+), maybe 5.0 or 5.3.

It is ok for me. However, according to README PSI is not required.

Also disabling swap space can help you to prevent freezing: this will help to avoid prolonged swapping.

What if I have only 8 GB of RAM? I remember I read in some article that it is better to have swap enabled. I only have SSD in this system. Theoretically, swap speed should be fast. Could you please clarify about swap?

lvitya avatar May 28 '20 18:05 lvitya

README PSI is not required

for basic usage, i e to handle low MemAvailable/SwapFree.

Theoretically, swap speed should be fast

As you see, it is not fast enough to prevent freezing.

Could you please clarify about swap?

Yes, later, plz wait.

hakavlad avatar May 28 '20 18:05 hakavlad

I read in some article that it is better to have swap enabled

Yes, the article is https://chrisdown.name/2018/01/02/in-defence-of-swap.html

hakavlad avatar May 28 '20 18:05 hakavlad

Is this me doing something wrong or the app?

App works as intended. It responds to low MemAvailable and SwapFree. As you see:

тра 28 17:58:39 viktor-desktop nohang-desktop[1270]: Memory status that requires corrective actions:
тра 28 17:58:39 viktor-desktop nohang-desktop[1270]:   MemAvailable [2 MiB, 0.0 %] <= soft_threshold_min_mem [392 MiB, 5.0 %]
тра 28 17:58:39 viktor-desktop nohang-desktop[1270]:   SwapFree [788 MiB, 9.8 %] <= soft_threshold_min_swap [806 MiB, 10.0 %]
тра 28 17:58:39 viktor-desktop nohang-desktop[1270]: Found 26 tasks with non-zero oom_score (except init and self) in 4957ms

My fault is that the documentation is terrible and I did not explain what "basic usage" is.

After waiting for 10 minutes I performed a power cycle

I think that the problem would be resolved if you waited 30 minutes. The memory will be freed after filling the swap and killing the victim.

The situation can be improved by reducing the size of the swap space. Small swap space may be filled faster and corrective action will happen faster.

Using PSI metrics allows you to detect freezing faster and perform corrective action less than one minute.

hakavlad avatar May 30 '20 19:05 hakavlad

Thank you for the detailed replies. I'm looking for a good way to install a kernel with PSI support. Previously I used UKUU but recently the app goes to "paid only". I can't install the app from the usual PPA. I want to find some repeatable way to install a kernel. The CLI way is maybe even better.

I plan to post the update here after I have some progress. This may take some time depending on how much time I can dedicate to this issue.

lvitya avatar May 30 '20 19:05 lvitya

I think that the problem would be resolved if you waited 30 minutes. The memory will be freed after filling the swap and killing the victim.

Actually I thought one of the features of earlyoom, nohang and alike is avoiding waiting such a long time. I was not aware that they require PSI support to act fast.

lvitya avatar May 30 '20 19:05 lvitya

I'm looking for a good way to install a kernel with PSI support.

See https://itsfoss.com/ubuntu-hwe-kernel/

Maybe you would run sudo apt install --install-recommends linux-generic-hwe-18.04 xserver-xorg-hwe-18.04 to install 5.3 with PSI support.

Hardware Enablement Stacks (HWE) are incorporated into installers for select Ubuntu LTS (Long Term Support) point releases. It is a special Ubuntu feature that provides an LTS release with hardware support introduced in newer Ubuntu releases.

https://packages.ubuntu.com/en/bionic/linux-generic-hwe-18.04

hakavlad avatar May 30 '20 19:05 hakavlad

Actually I thought one of the features of earlyoom, nohang and alike is avoiding waiting such a long time.

Without swap space you can avoid waiting. nohang without PSI works like earlyoom: it responds to MemAvailable and SwapFree. If you want to avoid long-time freezing with swap space you should have PSI support.

hakavlad avatar May 30 '20 20:05 hakavlad

Maybe you would run sudo apt install --install-recommends linux-generic-hwe-18.04 xserver-xorg-hwe-18.04 to install 5.3 with PSI support.

So I have installed Linux kernel 5.3.0 according to provided suggestion. Now nohang tests don't lead to Ubuntu freeze. Thank you.

One more question is why after killing the hog process the memory isn't freed? See, for example, this log after tail test. https://gist.github.com/lvitya/7241dda5b3f7723d84cfec41c55779f6

Memory status after implementing a corrective action: MemAvailable: 164.9 MiB, SwapFree: 1467.9 MiB

lvitya avatar Jun 03 '20 17:06 lvitya

One more question is why after killing the hog process the memory isn't freed?

Processes do not free memory immediately after receiving a signal, even if it is SIGKILL. Freeing up memory can take up to several seconds. See also https://github.com/rfjakob/earlyoom/issues/128#issuecomment-507019219

hakavlad avatar Jun 13 '20 08:06 hakavlad

Do you still run Ubuntu 18.04? Or did you meanwhile upgrade to 20.04 or 22.04?

alexmyczko avatar Mar 31 '23 15:03 alexmyczko

I upgraded to 20.04.

lvitya avatar Mar 31 '23 15:03 lvitya

@lvitya what is holding you from going 22.04?

alexmyczko avatar Mar 31 '23 15:03 alexmyczko

The transition introduces a bunch of problems. The conclusion is based on my previous experience. Too many things change at once. I still miss some Unity features. Something got broken. For example, manually added repositories don't work anymore. I should be ready to spend an uncertain amount of time fixing my user and dev environments.

lvitya avatar Mar 31 '23 16:03 lvitya

I agree something is fishy going on with Ubuntu, here is my solutions. https://github.com/alexmyczko/autoexec.bat/tree/master/config.sys

alexmyczko avatar Mar 31 '23 16:03 alexmyczko