nohang icon indicating copy to clipboard operation
nohang copied to clipboard

/tmp high usage triggering nohang

Open dim-geo opened this issue 4 years ago • 7 comments

Hello,

my /tmp is ram mounted. tmpfs on /tmp type tmpfs (rw,nosuid,nodev) I also use zram. A process has filled almost all /tmp and zram usage was very high. nohang was triggered and killed processes but of course this didn't cure the problem, as the process causing this, was not a memory hog. After I emptied /tmp, nohang was quiet.

A nice feature improvement would be to check the free space of tmpfs filesystems and notify user via notification that they are filling his/her ram.

nohang can either identify the process filling the tmpfs and kill it but if it's too difficult, a notification would be good.

dim-geo avatar May 10 '20 09:05 dim-geo

A process has filled almost all /tmp

What about limiting tmpfs size?

tmpfs /tmpfs tmpfs nodev,nosuid,noexec,noatime,size=4G 0 0

I think that normally tmpfs size should be limited.

hakavlad avatar May 10 '20 10:05 hakavlad

Hello,

BTW, nohang in pre-release versions tried to clean tmpfs: https://github.com/hakavlad/nohang-extra/blob/master/nohang_old#L49

hakavlad avatar May 10 '20 10:05 hakavlad

Hi, by default it's limited:

 Mount options
       The tmpfs filesystem supports the following mount options:

       size=bytes
              Specify an upper limit on the size of the filesystem.  The
              size is given in bytes, and rounded up to entire pages.

              The size may have a k, m, or g suffix for Ki, Mi, Gi (binary
              kilo (kibi), binary mega (mebi) and binary giga (gibi)).

              The size may also have a % suffix to limit this instance to a
              percentage of physical RAM.

              The default, when neither size nor nr_blocks is specified, is
              size=50%.

dim-geo avatar May 10 '20 10:05 dim-geo

We can easy detect tmpfs overflow by Shmem in /proc/meminfo:

              Shmem %lu (since Linux 2.6.32)
                     Amount of memory consumed in tmpfs(5) filesystems.

http://man7.org/linux/man-pages/man5/proc.5.html

tmpfs can be filled with small files with the participation of many processes at different times.

What process do you propose to kill?

What to notify the user about?

hakavlad avatar May 10 '20 14:05 hakavlad

First easier feature (just notify) is:

When nohang warning level are triggered, check also tmpfs usage (something like df | grep tmpfs) and if you see percentage above 50% and Shmem usage above a certain (configurable) percentage of ram then notify user that something is filling tmp filesystems.

Simpler approach: ignore df | grep tmpfs and use only the shmem percentage.

User, based on that notification, can take corrective actions by himself or ignore the alarm. In any case, either he will understand that io is failing due to full disk, or nohang will start killing processes and at some point the offending one.

Second, more advanced feature to implement:

badness can take into account open tmpfs files... 0) find all mountpoints related to tmpfs 1)take lsof input and grep relevant mountpoints, ( file size info is provided by lsof as well) and add the file sizes sum to the ram used by each process.

dim-geo avatar May 10 '20 15:05 dim-geo

if you see percentage above 50% and Shmem usage above a certain (configurable) percentage of ram then notify user that something is filling tmp filesystems.

use only the shmem percentage

It's easy to implement.

Low memory

Save your unsaved data! Close unused apps! Free up tmpfs! (Shmem: 60%)

Does it look good?

Maybe

Low memory

Save your unsaved data! Free up tmpfs! (Shmem: 60%)

hakavlad avatar May 10 '20 15:05 hakavlad

https://github.com/hakavlad/nohang/commit/bf431cf9b8060ef9aa4791bf668a7e175b486476

It's unconfigurable for now. It shows Shmem % if Shmem > 30% MemTotal.

hakavlad avatar May 10 '20 18:05 hakavlad