uwsgi
uwsgi copied to clipboard
High memory usage if `fs.nr_open` is very high and no `ulimit` set on Linux systems
While debugging https://github.com/kubernetes-sigs/kind/issues/2175, I tried to understand why uwsgi
wasn't running well on a Kind cluster on Fedora 33.
I came to the conclusion that it is because of a too high value for fs.nr_open
, which defaults to 1073741816
on Fedora 33, but only 1048576
on Ubuntu 20.10. The very high limit causes, on my machine, the uWSGI process on a pod to consume >8Gi of memory on the --http
process, and if memory limits are set, the process will get OOM-killed by the kernel (please see issue above for a test repo and logs).
The issue isn't manifested when running uwsgi
outside a container/pod because of per-user limits set with ulimit
of 1024
. Also the containerd.service
unit seems to, by default, set a value for fs.nr_open
of 1048576
, which helps avoid this issue when the container with uwsgi
is run via docker run
.
Pod logs (high limit set deliberately via sysctl -w fs.nr_open=1073741816
):
red@noctis:~/Development/kube-stuff$ kubectl logs example-pod
*** Starting uWSGI 2.0.19.1 (64bit) on [Fri Apr 2 18:05:18 2021] ***
compiled with version: 8.3.0 on 02 April 2021 17:51:03
os: Linux-5.8.0-48-generic #54-Ubuntu SMP Fri Mar 19 14:25:20 UTC 2021
nodename: example-pod
machine: x86_64
clock source: unix
pcre jit disabled
detected number of CPU cores: 8
current working directory: /
detected binary path: /usr/local/bin/uwsgi
uWSGI running as root, you can use --uid/--gid/--chroot options
*** WARNING: you are running uWSGI as root !!! (use the --uid flag) ***
*** WARNING: you are running uWSGI without its master process manager ***
your memory page size is 4096 bytes
detected max file descriptor number: 1073741816
(continues, and then hangs. The `--http` process is OOM-killed)
Raising the limit on an Ubuntu 20.10 machine to 1073741816
and trying again, without a container:
(Note that I had to do it via sysctl -w
and ulimit -n
to raise both limits, it seems Ubuntu has a per-user limit set of 1024
)
noctis# ulimit -n 1073741816
noctis# ulimit -n
1073741816
noctis# source venv/bin/activate
(venv) noctis# ./docker-entrypoint.sh
*** Starting uWSGI 2.0.19.1 (64bit) on [Fri Apr 2 19:17:18 2021] ***
compiled with version: 10.2.0 on 02 April 2021 18:03:34
os: Linux-5.8.0-48-generic #54-Ubuntu SMP Fri Mar 19 14:25:20 UTC 2021
nodename: noctis
machine: x86_64
clock source: unix
pcre jit disabled
detected number of CPU cores: 8
current working directory: /home/red/Development/kube-stuff
detected binary path: /home/red/Development/kube-stuff/venv/bin/uwsgi
uWSGI running as root, you can use --uid/--gid/--chroot options
*** WARNING: you are running uWSGI as root !!! (use the --uid flag) ***
*** WARNING: you are running uWSGI without its master process manager ***
your processes number limit is 62286
your memory page size is 4096 bytes
detected max file descriptor number: 1073741816
(doesn't hang, but consumes 8193M of memory)
Lowering the value of fs.nr_open
to 1048576
makes things work well on the pod. However, I wonder why the uwsgi
process consumes so much memory when this limit is high.
Running the same uwsgi
app without changing any limits, and without containers:
red@noctis:~/Development/kube-stuff$ sysctl fs.nr_open
fs.nr_open = 1048576
red@noctis:~/Development/kube-stuff$ ulimit -n
1024
red@noctis:~/Development/kube-stuff$ source venv/bin/activate
(venv) red@noctis:~/Development/kube-stuff$ ./docker-entrypoint.sh
*** Starting uWSGI 2.0.19.1 (64bit) on [Fri Apr 2 19:48:49 2021] ***
compiled with version: 10.2.0 on 02 April 2021 18:03:34
os: Linux-5.8.0-48-generic #54-Ubuntu SMP Fri Mar 19 14:25:20 UTC 2021
nodename: noctis
machine: x86_64
clock source: unix
pcre jit disabled
detected number of CPU cores: 8
current working directory: /home/red/Development/kube-stuff
detected binary path: /home/red/Development/kube-stuff/venv/bin/uwsgi
*** WARNING: you are running uWSGI without its master process manager ***
your processes number limit is 62286
your memory page size is 4096 bytes
detected max file descriptor number: 1024
Memory usage is normal in this case.
Finally, I notice that if I run --http-socket
instead of --http
, memory usage is what I would consider "normal" (a few hundred MiB at most), but these options are not equivalent according to the documentation.
Does it have the same effect passing the lower count with --max-fd
?
@xrmx It does not, I wasn't aware of that option. I tested it with --max-fd 1024
and it no longer consumes a huge amount of memory, even when running as non-root (despite what https://uwsgi-docs.readthedocs.io/en/latest/Options.html#max-fd says that it requires root privileges).
Probably the 1024
limit is a bit too low, but it works.
Yeah, the root problem is that some data structures are as a big as the number of fds available so the ill effect you have seen.
I see, thanks for the explanation. Assuming the data structures cannot be changed, I wonder if a default limit of something like 1048576
would be sane. However, at the same time, setting such defaults could break some use cases.
Just chiming in to share here since it was a result on the first page of a search query. Should help with visibility :+1:
This will likely be due to a config on your system for the container runtime (dockerd.service
, containerd.service
, etc) that sets LimitNOFILE=infinity
.
Typically infinity
will be approx 2^30
(over 1 billion) in size, while some distros like Debian (and Ubuntu deriving from it) have a lower 2^20
limit (1k times less) which is the default sysctl fs.nr_open
value.
This was due to systemd v240 (2018Q4) release that would raise fs.nr_open
and fs.file-max
to the highest possible value, and fs.nr_open
being used as infinity
IIRC.
- On some distros like Fedora, at least outside of containers this was a non-issue as
infinity
was not used (systemd 240 kept the soft limit of 1024, but raised the hard limit to 512k vs the kernels default 4096). - On others like Debian,
pam_limits.so
has been carrying a patch for something like 2 decades that setinfinity
as the hard limit IIRC (or it just took whatever the hard limit was on PID 1, which would befs.nr_open
, same outcome AFAIK). That caused the v240 release to not play well due to2^20
being raised to2^30
, so they build systemd without thefs.nr_open
bump (instead of fixing patch forpam_limits.so
:man_shrugging: ).
Anyway... for container runtimes with systemd, they'd configure LimitNOFILE
and have bounced between 2^20
(1048576
) and 2^30
(infinity
) a few times, with infinity
being present since 2018-2021 depending on what you installed (and when your distro got the update). That is what raised the limits in the container, that may not appear to be the same on your host.
Often you can configure the ulimit
per container (eg: docker run
has --ulimit
, compose and k8s have similar ulimit
config settings). Or you can set the LimitNOFILE
for the systemd service config to a sane value.... or if you're lucky I guess like in this case your software affected has an option to impose a limit.
Just to clarify, this typically only affects the soft limit value, although some software internally raises the soft limit to the hard limit (perfectly acceptable... just 2^30
is not a sane hard limit, 2^19
is often plenty and many can get away with 2^16
just fine).
As for the memory usage, from what I've read in other software affected (Java), an array is allocated sized to the soft limit set, and that used 8 bytes per element, thus for 2^30
uses approx 8.6GB of memory. The more sane 2^20
hard limit you'd see on Debian would only use 8.4MB in comparison, and likewise if the default soft limit 1024
would be fine, only 8.2KB needed.
For dockerd
and containerd
, this problem is likely to be resolved this year as there is a fair amount of discussion going on to not use infinity
.
Just adding some more visibility here. This still bit me on Fedora 39's docker. For people running into this issue:
- If not essential (exposing uWSGI to the world), not using the
--http
option, removes the issue - As mentioned above, lowering the ulimit works, either in systemd or (for a faster fix) in your bootstrap script,
ulimit -n 1048576
should do the trick!
https://access.redhat.com/solutions/1479623
https://github.com/systemd/systemd/commit/a8b627aaed409a15260c25988970c795bf963812
https://access.redhat.com/solutions/1479623 systemd/systemd@a8b627a
I already described the cause above with:
Typically
infinity
will be approx2^30
(over 1 billion) in size, while some distros like Debian (and Ubuntu deriving from it) have a lower2^20
limit (1k times less) which is the defaultsysctl fs.nr_open
value.
This was due to systemd v240 (2018Q4) release that would raise
fs.nr_open
andfs.file-max
to the highest possible value, andfs.nr_open
being used asinfinity
IIRC.
Is the intent of your links to provide additional reference / context?