weave
weave copied to clipboard
Memory leak on container start/restart
What you expected to happen?
That memory will be released on stop/restart
What happened?
Every time a container restarts or is stopped/killed some memory associated with it remains in linux SLAB. In our host some containers are frequently restarted and after a few days Unreclaimable memory gets filled to the point that system becomes unresponsive and starts swapping/crashes.
The problem is only with containers in weave network as other restarts don't affect this usage.
How to reproduce it?
The Unreclaimable slab memory usage can be seen with:
cat /proc/meminfo | grep SUnreclaim | awk '{print $2/1024 " MB"}'
After almost every restart of container that is in weave network this usage increases.
Versions:
$ weave version
weave script 2.6.5
weave 2.6.5
$ docker version
Client: Docker Engine - Community
Version: 19.03.12
API version: 1.40
Go version: go1.13.10
Git commit: 48a66213fe
Built: Mon Jun 22 15:45:52 2020
OS/Arch: linux/amd64
Experimental: false
Server: Docker Engine - Community
Engine:
Version: 19.03.12
API version: 1.40 (minimum version 1.12)
Go version: go1.13.10
Git commit: 48a66213fe
Built: Mon Jun 22 15:44:23 2020
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: 1.2.13
GitCommit: 7ad184331fa3e55e52b890ea95e65ba581ae3429
runc:
Version: 1.0.0-rc10
GitCommit: dc9208a3303feef5b3839f4323d9beb36df0a9dd
docker-init:
Version: 0.18.0
GitCommit: fec3683
$ uname -a
Linux transcoder03 4.9.0-12-amd64 #1 SMP Debian 4.9.210-1+deb9u1 (2020-06-07) x86_64 GNU/Linux
Logs:
Nothing special in there. https://gist.github.com/ProjectIcarusX/93de29940192327c019c287e6b1d7664
Thanks for the report.
Could you post the contents of /proc/slabinfo over a few hours as the problem gets worse?
Please also post the full logs of the Weave container, at least the first thousand lines. There can be some hints there that something is going wrong even if you don't see it.
Hi, Sorry for delay. I rewrote part of software to decrease restarts so it took some time for RAM usage to rise to a noticeable level.
Here is the additional information. /proc/slabinfo https://gist.github.com/ProjectIcarusX/d7f9a73e69d63b40fb43db1e1671fd54
slabtop -s c -o c https://gist.github.com/ProjectIcarusX/ded1ea120faa8c9586dcfc4eb8438fee
Full weave log for 3 weeks https://transfer.sh/xXlRs/weave.log
Any update on issue? It is still present in latest version. I can provide a simple PoC for reproduction if necessary.