weave icon indicating copy to clipboard operation
weave copied to clipboard

Memory leak on container start/restart

Open ProjectIcarusX opened this issue 5 years ago • 3 comments

What you expected to happen?

That memory will be released on stop/restart

What happened?

Every time a container restarts or is stopped/killed some memory associated with it remains in linux SLAB. In our host some containers are frequently restarted and after a few days Unreclaimable memory gets filled to the point that system becomes unresponsive and starts swapping/crashes.

The problem is only with containers in weave network as other restarts don't affect this usage.

How to reproduce it?

The Unreclaimable slab memory usage can be seen with:

cat /proc/meminfo | grep SUnreclaim | awk '{print $2/1024 " MB"}'

After almost every restart of container that is in weave network this usage increases.

Versions:

$ weave version
weave script 2.6.5
weave 2.6.5

$ docker version
Client: Docker Engine - Community
 Version:           19.03.12
 API version:       1.40
 Go version:        go1.13.10
 Git commit:        48a66213fe
 Built:             Mon Jun 22 15:45:52 2020
 OS/Arch:           linux/amd64
 Experimental:      false

Server: Docker Engine - Community
 Engine:
  Version:          19.03.12
  API version:      1.40 (minimum version 1.12)
  Go version:       go1.13.10
  Git commit:       48a66213fe
  Built:            Mon Jun 22 15:44:23 2020
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.2.13
  GitCommit:        7ad184331fa3e55e52b890ea95e65ba581ae3429
 runc:
  Version:          1.0.0-rc10
  GitCommit:        dc9208a3303feef5b3839f4323d9beb36df0a9dd
 docker-init:
  Version:          0.18.0
  GitCommit:        fec3683

$ uname -a
Linux transcoder03 4.9.0-12-amd64 #1 SMP Debian 4.9.210-1+deb9u1 (2020-06-07) x86_64 GNU/Linux

Logs:

Nothing special in there. https://gist.github.com/ProjectIcarusX/93de29940192327c019c287e6b1d7664

ProjectIcarusX avatar Jun 30 '20 09:06 ProjectIcarusX

Thanks for the report.

Could you post the contents of /proc/slabinfo over a few hours as the problem gets worse?

Please also post the full logs of the Weave container, at least the first thousand lines. There can be some hints there that something is going wrong even if you don't see it.

bboreham avatar Jul 06 '20 10:07 bboreham

Hi, Sorry for delay. I rewrote part of software to decrease restarts so it took some time for RAM usage to rise to a noticeable level.

Here is the additional information. /proc/slabinfo https://gist.github.com/ProjectIcarusX/d7f9a73e69d63b40fb43db1e1671fd54

slabtop -s c -o c https://gist.github.com/ProjectIcarusX/ded1ea120faa8c9586dcfc4eb8438fee

Full weave log for 3 weeks https://transfer.sh/xXlRs/weave.log

ProjectIcarusX avatar Jul 14 '20 20:07 ProjectIcarusX

Any update on issue? It is still present in latest version. I can provide a simple PoC for reproduction if necessary.

ProjectIcarusX avatar Nov 03 '20 06:11 ProjectIcarusX