microk8s Microk8s constantly spawning new processes when idle

I have a VPS server running microk8s (previously 1.19/stable and now upgraded to 1.20/stable while troubleshooting). My server's load is constantly high (5-10) with kube-apiserver appearing as top CPU consumer.

Digging a bit more, using execsnoop-bpfcc I found lots of new processes being constantly spawned by microk8s. Every other second up to 50 new processes per second. The server is basically idle. That can't be right, can it?

Quick and dirty stats with grep microk8s | cut -d' ' -f1 | uniq -c on the output of execsnoop-bpfcc:

     14 20:57:02
      3 20:57:03
     48 20:57:04
      7 20:57:05
      9 20:57:06
      3 20:57:07
     13 20:57:09
      1 20:57:10
      1 20:57:11
     13 20:57:12

But some processes excluded by this look like they were spawned by microk8s too, by looking at the PPIDs.

One of the processes which I find a lot is the following:

20:57:07 runc             151295 27204    0 /snap/microk8s/1910/bin/runc --root /run/containerd/runc/k8s.io --log /var/snap/microk8s/common/run/containerd/io.containerd.runtime.v2.task/k8s.io/$id --log-format json exec --process /var/snap/microk8s/common/run/runc-process430775463 --detach --pid-file /var/snap/microk8s/common/run/containerd/io.containerd.runtime.v2.task/k8s.io/$id $id

This process is spawned at least once per second, always with a new, unique id (a 64 character hex-string).

The logs of containerd always contain the same 4-6 lines repeated, again with changing ids:

Feb 12 21:33:14 $host microk8s.daemon-containerd[16547]: time="2021-02-12T21:33:14.423305512+01:00" level=info msg="Exec process \"$id1\" exits with exit code 0 and error <nil>"
Feb 12 21:33:14 $host microk8s.daemon-containerd[16547]: time="2021-02-12T21:33:14.423353573+01:00" level=info msg="Exec process \"$id2\" exits with exit code 0 and error <nil>"
Feb 12 21:33:14 $host microk8s.daemon-containerd[16547]: time="2021-02-12T21:33:14.423624404+01:00" level=info msg="Finish piping \"stderr\" of container exec \"$id2\""
Feb 12 21:33:14 $host microk8s.daemon-containerd[16547]: time="2021-02-12T21:33:14.423487501+01:00" level=info msg="Finish piping \"stdout\" of container exec \"$id2\""
Feb 12 21:33:14 $host microk8s.daemon-containerd[16547]: time="2021-02-12T21:33:14.433792186+01:00" level=info msg="ExecSync for \"$id3\" returns with exit code 0"
Feb 12 21:33:14 $host microk8s.daemon-containerd[16547]: time="2021-02-12T21:33:14.435643166+01:00" level=info msg="ExecSync for \"$id3\" returns with exit code 0"

Inspection report attached: inspection-report-20210212_211908.tar.gz

Feb 12 '21 21:02 knittl

I notice that calico do spawn containers every now and then like those calico-ipam.... I see that when i do htop. It comes and go.

Feb 13 '21 10:02 balchua

@balchua the thing is: I want to use (micro)k8s to run my software, not to constantly max out 1+ CPUs. It seems a bit excessive, but maybe my expectation is wrong and that is normal behavior? Perhaps my setup is botched? I can't really tell.

The docs state:

MicroK8s will install a minimal, lightweight Kubernetes you can run and use on practically any machine.

Hogging CPU does not feel lightweight to me.

Feb 14 '21 10:02 knittl

Hi @knittl, i agree with you on that. Although the kubernetes control plane usually takes some resources of the node. May i know your node specs? The smallest spec i've ever used is a 2 cpu x 4GB for a node that comes with the kubernetes control plane.

You can setup a worker node only to run your apps, but with a separate control plane to avoid resource contention with your apps.

I just wrote the steps on how to setup a worker node only with MicroK8s. https://discuss.kubernetes.io/t/adding-a-worker-node-only-with-microk8s/14801 I hope that helps.

Feb 14 '21 11:02 balchua

@balchua I totally understand that k8s will require resources to run :) It just feels a little much for my usecase: a small number of deployments, i.e. 14 deployments with 18 pods in total, already causes the server to have a load average of 5–10 most of the time. Stopping microk8s makes the load go away.

The server is a VPS with the following specs: Intel Xeon CPU with 2.2GHz and 8 cores, 32GB ram (8GB in use), and plenty of storage (1TB).

Feb 14 '21 11:02 knittl

Are you running multiple nodes? I am running my 3 node control plane on a 2 cpu x 4GB with 1 worker node. Running prometheus, linkerd, dashboard and my own workload, doesn't bring my load average to 5 or 10. Maybe i should try your setup of 14 deployments.

Feb 14 '21 12:02 balchua

No, it is a single-node setup. No prometheus, no dashboard. Listing all pods with microk8s kubectl get -A pods outputs the following list:

NAMESPACE                            NAME                                                   READY   STATUS    RESTARTS   AGE
kube-system                          hostpath-provisioner-5c65fbdb4f-lxjkw                  1/1     Running   0          40h
kube-system                          calico-kube-controllers-847c8c99d-hw7sn                1/1     Running   2          41h
kube-system                          coredns-86f78bb79c-qvsv5                               1/1     Running   0          40h
metallb-system                       speaker-sbsch                                          1/1     Running   0          40h
metallb-system                       controller-559b68bfd8-9hz8g                            1/1     Running   0          40h
gitlab-managed-apps                  ingress-nginx-ingress-default-backend-c9b59c85-qczbv   1/1     Running   0          40h
gitlab-managed-apps                  ingress-nginx-ingress-controller-68bcfdf674-8fsck      1/1     Running   0          40h
default                              project-iii-596fd4774b-5gbf5                           1/1     Running   0          40h
default                              project-iii-mysql-bc7cb98cb-98kkn                      1/1     Running   0          40h
gitlab-managed-apps                  certmanager-cert-manager-webhook-84545b7b88-whzrl      1/1     Running   0          39h
gitlab-managed-apps                  certmanager-cainjector-8c559d68f-ksnv5                 1/1     Running   0          39h
gitlab-managed-apps                  certmanager-cert-manager-855454cc95-gfct7              1/1     Running   0          39h
website-30-production                production-697965db4d-kmxgs                            1/1     Running   0          39h
gitlab-managed-apps                  runner-gitlab-runner-5b7fcdcc8-jxmjx                   1/1     Running   0          38h
backend-25-review-project25-a6pfx1   review-project25-a6pfx1-postgresql-0                   1/1     Running   0          38h
backend-25-review-project25-a6pfx1   review-project25-a6pfx1-f874d6d48-49npf                1/1     Running   1          38h
kube-system                          calico-node-w8tgj                                      1/1     Running   2          41h

The load average currently is (5, 10, 15 minutes): 3.55, 4.33, 3.69

Feb 14 '21 13:02 knittl

This is strange. I've not seen a load average this high on a good HW specs. The good way to try and check is to measure the load average on a bare bones MicroK8s. I.e. no workloads scheduled or running except for whatever comes default in Microk8s. Then try adding your workload one by one while measuring the load average in between.

Feb 14 '21 21:02 balchua

Another thing, if using top or htop to find the load average, IMHO i find that as long as the load average is below the number of cores, it should be fine.

So in your case above a load average of 3.5 on an 8 core system is acceptable.

Feb 15 '21 00:02 balchua

A load average of 3 out of 8 cores is acceptable in theory, yes. But the system feels sluggish when being connected via SSH which it does not when microk8s is not running. And considering the server is not actively doing something, it still seems too high (i.e. 3 is too much for not doing anything)

I finally found some time to analyze this a bit further. I restarted all pods this morning so that they are comparable. Afterwards, I have set up a cronjob to go over all running pods and read out their cpu usage as reported by the /sys/ fs:

while read -r ns pod; do { /snap/bin/microk8s kubectl exec -n "$ns" "$pod" -- cat /sys/fs/cgroup/cpu/cpuacct.stat; echo "$ns/$pod"; } | paste -sd '       '; done <pods >>cpu

The output format of my file is user $usercpu system $systemcpu $namespace/$podname with the CPU values showing the accumulated CPU of each pod. This allows me to easily filter and sort by CPU usage of pods over time.

After less than 2 hours of running pods, the kube-system/calico-node-... comes out as the top pod wrt. to CPU usage in both user and system numbers:

user 9111	system 7752	kube-system/calico-node-svkvd
user 10320	system 8799	kube-system/calico-node-svkvd

The metrics are taken in 10 minute intervals, with the trend of calico-node as follows (+9500/+8000):

user 811	system 761	kube-system/calico-node-svkvd
user 1896	system 1652	kube-system/calico-node-svkvd
user 3075	system 2596	kube-system/calico-node-svkvd
user 4241	system 3567	kube-system/calico-node-svkvd
user 5519	system 4709	kube-system/calico-node-svkvd
user 6703	system 5714	kube-system/calico-node-svkvd
user 7881	system 6722	kube-system/calico-node-svkvd
user 9111	system 7752	kube-system/calico-node-svkvd
user 10320	system 8799	kube-system/calico-node-svkvd

The next non-calico pod is one of my "real" applications, which currently sits at user=4000 and system=400. It is a Spring Boot application so most of the CPU was spent during startup (after all pods have started, it showed around user=2000).

The trend of the Spring Boot application is not as steep (+1700/+300 in the same period):

user 2557	system 202	my-ns/my-application-7d9f877c47-rs65g
user 2855	system 239	my-ns/my-application-7d9f877c47-rs65g
user 3022	system 275	my-ns/my-application-7d9f877c47-rs65g
user 3242	system 314	my-ns/my-application-7d9f877c47-rs65g
user 3402	system 342	my-ns/my-application-7d9f877c47-rs65g
user 3548	system 376	my-ns/my-application-7d9f877c47-rs65g
user 3770	system 417	my-ns/my-application-7d9f877c47-rs65g
user 3919	system 449	my-ns/my-application-7d9f877c47-rs65g
user 4087	system 482	my-ns/my-application-7d9f877c47-rs65g
user 4282	system 514	my-ns/my-application-7d9f877c47-rs65g

Is this normal/expected? There are currently 22 pods running (4 in namespace kube-system, 9 gitlab-managed apps (certmgr,ingress,runner), 9 are application pods). It looks like the calico-node pod is wasting quite a lot of CPU. Am I barking up the wrong tree?

Mar 21 '21 11:03 knittl

probably related: https://github.com/ubuntu/microk8s/issues/1567

i also take a look with execsnoop-bpfcc on a totally fresh installed system (with snaps install microk8s --classic) , this is what i'm getting from the log after a short amount of time.

# cat /root/pbfcc.out |cut -b 36-|sort|cut -b 1-230,285-| uniq -c|sort -rn|head -n100|sort -rn|head -n50
     92 /usr/bin/systemctl show --property=Id,ActiveState,UnitFileState,Type snap.microk8s.daemon-apiserver.service
     92 /usr/bin/snapctl services microk8s.daemon-apiserver
     92 /snap/microk8s/2074/usr/bin/openssl x509 -noout -issuer
     92 /snap/microk8s/2074/usr/bin/cmp -s /var/snap/microk8s/2074/certs/csr.conf.rendered /var/snap/microk8s/2074/certs/csr.conf
     92 /snap/microk8s/2074/sbin/ip route
     92 /snap/microk8s/2074/sbin/ifconfig cni0
     92 /snap/microk8s/2074/bin/sleep 3
     92 /snap/microk8s/2074/bin/sed -i s/#MOREIPS/IP.3 = 172.16.31.207\nIP.4 = 10.1.128.192\nIP.5 = 10.1.5.0/g /var/snap/microk8s/2074/certs/csr.conf.rendered
     92 /snap/microk8s/2074/bin/hostname -I
     92 /snap/microk8s/2074/bin/grep default
     92 /snap/microk8s/2074/bin/grep active
     92 /snap/microk8s/2074/bin/grep -E (--advertise-address|--bind-address) /var/snap/microk8s/2074/args/kube-apiserver
     92 /snap/microk8s/2074/bin/cp /var/snap/microk8s/2074/certs/csr.conf.template /var/snap/microk8s/2074/certs/csr.conf.rendered
     87 /proc/self/exe init
     85 /proc/self/fd/5 init
     57 /snap/microk8s/2074/bin/runc --root /run/containerd/runc/k8s.io --log /var/snap/microk8s/common/run/containerd/io.containerd.runtime.v2.task/k8s.io/f5c2d4e86803a2eab1bb74af4b9b95abf1cf45266f70e0d0a8 --log-format json exec --proces--detach --pid-file /var/snap/microk8s/common/run/containerd/io.containerd.runtime.v2.task/k8s.io/f5c2d4e86803a2eab1bb74af4b9b95abf1cf45266f70e0d0a8 f5c2d4e86803a2eab1bb74af4b9b95abf1cf45266f70e0d0a8deb464d0c55b21
     29 /usr/bin/check-status -r
     29 /snap/microk8s/2074/bin/runc --root /run/containerd/runc/k8s.io --log /var/snap/microk8s/common/run/containerd/io.containerd.runtime.v2.task/k8s.io/745bfb8a154b1bdd28a7a6364a1c2efab666c82dc8395167e5 --log-format json exec --proces--detach --pid-file /var/snap/microk8s/common/run/containerd/io.containerd.runtime.v2.task/k8s.io/745bfb8a154b1bdd28a7a6364a1c2efab666c82dc8395167e5 745bfb8a154b1bdd28a7a6364a1c2efab666c82dc8395167e5669ed46e85f0c5
     29 /bin/calico-node -felix-ready
     28 /usr/bin/find /var/log/pods/kube-system_calico-node-mkcbb_90ee0bc4-ed13-4dbd-9c0a-81ad223fe332/calico-node -xdev -printf .
     28 /usr/bin/find /var/log/pods/kube-system_calico-kube-controllers-847c8c99d-n9sjp_63a5776b-c6b4-4b6d-90bb-2e2c607f0185/calico-kube-controllers -xdev -printf .
     28 /snap/microk8s/2074/usr/bin/nice -n 19 du -x -s -B 1 /var/log/pods/kube-system_calico-node-mkcbb_90ee0bc4-ed13-4dbd-9c0a-81ad223fe332/calico-node
     28 /snap/microk8s/2074/usr/bin/nice -n 19 du -x -s -B 1 /var/log/pods/kube-system_calico-kube-controllers-847c8c99d-n9sjp_63a5776b-c6b4-4b6d-90bb-2e2c607f0185/calico-kube-controllers
     28 /snap/microk8s/2074/usr/bin/du -x -s -B 1 /var/log/pods/kube-system_calico-node-mkcbb_90ee0bc4-ed13-4dbd-9c0a-81ad223fe332/calico-node
     28 /snap/microk8s/2074/usr/bin/du -x -s -B 1 /var/log/pods/kube-system_calico-kube-controllers-847c8c99d-n9sjp_63a5776b-c6b4-4b6d-90bb-2e2c607f0185/calico-kube-controllers
     28 /bin/calico-node -felix-live
     27 /usr/bin/getent group microk8s
     27 /snap/microk8s/2074/kubectl --kubeconfig=/var/snap/microk8s/2074/credentials/client.config get nodes --all-namespaces
     27 /snap/microk8s/2074/kubectl --kubeconfig=/var/snap/microk8s/2074/credentials/client.config get all --all-namespaces
     27 /snap/microk8s/2074/bin/dqlite -s file:///var/snap/microk8s/2074/var/kubernetes/backend/cluster.yaml -c /var/snap/microk8s/2074/var/kubernetes/backend/cluster.crt -k /var/snap/microk8s/2074/var/kubernetes/backend/cluster.key -f js
     27 /snap/microk8s/2074/bin/chmod -R ug+rwX /var/snap/microk8s/2074/var/kubernetes/backend
     27 /snap/microk8s/2074/bin/chgrp microk8s -R /var/snap/microk8s/2074/var/kubernetes/backend
     26 /usr/sbin/ipset list
     23 /usr/sbin/iptables --version

# microk8s kubectl get -A pods
NAMESPACE     NAME                                      READY   STATUS    RESTARTS   AGE
kube-system   calico-node-mkcbb                         1/1     Running   0          12m
kube-system   calico-kube-controllers-847c8c99d-n9sjp   1/1     Running   0          12m

Hogging CPU does not feel lightweight to me.

@knittl : yes !

Mar 23 '21 18:03 devZer0

Another thing, if using top or htop to find the load average, IMHO i find that as long as the load average is below the number of cores, it should be fine.

So in your case above a load average of 3.5 on an 8 core system is acceptable.

i have loadavg of 1 on a 4 vcpu system with a completely idle microk8s instance and i find this is not really acceptable for something which is called "micro..." .

execsnoop-bpfcc at least is telling me that this not something which was build with efficiency in mind.

is this a bug or is this the same old "so what? we have enough ram/cpu today!"-developer story ?

Mar 23 '21 18:03 devZer0

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

Feb 18 '22 09:02 stale[bot]

no, not stale

Feb 18 '22 12:02 devZer0

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

Jan 14 '23 13:01 stale[bot]

Not stale, not completed.

This is still something that we are actively engaging with and trying to improve further. Some notes:

We have seen that Calico is not too lightweight for some of the systems that we target. In those cases, falling back to a flannel setup is an option. We are working on improving our documentation around this.
Some of the resource usage that we see with new processes being spawned continuously is coming from the apiserver-kicker service. This is a bash script that configures the cluster and automatically adjusts to account for some scenarios (amongst other things; make sure things do not break when connecting to a different network on a development laptop). It is always possible to disable the service if you see that it becomes an issue, or you are not interested in those automatic fixes with sudo snap stop microk8s.daemon-apiserver-kicker --disable

Further, the MicroK8s team is working on moving most of this logic out of bash scripts and into the cluster-agent service, which means that we don't spawn new processes in a loop every 5 seconds without reason.

Hope this communicates our team's work to people that are subscribed in the issue.

Feb 14 '23 08:02 neoaggelos

can someone please reopen and stop this fu..ing stale-bots? they suck !

addon:@neoaggelos , thank you!

Mar 16 '23 11:03 devZer0

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

Feb 09 '24 21:02 stale[bot]

why has this been closed?

it does not look that it's been resolved !

ヾ( ･`⌓´･)ﾉﾞ

Mar 12 '24 15:03 devZer0

microk8s microk8s copied to clipboard

Microk8s constantly spawning new processes when idle

microk8s
microk8s copied to clipboard