microk8s
microk8s copied to clipboard
Microk8s constantly spawning new processes when idle
I have a VPS server running microk8s (previously 1.19/stable and now upgraded to 1.20/stable while troubleshooting). My server's load is constantly high (5-10) with kube-apiserver appearing as top CPU consumer.
Digging a bit more, using execsnoop-bpfcc I found lots of new processes being constantly spawned by microk8s. Every other second up to 50 new processes per second. The server is basically idle. That can't be right, can it?
Quick and dirty stats with grep microk8s | cut -d' ' -f1 | uniq -c on the output of execsnoop-bpfcc:
14 20:57:02
3 20:57:03
48 20:57:04
7 20:57:05
9 20:57:06
3 20:57:07
13 20:57:09
1 20:57:10
1 20:57:11
13 20:57:12
But some processes excluded by this look like they were spawned by microk8s too, by looking at the PPIDs.
One of the processes which I find a lot is the following:
20:57:07 runc 151295 27204 0 /snap/microk8s/1910/bin/runc --root /run/containerd/runc/k8s.io --log /var/snap/microk8s/common/run/containerd/io.containerd.runtime.v2.task/k8s.io/$id --log-format json exec --process /var/snap/microk8s/common/run/runc-process430775463 --detach --pid-file /var/snap/microk8s/common/run/containerd/io.containerd.runtime.v2.task/k8s.io/$id $id
This process is spawned at least once per second, always with a new, unique id (a 64 character hex-string).
The logs of containerd always contain the same 4-6 lines repeated, again with changing ids:
Feb 12 21:33:14 $host microk8s.daemon-containerd[16547]: time="2021-02-12T21:33:14.423305512+01:00" level=info msg="Exec process \"$id1\" exits with exit code 0 and error <nil>"
Feb 12 21:33:14 $host microk8s.daemon-containerd[16547]: time="2021-02-12T21:33:14.423353573+01:00" level=info msg="Exec process \"$id2\" exits with exit code 0 and error <nil>"
Feb 12 21:33:14 $host microk8s.daemon-containerd[16547]: time="2021-02-12T21:33:14.423624404+01:00" level=info msg="Finish piping \"stderr\" of container exec \"$id2\""
Feb 12 21:33:14 $host microk8s.daemon-containerd[16547]: time="2021-02-12T21:33:14.423487501+01:00" level=info msg="Finish piping \"stdout\" of container exec \"$id2\""
Feb 12 21:33:14 $host microk8s.daemon-containerd[16547]: time="2021-02-12T21:33:14.433792186+01:00" level=info msg="ExecSync for \"$id3\" returns with exit code 0"
Feb 12 21:33:14 $host microk8s.daemon-containerd[16547]: time="2021-02-12T21:33:14.435643166+01:00" level=info msg="ExecSync for \"$id3\" returns with exit code 0"
Inspection report attached: inspection-report-20210212_211908.tar.gz
I notice that calico do spawn containers every now and then like those calico-ipam.... I see that when i do htop. It comes and go.
@balchua the thing is: I want to use (micro)k8s to run my software, not to constantly max out 1+ CPUs. It seems a bit excessive, but maybe my expectation is wrong and that is normal behavior? Perhaps my setup is botched? I can't really tell.
The docs state:
MicroK8s will install a minimal, lightweight Kubernetes you can run and use on practically any machine.
Hogging CPU does not feel lightweight to me.
Hi @knittl, i agree with you on that. Although the kubernetes control plane usually takes some resources of the node. May i know your node specs? The smallest spec i've ever used is a 2 cpu x 4GB for a node that comes with the kubernetes control plane.
You can setup a worker node only to run your apps, but with a separate control plane to avoid resource contention with your apps.
I just wrote the steps on how to setup a worker node only with MicroK8s. https://discuss.kubernetes.io/t/adding-a-worker-node-only-with-microk8s/14801 I hope that helps.
@balchua I totally understand that k8s will require resources to run :) It just feels a little much for my usecase: a small number of deployments, i.e. 14 deployments with 18 pods in total, already causes the server to have a load average of 5–10 most of the time. Stopping microk8s makes the load go away.
The server is a VPS with the following specs: Intel Xeon CPU with 2.2GHz and 8 cores, 32GB ram (8GB in use), and plenty of storage (1TB).
Are you running multiple nodes? I am running my 3 node control plane on a 2 cpu x 4GB with 1 worker node. Running prometheus, linkerd, dashboard and my own workload, doesn't bring my load average to 5 or 10. Maybe i should try your setup of 14 deployments.
No, it is a single-node setup. No prometheus, no dashboard. Listing all pods with microk8s kubectl get -A pods outputs the following list:
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system hostpath-provisioner-5c65fbdb4f-lxjkw 1/1 Running 0 40h
kube-system calico-kube-controllers-847c8c99d-hw7sn 1/1 Running 2 41h
kube-system coredns-86f78bb79c-qvsv5 1/1 Running 0 40h
metallb-system speaker-sbsch 1/1 Running 0 40h
metallb-system controller-559b68bfd8-9hz8g 1/1 Running 0 40h
gitlab-managed-apps ingress-nginx-ingress-default-backend-c9b59c85-qczbv 1/1 Running 0 40h
gitlab-managed-apps ingress-nginx-ingress-controller-68bcfdf674-8fsck 1/1 Running 0 40h
default project-iii-596fd4774b-5gbf5 1/1 Running 0 40h
default project-iii-mysql-bc7cb98cb-98kkn 1/1 Running 0 40h
gitlab-managed-apps certmanager-cert-manager-webhook-84545b7b88-whzrl 1/1 Running 0 39h
gitlab-managed-apps certmanager-cainjector-8c559d68f-ksnv5 1/1 Running 0 39h
gitlab-managed-apps certmanager-cert-manager-855454cc95-gfct7 1/1 Running 0 39h
website-30-production production-697965db4d-kmxgs 1/1 Running 0 39h
gitlab-managed-apps runner-gitlab-runner-5b7fcdcc8-jxmjx 1/1 Running 0 38h
backend-25-review-project25-a6pfx1 review-project25-a6pfx1-postgresql-0 1/1 Running 0 38h
backend-25-review-project25-a6pfx1 review-project25-a6pfx1-f874d6d48-49npf 1/1 Running 1 38h
kube-system calico-node-w8tgj 1/1 Running 2 41h
The load average currently is (5, 10, 15 minutes): 3.55, 4.33, 3.69
This is strange. I've not seen a load average this high on a good HW specs. The good way to try and check is to measure the load average on a bare bones MicroK8s. I.e. no workloads scheduled or running except for whatever comes default in Microk8s. Then try adding your workload one by one while measuring the load average in between.
Another thing, if using top or htop to find the load average, IMHO i find that as long as the load average is below the number of cores, it should be fine.
So in your case above a load average of 3.5 on an 8 core system is acceptable.
A load average of 3 out of 8 cores is acceptable in theory, yes. But the system feels sluggish when being connected via SSH which it does not when microk8s is not running. And considering the server is not actively doing something, it still seems too high (i.e. 3 is too much for not doing anything)
I finally found some time to analyze this a bit further. I restarted all pods this morning so that they are comparable. Afterwards, I have set up a cronjob to go over all running pods and read out their cpu usage as reported by the /sys/ fs:
while read -r ns pod; do { /snap/bin/microk8s kubectl exec -n "$ns" "$pod" -- cat /sys/fs/cgroup/cpu/cpuacct.stat; echo "$ns/$pod"; } | paste -sd ' '; done <pods >>cpu
The output format of my file is user $usercpu system $systemcpu $namespace/$podname with the CPU values showing the accumulated CPU of each pod. This allows me to easily filter and sort by CPU usage of pods over time.
After less than 2 hours of running pods, the kube-system/calico-node-... comes out as the top pod wrt. to CPU usage in both user and system numbers:
user 9111 system 7752 kube-system/calico-node-svkvd
user 10320 system 8799 kube-system/calico-node-svkvd
The metrics are taken in 10 minute intervals, with the trend of calico-node as follows (+9500/+8000):
user 811 system 761 kube-system/calico-node-svkvd
user 1896 system 1652 kube-system/calico-node-svkvd
user 3075 system 2596 kube-system/calico-node-svkvd
user 4241 system 3567 kube-system/calico-node-svkvd
user 5519 system 4709 kube-system/calico-node-svkvd
user 6703 system 5714 kube-system/calico-node-svkvd
user 7881 system 6722 kube-system/calico-node-svkvd
user 9111 system 7752 kube-system/calico-node-svkvd
user 10320 system 8799 kube-system/calico-node-svkvd
The next non-calico pod is one of my "real" applications, which currently sits at user=4000 and system=400. It is a Spring Boot application so most of the CPU was spent during startup (after all pods have started, it showed around user=2000).
The trend of the Spring Boot application is not as steep (+1700/+300 in the same period):
user 2557 system 202 my-ns/my-application-7d9f877c47-rs65g
user 2855 system 239 my-ns/my-application-7d9f877c47-rs65g
user 3022 system 275 my-ns/my-application-7d9f877c47-rs65g
user 3242 system 314 my-ns/my-application-7d9f877c47-rs65g
user 3402 system 342 my-ns/my-application-7d9f877c47-rs65g
user 3548 system 376 my-ns/my-application-7d9f877c47-rs65g
user 3770 system 417 my-ns/my-application-7d9f877c47-rs65g
user 3919 system 449 my-ns/my-application-7d9f877c47-rs65g
user 4087 system 482 my-ns/my-application-7d9f877c47-rs65g
user 4282 system 514 my-ns/my-application-7d9f877c47-rs65g
Is this normal/expected? There are currently 22 pods running (4 in namespace kube-system, 9 gitlab-managed apps (certmgr,ingress,runner), 9 are application pods). It looks like the calico-node pod is wasting quite a lot of CPU. Am I barking up the wrong tree?
probably related: https://github.com/ubuntu/microk8s/issues/1567
i also take a look with execsnoop-bpfcc on a totally fresh installed system (with snaps install microk8s --classic) , this is what i'm getting from the log after a short amount of time.
# cat /root/pbfcc.out |cut -b 36-|sort|cut -b 1-230,285-| uniq -c|sort -rn|head -n100|sort -rn|head -n50
92 /usr/bin/systemctl show --property=Id,ActiveState,UnitFileState,Type snap.microk8s.daemon-apiserver.service
92 /usr/bin/snapctl services microk8s.daemon-apiserver
92 /snap/microk8s/2074/usr/bin/openssl x509 -noout -issuer
92 /snap/microk8s/2074/usr/bin/cmp -s /var/snap/microk8s/2074/certs/csr.conf.rendered /var/snap/microk8s/2074/certs/csr.conf
92 /snap/microk8s/2074/sbin/ip route
92 /snap/microk8s/2074/sbin/ifconfig cni0
92 /snap/microk8s/2074/bin/sleep 3
92 /snap/microk8s/2074/bin/sed -i s/#MOREIPS/IP.3 = 172.16.31.207\nIP.4 = 10.1.128.192\nIP.5 = 10.1.5.0/g /var/snap/microk8s/2074/certs/csr.conf.rendered
92 /snap/microk8s/2074/bin/hostname -I
92 /snap/microk8s/2074/bin/grep default
92 /snap/microk8s/2074/bin/grep active
92 /snap/microk8s/2074/bin/grep -E (--advertise-address|--bind-address) /var/snap/microk8s/2074/args/kube-apiserver
92 /snap/microk8s/2074/bin/cp /var/snap/microk8s/2074/certs/csr.conf.template /var/snap/microk8s/2074/certs/csr.conf.rendered
87 /proc/self/exe init
85 /proc/self/fd/5 init
57 /snap/microk8s/2074/bin/runc --root /run/containerd/runc/k8s.io --log /var/snap/microk8s/common/run/containerd/io.containerd.runtime.v2.task/k8s.io/f5c2d4e86803a2eab1bb74af4b9b95abf1cf45266f70e0d0a8 --log-format json exec --proces--detach --pid-file /var/snap/microk8s/common/run/containerd/io.containerd.runtime.v2.task/k8s.io/f5c2d4e86803a2eab1bb74af4b9b95abf1cf45266f70e0d0a8 f5c2d4e86803a2eab1bb74af4b9b95abf1cf45266f70e0d0a8deb464d0c55b21
29 /usr/bin/check-status -r
29 /snap/microk8s/2074/bin/runc --root /run/containerd/runc/k8s.io --log /var/snap/microk8s/common/run/containerd/io.containerd.runtime.v2.task/k8s.io/745bfb8a154b1bdd28a7a6364a1c2efab666c82dc8395167e5 --log-format json exec --proces--detach --pid-file /var/snap/microk8s/common/run/containerd/io.containerd.runtime.v2.task/k8s.io/745bfb8a154b1bdd28a7a6364a1c2efab666c82dc8395167e5 745bfb8a154b1bdd28a7a6364a1c2efab666c82dc8395167e5669ed46e85f0c5
29 /bin/calico-node -felix-ready
28 /usr/bin/find /var/log/pods/kube-system_calico-node-mkcbb_90ee0bc4-ed13-4dbd-9c0a-81ad223fe332/calico-node -xdev -printf .
28 /usr/bin/find /var/log/pods/kube-system_calico-kube-controllers-847c8c99d-n9sjp_63a5776b-c6b4-4b6d-90bb-2e2c607f0185/calico-kube-controllers -xdev -printf .
28 /snap/microk8s/2074/usr/bin/nice -n 19 du -x -s -B 1 /var/log/pods/kube-system_calico-node-mkcbb_90ee0bc4-ed13-4dbd-9c0a-81ad223fe332/calico-node
28 /snap/microk8s/2074/usr/bin/nice -n 19 du -x -s -B 1 /var/log/pods/kube-system_calico-kube-controllers-847c8c99d-n9sjp_63a5776b-c6b4-4b6d-90bb-2e2c607f0185/calico-kube-controllers
28 /snap/microk8s/2074/usr/bin/du -x -s -B 1 /var/log/pods/kube-system_calico-node-mkcbb_90ee0bc4-ed13-4dbd-9c0a-81ad223fe332/calico-node
28 /snap/microk8s/2074/usr/bin/du -x -s -B 1 /var/log/pods/kube-system_calico-kube-controllers-847c8c99d-n9sjp_63a5776b-c6b4-4b6d-90bb-2e2c607f0185/calico-kube-controllers
28 /bin/calico-node -felix-live
27 /usr/bin/getent group microk8s
27 /snap/microk8s/2074/kubectl --kubeconfig=/var/snap/microk8s/2074/credentials/client.config get nodes --all-namespaces
27 /snap/microk8s/2074/kubectl --kubeconfig=/var/snap/microk8s/2074/credentials/client.config get all --all-namespaces
27 /snap/microk8s/2074/bin/dqlite -s file:///var/snap/microk8s/2074/var/kubernetes/backend/cluster.yaml -c /var/snap/microk8s/2074/var/kubernetes/backend/cluster.crt -k /var/snap/microk8s/2074/var/kubernetes/backend/cluster.key -f js
27 /snap/microk8s/2074/bin/chmod -R ug+rwX /var/snap/microk8s/2074/var/kubernetes/backend
27 /snap/microk8s/2074/bin/chgrp microk8s -R /var/snap/microk8s/2074/var/kubernetes/backend
26 /usr/sbin/ipset list
23 /usr/sbin/iptables --version
# microk8s kubectl get -A pods
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system calico-node-mkcbb 1/1 Running 0 12m
kube-system calico-kube-controllers-847c8c99d-n9sjp 1/1 Running 0 12m
Hogging CPU does not feel lightweight to me.
@knittl : yes !
Another thing, if using top or htop to find the load average, IMHO i find that as long as the load average is below the number of cores, it should be fine.
So in your case above a load average of 3.5 on an 8 core system is acceptable.
i have loadavg of 1 on a 4 vcpu system with a completely idle microk8s instance and i find this is not really acceptable for something which is called "micro..." .
execsnoop-bpfcc at least is telling me that this not something which was build with efficiency in mind.
is this a bug or is this the same old "so what? we have enough ram/cpu today!"-developer story ?
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
no, not stale
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Not stale, not completed.
This is still something that we are actively engaging with and trying to improve further. Some notes:
- We have seen that Calico is not too lightweight for some of the systems that we target. In those cases, falling back to a flannel setup is an option. We are working on improving our documentation around this.
- Some of the resource usage that we see with new processes being spawned continuously is coming from the
apiserver-kickerservice. This is a bash script that configures the cluster and automatically adjusts to account for some scenarios (amongst other things; make sure things do not break when connecting to a different network on a development laptop). It is always possible to disable the service if you see that it becomes an issue, or you are not interested in those automatic fixes withsudo snap stop microk8s.daemon-apiserver-kicker --disable
Further, the MicroK8s team is working on moving most of this logic out of bash scripts and into the cluster-agent service, which means that we don't spawn new processes in a loop every 5 seconds without reason.
Hope this communicates our team's work to people that are subscribed in the issue.
can someone please reopen and stop this fu..ing stale-bots? they suck !
addon:@neoaggelos , thank you!
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
why has this been closed?
it does not look that it's been resolved !
ヾ( ・`⌓´・)ノ゙