weave icon indicating copy to clipboard operation
weave copied to clipboard

kubernetes 1.19 / weave-net 2.7.0 : "Segmentation fault (core dumped)" on ARMv7

Open grunlab opened this issue 4 years ago • 5 comments

What you expected to happen?

Weave-net still working after upgrading it from 2.6.4 to 2.7.0 on kubernetes 1.19 running on ARMv7 nodes.

What happened?

After upgrade, "weave" container in pod "weave-net" in CrashLoopBackOff state

How to reproduce it?

  • Up & running kubernetes 1.19 cluster on ARMv7 nodes with weave-net 2.6.4 installed
  • Upgrade weave-net from 2.6.4 to 2.7.0 by applying:
kubectl apply -f "https://cloud.weave.works/k8s/net?k8s-version=$(kubectl version | base64 | tr -d '\n')"'
  • 1st pod restarted during the RollingUpdate blocked in CrashLoopBackOff state:
kubectl get pod weave-net-2854t -o wide 
NAME              READY   STATUS             RESTARTS   AGE   IP              NODE        NOMINATED NODE   READINESS GATES
weave-net-2854t   1/2     CrashLoopBackOff   2          49s   192.168.0.101   master-01   <none>           <none>
  • Rollback to 2.6.4:
kubectl rollout undo daemonset weave-net 
daemonset.apps/weave-net rolled back

Anything else we need to know?

Versions:

  • Kubernetes version: 1.19.2
  • Docker version: 19.3.13
  • Kernel version: 5.4.61
kubectl get node -o wide
NAME        STATUS   ROLES    AGE     VERSION   INTERNAL-IP     EXTERNAL-IP   OS-IMAGE                       KERNEL-VERSION       CONTAINER-RUNTIME
master-01   Ready    master   92d     v1.19.2   192.168.0.101   <none>        Armbian 20.08.1 Buster         5.4.61-odroidxu4     docker://19.3.13
master-02   Ready    master   165d    v1.19.2   192.168.0.102   <none>        Armbian 20.08.1 Buster         5.4.61-odroidxu4     docker://19.3.13
master-03   Ready    master   158d    v1.19.2   192.168.0.103   <none>        Armbian 20.08.1 Buster         5.4.61-odroidxu4     docker://19.3.13
worker-01   Ready    <none>   157d    v1.19.2   192.168.0.104   <none>        Armbian 20.08.1 Buster         5.4.61-odroidxu4     docker://19.3.13
worker-02   Ready    <none>   22d     v1.19.2   192.168.0.105   <none>        Armbian 20.08.1 Buster         5.4.61-odroidxu4     docker://19.3.13
worker-03   Ready    <none>   2d17h   v1.19.2   192.168.0.106   <none>        Armbian 20.08.1 Buster         5.4.61-odroidxu4     docker://19.3.13
worker-04   Ready    <none>   2d17h   v1.19.2   192.168.0.107   <none>        Armbian 20.08.1 Buster         5.4.61-odroidxu4     docker://19.3.13

Logs:

kubectl logs weave-net-2854t -c weave
Segmentation fault (core dumped)

Network:

iptables, ip6tables, arptables & ebtables are running in legacy mode

update-alternatives --set iptables /usr/sbin/iptables-legacy
update-alternatives --set ip6tables /usr/sbin/ip6tables-legacy
update-alternatives --set arptables /usr/sbin/arptables-legacy
update-alternatives --set ebtables /usr/sbin/ebtables-legacy

It was already the case before trying to upgrade weave-net

grunlab avatar Sep 18 '20 14:09 grunlab

Up Thank you

grunlab avatar Oct 13 '20 18:10 grunlab

Hi

I did some updates since the issue opening:

  • Armbian: 20.08.01 --> 20.08.17
  • Kernel: 5.4.61 --> 5.4.72
  • Kubernetes: 1.19.2 --> 1.19.3
  • Weave: 2.6.4 --> 2.6.5

But I'm still not able to upgrade weave from 2.6.5 to 2.7.0 :-(

kubectl apply -f "https://cloud.weave.works/k8s/net?k8s-version=$(kubectl version | base64 | tr -d '\n')"
kubectl get pod -n kube-system | grep CrashLoopBackOff
weave-net-8xp6b                     1/2     CrashLoopBackOff   5          4m22s

The status of the two containers inside weave-net pod is the following:

  • container weave-npc (image: weaveworks/weave-npc:2.7.0) --> ok, running.
  • container weave (image: weaveworks/weave-kube:2.7.0) --> ko, crashing with the following error:
kubectl logs weave-net-8xp6b -c weave
Segmentation fault (core dumped)

Pulled images:

sudo docker images | grep weave | grep 2.7.0
weaveworks/weave-npc                                  2.7.0               2d47a5fd0000        2 months ago        36.7MB
weaveworks/weave-kube                                 2.7.0               f58a4b249316        2 months ago        90.1MB

Do you need additional information? I would be very happy to help. Thank you

grunlab avatar Oct 29 '20 20:10 grunlab

The same happens for me with Asus Tinker Board S (ARMv7 as well). v2.6.4 works while v2.7.0 ends in Segmentation fault (core dumped). Nodes running on x86-64 are not affected from the issue.

thewilli avatar Nov 02 '20 15:11 thewilli

Hi,

Same issue with weave-kube on 2.7.0 on :

  • Ubuntu 20.04.1 LTS (Focal Fossa)
  • Raspberry Pi 3

$ uname -a Linux host1 5.4.0-1022-raspi #25-Ubuntu SMP PREEMPT Thu Oct 15 14:22:53 UTC 2020 armv7l armv7l armv7l GNU/Linux

Ubuntu 20.04 LTS is said to be certified on Raspberry Pi :

  • https://ubuntu.com/blog/ubuntu-20-04-lts-is-certified-for-the-raspberry-pi

Weave 2.6.5 seems running : $ kubectl get pods -A NAMESPACE NAME READY STATUS RESTARTS AGE kube-system coredns-66bff467f8-pknkx 1/1 Running 4 27d kube-system coredns-66bff467f8-rsdwn 1/1 Running 4 27d kube-system etcd-winuxpi3.lab.byte13.org 1/1 Running 52 5d2h kube-system kube-apiserver-winuxpi3.lab.byte13.org 1/1 Running 48 5d2h kube-system kube-controller-manager-winuxpi3.lab.byte13.org 1/1 Running 52 5d2h kube-system kube-proxy-vlpbw 1/1 Running 4 27d kube-system kube-scheduler-winuxpi3.lab.byte13.org 1/1 Running 53 5d2h kube-system weave-net-dzcfn 2/2 Running 0 9m53s

byte13 avatar Nov 15 '20 11:11 byte13

I've just upgraded to 2.8.1 ... no more issue with this version on ARMv7.

sudo podman images | grep weave
docker.io/weaveworks/weave-npc            2.8.1            7f92d556d4ff  9 hours ago    39.7 MB
docker.io/weaveworks/weave-kube           2.8.1            df29c0a4002c  9 hours ago    89.8 MB
kubectl get pod -n kube-system -o wide | grep weave
weave-net-4gwzl                     2/2     Running   0          62m   192.168.0.105   worker-02   <none>           <none>
weave-net-5255p                     2/2     Running   0          58m   192.168.0.103   master-03   <none>           <none>
weave-net-5n9sw                     2/2     Running   0          55m   192.168.0.101   master-01   <none>           <none>
weave-net-6fgl8                     2/2     Running   0          63m   192.168.0.106   worker-03   <none>           <none>
weave-net-7vqrh                     2/2     Running   0          56m   192.168.0.104   worker-01   <none>           <none>
weave-net-bkc7v                     2/2     Running   0          57m   192.168.0.102   master-02   <none>           <none>
weave-net-bpbgr                     2/2     Running   0          60m   192.168.0.107   worker-04   <none>           <none>
weave-net-r25f5                     2/2     Running   0          61m   192.168.0.108   worker-05   <none>           <none>
weave-net-vmhwk                     2/2     Running   0          65m   192.168.0.113   worker-10   <none>           <none>

You can close this issue

grunlab avatar Jan 25 '21 19:01 grunlab