weave
weave copied to clipboard
Document version compatibility (all dependencies)
Hi, I just spent quite some time figuring what was wrong with my kubernetes cluster. It appears latest weave is not compatible with latest CNI. Which is totally fine, but I think it should be made more apparent.
My suggestion would be to either include that information in the release notes, or to add a compatibility table in the documentation. Maybe there is one but I haven't been able to locate one.
Also the recommended installation method it to get a yaml
from your website, which only depends on the kubernetes version, and not the CNI version. So there should at least be a warning about that in the install doc.
As far as I can tell, latest weave works with CNI protocol version =0.3.0 and CNI release <=0.8.1
Why do you think it is not compatible with newer versions? I have not found an issue yet. I am using latest k3s, latest weave and CNI Plugins 1.1.0.
Yeah I'm sorry, it indeed works with any CNI release > 0.3 (including 1.0+) but only with protocol version 0.3 which is incompatible with containerd 1.6+ which uses 1.0 protocol version, and this broke my cluster when upgrading containerd.
Uh okay thank you for the hint. I am currently at containerd 1.5.9. Do you have logs what happens with containerd 1.6?
Sorry, I don't have the logs anymore, but pods could not be created or deleted anymore, and complained about missing or unparsable or incompatible verions. The error itself was not very helpful and I spent quite some time figuring out the issue was due to containerd requiring a different CNI protocol version than weave.
Do you have logs what happens with containerd 1.6?
I'm guessing it's the same thing I'm hitting -
Warning FailedCreatePodSandBox 4s (x10 over 118s) kubelet (combined from similar events): Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "217da23c13e0c6565689328bc8bee1e4095a44ec7a4c427b347f0a085372743d": plugin type="portmap" failed (add): failed to parse config: could not parse prevResult: could not parse prevResult: result type supports [0.3.0 0.3.1 0.4.0] but unmarshalled CNIVersion is "0.1.0"
ubuntu@on-3:~$ containerd --version
containerd github.com/containerd/containerd v1.6.1 10f428dac7cec44c864e1b830a4623af27a9fc70
ubuntu@on-3:~$
Edit: Dang it, and it's an ARM64 box, and there is no containerd 1.5.x for ARM64 :-/ ... I'm using Kubespray so I'll have to try one of the other container runtimes.
I'm hitting the same. FWICT this got introduced by https://github.com/containernetworking/cni/commit/76bf3de7f892b5adac1b20bf6fb7a1e962ad0cd1. Runtimes which include this commit don't work with CNI plugins which are missing https://github.com/containernetworking/cni/commit/27a5b994c2a55d1fceca08ec88139b61d4ad55fd (from 2017!).
The issue is that without this, weave-net
makes unversioned replies like
{
"ips": [
{
"version": "4",
"address": "10.32.0.2/12",
"gateway": "10.32.0.1"
}
],
"dns": {}
}
which the runtime now interprets as having version 0.1.0. With the referenced commit added to weave's copy of cni, it is versioned again:
{
"cniVersion": "0.3.0",
"ips": [
{
"version": "4",
"address": "10.32.0.6/12",
"gateway": "10.32.0.1"
}
],
"dns": {}
}
I attempted to update the cni version to 0.6.0, which contains the needed fixes but not the API break for cmdCheck
. While that built properly, the weave container fails to start due to a panic:
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x55fdf4037fbc]
goroutine 1 [running]:
main.isLocalNodeIP({0xc000191570, 0xd})
/home/abuild/rpmbuild/BUILD/weave-2.8.1/prog/kube-utils/main.go:82 +0xbc
main.getKubePeers({0x55fdf4bd72d0, 0xc000372dc0}, 0x0)
/home/abuild/rpmbuild/BUILD/weave-2.8.1/prog/kube-utils/main.go:62 +0x445
main.main()
/home/abuild/rpmbuild/BUILD/weave-2.8.1/prog/kube-utils/main.go:405 +0x87f
Failed to get peers
So the only way to fix this is to cherry-pick the fix into the vendored copy for now: https://github.com/Vogtinator/weave/commit/ef8fa923030d9b6da3ca014689871b0108486e31
With that, the cluster comes up as expected.
Do you have logs what happens with containerd 1.6?
I'm guessing it's the same thing I'm hitting -
Warning FailedCreatePodSandBox 4s (x10 over 118s) kubelet (combined from similar events): Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "217da23c13e0c6565689328bc8bee1e4095a44ec7a4c427b347f0a085372743d": plugin type="portmap" failed (add): failed to parse config: could not parse prevResult: could not parse prevResult: result type supports [0.3.0 0.3.1 0.4.0] but unmarshalled CNIVersion is "0.1.0"
ubuntu@on-3:~$ containerd --version containerd github.com/containerd/containerd v1.6.1 10f428dac7cec44c864e1b830a4623af27a9fc70 ubuntu@on-3:~$
Edit: Dang it, and it's an ARM64 box, and there is no containerd 1.5.x for ARM64 :-/ ... I'm using Kubespray so I'll have to try one of the other container runtimes.
Hello!
I am facing the same issues on my homelab sigle-node cluster:
Warning FailedCreatePodSandBox 48s (x167 over 38m) kubelet (combined from similar events): Failed to create pod sandbox: rpc error: code = Unknown desc = failed to create pod network sandbox k8s_local-path-provisioner-566b877b9c-rh7nj_kube-system_534f4cf2-4e1f-4c8e-ae16-b04ebb4b4ba6_0(e19e036f21a730777e4f6fb77fdd3c605ca61ea9990007235749ef169fca2c39): error adding pod kube-system_local-path-provisioner-566b877b9c-rh7nj to CNI network "weave": plugin type="portmap" failed (add): failed to parse config: could not parse prevResult: could not parse prevResult: result type supports [0.3.0 0.3.1 0.4.0] but unmarshalled CNIVersion is "0.1.0"
Kubernetes and runtime versions:
$ k get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
pikachu Ready control-plane,master 95d v1.23.1 192.168.100.152 <none> Red Hat Enterprise Linux 8.5 (Ootpa) 4.18.0-348.7.1.el8_5.x86_64 cri-o://1.23.0
Which solutions do we have at this moment?
Updated image (with github.com/containernetworking/[email protected]
) temporarily upload as below:
- https://hub.docker.com/r/alvistack/weave-kube
- https://hub.docker.com/r/alvistack/weave-npc
Also see https://github.com/weaveworks/weave/pull/3939
This does not seem to be happening only with containerd but also with cri-o. A release (incl. aarch64) with a fix would be very appreciated.
Yeah I'm sorry, it indeed works with any CNI release > 0.3 (including 1.0+) but only with protocol version 0.3 which is incompatible with containerd 1.6+ which uses 1.0 protocol version, and this broke my cluster when upgrading containerd.
containerd should work with older cni configs and plugins .. we did have a change where we did lo using 1.0.0 config and the cni loopback plugin.. we reverted that to 0.3.1 in containerd 1.6.4, containerd is built against cni v1.0.1 library but should be backwards compatible.
let's see what we can do to fix these issues..
to CNI network "weave": plugin type="portmap" failed (add): failed to parse config: could not parse prevResult: could not parse prevResult: result type supports [0.3.0 0.3.1 0.4.0] but unmarshalled CNIVersion is "0.1.0
nod new cni requires config version to be specified on setup otherwise presumes 010...
cheers!
looks like it's not just that the config needs to be specified but the plugin also needs the setup result to have the version in it or cni will convert the result to the wrong version and flush the important parts in the result..
If correct, weave needs a fix to add config version in result.. and cni needs a fix to use/try config version if the plugin did not provide the config version in the result...
Doesn't https://github.com/weaveworks/weave/pull/3939 solve this issue?
I was facing the same issue when deploying a brand new cluster. The core-dns pods wouldn't start with the error message below:
Warning FailedCreatePodSandBox 4m5s kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "1557f2385ba5c7a830a441c1abcd15a0d6026c1cc50d6ecad6f21be6bf3b215c": plugin type="portmap" failed (add): failed to parse config: could not parse prevResult: could not parse prevResult: result type supports [0.3.0 0.3.1 0.4.0] but unmarshalled CNIVersion is "0.1.0"
I was using the latest version of kubeadm/kubernetes (1.24.1), containerd (1.6.4), the CNI plugins (1.1.1) and weave (installed following the documentation).
I fixed the problem thanks to this issue by downgrading containerd to 1.5.12:
- Downloaded the correct tar file
- Stopped containerd service
- Untar containerd 1.5.12 over the previous install in
/usr/local
- Restart the containerd service
- Restart the kubelet service
- Core-dns pods started running
This should be resolved by #3946 – I'm going to keep this open until I can see how it gets published as "latest", or at least see that it gets published.
Let's re-purpose this issue, since I'm hearing from some folks there are other compatibility issues that may impact Weave net – eg. iptables, etc. I am using a mix of IPTables 1.8.8 and 1.8.5 in my cluster with weave net, and don't seem to be having any issues. But there might be something nuanced in here and it will be helpful to future users if we can document it.
- https://github.com/weaveworks/weave/issues/3465#issuecomment-929816278
It seems in here the important bit of information is that iptables-legacy must be used instead of nf_tables, and this affects mainly CentOS users so far.
Edit: when we have a good list going, we can close this issue by adding it to the docs. 👍