coreos-kubernetes
coreos-kubernetes copied to clipboard
Calico + rkt Fails due to CNI error
This was discussed in the #kubernetes-users
channel on the K8s Slack but I wanted to make sure it was documented here.
TLDR
When Calico + rkt are enabled on the vagrant (and other setups?) multi-node cluster all pods get stuck at the CreatingContainer
phase because of some problem with CNI not begin setup.
Problem
Using the latest commit ( 79b7350fe2e45a1a5e9ed0f34a904eb10c158232 ), when rkt
and calico
are enabled, ( i.e.,
# Whether to use Calico for Kubernetes network policy.
export USE_CALICO=true
# Determines the container runtime for kubernetes to use. Accepts 'docker' or 'rkt'.
export CONTAINER_RUNTIME=rkt
is set in controller-install.sh
and worker-install.sh
) we get the following error when trying to create a Pod:
$ kubectl create -f <spec for deployment with busybox container pod>
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
app-deploy-172403538-gvwpz 0/1 ContainerCreating 0 1m
$ kubectl describe pod app-deploy-172403538-gvwpz
[...]
Events:
FirstSeen LastSeen Count From SubObjectPath Type Reason Message
--------- -------- ----- ---- ------------- -------- ------ -------
44s 29s 6 {default-scheduler } Warning FailedScheduling no nodes available to schedule pods
12s 12s 1 {default-scheduler } Normal Scheduled Successfully assigned app-deploy-172403538-gvwpz to 172.17.4.202
12s 1s 2 {kubelet 172.17.4.202} Warning FailedSync Error syncing pod, skipping: failed to SyncPod: failed to set up pod network: cni config unintialized
Nodes have possibly have a similar story:
kubectl describe node <node name>
[...]
OS Image: Container Linux by CoreOS 1284.0.0 (Ladybug)
[...]
Container Runtime Version: rkt://1.21.0
Kubelet Version: v1.5.1+coreos.0
Kube-Proxy Version: v1.5.1+coreos.0
[...]
FirstSeen LastSeen Count From SubObjectPath Type Reason Message
--------- -------- ----- ---- ------------- -------- ------ -------
5m 5m 1 {kubelet 172.17.4.101} Warning ImageGCFailed unable to find data for container /
[... everything else normal ...]
Specs
$ vagrant version
Installed Version: 1.8.6
Latest Version: 1.9.1
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"5", GitVersion:"v1.5.1", GitCommit:"82450d03cb057bab0950214ef122b67c83fb11df", GitTreeState:"clean", BuildDate:"2016-12-14T00:57:05Z", GoVersion:"go1.7.4", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"5", GitVersion:"v1.5.1+coreos.0", GitCommit:"cc65f5321f9230bf9a3fa171155c1213d6e3480e", GitTreeState:"clean", BuildDate:"2016-12-14T04:08:28Z", GoVersion:"go1.7.4", Compiler:"gc", Platform:"linux/amd64"}
Hints
It was mentioned in channel that this is because of the Hyperkube image and it's compatibility with rkt [and/or] calico. I wasn't sure how to put that theory to the test so I can't confirm it.