coreos-kubernetes icon indicating copy to clipboard operation
coreos-kubernetes copied to clipboard

Calico + rkt Fails due to CNI error

Open pop opened this issue 7 years ago • 28 comments

This was discussed in the #kubernetes-users channel on the K8s Slack but I wanted to make sure it was documented here.

TLDR

When Calico + rkt are enabled on the vagrant (and other setups?) multi-node cluster all pods get stuck at the CreatingContainer phase because of some problem with CNI not begin setup.

Problem

Using the latest commit ( 79b7350fe2e45a1a5e9ed0f34a904eb10c158232 ), when rkt and calico are enabled, ( i.e.,

# Whether to use Calico for Kubernetes network policy.
export USE_CALICO=true

# Determines the container runtime for kubernetes to use. Accepts 'docker' or 'rkt'.
export CONTAINER_RUNTIME=rkt

is set in controller-install.sh and worker-install.sh ) we get the following error when trying to create a Pod:

$ kubectl create -f <spec for deployment with busybox container pod>
$ kubectl get pods
NAME                         READY     STATUS              RESTARTS   AGE
app-deploy-172403538-gvwpz   0/1       ContainerCreating   0          1m
$ kubectl describe pod app-deploy-172403538-gvwpz
[...]
Events:
  FirstSeen	LastSeen	Count	From			SubObjectPath	Type		Reason			Message
  ---------	--------	-----	----			-------------	--------	------			-------
  44s		29s		6	{default-scheduler }			Warning		FailedScheduling	no nodes available to schedule pods
  12s		12s		1	{default-scheduler }			Normal		Scheduled		Successfully assigned app-deploy-172403538-gvwpz to 172.17.4.202
  12s		1s		2	{kubelet 172.17.4.202}			Warning		FailedSync		Error syncing pod, skipping: failed to SyncPod: failed to set up pod network: cni config unintialized

Nodes have possibly have a similar story:

kubectl describe node <node name>
[...]
 OS Image:			Container Linux by CoreOS 1284.0.0 (Ladybug)
[...]
 Container Runtime Version:	rkt://1.21.0
 Kubelet Version:		v1.5.1+coreos.0
 Kube-Proxy Version:		v1.5.1+coreos.0
[...]
  FirstSeen	LastSeen	Count	From			SubObjectPath	Type		Reason			Message
  ---------	--------	-----	----			-------------	--------	------			-------
  5m		5m		1	{kubelet 172.17.4.101}			Warning		ImageGCFailed		unable to find data for container /
[... everything else normal ...]

Specs

$ vagrant version
Installed Version: 1.8.6
Latest Version: 1.9.1
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"5", GitVersion:"v1.5.1", GitCommit:"82450d03cb057bab0950214ef122b67c83fb11df", GitTreeState:"clean", BuildDate:"2016-12-14T00:57:05Z", GoVersion:"go1.7.4", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"5", GitVersion:"v1.5.1+coreos.0", GitCommit:"cc65f5321f9230bf9a3fa171155c1213d6e3480e", GitTreeState:"clean", BuildDate:"2016-12-14T04:08:28Z", GoVersion:"go1.7.4", Compiler:"gc", Platform:"linux/amd64"}

Hints

It was mentioned in channel that this is because of the Hyperkube image and it's compatibility with rkt [and/or] calico. I wasn't sure how to put that theory to the test so I can't confirm it.

pop avatar Jan 06 '17 00:01 pop