netplugin icon indicating copy to clipboard operation
netplugin copied to clipboard

contiv cni plugin is broken with cri-o runtime.

Open sbskas opened this issue 7 years ago • 4 comments

Description

When running kubernetes with the cri-o runtime (v1.0.2), the contiv plugin is unable to add network. When creating a new pod, the following errors are seen :

crio[1754]: time="2017-11-07 23:19:48.420019761+01:00" level=error msg="Error adding network: Contiv:Error moving to netns; Err: invalid nw name space: `/var/run/netns/k8s_kube-dns-5c48f5cf98-lgth2_kube-system_1670dee9-c2ea-11e7-a60b-782bcb3fbaa1_0-bf1577dd"

kubelet[2065]: E1107 23:19:48.460313 2065 pod_workers.go:182] Error syncing pod 1670dee9-c2ea-11e7-a60b-782bcb3fbaa1 ("kube-dns-5c48f5cf98-lgth2_kube-system(1670dee9-c2ea-11e7-a60b-782bcb3fbaa1)"), skipping: failed to "CreatePodSandbox" for "kube-dns-5c48f5cf98-lgth2_kube-system(1670dee9-c2ea-11e7-a60b-782bcb3fbaa1)" with CreatePodSandboxError: "CreatePodSandbox for pod "kube-dns-5c48f5cf98-lgth2_kube-system(1670dee9-c2ea-11e7-a60b-782bcb3fbaa1)" failed: rpc error: code = Unknown desc = failed to create pod network sandbox k8s_kube-dns-5c48f5cf98-lgth2_kube-system_1670dee9-c2ea-11e7-a60b-782bcb3fbaa1_0(e57408547d77f78c0e99edd130b3d7a97754806976599a2553aa1bb4d3bfc466): Contiv:Error moving to netns; Err: invalid nw name space: /var/run/netns/k8s_kube-dns-5c48f5cf98-lgth2_kube-system_1670dee9-c2ea-11e7-a60b-782bcb3fbaa1_0-bf1577dd"

Expected Behavior

A new network should be created and then an interface and an address should be assigned to the deployed pod.

Observed Behavior

Pod fails to get an address an thus cannot communicate with the rest of the cluster.

Steps to Reproduce (for bugs)

1- Instal cri-o v1.0.2 and remove docker 2- Create a k8s cluster with kubeadm (v1.8.2). 3- Install contiv 1.1.6: ./install/k8s/install.sh -n <kubemaster IP> 4- netmaster ds/netplugin ds/kube-proxy should be running ok 5- deploy/kube-dns should have 2 container marked as running and enters crashloopbackup

Your Environment

  • netctl version : 1.1.6
  • Orchestrator version (e.g. kubernetes, mesos, swarm): kubernetes v1.8.2, cri-o v1.0.2
  • Operating System and version : fedora 26 on Dell R710 (x4)

sbskas avatar Nov 07 '17 22:11 sbskas

A quick analysis show that the nsToPID function expects a string in the format '/proc/<pid>/xxxx'. However, cri-o choose to pass a symlink instead of the direct path. The cni specs says it is legit to pass a symlink instead of the direct path. Maybe a os.ReadLink() before the string analysis would help. I tried to patch the code like this without success.

sbskas avatar Nov 07 '17 22:11 sbskas

Hello guys, have any solution?

TIA!

newtonjose avatar Jul 13 '18 12:07 newtonjose

Is this issue Resolved on contiv?

kannanvr avatar Oct 28 '18 07:10 kannanvr

Is this issue Resolved on contiv?

it is resolved now in the master branch

zhouzijiang avatar Apr 21 '19 09:04 zhouzijiang