coreos-kubernetes icon indicating copy to clipboard operation
coreos-kubernetes copied to clipboard

Calico + rkt Fails due to CNI error

Open pop opened this issue 8 years ago • 28 comments

This was discussed in the #kubernetes-users channel on the K8s Slack but I wanted to make sure it was documented here.

TLDR

When Calico + rkt are enabled on the vagrant (and other setups?) multi-node cluster all pods get stuck at the CreatingContainer phase because of some problem with CNI not begin setup.

Problem

Using the latest commit ( 79b7350fe2e45a1a5e9ed0f34a904eb10c158232 ), when rkt and calico are enabled, ( i.e.,

# Whether to use Calico for Kubernetes network policy.
export USE_CALICO=true

# Determines the container runtime for kubernetes to use. Accepts 'docker' or 'rkt'.
export CONTAINER_RUNTIME=rkt

is set in controller-install.sh and worker-install.sh ) we get the following error when trying to create a Pod:

$ kubectl create -f <spec for deployment with busybox container pod>
$ kubectl get pods
NAME                         READY     STATUS              RESTARTS   AGE
app-deploy-172403538-gvwpz   0/1       ContainerCreating   0          1m
$ kubectl describe pod app-deploy-172403538-gvwpz
[...]
Events:
  FirstSeen	LastSeen	Count	From			SubObjectPath	Type		Reason			Message
  ---------	--------	-----	----			-------------	--------	------			-------
  44s		29s		6	{default-scheduler }			Warning		FailedScheduling	no nodes available to schedule pods
  12s		12s		1	{default-scheduler }			Normal		Scheduled		Successfully assigned app-deploy-172403538-gvwpz to 172.17.4.202
  12s		1s		2	{kubelet 172.17.4.202}			Warning		FailedSync		Error syncing pod, skipping: failed to SyncPod: failed to set up pod network: cni config unintialized

Nodes have possibly have a similar story:

kubectl describe node <node name>
[...]
 OS Image:			Container Linux by CoreOS 1284.0.0 (Ladybug)
[...]
 Container Runtime Version:	rkt://1.21.0
 Kubelet Version:		v1.5.1+coreos.0
 Kube-Proxy Version:		v1.5.1+coreos.0
[...]
  FirstSeen	LastSeen	Count	From			SubObjectPath	Type		Reason			Message
  ---------	--------	-----	----			-------------	--------	------			-------
  5m		5m		1	{kubelet 172.17.4.101}			Warning		ImageGCFailed		unable to find data for container /
[... everything else normal ...]

Specs

$ vagrant version
Installed Version: 1.8.6
Latest Version: 1.9.1
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"5", GitVersion:"v1.5.1", GitCommit:"82450d03cb057bab0950214ef122b67c83fb11df", GitTreeState:"clean", BuildDate:"2016-12-14T00:57:05Z", GoVersion:"go1.7.4", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"5", GitVersion:"v1.5.1+coreos.0", GitCommit:"cc65f5321f9230bf9a3fa171155c1213d6e3480e", GitTreeState:"clean", BuildDate:"2016-12-14T04:08:28Z", GoVersion:"go1.7.4", Compiler:"gc", Platform:"linux/amd64"}

Hints

It was mentioned in channel that this is because of the Hyperkube image and it's compatibility with rkt [and/or] calico. I wasn't sure how to put that theory to the test so I can't confirm it.

pop avatar Jan 06 '17 00:01 pop

For some reason, the /etc/kubernetes/cni/net.d/ is empty when USE_CALICO=true. Not sure whether that's expected.

yifan-gu avatar Jan 06 '17 00:01 yifan-gu

cc @heschlie

yifan-gu avatar Jan 06 '17 00:01 yifan-gu

Taking a look at this, rkt seems to be having some issues creating the Calico containers. I see:

Failed to create rkt container with error: json: error calling MarshalJSON for the type *schema.PodManifest: json: error calling MarshalJSON for type types.Volume: source for host volume cannot be empty.

The volumes section in the manifest looks ok to me, but without it the container can't come online. I'm not super familiar with rkt, are there any known issues related to volume mounts?

heschlie avatar Jan 06 '17 17:01 heschlie

@heschlie Thanks, will try to reproduce by creating the calico pod.

yifan-gu avatar Jan 06 '17 20:01 yifan-gu

Having the same error. At first I thought it was because, in kubernetes, rkt errors when mounting to a hostPath that doesn't exist. Whereas, on docker, the daemon creates the directory on the host if it does not exist.

I'm going to test this theory by creating those directories required by hostPath mounts before running the scripts.

UPDATE: Took a bit of trial and error, but this is the gist of the situation. The calico-node template in the controller-install.sh tries to mount the DNS file /etc/resolv.conf into the calico-node:v0.23.0 to use for TLS generation. However, the error that kubernetes is throwing "source for host volume cannot be empty." is vague. When I comment out the DNS mount point, the pod startup works fine. I am assuming that this is an issue that differs from Docker: rkt cannot mount individual files through kubernetes but Docker can.

Removing the DNS did end up breaking calico and the CNI files were not fully developed. The fix for this problem is to find a way of getting resolv.conf into the calico template without trying to mount it as a file. @yifan-gu @heschlie

Promaethius avatar Jan 16 '17 03:01 Promaethius

having the same problem. are you sure that rkt can't mount individual files ? in https://coreos.com/rkt/docs/latest/subcommands/run.html it says that "host volumes that can expose a directory or a file from the host to the pod." under the title "Mount Volumes into a Pod".

kfirufk avatar Jan 16 '17 14:01 kfirufk

Rkt has the capability but I don't think that kubernetes is handling single file mounts correctly. The exception that is being thrown comes from v.Source = "" which doesn't really make sense.

UPDATE: It looks like the error is thrown because resolv.conf is already passed as a mount in the rkt environment option: Environment="RKT_OPTS=--uuid-file-save=${uuid_file} \ --volume dns,kind=host,source=/run/systemd/resolve/resolv.conf \ --mount volume=dns,target=/etc/resolv.conf \

This is why the container starts when the hostPath is removed. However, the cni that is written only contains one bracket. core@rkt-calico ~ $ cat /etc/kubernetes/cni/net.d/10* {

These are the logs from the container. core@rkt-calico ~ $ sudo journalctl -M rkt-471ce5b0-02c3-4fe0-87ff-1bb593a7cbda -t install-cni -- Logs begin at Mon 2017-01-16 20:40:43 UTC, end at Mon 2017-01-16 20:48:34 UTC. -- Jan 16 20:40:43 rkt-calico install-cni[6]: Installing any TLS assets from /calico-secrets Jan 16 20:40:43 rkt-calico install-cni[6]: cp: can't stat '/calico-secrets/*': No such file or directory Jan 16 20:40:43 rkt-calico install-cni[6]: Wrote Calico CNI binaries to /host/opt/cni/bin/ Jan 16 20:40:44 rkt-calico install-cni[6]: CNI plugin version: v1.5.2 Jan 16 20:40:44 rkt-calico install-cni[6]: Wrote CNI config: { Jan 16 20:40:44 rkt-calico install-cni[6]: Done configuring CNI. Sleep=true

Promaethius avatar Jan 16 '17 19:01 Promaethius

cc @jonboulle @squeed

yifan-gu avatar Jan 17 '17 18:01 yifan-gu

It looks like the config map that contains the template for the CNI isn't being passed correctly through rkt. I'm assuming that this is an issue with a multi-line environmental variable since only the first line, which is the {, is being passed.

My idea for a fix that would work with both docker and rkt is to mount the config map in kubernetes as a file in the container then use an additional command arg in the daemonset yaml for calico to cat the file into the environmental variable that the script https://github.com/projectcalico/cni-plugin/blob/master/k8s-install/scripts/install-cni.sh is looking for.

I'll submit a pull request if I'm successful.

UPDATE: Tests were successful. This method is a workaround for an actual rkt issue but it works nonetheless.

There is a new problem however in that calico_node's runtime is docker by default. 2017-01-20 22:12:27,693 26263 [kube-system/kubernetes-dashboard-3543765157-82bgc] INFO Calico CNI execution complete, rc=100 2017-01-20 22:12:28,405 26295 [kube-system/kubernetes-dashboard-3543765157-82bgc] INFO Starting Calico CNI plugin execution 2017-01-20 22:12:28,408 26295 [kube-system/kubernetes-dashboard-3543765157-82bgc] INFO Remove network 'calico' from container: b1a0f666-df5b-11e6-b8a9-0cc47ab58acc 2017-01-20 22:12:28,408 26295 [kube-system/kubernetes-dashboard-3543765157-82bgc] INFO Releasing IP address 2017-01-20 22:12:28,408 26295 [kube-system/kubernetes-dashboard-3543765157-82bgc] INFO Using IPAM plugin at: /opt/cni/bin/host-local 2017-01-20 22:12:28,420 26295 [kube-system/kubernetes-dashboard-3543765157-82bgc] WARNING No Calico Endpoint for workload: kube-system.kubernetes-dashboard-3543765157-82bgc 2017-01-20 22:12:28,420 26295 [kube-system/kubernetes-dashboard-3543765157-82bgc] INFO Calico CNI execution complete, rc=0 2017-01-20 22:12:30,701 26386 [kube-system/kube-dns-782804071-4g78j] INFO Starting Calico CNI plugin execution 2017-01-20 22:12:30,713 26386 [kube-system/kube-dns-782804071-4g78j] ERROR Container b16c62ef-df5b-11e6-b8a9-0cc47ab58accwas not found. 2017-01-20 22:12:30,714 26386 [kube-system/kube-dns-782804071-4g78j] ERROR Unhandled Exception killed plugin Traceback (most recent call last): File "<string>", line 773, in main File "<string>", line 181, in execute File "<string>", line 200, in add File "calico_cni/container_engines.py", line 52, in uses_host_networking File "calico_cni/container_engines.py", line 70, in _docker_inspect KeyError: 'Unable to inspect container.' 2017-01-20 22:12:30,714 26386 [kube-system/kube-dns-782804071-4g78j] ERROR CNI Error: { "msg": "Unhandled Exception killed plugin", "cniVersion": "0.1.0", "code": 100, "details": null } Traceback (most recent call last): File "<string>", line 773, in main File "<string>", line 181, in execute File "<string>", line 200, in add File "calico_cni/container_engines.py", line 52, in uses_host_networking File "calico_cni/container_engines.py", line 70, in _docker_inspect KeyError: 'Unable to inspect container.' 2017-01-20 22:12:30,715 26386 [kube-system/kube-dns-782804071-4g78j] INFO Calico CNI execution complete, rc=100 2017-01-20 22:12:31,387 26417 [kube-system/kube-dns-782804071-4g78j] INFO Starting Calico CNI plugin execution 2017-01-20 22:12:31,390 26417 [kube-system/kube-dns-782804071-4g78j] INFO Remove network 'calico' from container: b16c62ef-df5b-11e6-b8a9-0cc47ab58acc 2017-01-20 22:12:31,390 26417 [kube-system/kube-dns-782804071-4g78j] INFO Releasing IP address 2017-01-20 22:12:31,390 26417 [kube-system/kube-dns-782804071-4g78j] INFO Using IPAM plugin at: /opt/cni/bin/host-local 2017-01-20 22:12:31,404 26417 [kube-system/kube-dns-782804071-4g78j] WARNING No Calico Endpoint for workload: kube-system.kube-dns-782804071-4g78j 2017-01-20 22:12:31,404 26417 [kube-system/kube-dns-782804071-4g78j] INFO Calico CNI execution complete, rc=0

Promaethius avatar Jan 19 '17 21:01 Promaethius

There is a new problem however in that calico_node's runtime is docker by default.

Looks like the version of Calico that is shipping in the hyperkube is really really old. Newer versions of Calico do not have that issue, and when using the self-hosted install I would not expect the Calico binaries within the hyperkube container to be used at all.

caseydavenport avatar Jan 25 '17 16:01 caseydavenport

Removing the DNS did end up breaking calico and the CNI files were not fully developed

I do not think Calico should need the /etc/resolv.conf mount. Can you elaborate on what went wrong when this mount was removed?

caseydavenport avatar Jan 25 '17 16:01 caseydavenport

Hi. I tried with the latest versions, still encountered the same problem. https://github.com/projectcalico/cni-plugin/issues/253

kfirufk avatar Jan 25 '17 17:01 kfirufk

@kfirufk that issue is a rkt issue about accepting a multi-line environmental variable. This was my docker/rkt compatible fix for it:

  • name: install-cni image: quay.io/calico/cni:v1.5.5 command: ["/bin/sh", "-c"] args: ["export CNI_NETWORK_CONFIG=$(cat /host/cni_network_config/config.conf) && /install-cni.sh"] env: # The location of the Calico etcd cluster. - name: ETCD_ENDPOINTS valueFrom: configMapKeyRef: name: calico-config key: etcd_endpoints # CNI configuration filename - name: CNI_CONF_NAME value: "10-calico.conf" volumeMounts: - mountPath: /host/opt/cni/bin name: cni-bin-dir - mountPath: /host/etc/cni/net.d name: cni-net-dir - mountPath: /calico-secrets name: etcd-certs # The CNI network config to install on each node. - mountPath: /host/cni_network_config name: cni-config

With the configmap mounted as a volume in: volumes: # Used by calico/node. - name: lib-modules hostPath: path: /lib/modules - name: var-run-calico hostPath: path: /var/run/calico # Used to install CNI. - name: cni-bin-dir hostPath: path: /opt/cni/bin - name: cni-net-dir hostPath: path: /etc/kubernetes/cni/net.d # Mount in the etcd TLS secrets. - name: etcd-certs secret: secretName: calico-etcd-secrets - name: cni-config configMap: name: calico-config items: - key: cni_network_config path: config.conf

Promaethius avatar Jan 25 '17 17:01 Promaethius

Btw you have to use cat <<'EOF' so that it echos the correct environment cat command to the file. This means you need to split the node/CNI manifest and the config file into two separate cat sections so the etcd address is echoed correctly to the config file with EOF and the variable is echoed to the CNI/node manifest with 'EOF'.

Promaethius avatar Jan 25 '17 17:01 Promaethius

@caseydavenport CNI doesn't need it but rkt has a section of global mounts passed into the service file with resolv.conf already specified. When rkt tried to mount it twice, it broke. This is why I removed that volume from the CNI, both because it doesn't need and and because it's already a global mount.

Promaethius avatar Jan 25 '17 17:01 Promaethius

@caseydavenport Also included in those mounts are the calico binaries. I'm still not understanding why hyperkube is using the old binaries when the new ones are mounted to the hyperkube rkt instance.

Promaethius avatar Jan 25 '17 17:01 Promaethius

Turns out most of the issues surrounding the calico binaries mounting to rkt was the if statement: # To run a self hosted Calico install it needs to be able to write to the CNI dir if [ "${USE_CALICO}" = "true" ]; then export CALICO_OPTS="--volume cni-bin,kind=host,source=/opt/cni/bin \ --mount volume=cni-bin,target=/opt/cni/bin" else export CALICO_OPTS="" fi Which I placed into the init_function before the templates in the form: # To run a self hosted Calico install it needs to be able to write to the CNI dir if [ ${USE_CALICO} = "true" ]; then local CALICO_OPTS="--volume cni-bin,kind=host,source=/opt/cni/bin \ --mount volume=cni-bin,target=/opt/cni/bin" echo "RKT Configured for Calico Binaries" else local CALICO_OPTS="" fi

The correct binaries now mount, however, there is a known issue preventing from felix running on the node daemons:

#https://github.com/kubernetes/minikube/issues/726

Promaethius avatar Jan 27 '17 06:01 Promaethius

@Promaethius, thanks for your response. I tried to apply your fix on canal.yaml and I get the same results, I probably miss something.

I pasted canal.yaml to https://paste.pound-python.org/show/lg5o1T04d3zyJdSGn8WY/ please let me know what I'm missing :) thanks!

kfirufk avatar Jan 27 '17 07:01 kfirufk

@kfirufk it turns out rkt is intrinsically flawed when running privileged containers. The node daemon felix requires RW privilages to /proc/sys which is mounted incorrectly. I'm trying to open dialogue to find a work around otherwise the bugfix is posted for rkt milestone 1.24. I've referenced the issue in my last post and will post back when I hear something

Promaethius avatar Jan 27 '17 07:01 Promaethius

Going to try something when I get home tonight and run the calico node pod with the k8s annotation: rkt.alpha.kubernetes.io/stage1-name-override: coreos.com/rkt/stage1-fly This is supposed to give the node the chroot isolation and RW permissions it needs to access the /proc/sys sockets with full privilages.

Promaethius avatar Jan 27 '17 21:01 Promaethius

I got rktnetes + calico working flawlessly with a cross compatible script. You can do a comparison to my fork of this GitHub project to see what I did because the explanation is too much for my mobile screen to handle right now.

Promaethius avatar Feb 09 '17 03:02 Promaethius

@Promaethius - yay! great news! i'll look into it this weekend.

kfirufk avatar Feb 09 '17 08:02 kfirufk

@Promaethius Could you please share your solution? I couldn't find any commits on your profile page related to this.

leodotcloud avatar Feb 17 '17 00:02 leodotcloud

@Promaethius - there's something I don't understand in multi-node/generic/controller-install.sh script of your forked project. the start_addons function uses "/host/manifests" directory and I don't run it from a container but from the ContainerOS itself, so.. the manifests are actually located under "/srv/kubernetes/manifests". I'm probably missing something, but what? :)

kfirufk avatar Feb 18 '17 09:02 kfirufk

@kfirufk sorry about that I made some changes today and had the wrong directory to serve the addons. I'm making some changes to include cephfs as the storage provider, nginx-ingress controller (my custom image with brotli compression enabled) and kube-lego by default so kubernetes is a little more flexible with projects. As far as calico goes, however, did you have any questions?

Promaethius avatar Feb 18 '17 10:02 Promaethius

@Promaethius, I didn't finish applying all your changes. working on forking your script in a easier editable method by actually having each file in a relevant directory instead of one big bash script. regarding calico i'll know soon enough :) I can't seem to find how to open bug reports in your forked project. 40-ExecStartPre-symlink.conf.conf should be one ".conf" i'm guessing :)

kfirufk avatar Feb 19 '17 22:02 kfirufk

@kfirufk by all means edit it. Feel free to submit pull requests by the way. I'll do my best to explain in this thread why I found calico would not work with this generic script:

CALICO_OPTS: Using export in the initialization of the script, the CALICO_OPTS variable remained blank when passed to the hyperkube image. The correct CNI binary folder would not be mounted and the newer Calico plugin would not be written resulting in rkt networking failures. Original:

# To run a self hosted Calico install it needs to be able to write to the CNI dir
if [ "${USE_CALICO}" = "true" ]; then
    export CALICO_OPTS="--volume cni-bin,kind=host,source=/opt/cni/bin \
                        --mount volume=cni-bin,target=/opt/cni/bin"
else
    export CALICO_OPTS=""
fi

Fixed:

    local TEMPLATE=/etc/systemd/system/kubelet.service
    local uuid_file="/var/run/kubelet-pod.uuid"
    if [ ! -f $TEMPLATE ]; then
        echo "TEMPLATE: $TEMPLATE"
        mkdir -p $(dirname $TEMPLATE)
    # To run a self hosted Calico install it needs to be able to write to the CNI dir
    if [ ${USE_CALICO} = "true" ]; then
        local CALICO_OPTS="--volume cni-bin,kind=host,source=/opt/cni/bin \
        --mount volume=cni-bin,target=/opt/cni/bin"
		mkdir -p /lib/modules
		mkdir -p /var/run/calico
		mkdir -p /opt/cni/bin
		mkdir -p /etc/kubernetes/cni/net.d
        echo "RKT Configured for Calico Binaries"
    else
        local CALICO_OPTS=""
    fi
    cat << EOF > $TEMPLATE
[Service]
Environment=KUBELET_VERSION=${K8S_VER}
Environment=KUBELET_ACI=${HYPERKUBE_IMAGE_REPO}
Environment="RKT_OPTS=--uuid-file-save=${uuid_file} \
  --volume dns,kind=host,source=/run/systemd/resolve/resolv.conf \
  --mount volume=dns,target=/etc/resolv.conf \
  --volume rkt,kind=host,source=/opt/bin/host-rkt \
  --mount volume=rkt,target=/usr/bin/rkt \
  --volume var-lib-rkt,kind=host,source=/var/lib/rkt \
  --mount volume=var-lib-rkt,target=/var/lib/rkt \
  --volume stage,kind=host,source=/tmp \
  --mount volume=stage,target=/tmp \
  --volume var-log,kind=host,source=/var/log \
  --mount volume=var-log,target=/var/log \
  ${CALICO_OPTS}"
ExecStartPre=/usr/bin/mkdir -p /etc/kubernetes/manifests
ExecStartPre=/usr/bin/mkdir -p /opt/cni/bin
ExecStartPre=/usr/bin/mkdir -p /var/log/containers
ExecStartPre=-/usr/bin/rkt rm --uuid-file=${uuid_file}
ExecStart=/usr/lib/coreos/kubelet-wrapper \
  --api-servers=http://127.0.0.1:8080 \
  --register-schedulable=true \
  --cni-conf-dir=/etc/kubernetes/cni/net.d \
  --network-plugin=cni \
  --container-runtime=${CONTAINER_RUNTIME} \
  --rkt-path=/usr/bin/rkt \
  --rkt-stage1-image=coreos.com/rkt/stage1-coreos \
  --allow-privileged=true \
  --pod-manifest-path=/etc/kubernetes/manifests \
  --hostname-override=${ADVERTISE_IP} \
  --cluster_dns=${DNS_SERVICE_IP} \
  --cluster_domain=cluster.local
ExecStop=-/usr/bin/rkt stop --uuid-file=${uuid_file}
Restart=always
RestartSec=10

[Install]
WantedBy=multi-user.target
EOF
    fi
...

Calico-Node:

  1. Because of a known issue with rkt, images running on anything other than a stage1-fly image would not be able to write into /proc/sys. This would crash the image since calico-node uses an ip socket in that container. Thanks to some folks over at rkt, a temporary fix was suggested.
  2. An old version of the node was being used.
  3. /etc/resolv.conf would throw errors if mounted with rkt. (This is not needed to function properly since kubernetes already handles nameservice.)

Calico-CNI: Another bug with rkt, it cannot accept multiline environmental variables. For those who had cryptic errors about brackets, reading the written CNI plugin file would immediately notice what went wrong. Only the first line of the configuration was being received which was just a single open {. The docker/rkt compatible fix for this was to mount the conf as a file and then echo it into the correct environmental before running the installer script.

Original:

kind: DaemonSet
apiVersion: extensions/v1beta1
metadata:
  name: calico-node
  namespace: kube-system
  labels:
    k8s-app: calico-node
spec:
  selector:
    matchLabels:
      k8s-app: calico-node
  template:
    metadata:
      labels:
        k8s-app: calico-node
      annotations:
        scheduler.alpha.kubernetes.io/critical-pod: ''
        scheduler.alpha.kubernetes.io/tolerations: |
          [{"key": "dedicated", "value": "master", "effect": "NoSchedule" },
           {"key":"CriticalAddonsOnly", "operator":"Exists"}]
    spec:
      hostNetwork: true
      containers:
        # Runs calico/node container on each Kubernetes node.  This 
        # container programs network policy and routes on each
        # host.
        - name: calico-node
          image: quay.io/calico/node:v0.23.0
          env:
            # The location of the Calico etcd cluster.
            - name: ETCD_ENDPOINTS
              valueFrom:
                configMapKeyRef:
                  name: calico-config
                  key: etcd_endpoints
            # Choose the backend to use. 
            - name: CALICO_NETWORKING_BACKEND
              value: "none"
            # Disable file logging so 'kubectl logs' works.
            - name: CALICO_DISABLE_FILE_LOGGING
              value: "true"
            - name: NO_DEFAULT_POOLS
              value: "true"
          securityContext:
            privileged: true
          volumeMounts:
            - mountPath: /lib/modules
              name: lib-modules
              readOnly: false
            - mountPath: /var/run/calico
              name: var-run-calico
              readOnly: false
            - mountPath: /etc/resolv.conf
              name: dns
              readOnly: true
        # This container installs the Calico CNI binaries
        # and CNI network config file on each node.
        - name: install-cni
          image: quay.io/calico/cni:v1.5.2
          imagePullPolicy: Always
          command: ["/install-cni.sh"]
          env:
            # CNI configuration filename
            - name: CNI_CONF_NAME
              value: "10-calico.conf"
            # The location of the Calico etcd cluster.
            - name: ETCD_ENDPOINTS
              valueFrom:
                configMapKeyRef:
                  name: calico-config
                  key: etcd_endpoints
            # The CNI network config to install on each node.
            - name: CNI_NETWORK_CONFIG
              valueFrom:
                configMapKeyRef:
                  name: calico-config
                  key: cni_network_config
          volumeMounts:
            - mountPath: /host/opt/cni/bin
              name: cni-bin-dir
            - mountPath: /host/etc/cni/net.d
              name: cni-net-dir
      volumes:
        # Used by calico/node.
        - name: lib-modules
          hostPath:
            path: /lib/modules
        - name: var-run-calico
          hostPath:
            path: /var/run/calico
        # Used to install CNI.
        - name: cni-bin-dir
          hostPath:
            path: /opt/cni/bin
        - name: cni-net-dir
          hostPath:
            path: /etc/kubernetes/cni/net.d
        - name: dns
          hostPath:
            path: /etc/resolv.conf

Fixed:

# This manifest installs the calico/node container, as well
# as the Calico CNI plugins and network config on
# each master and worker node in a Kubernetes cluster.
kind: DaemonSet
apiVersion: extensions/v1beta1
metadata:
  name: calico-node
  namespace: kube-system
  labels:
    k8s-app: calico-node
spec:
  selector:
    matchLabels:
      k8s-app: calico-node
  template:
    metadata:
      labels:
        k8s-app: calico-node
      annotations:
        scheduler.alpha.kubernetes.io/critical-pod: ''
        scheduler.alpha.kubernetes.io/tolerations: |
          [{"key": "dedicated", "value": "master", "effect": "NoSchedule" },
           {"key":"CriticalAddonsOnly", "operator":"Exists"}]
    spec:
      hostNetwork: true
      containers:
        # Runs calico/node container on each Kubernetes node.  This
        # container programs network policy and routes on each
        # host.
        - name: calico-node
          image: quay.io/calico/node:v1.0.1
          command: ["/bin/sh", "-c"]
          args: ["mount -o remount,rw /proc/sys && start_runit"]
          env:
            # The location of the Calico etcd cluster.
            - name: ETCD_ENDPOINTS
              valueFrom:
                configMapKeyRef:
                  name: calico-config
                  key: etcd_endpoints
            # Choose the backend to use.
            - name: CALICO_NETWORKING_BACKEND
              valueFrom:
                configMapKeyRef:
                  name: calico-config
                  key: calico_backend
            # Disable file logging so `kubectl logs` works.
            - name: CALICO_DISABLE_FILE_LOGGING
              value: "true"
            # Don't configure a default pool.  This is done by the Job
            # below.
            - name: NO_DEFAULT_POOLS
              value: "true"
            - name: FELIX_LOGSEVERITYSCREEN
              value: "info"
            # Location of the CA certificate for etcd.
            - name: ETCD_CA_CERT_FILE
              valueFrom:
                configMapKeyRef:
                  name: calico-config
                  key: etcd_ca
            # Location of the client key for etcd.
            - name: ETCD_KEY_FILE
              valueFrom:
                configMapKeyRef:
                  name: calico-config
                  key: etcd_key
            # Location of the client certificate for etcd.
            - name: ETCD_CERT_FILE
              valueFrom:
                configMapKeyRef:
                  name: calico-config
                  key: etcd_cert
            # Auto-detect the BGP IP address.
            - name: IP
              value: ""
          securityContext:
            privileged: true
          volumeMounts:
            - mountPath: /lib/modules
              name: lib-modules
              readOnly: false
            - mountPath: /var/run/calico
              name: var-run-calico
              readOnly: false
            - mountPath: /calico-secrets
              name: etcd-certs
        # This container installs the Calico CNI binaries
        # and CNI network config file on each node.
        - name: install-cni
          image: quay.io/calico/cni:v1.5.5
          command: ["/bin/sh", "-c"]
          args: ["export CNI_NETWORK_CONFIG=$(cat /host/cni_network_config/config.conf) && /install-cni.sh"]
          env:
            # The location of the Calico etcd cluster.
            - name: ETCD_ENDPOINTS
              valueFrom:
                configMapKeyRef:
                  name: calico-config
                  key: etcd_endpoints
            # CNI configuration filename
            - name: CNI_CONF_NAME
              value: "10-calico.conf"
          securityContext:
            privileged: true
          volumeMounts:
            - mountPath: /host/opt/cni/bin
              name: cni-bin-dir
            - mountPath: /host/etc/cni/net.d
              name: cni-net-dir
            - mountPath: /calico-secrets
              name: etcd-certs
            # The CNI network config to install on each node.
            - mountPath: /host/cni_network_config
              name: cni-config
      volumes:
        # Used by calico/node.
        - name: lib-modules
          hostPath:
            path: /lib/modules
        - name: var-run-calico
          hostPath:
            path: /var/run/calico
        # Used to install CNI.
        - name: cni-bin-dir
          hostPath:
            path: /opt/cni/bin
        - name: cni-net-dir
          hostPath:
            path: /etc/kubernetes/cni/net.d
        # Mount in the etcd TLS secrets.
        - name: etcd-certs
          secret:
            secretName: calico-etcd-secrets
        - name: cni-config
          configMap:
            name: calico-config
            items:
            - key: cni_network_config
              path: config.conf

Promaethius avatar Feb 20 '17 00:02 Promaethius

@Promaethius - still having problems starting calico on worker node. when you have a time please take a loot. http://stackoverflow.com/questions/42741272/calico-node-fails-starting-on-worker-node

thanks

kfirufk avatar Mar 11 '17 22:03 kfirufk