kube-router icon indicating copy to clipboard operation
kube-router copied to clipboard

Kube-router shouldn't rely on kube-proxy configmap

Open kvaps opened this issue 4 years ago • 15 comments

Hi, this PR https://github.com/kubernetes/kubernetes/pull/89593 introduces the important change to provide seamless an upgrade for a cluster using kubeadm and without risks having kube-proxy installed again, but it is rely on missing kube-proxy config.

Problem is that kube-router deployed as service-proxy rely on this kube-proxy config:

https://github.com/cloudnativelabs/kube-router/blob/5c5dc411232495f11f6292fb9e4168ae6e228047/daemonset/kubeadm-kuberouter-all-features.yaml#L53-L57 https://github.com/cloudnativelabs/kube-router/blob/5c5dc411232495f11f6292fb9e4168ae6e228047/daemonset/kubeadm-kuberouter-all-features.yaml#L83-L84 https://github.com/cloudnativelabs/kube-router/blob/5c5dc411232495f11f6292fb9e4168ae6e228047/daemonset/kubeadm-kuberouter-all-features.yaml#L127-L132

kvaps avatar Mar 30 '20 10:03 kvaps

related https://github.com/cloudnativelabs/kube-router/issues/833

murali-reddy avatar Mar 31 '20 11:03 murali-reddy

Came to say doing this fixed an issue in k8s v1.17.2 where kube-router was trying to connect to the cluster IP for the API. Using the manifest above, I can confirm this it is good.

ghost avatar Mar 31 '20 15:03 ghost

Adding to this that we should also update our documentation (as proposed in #833) to show how to install kube-router using kubeadm without utilizing kube-proxy once we get the configmap sorted.

aauren avatar Apr 25 '20 08:04 aauren

Came to say doing this fixed an issue in k8s v1.17.2 where kube-router was trying to connect to the cluster IP for the API. Using the manifest above, I can confirm this it is good.

This is exactly the problem.

The default kubeconfig that the pod gets when not (ab)using the one from kube-proxy is pointing at the cluster IP to talk to the API server. But when not using kube-proxy, kube-router is the one who actually makes cluster IPs work in the first place. Classical chicken vs egg problem.

The difference with the kube-proxy provided kubeconfig vs the default one is that it directly uses the API servers IP (aka controlPlaneEndpoint in kubeadm) without going through the cluster IP.

In my cluster there is a cluster-info config map that contains the controlPlaneEndpoint IP. Maybe that could be used to generate a working kubeconfig for kube-router? But not sure if that CM exists in any possible/supported cluster.

kubectl -n kube-public get cm cluster-info -o yaml

asteven avatar May 07 '20 21:05 asteven

I agree that this is a problem. kube-router should not be relying on a config from another component (especially one that it is often meant to replace) in order to function correctly.

@asteven that cluster-info config map is a good find! However, it's in the wrong namespace. Thinking it through I thought that we could use a hyperkube initContainer to copy it over to the kube-system namespace, but it starts to feel really hacky really fast. In the end I wonder if we'll end up in the same space as before, where we're "borrowing" someone else's config and trying to make it work for us.

I think in the end, it is correct for kube-router to assume that this configmap is provided for us. I think that the best way forward would be if kubeadm could provide our config for us the same way it does for other addon elements like kube-proxy and coredns. However, I'm not super familiar with kubeadm. Looking through their docs quickly I didn't see a lot of information about extending kubeadm, or adapting it to other proxy providers.

Does anyone in this thread have any information about how we could potentially configure or customize kubeadm to provide us this config in a way that doesn't rely on kube-proxy?

aauren avatar May 09 '20 14:05 aauren

I think in the end, it is correct for kube-router to assume that this configmap is provided for us. I think that the best way forward would be if kubeadm could provide our config for us the same way it does for other addon elements like kube-proxy and coredns. However, I'm not super familiar with kubeadm. Looking through their docs quickly I didn't see a lot of information about extending kubeadm, or adapting it to other proxy providers.

Unfortunately that is not going to happen. Upstream does not want kubeadm to know any details about addons. They will not add something just for kube-router.

But I think I found something that gives us what we need and that exists on every kubernetes node.

/etc/kubernetes/kubelet.conf

If we mount that as a volume we can run something like this on startup:

api_server="$(awk '/server: / {print $2}'  /etc/kubernetes/kubelet.conf)"

cat > /var/lib/kube-router/kubeconfig << DONE
apiVersion: v1
kind: Config
clusters:
- cluster:
    certificate-authority: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
    server: ${api_server}
  name: default
contexts:
- context:
    cluster: default
    namespace: default
    user: default
  name: default
current-context: default
users:
- name: default
  user:
    tokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
DONE

asteven avatar May 09 '20 20:05 asteven

With this patch I don't need the kube-proxy config map anymore.

I basically use some shell code in the init container to generate a working kubeconfig for the main container to use.

diff --git a/daemonset/kubeadm-kuberouter-all-features.yaml b/daemonset/kubeadm-kuberouter-all-features.yaml
index 08617260..5fcfd723 100644
--- a/daemonset/kubeadm-kuberouter-all-features.yaml
+++ b/daemonset/kubeadm-kuberouter-all-features.yaml
@@ -55,7 +55,7 @@ spec:
         - --run-firewall=true
         - --run-service-proxy=true
         - --bgp-graceful-restart=true
-        - --kubeconfig=/var/lib/kube-router/kubeconfig
+        - --kubeconfig=/pod/kubeconfig
         env:
         - name: NODE_NAME
           valueFrom:
@@ -81,9 +81,8 @@ spec:
           readOnly: true
         - name: cni-conf-dir
           mountPath: /etc/cni/net.d
-        - name: kubeconfig
-          mountPath: /var/lib/kube-router
-          readOnly: true
+        - name: pod-shared
+          mountPath: /pod
         - name: xtables-lock
           mountPath: /run/xtables.lock
           readOnly: false
@@ -94,7 +93,8 @@ spec:
         command:
         - /bin/sh
         - -c
-        - set -e -x;
+        - |
+          set -e -x;
           if [ ! -f /etc/cni/net.d/10-kuberouter.conflist ]; then
             if [ -f /etc/cni/net.d/*.conf ]; then
               rm -f /etc/cni/net.d/*.conf;
@@ -103,9 +103,37 @@ spec:
             cp /etc/kube-router/cni-conf.json ${TMP};
             mv ${TMP} /etc/cni/net.d/10-kuberouter.conflist;
           fi
+
+          # Generate a working kubeconfig file.
+          api_server="$(awk '/server: / {print $2}' /host/kubelet.conf)"
+          cat > /pod/kubeconfig << DONE
+          apiVersion: v1
+          kind: Config
+          clusters:
+          - cluster:
+              certificate-authority: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
+              server: ${api_server}
+            name: default
+          contexts:
+          - context:
+              cluster: default
+              namespace: default
+              user: default
+            name: default
+          current-context: default
+          users:
+          - name: default
+            user:
+              tokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
+          DONE
         volumeMounts:
         - name: cni-conf-dir
           mountPath: /etc/cni/net.d
+        - name: kubelet-conf
+          mountPath: /host/kubelet.conf
+          readOnly: true
+        - name: pod-shared
+          mountPath: /pod
         - name: kube-router-cfg
           mountPath: /etc/kube-router
       hostNetwork: true
@@ -128,12 +156,12 @@ spec:
       - name: kube-router-cfg
         configMap:
           name: kube-router-cfg
-      - name: kubeconfig
-        configMap:
-          name: kube-proxy
-          items:
-          - key: kubeconfig.conf
-            path: kubeconfig
+      - name: kubelet-conf
+        hostPath:
+          path: /etc/kubernetes/kubelet.conf
+      - name: pod-shared
+        emptyDir:
+          medium: Memory
       - name: xtables-lock
         hostPath:
           path: /run/xtables.lock

asteven avatar May 24 '20 22:05 asteven

thanks @asteven It works. Would you mind explain how is the kube config generated? Any gotchas generating like this?

murali-reddy avatar May 26 '20 09:05 murali-reddy

@asteven ping

aauren avatar Jul 10 '20 06:07 aauren

Sorry, I thought the solution was self explaining.

The kubelet talks to the API server via the external IP, not the internal one (ClusterIP). That's why it can work even if there is no CNI deployed yet. kube-router has the same requirements. So we get the external IP from the kubelets config and use it for kube-router.

The solution I am using does:

  • mounts the hosts kubelet config file inside the pod at /host/kubelet.conf
  • mounts a in-memory volume at /pod to share data between different containers
  • uses a script in the init container to extract the external API IP from the kubelet's config file, and then generates a working config for kube-router with that IP
  • stores that generated config to /pod/kubeconfig to share it between the init-container and the kube-router container
  • configure kube-router to use the config from /pod/kubeconfig

This works very well for me for the clusters I deploy on bare metal with kubeadm. I don't know if this solution works in other cases, e.g. in the various clouds or with some other provisioner or the like.

asteven avatar Jul 10 '20 06:07 asteven

thanks @asteven It works. Would you mind explain how is the kube config generated? Any gotchas generating like this?

BTW, kubeadm reads some cluster-specific information (eg. public IP and CA) from

kubectl get cm -n kube-public cluster-info -o yaml

It is publicly accessible

kvaps avatar Jul 10 '20 07:07 kvaps

@kvaps that was also my first thought. But to where should kubectl connect to get that info? This is exactly the chicken vs egg problem.

asteven avatar Jul 10 '20 08:07 asteven

Ping?

rgl avatar Aug 08 '21 09:08 rgl

If someone wants to put together a PR for this work, it should be relatively easy to implement @asteven's solution. In the end, I still wish that kubeadm would give us more information about the cluster so that we didn't have to borrow artifacts from other k8s components.

I personally don't find using the kubelet's config any better practice than using kube-proxy's, but I can see how not using a component that kube-router has the potential to replace helpful to some use-cases. So I'd be willing to review / test / merge if someone wants to go through the effort of putting up a PR.

aauren avatar Aug 13 '21 14:08 aauren

I personally don't find using the kubelet's config any better practice than using kube-proxy's, but I can see how not using a component that kube-router has the potential to replace helpful to some use-cases. So I'd be willing to review / test / merge if someone wants to go through the effort of putting up a PR.

The difference is that kubernetes can work without kubeadm and kube-proxy. But it can not work without the kubelet. And the kubelet will always need to connect to the API server without a cluster-ip. So it will always have the information that kube-router needs.

asteven avatar Aug 16 '21 14:08 asteven

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

github-actions[bot] avatar Sep 05 '23 01:09 github-actions[bot]

This issue was closed because it has been stale for 5 days with no activity.

github-actions[bot] avatar Sep 10 '23 02:09 github-actions[bot]