linkerd2 icon indicating copy to clipboard operation
linkerd2 copied to clipboard

Destination & identity service unable to reach k8s API due to static port 443 override

Open bascht opened this issue 2 years ago • 6 comments

What is the issue?

It looks like #6887 introduced TLS detection on port 443 under the static assumption that the Kubernetes API is always reachable on port :443 when using Cilium as the CNI:

Both destination and identity are rendering templates with a static

 - --outbound-ports-to-ignore
 - "443"

initContainer which will break in a cluster where the API endpoints are listening on a different port.

How can it be reproduced?

K8s v1.20.4 cluster running the API server on a different port:

# $ kubectl get svc -n default -o yaml                                                                                                                                                                                                          
apiVersion: v1
items:
- apiVersion: v1
  kind: Service
  metadata:
    creationTimestamp: "2021-03-03T15:11:12Z"
    labels:
      component: apiserver
      provider: kubernetes
    name: kubernetes
    namespace: default
    resourceVersion: "192"
    uid: f7ac600e-1545-4108-a5e4-67b39d8bbe16
  spec:
    clusterIP: 10.96.0.1
    clusterIPs:
    - 10.96.0.1
    ports:
    - name: https
      port: 443
      protocol: TCP
      targetPort: 6443
    sessionAffinity: None
    type: ClusterIP
  status:
    loadBalancer: {}
kind: List
metadata:
  resourceVersion: ""
  selfLink: ""
---
# $ kubectl get endpoints kubernetes -n default -o yaml                                                             
apiVersion: v1
kind: Endpoints
metadata:
  creationTimestamp: "2021-03-03T15:11:12Z"
  labels:
    endpointslice.kubernetes.io/skip-mirror: "true"
  name: kubernetes
  namespace: default
  resourceVersion: "142154827"
  uid: 5fbc0a4b-378a-4a4c-ae55-b0813e0f0572
subsets:
- addresses:
  - ip: 172.30.0.2
  - ip: 172.30.0.3
  - ip: 172.30.0.4
  ports:
  - name: https
    port: 6443
    protocol: TCP

which will lead to the destination service not being able to reach the API:

[   246.629525s]  WARN ThreadId(01) outbound:server{orig_dst=172.30.0.4:6443}:controller{addr=localhost:8086}:endpoint{addr=127.0.0.1:8086}: linkerd_reconnect: Failed to connect error=Connection refused (os error 111)

(logs from the linkerd-proxy container)

As soon as I manually patch the deployments for identity & destination service to ignore outbound port 6443:

 - --outbound-ports-to-ignore
 - "6443"

all services come up and are healthy

Logs, error output, etc

See section /How can it be reproduced?/

output of linkerd check -o short

$ linkerd check -o short                                                                                                                                                                   
Status check results are √

(that is - after setting the outgoing ignore ports)

Environment

  • Kubernetes v1.20.4, self hosted on Debian 10
  • Linux 4.19.0-20-amd64
  • CNI: Cilium v1.9.5
  • Linkerd stable-2.11.2

Possible solution

The outgoing port ignores should not be patched statically from an included template. This worked for us with Linkerd stable-2.10, so the breaking change was likely #6887.

Additional context

This could be duplicating #7460 but I think I can add a bit more context or help with debugging this.

Would you like to work on fixing this bug?

maybe

bascht avatar May 31 '22 13:05 bascht

the breaking change was likely https://github.com/linkerd/linkerd2/pull/6887.

That doesn't seem likely to me. This change added 443 to the proxy's list of default opaque ports. This should only really impact application pods, since, as you mention, the control plane is configured with:

 - --outbound-ports-to-ignore
 - "443"

That is, iptables configures all outbound traffic to :443 to skip the proxy entirely.

I suspect the issue is this: in non-cillium clusters, application connections to 443 are seen by iptables as connecting to 443, and so they handle this skip appropriately. But when cillium is in the mix, it rewrites the connection metadata before the iptables apply, so the traffic does not skip the proxy.

You could try manually editing the manifests to change 443 to 6443 to unblock control plane startup, but I'm not immediately sure what the best general purpose solution is for working well in Cillium-enabled clusters.

olix0r avatar May 31 '22 14:05 olix0r

That doesn't seem likely to me. This change added 443 to the proxy's list of default opaque ports. This should only really impact application pods, since, as you mention, the control plane is configured with:

But that would apply to linkerd <= stable-2.10 only, no? As far as I understood the controller was made redundant and all components talk directly to the K8s API.

Which would also explain why this worked for us until the upgrade to 2.11. I just checked one of our clusters that is still running 2.10 and there the controller has our full list of .Values.proxyInit.ignoreOutboundPorts:

 - --outbound-ports-to-ignore
 - 25,443,587,3306,11211,5432,6443

You could try manually editing the manifests to change 443 to 6443 to unblock control plane startup,

Yes, our current manual fix for this is to patch the generated manifest via yq to have 6443 instead of 443. :grin:

bascht avatar May 31 '22 14:05 bascht

As far as I understood the controller was made redundant and all components talk directly to the K8s API.

No, Linkerd proxies continue to use the controller components for discovery & identity.

olix0r avatar May 31 '22 15:05 olix0r

As far as I understood the controller was made redundant and all components talk directly to the K8s API.

No, Linkerd proxies continue to use the controller components for discovery & identity.

Okay, understood. I could be mixing up cause & effect then, it's just that what worked for 2.10 no longer works for 2.11 – the unblocking of 6443 via ignoreOutboundPorts could have been an unintended side effect for the whole time then.

bascht avatar May 31 '22 15:05 bascht

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Aug 31 '22 01:08 stale[bot]

Please don't close this one, it is still a problem.

tobgen avatar Sep 01 '22 13:09 tobgen

Likely related to #7730 and #7786.

jeremychase avatar Nov 09 '22 16:11 jeremychase

I had a look at this and came-up with some steps to reproduce the problem. This issue stems from using Cilium in its kube-proxy replacement mode. As soon as I set it to disabled, the deployments seemed to be unblocked without having to patch the outbound skip ports.

Steps to reproduce locally in k3d

$ k3d cluster create cilium-no-ebpf \
              --k3s-arg "--disable-network-policy@server:*" \
              --k3s-arg "--flannel-backend=none@server:*"

$ docker exec -it k3d-cilium-no-ebpf-server-0 mount bpffs /sys/fs/bpf -t bpf

$ docker exec -it k3d-cilium-no-ebpf-server-0 mount --make-shared /sys/fs/bpf

$ docker exec -it k3d-cilium-no-ebpf-server-0 mkdir -p /run/cilium/cgroupv2

$ docker exec -it k3d-cilium-no-ebpf-server-0 mount -t cgroup2 none /run/cilium/cgroupv2

$ docker exec -it k3d-cilium-no-ebpf-server-0 mount --make-shared /run/cilium/cgroupv2/

$ kubectl create -f https://raw.githubusercontent.com/cilium/cilium/v1.9/install/kubernetes/quick-install.yaml
  • We need to first mount the bpf filesystem and make it shared.
  • This does not happen automatically when running k3s, see this issue. * Next, we install Cilium using the quick-install manifest.

Control plane pods for me were not starting up (only the identity pod); errors in the identity pod:

NAME                                      READY   STATUS            RESTARTS      AGE
linkerd-proxy-injector-5488f9c68f-lf4lm   0/2     PodInitializing   0             47s
linkerd-destination-76f64cfc5-p8blb       0/4     PodInitializing   0             47s
linkerd-identity-f6c99f54c-7kz7w          0/2     Running           2 (27s ago)   47s

---

[    65.443977s]  WARN ThreadId(02) identity:controller{addr=localhost:8080}:endpoint{addr=127.0.0.1:8080}: linkerd_reconnect: Failed to connect error=Connection refused (os error 111)

Cilium config:

 $ kubectl get cm -n kube-system cilium-config -oyaml| grep 'kube-proxy-replacement'
  kube-proxy-replacement: probe

Looking at the Cilium logs, I haven't yet been able to find any details about packets being re-written from '443' to '6443' though:

Quick fixes:

Two ways that this can be quickly patched for people that need immediate support for this:

  1. skip 6443 on the outbound side for control plane components:
  • If running with proxy-init, this has to be patched manually, e.g:
      initContainers:
      - args:
        - --incoming-proxy-port
        - "4143"
        - --outgoing-proxy-port
        - "4140"
        - --proxy-uid
        - "2102"
        - --inbound-ports-to-ignore
        - 4190,4191,4567,4568
        - --outbound-ports-to-ignore
        - 443,6443
  • If running with linkerd-cni, the control plane components can be annotated with config.linkerd.io/skip-outbound-ports: 6443. The CNI plugin should set the relevant iptables rule.
  1. Run Cilium with kube-proxy-replacement: disabled
  • Edit the ConfigMap and changed the mode.
  • In my experiment, the pods went up as soon as the Cilium pod was restarted (presumably to allow the change to take effect). The control plane components did not require a restart:
:; kgp -n linkerd
NAME                                      READY   STATUS    RESTARTS       AGE
linkerd-identity-65b74c8f76-zk7hp         2/2     Running   3 (117s ago)   2m39s
linkerd-destination-5d59dc6c8b-htpxk      4/4     Running   0              2m39s
linkerd-proxy-injector-77b89fc448-p5sb8   2/2     Running   0              2m39s

mateiidavid avatar Nov 10 '22 16:11 mateiidavid

@bascht I have managed to reproduce this with v1.9 for Cilium but haven't been able to reproduce with v1.12. Even when socket loadbalancing is enabled, everything seems to work well without needing the annotation override. Can you confirm if bumping the Cilium version fixes this for you?

mateiidavid avatar Nov 15 '22 14:11 mateiidavid

@bascht Have you had a chance to try the Cilium upgrade @mateiidavid suggested above?

alpeb avatar Dec 16 '22 20:12 alpeb