linkerd2 icon indicating copy to clipboard operation
linkerd2 copied to clipboard

KUBERNETES_SERVICE_HOST env variable not being used to communicate with API-server

Open Retna-Gjensidige opened this issue 3 years ago • 4 comments

What is the issue?

We are testing linkerd2 stable-2.12.0 and we see that policy-controller running in the Destination pod is not able to connect to the API-server and ends up with crash/restart loop

Our current installation with linkerd2 stable-2.11.3 all is good with the policy-controller being able to access API server.

What we see is that policy-controller is not using the KUBERNETES_SERVICE_HOST env variable to connect to the API-server. Its using kubernetes.default.svc as the url to API-server.

Our requirement may be specific to Azure as we secure our AKS egress with a layer 7 firewall. Info here, plus Azure AKS has also support for the KUBERNETES_SERVICE_HOST now, release notes here

We created an issue in kubert and @olix0r pointed us to the change done in kube-rs that impacts us.

We use Cattle vs Pets model for our AKS clusters. Basically we, at a moments notice can destroy and provision a brand new AKS cluster. We use the FQDN of the API-server via KUBERNETES_SERVICE_HOST so that workloads may access the API-Server. This way we don't have to keep updating the firewall with the IP of API-Server. The IP of the API-server managed by MS/Azure is non static and may change at any given time.

How can it be reproduced?

Control the egress of the cluster via a Layer 7 firewall. Use FQDN of the API-Server in the env variable KUBERNETES_SERVICE_HOST for a pod that need to access the API-Server.

Logs, error output, etc

{"timestamp":"2022-09-02T09:28:19.075636Z","level":"DEBUG","fields":{"service.ready":true,"message":"processing request"},"target":"tower::buffer::worker","spans":[{"name":"networkauthentications"}]}
{"timestamp":"2022-09-02T09:28:19.075645Z","level":"DEBUG","fields":{"service.ready":true,"message":"processing request"},"target":"tower::buffer::worker","spans":[{"name":"httproutes"}]}
{"timestamp":"2022-09-02T09:28:19.075706Z","level":"DEBUG","fields":{"message":"requesting"},"target":"kube_client::client::builder","spans":[{"name":"httproutes"},{"http.method":"GET","http.url":"https://kubernetes.default.svc/apis/policy.linkerd.io/v1alpha1/httproutes?","otel.kind":"client","otel.name":"list","name":"HTTP"}]}
{"timestamp":"2022-09-02T09:28:19.075820Z","level":"DEBUG","fields":{"message":"requesting"},"target":"kube_client::client::builder","spans":[{"name":"meshtlsauthentications"},{"http.method":"GET","http.url":"https://kubernetes.default.svc/apis/policy.linkerd.io/v1alpha1/meshtlsauthentications?","otel.kind":"client","otel.name":"list","name":"HTTP"}]}
{"timestamp":"2022-09-02T09:28:19.075699Z","level":"DEBUG","fields":{"message":"resolving host=\"kubernetes.default.svc\""},"target":"hyper::client::connect::dns"}
{"timestamp":"2022-09-02T09:28:19.077097Z","level":"DEBUG","fields":{"message":"requesting"},"target":"kube_client::client::builder","spans":[{"name":"serverauthorizations"},{"http.method":"GET","http.url":"https://kubernetes.default.svc/apis/policy.linkerd.io/v1beta1/serverauthorizations?","otel.kind":"client","otel.name":"list","name":"HTTP"}]}
{"timestamp":"2022-09-02T09:28:19.077093Z","level":"DEBUG","fields":{"message":"requesting"},"target":"kube_client::client::builder","spans":[{"name":"networkauthentications"},{"http.method":"GET","http.url":"https://kubernetes.default.svc/apis/policy.linkerd.io/v1alpha1/networkauthentications?","otel.kind":"client","otel.name":"list","name":"HTTP"}]}
{"timestamp":"2022-09-02T09:28:19.078258Z","level":"DEBUG","fields":{"message":"resolving host=\"kubernetes.default.svc\""},"target":"hyper::client::connect::dns"}
{"timestamp":"2022-09-02T09:28:19.079550Z","level":"DEBUG","fields":{"message":"resolving host=\"kubernetes.default.svc\""},"target":"hyper::client::connect::dns"}
{"timestamp":"2022-09-02T09:28:19.078270Z","level":"DEBUG","fields":{"message":"resolving host=\"kubernetes.default.svc\""},"target":"hyper::client::connect::dns"}
{"timestamp":"2022-09-02T09:28:19.079722Z","level":"DEBUG","fields":{"message":"connecting to 10.2.0.1:443"},"target":"hyper::client::connect::http","spans":[{"name":"pods"},{"http.method":"GET","http.url":"https://kubernetes.default.svc/api/v1/pods?&labelSelector=linkerd.io%2Fcontrol-plane-ns","otel.kind":"client","otel.name":"list","name":"HTTP"}]}
{"timestamp":"2022-09-02T09:28:19.079751Z","level":"DEBUG","fields":{"message":"connecting to 10.2.0.1:443"},"target":"hyper::client::connect::http","spans":[{"name":"authorizationpolicies"},{"http.method":"GET","http.url":"https://kubernetes.default.svc/apis/policy.linkerd.io/v1alpha1/authorizationpolicies?","otel.kind":"client","otel.name":"list","name":"HTTP"}]}
{"timestamp":"2022-09-02T09:28:19.078340Z","level":"DEBUG","fields":{"message":"resolving host=\"kubernetes.default.svc\""},"target":"hyper::client::connect::dns"}
{"timestamp":"2022-09-02T09:28:19.079558Z","level":"DEBUG","fields":{"message":"resolving host=\"kubernetes.default.svc\""},"target":"hyper::client::connect::dns"}
{"timestamp":"2022-09-02T09:28:19.081938Z","level":"DEBUG","fields":{"message":"connecting to 10.2.0.1:443"},"target":"hyper::client::connect::http","spans":[{"name":"serverauthorizations"},{"http.method":"GET","http.url":"https://kubernetes.default.svc/apis/policy.linkerd.io/v1beta1/serverauthorizations?","otel.kind":"client","otel.name":"list","name":"HTTP"}]}
{"timestamp":"2022-09-02T09:28:19.082054Z","level":"DEBUG","fields":{"message":"connecting to 10.2.0.1:443"},"target":"hyper::client::connect::http","spans":[{"name":"networkauthentications"},{"http.method":"GET","http.url":"https://kubernetes.default.svc/apis/policy.linkerd.io/v1alpha1/networkauthentications?","otel.kind":"client","otel.name":"list","name":"HTTP"}]}
{"timestamp":"2022-09-02T09:28:19.082074Z","level":"DEBUG","fields":{"message":"connecting to 10.2.0.1:443"},"target":"hyper::client::connect::http","spans":[{"name":"httproutes"},{"http.method":"GET","http.url":"https://kubernetes.default.svc/apis/policy.linkerd.io/v1alpha1/httproutes?","otel.kind":"client","otel.name":"list","name":"HTTP"}]}
{"timestamp":"2022-09-02T09:28:19.083221Z","level":"DEBUG","fields":{"message":"connected to 10.2.0.1:443"},"target":"hyper::client::connect::http","spans":[{"name":"pods"},{"http.method":"GET","http.url":"https://kubernetes.default.svc/api/v1/pods?&labelSelector=linkerd.io%2Fcontrol-plane-ns","otel.kind":"client","otel.name":"list","name":"HTTP"}]}
{"timestamp":"2022-09-02T09:28:19.083242Z","level":"DEBUG","fields":{"message":"connected to 10.2.0.1:443"},"target":"hyper::client::connect::http","spans":[{"name":"serverauthorizations"},{"http.method":"GET","http.url":"https://kubernetes.default.svc/apis/policy.linkerd.io/v1beta1/serverauthorizations?","otel.kind":"client","otel.name":"list","name":"HTTP"}]}
{"timestamp":"2022-09-02T09:28:19.083344Z","level":"DEBUG","fields":{"message":"connected to 10.2.0.1:443"},"target":"hyper::client::connect::http","spans":[{"name":"authorizationpolicies"},{"http.method":"GET","http.url":"https://kubernetes.default.svc/apis/policy.linkerd.io/v1alpha1/authorizationpolicies?","otel.kind":"client","otel.name":"list","name":"HTTP"}]}
{"timestamp":"2022-09-02T09:28:19.083535Z","level":"DEBUG","fields":{"message":"connecting to 10.2.0.1:443"},"target":"hyper::client::connect::http","spans":[{"name":"servers"},{"http.method":"GET","http.url":"https://kubernetes.default.svc/apis/policy.linkerd.io/v1beta1/servers?","otel.kind":"client","otel.name":"list","name":"HTTP"}]}
{"timestamp":"2022-09-02T09:28:19.083555Z","level":"DEBUG","fields":{"message":"connected to 10.2.0.1:443"},"target":"hyper::client::connect::http","spans":[{"name":"networkauthentications"},{"http.method":"GET","http.url":"https://kubernetes.default.svc/apis/policy.linkerd.io/v1alpha1/networkauthentications?","otel.kind":"client","otel.name":"list","name":"HTTP"}]}
{"timestamp":"2022-09-02T09:28:19.083571Z","level":"DEBUG","fields":{"message":"connected to 10.2.0.1:443"},"target":"hyper::client::connect::http","spans":[{"name":"httproutes"},{"http.method":"GET","http.url":"https://kubernetes.default.svc/apis/policy.linkerd.io/v1alpha1/httproutes?","otel.kind":"client","otel.name":"list","name":"HTTP"}]}
{"timestamp":"2022-09-02T09:28:19.083901Z","level":"DEBUG","fields":{"message":"connecting to 10.2.0.1:443"},"target":"hyper::client::connect::http","spans":[{"name":"meshtlsauthentications"},{"http.method":"GET","http.url":"https://kubernetes.default.svc/apis/policy.linkerd.io/v1alpha1/meshtlsauthentications?","otel.kind":"client","otel.name":"list","name":"HTTP"}]}
{"timestamp":"2022-09-02T09:28:19.085181Z","level":"DEBUG","fields":{"message":"connected to 10.2.0.1:443"},"target":"hyper::client::connect::http","spans":[{"name":"meshtlsauthentications"},{"http.method":"GET","http.url":"https://kubernetes.default.svc/apis/policy.linkerd.io/v1alpha1/meshtlsauthentications?","otel.kind":"client","otel.name":"list","name":"HTTP"}]}
{"timestamp":"2022-09-02T09:28:19.085214Z","level":"ERROR","fields":{"message":"failed with error error trying to connect: unexpected EOF"},"target":"kube_client::client::builder","spans":[{"name":"serverauthorizations"},{"http.method":"GET","http.url":"https://kubernetes.default.svc/apis/policy.linkerd.io/v1beta1/serverauthorizations?","otel.kind":"client","otel.name":"list","otel.status_code":"ERROR","name":"HTTP"}]}
{"timestamp":"2022-09-02T09:28:19.085219Z","level":"ERROR","fields":{"message":"failed with error error trying to connect: unexpected EOF"},"target":"kube_client::client::builder","spans":[{"name":"pods"},{"http.method":"GET","http.url":"https://kubernetes.default.svc/api/v1/pods?&labelSelector=linkerd.io%2Fcontrol-plane-ns","otel.kind":"client","otel.name":"list","otel.status_code":"ERROR","name":"HTTP"}]}
{"timestamp":"2022-09-02T09:28:19.085257Z","level":"INFO","fields":{"message":"stream failed","error":"failed to perform initial object list: HyperError: error trying to connect: unexpected EOF"},"target":"kubert::errors","spans":[{"name":"serverauthorizations"}]}
{"timestamp":"2022-09-02T09:28:19.085283Z","level":"INFO","fields":{"message":"stream failed","error":"failed to perform initial object list: HyperError: error trying to connect: unexpected EOF"},"target":"kubert::errors","spans":[{"name":"pods"}]}

output of linkerd check -o short

Linkerd core checks
===================

linkerd-identity
----------------
‼ issuer cert is valid for at least 60 days
    issuer certificate will expire on 2022-09-08T09:44:22Z
    see https://linkerd.io/2.12/checks/#l5d-identity-issuer-cert-not-expiring-soon for hints

linkerd-webhooks-and-apisvc-tls
-------------------------------
‼ proxy-injector cert is valid for at least 60 days
    certificate will expire on 2022-09-07T09:44:24Z
    see https://linkerd.io/2.12/checks/#l5d-proxy-injector-webhook-cert-not-expiring-soon for hints
‼ sp-validator cert is valid for at least 60 days
    certificate will expire on 2022-09-07T09:44:23Z
    see https://linkerd.io/2.12/checks/#l5d-sp-validator-webhook-cert-not-expiring-soon for hints
‼ policy-validator cert is valid for at least 60 days
    certificate will expire on 2022-09-07T09:44:23Z
    see https://linkerd.io/2.12/checks/#l5d-policy-validator-webhook-cert-not-expiring-soon for hints

Status check results are √

Environment

  • Kubernetes Version: 1.24.3
  • Cluster Environment: Azure - AKS
  • Host OS: Linux
  • Linkerd version: stable-2.12.0

Possible solution

No response

Additional context

No response

Would you like to work on fixing this bug?

No response

Retna-Gjensidige avatar Sep 06 '22 14:09 Retna-Gjensidige

Thanks for (re-)opening this issue!

I've opened https://github.com/kube-rs/kube-rs/issues/1000 to discuss restoring support for the environment variable and https://github.com/Azure/AKS/issues/3183 to discuss how this feature is at odds with documented client contract.

olix0r avatar Sep 06 '22 14:09 olix0r

And https://github.com/kubernetes/kubernetes/issues/112263 discusses reconciling client-go's behavior with the documentation.

olix0r avatar Sep 06 '22 15:09 olix0r

This is fixed by kube-rs/kube-rs#1001. Once kube-rs v0.75 is released, we'll update dependencies. This is unlikely to be included in Linkerd stable-2.12.1, but it should be available for stable-2.12.2 (probably near the end of the month).

olix0r avatar Sep 09 '22 23:09 olix0r

Thank you @olix0r 🙏 💯 ❤️

Retna-Gjensidige avatar Sep 12 '22 09:09 Retna-Gjensidige