linkerd2
linkerd2 copied to clipboard
Opaqueness not applied to off-cluster destination with enable-external-profiles annotation
What is the issue?
We're running Linekrd stable-2.12.2
Linkerd is configured with:
proxy.opaquePorts: 25,587,3306,4444,5432,6379,26379,9300,11211
We set config.linkerd.io/enable-external-profiles: "true"
annotation on application Pods that connect to a MySQL server off-cluster on port 3306 (following the instructions from https://linkerd.io/2.12/features/protocol-detection/#setting-the-enable-external-profiles-annotation)
However, the application is failing to connect to the MySQL server and we see the following errors in linkerd proxy logs:
[ 12.990661s] INFO ThreadId(01) outbound:proxy{addr=10.14.0.218:3306}: linkerd_detect: Continuing after timeout: linkerd_proxy_http::version::Version protocol detection timed out after 10s
the address 10.14.0.218
is outside the cluster networks ranges (defined as: clusterNetworks: 172.20.0.0/17,172.20.128.0/17
)
Here's the manifest metadata of the running Pod:
kind: Pod
metadata:
annotations:
checksum/configmap-key-config.properties: ec936facad2bfc7bf8863ae2b8d3f90356bdfc94e2940ed31654f43abb2b0efb
cni.projectcalico.org/containerID: 590f016aabbac75a6825ad52e018ea71e4e3d09d341d8b232d6a17cf200e7eca
cni.projectcalico.org/podIP: 172.20.11.247/32
cni.projectcalico.org/podIPs: 172.20.11.247/32
config.linkerd.io/enable-external-profiles: "true"
linkerd.io/created-by: linkerd/proxy-injector stable-2.12.2
linkerd.io/inject: enabled
linkerd.io/proxy-version: stable-2.12.2
linkerd.io/trust-root-sha256: 1d57b9c015280710eafad0935ee3ec0bc4d7eb430908e89ae20c5ab7e5ec9f80
vault.security.banzaicloud.io/vault-addr: https://vault.vault.svc:8200
vault.security.banzaicloud.io/vault-env-daemon: "false"
vault.security.banzaicloud.io/vault-role: k8s-eventbus-maxwell
viz.linkerd.io/tap-enabled: "true"
I was reviewing a related issue https://github.com/linkerd/linkerd2/issues/8273, which seem to suggest that this was fixed by https://github.com/linkerd/linkerd2-proxy/pull/1617 and from what I can tell should be included in stable-2.12.2, unfortunately we are not able to get this to work as expected.
For now we're using config.linkerd.io/skip-outbound-ports: "3306"
as a workaround, but we are hoping to not need this and use the external profiles method instead.
How can it be reproduced?
- Deploy Linkerd stable-2.12.2
- Run an application Pod with
config.linkerd.io/enable-external-profiles: "true"
annotation connecting to a MySQL server on port 3306 running off-cluster (not in theclusterNetworks
range(s)) - Observe as applications fails to connect and linkerd-proxy reports
protocol detection timed out after 10s
Logs, error output, etc
[ 12.990661s] INFO ThreadId(01) outbound:proxy{addr=10.14.0.218:3306}: linkerd_detect: Continuing after timeout: linkerd_proxy_http::version::Version protocol detection timed out after 10s
output of linkerd check -o short
Linkerd core checks
===================
linkerd-version
---------------
‼ cli is up-to-date
is running version 2.12.2 but the latest stable version is 2.12.4
see https://linkerd.io/2.12/checks/#l5d-version-cli for hints
control-plane-version
---------------------
‼ control plane is up-to-date
is running version 2.12.2 but the latest stable version is 2.12.4
see https://linkerd.io/2.12/checks/#l5d-version-control for hints
linkerd-control-plane-proxy
---------------------------
‼ control plane proxies are up-to-date
some proxies are not running the current version:
* linkerd-destination-5cc958f64c-jjbhq (stable-2.12.2)
* linkerd-destination-5cc958f64c-lj8ss (stable-2.12.2)
* linkerd-destination-5cc958f64c-rjmlq (stable-2.12.2)
* linkerd-identity-84f9d7cf87-6jtxc (stable-2.12.2)
* linkerd-identity-84f9d7cf87-g5ndc (stable-2.12.2)
* linkerd-identity-84f9d7cf87-phbjm (stable-2.12.2)
* linkerd-proxy-injector-5cd47b84fd-dxpkg (stable-2.12.2)
* linkerd-proxy-injector-5cd47b84fd-phwcv (stable-2.12.2)
* linkerd-proxy-injector-5cd47b84fd-zkbq2 (stable-2.12.2)
see https://linkerd.io/2.12/checks/#l5d-cp-proxy-version for hints
Linkerd extensions checks
=========================
linkerd-viz
-----------
‼ viz extension proxies are up-to-date
some proxies are not running the current version:
* metrics-api-855d59f76c-68nz9 (stable-2.12.2)
* prometheus-f7c9f5f74-88djq (stable-2.12.2)
* tap-74db455fc9-p4gvh (stable-2.12.2)
* tap-74db455fc9-qvqxt (stable-2.12.2)
* tap-74db455fc9-v92b4 (stable-2.12.2)
* tap-injector-5875b778dc-hfmcx (stable-2.12.2)
* web-576647df96-mnvh6 (stable-2.12.2)
see https://linkerd.io/2.12/checks/#l5d-viz-proxy-cp-version for hints
Status check results are √
Environment
- Kubernetes version: 1.23.10
- Environment: kops 1.25.3
- Host OS: Ubuntu 20.04.5 LTS (Kernel 5.15.0-1021-aws)
- Linkerd version: stable-2.12.2
Possible solution
as a workaround, we are currently using the config.linkerd.io/skip-outbound-ports
annotation to skip port 3306 on Pods that need to connect to MySQL database off-cluster
Additional context
Opaqueness for port 3306 works just fine for MySQL database running in-cluster, so this is only affecting connections to MySQL servers running off-cluster.
Would you like to work on fixing this bug?
None
Hey folks 👋🏼
I saw this was labelled for 2.13, but just wanted to know if you think this is an issue in stable-2.12? or possibly something we have misconfigured?
@dkulchinsky We suspect this is a problem with stable-2.12 but need to spend more time debugging before we know for certain.
Thanks @jeremychase 👍🏼 let me know if you need additional information from me.
Hey @jeremychase, @risingspiral 👋🏼
Just saw Linkerd 2.13.0 was released, congrats! 🥳
Wanted to check in to see if this issue is something already covered/fixed in 2.13? or would that be in a future path release?
@dkulchinsky It will be in the future path. In 2.13 we've begun to change the discovery system away from ServiceProfiles. I think we're unlikely to invest more in "external service profiles", but we're still keenly interested in solving the underlying problem of being able to disable protocol detection for out-of-cluster traffic.
@dkulchinsky It will be in the future path. In 2.13 we've begun to change the discovery system away from ServiceProfiles. I think we're unlikely to invest more in "external service profiles", but we're still keenly interested in solving the underlying problem of being able to disable protocol detection for out-of-cluster traffic.
Thanks @olix0r, I think decoupling these concerns makes total sense.
Will be watching this space for updates as this is one of those issues that we constantly trip over with our users 😓 I'm guessing there's no ETA you can share at this point?
This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.
still an issue AFAIK, hoping there's some news about this? @olix0r
Have the same issue in the latest 2.14.0
still can see the protocol detection for one of the opaquePorts
I have also tried to set with skipSubnets (--subnets-to-ignore), but protocol detection still running for the request...
linkerd-proxy {"timestamp":"[ 632.121291s]","level":"INFO","fields":{"message":"Continuing after timeout: linkerd_proxy_http::version::Version protocol detection timed out after 10s"},"target":"linkerd_detect","spans":[{"name":"outbound"},{"addr":"xxxxx:3306","name":"proxy"}],"threadId":"ThreadId(1)"}
Only config.linkerd.io/skip-outbound-ports
will work
For the record, we hear y'all on this one: being able to do egress traffic without protocol detection delays would be a good thing.
We want to separate the solution of that problem from the mechanism of ServiceProfiles, though, especially as we've been moving more toward Gateway API. Any thoughts on what kind of mechanisms would fit your use cases particularly well?