linkerd2
linkerd2 copied to clipboard
HTTPRoute intermittently fails to distribute traffic
What is the issue?
When using an httproute
to dynamically redistribute load from one Service
to a MultiCluster mirrored Service, traffic only intermittently transmits correctly.
How can it be reproduced?
- 2 clusters,
east
andwest
, joined by a multicluster link that mirrors appropriately labelled services deployed inwest
intoeast
- a Service
foo
in clustereast
(but no deployment to receive traffic) - a mirrored Service in
east
calledfoo-west
. This should pass traffic to a deployment of something will return basic acks e.g. curls. - an HTTPRoute directing traffic received by parentRef Service
foo
to backendReffoo-east
. - send traffic to
Logs, error output, etc
Application curl logs:
❯ kubectl exec -it busybox-5cd4968444-zn549 -- wget http://APP.APP.svc.cluster.local/ping -O -
Defaulted container "main" out of: main, linkerd-init (init), linkerd-proxy (init)
Connecting to APP.APP.svc.cluster.local (IPADDR:80)
writing to stdout
written to stdout
☸ non-prod
❯ kubectl exec -it busybox-5cd4968444-zn549 -- wget http://APP.APP.svc.cluster.local/ping -O -
Defaulted container "main" out of: main, linkerd-init (init), linkerd-proxy (init)
Connecting to APP.APP.svc.cluster.local (IPADDR:80)
wget: server returned error: HTTP/1.1 504 Gateway Timeout
command terminated with exit code 1
Proxy sidecar:
[ 853.183882s] INFO ThreadId(01) outbound:proxy{addr=10.100.238.202:80}:service{ns=APP name=APP port=80}: linkerd_proxy_api_resolve::resolve: No endpoints
[ 856.184109s] INFO ThreadId(01) outbound:proxy{addr=10.100.238.202:80}:service{ns=APP name=APP port=80}: linkerd_proxy_balance_queue::worker: Unavailable; entering failfast timeout=3.0
[ 856.184575s] INFO ThreadId(01) outbound:proxy{addr=10.100.238.202:80}:rescue{client.addr=172.27.8.216:48586}: linkerd_app_core::errors::respond: HTTP/1.1 request failed error=logical service 10.100.238.202:80: route default.http: backend Service.APP.APP:80: Service.APP.APP:80: service in fail-fast error.sources=[route default.http: backend Service.APP.APP:80: Service.APP.APP:80: service in fail-fast, backend Service.APP.APP:80: Service.APP.APP:80: service in fail-fast, Service.APP.APP:80: service in fail-fast, service in fail-fast]
output of linkerd check -o short
❯ linkerd check -o short
linkerd-version
---------------
‼ cli is up-to-date
is running version 24.3.2 but the latest edge version is 24.5.3
see https://linkerd.io/2/checks/#l5d-version-cli for hints
control-plane-version
---------------------
‼ control plane is up-to-date
is running version 24.5.1 but the latest edge version is 24.5.3
see https://linkerd.io/2/checks/#l5d-version-control for hints
‼ control plane and cli versions match
control plane running edge-24.5.1 but cli running edge-24.3.2
see https://linkerd.io/2/checks/#l5d-version-control for hints
linkerd-control-plane-proxy
---------------------------
‼ control plane proxies are up-to-date
some proxies are not running the current version:
* linkerd-destination-888c96b5b-7pwmc (edge-24.5.1)
* linkerd-destination-888c96b5b-hl54h (edge-24.5.1)
* linkerd-destination-888c96b5b-vn62f (edge-24.5.1)
* linkerd-identity-56bbfdc7b6-2cfhj (edge-24.5.1)
* linkerd-identity-56bbfdc7b6-f9bvq (edge-24.5.1)
* linkerd-identity-56bbfdc7b6-h67sk (edge-24.5.1)
* linkerd-proxy-injector-68c6b7bc6-5vxm6 (edge-24.5.1)
* linkerd-proxy-injector-68c6b7bc6-hgmks (edge-24.5.1)
* linkerd-proxy-injector-68c6b7bc6-l45wh (edge-24.5.1)
see https://linkerd.io/2/checks/#l5d-cp-proxy-version for hints
‼ control plane proxies and cli versions match
linkerd-destination-888c96b5b-7pwmc running edge-24.5.1 but cli running edge-24.3.2
see https://linkerd.io/2/checks/#l5d-cp-proxy-cli-version for hints
linkerd-jaeger
--------------
‼ jaeger extension proxies are up-to-date
some proxies are not running the current version:
* collector-7db4655-sdwth (edge-24.5.1)
* jaeger-5c4c9ff587-5c729 (edge-24.5.1)
* jaeger-injector-6cb867b4f8-5mhnd (edge-24.5.1)
see https://linkerd.io/2/checks/#l5d-jaeger-proxy-cp-version for hints
‼ jaeger extension proxies and cli versions match
collector-7db4655-sdwth running edge-24.5.1 but cli running edge-24.3.2
see https://linkerd.io/2/checks/#l5d-jaeger-proxy-cli-version for hints
linkerd-viz
-----------
‼ viz extension proxies are up-to-date
some proxies are not running the current version:
* metrics-api-db8857cf8-mfw6c (edge-24.5.1)
* metrics-api-db8857cf8-p59sg (edge-24.5.1)
* metrics-api-db8857cf8-wxm87 (edge-24.5.1)
* tap-6d6cf4c465-2rzj8 (edge-24.5.1)
* tap-6d6cf4c465-8bshr (edge-24.5.1)
* tap-6d6cf4c465-bg6sd (edge-24.5.1)
* tap-injector-66c6f694f4-7rwx4 (edge-24.5.1)
* tap-injector-66c6f694f4-9hjpw (edge-24.5.1)
* tap-injector-66c6f694f4-vqw6r (edge-24.5.1)
* web-56d54f864d-82jcp (edge-24.5.1)
* web-56d54f864d-j4vbv (edge-24.5.1)
see https://linkerd.io/2/checks/#l5d-viz-proxy-cp-version for hints
‼ viz extension proxies and cli versions match
metrics-api-db8857cf8-mfw6c running edge-24.5.1 but cli running edge-24.3.2
see https://linkerd.io/2/checks/#l5d-viz-proxy-cli-version for hints
Status check results are √
Environment
- Kubernetes v1.29.3
- EKS cluster
- Bottlerocket nodes
- Cilium CNI in AWS VPC replacement mode
Possible solution
No response
Additional context
No response
Would you like to work on fixing this bug?
maybe