linkerd2
linkerd2 copied to clipboard
minor logspam on ClusterIP: outbound: linkerd_app_core::serve: Connection closed error=read header from client timeout
What is the issue?
Hi ... I'm seeing what I think is a fairly low level of unecessary logs related to ClusterIPs
e.g.
outbound: linkerd_app_core::serve: Connection closed error=read header from client timeout client.addr=172.12.168.50:46264 server.addr=10.101.127.29:80
in this case server.addr=10.101.127.29 is a ClusterIP ... traffic seems to be flowing fine to the meshed pods sitting behind this ClusterIP, so I'm wondering if this is really just related to the resolution of the pods behind the ClusterIP rather than the actual traffic.
I've tried:
- increasing the connect timeout using :
config.linkerd.io/proxy-outbound-connect-timeout - enabling debug logging .. but this returns a 404 and does not seem to enable debug logging
There aren't a huge volume, and traffic seems to be flowing, but I don't have the experience with linkerd to know if this indicates a misconfiguration or some other issue that should be resolved!
any help appreciated
How can it be reproduced?
send traffic to meshed service via a ClusterIP ?
Logs, error output, etc
outbound: linkerd_app_core::serve: Connection closed error=read header from client timeout client.addr=172.12.168.50:46264 server.addr=10.101.127.29:80
output of linkerd check -o short
linkerd-version
---------------
‼ cli is up-to-date
is running version 25.6.1 but the latest edge version is 25.6.2
see https://linkerd.io/2/checks/#l5d-version-cli for hints
control-plane-version
---------------------
‼ control plane is up-to-date
is running version 25.6.1 but the latest edge version is 25.6.2
see https://linkerd.io/2/checks/#l5d-version-control for hints
linkerd-control-plane-proxy
---------------------------
‼ control plane proxies are up-to-date
some proxies are not running the current version:
* linkerd-destination-5b68d969b7-lvkpd (edge-25.6.1)
* linkerd-destination-5b68d969b7-lvtvn (edge-25.6.1)
* linkerd-destination-5b68d969b7-xkz7j (edge-25.6.1)
* linkerd-identity-6c76fd6b97-4dk55 (edge-25.6.1)
* linkerd-identity-6c76fd6b97-8jvzx (edge-25.6.1)
* linkerd-identity-6c76fd6b97-ch4qr (edge-25.6.1)
* linkerd-proxy-injector-68f4d78469-7tc8g (edge-25.6.1)
* linkerd-proxy-injector-68f4d78469-bbvnw (edge-25.6.1)
* linkerd-proxy-injector-68f4d78469-j7j8f (edge-25.6.1)
see https://linkerd.io/2/checks/#l5d-cp-proxy-version for hints
linkerd-multicluster
--------------------
‼ multicluster extension proxies are up-to-date
some proxies are not running the current version:
* controller-eks-odin-f7f8f7bbd-bjsxz (edge-25.6.1)
* linkerd-local-service-mirror-7978bf5599-4hsd9 (edge-25.6.1)
see https://linkerd.io/2/checks/#l5d-multicluster-proxy-cp-version for hints
linkerd-viz
-----------
‼ viz extension proxies are up-to-date
some proxies are not running the current version:
* metrics-api-85f64db8dd-ppkbw (edge-25.6.1)
* prometheus-5fb54bb54-tpsn7 (edge-25.6.1)
* tap-8694877466-mqbq2 (edge-25.6.1)
* tap-injector-859cbc9d57-kc7x9 (edge-25.6.1)
* web-db4dd678d-mn7f9 (edge-25.6.1)
see https://linkerd.io/2/checks/#l5d-viz-proxy-cp-version for hints
Status check results are √
Environment
- Kubernetes Version: 1.33
- Cluster Environment: AWS EKS (auto)
- Host OS: Bottlerocket
- Linkerd: version: 25.6.1
Possible solution
No response
Additional context
No response
Would you like to work on fixing this bug?
None
ok, more detail ... , I managed to get debug logging working using the pod annotations..
2025-06-15 13:49:19.222: I see
[9585.967197s] INFO ThreadId(01) outbound: linkerd_app_core::serve: Connection closed error=read header from client timeout client.addr=172.12.168.50:52084 server.addr=10.101.127.29:80
the closest request (to meta-loki-gateway.meta.svc) I can find is here ..
2025-06-15 13:49:18.994 [435.152893s] DEBUG ThreadId(01) hickory_proto::udp::udp_client_stream: final message: ; header 39367:QUERY:RD:NoError:QUERY:0/0/0
2025-06-15 13:49:18.994 [435.152881s] DEBUG ThreadId(01) hickory_proto::xfer: enqueueing message:QUERY:[Query { name: Name("linkerd-dst-headless.linkerd.svc.cluster.local."), query_type: SRV, query_class: IN }]
2025-06-15 13:49:18.994 [435.152874s] DEBUG ThreadId(01) hickory_resolver::name_server::name_server: existing connection: NameServerConfig { socket_addr: 10.101.0.10:53, protocol: Udp, tls_dns_name: None, http_endpoint: None, trust_negative_responses: false, bind_addr: None }
2025-06-15 13:49:18.994 [435.152863s] DEBUG ThreadId(01) hickory_resolver::name_server::name_server_pool: sending request: [Query { name: Name("linkerd-dst-headless.linkerd.svc.cluster.local."), query_type: SRV, query_class: IN }]
2025-06-15 13:49:18.994 [435.152852s] DEBUG ThreadId(01) hickory_proto::xfer::dns_handle: querying: linkerd-dst-headless.linkerd.svc.cluster.local. SRV
2025-06-15 13:49:18.994 [435.152781s] DEBUG ThreadId(01) linkerd_dns: Resolving a SRV record name=linkerd-dst-headless.linkerd.svc.cluster.local.
2025-06-15 13:49:18.505 [434.664111s] DEBUG ThreadId(01) linkerd_http_classify::channel: sending EOS to classify
2025-06-15 13:49:18.505 [434.664102s] DEBUG ThreadId(01) linkerd_http_classify::channel: dropping ResponseBody
2025-06-15 13:49:18.505 [434.664039s] DEBUG ThreadId(01) h2::codec::framed_read: received frame=Headers { stream_id: StreamId(713), flags: (0x5: END_HEADERS | END_STREAM) }
2025-06-15 13:49:18.500 [434.659214s] DEBUG ThreadId(01) h2::codec::framed_write: send frame=Data { stream_id: StreamId(713), flags: (0x1: END_STREAM) }
2025-06-15 13:49:18.500 [434.659196s] DEBUG ThreadId(01) h2::codec::framed_write: send frame=Headers { stream_id: StreamId(713), flags: (0x4: END_HEADERS) }
2025-06-15 13:49:18.500 [434.659146s] DEBUG ThreadId(01) linkerd_proxy_http::orig_proto: Upgrading request version=HTTP/1.1 absolute_form=false
2025-06-15 13:49:18.500 [434.659140s] DEBUG ThreadId(01) linkerd_proxy_http::client: method=POST uri=http://meta-loki-gateway.meta.svc:80/loki/api/v1/push version=HTTP/1.1
2025-06-15 13:49:18.500 [434.659122s] DEBUG ThreadId(01) linkerd_http_classify::channel: state=Some(State { classify: Http(StatusRanges([500..=599])), tx: Sender { chan: Tx { inner: Chan { tx: Tx { block_tail: 0x7f999842c380, tail_position: 0 }, semaphore: Semaphore { semaphore: Semaphore { permits: 10000 }, bound: 10000 }, rx_waker: AtomicWaker, tx_count: 2, rx_fields: "..." } } } })
2025-06-15 13:49:18.500 [434.659082s] DEBUG ThreadId(01) linkerd_app_outbound::http::logical::policy::route::extensions: Initializing route extensions retry=None timeouts=StreamTimeouts { response_headers: None, response_end: None, idle: None, limit: None }
2025-06-15 13:49:18.500 [434.659037s] DEBUG ThreadId(01) linkerd_app_outbound::http::logical::policy::router: Selected route meta=RouteRef(Default { name: "http" })
2025-06-15 13:49:17.311 [433.469629s] DEBUG ThreadId(01) linkerd_http_classify::channel: sending EOS to classify
2025-06-15 13:49:17.311 [433.469624s] DEBUG ThreadId(01) linkerd_http_classify::channel: dropping ResponseBody
2025-06-15 13:49:17.311 [433.469565s] DEBUG ThreadId(01) h2::codec::framed_read: received frame=Headers { stream_id: StreamId(711), flags: (0x5: END_HEADERS | END_STREAM) }
2025-06-15 13:49:17.300 [433.459010s] DEBUG ThreadId(01) h2::codec::framed_write: send frame=Data { stream_id: StreamId(711), flags: (0x1: END_STREAM) }
2025-06-15 13:49:17.300 [433.458991s] DEBUG ThreadId(01) h2::codec::framed_write: send frame=Headers { stream_id: StreamId(711), flags: (0x4: END_HEADERS) }
2025-06-15 13:49:17.300 [433.458951s] DEBUG ThreadId(01) linkerd_proxy_http::orig_proto: Upgrading request version=HTTP/1.1 absolute_form=false
2025-06-15 13:49:17.300 [433.458946s] DEBUG ThreadId(01) linkerd_proxy_http::client: method=POST uri=http://meta-loki-gateway.meta.svc:80/loki/api/v1/push version=HTTP/1.1
2025-06-15 13:49:17.300 [433.458929s] DEBUG ThreadId(01) linkerd_http_classify::channel: state=Some(State { classify: Http(StatusRanges([500..=599])), tx: Sender { chan: Tx { inner: Chan { tx: Tx { block_tail: 0x7f999842c380, tail_position: 0 }, semaphore: Semaphore { semaphore: Semaphore { permits: 10000 }, bound: 10000 }, rx_waker: AtomicWaker, tx_count: 2, rx_fields: "..." } } } })
2025-06-15 13:49:17.300 [433.458893s] DEBUG ThreadId(01) linkerd_app_outbound::http::logical::policy::route::extensions: Initializing route extensions retry=None timeouts=StreamTimeouts { response_headers: None, response_end: None, idle: None, limit: None }
2025-06-15 13:49:17.300 [433.458851s] DEBUG ThreadId(01) linkerd_app_outbound::http::logical::policy::router: Selected route meta=RouteRef(Default { name: "http" })
2025-06-15 13:49:16.705 [432.863656s] WARN ThreadId(01) linkerd_app_core::serve: Server failed to become ready error=unauthorized connection on default/all-authenticated client.addr=172.12.172.9:47558
2025-06-15 13:49:16.705 [432.863620s] INFO ThreadId(01) linkerd_app_inbound::policy::tcp: Connection denied server.group= server.kind=default server.name=all-authenticated tls=None(PortSkipped) client=172.12.172.9:47558
2025-06-15 13:49:16.705 [432.863562s] DEBUG ThreadId(01) linkerd_app_inbound::accept: Accepted policy=ServerPolicy { protocol: Opaque([Authorization { networks: [Network { net: 0.0.0.0/0, except: [] }, Network { net: ::/0, except: [] }], authentication: TlsAuthenticated { identities: {}, suffixes: [Suffix { ends_with: "" }] }, meta: Default { name: "all-authenticated" } }, Authorization { networks: [Network { net: 127.0.0.1/32, except: [] }, Network { net: ::1/128, except: [] }], authentication: Unauthenticated, meta: Default { name: "localhost" } }]), meta: Default { name: "all-authenticated" }, local_rate_limit: LocalRateLimit { meta: None, total: None, per_identity: None, overrides: {} } }
though interesting that 2025-06-15 13:49:19.222: I see
[9585.967197s] INFO ThreadId(01) outbound: linkerd_app_core::serve: Connection closed error=read header from client timeout client.addr=172.12.168.50:52084 server.addr=10.101.127.29:80
has a time 9585.967197s ? is this how long the connection has been open ? ( is this an idle timeout ? )
Thanks for the report! At first glance this seems very similar to https://github.com/linkerd/linkerd2/issues/14147, are these connections by any chance keep-alives that might be timing out? If so this may just be a case of increasing the outbound timeout as you've done.
shouldn't think so ... the writes are running pretty continously it's a straight HTTP post with protobuf payload https://grafana.com/docs/loki/latest/reference/loki-http-api/#ingest-logs
We are seeing the same issue in the traffic between our applications and our self-hosted Elastic APM server.
Also spotted the following errors when our applications try to retrieve config from Consul. Traffic between applications and Consul Clients is basic HTTP traffic, so handling as opaque seems wrong.
inbound:server{port=8500}: linkerd_app_inbound::detect: Handling connection as opaque due to policy timeout=10s
thank you all for filing this issue.
we were able to identify a change in the default behavior of hyper, our HTTP library, as being responsible for this. hyper introduced a 30-second idle timeout for http/1.1 connections in its 1.0 release, while the 0.14 release being used prior to v2.288 did not enforce such a timeout.
pull request linkerd/linkerd2-proxy#3985 removed this timeout, restoring the behavior that was present prior to the v2.288 release of the proxy. this issue should now be fixed in proxy release v2.304, which is included in the edge-25.6.4 release of linkerd2, see: https://github.com/linkerd/linkerd2/releases/tag/edge-25.6.4