linkerd2
linkerd2 copied to clipboard
Linkerd route level RPS not coming up
What is the issue?
We have deployed Linkerd stable v2.14.0 on GKE v1.24. We configured a service profile for an application, and the routes were getting added, but we could not see the RPS.
How can it be reproduced?
GKE v1.24 Linkerd stable-v2.14.0
Logs, error output, etc
ROUTE SERVICE SUCCESS RPS LATENCY_P50 LATENCY_P95 LATENCY_P99 [DEFAULT] svc1 - - - - - healthz svc1 - - - - - version svc1 - - - - -
output of linkerd check -o short
linkerd-version
‼ cli is up-to-date is running version 2.14.0 but the latest stable version is 2.14.5 see https://linkerd.io/2.14/checks/#l5d-version-cli for hints
control-plane-version
‼ control plane is up-to-date is running version 2.14.0 but the latest stable version is 2.14.5 see https://linkerd.io/2.14/checks/#l5d-version-control for hints
linkerd-control-plane-proxy
‼ control plane proxies are up-to-date some proxies are not running the current version: * linkerd-destination-d995f46dc-gcvwq (stable-2.14.0) * linkerd-identity-86c6f76f6c-p6k52 (stable-2.14.0) * linkerd-proxy-injector-6fc56bcd48-2x9rs (stable-2.14.0) see https://linkerd.io/2.14/checks/#l5d-cp-proxy-version for hints
linkerd-viz
‼ linkerd-viz pods are injected could not find proxy container for metrics-api-f46599848-6j2lz pod see https://linkerd.io/2.14/checks/#l5d-viz-pods-injection for hints ‼ viz extension pods are running container "linkerd-proxy" in pod "metrics-api-f46599848-6j2lz" is not ready see https://linkerd.io/2.14/checks/#l5d-viz-pods-running for hints ‼ viz extension proxies are healthy no "linkerd-proxy" containers found in the "linkerd" namespace see https://linkerd.io/2.14/checks/#l5d-viz-proxy-healthy for hints
Status check results are √
Environment
GKE v1.24 Linkerd stable-2.14.0
Possible solution
No response
Additional context
No response
Would you like to work on fixing this bug?
yes
Hey @mayank-ag-dev! Those errors from linkerd check
are very concerning – they look an awful lot like linkerd-viz
isn't set up correctly. Maybe uninstall and reinstall it?
Assuming that you clear the Viz errors and it's still not working, we'd like to see the Service and ServiceProfile for at least one of these workloads... thanks!
Hey @kflynn I resolved the errors for linkerd viz sharing the service and service profile snippet
apiVersion: linkerd.io/v1alpha2
kind: ServiceProfile
metadata:
name: podinfo-svc.podinfo.svc.cluster.local
namespace: podinfo
spec:
routes:
- name: health-check
condition:
method: GET
pathRegex: /healthz
- name: version
condition:
method: GET
pathRegex: /version
---
apiVersion: v1
items:
- apiVersion: v1
kind: Service
metadata:
name: podinfo-svc
namespace: podinfo
spec:
ports:
- name: http
port: 9898
protocol: TCP
targetPort: http
- name: grpc
port: 9999
protocol: TCP
targetPort: grpc
selector:
app: podinfo
type: ClusterIP
And after the Viz errors are resolved, it's still not working?
Yes... Are there any configuration changes for linkerd service profile stable-v2.14.0?
--------------
√ can initialize the client
√ can query the Kubernetes API
kubernetes-version
------------------
√ is running the minimum Kubernetes API version
linkerd-existence
-----------------
√ 'linkerd-config' config map exists
√ heartbeat ServiceAccount exist
√ control plane replica sets are ready
√ no unschedulable pods
√ control plane pods are ready
√ cluster networks contains all node podCIDRs
√ cluster networks contains all pods
√ cluster networks contains all services
linkerd-config
--------------
√ control plane Namespace exists
√ control plane ClusterRoles exist
√ control plane ClusterRoleBindings exist
√ control plane ServiceAccounts exist
√ control plane CustomResourceDefinitions exist
√ control plane MutatingWebhookConfigurations exist
√ control plane ValidatingWebhookConfigurations exist
√ proxy-init container runs as root user if docker container runtime is used
linkerd-identity
----------------
√ certificate config is valid
√ trust anchors are using supported crypto algorithm
√ trust anchors are within their validity period
√ trust anchors are valid for at least 60 days
√ issuer cert is using supported crypto algorithm
√ issuer cert is within its validity period
√ issuer cert is valid for at least 60 days
√ issuer cert is issued by the trust anchor
linkerd-webhooks-and-apisvc-tls
-------------------------------
√ proxy-injector webhook has valid cert
√ proxy-injector cert is valid for at least 60 days
√ sp-validator webhook has valid cert
√ sp-validator cert is valid for at least 60 days
√ policy-validator webhook has valid cert
√ policy-validator cert is valid for at least 60 days
linkerd-version
---------------
√ can determine the latest version
‼ cli is up-to-date
is running version 2.14.0 but the latest stable version is 2.14.5
see https://linkerd.io/2.14/checks/#l5d-version-cli for hints
control-plane-version
---------------------
√ can retrieve the control plane version
‼ control plane is up-to-date
is running version 2.14.0 but the latest stable version is 2.14.5
see https://linkerd.io/2.14/checks/#l5d-version-control for hints
√ control plane and cli versions match
linkerd-control-plane-proxy
---------------------------
√ control plane proxies are healthy
‼ control plane proxies are up-to-date
some proxies are not running the current version:
* linkerd-destination-64fd9c9866-pbzxt (stable-2.14.0)
* linkerd-identity-6c5fc457db-pwl7f (stable-2.14.0)
* linkerd-proxy-injector-5d85b4686f-mg77v (stable-2.14.0)
see https://linkerd.io/2.14/checks/#l5d-cp-proxy-version for hints
√ control plane proxies and cli versions match
linkerd-viz
-----------
√ linkerd-viz Namespace exists
√ can initialize the client
√ linkerd-viz ClusterRoles exist
√ linkerd-viz ClusterRoleBindings exist
√ tap API server has valid cert
√ tap API server cert is valid for at least 60 days
√ tap API service is running
√ linkerd-viz pods are injected
√ viz extension pods are running
√ viz extension proxies are healthy
‼ viz extension proxies are up-to-date
some proxies are not running the current version:
* metrics-api-75f76fbd65-44wv8 (stable-2.14.0)
* prometheus-7c74c74478-7fxkz (stable-2.14.0)
* tap-6665794f66-f6ksl (stable-2.14.0)
* tap-injector-74f66f65d5-zkw9v (stable-2.14.0)
* web-78c46f4b57-8wx9z (stable-2.14.0)
see https://linkerd.io/2.14/checks/#l5d-viz-proxy-cp-version for hints
√ viz extension proxies and cli versions match
√ prometheus is installed and configured correctly
√ viz extension self-check
Status check results are √
@mayank-ag-dev I think the biggest question here is whether you're using ServiceProfiles or HTTPRoutes. For per-route metrics at the moment, you need to be using ServiceProfiles.
@kflynn We are using serviceProfiles for HTTPRoutes.
@mayank-ag-dev 🤦♂️ So sorry to ask you to confirm ServiceProfiles when you'd already posted a ServiceProfile! Let me poke a little more into this.
@kflynn Any update on this? We have major impact on observability cz of this.
So far I haven't managed to reproduce this. 🙁 Are you on our Slack? If so, I'd like to connect there and try a few things with you.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.