lens icon indicating copy to clipboard operation
lens copied to clipboard

"Metrics not available at the moment" on minikube , prometheus installed via Lens > Settings > Lens metrics

Open ecerulm opened this issue 2 years ago • 4 comments

Describe the bug

I get "metrics not available at the moment" for all pods, even though prometheus is installed using Lens itself.

To Reproduce

minikube delete minikube start Start Lens.app > Catalog > minikube minikube > Settings > Lens Metrics

  • Enabled bundled Prometheus metrics stack (check)
  • Enable bundled kube-state-metrics stack (check)
  • Enable bundled node-exporter stack (check)
  • Apply

minikube > Settings > Metrics > Prometheus > Lens

minikube kubectl -- -n lens-metrics get all
NAME                                     READY   STATUS    RESTARTS   AGE
pod/kube-state-metrics-95ccdf888-tkqzz   1/1     Running   0          98s
pod/node-exporter-68fc7                  1/1     Running   0          98s
pod/prometheus-0                         1/1     Running   0          98s

NAME                         TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)    AGE
service/kube-state-metrics   ClusterIP   10.97.199.201   <none>        8080/TCP   98s
service/node-exporter        ClusterIP   None            <none>        80/TCP     98s
service/prometheus           ClusterIP   10.107.181.49   <none>        80/TCP     98s

NAME                           DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR   AGE
daemonset.apps/node-exporter   1         1         1       1            1           <none>          98s

NAME                                 READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/kube-state-metrics   1/1     1            1           98s

NAME                                           DESIRED   CURRENT   READY   AGE
replicaset.apps/kube-state-metrics-95ccdf888   1         1         1       98s

NAME                          READY   AGE
statefulset.apps/prometheus   1/1     98s

Run a pod

 kubectl run  -ti --rm "test-$RANDOM" --image=ecerulm/ubuntu-tools:latest
root@test-1720:/# apt-get install stress
root@test-1720:/# stress --cpu 1

image

Expected behavior I expect it to show cpu metrics for the pods, or a debug log somewhere that tells me why there is no metrics. As far as I know Lens it's doing PromQL queries to prometheus-server but I don't know exactly which queries and why those queries come empty

Screenshots image

Environment (please complete the following information):

  • Lens Version: 5.4.3-latest.20220317.1
  • OS: macOS Big Sur 11.6.2
  • Installation method (e.g. snap or AppImage in Linux): dmg

Logs: When you run the application executable from command line you will see some logging output. Please paste them here:

Your logs go here...

Kubeconfig: Quite often the problems are caused by malformed kubeconfig which the application tries to load. Please share your kubeconfig, remember to remove any secret and sensitive information.

your kubeconfig here

Additional context

 minikube kubectl -- port-forward -n lens-metrics service/prometheus 8080:80
Forwarding from 127.0.0.1:8080 -> 9090
Forwarding from [::1]:8080 -> 9090

From the browser I can see that container_cpu_usage_seconds_total{pod="test-1720"}
image

has results, so I guess that Lens is doing some other query , but it's not clear which one.

ecerulm avatar Mar 21 '22 11:03 ecerulm

I think that Lens.app will do the following PromQL query

https://github.com/lensapp/lens/blob/589472c2b53a0e65e452d36078a71ce4f642d1b7/src/main/prometheus/lens.ts#L82-L83

So I tried that against the prometheus server

minikube kubectl -- port-forward -n lens-metrics service/prometheus 8080:80

And performed sum(rate(container_cpu_usage_seconds_total{container!="", image!="", pod=~"test-32677", namespace="default"}[1m])) by (pod) and I get Empty result from prometheus server:

image

But if I try with an longer rateAccuracy, prometheus will actually return metrics:

sum(rate(container_cpu_usage_seconds_total{container!="", image!="", pod=~"test-32677", namespace="default"}[20m])) by (pod)
image

ecerulm avatar Mar 21 '22 14:03 ecerulm

It takes some time before we expect metrics to start appearing. However, we can certainly improve the UI here to make that more clear.

Nokel81 avatar Mar 21 '22 14:03 Nokel81

that pod "test-32677" has been running 1 hour and it's not showing up in Lens. Also if I try the same thing but installing kube-prometheus (prometheus operator) then metrics appear in Lens.app after 2 minutes.

minikube delete && minikube start --kubernetes-version=v1.23.0 --memory=6g --bootstrapper=kubeadm --extra-config=kubelet.authentication-token-webhook=true --extra-config=kubelet.authorization-mode=Webhook --extra-config=scheduler.bind-address=0.0.0.0 --extra-config=controller-manager.bind-address=0.0.0.0
minikube addons disable metrics-server
minikube kubectl -- apply --server-side -f manifests/setup
minikube kubectl -- apply -f manifests/

and then change in Lens to prometheus operator monitoring/prometheus-k8s:9090 and the CPU chart in Lens.app works almost right away. But I haven't managed to get in minikube the Lens metrics or Helm option to work.

ecerulm avatar Mar 21 '22 15:03 ecerulm

I am seeing the same behaviour for 1 of my clusters, whereas another cluster in the same workspace works perfectly fine, showing all metrics. Here are more details:

Lens Version: 5.4.6 OS: Ubuntu 20.04 Installation method : .deb package

Here are my observations with 2 of my clusters in lens:

(1.) Cloud provider: Azure Service: Azure kubernetes service k8s version: 1.22 prometheus installation: Manually installed helm chart (https://github.com/prometheus-community/helm-charts/tree/prometheus-14.12.0/charts/prometheus), in namespace "metrics"

When seeing this cluster in lens, I don't even need to setup Settings -> Metrics -> "helm". With the default setting of "auto detect" , all the metrics (i.e. cluster, node, pod etc) are visible.

(2.) Cloud provider: AWS Service: AWS EKS k8s version: 1.22 prometheus installation: Manually installed helm chart (https://github.com/prometheus-community/helm-charts/tree/prometheus-14.12.0/charts/prometheus), in namespace "metrics"

When seeing this cluster in lens, the default setting "auto-detect" caused the metrics chart to try loading for 5 mins, and then give message "metrics are not available at this moment"

Upon changing this setting to "helm", and providing prometheus service address as "metrics/prometheus:80" , same observation of "5 minute wait -> metrics not available"

I even tried removing the prometheus helm release, and installed lens-metrics stack (enabled prometheus, kube-state-metrics, node-exporter in "settings -> lens-metrics"), but still the same behaviour.

tapanhalani avatar Jun 07 '22 14:06 tapanhalani

I am seeing the same behaviour for 1 of my clusters, whereas another cluster in the same workspace works perfectly fine, showing all metrics. Here are more details:

Lens Version: 5.4.6 OS: Ubuntu 20.04 Installation method : .deb package

Here are my observations with 2 of my clusters in lens:

(1.) Cloud provider: Azure Service: Azure kubernetes service k8s version: 1.22 prometheus installation: Manually installed helm chart (https://github.com/prometheus-community/helm-charts/tree/prometheus-14.12.0/charts/prometheus), in namespace "metrics"

When seeing this cluster in lens, I don't even need to setup Settings -> Metrics -> "helm". With the default setting of "auto detect" , all the metrics (i.e. cluster, node, pod etc) are visible.

(2.) Cloud provider: AWS Service: AWS EKS k8s version: 1.22 prometheus installation: Manually installed helm chart (https://github.com/prometheus-community/helm-charts/tree/prometheus-14.12.0/charts/prometheus), in namespace "metrics"

When seeing this cluster in lens, the default setting "auto-detect" caused the metrics chart to try loading for 5 mins, and then give message "metrics are not available at this moment"

Upon changing this setting to "helm", and providing prometheus service address as "metrics/prometheus:80" , same observation of "5 minute wait -> metrics not available"

I even tried removing the prometheus helm release, and installed lens-metrics stack (enabled prometheus, kube-state-metrics, node-exporter in "settings -> lens-metrics"), but still the same behaviour.

I'm experiencing issues after deploying the same chart in Azure AKS since you mentioned you are able to pull all the metrics using the lens. https://github.com/prometheus-community/helm-charts/tree/prometheus-14.12.0/charts/prometheus Prometheus: 14.12.0 Lens Version: 6.0 K8 Version: 1.19.11 Do you think enabling/pulling metrics(CPU, memory, Disk), do we need to customize anything on the helm chart side? Thanks in Advance.

nanirover avatar Nov 09 '22 18:11 nanirover

I observed that the query works if you removed container!="", image!=""

In my case it is

sum(rate(container_cpu_usage_seconds_total{pod=~"camel-k-operator-84c8c9d56b-s5l6k", namespace="default"}[1m])) by (pod)

image image

wizpresso-steve-cy-fan avatar Nov 29 '22 03:11 wizpresso-steve-cy-fan

cc https://github.com/lensapp/lens/issues/5660#issuecomment-1159149850

wizpresso-steve-cy-fan avatar Nov 29 '22 03:11 wizpresso-steve-cy-fan

@wizpresso-steve-cy-fan Which provider do you have set in your cluster preferences?

Nokel81 avatar Nov 29 '22 14:11 Nokel81

Thanks for bringing this up.

Nokel81 avatar Nov 29 '22 14:11 Nokel81

Though I think you probably should be using the "Helm 14.x" provider

Nokel81 avatar Nov 29 '22 19:11 Nokel81

@Nokel81 I used Prometheus Operator since this is the way I installed it. I used an helm chart to install the operator

wizpresso-steve-cy-fan avatar Nov 30 '22 02:11 wizpresso-steve-cy-fan

I guess my question was mostly towards @nanirover

Nokel81 avatar Nov 30 '22 02:11 Nokel81

@ecerulm Did you try and change the scrape_interval for your install? We have it set at 15s which should mean that the 1m rate interval can collect 2 or more scrapes.

I guess your problem is that the scrapes seem to be failing quite often.

NOTE: the above PR is for fixing @wizpresso-steve-cy-fan's issue

Nokel81 avatar Nov 30 '22 16:11 Nokel81